All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/36] Arm cache coloring
@ 2022-03-04 17:46 Marco Solieri
  2022-03-04 17:46 ` [PATCH 01/36] Revert "xen/arm: setup: Add Xen as boot module before printing all boot modules" Marco Solieri
                   ` (35 more replies)
  0 siblings, 36 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni

Shared caches in multi-core CPU architectures represent a problem for
predictability of memory access latency.  This jeopardizes applicability
of many Arm platform in real-time critical and mixed-criticality
scenarios.  We introduce support for cache partitioning with page
coloring, a transparent software technique that enables isolation
between domains and Xen, and thus avoids cache interference.

When creating a domain, a simple syntax (e.g. `0-3` or `4-11`) allows
the user to define assignments of cache partitions ids, called colors,
where assigning different colors guarantees no mutual eviction on cache
will ever happen.  This instructs the Xen memory allocator to provide
the i-th color assignee only with pages that maps to color i, i.e. that
are indexed in the i-th cache partition.

The proposed implementation supports the dom0less experimental feature.
The solution has been tested in several scenarios, including Xilinx Zynq
MPSoCs.


Overview of implementation and commits structure
------------------------------------------------

- Coloring support is added for dom0 and domU by defining the core
  logic, as well as the hardware inspection functionalities used for
  getting needed coloring information [4-17].
- A new memory page allocator that implement the cache coloring
  mechanism is introduced.  The allocation algorithm follows the given
  coloring scheme specified for each domain, and maximizes contiguity in
  the page selection [18-21].
- Coloring support is added to Xen .text region [22-29], as well as to
  dom0less domains [30].
- Extensive documentation details the technique and gently explains
  usage [33-36].


Known limitations
-----------------

- We need to bring back [1-3] the relocation feature in order to move
  Xen memory to a colored space where the hypervisor could be isolated
  from VMs interference.
- When cache coloring is used, static memory assignment is disabled to
  avoid incompatibility. [31]
- Due to assert failure [32], the number of supported colors is
  currently limited at 64, which should be satisfactory for most chips.
  In particular, the problem lies in the cache coloring configuration
  data structure that belongs to each domain.  We are aware that this is
  not a clean solution but we hope that this could be discussed and
  solved within this pull request.


Acknowledgements
----------------

This work is sponsored by Xilinx Inc., and supported by University of
Modena and Reggio Emilia and Minerva Systems.

***

Luca Miccio (36):
  Revert "xen/arm: setup: Add Xen as boot module before printing all
    boot modules"
  Revert "xen/arm: mm: Initialize page-tables earlier"
  xen/arm: restore xen_paddr argument in setup_pagetables
  xen/arm: add parsing function for cache coloring configuration
  xen/arm: compute LLC way size by hardware inspection
  xen/arm: add coloring basic initialization
  xen/arm: add coloring data to domains
  xen/arm: add colored flag to page struct
  xen/arch: add default colors selection function
  xen/arch: check color selection function
  xen/include: define hypercall parameter for coloring
  xen/arm: initialize cache coloring data for Dom0/U
  xen/arm: A domain is not direct mapped when coloring is enabled
  xen/arch: add dump coloring info for domains
  tools: add support for cache coloring configuration
  xen/color alloc: implement color_from_page for ARM64
  xen/arm: add get_max_color function
  Alloc: introduce page_list_for_each_reverse
  xen/arch: introduce cache-coloring allocator
  xen/common: introduce buddy required reservation
  xen/common: add colored allocator initialization
  xen/arch: init cache coloring conf for Xen
  xen/arch: coloring: manually calculate Xen physical addresses
  xen/arm: enable consider_modules for coloring
  xen/arm: bring back get_xen_paddr
  xen/arm: add argument to remove_early_mappings
  xen/arch: add coloring support for Xen
  xen/arm: introduce xen_map_text_rw
  xen/arm: add dump function for coloring info
  xen/arm: add coloring support to dom0less
  Disable coloring if static memory support is selected
  xen/arm: reduce the number of supported colors
  doc, xen-command-line: introduce coloring options
  doc, xl.cfg: introduce coloring configuration option
  doc, device-tree: introduce 'colors' property
  doc, arm: add usage documentation for cache coloring support

 docs/man/xl.cfg.5.pod.in              |  14 +
 docs/misc/arm/cache_coloring.rst      | 191 +++++++++++
 docs/misc/arm/device-tree/booting.txt |   3 +
 docs/misc/xen-command-line.pandoc     |  51 ++-
 tools/libs/light/libxl_arm.c          |  11 +
 tools/libs/light/libxl_types.idl      |   1 +
 tools/xl/xl_parse.c                   |  59 +++-
 xen/arch/arm/Kconfig                  |   6 +
 xen/arch/arm/Makefile                 |   2 +-
 xen/arch/arm/alternative.c            |   8 +-
 xen/arch/arm/coloring.c               | 469 ++++++++++++++++++++++++++
 xen/arch/arm/domain.c                 |  56 +++
 xen/arch/arm/domain_build.c           |  42 ++-
 xen/arch/arm/include/asm/coloring.h   |  98 ++++++
 xen/arch/arm/include/asm/mm.h         |  18 +-
 xen/arch/arm/mm.c                     | 245 +++++++++++++-
 xen/arch/arm/psci.c                   |   4 +-
 xen/arch/arm/setup.c                  |  94 +++++-
 xen/arch/arm/smpboot.c                |  19 +-
 xen/common/page_alloc.c               | 321 +++++++++++++++++-
 xen/common/vmap.c                     |   4 +-
 xen/include/public/arch-arm.h         |   8 +
 xen/include/xen/mm.h                  |   7 +
 xen/include/xen/sched.h               |   4 +
 xen/include/xen/vmap.h                |   2 +
 25 files changed, 1689 insertions(+), 48 deletions(-)
 create mode 100644 docs/misc/arm/cache_coloring.rst
 create mode 100644 xen/arch/arm/coloring.c
 create mode 100644 xen/arch/arm/include/asm/coloring.h

-- 
2.30.2



^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 01/36] Revert "xen/arm: setup: Add Xen as boot module before printing all boot modules"
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 18:50   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 02/36] Revert "xen/arm: mm: Initialize page-tables earlier" Marco Solieri
                   ` (34 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

This reverts commit 48fb2a9deba11ee48dde21c5c1aa93b4d4e1043b.
---
 xen/arch/arm/setup.c | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index d5d0792ed4..c5a556855e 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -888,18 +888,18 @@ void __init start_xen(unsigned long boot_phys_offset,
               "Please check your bootloader.\n",
               fdt_paddr);
 
-    /* Register Xen's load address as a boot module. */
-    xen_bootmodule = add_boot_module(BOOTMOD_XEN,
-                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
-                             (paddr_t)(uintptr_t)(_end - _start), false);
-    BUG_ON(!xen_bootmodule);
-
     fdt_size = boot_fdt_info(device_tree_flattened, fdt_paddr);
 
     cmdline = boot_fdt_cmdline(device_tree_flattened);
     printk("Command line: %s\n", cmdline);
     cmdline_parse(cmdline);
 
+    /* Register Xen's load address as a boot module. */
+    xen_bootmodule = add_boot_module(BOOTMOD_XEN,
+                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
+                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
+    BUG_ON(!xen_bootmodule);
+
     setup_mm();
 
     /* Parse the ACPI tables for possible boot-time configuration */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 02/36] Revert "xen/arm: mm: Initialize page-tables earlier"
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
  2022-03-04 17:46 ` [PATCH 01/36] Revert "xen/arm: setup: Add Xen as boot module before printing all boot modules" Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 03/36] xen/arm: restore xen_paddr argument in setup_pagetables Marco Solieri
                   ` (33 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

This reverts commit 3a5d341681af650825bbe3bee9be5d187da35080.

The coloring support  will be configurable within the Xen command line
but it will be initialized before the page-tables; this is necessary
for coloring the hypervisor itself beacuse we will create a specific
mapping for it that could be configured using some options.
In order to parse all the needed information from the device tree, we
need to revert the above commit and restore the previous order for
page-tables initialization.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
---
 xen/arch/arm/mm.c    | 11 +++++++++--
 xen/arch/arm/setup.c |  4 ++--
 2 files changed, 11 insertions(+), 4 deletions(-)

diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index b1eae767c2..e6381e46e6 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -551,6 +551,7 @@ static inline lpae_t pte_of_xenaddr(vaddr_t va)
     return mfn_to_xen_entry(maddr_to_mfn(ma), MT_NORMAL);
 }
 
+/* Map the FDT in the early boot page table */
 void * __init early_fdt_map(paddr_t fdt_paddr)
 {
     /* We are using 2MB superpage for mapping the FDT */
@@ -573,7 +574,7 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
     /* The FDT is mapped using 2MB superpage */
     BUILD_BUG_ON(BOOT_FDT_VIRT_START % SZ_2M);
 
-    create_mappings(xen_second, BOOT_FDT_VIRT_START, paddr_to_pfn(base_paddr),
+    create_mappings(boot_second, BOOT_FDT_VIRT_START, paddr_to_pfn(base_paddr),
                     SZ_2M >> PAGE_SHIFT, SZ_2M);
 
     offset = fdt_paddr % SECOND_SIZE;
@@ -588,7 +589,7 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
 
     if ( (offset + size) > SZ_2M )
     {
-        create_mappings(xen_second, BOOT_FDT_VIRT_START + SZ_2M,
+        create_mappings(boot_second, BOOT_FDT_VIRT_START + SZ_2M,
                         paddr_to_pfn(base_paddr + SZ_2M),
                         SZ_2M >> PAGE_SHIFT, SZ_2M);
     }
@@ -699,6 +700,12 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
     pte.pt.table = 1;
     xen_second[second_table_offset(FIXMAP_ADDR(0))] = pte;
 
+    /* ... DTB */
+    pte = boot_second[second_table_offset(BOOT_FDT_VIRT_START)];
+    xen_second[second_table_offset(BOOT_FDT_VIRT_START)] = pte;
+    pte = boot_second[second_table_offset(BOOT_FDT_VIRT_START + SZ_2M)];
+    xen_second[second_table_offset(BOOT_FDT_VIRT_START + SZ_2M)] = pte;
+
 #ifdef CONFIG_ARM_64
     ttbr = (uintptr_t) xen_pgtable + phys_offset;
 #else
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index c5a556855e..100b322b3e 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -877,8 +877,6 @@ void __init start_xen(unsigned long boot_phys_offset,
     /* Initialize traps early allow us to get backtrace when an error occurred */
     init_traps();
 
-    setup_pagetables(boot_phys_offset);
-
     smp_clear_cpu_maps();
 
     device_tree_flattened = early_fdt_map(fdt_paddr);
@@ -900,6 +898,8 @@ void __init start_xen(unsigned long boot_phys_offset,
                              (paddr_t)(uintptr_t)(_end - _start + 1), false);
     BUG_ON(!xen_bootmodule);
 
+    setup_pagetables(boot_phys_offset);
+
     setup_mm();
 
     /* Parse the ACPI tables for possible boot-time configuration */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 03/36] xen/arm: restore xen_paddr argument in setup_pagetables
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
  2022-03-04 17:46 ` [PATCH 01/36] Revert "xen/arm: setup: Add Xen as boot module before printing all boot modules" Marco Solieri
  2022-03-04 17:46 ` [PATCH 02/36] Revert "xen/arm: mm: Initialize page-tables earlier" Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration Marco Solieri
                   ` (32 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Coloring support will re-enable part of the Xen relocation since the
underlying idea is to "relocate using coloring" for the hypervisors
itself.
We setup a target region that will be used exclusively from Xen and
it will be mapped using the coloring configuration of the hypervisor.
Part of the relocation we need to bring back is the usage of xen_paddr
variable that will tell us the physical start address where Xen is
located.
Add this variable to the setup_pagetables function and set it properly
when coloring is not enabled.
Later on it will be initialized accordingly whether the coloring support
is enabled or not.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
---
 xen/arch/arm/include/asm/mm.h | 2 +-
 xen/arch/arm/mm.c             | 2 +-
 xen/arch/arm/setup.c          | 6 +++---
 3 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 424aaf2823..487be7cf59 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -176,7 +176,7 @@ extern unsigned long total_pages;
 #define PDX_GROUP_SHIFT SECOND_SHIFT
 
 /* Boot-time pagetable setup */
-extern void setup_pagetables(unsigned long boot_phys_offset);
+extern void setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr);
 /* Map FDT in boot pagetable */
 extern void *early_fdt_map(paddr_t fdt_paddr);
 /* Remove early mappings */
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index e6381e46e6..fd7a313d88 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -634,7 +634,7 @@ static void clear_table(void *table)
 
 /* Boot-time pagetable setup.
  * Changes here may need matching changes in head.S */
-void __init setup_pagetables(unsigned long boot_phys_offset)
+void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
 {
     uint64_t ttbr;
     lpae_t pte, *p;
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 100b322b3e..b8d4f50d90 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -867,6 +867,7 @@ void __init start_xen(unsigned long boot_phys_offset,
     struct bootmodule *xen_bootmodule;
     struct domain *d;
     int rc;
+    paddr_t xen_paddr = (paddr_t)(_start + boot_phys_offset);
 
     dcache_line_bytes = read_dcache_line_bytes();
 
@@ -893,12 +894,11 @@ void __init start_xen(unsigned long boot_phys_offset,
     cmdline_parse(cmdline);
 
     /* Register Xen's load address as a boot module. */
-    xen_bootmodule = add_boot_module(BOOTMOD_XEN,
-                             (paddr_t)(uintptr_t)(_start + boot_phys_offset),
+    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr,
                              (paddr_t)(uintptr_t)(_end - _start + 1), false);
     BUG_ON(!xen_bootmodule);
 
-    setup_pagetables(boot_phys_offset);
+    setup_pagetables(boot_phys_offset, xen_paddr);
 
     setup_mm();
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (2 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 03/36] xen/arm: restore xen_paddr argument in setup_pagetables Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-09 19:09   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 05/36] xen/arm: compute LLC way size by hardware inspection Marco Solieri
                   ` (31 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

Add three new bootargs allowing configuration of cache coloring support
for Xen:
- way_size: The size of a LLC way in bytes. This value is mainly used
  to calculate the maximum available colors on the platform.
- dom0_colors: The coloring configuration for Dom0, which also acts as
  default configuration for any DomU without an explicit configuration.
- xen_colors: The coloring configuration for the Xen hypervisor itself.

A cache coloring configuration consists of a selection of colors to be
assigned to a VM or to the hypervisor. It is represented by a set of
ranges. Add a common function that parses a string with a
comma-separated set of hyphen-separated ranges like "0-7,15-16" and
returns both: the number of chosen colors, and an array containing their
ids.
Currently we support platforms with up to 128 colors.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 xen/arch/arm/Kconfig                |   5 ++
 xen/arch/arm/Makefile               |   2 +-
 xen/arch/arm/coloring.c             | 131 ++++++++++++++++++++++++++++
 xen/arch/arm/include/asm/coloring.h |  28 ++++++
 4 files changed, 165 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/arm/coloring.c
 create mode 100644 xen/arch/arm/include/asm/coloring.h

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index ecfa6822e4..f0f999d172 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -97,6 +97,11 @@ config HARDEN_BRANCH_PREDICTOR
 
 	  If unsure, say Y.
 
+config COLORING
+	bool "L2 cache coloring"
+	default n
+	depends on ARM_64
+
 config TEE
 	bool "Enable TEE mediators support (UNSUPPORTED)" if UNSUPPORTED
 	default n
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index c993ce72a3..581896a528 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -66,7 +66,7 @@ obj-$(CONFIG_SBSA_VUART_CONSOLE) += vpl011.o
 obj-y += vsmc.o
 obj-y += vpsci.o
 obj-y += vuart.o
-
+obj-$(CONFIG_COLORING) += coloring.o
 extra-y += xen.lds
 
 #obj-bin-y += ....o
diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
new file mode 100644
index 0000000000..8f1cff6efb
--- /dev/null
+++ b/xen/arch/arm/coloring.c
@@ -0,0 +1,131 @@
+/*
+ * xen/arch/arm/coloring.c
+ *
+ * Coloring support for ARM
+ *
+ * Copyright (C) 2019 Xilinx Inc.
+ *
+ * Authors:
+ *    Luca Miccio <lucmiccio@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <xen/init.h>
+#include <xen/types.h>
+#include <xen/lib.h>
+#include <xen/errno.h>
+#include <xen/param.h>
+#include <asm/coloring.h>
+
+/* Number of color(s) assigned to Xen */
+static uint32_t xen_col_num;
+/* Coloring configuration of Xen as bitmask */
+static uint32_t xen_col_mask[MAX_COLORS_CELLS];
+
+/* Number of color(s) assigned to Dom0 */
+static uint32_t dom0_col_num;
+/* Coloring configuration of Dom0 as bitmask */
+static uint32_t dom0_col_mask[MAX_COLORS_CELLS];
+
+static uint64_t way_size;
+
+/*************************
+ * PARSING COLORING BOOTARGS
+ */
+
+/*
+ * Parse the coloring configuration given in the buf string, following the
+ * syntax below, and store the number of colors and a corresponding mask in
+ * the last two given pointers.
+ *
+ * COLOR_CONFIGURATION ::= RANGE,...,RANGE
+ * RANGE               ::= COLOR-COLOR
+ *
+ * Example: "2-6,15-16" represents the set of colors: 2,3,4,5,6,15,16.
+ */
+static int parse_color_config(
+    const char *buf, uint32_t *col_mask, uint32_t *col_num)
+{
+    int start, end, i;
+    const char* s = buf;
+    unsigned int offset;
+
+    if ( !col_mask || !col_num )
+        return -EINVAL;
+
+    *col_num = 0;
+    for ( i = 0; i < MAX_COLORS_CELLS; i++ )
+        col_mask[i] = 0;
+
+    while ( *s != '\0' )
+    {
+        if ( *s != ',' )
+        {
+            start = simple_strtoul(s, &s, 0);
+
+            /* Ranges are hyphen-separated */
+            if ( *s != '-' )
+                goto fail;
+            s++;
+
+            end = simple_strtoul(s, &s, 0);
+
+            for ( i = start; i <= end; i++ )
+            {
+                offset = i / 32;
+                if ( offset > MAX_COLORS_CELLS )
+                    goto fail;
+
+                if ( !(col_mask[offset] & (1 << i % 32)) )
+                    *col_num += 1;
+                col_mask[offset] |= (1 << i % 32);
+            }
+        }
+        else
+            s++;
+    }
+
+    return *s ? -EINVAL : 0;
+fail:
+    return -EINVAL;
+}
+
+static int __init parse_way_size(const char *s)
+{
+    way_size = simple_strtoull(s, &s, 0);
+
+    return *s ? -EINVAL : 0;
+}
+custom_param("way_size", parse_way_size);
+
+static int __init parse_dom0_colors(const char *s)
+{
+    return parse_color_config(s, dom0_col_mask, &dom0_col_num);
+}
+custom_param("dom0_colors", parse_dom0_colors);
+
+static int __init parse_xen_colors(const char *s)
+{
+    return parse_color_config(s, xen_col_mask, &xen_col_num);
+}
+custom_param("xen_colors", parse_xen_colors);
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
new file mode 100644
index 0000000000..60958d1244
--- /dev/null
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -0,0 +1,28 @@
+/*
+ * xen/arm/include/asm/coloring.h
+ *
+ * Coloring support for ARM
+ *
+ * Copyright (C) 2019 Xilinx Inc.
+ *
+ * Authors:
+ *    Luca Miccio <lucmiccio@gmail.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_ARM_COLORING_H__
+#define __ASM_ARM_COLORING_H__
+
+#define MAX_COLORS_CELLS 4
+
+#endif /* !__ASM_ARM_COLORING_H__ */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 05/36] xen/arm: compute LLC way size by hardware inspection
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (3 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-09 20:12   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 06/36] xen/arm: add coloring basic initialization Marco Solieri
                   ` (30 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

The size of the LLC way is a crucial parameter for the cache coloring
support, since it determines the maximum number of available colors on
the platform.  This parameter can currently be retrieved only from
the way_size bootarg and it is prone to misconfiguration nullifying the
coloring mechanism and breaking cache isolation.

Add an alternative and more safe method to retrieve the way size by
directly asking the hardware, namely using CCSIDR_EL1 and CSSELR_EL1
registers.

This method has to check also if at least L2 is implemented in the
hardware since there are scenarios where only L1 cache is availble, e.g,
QEMU.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 xen/arch/arm/coloring.c | 76 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index 8f1cff6efb..e3d490b453 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -25,7 +25,10 @@
 #include <xen/lib.h>
 #include <xen/errno.h>
 #include <xen/param.h>
+
+#include <asm/sysregs.h>
 #include <asm/coloring.h>
+#include <asm/io.h>
 
 /* Number of color(s) assigned to Xen */
 static uint32_t xen_col_num;
@@ -39,6 +42,79 @@ static uint32_t dom0_col_mask[MAX_COLORS_CELLS];
 
 static uint64_t way_size;
 
+#define CTR_LINESIZE_MASK 0x7
+#define CTR_SIZE_SHIFT 13
+#define CTR_SIZE_MASK 0x3FFF
+#define CTR_SELECT_L2 1 << 1
+#define CTR_SELECT_L3 1 << 2
+#define CTR_CTYPEn_MASK 0x7
+#define CTR_CTYPE2_SHIFT 3
+#define CTR_CTYPE3_SHIFT 6
+#define CTR_LLC_ON 1 << 2
+#define CTR_LOC_SHIFT 24
+#define CTR_LOC_MASK 0x7
+#define CTR_LOC_L2 1 << 1
+#define CTR_LOC_NOT_IMPLEMENTED 1 << 0
+
+
+/* Return the way size of last level cache by asking the hardware */
+static uint64_t get_llc_way_size(void)
+{
+    uint32_t cache_sel = READ_SYSREG64(CSSELR_EL1);
+    uint32_t cache_global_info = READ_SYSREG64(CLIDR_EL1);
+    uint32_t cache_info;
+    uint32_t cache_line_size;
+    uint32_t cache_set_num;
+    uint32_t cache_sel_tmp;
+
+    printk(XENLOG_INFO "Get information on LLC\n");
+    printk(XENLOG_INFO "Cache CLIDR_EL1: 0x%"PRIx32"\n", cache_global_info);
+
+    /* Check if at least L2 is implemented */
+    if ( ((cache_global_info >> CTR_LOC_SHIFT) & CTR_LOC_MASK)
+        == CTR_LOC_NOT_IMPLEMENTED )
+    {
+        printk(XENLOG_ERR "ERROR: L2 Cache not implemented\n");
+        return 0;
+    }
+
+    /* Save old value of CSSELR_EL1 */
+    cache_sel_tmp = cache_sel;
+
+    /* Get LLC index */
+    if ( ((cache_global_info >> CTR_CTYPE2_SHIFT) & CTR_CTYPEn_MASK)
+        == CTR_LLC_ON )
+        cache_sel = CTR_SELECT_L2;
+    else
+        cache_sel = CTR_SELECT_L3;
+
+    printk(XENLOG_INFO "LLC selection: %u\n", cache_sel);
+    /* Select the correct LLC in CSSELR_EL1 */
+    WRITE_SYSREG64(cache_sel, CSSELR_EL1);
+
+    /* Ensure write */
+    isb();
+
+    /* Get info about the LLC */
+    cache_info = READ_SYSREG64(CCSIDR_EL1);
+
+    /* ARM TRM: (Log2(Number of bytes in cache line)) - 4. */
+    cache_line_size = 1 << ((cache_info & CTR_LINESIZE_MASK) + 4);
+    /* ARM TRM: (Number of sets in cache) - 1 */
+    cache_set_num = ((cache_info >> CTR_SIZE_SHIFT) & CTR_SIZE_MASK) + 1;
+
+    printk(XENLOG_INFO "Cache line size: %u bytes\n", cache_line_size);
+    printk(XENLOG_INFO "Cache sets num: %u\n", cache_set_num);
+
+    /* Restore value in CSSELR_EL1 */
+    WRITE_SYSREG64(cache_sel_tmp, CSSELR_EL1);
+
+    /* Ensure write */
+    isb();
+
+    return (cache_line_size * cache_set_num);
+}
+
 /*************************
  * PARSING COLORING BOOTARGS
  */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 06/36] xen/arm: add coloring basic initialization
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (4 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 05/36] xen/arm: compute LLC way size by hardware inspection Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 07/36] xen/arm: add coloring data to domains Marco Solieri
                   ` (29 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Introduce a first and simple initialization function for the cache
coloring support. A helper function computes 'addr_col_mask', the
platform-dependent bitmask asserting the bits in memory addresses that
can be used for the coloring mechanism. This, in turn is used to
determine the total amount of available colors.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/coloring.c             | 83 +++++++++++++++++++++++++++++
 xen/arch/arm/include/asm/coloring.h |  8 +++
 xen/arch/arm/setup.c                |  4 ++
 3 files changed, 95 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index e3d490b453..af75b536a7 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -39,8 +39,13 @@ static uint32_t xen_col_mask[MAX_COLORS_CELLS];
 static uint32_t dom0_col_num;
 /* Coloring configuration of Dom0 as bitmask */
 static uint32_t dom0_col_mask[MAX_COLORS_CELLS];
+/* Maximum number of available color(s) */
+static uint32_t max_col_num;
+/* Maximum available coloring configuration as bitmask */
+static uint32_t max_col_mask[MAX_COLORS_CELLS];
 
 static uint64_t way_size;
+static uint64_t addr_col_mask;
 
 #define CTR_LINESIZE_MASK 0x7
 #define CTR_SIZE_SHIFT 13
@@ -115,6 +120,84 @@ static uint64_t get_llc_way_size(void)
     return (cache_line_size * cache_set_num);
 }
 
+/*
+ * Return the coloring mask based on the value of @param llc_way_size.
+ * This mask represents the bits in the address that can be used
+ * for defining available colors.
+ *
+ * @param llc_way_size		Last level cache way size.
+ * @return unsigned long	The coloring bitmask.
+ */
+static __init uint64_t calculate_addr_col_mask(uint64_t llc_way_size)
+{
+    uint64_t addr_col_mask = 0;
+    unsigned int i;
+    unsigned int low_idx, high_idx;
+
+    low_idx = PAGE_SHIFT;
+    high_idx = get_count_order(llc_way_size) - 1;
+
+    for ( i = low_idx; i <= high_idx; i++ )
+        addr_col_mask |= (1 << i);
+
+    return addr_col_mask;
+}
+
+bool __init coloring_init(void)
+{
+    int i;
+
+    printk(XENLOG_INFO "Initialize XEN coloring: \n");
+    /*
+     * If the way size is not provided by the configuration, try to get
+     * this information from hardware.
+     */
+    if ( !way_size )
+    {
+        way_size = get_llc_way_size();
+
+        if ( !way_size )
+        {
+            printk(XENLOG_ERR "ERROR: way size is null\n");
+            return false;
+        }
+    }
+
+    addr_col_mask = calculate_addr_col_mask(way_size);
+    if ( !addr_col_mask )
+    {
+        printk(XENLOG_ERR "ERROR: addr_col_mask is null\n");
+        return false;
+    }
+
+    max_col_num = ((addr_col_mask >> PAGE_SHIFT) + 1);
+
+   /*
+    * If the user or the platform itself provide a way_size
+    * configuration that corresponds to a number of max.
+    * colors greater than the one we support, we cannot
+    * continue. So the check on offset value is necessary.
+    */
+    if ( max_col_num > 32 * MAX_COLORS_CELLS )
+    {
+        printk(XENLOG_ERR "ERROR: max. color value not supported\n");
+        return false;
+    }
+
+    for ( i = 0; i < max_col_num; i++ )
+    {
+        unsigned int offset = i / 32;
+
+        max_col_mask[offset] |= (1 << i % 32);
+    }
+
+    printk(XENLOG_INFO "Way size: 0x%"PRIx64"\n", way_size);
+    printk(XENLOG_INFO "Color bits in address: 0x%"PRIx64"\n", addr_col_mask);
+    printk(XENLOG_INFO "Max number of colors: %u\n", max_col_num);
+
+    return true;
+}
+
 /*************************
  * PARSING COLORING BOOTARGS
  */
diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index 60958d1244..70e1dbd09b 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -25,4 +25,12 @@
 
 #define MAX_COLORS_CELLS 4
 
+#ifdef CONFIG_COLORING
+bool __init coloring_init(void);
+#else /* !CONFIG_COLORING */
+static inline bool __init coloring_init(void)
+{
+    return true;
+}
+#endif /* CONFIG_COLORING */
 #endif /* !__ASM_ARM_COLORING_H__ */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index b8d4f50d90..f39c62ea70 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -53,6 +53,7 @@
 #include <asm/setup.h>
 #include <xsm/xsm.h>
 #include <asm/acpi.h>
+#include <asm/coloring.h>
 
 struct bootinfo __initdata bootinfo;
 
@@ -893,6 +894,9 @@ void __init start_xen(unsigned long boot_phys_offset,
     printk("Command line: %s\n", cmdline);
     cmdline_parse(cmdline);
 
+    if ( !coloring_init() )
+        panic("Xen Coloring support: setup failed\n");
+
     /* Register Xen's load address as a boot module. */
     xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr,
                              (paddr_t)(uintptr_t)(_end - _start + 1), false);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 07/36] xen/arm: add coloring data to domains
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (5 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 06/36] xen/arm: add coloring basic initialization Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-07  7:22   ` Jan Beulich
  2022-03-04 17:46 ` [PATCH 08/36] xen/arm: add colored flag to page struct Marco Solieri
                   ` (28 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

We want to be able to associate an assignment of cache colors to each
domain.  Add a configurable-length array containing a set of color
indices in the domain data.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/include/xen/sched.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 10ea969c7a..bfbe72b3ea 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -388,6 +388,10 @@ struct domain
     atomic_t         shr_pages;         /* shared pages */
     atomic_t         paged_pages;       /* paged-out pages */
 
+    /* Coloring. */
+    uint32_t        *colors;
+    uint32_t        max_colors;
+
     /* Scheduling. */
     void            *sched_priv;    /* scheduler-specific data */
     struct sched_unit *sched_unit_list;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 08/36] xen/arm: add colored flag to page struct
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (6 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 07/36] xen/arm: add coloring data to domains Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 20:13   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 09/36] xen/arch: add default colors selection function Marco Solieri
                   ` (27 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

A new allocator enforcing a cache-coloring configuration is going to be
introduced.  We thus need to distinguish the memory pages assigned to,
and managed by, such colored allocator from the ordinary buddy
allocator's ones.  Add a color flag to the page structure.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/include/asm/mm.h | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 487be7cf59..9ac1767595 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -88,6 +88,10 @@ struct page_info
          */
         u32 tlbflush_timestamp;
     };
+
+    /* Is page managed by the cache-colored allocator? */
+    bool colored;
+
     u64 pad;
 };
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 09/36] xen/arch: add default colors selection function
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (7 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 08/36] xen/arm: add colored flag to page struct Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-07  7:28   ` Jan Beulich
  2022-03-04 17:46 ` [PATCH 10/36] xen/arch: check color " Marco Solieri
                   ` (26 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

When cache coloring support is enabled, a color assignment is needed for
every domain. Introduce a function computing a default configuration
with a safe and common value -- the dom0 color selection.

Do not access directly the array of color indices of dom0. Instead make
use of the dom0 color configuration as a bitmask.
Add a helper function that converts the color configuration bitmask into
the indices array.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/coloring.c             | 36 +++++++++++++++++++++++++++++
 xen/arch/arm/include/asm/coloring.h |  7 ++++++
 2 files changed, 43 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index af75b536a7..f6e6d09477 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -143,6 +143,42 @@ static __init uint64_t calculate_addr_col_mask(uint64_t llc_way_size)
     return addr_col_mask;
 }
 
+static int copy_mask_to_list(
+    uint32_t *col_mask, uint32_t *col_list, uint64_t col_num)
+{
+    unsigned int i, k, c;
+
+    if ( !col_list )
+        return -EINVAL;
+
+    for ( i = 0, k = 0; i < MAX_COLORS_CELLS; i++ )
+        for ( c = 0; k < col_num && c < 32; c++ )
+            if ( col_mask[i] & (1 << (c + (i*32))) )
+                col_list[k++] = c + (i * 32);
+
+    return 0;
+}
+
+uint32_t *setup_default_colors(uint32_t *col_num)
+{
+    uint32_t *col_list;
+
+    if ( dom0_col_num )
+    {
+        *col_num = dom0_col_num;
+        col_list = xzalloc_array(uint32_t, dom0_col_num);
+        if ( !col_list )
+        {
+            printk(XENLOG_ERR "setup_default_colors: Alloc failed\n");
+            return NULL;
+        }
+        copy_mask_to_list(dom0_col_mask, col_list, dom0_col_num);
+        return col_list;
+    }
+
+    return NULL;
+}
+
 bool __init coloring_init(void)
 {
     int i;
diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index 70e1dbd09b..8f24acf082 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -27,6 +27,13 @@
 
 #ifdef CONFIG_COLORING
 bool __init coloring_init(void);
+
+/*
+ * Return an array with default colors selection and store the number of
+ * colors in @param col_num. The array selection will be equal to the dom0
+ * color configuration.
+ */
+uint32_t *setup_default_colors(uint32_t *col_num);
 #else /* !CONFIG_COLORING */
 static inline bool __init coloring_init(void)
 {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 10/36] xen/arch: check color selection function
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (8 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 09/36] xen/arch: add default colors selection function Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-09 20:17   ` Julien Grall
  2022-03-14  6:06   ` Henry Wang
  2022-03-04 17:46 ` [PATCH 11/36] xen/include: define hypercall parameter for coloring Marco Solieri
                   ` (25 subsequent siblings)
  35 siblings, 2 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Dom0 color configuration is parsed in the Xen command line. Add an
helper function to check the user selection. If no configuration is
provided by the user, all the available colors supported by the
hardware will be assigned to dom0.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/coloring.c             | 17 +++++++++++++++++
 xen/arch/arm/include/asm/coloring.h |  8 ++++++++
 2 files changed, 25 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index f6e6d09477..382d558021 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -179,6 +179,23 @@ uint32_t *setup_default_colors(uint32_t *col_num)
     return NULL;
 }
 
+bool check_domain_colors(struct domain *d)
+{
+    int i;
+    bool ret = false;
+
+    if ( !d )
+        return ret;
+
+    if ( d->max_colors > max_col_num )
+        return ret;
+
+    for ( i = 0; i < d->max_colors; i++ )
+        ret |= (d->colors[i] > (max_col_num - 1));
+
+    return !ret;
+}
+
 bool __init coloring_init(void)
 {
     int i;
diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index 8f24acf082..fdd46448d7 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -26,8 +26,16 @@
 #define MAX_COLORS_CELLS 4
 
 #ifdef CONFIG_COLORING
+#include <xen/sched.h>
+
 bool __init coloring_init(void);
 
+/*
+ * Check colors of a given domain.
+ * Return true if check passed, false otherwise.
+ */
+bool check_domain_colors(struct domain *d);
+
 /*
  * Return an array with default colors selection and store the number of
  * colors in @param col_num. The array selection will be equal to the dom0
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 11/36] xen/include: define hypercall parameter for coloring
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (9 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 10/36] xen/arch: check color " Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-07  7:31   ` Jan Beulich
  2022-03-09 20:29   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 12/36] xen/arm: initialize cache coloring data for Dom0/U Marco Solieri
                   ` (24 subsequent siblings)
  35 siblings, 2 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

During domU creation process the colors selection has to be passed to
the Xen hypercall.
This is generally done using what Xen calls GUEST_HANDLE_PARAMS. In this
case a simple bitmask for the coloring configuration suffices.
Currently the maximum amount of supported colors is 128.
Add a new parameter that allows us to pass both the colors bitmask
and the number of elements in it.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 xen/arch/arm/include/asm/coloring.h | 2 --
 xen/include/public/arch-arm.h       | 8 ++++++++
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index fdd46448d7..1f7e0dde79 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -23,8 +23,6 @@
 #ifndef __ASM_ARM_COLORING_H__
 #define __ASM_ARM_COLORING_H__
 
-#define MAX_COLORS_CELLS 4
-
 #ifdef CONFIG_COLORING
 #include <xen/sched.h>
 
diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index 94b31511dd..627cc42164 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -303,6 +303,12 @@ struct vcpu_guest_context {
 typedef struct vcpu_guest_context vcpu_guest_context_t;
 DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);
 
+#define MAX_COLORS_CELLS 4
+struct color_guest_config {
+    uint32_t max_colors;
+    uint32_t colors[MAX_COLORS_CELLS];
+};
+
 /*
  * struct xen_arch_domainconfig's ABI is covered by
  * XEN_DOMCTL_INTERFACE_VERSION.
@@ -335,6 +341,8 @@ struct xen_arch_domainconfig {
      *
      */
     uint32_t clock_frequency;
+    /* IN */
+    struct color_guest_config colors;
 };
 #endif /* __XEN__ || __XEN_TOOLS__ */
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 12/36] xen/arm: initialize cache coloring data for Dom0/U
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (10 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 11/36] xen/include: define hypercall parameter for coloring Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-11 19:05   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 13/36] xen/arm: A domain is not direct mapped when coloring is enabled Marco Solieri
                   ` (23 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Initialize cache coloring configuration during domain creation. If no
colors assignment is provided by the user, use the default one.
The default configuration is the one assigned to Dom0. The latter is
configured as a standard domain with default configuration.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/domain.c       | 53 +++++++++++++++++++++++++++++++++++++
 xen/arch/arm/domain_build.c |  5 +++-
 2 files changed, 57 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 8110c1df86..33471b3c58 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -38,6 +38,7 @@
 #include <asm/vfp.h>
 #include <asm/vgic.h>
 #include <asm/vtimer.h>
+#include <asm/coloring.h>
 
 #include "vpci.h"
 #include "vuart.h"
@@ -782,6 +783,58 @@ int arch_domain_create(struct domain *d,
     if ( (rc = domain_vpci_init(d)) != 0 )
         goto fail;
 
+    d->max_colors = 0;
+#ifdef CONFIG_COLORING
+    /* Setup domain colors */
+    if ( !config->arch.colors.max_colors )
+    {
+        if ( !is_hardware_domain(d) )
+            printk(XENLOG_INFO "Color configuration not found for dom%u, using default\n",
+                   d->domain_id);
+        d->colors = setup_default_colors(&d->max_colors);
+        if ( !d->colors )
+        {
+            rc = -ENOMEM;
+            printk(XENLOG_ERR "Color array allocation failed for dom%u\n",
+                   d->domain_id);
+            goto fail;
+        }
+    }
+    else
+    {
+        int i, k;
+
+        d->colors = xzalloc_array(uint32_t, config->arch.colors.max_colors);
+        if ( !d->colors )
+        {
+            rc = -ENOMEM;
+            printk(XENLOG_ERR "Failed to alloc colors for dom%u\n",
+                   d->domain_id);
+            goto fail;
+        }
+
+        d->max_colors = config->arch.colors.max_colors;
+        for ( i = 0, k = 0;
+              k < d->max_colors && i < sizeof(config->arch.colors.colors) * 8;
+              i++ )
+        {
+            if ( config->arch.colors.colors[i / 32] & (1 << (i % 32)) )
+                d->colors[k++] = i;
+        }
+    }
+
+    printk("Dom%u colors: [ ", d->domain_id);
+    for ( int i = 0; i < d->max_colors; i++ )
+        printk("%u ", d->colors[i]);
+    printk("]\n");
+
+    if ( !check_domain_colors(d) )
+    {
+        rc = -EINVAL;
+        printk(XENLOG_ERR "Failed to check colors for dom%u\n", d->domain_id);
+        goto fail;
+    }
+#endif
     return 0;
 
 fail:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 8be01678de..9630d00066 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -3344,7 +3344,10 @@ void __init create_dom0(void)
         printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded.\n");
     dom0_cfg.arch.tee_type = tee_get_type();
     dom0_cfg.max_vcpus = dom0_max_vcpus();
-
+#ifdef CONFIG_COLORING
+    /* Colors are set after domain_create */
+    dom0_cfg.arch.colors.max_colors = 0;
+#endif
     if ( iommu_enabled )
         dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 13/36] xen/arm: A domain is not direct mapped when coloring is enabled
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (11 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 12/36] xen/arm: initialize cache coloring data for Dom0/U Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-09 20:34   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 14/36] xen/arch: add dump coloring info for domains Marco Solieri
                   ` (22 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Based on the intrinsic nature of cache coloring, it is trivial to state
that each domain that is colored is also not direct mapped.
Set the directmap variable to false when coloring is enabled.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/domain.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 33471b3c58..80a6f39464 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -785,6 +785,8 @@ int arch_domain_create(struct domain *d,
 
     d->max_colors = 0;
 #ifdef CONFIG_COLORING
+    d->arch.directmap = false;
+
     /* Setup domain colors */
     if ( !config->arch.colors.max_colors )
     {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 14/36] xen/arch: add dump coloring info for domains
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (12 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 13/36] xen/arm: A domain is not direct mapped when coloring is enabled Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 15/36] tools: add support for cache coloring configuration Marco Solieri
                   ` (21 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Print the color assignment for each domain when requested.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/coloring.c             | 12 ++++++++++++
 xen/arch/arm/domain.c               |  1 +
 xen/arch/arm/include/asm/coloring.h |  7 +++++++
 3 files changed, 20 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index 382d558021..8061c3824f 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -332,6 +332,18 @@ static int __init parse_xen_colors(const char *s)
 }
 custom_param("xen_colors", parse_xen_colors);
 
+void coloring_dump_info(struct domain *d)
+{
+    int i;
+
+    printk("Domain %d has %u color(s) [ ", d->domain_id, d->max_colors);
+    for ( i = 0; i < d->max_colors; i++ )
+    {
+        printk("%"PRIu32" ", d->colors[i]);
+    }
+    printk("]\n");
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 80a6f39464..fc12c79488 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -1131,6 +1131,7 @@ int domain_relinquish_resources(struct domain *d)
 void arch_dump_domain_info(struct domain *d)
 {
     p2m_dump_info(d);
+    coloring_dump_info(d);
 }
 
 
diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index 1f7e0dde79..8609e17e80 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -40,10 +40,17 @@ bool check_domain_colors(struct domain *d);
  * color configuration.
  */
 uint32_t *setup_default_colors(uint32_t *col_num);
+
+void coloring_dump_info(struct domain *d);
 #else /* !CONFIG_COLORING */
 static inline bool __init coloring_init(void)
 {
     return true;
 }
+
+static inline void coloring_dump_info(struct domain *d)
+{
+    return;
+}
 #endif /* CONFIG_COLORING */
 #endif /* !__ASM_ARM_COLORING_H__ */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 15/36] tools: add support for cache coloring configuration
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (13 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 14/36] xen/arch: add dump coloring info for domains Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 16/36] xen/color alloc: implement color_from_page for ARM64 Marco Solieri
                   ` (20 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

Add a new "colors" parameter that defines the color assignment for a
domain. The user can specify one or more color ranges using the same
syntax as the command line color selection (e.g. 0-4).
The parameter is defined as a list of strings that represent the
color ranges.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 tools/libs/light/libxl_arm.c     | 11 ++++++
 tools/libs/light/libxl_types.idl |  1 +
 tools/xl/xl_parse.c              | 59 ++++++++++++++++++++++++++++++--
 3 files changed, 69 insertions(+), 2 deletions(-)

diff --git a/tools/libs/light/libxl_arm.c b/tools/libs/light/libxl_arm.c
index eef1de0939..8944b250d9 100644
--- a/tools/libs/light/libxl_arm.c
+++ b/tools/libs/light/libxl_arm.c
@@ -101,6 +101,17 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc,
         return ERROR_FAIL;
     }
 
+    config->arch.colors.max_colors = d_config->b_info.num_colors;
+    for (i = 0; i < sizeof(config->arch.colors.colors) / 4; i++)
+        config->arch.colors.colors[i] = 0;
+    for (i = 0; i < d_config->b_info.num_colors; i++) {
+        unsigned int j = d_config->b_info.colors[i] / 32;
+        if (j > sizeof(config->arch.colors.colors) / 4)
+            return ERROR_FAIL;
+        config->arch.colors.colors[j] |= (1 << (d_config->b_info.colors[i] % 32));
+    }
+    LOG(DEBUG, "Setup domain colors");
+
     return 0;
 }
 
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 2a42da2f7d..2a39012369 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -545,6 +545,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("ioports",          Array(libxl_ioport_range, "num_ioports")),
     ("irqs",             Array(uint32, "num_irqs")),
     ("iomem",            Array(libxl_iomem_range, "num_iomem")),
+    ("colors",           Array(uint32, "num_colors")),
     ("claim_mode",	     libxl_defbool),
     ("event_channels",   uint32),
     ("kernel",           string),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 117fcdcb2b..9b6ab1c2e4 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1216,8 +1216,9 @@ void parse_config_data(const char *config_source,
     XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms,
                    *usbctrls, *usbdevs, *p9devs, *vdispls, *pvcallsifs_devs;
     XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian, *dtdevs,
-                   *mca_caps;
-    int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian, num_mca_caps;
+                   *mca_caps, *colors;
+    int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian, num_mca_caps,
+                    num_colors;
     int pci_power_mgmt = 0;
     int pci_msitranslate = 0;
     int pci_permissive = 0;
@@ -1366,6 +1367,60 @@ void parse_config_data(const char *config_source,
     if (!xlu_cfg_get_long (config, "maxmem", &l, 0))
         b_info->max_memkb = l * 1024;
 
+    if (!xlu_cfg_get_list(config, "colors", &colors, &num_colors, 0)) {
+        int ret, k, p, cur_index;
+
+        b_info->num_colors = 0;
+        /* Get number of colors based on ranges */
+        for (i = 0; i < num_colors; i++) {
+            uint32_t start, end;
+
+            buf = xlu_cfg_get_listitem (colors, i);
+            if (!buf) {
+                fprintf(stderr,
+                    "xl: Unable to get element %d in colors range list\n", i);
+                exit(1);
+            }
+
+            ret = sscanf(buf, "%u-%u", &start, &end);
+            if (ret < 2) {
+                fprintf(stderr,
+                    "xl: Invalid argument parsing colors range: %s\n", buf);
+                exit(1);
+            }
+
+            if (start > end) {
+                fprintf(stderr,
+                    "xl: invalid range: S:%u > E:%u \n", start,end);
+                exit(1);
+            }
+
+            /*
+             * Alloc a first array and then increase its size with realloc based
+             * on the number of ranges
+             */
+
+            /* Check for overlaps */
+            for (k = start; k <= end; k++) {
+                 for (p = 0; p < b_info->num_colors; p++)
+                    if(b_info->colors[p] == k) {
+                        fprintf(stderr,
+                            "xl: overlapped ranges not allowed\n");
+                        exit(1);
+                    }
+            }
+
+            cur_index = b_info->num_colors;
+            b_info->num_colors += (end - start) + 1;
+            b_info->colors = (uint32_t *)realloc(b_info->colors,
+                             sizeof(*b_info->colors) * b_info->num_colors);
+
+            for (k = start; cur_index < b_info->num_colors;
+                cur_index++, k++)
+                b_info->colors[cur_index] = k;
+        }
+    }
+
     if (!xlu_cfg_get_long (config, "vcpus", &l, 0)) {
         vcpus = l;
         if (libxl_cpu_bitmap_alloc(ctx, &b_info->avail_vcpus, l)) {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 16/36] xen/color alloc: implement color_from_page for ARM64
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (14 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 15/36] tools: add support for cache coloring configuration Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 20:54   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 17/36] xen/arm: add get_max_color function Marco Solieri
                   ` (19 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

The colored allocator should not make any assumptions on how a color is
defined, since the definition may change depending on the architecture.
Use a generic function "color_from_page" that returns the color id based
on the page address.
Add a definition for ARMv8 architectures.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/coloring.c             | 13 +++++++++++++
 xen/arch/arm/include/asm/coloring.h |  7 +++++++
 2 files changed, 20 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index 8061c3824f..4748d717d6 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -196,6 +196,19 @@ bool check_domain_colors(struct domain *d)
     return !ret;
 }
 
+/*
+ * Compute color id from the page @param pg.
+ * Page size determines the lowest available bit, while add_col_mask is used to
+ * select the rest.
+ *
+ * @param pg              Page address
+ * @return unsigned long  Color id
+ */
+unsigned long color_from_page(struct page_info *pg)
+{
+  return ((addr_col_mask & page_to_maddr(pg)) >> PAGE_SHIFT);
+}
+
 bool __init coloring_init(void)
 {
     int i;
diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index 8609e17e80..318e2a4521 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -42,6 +42,13 @@ bool check_domain_colors(struct domain *d);
 uint32_t *setup_default_colors(uint32_t *col_num);
 
 void coloring_dump_info(struct domain *d);
+
+/*
+ * Compute the color of the given page address.
+ * This function should change depending on the cache architecture
+ * specifications.
+ */
+unsigned long color_from_page(struct page_info *pg);
 #else /* !CONFIG_COLORING */
 static inline bool __init coloring_init(void)
 {
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 17/36] xen/arm: add get_max_color function
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (15 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 16/36] xen/color alloc: implement color_from_page for ARM64 Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-11 19:09   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 18/36] Alloc: introduce page_list_for_each_reverse Marco Solieri
                   ` (18 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

In order to initialize the colored allocator data structure, the maximum
amount of colors defined by the hardware has to be know.
Add a helper function that returns this information.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
---
 xen/arch/arm/coloring.c             | 5 +++++
 xen/arch/arm/include/asm/coloring.h | 8 ++++++++
 2 files changed, 13 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index 4748d717d6..d1ac193a80 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -209,6 +209,11 @@ unsigned long color_from_page(struct page_info *pg)
   return ((addr_col_mask & page_to_maddr(pg)) >> PAGE_SHIFT);
 }
 
+uint32_t get_max_colors(void)
+{
+    return max_col_num;
+}
+
 bool __init coloring_init(void)
 {
     int i;
diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index 318e2a4521..22e67dc9d8 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -49,6 +49,9 @@ void coloring_dump_info(struct domain *d);
  * specifications.
  */
 unsigned long color_from_page(struct page_info *pg);
+
+/* Return the maximum available number of colors supported by the hardware */
+uint32_t get_max_colors(void);
 #else /* !CONFIG_COLORING */
 static inline bool __init coloring_init(void)
 {
@@ -59,5 +62,10 @@ static inline void coloring_dump_info(struct domain *d)
 {
     return;
 }
+
+static inline uint32_t get_max_colors(void)
+{
+    return 0;
+}
 #endif /* CONFIG_COLORING */
 #endif /* !__ASM_ARM_COLORING_H__ */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 18/36] Alloc: introduce page_list_for_each_reverse
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (16 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 17/36] xen/arm: add get_max_color function Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-07  7:35   ` Jan Beulich
  2022-03-04 17:46 ` [PATCH 19/36] xen/arch: introduce cache-coloring allocator Marco Solieri
                   ` (17 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
---
 xen/include/xen/mm.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index 3be754da92..f0861ed5bb 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -488,6 +488,8 @@ page_list_splice(struct page_list_head *list, struct page_list_head *head)
     list_for_each_entry_safe(pos, tmp, head, list)
 # define page_list_for_each_safe_reverse(pos, tmp, head) \
     list_for_each_entry_safe_reverse(pos, tmp, head, list)
+# define page_list_for_each_reverse(pos, head) \
+    list_for_each_entry_reverse(pos, head, list)
 #endif
 
 static inline unsigned int get_order_from_bytes(paddr_t size)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 19/36] xen/arch: introduce cache-coloring allocator
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (17 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 18/36] Alloc: introduce page_list_for_each_reverse Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-09 14:35   ` Jan Beulich
  2022-03-04 17:46 ` [PATCH 20/36] xen/common: introduce buddy required reservation Marco Solieri
                   ` (16 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Introduce a new memory page allocator that implement the cache coloring
mechanism. The allocation algorithm follows the given coloring scheme
specified for each guest, and maximizes contiguity in the page
selection.

Pages are stored by color in separated and address-ordered lists that
are collectively called the colored heap.  These lists will be populated
by a simple initialisation function, which, for any available page,
compute its color and insert it in the corresponding list.  When a
domain requests a page, the allocator take one from the subset of lists
whose colors equal the domain configuration.  It chooses the highest
page element among the lasts elements of such lists.  This ordering
guarantees that contiguous pages are sequentially allocated, if this is
made possible by a color assignment which includes adjacent ids.

The allocator can handle only requests with order equals to 0 since the
single color granularity is represented in memory by one page.

A dump function is added to allow inspection of colored heap
information.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/common/page_alloc.c | 264 +++++++++++++++++++++++++++++++++++++++-
 xen/include/xen/mm.h    |   5 +
 2 files changed, 268 insertions(+), 1 deletion(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 4635718237..82f6e8330a 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -150,6 +150,9 @@
 #define p2m_pod_offline_or_broken_hit(pg) 0
 #define p2m_pod_offline_or_broken_replace(pg) BUG_ON(pg != NULL)
 #endif
+#ifdef CONFIG_COLORING
+#include <asm/coloring.h>
+#endif
 
 #ifndef PGC_reserved
 #define PGC_reserved 0
@@ -438,6 +441,263 @@ mfn_t __init alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align)
 
 
 
+static DEFINE_SPINLOCK(heap_lock);
+
+#ifdef CONFIG_COLORING
+/*************************
+ * COLORED SIDE-ALLOCATOR
+ *
+ * Pages are stored by their color in separated lists. Each list defines a color
+ * and it is initialized during end_boot_allocator, where each page's color
+ * is calculated and the page itself is put in the correct list.
+ * After initialization we have N list where N is the number of maximum
+ * available colors on the platform.
+ * All the lists' heads are stored as element in an array with size N-1 using
+ * the following schema:
+ * array[X] = head of color X, where X goes from 0 to N-1
+ */
+typedef struct page_list_head color_list;
+static color_list *color_heap;
+static long total_avail_col_pages;
+static u64 col_num_max;
+static bool color_init_state = true;
+
+#define page_to_head(pg) (&color_heap[color_from_page(pg)])
+#define color_to_head(col) (&color_heap[col])
+
+/* Add page in list in order depending on its physical address. */
+static void page_list_add_order(struct page_info *pg, struct list_head *head)
+{
+    struct page_info *pos;
+
+    /* Add first page after head */
+    if ( page_list_empty(head) )
+    {
+        page_list_add(pg, head);
+        return;
+    }
+
+    /* Add non-first page in list in ascending order */
+    page_list_for_each_reverse(pos, head)
+    {
+        /* Get pg position */
+        if ( page_to_maddr(pos) <= page_to_maddr(pg) )
+        {
+            /* Insert pg between pos and pos->list.next */
+            page_list_add(pg, &pos->list);
+            break;
+        }
+
+        /*
+         * If pos is the first element it means that pg <= pos so we have
+         * to insert pg after head.
+         */
+        if ( page_list_first(head) == pos )
+        {
+            page_list_add(pg, head);
+            break;
+        }
+    }
+}
+
+/* Alloc one page based on domain color configuration */
+static struct page_info *alloc_col_heap_page(
+    unsigned int memflags, struct domain *d)
+{
+    struct page_info *pg, *tmp;
+    bool need_tlbflush = false;
+    uint32_t cur_color;
+    uint32_t tlbflush_timestamp = 0;
+    uint32_t *colors = 0;
+    int max_colors;
+    int i;
+
+    colors = d->colors;
+    max_colors = d->max_colors;
+
+    spin_lock(&heap_lock);
+
+    tmp = pg = NULL;
+
+    /* Check for the first pg on non-empty list */
+    for ( i = 0; i < max_colors; i++ )
+    {
+        if ( !page_list_empty(color_to_head(colors[i])) )
+        {
+            tmp = pg = page_list_last(color_to_head(colors[i]));
+            cur_color = d->colors[i];
+            break;
+        }
+    }
+
+    /* If all lists are empty, no requests can be satisfied */
+    if ( !pg )
+    {
+        spin_unlock(&heap_lock);
+        return NULL;
+    }
+
+    /* Get the highest page from the lists compliant to the domain color(s) */
+    for ( i += 1; i < max_colors; i++ )
+    {
+        if ( page_list_empty(color_to_head(colors[i])) )
+        {
+            printk(XENLOG_INFO "List empty\n");
+            continue;
+        }
+        tmp = page_list_last(color_to_head(colors[i]));
+        if ( page_to_maddr(tmp) > page_to_maddr(pg) )
+        {
+            pg = tmp;
+            cur_color = colors[i];
+        }
+    }
+
+    if ( !pg )
+    {
+        spin_unlock(&heap_lock);
+        return NULL;
+    }
+
+    pg->count_info = PGC_state_inuse;
+
+    if ( !(memflags & MEMF_no_tlbflush) )
+        accumulate_tlbflush(&need_tlbflush, pg,
+                            &tlbflush_timestamp);
+
+    /* Initialise fields which have other uses for free pages. */
+    pg->u.inuse.type_info = 0;
+    page_set_owner(pg, NULL);
+
+    flush_page_to_ram(mfn_x(page_to_mfn(pg)),
+                      !(memflags & MEMF_no_icache_flush));
+
+    page_list_del(pg, page_to_head(pg));
+    total_avail_col_pages--;
+
+    spin_unlock(&heap_lock);
+
+    if ( need_tlbflush )
+        filtered_flush_tlb_mask(tlbflush_timestamp);
+
+    return pg;
+}
+
+struct page_info *alloc_col_domheap_page(
+    struct domain *d, unsigned int memflags)
+{
+    struct page_info *pg;
+
+    ASSERT(!in_irq());
+
+    /* Get page based on color selection */
+    pg = alloc_col_heap_page(memflags, d);
+
+    if ( !pg )
+    {
+        printk(XENLOG_INFO "ERROR: Colored Page is null\n");
+        return NULL;
+    }
+
+    /* Assign page to domain */
+    if ( d && !(memflags & MEMF_no_owner) &&
+        assign_page(pg, 0, d, memflags) )
+    {
+        free_col_heap_page(pg);
+        return NULL;
+    }
+
+    return pg;
+}
+
+void free_col_heap_page(struct page_info *pg)
+{
+    /* This page is not a guest frame any more. */
+    pg->count_info = PGC_state_free;
+
+    page_set_owner(pg, NULL);
+    total_avail_col_pages++;
+    page_list_add_order( pg, page_to_head(pg) );
+}
+
+static inline void init_col_heap_pages(struct page_info *pg, unsigned long nr_pages)
+{
+    int i;
+
+    if ( color_init_state )
+    {
+        col_num_max = get_max_colors();
+        color_heap = xmalloc_array(color_list, col_num_max);
+        BUG_ON(!color_heap);
+
+        for ( i = 0; i < col_num_max; i++ )
+        {
+            printk(XENLOG_INFO "Init list for color: %u\n", i);
+            INIT_PAGE_LIST_HEAD(&color_heap[i]);
+        }
+
+        color_init_state = false;
+    }
+
+    printk(XENLOG_INFO "Init color heap pages with %lu pages for a given size of 0x%"PRIx64"\n",
+            nr_pages, nr_pages * PAGE_SIZE);
+    printk(XENLOG_INFO "Paging starting from: 0x%"PRIx64"\n", page_to_maddr(pg));
+    total_avail_col_pages += nr_pages;
+
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        pg->colored = true;
+        page_list_add_order(pg, page_to_head(pg));
+        pg++;
+    }
+}
+
+static inline bool is_page_colored(struct page_info *pg)
+{
+        return pg->colored;
+}
+
+static void dump_col_heap(unsigned char key)
+{
+    struct page_info *pg;
+    unsigned long size;
+    unsigned int i;
+
+    printk("Colored heap info\n");
+    for ( i = 0; i < col_num_max; i++ )
+    {
+        printk("Heap[%u]: ", i);
+        size = 0;
+        page_list_for_each( pg, color_to_head(i) )
+        {
+            BUG_ON(!(color_from_page(pg) == i));
+            size++;
+        }
+        printk("%lu pages -> %lukB free\n", size, size << (PAGE_SHIFT - 10));
+    }
+
+    printk("Total number of pages: %lu\n", total_avail_col_pages);
+}
+#else /* !CONFIG_COLORING */
+#define init_col_heap_pages(x, y) init_heap_pages(x, y)
+
+inline struct page_info *alloc_col_domheap_page(
+	struct domain *d, unsigned int memflags)
+{
+	return NULL;
+}
+
+inline void free_col_heap_page(struct page_info *pg)
+{
+	return;
+}
+
+static inline bool is_page_colored(struct page_info *pg)
+{
+        return false;
+}
+#endif /* CONFIG_COLORING */
+
 /*************************
  * BINARY BUDDY ALLOCATOR
  */
@@ -458,7 +718,6 @@ static unsigned long node_need_scrub[MAX_NUMNODES];
 static unsigned long *avail[MAX_NUMNODES];
 static long total_avail_pages;
 
-static DEFINE_SPINLOCK(heap_lock);
 static long outstanding_claims; /* total outstanding claims by all domains */
 
 unsigned long domain_adjust_tot_pages(struct domain *d, long pages)
@@ -2600,6 +2859,9 @@ static void cf_check dump_heap(unsigned char key)
 static __init int cf_check register_heap_trigger(void)
 {
     register_keyhandler('H', dump_heap, "dump heap info", 1);
+#ifdef CONFIG_COLORING
+    register_keyhandler('c', dump_col_heap, "dump coloring heap info", 1);
+#endif
     return 0;
 }
 __initcall(register_heap_trigger);
diff --git a/xen/include/xen/mm.h b/xen/include/xen/mm.h
index f0861ed5bb..63288e537c 100644
--- a/xen/include/xen/mm.h
+++ b/xen/include/xen/mm.h
@@ -131,6 +131,11 @@ unsigned int online_page(mfn_t mfn, uint32_t *status);
 int offline_page(mfn_t mfn, int broken, uint32_t *status);
 int query_page_offline(mfn_t mfn, uint32_t *status);
 
+/* Colored suballocator. */
+struct page_info *alloc_col_domheap_page(
+    struct domain *d, unsigned int memflags);
+void free_col_heap_page(struct page_info *pg);
+
 void heap_init_late(void);
 
 int assign_pages(
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 20/36] xen/common: introduce buddy required reservation
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (18 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 19/36] xen/arch: introduce cache-coloring allocator Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-09 14:45   ` Jan Beulich
  2022-03-04 17:46 ` [PATCH 21/36] xen/common: add colored allocator initialization Marco Solieri
                   ` (15 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

When cache coloring is enabled, a certain amount of memory is reserved
for buddy allocation because current coloring implementation does not
support Xen heap memory. As of this commit, the colored allocator is used
for dom0, domUs, while the buddy manages only Xen memory. The memory
reserved to the buddy is thus lowered to a reasonably small value.
Introduce a new variable that specifies the amount of memory reserved
for the buddy allocator.
The current default value will be enough even when we will add
coloring for Xen in the following patches.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/common/page_alloc.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 82f6e8330a..fffa438029 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -230,6 +230,13 @@ static bool __read_mostly scrub_debug;
 #define scrub_debug    false
 #endif
 
+#ifdef CONFIG_COLORING
+/* Minimum size required for buddy allocator to work with colored one */
+unsigned long buddy_required_size __read_mostly = MB(64);
+#else
+unsigned long buddy_required_size __read_mostly = 0;
+#endif
+
 /*
  * Bit width of the DMA heap -- used to override NUMA-node-first.
  * allocation strategy, which can otherwise exhaust low memory.
@@ -678,6 +685,13 @@ static void dump_col_heap(unsigned char key)
 
     printk("Total number of pages: %lu\n", total_avail_col_pages);
 }
+static int __init parse_buddy_required_size(const char *s)
+{
+    buddy_required_size = simple_strtoull(s, &s, 0);
+
+    return *s ? -EINVAL : 0;
+}
+custom_param("buddy_size", parse_buddy_required_size);
 #else /* !CONFIG_COLORING */
 #define init_col_heap_pages(x, y) init_heap_pages(x, y)
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 21/36] xen/common: add colored allocator initialization
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (19 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 20/36] xen/common: introduce buddy required reservation Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-09 14:58   ` Jan Beulich
  2022-03-04 17:46 ` [PATCH 22/36] xen/arch: init cache coloring conf for Xen Marco Solieri
                   ` (14 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

Initialize colored heap and allocator data structures. It is assumed
that pages are given to the init function is in ascending order. To
ensure that, pages are retrieved from bootmem_regions starting from the
first one. Moreover, this allows quickly insertion of freed pages into
the colored allocator's internal data structures -- sorted lists.
If coloring is disabled, changing the free page order should not affect
both performance and functionalities of the buddy allocator.

Do not allocate Dom0 memory with direct mapping if colored is enabled.

Signed-off-by: Luca Miccio <206497@studenti.unimore.it>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 xen/arch/arm/domain_build.c |  7 +++++-
 xen/common/page_alloc.c     | 43 +++++++++++++++++++++++++++++++------
 2 files changed, 42 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 9630d00066..03a2573d67 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -3292,7 +3292,12 @@ static int __init construct_dom0(struct domain *d)
     /* type must be set before allocate_memory */
     d->arch.type = kinfo.type;
 #endif
-    allocate_memory_11(d, &kinfo);
+#ifdef CONFIG_COLORING
+    if ( d->max_colors )
+        allocate_memory(d, &kinfo);
+    else
+#endif
+        allocate_memory_11(d, &kinfo);
     find_gnttab_region(d, &kinfo);
 
     /* Map extra GIC MMIO, irqs and other hw stuffs to dom0. */
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index fffa438029..dea14bc39f 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2154,11 +2154,26 @@ void __init end_boot_allocator(void)
             break;
         }
     }
-    for ( i = nr_bootmem_regions; i-- > 0; )
+
+    for ( i = 0; i < nr_bootmem_regions; i++ )
     {
         struct bootmem_region *r = &bootmem_region_list[i];
-        if ( r->s < r->e )
-            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
+
+        /*
+         * Find the first region that can fill the buddy allocator memory
+         * specified by buddy_required_size.
+         */
+        if ( buddy_required_size && (r->e - r->s) >
+            PFN_DOWN(buddy_required_size) )
+        {
+            init_heap_pages(mfn_to_page(_mfn(r->s)),
+                PFN_DOWN(buddy_required_size));
+
+            r->s += PFN_DOWN(buddy_required_size);
+            buddy_required_size = 0;
+        }
+
+        init_col_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
     }
     nr_bootmem_regions = 0;
 
@@ -2619,9 +2634,12 @@ int assign_pages(
         page_set_owner(&pg[i], d);
         smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
         pg[i].count_info =
-            (pg[i].count_info & (PGC_extra | PGC_reserved)) | PGC_allocated | 1;
+             (pg[i].count_info & (PGC_extra | PGC_reserved)) | PGC_allocated | 1;
 
-        page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
+        if ( is_page_colored(pg) )
+            page_list_add(&pg[i], page_to_list(d, &pg[i]));
+        else
+            page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
     }
 
  out:
@@ -2642,6 +2660,15 @@ struct page_info *alloc_domheap_pages(
     unsigned int bits = memflags >> _MEMF_bits, zone_hi = NR_ZONES - 1;
     unsigned int dma_zone;
 
+    /* Only Dom0 and DomUs are supported for coloring */
+    if ( d && d->max_colors > 0 )
+    {
+        /* Colored allocation must be done on 0 order */
+        if (order)
+            return NULL;
+
+        return alloc_col_domheap_page(d, memflags);
+    }
     ASSERT(!in_irq());
 
     bits = domain_clamp_alloc_bitsize(memflags & MEMF_no_owner ? NULL : d,
@@ -2761,8 +2788,10 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
             scrub = 1;
         }
 
-        free_heap_pages(pg, order, scrub);
-    }
+        if ( is_page_colored(pg) )
+            free_col_heap_page(pg);
+        else
+            free_heap_pages(pg, order, scrub);}
 
     if ( drop_dom_ref )
         put_domain(d);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 22/36] xen/arch: init cache coloring conf for Xen
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (20 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 21/36] xen/common: add colored allocator initialization Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-14 18:59   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 23/36] xen/arch: coloring: manually calculate Xen physical addresses Marco Solieri
                   ` (13 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Add initialization for Xen coloring data. By default, use the lowest
color index available.

Benchmarking the VM interrupt response time provides an estimation of
LLC usage by Xen's most latency-critical runtime task.  Results on Arm
Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
reserves 64 KiB of L2, is enough to attain best responsiveness.

More colors are instead very likely to be needed on processors whose L1
cache is physically-indexed and physically-tagged, such as Cortex-A57.
In such cases, coloring applies to L1 also, and there typically are two
distinct L1-colors. Therefore, reserving only one color for Xen would
senselessly partitions a cache memory that is already private, i.e.
underutilize it. The default amount of Xen colors is thus set to one.

Signed-off-by: Luca Miccio <206497@studenti.unimore.it>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/coloring.c | 31 ++++++++++++++++++++++++++++++-
 1 file changed, 30 insertions(+), 1 deletion(-)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index d1ac193a80..761414fcd7 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -30,10 +30,18 @@
 #include <asm/coloring.h>
 #include <asm/io.h>
 
+/* By default Xen uses the lowestmost color */
+#define XEN_COLOR_DEFAULT_MASK 0x0001
+#define XEN_COLOR_DEFAULT_NUM 1
+/* Current maximum useful colors */
+#define MAX_XEN_COLOR   128
+
 /* Number of color(s) assigned to Xen */
 static uint32_t xen_col_num;
 /* Coloring configuration of Xen as bitmask */
 static uint32_t xen_col_mask[MAX_COLORS_CELLS];
+/* Xen colors IDs */
+static uint32_t xen_col_list[MAX_XEN_COLOR];
 
 /* Number of color(s) assigned to Dom0 */
 static uint32_t dom0_col_num;
@@ -216,7 +224,7 @@ uint32_t get_max_colors(void)
 
 bool __init coloring_init(void)
 {
-    int i;
+    int i, rc;
 
     printk(XENLOG_INFO "Initialize XEN coloring: \n");
     /*
@@ -266,6 +274,27 @@ bool __init coloring_init(void)
     printk(XENLOG_INFO "Color bits in address: 0x%"PRIx64"\n", addr_col_mask);
     printk(XENLOG_INFO "Max number of colors: %u\n", max_col_num);
 
+    if ( !xen_col_num )
+    {
+        xen_col_mask[0] = XEN_COLOR_DEFAULT_MASK;
+        xen_col_num = XEN_COLOR_DEFAULT_NUM;
+        printk(XENLOG_WARNING "Xen color configuration not found. Using default\n");
+    }
+
+    printk(XENLOG_INFO "Xen color configuration: 0x%"PRIx32"%"PRIx32"%"PRIx32"%"PRIx32"\n",
+            xen_col_mask[3], xen_col_mask[2], xen_col_mask[1], xen_col_mask[0]);
+    rc = copy_mask_to_list(xen_col_mask, xen_col_list, xen_col_num);
+
+    if ( rc )
+        return false;
+
+    for ( i = 0; i < xen_col_num; i++ )
+        if ( xen_col_list[i] > (max_col_num - 1) )
+        {
+            printk(XENLOG_ERR "ERROR: max. color value not supported\n");
+            return false;
+        }
+
     return true;
 }
 
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 23/36] xen/arch: coloring: manually calculate Xen physical addresses
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (21 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 22/36] xen/arch: init cache coloring conf for Xen Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-14 19:23   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 24/36] xen/arm: enable consider_modules for coloring Marco Solieri
                   ` (12 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

During Xen coloring procedure, we need to manually calculate consecutive
physical addresses that conform to the color selection. Add an helper
function that does this operation. The latter will return the next
address that conforms to Xen color selection.

The next_colored function is architecture dependent and the provided
implementation is for ARMv8.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/coloring.c             | 43 +++++++++++++++++++++++++++++
 xen/arch/arm/include/asm/coloring.h | 14 ++++++++++
 2 files changed, 57 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index 761414fcd7..aae3c77a7b 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -222,6 +222,49 @@ uint32_t get_max_colors(void)
     return max_col_num;
 }
 
+paddr_t next_xen_colored(paddr_t phys)
+{
+    unsigned int i;
+    unsigned int col_next_number = 0;
+    unsigned int col_cur_number = (phys & addr_col_mask) >> PAGE_SHIFT;
+    int overrun = 0;
+    paddr_t ret;
+
+    /*
+     * Check if address color conforms to Xen selection. If it does, return
+     * the address as is.
+     */
+    for( i = 0; i < xen_col_num; i++)
+        if ( col_cur_number == xen_col_list[i] )
+            return phys;
+
+    /* Find next col */
+    for( i = xen_col_num -1 ; i >= 0; i--)
+    {
+        if ( col_cur_number > xen_col_list[i])
+        {
+            /* Need to start to first element and add a way_size */
+            if ( i == (xen_col_num - 1) )
+            {
+                col_next_number = xen_col_list[0];
+                overrun = 1;
+            }
+            else
+            {
+                col_next_number = xen_col_list[i+1];
+                overrun = 0;
+            }
+            break;
+        }
+    }
+
+    /* Align phys to way_size */
+    ret = phys - (PAGE_SIZE * col_cur_number);
+    /* Add the offset based on color selection*/
+    ret += (PAGE_SIZE * (col_next_number)) + (way_size*overrun);
+    return ret;
+}
+
 bool __init coloring_init(void)
 {
     int i, rc;
diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index 22e67dc9d8..8c4525677c 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -28,6 +28,20 @@
 
 bool __init coloring_init(void);
 
+/*
+ * Return physical page address that conforms to the colors selection
+ * given in col_selection_mask after @param phys.
+ *
+ * @param phys         Physical address start.
+ * @param addr_col_mask        Mask specifying the bits available for coloring.
+ * @param col_selection_mask   Mask asserting the color bits to be used,
+ * must not be 0.
+ *
+ * @return The lowest physical page address being greater or equal than
+ * 'phys' and belonging to Xen color selection
+ */
+paddr_t next_xen_colored(paddr_t phys);
+
 /*
  * Check colors of a given domain.
  * Return true if check passed, false otherwise.
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 24/36] xen/arm: enable consider_modules for coloring
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (22 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 23/36] xen/arch: coloring: manually calculate Xen physical addresses Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-14 19:24   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 25/36] xen/arm: bring back get_xen_paddr Marco Solieri
                   ` (11 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

In order to relocate Xen the function get_xen_paddr will be used in the
following patches. The method has "consider_modules" as a prerequisite
so it has to be enabled both for ARM32 and coloring.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
---
 xen/arch/arm/setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index f39c62ea70..0bfe12da57 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -442,7 +442,7 @@ static void * __init relocate_fdt(paddr_t dtb_paddr, size_t dtb_size)
     return fdt;
 }
 
-#ifdef CONFIG_ARM_32
+#if defined (CONFIG_ARM_32) || (CONFIG_COLORING)
 /*
  * Returns the end address of the highest region in the range s..e
  * with required size and alignment that does not conflict with the
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 25/36] xen/arm: bring back get_xen_paddr
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (23 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 24/36] xen/arm: enable consider_modules for coloring Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 26/36] xen/arm: add argument to remove_early_mappings Marco Solieri
                   ` (10 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

In order to efficiently coloring Xen, we need to relocate it and move
the xen code to a unique memory region that will be marked as colored
for Xen itself. This region will be out target region and it will be
placed as high as possibile in RAM. To do that we need to use the old
get_xen_paddr function that was part of the relocation feature.
Moreover the size of the region we want to relocate is not equal to xen
code size anymore because of coloring.
In the worst case the target region must be greater than xen code
size * avail. colors. However the get_xen_paddr assumes to handle a
memory with size equals only to xen code region.
Add a new "size" parameter to handle also the coloring case.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Acked-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 xen/arch/arm/setup.c | 54 ++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 54 insertions(+)

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 0bfe12da57..8d980ce18d 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -570,6 +570,60 @@ static paddr_t __init next_module(paddr_t s, paddr_t *end)
     return lowest;
 }
 
+#ifdef CONFIG_COLORING
+/**
+ * get_xen_paddr - get physical address to relocate Xen to
+ *
+ * Xen is relocated to as near to the top of RAM as possible and
+ * aligned to a XEN_PADDR_ALIGN boundary.
+ */
+static paddr_t __init get_xen_paddr(uint32_t xen_size)
+{
+    struct meminfo *mi = &bootinfo.mem;
+    paddr_t min_size;
+    paddr_t paddr = 0;
+    int i;
+
+    min_size = (xen_size + (XEN_PADDR_ALIGN-1)) & ~(XEN_PADDR_ALIGN-1);
+
+    /* Find the highest bank with enough space. */
+    for ( i = 0; i < mi->nr_banks; i++ )
+    {
+        const struct membank *bank = &mi->bank[i];
+        paddr_t s, e;
+
+        if ( bank->size >= min_size )
+        {
+            e = consider_modules(bank->start, bank->start + bank->size,
+                                 min_size, XEN_PADDR_ALIGN, 0);
+            if ( !e )
+                continue;
+
+#ifdef CONFIG_ARM_32
+            /* Xen must be under 4GB */
+            if ( e > 0x100000000ULL )
+                e = 0x100000000ULL;
+            if ( e < bank->start )
+                continue;
+#endif
+
+            s = e - min_size;
+
+            if ( s > paddr )
+                paddr = s;
+        }
+    }
+
+    if ( !paddr )
+        panic("Not enough memory to relocate Xen\n");
+
+    printk("Placing Xen at 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
+           paddr, paddr + min_size);
+
+    return paddr;
+}
+#endif
+
 static void __init init_pdx(void)
 {
     paddr_t bank_start, bank_size, bank_end;
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 26/36] xen/arm: add argument to remove_early_mappings
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (24 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 25/36] xen/arm: bring back get_xen_paddr Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-14 19:59   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 27/36] xen/arch: add coloring support for Xen Marco Solieri
                   ` (9 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

Upcoming patches will need to remove temporary mappings created during
Xen coloring process. The function remove_early_mappings does what we
need but it is case-specific. Parametrize the function to avoid code
replication.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Acked-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 xen/arch/arm/include/asm/mm.h | 2 +-
 xen/arch/arm/mm.c             | 8 ++++----
 xen/arch/arm/setup.c          | 3 ++-
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 9ac1767595..041ec4ee70 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -184,7 +184,7 @@ extern void setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr);
 /* Map FDT in boot pagetable */
 extern void *early_fdt_map(paddr_t fdt_paddr);
 /* Remove early mappings */
-extern void remove_early_mappings(void);
+extern void remove_early_mappings(unsigned long va, unsigned long size);
 /* Allocate and initialise pagetables for a secondary CPU. Sets init_ttbr to the
  * new page table */
 extern int init_secondary_pagetables(int cpu);
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index fd7a313d88..d69f18b5d2 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -597,13 +597,13 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
     return fdt_virt;
 }
 
-void __init remove_early_mappings(void)
+void __init remove_early_mappings(unsigned long va, unsigned long size)
 {
     lpae_t pte = {0};
-    write_pte(xen_second + second_table_offset(BOOT_FDT_VIRT_START), pte);
-    write_pte(xen_second + second_table_offset(BOOT_FDT_VIRT_START + SZ_2M),
+    write_pte(xen_second + second_table_offset(va), pte);
+    write_pte(xen_second + second_table_offset(va + size),
               pte);
-    flush_xen_tlb_range_va(BOOT_FDT_VIRT_START, BOOT_FDT_SLOT_SIZE);
+    flush_xen_tlb_range_va(va, size);
 }
 
 /*
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 8d980ce18d..13b10515a8 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -41,6 +41,7 @@
 #include <xen/libfdt/libfdt.h>
 #include <xen/acpi.h>
 #include <xen/warning.h>
+#include <xen/sizes.h>
 #include <asm/alternative.h>
 #include <asm/page.h>
 #include <asm/current.h>
@@ -426,7 +427,7 @@ void __init discard_initial_modules(void)
 
     mi->nr_mods = 0;
 
-    remove_early_mappings();
+    remove_early_mappings(BOOT_FDT_VIRT_START, SZ_2M);
 }
 
 /* Relocate the FDT in Xen heap */
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 27/36] xen/arch: add coloring support for Xen
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (25 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 26/36] xen/arm: add argument to remove_early_mappings Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 19:47   ` Julien Grall
                     ` (2 more replies)
  2022-03-04 17:46 ` [PATCH 28/36] xen/arm: introduce xen_map_text_rw Marco Solieri
                   ` (8 subsequent siblings)
  35 siblings, 3 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Introduce a new implementation of setup_pagetables that uses coloring
logic in order to isolate Xen code using its color selection.
Page tables construction is essentially copied, except for the xenmap
table, where coloring logic is needed.  Given the absence of a contiguous
physical mapping, pointers to next level tables need to be manually
calculated.

Xen code is relocated in strided mode using the same coloring logic as
the one in xenmap table by using a temporary colored mapping that will
be destroyed after switching the TTBR register.

Keep Xen text section mapped in the newly created pagetables.
The boot process relies on computing needed physical addresses of Xen
code by using a shift, but colored mapping is not linear and not easily
computable. Therefore, the old Xen code is temporarily kept and used to
boot secondary CPUs until they switch to the colored mapping, which is
accessed using the handy macro virt_old.  After the boot process, the old
Xen code memory is reset and its mapping is destroyed.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/include/asm/coloring.h |  13 ++
 xen/arch/arm/include/asm/mm.h       |   7 ++
 xen/arch/arm/mm.c                   | 186 +++++++++++++++++++++++++++-
 xen/arch/arm/psci.c                 |   4 +-
 xen/arch/arm/setup.c                |  21 +++-
 xen/arch/arm/smpboot.c              |  19 ++-
 6 files changed, 241 insertions(+), 9 deletions(-)

diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
index 8c4525677c..424f6c2b04 100644
--- a/xen/arch/arm/include/asm/coloring.h
+++ b/xen/arch/arm/include/asm/coloring.h
@@ -26,6 +26,17 @@
 #ifdef CONFIG_COLORING
 #include <xen/sched.h>
 
+/*
+ * Amount of memory that we need to map in order to color Xen.  The value
+ * depends on the maximum number of available colors of the hardware.  The
+ * memory size is pessimistically calculated assuming only one color is used,
+ * which means that any pages belonging to any other color has to be skipped.
+ */
+#define XEN_COLOR_MAP_SIZE \
+	((((_end - _start) * get_max_colors())\
+		+ (XEN_PADDR_ALIGN-1)) & ~(XEN_PADDR_ALIGN-1))
+#define XEN_COLOR_MAP_SIZE_M (XEN_COLOR_MAP_SIZE >> 20)
+
 bool __init coloring_init(void);
 
 /*
@@ -67,6 +78,8 @@ unsigned long color_from_page(struct page_info *pg);
 /* Return the maximum available number of colors supported by the hardware */
 uint32_t get_max_colors(void);
 #else /* !CONFIG_COLORING */
+#define XEN_COLOR_MAP_SIZE (_end - _start)
+
 static inline bool __init coloring_init(void)
 {
     return true;
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 041ec4ee70..1422091436 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -362,6 +362,13 @@ void clear_and_clean_page(struct page_info *page);
 
 unsigned int arch_get_dma_bitsize(void);
 
+#ifdef CONFIG_COLORING
+#define virt_boot_xen(virt)\
+    (vaddr_t)((virt - XEN_VIRT_START) + BOOT_RELOC_VIRT_START)
+#else
+#define virt_boot_xen(virt) virt
+#endif
+
 #endif /*  __ARCH_ARM_MM__ */
 /*
  * Local variables:
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index d69f18b5d2..53ea13641b 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -42,6 +42,7 @@
 #include <xen/libfdt/libfdt.h>
 
 #include <asm/setup.h>
+#include <asm/coloring.h>
 
 /* Override macros from asm/page.h to make them work with mfn_t */
 #undef virt_to_mfn
@@ -110,6 +111,9 @@ DEFINE_BOOT_PAGE_TABLE(boot_second_id);
 DEFINE_BOOT_PAGE_TABLE(boot_third_id);
 DEFINE_BOOT_PAGE_TABLE(boot_second);
 DEFINE_BOOT_PAGE_TABLE(boot_third);
+#ifdef CONFIG_COLORING
+DEFINE_BOOT_PAGE_TABLE(boot_colored_xen);
+#endif
 
 /* Main runtime page tables */
 
@@ -632,6 +636,166 @@ static void clear_table(void *table)
     clean_and_invalidate_dcache_va_range(table, PAGE_SIZE);
 }
 
+#ifdef CONFIG_COLORING
+/*
+ * Translate a Xen (.text) virtual address to the colored physical one
+ * depending on the hypervisor configuration.
+ * N.B: this function must be used only when migrating from non colored to
+ * colored pagetables since it assumes to have the temporary mappings created
+ * during setup_pagetables that starts from BOOT_RELOC_VIRT_START.
+ * After the migration we have to use virt_to_maddr.
+ */
+static paddr_t virt_to_maddr_colored(vaddr_t virt)
+{
+    unsigned int va_offset;
+
+    va_offset = virt - XEN_VIRT_START;
+    return __pa(BOOT_RELOC_VIRT_START + va_offset);
+}
+
+static void __init coloring_temp_mappings(paddr_t xen_paddr, vaddr_t virt_start)
+{
+    int i;
+    lpae_t pte;
+    unsigned int xen_text_size = (_end - _start);
+
+    xen_text_size = PAGE_ALIGN(xen_text_size);
+
+    pte = mfn_to_xen_entry(maddr_to_mfn(__pa(boot_second)), MT_NORMAL);
+    pte.pt.table = 1;
+    boot_first[first_table_offset(virt_start)] = pte;
+
+    pte = mfn_to_xen_entry(maddr_to_mfn(__pa(boot_colored_xen)), MT_NORMAL);
+    pte.pt.table = 1;
+    boot_second[second_table_offset(virt_start)] = pte;
+
+    for ( i = 0; i < (xen_text_size/PAGE_SIZE); i++ )
+    {
+        mfn_t mfn;
+        xen_paddr = next_xen_colored(xen_paddr);
+        mfn = maddr_to_mfn(xen_paddr);
+        pte = mfn_to_xen_entry(mfn, MT_NORMAL);
+        pte.pt.table = 1; /* 4k mappings always have this bit set */
+        boot_colored_xen[i] = pte;
+        xen_paddr += PAGE_SIZE;
+    }
+
+   flush_xen_tlb_local();
+}
+
+/*
+ * Boot-time pagetable setup with coloring support
+ * Changes here may need matching changes in head.S
+ *
+ * The process can be explained as follows:
+ * - Create a temporary colored mapping that conforms to Xen color selection.
+ * - Update all the pagetables links that point to the next level table(s):
+ * this process is crucial beacause the translation tables are not physically
+ * contiguous and we cannot calculate the physical addresses by using the
+ * standard method (physical offset). In order to get the correct physical
+ * address we use virt_to_maddr_colored that translates the virtual address
+ * into a physical one based on the Xen coloring configuration.
+ * - Copy Xen to the new location.
+ * - Update TTBR0_EL2 with the new root page table address.
+ */
+void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
+{
+    int i;
+    lpae_t pte, *p;
+    paddr_t pt_phys;
+    mfn_t pt_phys_mfn;
+    paddr_t _xen_paddr = xen_paddr;
+
+    phys_offset = boot_phys_offset;
+
+    ASSERT((_end - _start) < SECOND_SIZE);
+    /* Create temporary mappings */
+    coloring_temp_mappings(xen_paddr, BOOT_RELOC_VIRT_START);
+
+    /* Build pagetables links */
+    p = (void *)xen_pgtable;
+    pt_phys = virt_to_maddr_colored((vaddr_t)xen_first);
+    pt_phys_mfn = maddr_to_mfn(pt_phys);
+    p[0] = mfn_to_xen_entry(pt_phys_mfn, MT_NORMAL);
+    p[0].pt.table = 1;
+    p[0].pt.xn = 0;
+    p = (void *)xen_first;
+
+    for ( i = 0; i < 2; i++ )
+    {
+        pt_phys = virt_to_maddr_colored((vaddr_t)(xen_second + i * LPAE_ENTRIES));
+        pt_phys_mfn = maddr_to_mfn(pt_phys);
+        p[i] = mfn_to_xen_entry(pt_phys_mfn, MT_NORMAL);
+        p[i].pt.table = 1;
+        p[i].pt.xn = 0;
+    }
+
+    for ( i = 0; i < LPAE_ENTRIES; i++ )
+    {
+        mfn_t mfn;
+        vaddr_t va = XEN_VIRT_START + (i << PAGE_SHIFT);
+        _xen_paddr = next_xen_colored(_xen_paddr);
+        mfn = maddr_to_mfn(_xen_paddr);
+        if ( !is_kernel(va) )
+            break;
+        pte = mfn_to_xen_entry(mfn, MT_NORMAL);
+        pte.pt.table = 1; /* 4k mappings always have this bit set */
+        if ( is_kernel_text(va) || is_kernel_inittext(va) )
+        {
+            pte.pt.xn = 0;
+            pte.pt.ro = 1;
+        }
+        if ( is_kernel_rodata(va) )
+            pte.pt.ro = 1;
+        xen_xenmap[i] = pte;
+        _xen_paddr += PAGE_SIZE;
+    }
+
+    /* Initialise xen second level entries ... */
+    /* ... Xen's text etc */
+    pt_phys = virt_to_maddr_colored((vaddr_t)(xen_xenmap));
+    pt_phys_mfn = maddr_to_mfn(pt_phys);
+    pte = mfn_to_xen_entry(pt_phys_mfn, MT_NORMAL);
+    pte.pt.table = 1;
+    xen_second[second_table_offset(XEN_VIRT_START)] = pte;
+
+    /* ... Fixmap */
+    pt_phys = virt_to_maddr_colored((vaddr_t)(xen_fixmap));
+    pt_phys_mfn = maddr_to_mfn(pt_phys);
+    pte = mfn_to_xen_entry(pt_phys_mfn, MT_NORMAL);
+    pte.pt.table = 1;
+    xen_second[second_table_offset(FIXMAP_ADDR(0))] = pte;
+
+    /* ... DTB */
+    pte = boot_second[second_table_offset(BOOT_FDT_VIRT_START)];
+    xen_second[second_table_offset(BOOT_FDT_VIRT_START)] = pte;
+    pte = boot_second[second_table_offset(BOOT_FDT_VIRT_START + SZ_2M)];
+    xen_second[second_table_offset(BOOT_FDT_VIRT_START + SZ_2M)] = pte;
+
+    /* Update the value of init_ttbr */
+    init_ttbr = virt_to_maddr_colored((vaddr_t)xen_pgtable);
+    clean_dcache(init_ttbr);
+
+    /* Copy Xen to the new location */
+    memcpy((void*)BOOT_RELOC_VIRT_START,
+        (const void*)XEN_VIRT_START, (_end - _start));
+    clean_dcache_va_range((void*)BOOT_RELOC_VIRT_START, (_end - _start));
+
+    /* Change ttbr */
+    switch_ttbr(init_ttbr);
+
+    /*
+     * Keep mapped old Xen memory in a contiguous mapping
+     * for other cpus to boot. This mapping will also replace the
+     * one created at the beginning of setup_pagetables.
+     */
+    create_mappings(xen_second, BOOT_RELOC_VIRT_START,
+                paddr_to_pfn(XEN_VIRT_START + phys_offset),
+                SZ_2M >> PAGE_SHIFT, SZ_2M);
+
+    xen_pt_enforce_wnx();
+}
+#else
 /* Boot-time pagetable setup.
  * Changes here may need matching changes in head.S */
 void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
@@ -721,6 +885,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
     per_cpu(xen_dommap, 0) = cpu0_dommap;
 #endif
 }
+#endif /* !CONFIG_COLORING */
 
 static void clear_boot_pagetables(void)
 {
@@ -735,6 +900,9 @@ static void clear_boot_pagetables(void)
 #endif
     clear_table(boot_second);
     clear_table(boot_third);
+#ifdef CONFIG_COLORING
+    clear_table(boot_colored_xen);
+#endif
 }
 
 #ifdef CONFIG_ARM_64
@@ -742,10 +910,16 @@ int init_secondary_pagetables(int cpu)
 {
     clear_boot_pagetables();
 
+    /*
+     * For coloring the value of the ttbr was already set up during
+     * setup_pagetables.
+     */
+#ifndef CONFIG_COLORING
     /* Set init_ttbr for this CPU coming up. All CPus share a single setof
      * pagetables, but rewrite it each time for consistency with 32 bit. */
     init_ttbr = (uintptr_t) xen_pgtable + phys_offset;
     clean_dcache(init_ttbr);
+#endif
     return 0;
 }
 #else
@@ -859,12 +1033,20 @@ void __init setup_xenheap_mappings(unsigned long base_mfn,
         else if ( xenheap_first_first_slot == -1)
         {
             /* Use xenheap_first_first to bootstrap the mappings */
-            first = xenheap_first_first;
+            paddr_t phys_addr;
+
+            /*
+             * At this stage is safe to use virt_to_maddr because Xen mapping
+             * is already in place. Using virt_to_maddr allows us to unify
+             * codepath with and without cache coloring enabled.
+             */
+            phys_addr = virt_to_maddr((vaddr_t)xenheap_first_first);
+            pte = mfn_to_xen_entry(maddr_to_mfn(phys_addr),MT_NORMAL);
 
-            pte = pte_of_xenaddr((vaddr_t)xenheap_first_first);
             pte.pt.table = 1;
             write_pte(p, pte);
 
+            first = xenheap_first_first;
             xenheap_first_first_slot = slot;
         }
         else
diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
index 0c90c2305c..d443fac6a2 100644
--- a/xen/arch/arm/psci.c
+++ b/xen/arch/arm/psci.c
@@ -25,6 +25,7 @@
 #include <asm/cpufeature.h>
 #include <asm/psci.h>
 #include <asm/acpi.h>
+#include <asm/coloring.h>
 
 /*
  * While a 64-bit OS can make calls with SMC32 calling conventions, for
@@ -49,7 +50,8 @@ int call_psci_cpu_on(int cpu)
 {
     struct arm_smccc_res res;
 
-    arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary),
+    arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu),
+                  __pa(virt_boot_xen((vaddr_t)init_secondary)),
                   &res);
 
     return PSCI_RET(res);
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 13b10515a8..294b806120 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -924,6 +924,7 @@ void __init start_xen(unsigned long boot_phys_offset,
     struct domain *d;
     int rc;
     paddr_t xen_paddr = (paddr_t)(_start + boot_phys_offset);
+    uint32_t xen_size = (_end - _start);
 
     dcache_line_bytes = read_dcache_line_bytes();
 
@@ -952,13 +953,16 @@ void __init start_xen(unsigned long boot_phys_offset,
     if ( !coloring_init() )
         panic("Xen Coloring support: setup failed\n");
 
+    xen_size = XEN_COLOR_MAP_SIZE;
+#ifdef CONFIG_COLORING
+    xen_paddr = get_xen_paddr(xen_size);
+#endif
+
     /* Register Xen's load address as a boot module. */
-    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr,
-                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
+    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr, xen_size, false);
     BUG_ON(!xen_bootmodule);
 
     setup_pagetables(boot_phys_offset, xen_paddr);
-
     setup_mm();
 
     /* Parse the ACPI tables for possible boot-time configuration */
@@ -1072,6 +1076,17 @@ void __init start_xen(unsigned long boot_phys_offset,
 
     setup_virt_paging();
 
+    /*
+     * This removal is useful if cache coloring is enabled but
+     * it should not affect non coloring configuration.
+     * The removal is done earlier than discard_initial_modules
+     * beacuse in do_initcalls there is the livepatch support
+     * setup which uses the virtual addresses starting from
+     * BOOT_RELOC_VIRT_START.
+     * Remove coloring mappings to expose a clear state to the
+     * livepatch module.
+     */
+    remove_early_mappings(BOOT_RELOC_VIRT_START, SZ_2M);
     do_initcalls();
 
     /*
diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
index 7bfd0a73a7..5ef68976c9 100644
--- a/xen/arch/arm/smpboot.c
+++ b/xen/arch/arm/smpboot.c
@@ -438,6 +438,7 @@ int __cpu_up(unsigned int cpu)
 {
     int rc;
     s_time_t deadline;
+    vaddr_t *smp_up_cpu_addr;
 
     printk("Bringing up CPU%d\n", cpu);
 
@@ -453,10 +454,22 @@ int __cpu_up(unsigned int cpu)
     /* Tell the remote CPU what its logical CPU ID is. */
     init_data.cpuid = cpu;
 
+    /*
+     * If coloring is enabled, non-Master CPUs boot using the old Xen code.
+     * During the boot process each cpu is booted one after another using the
+     * smp_cpu_cpu variable. This variable is accessed in head.S using its
+     * physical address.
+     * That address is calculated using the physical offset of the old Xen
+     * code. With coloring we can not rely anymore on that offset. For this
+     * reason in order to boot the other cpus we rely on the old xen code that
+     * was mapped during tables setup in mm.c so that we can use the old physical
+     * offset and the old head.S code also. In order to modify the old Xen code
+     * we need to access it using the mapped done in color_xen.
+     */
+    smp_up_cpu_addr = (vaddr_t *)virt_boot_xen((vaddr_t)&smp_up_cpu);
+    *smp_up_cpu_addr = cpu_logical_map(cpu);
     /* Open the gate for this CPU */
-    smp_up_cpu = cpu_logical_map(cpu);
-    clean_dcache(smp_up_cpu);
-
+    clean_dcache(*smp_up_cpu_addr);
     rc = arch_cpu_up(cpu);
 
     console_end_sync();
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 28/36] xen/arm: introduce xen_map_text_rw
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (26 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 27/36] xen/arch: add coloring support for Xen Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-07  7:39   ` Jan Beulich
  2022-03-04 17:46 ` [PATCH 29/36] xen/arm: add dump function for coloring info Marco Solieri
                   ` (7 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

Introduce two new arm specific functions to temporarily map/unmap the
Xen text read-write (the Xen text is mapped read-only by default by
setup_pagetables): xen_map_text_rw and xen_unmap_text_rw.

There is only one caller in the alternative framework.

The non-colored implementation simply uses __vmap to do the mapping. In
other words, there are no changes to the non-colored case.

The colored implementation calculates Xen text physical addresses
appropriately, according to the coloring configuration.

Export vm_alloc because it is needed by the colored implementation of
xen_map_text_rw.

Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
---
 xen/arch/arm/alternative.c    |  8 ++------
 xen/arch/arm/include/asm/mm.h |  3 +++
 xen/arch/arm/mm.c             | 38 +++++++++++++++++++++++++++++++++++
 xen/common/vmap.c             |  4 ++--
 xen/include/xen/vmap.h        |  2 ++
 5 files changed, 47 insertions(+), 8 deletions(-)

diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
index 237c4e5642..2481521c9c 100644
--- a/xen/arch/arm/alternative.c
+++ b/xen/arch/arm/alternative.c
@@ -185,9 +185,6 @@ static int __apply_alternatives_multi_stop(void *unused)
     {
         int ret;
         struct alt_region region;
-        mfn_t xen_mfn = virt_to_mfn(_start);
-        paddr_t xen_size = _end - _start;
-        unsigned int xen_order = get_order_from_bytes(xen_size);
         void *xenmap;
 
         BUG_ON(patched);
@@ -196,8 +193,7 @@ static int __apply_alternatives_multi_stop(void *unused)
          * The text and inittext section are read-only. So re-map Xen to
          * be able to patch the code.
          */
-        xenmap = __vmap(&xen_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
-                        VMAP_DEFAULT);
+        xenmap = xen_map_text_rw();
         /* Re-mapping Xen is not expected to fail during boot. */
         BUG_ON(!xenmap);
 
@@ -208,7 +204,7 @@ static int __apply_alternatives_multi_stop(void *unused)
         /* The patching is not expected to fail during boot. */
         BUG_ON(ret != 0);
 
-        vunmap(xenmap);
+        xen_unmap_text_rw(xenmap);
 
         /* Barriers provided by the cache flushing */
         write_atomic(&patched, 1);
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 1422091436..defb1efaad 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -195,6 +195,9 @@ extern void mmu_init_secondary_cpu(void);
 extern void setup_xenheap_mappings(unsigned long base_mfn, unsigned long nr_mfns);
 /* Map a frame table to cover physical addresses ps through pe */
 extern void setup_frametable_mappings(paddr_t ps, paddr_t pe);
+/* Create temporary Xen text read-write mapping */
+extern void *xen_map_text_rw(void);
+extern void xen_unmap_text_rw(void *va);
 /* Map a 4k page in a fixmap entry */
 extern void set_fixmap(unsigned map, mfn_t mfn, unsigned attributes);
 /* Remove a mapping from a fixmap entry */
diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
index 53ea13641b..b18c7cd373 100644
--- a/xen/arch/arm/mm.c
+++ b/xen/arch/arm/mm.c
@@ -637,6 +637,31 @@ static void clear_table(void *table)
 }
 
 #ifdef CONFIG_COLORING
+void* __init xen_map_text_rw(void)
+{
+    paddr_t xen_paddr = __pa(_start);
+    unsigned int xen_size = 1 << get_order_from_bytes(_end - _start);
+    void *va = vm_alloc(xen_size, 1, VMAP_DEFAULT);
+    unsigned long cur = (unsigned long)va;
+    mfn_t mfn_col;
+    unsigned int i;
+
+    for ( i = 0; i < xen_size; i++, cur += PAGE_SIZE )
+    {
+        xen_paddr = next_xen_colored(xen_paddr);
+        mfn_col = maddr_to_mfn(xen_paddr);
+        if ( map_pages_to_xen(cur, mfn_col, 1, PAGE_HYPERVISOR) )
+            return NULL;
+        xen_paddr += PAGE_SIZE;
+    }
+    return va;
+}
+
+void __init xen_unmap_text_rw(void *va)
+{
+    vunmap(va);
+}
+
 /*
  * Translate a Xen (.text) virtual address to the colored physical one
  * depending on the hypervisor configuration.
@@ -796,6 +821,19 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
     xen_pt_enforce_wnx();
 }
 #else
+void* __init xen_map_text_rw(void)
+{
+    unsigned int xen_order = get_order_from_bytes(_end - _start);
+    mfn_t xen_mfn = virt_to_mfn(_start);
+    return __vmap(&xen_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
+                  VMAP_DEFAULT);
+}
+
+void __init xen_unmap_text_rw(void *va)
+{
+    vunmap(va);
+}
+
 /* Boot-time pagetable setup.
  * Changes here may need matching changes in head.S */
 void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
diff --git a/xen/common/vmap.c b/xen/common/vmap.c
index 4fd6b3067e..bedfc9d418 100644
--- a/xen/common/vmap.c
+++ b/xen/common/vmap.c
@@ -45,8 +45,8 @@ void __init vm_init_type(enum vmap_region type, void *start, void *end)
     populate_pt_range(va, vm_low[type] - nr);
 }
 
-static void *vm_alloc(unsigned int nr, unsigned int align,
-                      enum vmap_region t)
+void *vm_alloc(unsigned int nr, unsigned int align,
+               enum vmap_region t)
 {
     unsigned int start, bit;
 
diff --git a/xen/include/xen/vmap.h b/xen/include/xen/vmap.h
index b0f7632e89..dcf2be692f 100644
--- a/xen/include/xen/vmap.h
+++ b/xen/include/xen/vmap.h
@@ -12,6 +12,8 @@ enum vmap_region {
 
 void vm_init_type(enum vmap_region type, void *start, void *end);
 
+void *vm_alloc(unsigned int nr, unsigned int align,
+               enum vmap_region t);
 void *__vmap(const mfn_t *mfn, unsigned int granularity, unsigned int nr,
              unsigned int align, unsigned int flags, enum vmap_region);
 void *vmap(const mfn_t *mfn, unsigned int nr);
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 29/36] xen/arm: add dump function for coloring info
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (27 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 28/36] xen/arm: introduce xen_map_text_rw Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 30/36] xen/arm: add coloring support to dom0less Marco Solieri
                   ` (6 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio, Stefano Stabellini

From: Luca Miccio <lucmiccio@gmail.com>

Display general information about coloring support both during boot and
when requested by the user.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
---
 xen/arch/arm/coloring.c | 24 ++++++++++++++++++++++++
 1 file changed, 24 insertions(+)

diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
index aae3c77a7b..c590e1629a 100644
--- a/xen/arch/arm/coloring.c
+++ b/xen/arch/arm/coloring.c
@@ -24,6 +24,7 @@
 #include <xen/types.h>
 #include <xen/lib.h>
 #include <xen/errno.h>
+#include <xen/keyhandler.h>
 #include <xen/param.h>
 
 #include <asm/sysregs.h>
@@ -434,6 +435,29 @@ void coloring_dump_info(struct domain *d)
     printk("]\n");
 }
 
+static void dump_coloring_info(unsigned char key)
+{
+    int i;
+
+    printk("Coloring general information\n");
+    printk("Way size: %"PRIu64"kB\n", way_size >> 10);
+    printk("Max. number of colors available: %"PRIu32"\n", max_col_num);
+
+    printk("Xen color(s):\t[");
+    for ( i = 0; i < xen_col_num; i++ )
+        printk(" %"PRIu32" ", xen_col_list[i]);
+    printk("]\n");
+}
+
+static __init int register_heap_trigger(void)
+{
+    register_keyhandler('C', dump_coloring_info, "dump coloring general info", 1);
+
+    /* Also print general information once at boot */
+    dump_coloring_info('C');
+    return 0;
+}
+__initcall(register_heap_trigger);
 /*
  * Local variables:
  * mode: C
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 30/36] xen/arm: add coloring support to dom0less
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (28 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 29/36] xen/arm: add dump function for coloring info Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 31/36] Disable coloring if static memory support is selected Marco Solieri
                   ` (5 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Dom0less color assignment is performed via Device Tree with a new
attribute "colors". In this case the color assignment is represented by
a bitmask where it suffices to set all and only the bits having a
position equal to the chosen colors, leaving unset all the others.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 xen/arch/arm/domain_build.c | 30 ++++++++++++++++++++++++++++++
 1 file changed, 30 insertions(+)

diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 03a2573d67..c7ca45c0c4 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -27,6 +27,7 @@
 #include <asm/setup.h>
 #include <asm/cpufeature.h>
 #include <asm/domain_build.h>
+#include <asm/coloring.h>
 
 #include <xen/irq.h>
 #include <xen/grant_table.h>
@@ -3173,6 +3174,10 @@ void __init create_domUs(void)
 {
     struct dt_device_node *node;
     const struct dt_device_node *chosen = dt_find_node_by_path("/chosen");
+    u32 col_val;
+    const u32 *cells;
+    u32 len;
+    int cell, i, k;
 
     BUG_ON(chosen == NULL);
     dt_for_each_child_node(chosen, node)
@@ -3241,6 +3246,31 @@ void __init create_domUs(void)
                                          vpl011_virq - 32 + 1);
         }
 
+        d_cfg.arch.colors.max_colors = 0;
+        memset(&d_cfg.arch.colors.colors, 0x0, sizeof(d_cfg.arch.colors.colors));
+
+        cells = dt_get_property(node, "colors", &len);
+        if ( cells != NULL && len > 0 )
+        {
+            if ( !get_max_colors() )
+                panic("Coloring requested but no colors configuration found!\n");
+
+            if ( len > sizeof(d_cfg.arch.colors.colors) )
+                panic("Dom0less DomU color information is invalid\n");
+
+            for ( k = 0, cell = len/4 - 1; cell >= 0; cell--, k++ )
+            {
+                col_val = be32_to_cpup(&cells[cell]);
+                if ( col_val )
+                {
+                    /* Calculate number of bit set */
+                    for ( i = 0; i < 32; i++)
+                        if ( col_val & (1 << i) )
+                            d_cfg.arch.colors.max_colors++;
+                    d_cfg.arch.colors.colors[k] = col_val;
+                }
+            }
+        }
         /*
          * The variable max_init_domid is initialized with zero, so here it's
          * very important to use the pre-increment operator to call
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 31/36] Disable coloring if static memory support is selected
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (29 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 30/36] xen/arm: add coloring support to dom0less Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-14 20:04   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 32/36] xen/arm: reduce the number of supported colors Marco Solieri
                   ` (4 subsequent siblings)
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Static memory assumes to have physically contiguous memory mapped to
domains. This assumption cannot be made when coloring is enabled.
These two features have to be mutually exclusive.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
---
 xen/arch/arm/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index f0f999d172..8f8be9d754 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -100,6 +100,7 @@ config HARDEN_BRANCH_PREDICTOR
 config COLORING
 	bool "L2 cache coloring"
 	default n
+	depends on !STATIC_MEMORY
 	depends on ARM_64
 
 config TEE
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 32/36] xen/arm: reduce the number of supported colors
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (30 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 31/36] Disable coloring if static memory support is selected Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:46 ` [PATCH 33/36] doc, xen-command-line: introduce coloring options Marco Solieri
                   ` (3 subsequent siblings)
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Currently coloring supports breaks assertion in domctl.c:892 because of
the data structure used for color configuration. Currently the array is
set to support up to 128 colors.
Lower the number of supported colors to 64 until a better solution is
found.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
---
 xen/include/public/arch-arm.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
index 627cc42164..5e2eaa02ad 100644
--- a/xen/include/public/arch-arm.h
+++ b/xen/include/public/arch-arm.h
@@ -303,7 +303,7 @@ struct vcpu_guest_context {
 typedef struct vcpu_guest_context vcpu_guest_context_t;
 DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);
 
-#define MAX_COLORS_CELLS 4
+#define MAX_COLORS_CELLS 2
 struct color_guest_config {
     uint32_t max_colors;
     uint32_t colors[MAX_COLORS_CELLS];
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 33/36] doc, xen-command-line: introduce coloring options
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (31 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 32/36] xen/arm: reduce the number of supported colors Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-07  7:42   ` Jan Beulich
  2022-03-14 22:07   ` Julien Grall
  2022-03-04 17:46 ` [PATCH 34/36] doc, xl.cfg: introduce coloring configuration option Marco Solieri
                   ` (2 subsequent siblings)
  35 siblings, 2 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Four additional parameters in the Xen command line are used to define
the underlying coloring policy, which is not directly configurable
otherwise.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 docs/misc/xen-command-line.pandoc | 51 +++++++++++++++++++++++++++++--
 1 file changed, 49 insertions(+), 2 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index efda335652..a472d51cf9 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -299,6 +299,20 @@ can be maintained with the pv-shim mechanism.
     cause Xen not to use Indirect Branch Tracking even when support is
     available in hardware.
 
+### buddy\_size (arm64)
+> `= <size in megabyte>`
+
+> Default: `64 MB`
+
+Amount of memory reserved for the buddy allocator when colored allocator is
+active. This options is useful only if coloring support is enabled.
+The colored allocator is meant as an alternative to the buddy allocator,
+since its allocation policy is by definition incompatible with the
+generic one. Since the Xen heap systems is not colored yet, we need to
+support the coexistence of the two allocators for now. This parameter, which is
+optional and for expert only, is used to set the amount of memory reserved to
+the buddy allocator.
+
 ### clocksource (x86)
 > `= pit | hpet | acpi | tsc`
 
@@ -884,7 +898,17 @@ Controls for the dom0 IOMMU setup.
 
     Incorrect use of this option may result in a malfunctioning system.
 
-### dom0_ioports_disable (x86)
+### dom0\_colors (arm64)
+> `= List of <integer>-<integer>`
+
+> Default: `All available colors`
+
+Specify dom0 color configuration. If the parameter is not set, all available
+colors are chosen and the user is warned on Xen's serial console. This color
+configuration acts also as the default one for all DomUs that do not have any
+explicit color assignment in their configuration file.
+
+### dom0\_ioports\_disable (x86)
 > `= List of <hex>-<hex>`
 
 Specify a list of IO ports to be excluded from dom0 access.
@@ -2625,6 +2649,20 @@ unknown NMIs will still be processed.
 Set the NMI watchdog timeout in seconds.  Specifying `0` will turn off
 the watchdog.
 
+### way\_size (arm64)
+> `= <size in byte>`
+
+> Default: `Obtained from the hardware`
+
+Specify the way size of the Last Level Cache. This parameter is only useful with
+coloring support enabled. It is an optional, expert-only parameter and it is
+used to calculate what bits in the physical address can be used by the coloring
+algorithm, and thus the maximum available colors on the platform. It can be
+obtained by dividing the total LLC size by the number of associativity ways.
+By default, the value is also automatically computed during coloring
+initialization to avoid any kind of misconfiguration. For this reason, it is
+highly recommended to use this boot argument with specific needs only.
+
 ### x2apic (x86)
 > `= <boolean>`
 
@@ -2642,7 +2680,16 @@ In the case that x2apic is in use, this option switches between physical and
 clustered mode.  The default, given no hint from the **FADT**, is cluster
 mode.
 
-### xenheap_megabytes (arm32)
+### xen\_colors (arm64)
+> `= List of <integer>-<integer>`
+
+> Default: `0-0: the lowermost color`
+
+Specify Xen color configuration. 
+Two colors are most likely needed on platforms where private caches are
+physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57.
+
+### xenheap\_megabytes (arm32)
 > `= <size>`
 
 > Default: `0` (1/32 of RAM)
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 34/36] doc, xl.cfg: introduce coloring configuration option
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (32 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 33/36] doc, xen-command-line: introduce coloring options Marco Solieri
@ 2022-03-04 17:46 ` Marco Solieri
  2022-03-04 17:47 ` [PATCH 35/36] doc, device-tree: introduce 'colors' property Marco Solieri
  2022-03-04 17:47 ` [PATCH 36/36] doc, arm: add usage documentation for cache coloring support Marco Solieri
  35 siblings, 0 replies; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:46 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

The color selection has to be specified in the configuration file of the
virtual machine with the new parameter 'colors'. This parameter
defines the colors to be assigned to that particular VM, expressed as a
list of ranges.
Add documentation for the new 'colors' parameter.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 docs/man/xl.cfg.5.pod.in | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index b98d161398..98c2da0c9e 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -2865,6 +2865,20 @@ Currently, only the "sbsa_uart" model is supported for ARM.
 
 =back
 
+=over 4
+
+=item B<colors=[ "COLORS_RANGE", "COLORS_RANGE", ...]>
+
+Specify the color configuration for the guest. B<COLORS_RANGE> is expressed
+using colors numbers. The range starts always from 0 up to the maximum amount
+of available colors.
+The number of available colors depends on the LLC layout of the specific
+platform and determines the maximum allowed value.  This number can be either
+calculated or read from the output given by the hypervisor during boot, if
+DEBUG logging is enabled.
+
+=back
+
 =head3 x86
 
 =over 4
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 35/36] doc, device-tree: introduce 'colors' property
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (33 preceding siblings ...)
  2022-03-04 17:46 ` [PATCH 34/36] doc, xl.cfg: introduce coloring configuration option Marco Solieri
@ 2022-03-04 17:47 ` Marco Solieri
  2022-03-14 22:17   ` Julien Grall
  2022-03-04 17:47 ` [PATCH 36/36] doc, arm: add usage documentation for cache coloring support Marco Solieri
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Dom0less uses device tree for DomUs when booting them without using
Dom0. Add a new device tree property 'colors' that specifies the
coloring configuration for DomUs when using Dom0less.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 docs/misc/arm/device-tree/booting.txt | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
index a94125394e..44971bfe60 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -162,6 +162,9 @@ with the following properties:
 
     An integer specifying the number of vcpus to allocate to the guest.
 
+- colors
+    A 64 bit bitmask specifying the color configuration for the guest.
+
 - vpl011
 
     An empty property to enable/disable a virtual pl011 for the guest to
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* [PATCH 36/36] doc, arm: add usage documentation for cache coloring support
  2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
                   ` (34 preceding siblings ...)
  2022-03-04 17:47 ` [PATCH 35/36] doc, device-tree: introduce 'colors' property Marco Solieri
@ 2022-03-04 17:47 ` Marco Solieri
  2022-03-15 19:23   ` Julien Grall
  35 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-04 17:47 UTC (permalink / raw)
  To: xen-devel
  Cc: Marco Solieri, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Luca Miccio

From: Luca Miccio <lucmiccio@gmail.com>

Add basic documentation that shows how cache coloring support can be
used in Xen. It introduces the basic concepts behind cache coloring,
defines the cache selection format, and explains how to assign colors to
the supported domains: Dom0, DomUs and Xen itself. Known issues are
also reported.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
 docs/misc/arm/cache_coloring.rst | 191 +++++++++++++++++++++++++++++++
 1 file changed, 191 insertions(+)
 create mode 100644 docs/misc/arm/cache_coloring.rst

diff --git a/docs/misc/arm/cache_coloring.rst b/docs/misc/arm/cache_coloring.rst
new file mode 100644
index 0000000000..082afb1b6c
--- /dev/null
+++ b/docs/misc/arm/cache_coloring.rst
@@ -0,0 +1,191 @@
+Xen coloring support user's guide
+=================================
+
+The cache coloring support in Xen allows to reserve last level cache partition
+for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
+
+In order to enable and use it, few steps are needed.
+
+- Enable coloring in XEN configuration file.
+
+        CONFIG_COLORING=y
+
+- Enable/disable debug information (optional).
+
+        CONFIG_COLORING_DEBUG=y/n
+
+Before digging into configuration instructions, configurers should first
+understand the basics of cache coloring.
+
+Background
+**********
+
+Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
+to each core (hence using multiple cache units), while the last level is shared
+among all of them. Such configuration implies that memory operations on one
+core (e.g. running a DomU) are able to generate interference on another core
+(e.g .hosting another DomU). Cache coloring allows eliminating this
+mutual interference, and thus guaranteeing higher and more predictable
+performances for memory accesses.
+The key concept underlying cache coloring is a fragmentation of the memory
+space into a set of sub-spaces called colors that are mapped to disjoint cache
+partitions. Technically, the whole memory space is first divided into a number
+of subsequent regions. Then each region is in turn divided into a number of
+subsequent sub-colors. The generic i-th color is then obtained by all the
+i-th sub-colors in each region.
+
+.. raw:: html
+
+    <pre>
+                            Region j            Region j+1
+                .....................   ............
+                .                     . .
+                .                       .
+            _ _ _______________ _ _____________________ _ _
+                |     |     |     |     |     |     |
+                | c_0 | c_1 |     | c_n | c_0 | c_1 |
+           _ _ _|_____|_____|_ _ _|_____|_____|_____|_ _ _
+                    :                       :
+                    :                       :...         ... .
+                    :                            color 0
+                    :...........................         ... .
+                                                :
+          . . ..................................:
+    </pre>
+
+There are two pragmatic lesson to be learnt.
+
+1. If one wants to avoid cache interference between two domains, different
+   colors needs to be used for their memory.
+
+2. Color assignment must privilege contiguity in the partitioning. E.g.,
+   assigning colors (0,1) to domain I  and (2,3) to domain  J is better than
+   assigning colors (0,2) to I and (1,3) to J.
+
+
+Color(s) selection format
+**************************
+
+Regardless of the domain that has to be colored (Dom0, DomUs and Xen),
+the color selection can be expressed using the same syntax.  In particular,
+the latter is expressed as a comma-separated list of hyphen-separated intervals
+of color numbers, as in `0-4,5-8,10-15`.  Ranges are always represented using
+strings. Note that no spaces are allowed.
+
+The number of available colors depends on the LLC layout of the specific
+platform and determines the maximum allowed value.  This number can be either
+calculated [#f1]_ or read from the output given by the hypervisor during boot,
+if DEBUG logging is enabled.
+
+Examples:
+
++---------------------+-----------------------------------+
+|**Configuration**    |**Actual selection**               |
++---------------------+-----------------------------------+
+|  1-2,5-8            | [1, 2, 5, 6, 7, 8]                |
++---------------------+-----------------------------------+
+|  0-8,3-8            | [0, 1, 2, 3, 4, 5, 6, 7, 8]       |
++---------------------+-----------------------------------+
+|  0-0                | [0]                               |
++---------------------+-----------------------------------+
+
+General coloring parameters
+***************************
+
+Four additional parameters in the Xen command line are used to define the
+underlying coloring policy, which is not directly configurable otherwise.
+
+Please refer to the relative documentation in docs/man/xl.cfg.pod.5.in.
+
+Dom0less support
+****************
+Support for the Dom0less experimental features is provided. Color selection for
+a virtual machine is defined by the attribute `colors`, whose format is not a
+string for ranges list, but a bitmask. It suffices to set all and only the bits
+having a position equal to the chosen colors, leaving unset all the others. For
+example, if we choose 8 colors out of 16, we can use a bitmask with 8 bits set
+and 8 bit unset, like:
+
+- `0xff00` -> `1111 1111 0000 0000`
+- `0x0ff0` -> `0000 1111 1111 0000`
+- `0x3c3c` -> `0011 1100 0011 1100`
+
+Configuration example:
+
+.. raw:: html
+
+    <pre>
+        xen,xen-bootargs = "console=dtuart dtuart=serial0 dom0_mem=1G dom0_max_vcpus=1 sched=null way_size=65536 xen_colors=0-1 dom0_colors=2-6";
+        xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen root=/dev/ram0"
+
+        dom0 {
+            compatible = "xen,linux-zimage" "xen,multiboot-module";
+            reg = <0x0 0x1000000 0x0 15858176>;
+        };
+
+        dom0-ramdisk {
+            compatible = "xen,linux-initrd" "xen,multiboot-module";
+            reg = <0x0 0x2000000 0x0 20638062>;
+        };
+
+        domU0 {
+            #address-cells = <0x1>;
+            #size-cells = <0x1>;
+            compatible = "xen,domain";
+            memory = <0x0 0x40000>;
+            colors = <0x0 0x0f00>;
+            cpus = <0x1>;
+            vpl011 = <0x1>;
+
+            module@2000000 {
+                compatible = "multiboot,kernel", "multiboot,module";
+                reg = <0x2000000 0xffffff>;
+                bootargs = "console=ttyAMA0";
+            };
+
+            module@30000000 {
+                compatible = "multiboot,ramdisk", "multiboot,module";
+                reg = <0x3000000 0xffffff>;
+            };
+        };
+    </pre>
+
+Please refer to the relative documentation in
+docs/misc/arm/device-tree/booting.txt.
+
+
+Known issues
+************
+
+Explicitly define way_size in QEMU
+##################################
+
+Currently, QEMU does not have a comprehensive cache model, so the cache coloring
+support fails to detect a cache geometry where to operate. In this case, the
+boot hangs as soon as the Xen image is loaded. To overcome this issue, it is
+enough to specify the way_size parameter in the command line. Any multiple
+greater than 1 of the page size allows the coloring mechanism to work, but the
+precise behavior on the system that QEMU is emulating can be obtained with its
+way_size. For instance, set way_size=65536.
+
+
+Fail to boot colored DomUs with large memory size
+#################################################
+
+If the kernel used for Dom0 does not contain the upstream commit
+3941552aec1e04d63999988a057ae09a1c56ebeb and uses the hypercall buffer device,
+colored DomUs with memory size larger then 127 MB cannot be created. This is
+caused by the default limit of this buffer of 64 pages. The solution is to
+manually apply the above patch, or to check if there is an updated version of
+the kernel in use for Dom0 that contains this change.
+
+Notes:
+******
+
+.. [#f1] To compute the number of available colors on a platform, one can simply
+  divide `way_size` by `page_size`, where: `page_size` is the size of the page
+  used on the system (usually 4 KiB); `way_size` is size of each LLC way.  For
+  example, an Arm Cortex-A53 with a 16-ways associative 1 MiB LLC enable 16
+  colors, when pages are 4 KiB.
+
+
-- 
2.30.2



^ permalink raw reply related	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/36] Revert "xen/arm: setup: Add Xen as boot module before printing all boot modules"
  2022-03-04 17:46 ` [PATCH 01/36] Revert "xen/arm: setup: Add Xen as boot module before printing all boot modules" Marco Solieri
@ 2022-03-04 18:50   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-04 18:50 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi Marco,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> This reverts commit 48fb2a9deba11ee48dde21c5c1aa93b4d4e1043b.
Can you explain why you need to revert this patch?

Also, there is a missing signed-off-by for both Luca and you.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 27/36] xen/arch: add coloring support for Xen
  2022-03-04 17:46 ` [PATCH 27/36] xen/arch: add coloring support for Xen Marco Solieri
@ 2022-03-04 19:47   ` Julien Grall
  2022-03-09 11:28     ` Julien Grall
  2022-03-14  3:47   ` Henry Wang
  2022-03-14 21:58   ` Julien Grall
  2 siblings, 1 reply; 79+ messages in thread
From: Julien Grall @ 2022-03-04 19:47 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Introduce a new implementation of setup_pagetables that uses coloring
> logic in order to isolate Xen code using its color selection.
> Page tables construction is essentially copied, except for the xenmap
> table, where coloring logic is needed.  Given the absence of a contiguous
> physical mapping, pointers to next level tables need to be manually
> calculated.

The implementation of setup_pagetables() is not compliant to the Arm 
Arm. And I have plan to completely get rid of it.

The main part that is not compliant is switch_ttbr() because it keeps 
the MMU on. We should switch the MMU off, update the TTBR and then 
switch on the MMU. This implies that we need an identity mapping of the 
part of Xen that will run with MMU off.

I understand that rebuilding the page-tables and therefore switching the 
TTBR will be necessary for cache coloring. So before any new use, I 
would like the implementation of switch_ttbr() to be fixed.

What we will need to do is find space in the virtual layout that also 
match a physical address. With that in place, we could use the mapping 
to switch between TTBR.

[...]

>   void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
> @@ -721,6 +885,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>       per_cpu(xen_dommap, 0) = cpu0_dommap;
>   #endif
>   }
> +#endif /* !CONFIG_COLORING */
>   
>   static void clear_boot_pagetables(void)
>   {
> @@ -735,6 +900,9 @@ static void clear_boot_pagetables(void)
>   #endif
>       clear_table(boot_second);
>       clear_table(boot_third);
> +#ifdef CONFIG_COLORING
> +    clear_table(boot_colored_xen);
> +#endif

AFAICT, this is going to clear the boot pagetables in the cache coloring 
version of Xen. However, the secondary CPUs will build their page-tables 
using the version in the old Xen.

So you will need to update the code to clear the correct boot page tables.

[...]

> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> index 0c90c2305c..d443fac6a2 100644
> --- a/xen/arch/arm/psci.c
> +++ b/xen/arch/arm/psci.c
> @@ -25,6 +25,7 @@
>   #include <asm/cpufeature.h>
>   #include <asm/psci.h>
>   #include <asm/acpi.h>
> +#include <asm/coloring.h>
>   
>   /*
>    * While a 64-bit OS can make calls with SMC32 calling conventions, for
> @@ -49,7 +50,8 @@ int call_psci_cpu_on(int cpu)
>   {
>       struct arm_smccc_res res;
>   
> -    arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary),
> +    arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu),
> +                  __pa(virt_boot_xen((vaddr_t)init_secondary)),
>                     &res);
>   
>       return PSCI_RET(res);
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 13b10515a8..294b806120 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -924,6 +924,7 @@ void __init start_xen(unsigned long boot_phys_offset,
>       struct domain *d;
>       int rc;
>       paddr_t xen_paddr = (paddr_t)(_start + boot_phys_offset);
> +    uint32_t xen_size = (_end - _start);
>   
>       dcache_line_bytes = read_dcache_line_bytes();
>   
> @@ -952,13 +953,16 @@ void __init start_xen(unsigned long boot_phys_offset,
>       if ( !coloring_init() )
>           panic("Xen Coloring support: setup failed\n");
>   
> +    xen_size = XEN_COLOR_MAP_SIZE;
> +#ifdef CONFIG_COLORING
> +    xen_paddr = get_xen_paddr(xen_size);
> +#endif
> +
>       /* Register Xen's load address as a boot module. */
> -    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr,
> -                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
> +    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr, xen_size, false);

How do you plan to exclude the memory allocate the cache coloring version?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 08/36] xen/arm: add colored flag to page struct
  2022-03-04 17:46 ` [PATCH 08/36] xen/arm: add colored flag to page struct Marco Solieri
@ 2022-03-04 20:13   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-04 20:13 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> A new allocator enforcing a cache-coloring configuration is going to be
> introduced.  We thus need to distinguish the memory pages assigned to,
> and managed by, such colored allocator from the ordinary buddy
> allocator's ones.  Add a color flag to the page structure.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   xen/arch/arm/include/asm/mm.h | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
> index 487be7cf59..9ac1767595 100644
> --- a/xen/arch/arm/include/asm/mm.h
> +++ b/xen/arch/arm/include/asm/mm.h
> @@ -88,6 +88,10 @@ struct page_info
>            */
>           u32 tlbflush_timestamp;
>       };
> +
> +    /* Is page managed by the cache-colored allocator? */
> +    bool colored;

struct page_info is going to be used quite a lot. In fact, there is one 
per RAM page. So we need to avoid growing the structure.

For Arm64, there is a 4 bytes padding here. But for arm32, there are 
none. So the size will increase by another 8 bytes.

In this case, I would use a bit in count_info.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 16/36] xen/color alloc: implement color_from_page for ARM64
  2022-03-04 17:46 ` [PATCH 16/36] xen/color alloc: implement color_from_page for ARM64 Marco Solieri
@ 2022-03-04 20:54   ` Julien Grall
  2022-03-11 17:39     ` Marco Solieri
  0 siblings, 1 reply; 79+ messages in thread
From: Julien Grall @ 2022-03-04 20:54 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> The colored allocator should not make any assumptions on how a color is
> defined, since the definition may change depending on the architecture.
IIUC, you are saying that the mapping between a physical address to a 
way is the same on every Armv8 processor.

Can you provide a reference from the Arm Arm which confirm this statement?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 07/36] xen/arm: add coloring data to domains
  2022-03-04 17:46 ` [PATCH 07/36] xen/arm: add coloring data to domains Marco Solieri
@ 2022-03-07  7:22   ` Jan Beulich
  0 siblings, 0 replies; 79+ messages in thread
From: Jan Beulich @ 2022-03-07  7:22 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> We want to be able to associate an assignment of cache colors to each
> domain.  Add a configurable-length array containing a set of color
> indices in the domain data.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>  xen/include/xen/sched.h | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 10ea969c7a..bfbe72b3ea 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -388,6 +388,10 @@ struct domain
>      atomic_t         shr_pages;         /* shared pages */
>      atomic_t         paged_pages;       /* paged-out pages */
>  
> +    /* Coloring. */
> +    uint32_t        *colors;
> +    uint32_t        max_colors;

You will want to justify why this needs to live in struct domain, and
not in struct arch_domain (as the title would suggest). You will also
want to check whether uint32_t is actually appropriate to use here -
see ./CODING_STYLE. Finally, a comment this short (and hence ambiguous)
isn't worthwhile to have, imo. It (as well as the title) doesn't even
include the word "cache".

Jan



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 09/36] xen/arch: add default colors selection function
  2022-03-04 17:46 ` [PATCH 09/36] xen/arch: add default colors selection function Marco Solieri
@ 2022-03-07  7:28   ` Jan Beulich
  0 siblings, 0 replies; 79+ messages in thread
From: Jan Beulich @ 2022-03-07  7:28 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> When cache coloring support is enabled, a color assignment is needed for
> every domain. Introduce a function computing a default configuration
> with a safe and common value -- the dom0 color selection.
> 
> Do not access directly the array of color indices of dom0. Instead make
> use of the dom0 color configuration as a bitmask.
> Add a helper function that converts the color configuration bitmask into
> the indices array.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>

Nit (but relevant because it might be misguiding people just glancing over
the series): Did you mean "xen/arm:" rather than "xen/arch:" in the title
here as well as in that of the next patch?

Jan

> ---
>  xen/arch/arm/coloring.c             | 36 +++++++++++++++++++++++++++++
>  xen/arch/arm/include/asm/coloring.h |  7 ++++++
>  2 files changed, 43 insertions(+)
> 
> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> index af75b536a7..f6e6d09477 100644
> --- a/xen/arch/arm/coloring.c
> +++ b/xen/arch/arm/coloring.c
> @@ -143,6 +143,42 @@ static __init uint64_t calculate_addr_col_mask(uint64_t llc_way_size)
>      return addr_col_mask;
>  }
>  
> +static int copy_mask_to_list(
> +    uint32_t *col_mask, uint32_t *col_list, uint64_t col_num)
> +{
> +    unsigned int i, k, c;
> +
> +    if ( !col_list )
> +        return -EINVAL;
> +
> +    for ( i = 0, k = 0; i < MAX_COLORS_CELLS; i++ )
> +        for ( c = 0; k < col_num && c < 32; c++ )
> +            if ( col_mask[i] & (1 << (c + (i*32))) )
> +                col_list[k++] = c + (i * 32);
> +
> +    return 0;
> +}
> +
> +uint32_t *setup_default_colors(uint32_t *col_num)
> +{
> +    uint32_t *col_list;
> +
> +    if ( dom0_col_num )
> +    {
> +        *col_num = dom0_col_num;
> +        col_list = xzalloc_array(uint32_t, dom0_col_num);
> +        if ( !col_list )
> +        {
> +            printk(XENLOG_ERR "setup_default_colors: Alloc failed\n");
> +            return NULL;
> +        }
> +        copy_mask_to_list(dom0_col_mask, col_list, dom0_col_num);
> +        return col_list;
> +    }
> +
> +    return NULL;
> +}
> +
>  bool __init coloring_init(void)
>  {
>      int i;
> diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
> index 70e1dbd09b..8f24acf082 100644
> --- a/xen/arch/arm/include/asm/coloring.h
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -27,6 +27,13 @@
>  
>  #ifdef CONFIG_COLORING
>  bool __init coloring_init(void);
> +
> +/*
> + * Return an array with default colors selection and store the number of
> + * colors in @param col_num. The array selection will be equal to the dom0
> + * color configuration.
> + */
> +uint32_t *setup_default_colors(uint32_t *col_num);
>  #else /* !CONFIG_COLORING */
>  static inline bool __init coloring_init(void)
>  {



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 11/36] xen/include: define hypercall parameter for coloring
  2022-03-04 17:46 ` [PATCH 11/36] xen/include: define hypercall parameter for coloring Marco Solieri
@ 2022-03-07  7:31   ` Jan Beulich
  2022-03-09 20:29   ` Julien Grall
  1 sibling, 0 replies; 79+ messages in thread
From: Jan Beulich @ 2022-03-07  7:31 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio,
	Stefano Stabellini, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -303,6 +303,12 @@ struct vcpu_guest_context {
>  typedef struct vcpu_guest_context vcpu_guest_context_t;
>  DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);
>  
> +#define MAX_COLORS_CELLS 4
> +struct color_guest_config {
> +    uint32_t max_colors;
> +    uint32_t colors[MAX_COLORS_CELLS];
> +};
> +
>  /*
>   * struct xen_arch_domainconfig's ABI is covered by
>   * XEN_DOMCTL_INTERFACE_VERSION.
> @@ -335,6 +341,8 @@ struct xen_arch_domainconfig {
>       *
>       */
>      uint32_t clock_frequency;
> +    /* IN */
> +    struct color_guest_config colors;
>  };
>  #endif /* __XEN__ || __XEN_TOOLS__ */
>  

Please no new additions to the public interface without proper XEN_ / xen_
name prefixes on anything going in some global name space. (Personally I
also wonder whether a separate struct is warranted, but I'm not a
maintainer here, so I've got little say.)

Jan



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 18/36] Alloc: introduce page_list_for_each_reverse
  2022-03-04 17:46 ` [PATCH 18/36] Alloc: introduce page_list_for_each_reverse Marco Solieri
@ 2022-03-07  7:35   ` Jan Beulich
  0 siblings, 0 replies; 79+ messages in thread
From: Jan Beulich @ 2022-03-07  7:35 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -488,6 +488,8 @@ page_list_splice(struct page_list_head *list, struct page_list_head *head)
>      list_for_each_entry_safe(pos, tmp, head, list)
>  # define page_list_for_each_safe_reverse(pos, tmp, head) \
>      list_for_each_entry_safe_reverse(pos, tmp, head, list)
> +# define page_list_for_each_reverse(pos, head) \
> +    list_for_each_entry_reverse(pos, head, list)
>  #endif

There are two sets of macros (for there being two flavors of lists),
and hence - even if you need only one form on Arm - the other form
should be introduced right away. I also think it would be far better
to merge this into the patch actually first needing the new
construct, as only then it'll be able to judge whether none of the
existing constructs would be a reasonable fit.

Jan



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 28/36] xen/arm: introduce xen_map_text_rw
  2022-03-04 17:46 ` [PATCH 28/36] xen/arm: introduce xen_map_text_rw Marco Solieri
@ 2022-03-07  7:39   ` Jan Beulich
  2022-03-11 22:28     ` Julien Grall
  0 siblings, 1 reply; 79+ messages in thread
From: Jan Beulich @ 2022-03-07  7:39 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio,
	Stefano Stabellini, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Introduce two new arm specific functions to temporarily map/unmap the
> Xen text read-write (the Xen text is mapped read-only by default by
> setup_pagetables): xen_map_text_rw and xen_unmap_text_rw.
> 
> There is only one caller in the alternative framework.
> 
> The non-colored implementation simply uses __vmap to do the mapping. In
> other words, there are no changes to the non-colored case.
> 
> The colored implementation calculates Xen text physical addresses
> appropriately, according to the coloring configuration.
> 
> Export vm_alloc because it is needed by the colored implementation of
> xen_map_text_rw.

I'm afraid I view vm_alloc() as strictly an internal function to
vmap.c. Even livepatching infrastructure has got away without making
it non-static.

Jan



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 33/36] doc, xen-command-line: introduce coloring options
  2022-03-04 17:46 ` [PATCH 33/36] doc, xen-command-line: introduce coloring options Marco Solieri
@ 2022-03-07  7:42   ` Jan Beulich
  2022-03-14 22:07   ` Julien Grall
  1 sibling, 0 replies; 79+ messages in thread
From: Jan Beulich @ 2022-03-07  7:42 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Four additional parameters in the Xen command line are used to define
> the underlying coloring policy, which is not directly configurable
> otherwise.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>  docs/misc/xen-command-line.pandoc | 51 +++++++++++++++++++++++++++++--
>  1 file changed, 49 insertions(+), 2 deletions(-)

Documentation of new command line options should be added in the same
patch which adds support for the options.

> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index efda335652..a472d51cf9 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -299,6 +299,20 @@ can be maintained with the pv-shim mechanism.
>      cause Xen not to use Indirect Branch Tracking even when support is
>      available in hardware.
>  
> +### buddy\_size (arm64)

In new options we generally prefer - over _. I also don't think the name
is making clear enough what is actually being controlled.

Jan

> +> `= <size in megabyte>`
> +
> +> Default: `64 MB`
> +
> +Amount of memory reserved for the buddy allocator when colored allocator is
> +active. This options is useful only if coloring support is enabled.
> +The colored allocator is meant as an alternative to the buddy allocator,
> +since its allocation policy is by definition incompatible with the
> +generic one. Since the Xen heap systems is not colored yet, we need to
> +support the coexistence of the two allocators for now. This parameter, which is
> +optional and for expert only, is used to set the amount of memory reserved to
> +the buddy allocator.
> +
>  ### clocksource (x86)
>  > `= pit | hpet | acpi | tsc`
>  
> @@ -884,7 +898,17 @@ Controls for the dom0 IOMMU setup.
>  
>      Incorrect use of this option may result in a malfunctioning system.
>  
> -### dom0_ioports_disable (x86)
> +### dom0\_colors (arm64)
> +> `= List of <integer>-<integer>`
> +
> +> Default: `All available colors`
> +
> +Specify dom0 color configuration. If the parameter is not set, all available
> +colors are chosen and the user is warned on Xen's serial console. This color
> +configuration acts also as the default one for all DomUs that do not have any
> +explicit color assignment in their configuration file.
> +
> +### dom0\_ioports\_disable (x86)
>  > `= List of <hex>-<hex>`
>  
>  Specify a list of IO ports to be excluded from dom0 access.
> @@ -2625,6 +2649,20 @@ unknown NMIs will still be processed.
>  Set the NMI watchdog timeout in seconds.  Specifying `0` will turn off
>  the watchdog.
>  
> +### way\_size (arm64)
> +> `= <size in byte>`
> +
> +> Default: `Obtained from the hardware`
> +
> +Specify the way size of the Last Level Cache. This parameter is only useful with
> +coloring support enabled. It is an optional, expert-only parameter and it is
> +used to calculate what bits in the physical address can be used by the coloring
> +algorithm, and thus the maximum available colors on the platform. It can be
> +obtained by dividing the total LLC size by the number of associativity ways.
> +By default, the value is also automatically computed during coloring
> +initialization to avoid any kind of misconfiguration. For this reason, it is
> +highly recommended to use this boot argument with specific needs only.
> +
>  ### x2apic (x86)
>  > `= <boolean>`
>  
> @@ -2642,7 +2680,16 @@ In the case that x2apic is in use, this option switches between physical and
>  clustered mode.  The default, given no hint from the **FADT**, is cluster
>  mode.
>  
> -### xenheap_megabytes (arm32)
> +### xen\_colors (arm64)
> +> `= List of <integer>-<integer>`
> +
> +> Default: `0-0: the lowermost color`
> +
> +Specify Xen color configuration. 
> +Two colors are most likely needed on platforms where private caches are
> +physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57.
> +
> +### xenheap\_megabytes (arm32)
>  > `= <size>`
>  
>  > Default: `0` (1/32 of RAM)



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 27/36] xen/arch: add coloring support for Xen
  2022-03-04 19:47   ` Julien Grall
@ 2022-03-09 11:28     ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-09 11:28 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 19:47, Julien Grall wrote:
> On 04/03/2022 17:46, Marco Solieri wrote:
>> From: Luca Miccio <lucmiccio@gmail.com>
>>
>> Introduce a new implementation of setup_pagetables that uses coloring
>> logic in order to isolate Xen code using its color selection.
>> Page tables construction is essentially copied, except for the xenmap
>> table, where coloring logic is needed.  Given the absence of a contiguous
>> physical mapping, pointers to next level tables need to be manually
>> calculated.
> 
> The implementation of setup_pagetables() is not compliant to the Arm 
> Arm. And I have plan to completely get rid of it.
> 
> The main part that is not compliant is switch_ttbr() because it keeps 
> the MMU on. We should switch the MMU off, update the TTBR and then 
> switch on the MMU. This implies that we need an identity mapping of the 
> part of Xen that will run with MMU off.
> 
> I understand that rebuilding the page-tables and therefore switching the 
> TTBR will be necessary for cache coloring. So before any new use, I 
> would like the implementation of switch_ttbr() to be fixed.
> 
> What we will need to do is find space in the virtual layout that also 
> match a physical address. With that in place, we could use the mapping 
> to switch between TTBR.

I have posted an early RFC [1] to reshuffle the memory layout on Arm so 
we have space to for the identity mapping. I have also reworked 
switch_ttbr() to turn off/on the MMU before/after updating the TTBR.

The series should work on arm64. The arm32 effort requires a bit more 
effort as we have less virtual space.

I haven't killed setup_pagetables() yet so you have a base to write the 
cache coloring version. There may be also some tweak necessary for cache 
coloring (e.g. flush the instruction cache).

Cheers,

[1] https://lore.kernel.org/xen-devel/20220309112048.17377-1-julien@xen.org/

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 19/36] xen/arch: introduce cache-coloring allocator
  2022-03-04 17:46 ` [PATCH 19/36] xen/arch: introduce cache-coloring allocator Marco Solieri
@ 2022-03-09 14:35   ` Jan Beulich
  0 siblings, 0 replies; 79+ messages in thread
From: Jan Beulich @ 2022-03-09 14:35 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> @@ -438,6 +441,263 @@ mfn_t __init alloc_boot_pages(unsigned long nr_pfns, unsigned long pfn_align)
>  
>  
>  
> +static DEFINE_SPINLOCK(heap_lock);

Please take the opportunity and shrink the number of consecutive blank lines
above here.

> +
> +#ifdef CONFIG_COLORING
> +/*************************
> + * COLORED SIDE-ALLOCATOR
> + *
> + * Pages are stored by their color in separated lists. Each list defines a color
> + * and it is initialized during end_boot_allocator, where each page's color
> + * is calculated and the page itself is put in the correct list.
> + * After initialization we have N list where N is the number of maximum
> + * available colors on the platform.
> + * All the lists' heads are stored as element in an array with size N-1 using
> + * the following schema:
> + * array[X] = head of color X, where X goes from 0 to N-1
> + */
> +typedef struct page_list_head color_list;

color_list_t, to make it easy to recognize this as a type (rather than
a variable name)?

> +static color_list *color_heap;

__ro_after_init or, if Arm still hasn't got support for  that, at least
__read_mostly.

> +static long total_avail_col_pages;

Can this go negative? If not, unsigned long please.

> +static u64 col_num_max;

No new uses of u<N> or s<N> please. They're being phased out, and C99
types should be used instead.

> +static bool color_init_state = true;

This variable looks to be local to init_col_heap_pages() - please move
it there (provided it's needed in the first place: it's no clear why
you don't use an initcall instead).

> +#define page_to_head(pg) (&color_heap[color_from_page(pg)])
> +#define color_to_head(col) (&color_heap[col])
> +
> +/* Add page in list in order depending on its physical address. */
> +static void page_list_add_order(struct page_info *pg, struct list_head *head)

I guess you mean struct page_list_head * here?

> +{
> +    struct page_info *pos;
> +
> +    /* Add first page after head */
> +    if ( page_list_empty(head) )
> +    {
> +        page_list_add(pg, head);
> +        return;
> +    }
> +
> +    /* Add non-first page in list in ascending order */
> +    page_list_for_each_reverse(pos, head)
> +    {
> +        /* Get pg position */
> +        if ( page_to_maddr(pos) <= page_to_maddr(pg) )

Wouldn't it be a bug if the two were equal? If so, perhaps better
ASSERT() or even BUG_ON() accordingly?

> +        {
> +            /* Insert pg between pos and pos->list.next */
> +            page_list_add(pg, &pos->list);

The 2nd parameter of page_list_add() is struct page_list_head *, not
struct page_list_entry *. I guess you won't get away without introducing
a new accessor.

> +            break;
> +        }
> +
> +        /*
> +         * If pos is the first element it means that pg <= pos so we have
> +         * to insert pg after head.
> +         */
> +        if ( page_list_first(head) == pos )
> +        {
> +            page_list_add(pg, head);
> +            break;
> +        }

The way it's written it's not immediately obvious that the passed in page
would actually be put anywhere on the list by the time the function
returns. Furthermore this if() doesn't look to be necessary to be
evaluated on every loop iteration. Instead it could apparently live
_after_ the loop, requiring use of "return" instead of "break" further up.

> +    }
> +}

This function dealing with a page at a time while linearly scanning the
list looks to be pretty inefficient, the more that it'll be a common
case that callers have to deal with multiple pages at a time. Sadly it is
not clear how many colors there may be (without hunting down the origin
of max_col_num, which get_max_colors() returns, in the earlier 18
patches), and hence how long these lists may grow.

> +/* Alloc one page based on domain color configuration */
> +static struct page_info *alloc_col_heap_page(
> +    unsigned int memflags, struct domain *d)
> +{
> +    struct page_info *pg, *tmp;
> +    bool need_tlbflush = false;
> +    uint32_t cur_color;
> +    uint32_t tlbflush_timestamp = 0;
> +    uint32_t *colors = 0;

Please consult ./CODING_STYLE for when it is appropriate to use fixed-
width types.

> +    int max_colors;
> +    int i;

I don't suppose either of these can go negative, so unsigned int please.
(Just like other remarks - please consider applicable to the entire
series.)

> +    colors = d->colors;
> +    max_colors = d->max_colors;

Please make these the initializers of the variables, which will also
avoid using 0 where NULL is meant above. It also looks as if these were
the only two uses of "d" in the function. If so, please consider making
colors and max_colors the function parameters instead. Or else d likely
wants to be pointer to const (and colors as well as it looks; generally
please use const on pointed-to types wherever possible).

> +    spin_lock(&heap_lock);
> +
> +    tmp = pg = NULL;
> +
> +    /* Check for the first pg on non-empty list */
> +    for ( i = 0; i < max_colors; i++ )
> +    {
> +        if ( !page_list_empty(color_to_head(colors[i])) )
> +        {
> +            tmp = pg = page_list_last(color_to_head(colors[i]));
> +            cur_color = d->colors[i];
> +            break;
> +        }
> +    }
> +
> +    /* If all lists are empty, no requests can be satisfied */
> +    if ( !pg )
> +    {
> +        spin_unlock(&heap_lock);
> +        return NULL;
> +    }

I'm not convinced this is a useful thing to have. The identical
construct below the subsequent loop will deal with this case quite fine
afaict.

> +    /* Get the highest page from the lists compliant to the domain color(s) */
> +    for ( i += 1; i < max_colors; i++ )

Perhaps easier as

    while ( ++i < max_colors )

?

> +    {
> +        if ( page_list_empty(color_to_head(colors[i])) )
> +        {
> +            printk(XENLOG_INFO "List empty\n");

This is liable to be too noisy even if converted to dprintk(). Please
drop.

> +            continue;
> +        }
> +        tmp = page_list_last(color_to_head(colors[i]));
> +        if ( page_to_maddr(tmp) > page_to_maddr(pg) )
> +        {
> +            pg = tmp;
> +            cur_color = colors[i];

You only ever write this variable - please drop such, or introduce them
only once actually needed (if e.g. in a later patch).

> +        }
> +    }
> +
> +    if ( !pg )
> +    {
> +        spin_unlock(&heap_lock);
> +        return NULL;
> +    }
> +
> +    pg->count_info = PGC_state_inuse;
> +
> +    if ( !(memflags & MEMF_no_tlbflush) )
> +        accumulate_tlbflush(&need_tlbflush, pg,
> +                            &tlbflush_timestamp);
> +
> +    /* Initialise fields which have other uses for free pages. */
> +    pg->u.inuse.type_info = 0;
> +    page_set_owner(pg, NULL);

This would now become the 3rd instance - time to consider a small
helper function?

> +    flush_page_to_ram(mfn_x(page_to_mfn(pg)),
> +                      !(memflags & MEMF_no_icache_flush));
> +
> +    page_list_del(pg, page_to_head(pg));
> +    total_avail_col_pages--;
> +
> +    spin_unlock(&heap_lock);
> +
> +    if ( need_tlbflush )
> +        filtered_flush_tlb_mask(tlbflush_timestamp);
> +
> +    return pg;
> +}
> +
> +struct page_info *alloc_col_domheap_page(
> +    struct domain *d, unsigned int memflags)
> +{
> +    struct page_info *pg;
> +
> +    ASSERT(!in_irq());
> +
> +    /* Get page based on color selection */
> +    pg = alloc_col_heap_page(memflags, d);
> +
> +    if ( !pg )
> +    {
> +        printk(XENLOG_INFO "ERROR: Colored Page is null\n");
> +        return NULL;
> +    }
> +
> +    /* Assign page to domain */
> +    if ( d && !(memflags & MEMF_no_owner) &&
> +        assign_page(pg, 0, d, memflags) )
> +    {
> +        free_col_heap_page(pg);
> +        return NULL;
> +    }
> +
> +    return pg;
> +}

So this is really only providing a single order-0 page. From the cover
letter it didn't sound like you were aiming at such a limited use case.
It's also not listed under "Known limitations" there that large pages
don't even have provisions made for.

> +void free_col_heap_page(struct page_info *pg)
> +{
> +    /* This page is not a guest frame any more. */
> +    pg->count_info = PGC_state_free;
> +
> +    page_set_owner(pg, NULL);
> +    total_avail_col_pages++;
> +    page_list_add_order( pg, page_to_head(pg) );

Nit: Stray blanks immediately inside the parentheses.

> +}

How does this fit into the get_page() / put_page() machinery? You
don't alter free_domheap_pages(), after all.

> +static inline void init_col_heap_pages(struct page_info *pg, unsigned long nr_pages)

Why inline? I guess this might be to silence the compiler warning
about the function being unused, but then this only means that you
want to introduce the function once it's needed. Then it would
also be possible to tell whether the function wants to be __init.

Additionally the line is too long.

> +{
> +    int i;
> +
> +    if ( color_init_state )
> +    {
> +        col_num_max = get_max_colors();
> +        color_heap = xmalloc_array(color_list, col_num_max);
> +        BUG_ON(!color_heap);
> +
> +        for ( i = 0; i < col_num_max; i++ )
> +        {
> +            printk(XENLOG_INFO "Init list for color: %u\n", i);

Again too noisy. Such may be okay in a RFC series, but should have been
dropped for a "normal" submission.

> +            INIT_PAGE_LIST_HEAD(&color_heap[i]);
> +        }
> +
> +        color_init_state = false;
> +    }
> +
> +    printk(XENLOG_INFO "Init color heap pages with %lu pages for a given size of 0x%"PRIx64"\n",

While you shouldn't split the format string across lines, you should
take all other available measures to limit line length. Furthermore
please consider using the shorter %#x form here and elsewhere. Overall:

    printk(XENLOG_INFO
           "Init color heap with %lu pages for a given size of 0x%"PRIx64"\n",

And even then the two values logged are redundant with one another,
so things can further be shortened here.

> +            nr_pages, nr_pages * PAGE_SIZE);

Nit: Indentation.

> +    printk(XENLOG_INFO "Paging starting from: 0x%"PRIx64"\n", page_to_maddr(pg));
> +    total_avail_col_pages += nr_pages;
> +
> +    for ( i = 0; i < nr_pages; i++ )
> +    {
> +        pg->colored = true;
> +        page_list_add_order(pg, page_to_head(pg));
> +        pg++;
> +    }
> +}
> +
> +static inline bool is_page_colored(struct page_info *pg)
> +{
> +        return pg->colored;

Nit: Indentation again (and more instance below).

> +}
> +
> +static void dump_col_heap(unsigned char key)
> +{
> +    struct page_info *pg;
> +    unsigned long size;
> +    unsigned int i;
> +
> +    printk("Colored heap info\n");
> +    for ( i = 0; i < col_num_max; i++ )
> +    {
> +        printk("Heap[%u]: ", i);
> +        size = 0;
> +        page_list_for_each( pg, color_to_head(i) )
> +        {
> +            BUG_ON(!(color_from_page(pg) == i));
> +            size++;
> +        }

How long is this going to take on a decently sized system? At the
very least you'll need to call process_pending_softirqs() every
once in a while. But this may be taking too long this way anyway.

> +        printk("%lu pages -> %lukB free\n", size, size << (PAGE_SHIFT - 10));

Again the same information is being logged twice.

> +    }
> +
> +    printk("Total number of pages: %lu\n", total_avail_col_pages);

Since the value logged isn't calculated locally, may I suggest to
merge this into the initial printk()?

> +}
> +#else /* !CONFIG_COLORING */
> +#define init_col_heap_pages(x, y) init_heap_pages(x, y)
> +
> +inline struct page_info *alloc_col_domheap_page(
> +	struct domain *d, unsigned int memflags)
> +{
> +	return NULL;
> +}
> +
> +inline void free_col_heap_page(struct page_info *pg)
> +{
> +	return;
> +}
> +
> +static inline bool is_page_colored(struct page_info *pg)
> +{
> +        return false;
> +}

Why are any of these needed? And if any are needed, please drop
inline once again.

> @@ -2600,6 +2859,9 @@ static void cf_check dump_heap(unsigned char key)
>  static __init int cf_check register_heap_trigger(void)
>  {
>      register_keyhandler('H', dump_heap, "dump heap info", 1);
> +#ifdef CONFIG_COLORING
> +    register_keyhandler('c', dump_col_heap, "dump coloring heap info", 1);

'c' already has a use on x86. Please avoid such collisions, even if
initially you're targeting Arm only. I don't see why a separate key
is needed anyway - can't you just extend dump_heap()?

> --- a/xen/include/xen/mm.h
> +++ b/xen/include/xen/mm.h
> @@ -131,6 +131,11 @@ unsigned int online_page(mfn_t mfn, uint32_t *status);
>  int offline_page(mfn_t mfn, int broken, uint32_t *status);
>  int query_page_offline(mfn_t mfn, uint32_t *status);
>  
> +/* Colored suballocator. */
> +struct page_info *alloc_col_domheap_page(
> +    struct domain *d, unsigned int memflags);
> +void free_col_heap_page(struct page_info *pg);

These two should imo represent a pair, i.e. be named similarly.

Jan



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 20/36] xen/common: introduce buddy required reservation
  2022-03-04 17:46 ` [PATCH 20/36] xen/common: introduce buddy required reservation Marco Solieri
@ 2022-03-09 14:45   ` Jan Beulich
  2022-03-09 14:47     ` Jan Beulich
  0 siblings, 1 reply; 79+ messages in thread
From: Jan Beulich @ 2022-03-09 14:45 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -230,6 +230,13 @@ static bool __read_mostly scrub_debug;
>  #define scrub_debug    false
>  #endif
>  
> +#ifdef CONFIG_COLORING
> +/* Minimum size required for buddy allocator to work with colored one */
> +unsigned long buddy_required_size __read_mostly = MB(64);
> +#else
> +unsigned long buddy_required_size __read_mostly = 0;
> +#endif

Please avoid such redundancy when possible. Here perhaps easiest
by having the value come from Kconfig. By giving that separate
option a prompt, it would even become configurable at build time.

> @@ -678,6 +685,13 @@ static void dump_col_heap(unsigned char key)
>  
>      printk("Total number of pages: %lu\n", total_avail_col_pages);
>  }
> +static int __init parse_buddy_required_size(const char *s)
> +{
> +    buddy_required_size = simple_strtoull(s, &s, 0);
> +
> +    return *s ? -EINVAL : 0;
> +}
> +custom_param("buddy_size", parse_buddy_required_size);

Why not integer_param() or, even better fitting the purpose,
size_param()? Also (I may have said so elsewhere already) please
prefer - over _ in new command line option names. And of course
the name needs to be unambiguous enough for it to be easy to
associate the purpose.

Jan



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 20/36] xen/common: introduce buddy required reservation
  2022-03-09 14:45   ` Jan Beulich
@ 2022-03-09 14:47     ` Jan Beulich
  0 siblings, 0 replies; 79+ messages in thread
From: Jan Beulich @ 2022-03-09 14:47 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, xen-devel

On 09.03.2022 15:45, Jan Beulich wrote:
> On 04.03.2022 18:46, Marco Solieri wrote:
>> --- a/xen/common/page_alloc.c
>> +++ b/xen/common/page_alloc.c
>> @@ -230,6 +230,13 @@ static bool __read_mostly scrub_debug;
>>  #define scrub_debug    false
>>  #endif
>>  
>> +#ifdef CONFIG_COLORING
>> +/* Minimum size required for buddy allocator to work with colored one */
>> +unsigned long buddy_required_size __read_mostly = MB(64);
>> +#else
>> +unsigned long buddy_required_size __read_mostly = 0;
>> +#endif
> 
> Please avoid such redundancy when possible. Here perhaps easiest
> by having the value come from Kconfig. By giving that separate
> option a prompt, it would even become configurable at build time.

Oh, and: Why is this not static? And without seeing what it's going
to be used for it's quite hard to judge whether the initial value
chosen is actually sufficient. I could imagine that this would
rather want to be derived from total memory size.

Jan



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 21/36] xen/common: add colored allocator initialization
  2022-03-04 17:46 ` [PATCH 21/36] xen/common: add colored allocator initialization Marco Solieri
@ 2022-03-09 14:58   ` Jan Beulich
  0 siblings, 0 replies; 79+ messages in thread
From: Jan Beulich @ 2022-03-09 14:58 UTC (permalink / raw)
  To: Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, Luca Miccio,
	Stefano Stabellini, xen-devel

On 04.03.2022 18:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Initialize colored heap and allocator data structures. It is assumed
> that pages are given to the init function is in ascending order.

I don't think this is a good assumption to make.

> To
> ensure that, pages are retrieved from bootmem_regions starting from the
> first one. Moreover, this allows quickly insertion of freed pages into
> the colored allocator's internal data structures -- sorted lists.

I wouldn't call insertion by linear scan "quick", to be honest.

> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -2154,11 +2154,26 @@ void __init end_boot_allocator(void)
>              break;
>          }
>      }
> -    for ( i = nr_bootmem_regions; i-- > 0; )
> +
> +    for ( i = 0; i < nr_bootmem_regions; i++ )
>      {
>          struct bootmem_region *r = &bootmem_region_list[i];
> -        if ( r->s < r->e )
> -            init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
> +
> +        /*
> +         * Find the first region that can fill the buddy allocator memory
> +         * specified by buddy_required_size.
> +         */

Why would all of this memory need to come from a single region? And
why would any region - regardless of address - be okay?

> +        if ( buddy_required_size && (r->e - r->s) >
> +            PFN_DOWN(buddy_required_size) )

I think >= will do here?

Also - nit: Indentation.

> +        {
> +            init_heap_pages(mfn_to_page(_mfn(r->s)),
> +                PFN_DOWN(buddy_required_size));

And again - indentation.

> +            r->s += PFN_DOWN(buddy_required_size);
> +            buddy_required_size = 0;
> +        }
> +
> +        init_col_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);

Judging from this, buddy_required_size can actually be __initdata in
the previous patch. Being able to spot such is another reason to not
split patches like this.

> @@ -2619,9 +2634,12 @@ int assign_pages(
>          page_set_owner(&pg[i], d);
>          smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
>          pg[i].count_info =
> -            (pg[i].count_info & (PGC_extra | PGC_reserved)) | PGC_allocated | 1;
> +             (pg[i].count_info & (PGC_extra | PGC_reserved)) | PGC_allocated | 1;

Why the change?

> @@ -2642,6 +2660,15 @@ struct page_info *alloc_domheap_pages(
>      unsigned int bits = memflags >> _MEMF_bits, zone_hi = NR_ZONES - 1;
>      unsigned int dma_zone;
>  
> +    /* Only Dom0 and DomUs are supported for coloring */
> +    if ( d && d->max_colors > 0 )
> +    {
> +        /* Colored allocation must be done on 0 order */
> +        if (order)

Nit: Missing blanks.

> @@ -2761,8 +2788,10 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
>              scrub = 1;
>          }
>  
> -        free_heap_pages(pg, order, scrub);
> -    }
> +        if ( is_page_colored(pg) )
> +            free_col_heap_page(pg);
> +        else
> +            free_heap_pages(pg, order, scrub);}

Very interesting brace placement.

Jan



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration
  2022-03-04 17:46 ` [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration Marco Solieri
@ 2022-03-09 19:09   ` Julien Grall
  2022-03-22  9:17     ` Luca Miccio
  2022-05-13 14:22     ` Carlo Nonato
  0 siblings, 2 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-09 19:09 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio,
	Stefano Stabellini

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Add three new bootargs allowing configuration of cache coloring support
> for Xen:

I would prefer if documentation of each command line is part of the 
patch introducing them. This would help understanding some of the 
parameters.

> - way_size: The size of a LLC way in bytes. This value is mainly used
>    to calculate the maximum available colors on the platform.

We should only add command line option when they are a strong use case. 
In documentation, you wrote that someone may want to overwrite the way 
size for "specific needs".

Can you explain what would be those needs?

> - dom0_colors: The coloring configuration for Dom0, which also acts as
>    default configuration for any DomU without an explicit configuration.
> - xen_colors: The coloring configuration for the Xen hypervisor itself.
> 
> A cache coloring configuration consists of a selection of colors to be
> assigned to a VM or to the hypervisor. It is represented by a set of
> ranges. Add a common function that parses a string with a
> comma-separated set of hyphen-separated ranges like "0-7,15-16" and
> returns both: the number of chosen colors, and an array containing their
> ids.
> Currently we support platforms with up to 128 colors.

Is there any reason this value is hardcoded in Xen rather than part of 
the Kconfig?

> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> ---
>   xen/arch/arm/Kconfig                |   5 ++
>   xen/arch/arm/Makefile               |   2 +-
>   xen/arch/arm/coloring.c             | 131 ++++++++++++++++++++++++++++
>   xen/arch/arm/include/asm/coloring.h |  28 ++++++
>   4 files changed, 165 insertions(+), 1 deletion(-)
>   create mode 100644 xen/arch/arm/coloring.c
>   create mode 100644 xen/arch/arm/include/asm/coloring.h
> 
> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> index ecfa6822e4..f0f999d172 100644
> --- a/xen/arch/arm/Kconfig
> +++ b/xen/arch/arm/Kconfig
> @@ -97,6 +97,11 @@ config HARDEN_BRANCH_PREDICTOR
>   
>   	  If unsure, say Y.
>   
> +config COLORING
> +	bool "L2 cache coloring"
> +	default n

This wants to be gated with EXPERT for time-being. SUPPORT.MD woudl
Furthermore, I think this wants to be gated with EXPERT for the time-being.

> +	depends on ARM_64

Why is this limited to arm64?

> +
>   config TEE
>   	bool "Enable TEE mediators support (UNSUPPORTED)" if UNSUPPORTED
>   	default n
> diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> index c993ce72a3..581896a528 100644
> --- a/xen/arch/arm/Makefile
> +++ b/xen/arch/arm/Makefile
> @@ -66,7 +66,7 @@ obj-$(CONFIG_SBSA_VUART_CONSOLE) += vpl011.o
>   obj-y += vsmc.o
>   obj-y += vpsci.o
>   obj-y += vuart.o
> -
> +obj-$(CONFIG_COLORING) += coloring.o

Please keep the newline before extra-y. The file are meant to be ordered 
alphabetically. So this should be inserted in the correct position.

>   extra-y += xen.lds
>   
>   #obj-bin-y += ....o
> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> new file mode 100644
> index 0000000000..8f1cff6efb
> --- /dev/null
> +++ b/xen/arch/arm/coloring.c
> @@ -0,0 +1,131 @@
> +/*
> + * xen/arch/arm/coloring.c
> + *
> + * Coloring support for ARM
> + *
> + * Copyright (C) 2019 Xilinx Inc.
> + *
> + * Authors:
> + *    Luca Miccio <lucmiccio@gmail.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#include <xen/init.h>
> +#include <xen/types.h>
> +#include <xen/lib.h>
> +#include <xen/errno.h>
> +#include <xen/param.h>
> +#include <asm/coloring.h>

The includes should be ordered so <xen/...> are first, then <asm/...>.
They are also ordered alphabetically within their own category.

> +
> +/* Number of color(s) assigned to Xen */
> +static uint32_t xen_col_num;
> +/* Coloring configuration of Xen as bitmask */
> +static uint32_t xen_col_mask[MAX_COLORS_CELLS];
Xen provides helpers to create and use bitmaps (see 
include/xen/bitmap.h). Can you use?

> +
> +/* Number of color(s) assigned to Dom0 */
> +static uint32_t dom0_col_num;
> +/* Coloring configuration of Dom0 as bitmask */
> +static uint32_t dom0_col_mask[MAX_COLORS_CELLS];
> +
> +static uint64_t way_size;
> +
> +/*************************
> + * PARSING COLORING BOOTARGS
> + */
> +
> +/*
> + * Parse the coloring configuration given in the buf string, following the
> + * syntax below, and store the number of colors and a corresponding mask in
> + * the last two given pointers.
> + *
> + * COLOR_CONFIGURATION ::= RANGE,...,RANGE
> + * RANGE               ::= COLOR-COLOR
> + *
> + * Example: "2-6,15-16" represents the set of colors: 2,3,4,5,6,15,16.
> + */
> +static int parse_color_config(
> +    const char *buf, uint32_t *col_mask, uint32_t *col_num)


Coding style. We usually declarate paremeters on the same line as the 
function name. If they can't fit on the same line, then we split in two 
with the parameter aligned to the first paremeter.

> +{
> +    int start, end, i;

AFAICT, none of the 3 variables will store negative values. So can they 
be unsigned?

> +    const char* s = buf;
> +    unsigned int offset;
> +
> +    if ( !col_mask || !col_num )
> +        return -EINVAL;
> +
> +    *col_num = 0;
> +    for ( i = 0; i < MAX_COLORS_CELLS; i++ )
> +        col_mask[i] = 0;
dom0_col_mask and xen_col_mask are already zeroed. I would also expect 
the same for dynamically allocated bitmask. So can this be dropped?

> +
> +    while ( *s != '\0' )
> +    {
> +        if ( *s != ',' )
> +        {
> +            start = simple_strtoul(s, &s, 0);
> +
> +            /* Ranges are hyphen-separated */
> +            if ( *s != '-' )
> +                goto fail;
> +            s++;
> +
> +            end = simple_strtoul(s, &s, 0);
> +
> +            for ( i = start; i <= end; i++ )
> +            {
> +                offset = i / 32;
> +                if ( offset > MAX_COLORS_CELLS )
> +                    goto fail;
> +
> +                if ( !(col_mask[offset] & (1 << i % 32)) )
> +                    *col_num += 1;
> +                col_mask[offset] |= (1 << i % 32);
> +            }
> +        }
> +        else
> +            s++;
> +    }
> +
> +    return *s ? -EINVAL : 0;
> +fail:
> +    return -EINVAL;
> +}
> +
> +static int __init parse_way_size(const char *s)
> +{
> +    way_size = simple_strtoull(s, &s, 0);
> +
> +    return *s ? -EINVAL : 0;
> +}
> +custom_param("way_size", parse_way_size);
> +
> +static int __init parse_dom0_colors(const char *s)
> +{
> +    return parse_color_config(s, dom0_col_mask, &dom0_col_num);
> +}
> +custom_param("dom0_colors", parse_dom0_colors);
> +
> +static int __init parse_xen_colors(const char *s)
> +{
> +    return parse_color_config(s, xen_col_mask, &xen_col_num);
> +}
> +custom_param("xen_colors", parse_xen_colors);
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
> new file mode 100644
> index 0000000000..60958d1244
> --- /dev/null
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -0,0 +1,28 @@
> +/*
> + * xen/arm/include/asm/coloring.h
> + *
> + * Coloring support for ARM
> + *
> + * Copyright (C) 2019 Xilinx Inc.
> + *
> + * Authors:
> + *    Luca Miccio <lucmiccio@gmail.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_ARM_COLORING_H__
> +#define __ASM_ARM_COLORING_H__
> +
> +#define MAX_COLORS_CELLS 4
> +
> +#endif /* !__ASM_ARM_COLORING_H__ */

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/36] xen/arm: compute LLC way size by hardware inspection
  2022-03-04 17:46 ` [PATCH 05/36] xen/arm: compute LLC way size by hardware inspection Marco Solieri
@ 2022-03-09 20:12   ` Julien Grall
  2022-05-13 14:34     ` Carlo Nonato
  0 siblings, 1 reply; 79+ messages in thread
From: Julien Grall @ 2022-03-09 20:12 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio,
	Stefano Stabellini

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> The size of the LLC way is a crucial parameter for the cache coloring
> support, since it determines the maximum number of available colors on
> the platform.  This parameter can currently be retrieved only from
> the way_size bootarg and it is prone to misconfiguration nullifying the
> coloring mechanism and breaking cache isolation.

Reading this sentence, I think the command line option should be 
introduced after this patch (assuming this is necessary). This will 
avoid undoing/fixing a "bug" that was introduced by the same series.

> 
> Add an alternative and more safe method to retrieve the way size by
> directly asking the hardware, namely using CCSIDR_EL1 and CSSELR_EL1
> registers.
> 
> This method has to check also if at least L2 is implemented in the
> hardware since there are scenarios where only L1 cache is availble, e.g,

In the previous patch, the description for the Kconfig suggests that the 
cache coloring will only happen on L2. But here you are also adding L1. 
So I think the documentation needs to be updated.

Typo: s/availble/available/

> QEMU.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> ---
>   xen/arch/arm/coloring.c | 76 +++++++++++++++++++++++++++++++++++++++++
>   1 file changed, 76 insertions(+)
> 
> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> index 8f1cff6efb..e3d490b453 100644
> --- a/xen/arch/arm/coloring.c
> +++ b/xen/arch/arm/coloring.c
> @@ -25,7 +25,10 @@
>   #include <xen/lib.h>
>   #include <xen/errno.h>
>   #include <xen/param.h>
> +


NIT: I think this belongs to patch #4.

> +#include <asm/sysregs.h>

Please order the include alphabetically.

>   #include <asm/coloring.h> > +#include <asm/io.h>

You don't seem to use read*/write* helper. So why do you need this?
>   
>   /* Number of color(s) assigned to Xen */
>   static uint32_t xen_col_num;
> @@ -39,6 +42,79 @@ static uint32_t dom0_col_mask[MAX_COLORS_CELLS];
>   
>   static uint64_t way_size;
>   
> +#define CTR_LINESIZE_MASK 0x7
> +#define CTR_SIZE_SHIFT 13
> +#define CTR_SIZE_MASK 0x3FFF
> +#define CTR_SELECT_L2 1 << 1
> +#define CTR_SELECT_L3 1 << 2
> +#define CTR_CTYPEn_MASK 0x7
> +#define CTR_CTYPE2_SHIFT 3
> +#define CTR_CTYPE3_SHIFT 6
> +#define CTR_LLC_ON 1 << 2
> +#define CTR_LOC_SHIFT 24
> +#define CTR_LOC_MASK 0x7
> +#define CTR_LOC_L2 1 << 1
> +#define CTR_LOC_NOT_IMPLEMENTED 1 << 0

We already define some CTR_* in processor.h. Please any extra one there.

> +
> +
> +/* Return the way size of last level cache by asking the hardware */
> +static uint64_t get_llc_way_size(void)

This will break compilation as you are introducing get_llc_way_size() 
but not using it.

I would suggest to fold this patch in the next one.

> +{
> +    uint32_t cache_sel = READ_SYSREG64(CSSELR_EL1);

The return type for READ_SYSREG64() is uint64_t. That said, the 
equivalent register on 32bit is CSSELR which is 32-bit. So this should 
be READ_SYSREG() and the matching type is register_t.

> +    uint32_t cache_global_info = READ_SYSREG64(CLIDR_EL1);

Same remark here. Except the matching register is CLIDR.

> +    uint32_t cache_info;
> +    uint32_t cache_line_size;
> +    uint32_t cache_set_num;
> +    uint32_t cache_sel_tmp;
> +
> +    printk(XENLOG_INFO "Get information on LLC\n");
> +    printk(XENLOG_INFO "Cache CLIDR_EL1: 0x%"PRIx32"\n", cache_global_info);
> +
> +    /* Check if at least L2 is implemented */
> +    if ( ((cache_global_info >> CTR_LOC_SHIFT) & CTR_LOC_MASK)

This is a bit confusing. cache_global_info is storing CLIDR_* but you 
are using macro starting with CTR_*.

Did you intend to name the macros CLIDR_*?

The same remark goes for the other use of CTR_ below. The name of the 
macros should match the register they are meant to be used on.

> +        == CTR_LOC_NOT_IMPLEMENTED )

I am a bit confused this the check here. Shouln't you check that Ctype2 
is notn 0 instead?

> +    {
> +        printk(XENLOG_ERR "ERROR: L2 Cache not implemented\n");
> +        return 0;
> +    }
> +
> +    /* Save old value of CSSELR_EL1 */
> +    cache_sel_tmp = cache_sel;
> +
> +    /* Get LLC index */
> +    if ( ((cache_global_info >> CTR_CTYPE2_SHIFT) & CTR_CTYPEn_MASK)
> +        == CTR_LLC_ON )

I don't understand this check. You define CTR_LLC_ON to 1 << 2. So it 
would be 0b10. From the field you checked, this value mean "Data Cache 
Only". How is this indicating the which level to chose?

But then in patch #4 you wrote we will do cache coloring on L2. So why 
are we selecting L3?

> +        cache_sel = CTR_SELECT_L2;
> +    else
> +        cache_sel = CTR_SELECT_L3;
> +
> +    printk(XENLOG_INFO "LLC selection: %u\n", cache_sel);
> +    /* Select the correct LLC in CSSELR_EL1 */
> +    WRITE_SYSREG64(cache_sel, CSSELR_EL1);

This should be WRITE_SYSREG().

> +
> +    /* Ensure write */
> +    isb();
> +
> +    /* Get info about the LLC */
> +    cache_info = READ_SYSREG64(CCSIDR_EL1);
> +
> +    /* ARM TRM: (Log2(Number of bytes in cache line)) - 4. */

 From my understanding "TRM" in the Arm world refers to a specific 
processor. In this case we want to quote the spec. So we usually say 
"Arm Arm".

> +    cache_line_size = 1 << ((cache_info & CTR_LINESIZE_MASK) + 4);
> +    /* ARM TRM: (Number of sets in cache) - 1 */
> +    cache_set_num = ((cache_info >> CTR_SIZE_SHIFT) & CTR_SIZE_MASK) + 1;

The shifts here are assuming that FEAT_CCIDX is not implemented. I would 
be OK if we decide to not support cache coloring on such platform. 
However, we need to return an error if a user tries to use cache 
coloring on such platform.

> +
> +    printk(XENLOG_INFO "Cache line size: %u bytes\n", cache_line_size);
> +    printk(XENLOG_INFO "Cache sets num: %u\n", cache_set_num);
> +
> +    /* Restore value in CSSELR_EL1 */
> +    WRITE_SYSREG64(cache_sel_tmp, CSSELR_EL1);
> +
> +    /* Ensure write */
> +    isb();
> +
> +    return (cache_line_size * cache_set_num);
> +}
> +
>   /*************************
>    * PARSING COLORING BOOTARGS
>    */

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 10/36] xen/arch: check color selection function
  2022-03-04 17:46 ` [PATCH 10/36] xen/arch: check color " Marco Solieri
@ 2022-03-09 20:17   ` Julien Grall
  2022-03-14  6:06   ` Henry Wang
  1 sibling, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-09 20:17 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Dom0 color configuration is parsed in the Xen command line. Add an
> helper function to check the user selection. If no configuration is
> provided by the user, all the available colors supported by the
> hardware will be assigned to dom0.

 From the commit message, I was expecting the function to be used. Can 
this be introduced when you introduce its user?

> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   xen/arch/arm/coloring.c             | 17 +++++++++++++++++
>   xen/arch/arm/include/asm/coloring.h |  8 ++++++++
>   2 files changed, 25 insertions(+)
> 
> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> index f6e6d09477..382d558021 100644
> --- a/xen/arch/arm/coloring.c
> +++ b/xen/arch/arm/coloring.c
> @@ -179,6 +179,23 @@ uint32_t *setup_default_colors(uint32_t *col_num)
>       return NULL;
>   }
>   
> +bool check_domain_colors(struct domain *d)
> +{
> +    int i;
> +    bool ret = false;
> +
> +    if ( !d )
> +        return ret;
> +
> +    if ( d->max_colors > max_col_num )
> +        return ret;
> +
> +    for ( i = 0; i < d->max_colors; i++ )
> +        ret |= (d->colors[i] > (max_col_num - 1));
> +
> +    return !ret;
> +}
> +
>   bool __init coloring_init(void)
>   {
>       int i;
> diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
> index 8f24acf082..fdd46448d7 100644
> --- a/xen/arch/arm/include/asm/coloring.h
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -26,8 +26,16 @@
>   #define MAX_COLORS_CELLS 4
>   
>   #ifdef CONFIG_COLORING
> +#include <xen/sched.h>
> +
>   bool __init coloring_init(void);
>   
> +/*
> + * Check colors of a given domain.
> + * Return true if check passed, false otherwise.
> + */
> +bool check_domain_colors(struct domain *d);
> +
>   /*
>    * Return an array with default colors selection and store the number of
>    * colors in @param col_num. The array selection will be equal to the dom0

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 11/36] xen/include: define hypercall parameter for coloring
  2022-03-04 17:46 ` [PATCH 11/36] xen/include: define hypercall parameter for coloring Marco Solieri
  2022-03-07  7:31   ` Jan Beulich
@ 2022-03-09 20:29   ` Julien Grall
  1 sibling, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-09 20:29 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio,
	Stefano Stabellini

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> During domU creation process the colors selection has to be passed to
> the Xen hypercall.
> This is generally done using what Xen calls GUEST_HANDLE_PARAMS. In this
> case a simple bitmask for the coloring configuration suffices.
> Currently the maximum amount of supported colors is 128.
> Add a new parameter that allows us to pass both the colors bitmask
> and the number of elements in it.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> ---
>   xen/arch/arm/include/asm/coloring.h | 2 --
>   xen/include/public/arch-arm.h       | 8 ++++++++
I would prefer if the structure is defined in the same patch that will 
use it. This would make easier to figure out if the structure is indeed 
suitable.

>   2 files changed, 8 insertions(+), 2 deletions(-)
> 
> diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
> index fdd46448d7..1f7e0dde79 100644
> --- a/xen/arch/arm/include/asm/coloring.h
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -23,8 +23,6 @@
>   #ifndef __ASM_ARM_COLORING_H__
>   #define __ASM_ARM_COLORING_H__
>   
> -#define MAX_COLORS_CELLS 4
> -

In general, we should avoid moving code that was introduced within the 
same series.

In this case, I am not convinced we should use a static array to 
communicate the information between the toolstack and Xen.

This would make more difficult for the user to tweak update the number 
of colors.

Instead, I think it should be better to expose to the toolstack the 
number of color supported and allocate a dynamic array.

>   #ifdef CONFIG_COLORING
>   #include <xen/sched.h>
>   
> diff --git a/xen/include/public/arch-arm.h b/xen/include/public/arch-arm.h
> index 94b31511dd..627cc42164 100644
> --- a/xen/include/public/arch-arm.h
> +++ b/xen/include/public/arch-arm.h
> @@ -303,6 +303,12 @@ struct vcpu_guest_context {
>   typedef struct vcpu_guest_context vcpu_guest_context_t;
>   DEFINE_XEN_GUEST_HANDLE(vcpu_guest_context_t);
>   
> +#define MAX_COLORS_CELLS 4
> +struct color_guest_config {
> +    uint32_t max_colors;
> +    uint32_t colors[MAX_COLORS_CELLS];
> +};

This looks like an open-coded version of xenctl_bitmap. Can you have a 
look to use it?

I would expect this will reduce how much code you introduced in the next 
patch.

> +
>   /*
>    * struct xen_arch_domainconfig's ABI is covered by
>    * XEN_DOMCTL_INTERFACE_VERSION.
> @@ -335,6 +341,8 @@ struct xen_arch_domainconfig {
>        *
>        */
>       uint32_t clock_frequency;
> +    /* IN */
> +    struct color_guest_config colors;
>   };
>   #endif /* __XEN__ || __XEN_TOOLS__ */
>   

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 13/36] xen/arm: A domain is not direct mapped when coloring is enabled
  2022-03-04 17:46 ` [PATCH 13/36] xen/arm: A domain is not direct mapped when coloring is enabled Marco Solieri
@ 2022-03-09 20:34   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-09 20:34 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Based on the intrinsic nature of cache coloring, it is trivial to state
> that each domain that is colored is also not direct mapped.
> Set the directmap variable to false when coloring is enabled.

This is basically fixing a bug that was introduced in the previous 
patch. Please fold it.

> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   xen/arch/arm/domain.c | 2 ++
>   1 file changed, 2 insertions(+)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 33471b3c58..80a6f39464 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -785,6 +785,8 @@ int arch_domain_create(struct domain *d,
>   
>       d->max_colors = 0;
>   #ifdef CONFIG_COLORING
> +    d->arch.directmap = false;

We should avoid silently overwriting what the user requested. Instead, 
we should add a check in arch_sanitise_domain_config() to forbid case 
where CDF_directmap is set *and* the number of colors is > 0.

> +
>       /* Setup domain colors */
>       if ( !config->arch.colors.max_colors )
>       {

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 16/36] xen/color alloc: implement color_from_page for ARM64
  2022-03-04 20:54   ` Julien Grall
@ 2022-03-11 17:39     ` Marco Solieri
  2022-03-11 17:57       ` Julien Grall
  0 siblings, 1 reply; 79+ messages in thread
From: Marco Solieri @ 2022-03-11 17:39 UTC (permalink / raw)
  To: Julien Grall
  Cc: xen-devel, Andrew Cooper, George Dunlap, Jan Beulich,
	Stefano Stabellini, Wei Liu, Marco Solieri, Andrea Bastoni,
	Luca Miccio

[-- Attachment #1: Type: text/plain, Size: 2246 bytes --]

On Fri, Mar 04, 2022 at 08:54:35PM +0000, Julien Grall wrote:
> On 04/03/2022 17:46, Marco Solieri wrote:
> > The colored allocator should not make any assumptions on how a color
> > is defined, since the definition may change depending on the
> > architecture.
> IIUC, you are saying that the mapping between a physical address to a
> way is the same on every Armv8 processor.
> 
> Can you provide a reference from the Arm Arm which confirm this
> statement?

We are actually stating quite the opposite.  Generally speaking, the Arm
ARM leaves as IMPLEMENTATION DEFINED many details that are needed to
determine how colouring should be defined, most notably:
- the physical vs virtual indexing, which determines whether colouring
  is possible;
- the cache line length and the degree of associativity, which determine
  the way size, which in turn, together with the page size selected by
  the OS/hypervisor, allows to compute the number of available colours;
- the number of levels of shared caches, which determines the number of
  different colour sets.

For the sake of simplicity, we wanted to decouple the notion of colour
from the many hardware features that enable/suggest one of the
(sometimes many) instantiations.

All these details are usually reported in the processor TRM.  E.g., in
the A53 TRM [DDI 0500J] we read (Sec. 7.1):

| Optional tightly-coupled L2 cache that includes:
| — Configurable L2 cache size of 128KB, 256KB, 512KB, 1MB and 2MB.
| — Fixed line length of 64 bytes.
| — Physically indexed and tagged cache.
| — 16-way set-associative cache structure.

A simplified version of this configuration is implemented in PATCH
06/36, where the fallback automatic configuration assumes that colouring
targets the LLC, and that this is the only shared cache level.

I hope that this clarify some points.


Cheers.

-- 
Marco Solieri, Ph.D.
CEO & Founder
Tel: +39-059-205-5182 -- Cell: +39-349-678-66-65 -- OpenPGP: 0x75822E7E

Minerva Systems SRL -- http://www.minervasys.tech
Via Campi 213/B, 41125, Modena, Italy -- PIVA/CF 03996890368

~~>
Discover how to easily optimise complex embedded solutions
for high-performance, safety and predictability. Together.

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 16/36] xen/color alloc: implement color_from_page for ARM64
  2022-03-11 17:39     ` Marco Solieri
@ 2022-03-11 17:57       ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-11 17:57 UTC (permalink / raw)
  To: Marco Solieri
  Cc: xen-devel, Andrew Cooper, George Dunlap, Jan Beulich,
	Stefano Stabellini, Wei Liu, Marco Solieri, Andrea Bastoni,
	Luca Miccio

Hi Marco,

On 11/03/2022 17:39, Marco Solieri wrote:
> On Fri, Mar 04, 2022 at 08:54:35PM +0000, Julien Grall wrote:
>> On 04/03/2022 17:46, Marco Solieri wrote:
>>> The colored allocator should not make any assumptions on how a color
>>> is defined, since the definition may change depending on the
>>> architecture.
>> IIUC, you are saying that the mapping between a physical address to a
>> way is the same on every Armv8 processor.
>>
>> Can you provide a reference from the Arm Arm which confirm this
>> statement?
> 
> We are actually stating quite the opposite.  Generally speaking, the Arm
> ARM leaves as IMPLEMENTATION DEFINED many details that are needed to
> determine how colouring should be defined, most notably:
> - the physical vs virtual indexing, which determines whether colouring
>    is possible;
> - the cache line length and the degree of associativity, which determine
>    the way size, which in turn, together with the page size selected by
>    the OS/hypervisor, allows to compute the number of available colours;
> - the number of levels of shared caches, which determines the number of
>    different colour sets.
> 
> For the sake of simplicity, we wanted to decouple the notion of colour
> from the many hardware features that enable/suggest one of the
> (sometimes many) instantiations.
> 
> All these details are usually reported in the processor TRM.  E.g., in
> the A53 TRM [DDI 0500J] we read (Sec. 7.1):
> 
> | Optional tightly-coupled L2 cache that includes:
> | — Configurable L2 cache size of 128KB, 256KB, 512KB, 1MB and 2MB.
> | — Fixed line length of 64 bytes.
> | — Physically indexed and tagged cache.
> | — 16-way set-associative cache structure.
Thanks for the details. They are all about the variables of an equation. 
What I am looking for is how the equation calculate_addr_col_mask() in 
patch #6 was defined.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 12/36] xen/arm: initialize cache coloring data for Dom0/U
  2022-03-04 17:46 ` [PATCH 12/36] xen/arm: initialize cache coloring data for Dom0/U Marco Solieri
@ 2022-03-11 19:05   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-11 19:05 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Initialize cache coloring configuration during domain creation. If no
> colors assignment is provided by the user, use the default one.
> The default configuration is the one assigned to Dom0. The latter is
> configured as a standard domain with default configuration.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   xen/arch/arm/domain.c       | 53 +++++++++++++++++++++++++++++++++++++
>   xen/arch/arm/domain_build.c |  5 +++-
>   2 files changed, 57 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
> index 8110c1df86..33471b3c58 100644
> --- a/xen/arch/arm/domain.c
> +++ b/xen/arch/arm/domain.c
> @@ -38,6 +38,7 @@
>   #include <asm/vfp.h>
>   #include <asm/vgic.h>
>   #include <asm/vtimer.h>
> +#include <asm/coloring.h>
>   
>   #include "vpci.h"
>   #include "vuart.h"
> @@ -782,6 +783,58 @@ int arch_domain_create(struct domain *d,
>       if ( (rc = domain_vpci_init(d)) != 0 )
>           goto fail;
>   
> +    d->max_colors = 0;

NIT: d is always zeroed when allocated. So it is not necessary to 
initialize the field again.

> +#ifdef CONFIG_COLORING

Please move this code in a separate helper. The new helper could be 
defined in coloring.c.

Furthermore, I would initialize the coloring information earlier in 
arch_domain_create(). This could be useful if we want to allocate 
internal structure from a color assigned to the domain.

> +    /* Setup domain colors */
> +    if ( !config->arch.colors.max_colors )
> +    {
> +        if ( !is_hardware_domain(d) )
> +            printk(XENLOG_INFO "Color configuration not found for dom%u, using default\n",

This message and the other below wants to be ratelimited. I would use 
XENLOG_G_{INFO, ERROR}.

Please use %pd instead of dom%u. This remark is valid for all the other 
use below.

> +                   d->domain_id);

This would need to be changed to 'd'.

> +        d->colors = setup_default_colors(&d->max_colors);

Looking at setup_default_colors(), it using "dom0_col_num". This implies 
we are using the dom0 color. Shouldn't we return an error if d is not 
the hardware domain?

Also, AFAICT, you allocate the memory but never free it.

> +        if ( !d->colors )
> +        {
> +            rc = -ENOMEM;
> +            printk(XENLOG_ERR "Color array allocation failed for dom%u\n",
> +                   d->domain_id);
> +            goto fail;
> +        }
> +    }
> +    else
> +    {
> +        int i, k;
> +
> +        d->colors = xzalloc_array(uint32_t, config->arch.colors.max_colors);

Same here.

> +        if ( !d->colors )
> +        {
> +            rc = -ENOMEM;
> +            printk(XENLOG_ERR "Failed to alloc colors for dom%u\n",
> +                   d->domain_id);
> +            goto fail;
> +        }
> +
> +        d->max_colors = config->arch.colors.max_colors;
> +        for ( i = 0, k = 0;
> +              k < d->max_colors && i < sizeof(config->arch.colors.colors) * 8;
> +              i++ )
> +        {
> +            if ( config->arch.colors.colors[i / 32] & (1 << (i % 32)) )
> +                d->colors[k++] = i;
> +        }
> +    }
> +
> +    printk("Dom%u colors: [ ", d->domain_id);
> +    for ( int i = 0; i < d->max_colors; i++ ) > +        printk("%u ", d->colors[i]);
> +    printk("]\n");

You will be able to get the same information using the debug-key. So I 
am not convinced this is warrant to print here. The more the 
configuration should always be the same as what the user requested.

> +
> +    if ( !check_domain_colors(d) )
> +    {
> +        rc = -EINVAL;
> +        printk(XENLOG_ERR "Failed to check colors for dom%u\n", d->domain_id);
> +        goto fail;
> +    }
> +#endif
>       return 0;
>   
>   fail:
> diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
> index 8be01678de..9630d00066 100644
> --- a/xen/arch/arm/domain_build.c
> +++ b/xen/arch/arm/domain_build.c
> @@ -3344,7 +3344,10 @@ void __init create_dom0(void)
>           printk(XENLOG_WARNING "Maximum number of vGIC IRQs exceeded.\n");
>       dom0_cfg.arch.tee_type = tee_get_type();
>       dom0_cfg.max_vcpus = dom0_max_vcpus();
> -
> +#ifdef CONFIG_COLORING
> +    /* Colors are set after domain_create */

Do you instead mean 'by'?

> +    dom0_cfg.arch.colors.max_colors = 0;
> +#endif
>       if ( iommu_enabled )
>           dom0_cfg.flags |= XEN_DOMCTL_CDF_iommu;
>   

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 17/36] xen/arm: add get_max_color function
  2022-03-04 17:46 ` [PATCH 17/36] xen/arm: add get_max_color function Marco Solieri
@ 2022-03-11 19:09   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-11 19:09 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi Marco,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> In order to initialize the colored allocator data structure, the maximum
> amount of colors defined by the hardware has to be know.
> Add a helper function that returns this information.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> ---
>   xen/arch/arm/coloring.c             | 5 +++++
>   xen/arch/arm/include/asm/coloring.h | 8 ++++++++

This helper is simple enough that I think it would be better to fold in 
the first patch using it.

>   2 files changed, 13 insertions(+)
> 
> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> index 4748d717d6..d1ac193a80 100644
> --- a/xen/arch/arm/coloring.c
> +++ b/xen/arch/arm/coloring.c
> @@ -209,6 +209,11 @@ unsigned long color_from_page(struct page_info *pg)
>     return ((addr_col_mask & page_to_maddr(pg)) >> PAGE_SHIFT);
>   }
>   
> +uint32_t get_max_colors(void)
> +{
> +    return max_col_num;
> +}
> +
>   bool __init coloring_init(void)
>   {
>       int i;
> diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
> index 318e2a4521..22e67dc9d8 100644
> --- a/xen/arch/arm/include/asm/coloring.h
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -49,6 +49,9 @@ void coloring_dump_info(struct domain *d);
>    * specifications.
>    */
>   unsigned long color_from_page(struct page_info *pg);
> +
> +/* Return the maximum available number of colors supported by the hardware */
> +uint32_t get_max_colors(void);
>   #else /* !CONFIG_COLORING */
>   static inline bool __init coloring_init(void)
>   {
> @@ -59,5 +62,10 @@ static inline void coloring_dump_info(struct domain *d)
>   {
>       return;
>   }
> +
> +static inline uint32_t get_max_colors(void)
> +{
> +    return 0;
> +}
>   #endif /* CONFIG_COLORING */
>   #endif /* !__ASM_ARM_COLORING_H__ */

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 28/36] xen/arm: introduce xen_map_text_rw
  2022-03-07  7:39   ` Jan Beulich
@ 2022-03-11 22:28     ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-11 22:28 UTC (permalink / raw)
  To: Jan Beulich, Marco Solieri
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Marco Solieri, Andrea Bastoni, Luca Miccio, Stefano Stabellini,
	xen-devel

Hi,

On 07/03/2022 07:39, Jan Beulich wrote:
> On 04.03.2022 18:46, Marco Solieri wrote:
>> From: Luca Miccio <lucmiccio@gmail.com>
>>
>> Introduce two new arm specific functions to temporarily map/unmap the
>> Xen text read-write (the Xen text is mapped read-only by default by
>> setup_pagetables): xen_map_text_rw and xen_unmap_text_rw.
>>
>> There is only one caller in the alternative framework.
>>
>> The non-colored implementation simply uses __vmap to do the mapping. In
>> other words, there are no changes to the non-colored case.
>>
>> The colored implementation calculates Xen text physical addresses
>> appropriately, according to the coloring configuration.
>>
>> Export vm_alloc because it is needed by the colored implementation of
>> xen_map_text_rw.
> 
> I'm afraid I view vm_alloc() as strictly an internal function to
> vmap.c. Even livepatching infrastructure has got away without making
> it non-static.

I think we can get away from using vmap() here. We were using it because 
Xen text mappings are RX and this is enforced by the processor via 
SCTLR_EL1.WXN.

The bit is cached in the TLB. Back then it wasn't very clear what would 
happen if we clear the bit. Looking at the latest Arm Arm (ARM DDI 
0487H.a D5.10), there is now a section "TLB invalidation and System 
register control fields" providing more details.

Reading the section, it should be safe to temporary disable WXN on every 
CPUs and make Xen text writable.

@Marco, would you be able to have a look?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH 27/36] xen/arch: add coloring support for Xen
  2022-03-04 17:46 ` [PATCH 27/36] xen/arch: add coloring support for Xen Marco Solieri
  2022-03-04 19:47   ` Julien Grall
@ 2022-03-14  3:47   ` Henry Wang
  2022-03-14 21:58   ` Julien Grall
  2 siblings, 0 replies; 79+ messages in thread
From: Henry Wang @ 2022-03-14  3:47 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri, Andrea Bastoni,
	Luca Miccio

Hi,

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> Marco Solieri
> Sent: Saturday, March 5, 2022 1:47 AM
> To: xen-devel@lists.xenproject.org
> Cc: Marco Solieri <marco.solieri@minervasys.tech>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>;
> Jan Beulich <jbeulich@suse.com>; Julien Grall <julien@xen.org>; Stefano
> Stabellini <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Marco Solieri
> <marco.solieri@unimore.it>; Andrea Bastoni
> <andrea.bastoni@minervasys.tech>; Luca Miccio <lucmiccio@gmail.com>
> Subject: [PATCH 27/36] xen/arch: add coloring support for Xen
> 
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Introduce a new implementation of setup_pagetables that uses coloring
> logic in order to isolate Xen code using its color selection.
> Page tables construction is essentially copied, except for the xenmap
> table, where coloring logic is needed.  Given the absence of a contiguous
> physical mapping, pointers to next level tables need to be manually
> calculated.
> 
> Xen code is relocated in strided mode using the same coloring logic as
> the one in xenmap table by using a temporary colored mapping that will
> be destroyed after switching the TTBR register.
> 
> Keep Xen text section mapped in the newly created pagetables.
> The boot process relies on computing needed physical addresses of Xen
> code by using a shift, but colored mapping is not linear and not easily
> computable. Therefore, the old Xen code is temporarily kept and used to
> boot secondary CPUs until they switch to the colored mapping, which is
> accessed using the handy macro virt_old.  After the boot process, the old
> Xen code memory is reset and its mapping is destroyed.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>  xen/arch/arm/include/asm/coloring.h |  13 ++
>  xen/arch/arm/include/asm/mm.h       |   7 ++
>  xen/arch/arm/mm.c                   | 186 +++++++++++++++++++++++++++-
>  xen/arch/arm/psci.c                 |   4 +-
>  xen/arch/arm/setup.c                |  21 +++-
>  xen/arch/arm/smpboot.c              |  19 ++-
>  6 files changed, 241 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/arm/include/asm/coloring.h
> b/xen/arch/arm/include/asm/coloring.h
> index 8c4525677c..424f6c2b04 100644
> --- a/xen/arch/arm/include/asm/coloring.h
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -26,6 +26,17 @@
>  #ifdef CONFIG_COLORING
>  #include <xen/sched.h>
> 
> +/*
> + * Amount of memory that we need to map in order to color Xen.  The value
> + * depends on the maximum number of available colors of the hardware.
> The
> + * memory size is pessimistically calculated assuming only one color is used,
> + * which means that any pages belonging to any other color has to be
> skipped.
> + */
> +#define XEN_COLOR_MAP_SIZE \
> +	((((_end - _start) * get_max_colors())\
> +		+ (XEN_PADDR_ALIGN-1)) & ~(XEN_PADDR_ALIGN-1))
> +#define XEN_COLOR_MAP_SIZE_M (XEN_COLOR_MAP_SIZE >> 20)
> +
>  bool __init coloring_init(void);
> 
>  /*
> @@ -67,6 +78,8 @@ unsigned long color_from_page(struct page_info *pg);
>  /* Return the maximum available number of colors supported by the
> hardware */
>  uint32_t get_max_colors(void);
>  #else /* !CONFIG_COLORING */
> +#define XEN_COLOR_MAP_SIZE (_end - _start)
> +
>  static inline bool __init coloring_init(void)
>  {
>      return true;
> diff --git a/xen/arch/arm/include/asm/mm.h
> b/xen/arch/arm/include/asm/mm.h
> index 041ec4ee70..1422091436 100644
> --- a/xen/arch/arm/include/asm/mm.h
> +++ b/xen/arch/arm/include/asm/mm.h
> @@ -362,6 +362,13 @@ void clear_and_clean_page(struct page_info *page);
> 
>  unsigned int arch_get_dma_bitsize(void);
> 
> +#ifdef CONFIG_COLORING
> +#define virt_boot_xen(virt)\
> +    (vaddr_t)((virt - XEN_VIRT_START) + BOOT_RELOC_VIRT_START)

Apart from Julien's comments, one small issue, I am afraid with commit:

0c18fb7632 xen/arm: Remove unused BOOT_RELOC_VIRT_START

merged in staging branch, directly applying the Arm Cache coloring
series on top of the staging branch breaks the build of Xen.

Therefore please take care of this issue when sending the next version,
Thanks :)

Kind regards,

Henry

> +#else
> +#define virt_boot_xen(virt) virt
> +#endif
> +
>  #endif /*  __ARCH_ARM_MM__ */
>  /*
>   * Local variables:
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index d69f18b5d2..53ea13641b 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -42,6 +42,7 @@
>  #include <xen/libfdt/libfdt.h>
> 
>  #include <asm/setup.h>
> +#include <asm/coloring.h>
> 
>  /* Override macros from asm/page.h to make them work with mfn_t */
>  #undef virt_to_mfn
> @@ -110,6 +111,9 @@ DEFINE_BOOT_PAGE_TABLE(boot_second_id);
>  DEFINE_BOOT_PAGE_TABLE(boot_third_id);
>  DEFINE_BOOT_PAGE_TABLE(boot_second);
>  DEFINE_BOOT_PAGE_TABLE(boot_third);
> +#ifdef CONFIG_COLORING
> +DEFINE_BOOT_PAGE_TABLE(boot_colored_xen);
> +#endif
> 
>  /* Main runtime page tables */
> 
> @@ -632,6 +636,166 @@ static void clear_table(void *table)
>      clean_and_invalidate_dcache_va_range(table, PAGE_SIZE);
>  }
> 
> +#ifdef CONFIG_COLORING
> +/*
> + * Translate a Xen (.text) virtual address to the colored physical one
> + * depending on the hypervisor configuration.
> + * N.B: this function must be used only when migrating from non colored to
> + * colored pagetables since it assumes to have the temporary mappings
> created
> + * during setup_pagetables that starts from BOOT_RELOC_VIRT_START.
> + * After the migration we have to use virt_to_maddr.
> + */
> +static paddr_t virt_to_maddr_colored(vaddr_t virt)
> +{
> +    unsigned int va_offset;
> +
> +    va_offset = virt - XEN_VIRT_START;
> +    return __pa(BOOT_RELOC_VIRT_START + va_offset);
> +}
> +
> +static void __init coloring_temp_mappings(paddr_t xen_paddr, vaddr_t
> virt_start)
> +{
> +    int i;
> +    lpae_t pte;
> +    unsigned int xen_text_size = (_end - _start);
> +
> +    xen_text_size = PAGE_ALIGN(xen_text_size);
> +
> +    pte = mfn_to_xen_entry(maddr_to_mfn(__pa(boot_second)),
> MT_NORMAL);
> +    pte.pt.table = 1;
> +    boot_first[first_table_offset(virt_start)] = pte;
> +
> +    pte = mfn_to_xen_entry(maddr_to_mfn(__pa(boot_colored_xen)),
> MT_NORMAL);
> +    pte.pt.table = 1;
> +    boot_second[second_table_offset(virt_start)] = pte;
> +
> +    for ( i = 0; i < (xen_text_size/PAGE_SIZE); i++ )
> +    {
> +        mfn_t mfn;
> +        xen_paddr = next_xen_colored(xen_paddr);
> +        mfn = maddr_to_mfn(xen_paddr);
> +        pte = mfn_to_xen_entry(mfn, MT_NORMAL);
> +        pte.pt.table = 1; /* 4k mappings always have this bit set */
> +        boot_colored_xen[i] = pte;
> +        xen_paddr += PAGE_SIZE;
> +    }
> +
> +   flush_xen_tlb_local();
> +}
> +
> +/*
> + * Boot-time pagetable setup with coloring support
> + * Changes here may need matching changes in head.S
> + *
> + * The process can be explained as follows:
> + * - Create a temporary colored mapping that conforms to Xen color
> selection.
> + * - Update all the pagetables links that point to the next level table(s):
> + * this process is crucial beacause the translation tables are not physically
> + * contiguous and we cannot calculate the physical addresses by using the
> + * standard method (physical offset). In order to get the correct physical
> + * address we use virt_to_maddr_colored that translates the virtual address
> + * into a physical one based on the Xen coloring configuration.
> + * - Copy Xen to the new location.
> + * - Update TTBR0_EL2 with the new root page table address.
> + */
> +void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t
> xen_paddr)
> +{
> +    int i;
> +    lpae_t pte, *p;
> +    paddr_t pt_phys;
> +    mfn_t pt_phys_mfn;
> +    paddr_t _xen_paddr = xen_paddr;
> +
> +    phys_offset = boot_phys_offset;
> +
> +    ASSERT((_end - _start) < SECOND_SIZE);
> +    /* Create temporary mappings */
> +    coloring_temp_mappings(xen_paddr, BOOT_RELOC_VIRT_START);
> +
> +    /* Build pagetables links */
> +    p = (void *)xen_pgtable;
> +    pt_phys = virt_to_maddr_colored((vaddr_t)xen_first);
> +    pt_phys_mfn = maddr_to_mfn(pt_phys);
> +    p[0] = mfn_to_xen_entry(pt_phys_mfn, MT_NORMAL);
> +    p[0].pt.table = 1;
> +    p[0].pt.xn = 0;
> +    p = (void *)xen_first;
> +
> +    for ( i = 0; i < 2; i++ )
> +    {
> +        pt_phys = virt_to_maddr_colored((vaddr_t)(xen_second + i *
> LPAE_ENTRIES));
> +        pt_phys_mfn = maddr_to_mfn(pt_phys);
> +        p[i] = mfn_to_xen_entry(pt_phys_mfn, MT_NORMAL);
> +        p[i].pt.table = 1;
> +        p[i].pt.xn = 0;
> +    }
> +
> +    for ( i = 0; i < LPAE_ENTRIES; i++ )
> +    {
> +        mfn_t mfn;
> +        vaddr_t va = XEN_VIRT_START + (i << PAGE_SHIFT);
> +        _xen_paddr = next_xen_colored(_xen_paddr);
> +        mfn = maddr_to_mfn(_xen_paddr);
> +        if ( !is_kernel(va) )
> +            break;
> +        pte = mfn_to_xen_entry(mfn, MT_NORMAL);
> +        pte.pt.table = 1; /* 4k mappings always have this bit set */
> +        if ( is_kernel_text(va) || is_kernel_inittext(va) )
> +        {
> +            pte.pt.xn = 0;
> +            pte.pt.ro = 1;
> +        }
> +        if ( is_kernel_rodata(va) )
> +            pte.pt.ro = 1;
> +        xen_xenmap[i] = pte;
> +        _xen_paddr += PAGE_SIZE;
> +    }
> +
> +    /* Initialise xen second level entries ... */
> +    /* ... Xen's text etc */
> +    pt_phys = virt_to_maddr_colored((vaddr_t)(xen_xenmap));
> +    pt_phys_mfn = maddr_to_mfn(pt_phys);
> +    pte = mfn_to_xen_entry(pt_phys_mfn, MT_NORMAL);
> +    pte.pt.table = 1;
> +    xen_second[second_table_offset(XEN_VIRT_START)] = pte;
> +
> +    /* ... Fixmap */
> +    pt_phys = virt_to_maddr_colored((vaddr_t)(xen_fixmap));
> +    pt_phys_mfn = maddr_to_mfn(pt_phys);
> +    pte = mfn_to_xen_entry(pt_phys_mfn, MT_NORMAL);
> +    pte.pt.table = 1;
> +    xen_second[second_table_offset(FIXMAP_ADDR(0))] = pte;
> +
> +    /* ... DTB */
> +    pte = boot_second[second_table_offset(BOOT_FDT_VIRT_START)];
> +    xen_second[second_table_offset(BOOT_FDT_VIRT_START)] = pte;
> +    pte = boot_second[second_table_offset(BOOT_FDT_VIRT_START +
> SZ_2M)];
> +    xen_second[second_table_offset(BOOT_FDT_VIRT_START + SZ_2M)] =
> pte;
> +
> +    /* Update the value of init_ttbr */
> +    init_ttbr = virt_to_maddr_colored((vaddr_t)xen_pgtable);
> +    clean_dcache(init_ttbr);
> +
> +    /* Copy Xen to the new location */
> +    memcpy((void*)BOOT_RELOC_VIRT_START,
> +        (const void*)XEN_VIRT_START, (_end - _start));
> +    clean_dcache_va_range((void*)BOOT_RELOC_VIRT_START, (_end -
> _start));
> +
> +    /* Change ttbr */
> +    switch_ttbr(init_ttbr);
> +
> +    /*
> +     * Keep mapped old Xen memory in a contiguous mapping
> +     * for other cpus to boot. This mapping will also replace the
> +     * one created at the beginning of setup_pagetables.
> +     */
> +    create_mappings(xen_second, BOOT_RELOC_VIRT_START,
> +                paddr_to_pfn(XEN_VIRT_START + phys_offset),
> +                SZ_2M >> PAGE_SHIFT, SZ_2M);
> +
> +    xen_pt_enforce_wnx();
> +}
> +#else
>  /* Boot-time pagetable setup.
>   * Changes here may need matching changes in head.S */
>  void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t
> xen_paddr)
> @@ -721,6 +885,7 @@ void __init setup_pagetables(unsigned long
> boot_phys_offset, paddr_t xen_paddr)
>      per_cpu(xen_dommap, 0) = cpu0_dommap;
>  #endif
>  }
> +#endif /* !CONFIG_COLORING */
> 
>  static void clear_boot_pagetables(void)
>  {
> @@ -735,6 +900,9 @@ static void clear_boot_pagetables(void)
>  #endif
>      clear_table(boot_second);
>      clear_table(boot_third);
> +#ifdef CONFIG_COLORING
> +    clear_table(boot_colored_xen);
> +#endif
>  }
> 
>  #ifdef CONFIG_ARM_64
> @@ -742,10 +910,16 @@ int init_secondary_pagetables(int cpu)
>  {
>      clear_boot_pagetables();
> 
> +    /*
> +     * For coloring the value of the ttbr was already set up during
> +     * setup_pagetables.
> +     */
> +#ifndef CONFIG_COLORING
>      /* Set init_ttbr for this CPU coming up. All CPus share a single setof
>       * pagetables, but rewrite it each time for consistency with 32 bit. */
>      init_ttbr = (uintptr_t) xen_pgtable + phys_offset;
>      clean_dcache(init_ttbr);
> +#endif
>      return 0;
>  }
>  #else
> @@ -859,12 +1033,20 @@ void __init setup_xenheap_mappings(unsigned
> long base_mfn,
>          else if ( xenheap_first_first_slot == -1)
>          {
>              /* Use xenheap_first_first to bootstrap the mappings */
> -            first = xenheap_first_first;
> +            paddr_t phys_addr;
> +
> +            /*
> +             * At this stage is safe to use virt_to_maddr because Xen mapping
> +             * is already in place. Using virt_to_maddr allows us to unify
> +             * codepath with and without cache coloring enabled.
> +             */
> +            phys_addr = virt_to_maddr((vaddr_t)xenheap_first_first);
> +            pte = mfn_to_xen_entry(maddr_to_mfn(phys_addr),MT_NORMAL);
> 
> -            pte = pte_of_xenaddr((vaddr_t)xenheap_first_first);
>              pte.pt.table = 1;
>              write_pte(p, pte);
> 
> +            first = xenheap_first_first;
>              xenheap_first_first_slot = slot;
>          }
>          else
> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> index 0c90c2305c..d443fac6a2 100644
> --- a/xen/arch/arm/psci.c
> +++ b/xen/arch/arm/psci.c
> @@ -25,6 +25,7 @@
>  #include <asm/cpufeature.h>
>  #include <asm/psci.h>
>  #include <asm/acpi.h>
> +#include <asm/coloring.h>
> 
>  /*
>   * While a 64-bit OS can make calls with SMC32 calling conventions, for
> @@ -49,7 +50,8 @@ int call_psci_cpu_on(int cpu)
>  {
>      struct arm_smccc_res res;
> 
> -    arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu),
> __pa(init_secondary),
> +    arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu),
> +                  __pa(virt_boot_xen((vaddr_t)init_secondary)),
>                    &res);
> 
>      return PSCI_RET(res);
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 13b10515a8..294b806120 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -924,6 +924,7 @@ void __init start_xen(unsigned long boot_phys_offset,
>      struct domain *d;
>      int rc;
>      paddr_t xen_paddr = (paddr_t)(_start + boot_phys_offset);
> +    uint32_t xen_size = (_end - _start);
> 
>      dcache_line_bytes = read_dcache_line_bytes();
> 
> @@ -952,13 +953,16 @@ void __init start_xen(unsigned long
> boot_phys_offset,
>      if ( !coloring_init() )
>          panic("Xen Coloring support: setup failed\n");
> 
> +    xen_size = XEN_COLOR_MAP_SIZE;
> +#ifdef CONFIG_COLORING
> +    xen_paddr = get_xen_paddr(xen_size);
> +#endif
> +
>      /* Register Xen's load address as a boot module. */
> -    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr,
> -                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
> +    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr,
> xen_size, false);
>      BUG_ON(!xen_bootmodule);
> 
>      setup_pagetables(boot_phys_offset, xen_paddr);
> -
>      setup_mm();
> 
>      /* Parse the ACPI tables for possible boot-time configuration */
> @@ -1072,6 +1076,17 @@ void __init start_xen(unsigned long
> boot_phys_offset,
> 
>      setup_virt_paging();
> 
> +    /*
> +     * This removal is useful if cache coloring is enabled but
> +     * it should not affect non coloring configuration.
> +     * The removal is done earlier than discard_initial_modules
> +     * beacuse in do_initcalls there is the livepatch support
> +     * setup which uses the virtual addresses starting from
> +     * BOOT_RELOC_VIRT_START.
> +     * Remove coloring mappings to expose a clear state to the
> +     * livepatch module.
> +     */
> +    remove_early_mappings(BOOT_RELOC_VIRT_START, SZ_2M);
>      do_initcalls();
> 
>      /*
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index 7bfd0a73a7..5ef68976c9 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -438,6 +438,7 @@ int __cpu_up(unsigned int cpu)
>  {
>      int rc;
>      s_time_t deadline;
> +    vaddr_t *smp_up_cpu_addr;
> 
>      printk("Bringing up CPU%d\n", cpu);
> 
> @@ -453,10 +454,22 @@ int __cpu_up(unsigned int cpu)
>      /* Tell the remote CPU what its logical CPU ID is. */
>      init_data.cpuid = cpu;
> 
> +    /*
> +     * If coloring is enabled, non-Master CPUs boot using the old Xen code.
> +     * During the boot process each cpu is booted one after another using the
> +     * smp_cpu_cpu variable. This variable is accessed in head.S using its
> +     * physical address.
> +     * That address is calculated using the physical offset of the old Xen
> +     * code. With coloring we can not rely anymore on that offset. For this
> +     * reason in order to boot the other cpus we rely on the old xen code that
> +     * was mapped during tables setup in mm.c so that we can use the old
> physical
> +     * offset and the old head.S code also. In order to modify the old Xen
> code
> +     * we need to access it using the mapped done in color_xen.
> +     */
> +    smp_up_cpu_addr = (vaddr_t *)virt_boot_xen((vaddr_t)&smp_up_cpu);
> +    *smp_up_cpu_addr = cpu_logical_map(cpu);
>      /* Open the gate for this CPU */
> -    smp_up_cpu = cpu_logical_map(cpu);
> -    clean_dcache(smp_up_cpu);
> -
> +    clean_dcache(*smp_up_cpu_addr);
>      rc = arch_cpu_up(cpu);
> 
>      console_end_sync();
> --
> 2.30.2
> 



^ permalink raw reply	[flat|nested] 79+ messages in thread

* RE: [PATCH 10/36] xen/arch: check color selection function
  2022-03-04 17:46 ` [PATCH 10/36] xen/arch: check color " Marco Solieri
  2022-03-09 20:17   ` Julien Grall
@ 2022-03-14  6:06   ` Henry Wang
  1 sibling, 0 replies; 79+ messages in thread
From: Henry Wang @ 2022-03-14  6:06 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri, Andrea Bastoni,
	Luca Miccio

Hi Marco,

> -----Original Message-----
> From: Xen-devel <xen-devel-bounces@lists.xenproject.org> On Behalf Of
> Marco Solieri
> Sent: Saturday, March 5, 2022 1:47 AM
> To: xen-devel@lists.xenproject.org
> Cc: Marco Solieri <marco.solieri@minervasys.tech>; Andrew Cooper
> <andrew.cooper3@citrix.com>; George Dunlap <george.dunlap@citrix.com>;
> Jan Beulich <jbeulich@suse.com>; Julien Grall <julien@xen.org>; Stefano
> Stabellini <sstabellini@kernel.org>; Wei Liu <wl@xen.org>; Marco Solieri
> <marco.solieri@unimore.it>; Andrea Bastoni
> <andrea.bastoni@minervasys.tech>; Luca Miccio <lucmiccio@gmail.com>
> Subject: [PATCH 10/36] xen/arch: check color selection function
> 
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Dom0 color configuration is parsed in the Xen command line. Add an
> helper function to check the user selection. If no configuration is
> provided by the user, all the available colors supported by the
> hardware will be assigned to dom0.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>

For the first 10 commits:

Tested-by: Henry Wang <Henry.Wang@arm.com>

> ---
>  xen/arch/arm/coloring.c             | 17 +++++++++++++++++
>  xen/arch/arm/include/asm/coloring.h |  8 ++++++++
>  2 files changed, 25 insertions(+)
> 
> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> index f6e6d09477..382d558021 100644
> --- a/xen/arch/arm/coloring.c
> +++ b/xen/arch/arm/coloring.c
> @@ -179,6 +179,23 @@ uint32_t *setup_default_colors(uint32_t *col_num)
>      return NULL;
>  }
> 
> +bool check_domain_colors(struct domain *d)
> +{
> +    int i;
> +    bool ret = false;
> +
> +    if ( !d )
> +        return ret;
> +
> +    if ( d->max_colors > max_col_num )
> +        return ret;
> +
> +    for ( i = 0; i < d->max_colors; i++ )
> +        ret |= (d->colors[i] > (max_col_num - 1));
> +
> +    return !ret;
> +}
> +
>  bool __init coloring_init(void)
>  {
>      int i;
> diff --git a/xen/arch/arm/include/asm/coloring.h
> b/xen/arch/arm/include/asm/coloring.h
> index 8f24acf082..fdd46448d7 100644
> --- a/xen/arch/arm/include/asm/coloring.h
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -26,8 +26,16 @@
>  #define MAX_COLORS_CELLS 4
> 
>  #ifdef CONFIG_COLORING
> +#include <xen/sched.h>
> +
>  bool __init coloring_init(void);
> 
> +/*
> + * Check colors of a given domain.
> + * Return true if check passed, false otherwise.
> + */
> +bool check_domain_colors(struct domain *d);
> +
>  /*
>   * Return an array with default colors selection and store the number of
>   * colors in @param col_num. The array selection will be equal to the dom0
> --
> 2.30.2
> 



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 22/36] xen/arch: init cache coloring conf for Xen
  2022-03-04 17:46 ` [PATCH 22/36] xen/arch: init cache coloring conf for Xen Marco Solieri
@ 2022-03-14 18:59   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-14 18:59 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio, Luca Miccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Add initialization for Xen coloring data. By default, use the lowest
> color index available.
> 
> Benchmarking the VM interrupt response time provides an estimation of
> LLC usage by Xen's most latency-critical runtime task.  Results on Arm
> Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
> reserves 64 KiB of L2, is enough to attain best responsiveness.
> 
> More colors are instead very likely to be needed on processors whose L1
> cache is physically-indexed and physically-tagged, such as Cortex-A57.
> In such cases, coloring applies to L1 also, and there typically are two
> distinct L1-colors. Therefore, reserving only one color for Xen would
> senselessly partitions a cache memory that is already private, i.e.
> underutilize it. The default amount of Xen colors is thus set to one.
> 
> Signed-off-by: Luca Miccio <206497@studenti.unimore.it>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   xen/arch/arm/coloring.c | 31 ++++++++++++++++++++++++++++++-
>   1 file changed, 30 insertions(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> index d1ac193a80..761414fcd7 100644
> --- a/xen/arch/arm/coloring.c
> +++ b/xen/arch/arm/coloring.c
> @@ -30,10 +30,18 @@
>   #include <asm/coloring.h>
>   #include <asm/io.h>
>   
> +/* By default Xen uses the lowestmost color */
> +#define XEN_COLOR_DEFAULT_MASK 0x0001

You are setting a uint32_t value. So it should be 0x00000001.

But I think it is a bit confusing to define a mask here. Instead, I 
would define the color ID and set the bit.

> +#define XEN_COLOR_DEFAULT_NUM 1
> +/* Current maximum useful colors */
> +#define MAX_XEN_COLOR   128 > +
>   /* Number of color(s) assigned to Xen */
>   static uint32_t xen_col_num;
>   /* Coloring configuration of Xen as bitmask */
>   static uint32_t xen_col_mask[MAX_COLORS_CELLS];
> +/* Xen colors IDs */
> +static uint32_t xen_col_list[MAX_XEN_COLOR];

Why do we need to store xen colors in both a bitmask form and an array 
of color ID?

>   
>   /* Number of color(s) assigned to Dom0 */
>   static uint32_t dom0_col_num;
> @@ -216,7 +224,7 @@ uint32_t get_max_colors(void)
>   
>   bool __init coloring_init(void)
>   {
> -    int i;
> +    int i, rc;
>   
>       printk(XENLOG_INFO "Initialize XEN coloring: \n");
>       /*
> @@ -266,6 +274,27 @@ bool __init coloring_init(void)
>       printk(XENLOG_INFO "Color bits in address: 0x%"PRIx64"\n", addr_col_mask);
>       printk(XENLOG_INFO "Max number of colors: %u\n", max_col_num);
>   
> +    if ( !xen_col_num )
> +    {
> +        xen_col_mask[0] = XEN_COLOR_DEFAULT_MASK;
> +        xen_col_num = XEN_COLOR_DEFAULT_NUM;
> +        printk(XENLOG_WARNING "Xen color configuration not found. Using default\n");
> +    }
> +
> +    printk(XENLOG_INFO "Xen color configuration: 0x%"PRIx32"%"PRIx32"%"PRIx32"%"PRIx32"\n",
> +            xen_col_mask[3], xen_col_mask[2], xen_col_mask[1], xen_col_mask[0]);

You are making the assumption that MAX_COLORS_CELLS is always 4. This 
may be more or worse less. So this should be rework to avoid making any 
assumption on the size.

I expect switching to the generic bitmask will help here.

> +    rc = copy_mask_to_list(xen_col_mask, xen_col_list, xen_col_num);
> +
> +    if ( rc )
> +        return false;
> +
> +    for ( i = 0; i < xen_col_num; i++ )
> +        if ( xen_col_list[i] > (max_col_num - 1) )
> +        {
> +            printk(XENLOG_ERR "ERROR: max. color value not supported\n");
> +            return false;
> +        }
> +
>       return true;
>   }
>   

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 23/36] xen/arch: coloring: manually calculate Xen physical addresses
  2022-03-04 17:46 ` [PATCH 23/36] xen/arch: coloring: manually calculate Xen physical addresses Marco Solieri
@ 2022-03-14 19:23   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-14 19:23 UTC (permalink / raw)
  To: Marco Solieri
  Cc: xen-devel, Andrew Cooper, George Dunlap, Jan Beulich,
	Stefano Stabellini, Wei Liu, Marco Solieri, Andrea Bastoni,
	lucmiccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> During Xen coloring procedure, we need to manually calculate consecutive
> physical addresses that conform to the color selection. Add an helper
> function that does this operation. The latter will return the next
> address that conforms to Xen color selection.
> 
> The next_colored function is architecture dependent and the provided
> implementation is for ARMv8.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   xen/arch/arm/coloring.c             | 43 +++++++++++++++++++++++++++++
>   xen/arch/arm/include/asm/coloring.h | 14 ++++++++++
>   2 files changed, 57 insertions(+)
> 
> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> index 761414fcd7..aae3c77a7b 100644
> --- a/xen/arch/arm/coloring.c
> +++ b/xen/arch/arm/coloring.c
> @@ -222,6 +222,49 @@ uint32_t get_max_colors(void)
>       return max_col_num;
>   }
>   
> +paddr_t next_xen_colored(paddr_t phys)
> +{
> +    unsigned int i;
> +    unsigned int col_next_number = 0;
> +    unsigned int col_cur_number = (phys & addr_col_mask) >> PAGE_SHIFT;
> +    int overrun = 0;

This only looks to be used as an unsigned value. So please use 'unsigned 
int'.

> +    paddr_t ret;
> +
> +    /*
> +     * Check if address color conforms to Xen selection. If it does, return
> +     * the address as is.
> +     */
> +    for( i = 0; i < xen_col_num; i++)

Coding style: missing space after 'for' and before ')'.

> +        if ( col_cur_number == xen_col_list[i] )
> +            return phys;
> +
> +    /* Find next col */
> +    for( i = xen_col_num -1 ; i >= 0; i--)

i is unsigned. So the 'i >= 0' will always be true as it will wrap to 
2^32 - 1. What did you intend to check?

Coding style: missing space after 'for', '-' and before ')'.

> +    {
> +        if ( col_cur_number > xen_col_list[i])

Coding style: missing space before ')'.

> +        {
> +            /* Need to start to first element and add a way_size */
> +            if ( i == (xen_col_num - 1) )
> +            {
> +                col_next_number = xen_col_list[0];
> +                overrun = 1;
> +            }
> +            else
> +            {
> +                col_next_number = xen_col_list[i+1];

Coding style: Missing space before and after '+'.

> +                overrun = 0;
> +            }
> +            break;
> +        }
> +    }
> +
> +    /* Align phys to way_size */
> +    ret = phys - (PAGE_SIZE * col_cur_number);

I am not sure to understand how the comment is matching with the code. 
 From the comment, I would expect the expression to contain 'way_size'.

> +    /* Add the offset based on color selection*/

Coding style: missing space before '*/'.

> +    ret += (PAGE_SIZE * (col_next_number)) + (way_size*overrun);
Coding style: Missing space before and after '*'.

> +    return ret;
> +}
> +
>   bool __init coloring_init(void)
>   {
>       int i, rc;
> diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
> index 22e67dc9d8..8c4525677c 100644
> --- a/xen/arch/arm/include/asm/coloring.h
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -28,6 +28,20 @@
>   
>   bool __init coloring_init(void);
>   
> +/*
> + * Return physical page address that conforms to the colors selection
> + * given in col_selection_mask after @param phys.
> + *
> + * @param phys         Physical address start.
> + * @param addr_col_mask        Mask specifying the bits available for coloring.
> + * @param col_selection_mask   Mask asserting the color bits to be used,
> + * must not be 0.

The function belows have only one parameter. Yet, you are description 3 
parameters here.

> + *
> + * @return The lowest physical page address being greater or equal than
> + * 'phys' and belonging to Xen color selection
> + */
> +paddr_t next_xen_colored(paddr_t phys);
> +
>   /*
>    * Check colors of a given domain.
>    * Return true if check passed, false otherwise.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 24/36] xen/arm: enable consider_modules for coloring
  2022-03-04 17:46 ` [PATCH 24/36] xen/arm: enable consider_modules for coloring Marco Solieri
@ 2022-03-14 19:24   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-14 19:24 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> In order to relocate Xen the function get_xen_paddr will be used in the
> following patches. The method has "consider_modules" as a prerequisite
> so it has to be enabled both for ARM32 and coloring.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> ---
>   xen/arch/arm/setup.c | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index f39c62ea70..0bfe12da57 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -442,7 +442,7 @@ static void * __init relocate_fdt(paddr_t dtb_paddr, size_t dtb_size)
>       return fdt;
>   }
>   
> -#ifdef CONFIG_ARM_32
> +#if defined (CONFIG_ARM_32) || (CONFIG_COLORING)

Please fold this change in the first use of consider_modules().

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 26/36] xen/arm: add argument to remove_early_mappings
  2022-03-04 17:46 ` [PATCH 26/36] xen/arm: add argument to remove_early_mappings Marco Solieri
@ 2022-03-14 19:59   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-14 19:59 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio,
	Stefano Stabellini

Hi Marco,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Upcoming patches will need to remove temporary mappings created during
> Xen coloring process. The function remove_early_mappings does what we
> need but it is case-specific. Parametrize the function to avoid code
> replication.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> Acked-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> ---
>   xen/arch/arm/include/asm/mm.h | 2 +-
>   xen/arch/arm/mm.c             | 8 ++++----
>   xen/arch/arm/setup.c          | 3 ++-
>   3 files changed, 7 insertions(+), 6 deletions(-)
> 
> diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
> index 9ac1767595..041ec4ee70 100644
> --- a/xen/arch/arm/include/asm/mm.h
> +++ b/xen/arch/arm/include/asm/mm.h
> @@ -184,7 +184,7 @@ extern void setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr);
>   /* Map FDT in boot pagetable */
>   extern void *early_fdt_map(paddr_t fdt_paddr);
>   /* Remove early mappings */
> -extern void remove_early_mappings(void);
> +extern void remove_early_mappings(unsigned long va, unsigned long size);
>   /* Allocate and initialise pagetables for a secondary CPU. Sets init_ttbr to the
>    * new page table */
>   extern int init_secondary_pagetables(int cpu);
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index fd7a313d88..d69f18b5d2 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -597,13 +597,13 @@ void * __init early_fdt_map(paddr_t fdt_paddr)
>       return fdt_virt;
>   }
>   
> -void __init remove_early_mappings(void)
> +void __init remove_early_mappings(unsigned long va, unsigned long size)
>   {
>       lpae_t pte = {0};
> -    write_pte(xen_second + second_table_offset(BOOT_FDT_VIRT_START), pte);
> -    write_pte(xen_second + second_table_offset(BOOT_FDT_VIRT_START + SZ_2M),
> +    write_pte(xen_second + second_table_offset(va), pte);
> +    write_pte(xen_second + second_table_offset(va + size),
>                 pte);

The original goal of this code was to remove 2 entries. Each entry 
covering 2MB.

Anyone calling with size == 2MB will expect a single mapping to be 
removed. But 4MB worth of memory will be removed.

Effectively, remove_early_mappings() is not generic enough to be 
parametrized. I also don't think this function should be parametrized.
The goal is to remove any mappings that was created during early boot.

I will have a look at how you use it before making any suggestions.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 31/36] Disable coloring if static memory support is selected
  2022-03-04 17:46 ` [PATCH 31/36] Disable coloring if static memory support is selected Marco Solieri
@ 2022-03-14 20:04   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-14 20:04 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi Marco,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Static memory assumes to have physically contiguous memory mapped to
> domains. This assumption cannot be made when coloring is enabled.
> These two features have to be mutually exclusive.

I understand that at runtime, you want them to be mutually exclusive.
But I am not sure to understand why this needs to be mutually exclusive 
at compile time.

In fact, I think it would be nice if we have a same binary Xen that can 
be used with/without coloring. Could you outline any reasons that would 
make this goal difficult to achieve?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 27/36] xen/arch: add coloring support for Xen
  2022-03-04 17:46 ` [PATCH 27/36] xen/arch: add coloring support for Xen Marco Solieri
  2022-03-04 19:47   ` Julien Grall
  2022-03-14  3:47   ` Henry Wang
@ 2022-03-14 21:58   ` Julien Grall
  2 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-14 21:58 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi Marco,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Introduce a new implementation of setup_pagetables that uses coloring
> logic in order to isolate Xen code using its color selection.
> Page tables construction is essentially copied, except for the xenmap
> table, where coloring logic is needed.  Given the absence of a contiguous
> physical mapping, pointers to next level tables need to be manually
> calculated.
> 
> Xen code is relocated in strided mode using the same coloring logic as
> the one in xenmap table by using a temporary colored mapping that will
> be destroyed after switching the TTBR register.
> 
> Keep Xen text section mapped in the newly created pagetables.
> The boot process relies on computing needed physical addresses of Xen
> code by using a shift, but colored mapping is not linear and not easily
> computable. Therefore, the old Xen code is temporarily kept and used to
> boot secondary CPUs until they switch to the colored mapping, which is
> accessed using the handy macro virt_old.  After the boot process, the old
> Xen code memory is reset and its mapping is destroyed.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   xen/arch/arm/include/asm/coloring.h |  13 ++
>   xen/arch/arm/include/asm/mm.h       |   7 ++
>   xen/arch/arm/mm.c                   | 186 +++++++++++++++++++++++++++-
>   xen/arch/arm/psci.c                 |   4 +-
>   xen/arch/arm/setup.c                |  21 +++-
>   xen/arch/arm/smpboot.c              |  19 ++-
>   6 files changed, 241 insertions(+), 9 deletions(-)
> 
> diff --git a/xen/arch/arm/include/asm/coloring.h b/xen/arch/arm/include/asm/coloring.h
> index 8c4525677c..424f6c2b04 100644
> --- a/xen/arch/arm/include/asm/coloring.h
> +++ b/xen/arch/arm/include/asm/coloring.h
> @@ -26,6 +26,17 @@
>   #ifdef CONFIG_COLORING
>   #include <xen/sched.h>
>   
> +/*
> + * Amount of memory that we need to map in order to color Xen.  The value
> + * depends on the maximum number of available colors of the hardware.  The
> + * memory size is pessimistically calculated assuming only one color is used,
> + * which means that any pages belonging to any other color has to be skipped.
> + */
> +#define XEN_COLOR_MAP_SIZE \
> +	((((_end - _start) * get_max_colors())\
> +		+ (XEN_PADDR_ALIGN-1)) & ~(XEN_PADDR_ALIGN-1))

This is an open-coded version of ROUNDUP. Looking at the number, if we 
assume the maximum number of colors (128) and Xen of 2MB. We would end 
up to reserve 256MB of memory for Xen.

This sounds quite a lot to me. This might be acceptable for a first 
approach, but I am wondering if there are a way to reduce the size?

> +#define XEN_COLOR_MAP_SIZE_M (XEN_COLOR_MAP_SIZE >> 20)
> +
>   bool __init coloring_init(void);
>   
>   /*
> @@ -67,6 +78,8 @@ unsigned long color_from_page(struct page_info *pg);
>   /* Return the maximum available number of colors supported by the hardware */
>   uint32_t get_max_colors(void);
>   #else /* !CONFIG_COLORING */
> +#define XEN_COLOR_MAP_SIZE (_end - _start)
> +
>   static inline bool __init coloring_init(void)
>   {
>       return true;
> diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
> index 041ec4ee70..1422091436 100644
> --- a/xen/arch/arm/include/asm/mm.h
> +++ b/xen/arch/arm/include/asm/mm.h
> @@ -362,6 +362,13 @@ void clear_and_clean_page(struct page_info *page);
>   
>   unsigned int arch_get_dma_bitsize(void);
>   
> +#ifdef CONFIG_COLORING
> +#define virt_boot_xen(virt)\
> +    (vaddr_t)((virt - XEN_VIRT_START) + BOOT_RELOC_VIRT_START)
> +#else
> +#define virt_boot_xen(virt) virt
> +#endif
> +
>   #endif /*  __ARCH_ARM_MM__ */
>   /*
>    * Local variables:
> diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
> index d69f18b5d2..53ea13641b 100644
> --- a/xen/arch/arm/mm.c
> +++ b/xen/arch/arm/mm.c
> @@ -42,6 +42,7 @@
>   #include <xen/libfdt/libfdt.h>
>   
>   #include <asm/setup.h>
> +#include <asm/coloring.h>
>   
>   /* Override macros from asm/page.h to make them work with mfn_t */
>   #undef virt_to_mfn
> @@ -110,6 +111,9 @@ DEFINE_BOOT_PAGE_TABLE(boot_second_id);
>   DEFINE_BOOT_PAGE_TABLE(boot_third_id);
>   DEFINE_BOOT_PAGE_TABLE(boot_second);
>   DEFINE_BOOT_PAGE_TABLE(boot_third);
> +#ifdef CONFIG_COLORING
> +DEFINE_BOOT_PAGE_TABLE(boot_colored_xen);
> +#endif

DEFINE_BOOT_PAGE_TABLE() should only be used for page-tables that will 
be used before switching to the C world. AFAICT, boot_colored_xen is 
only going to be access in the C world, so you should use 
DEFINE_PAGE_TABLE().

>   
>   /* Main runtime page tables */
>   
> @@ -632,6 +636,166 @@ static void clear_table(void *table)
>       clean_and_invalidate_dcache_va_range(table, PAGE_SIZE);
>   }
>   
> +#ifdef CONFIG_COLORING
> +/*
> + * Translate a Xen (.text) virtual address to the colored physical one
> + * depending on the hypervisor configuration.
> + * N.B: this function must be used only when migrating from non colored to
> + * colored pagetables since it assumes to have the temporary mappings created
> + * during setup_pagetables that starts from BOOT_RELOC_VIRT_START.
> + * After the migration we have to use virt_to_maddr.
> + */
> +static paddr_t virt_to_maddr_colored(vaddr_t virt)
> +{
> +    unsigned int va_offset;
> +
> +    va_offset = virt - XEN_VIRT_START;
> +    return __pa(BOOT_RELOC_VIRT_START + va_offset);
> +}
> +
> +static void __init coloring_temp_mappings(paddr_t xen_paddr, vaddr_t virt_start)

This code is making some assumptions about virt_start (e.g. it has to be 
in the first 512MB of the address space. Given that the address is 
always the same, I think it would be better to drop the parameter and 
hardocode the value.

> +{
> +    int i;

unsigned int please.

> +    lpae_t pte;
> +    unsigned int xen_text_size = (_end - _start);
> +
> +    xen_text_size = PAGE_ALIGN(xen_text_size);

None of the memory between _end and the next page can really be used for 
other purpose. So I think it would be better to make sure that _end is 
always page aligned.

To do that, you would need to modify xen.lds.S.

> +
> +    pte = mfn_to_xen_entry(maddr_to_mfn(__pa(boot_second)), MT_NORMAL);

maddr_to_mfn(__pa()) is an open-coded version of virt_to_mfn().

> +    pte.pt.table = 1;
> +    boot_first[first_table_offset(virt_start)] = pte;
> +
> +    pte = mfn_to_xen_entry(maddr_to_mfn(__pa(boot_colored_xen)), MT_NORMAL);
> +    pte.pt.table = 1;
> +    boot_second[second_table_offset(virt_start)] = pte;

This is live page-table. So you want to use write_pte(). Also, I would 
link the page table *after* boot_colored_xen has been fully populated.

If you don't do that, you would need to use write_pte() as well below.

> +
> +    for ( i = 0; i < (xen_text_size/PAGE_SIZE); i++ )

Coding style: missing space before and after '/'.

> +    {
> +        mfn_t mfn;

Coding style: we usually add a blank line after the declarations.

That said, this variable seems to be a bit pointless as you use it only 
once below.

> +        xen_paddr = next_xen_colored(xen_paddr);
> +        mfn = maddr_to_mfn(xen_paddr);
> +        pte = mfn_to_xen_entry(mfn, MT_NORMAL);
> +        pte.pt.table = 1; /* 4k mappings always have this bit set */

For new code, I would like if we avoid mentionning a particular page and 
use the level where the mapping happens. In this case, it would be a 
level 3 mapping.

> +        boot_colored_xen[i] = pte;
> +        xen_paddr += PAGE_SIZE;
> +    }
> + > +   flush_xen_tlb_local();

The flush is not necessary as the entry in boot_second should be freed. 
If it were not, then you would have to use the break-before-make 
sequence (i.e. remove the entry, flush, add the entry) to avoid any issue.

> +}
> +
> +/*
> + * Boot-time pagetable setup with coloring support
> + * Changes here may need matching changes in head.S

I am not overly happy to see setup_pagetables() completely duplicated. I 
hope we can drop the non-coloring version at some point, but in the 
meantime I think we should try to adapt the existing one.

This would also help to review/maintain the code.

> + *
> + * The process can be explained as follows:
> + * - Create a temporary colored mapping that conforms to Xen color selection.
> + * - Update all the pagetables links that point to the next level table(s):
> + * this process is crucial beacause the translation tables are not physically
> + * contiguous and we cannot calculate the physical addresses by using the
> + * standard method (physical offset). In order to get the correct physical
> + * address we use virt_to_maddr_colored that translates the virtual address
> + * into a physical one based on the Xen coloring configuration.
> + * - Copy Xen to the new location.
> + * - Update TTBR0_EL2 with the new root page table address.
> + */
> +void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
> +{
> +    int i;
> +    lpae_t pte, *p;
> +    paddr_t pt_phys;
> +    mfn_t pt_phys_mfn;
> +    paddr_t _xen_paddr = xen_paddr;
> +
> +    phys_offset = boot_phys_offset;
> +
> +    ASSERT((_end - _start) < SECOND_SIZE);

_end and _start are not going to change. So I think this should be an 
ASSERT as part of the linker script.

[...]

> +    /* ... DTB */
> +    pte = boot_second[second_table_offset(BOOT_FDT_VIRT_START)];
> +    xen_second[second_table_offset(BOOT_FDT_VIRT_START)] = pte;
> +    pte = boot_second[second_table_offset(BOOT_FDT_VIRT_START + SZ_2M)];
> +    xen_second[second_table_offset(BOOT_FDT_VIRT_START + SZ_2M)] = pte;

The DTB will not be used right after the switch_ttbr(). So I would 
prefer if we call early_fdt_map() again after the switch.

> +
> +    /* Update the value of init_ttbr */
> +    init_ttbr = virt_to_maddr_colored((vaddr_t)xen_pgtable);
> +    clean_dcache(init_ttbr);

I don't much like the idea to set init_ttbr in a different place for 
coloring. See more below.

> +
> +    /* Copy Xen to the new location */
> +    memcpy((void*)BOOT_RELOC_VIRT_START,
> +        (const void*)XEN_VIRT_START, (_end - _start));
> +    clean_dcache_va_range((void*)BOOT_RELOC_VIRT_START, (_end - _start)); > +
> +    /* Change ttbr */
> +    switch_ttbr(init_ttbr);
> +
> +    /*
> +     * Keep mapped old Xen memory in a contiguous mapping
> +     * for other cpus to boot. This mapping will also replace the
> +     * one created at the beginning of setup_pagetables.
> +     */

AFAICT, the second CPUs will never run using the virtual address 
BOOT_RELOC_VIRT_START. You only seem to use it so you can conveniently 
translate the address using virt_to_* helper.

> +    create_mappings(xen_second, BOOT_RELOC_VIRT_START,
> +                paddr_to_pfn(XEN_VIRT_START + phys_offset),
> +                SZ_2M >> PAGE_SHIFT, SZ_2M);

This call will create a 2MB mapping. We only guarantee that Xen will be 
4KB aligned.

In addition to Xen, we don't know what we will in memory right after Xen 
(assuming it is less than 2MB). This may be device memory and could 
result to weird issue.

But, as I wrote above, the virtual mapping doesn't look to be used by 
the secondary CPUs. So I would rather prefer if we don't introduce that 
extra mapping and temporarily map/unmap if CPU0 needs to access it 
temporarily.

> +
> +    xen_pt_enforce_wnx();
> +}
> +#else
>   /* Boot-time pagetable setup.
>    * Changes here may need matching changes in head.S */
>   void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
> @@ -721,6 +885,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset, paddr_t xen_paddr)
>       per_cpu(xen_dommap, 0) = cpu0_dommap;
>   #endif
>   }
> +#endif /* !CONFIG_COLORING */
>   
>   static void clear_boot_pagetables(void)
>   {
> @@ -735,6 +900,9 @@ static void clear_boot_pagetables(void)
>   #endif
>       clear_table(boot_second);
>       clear_table(boot_third);
> +#ifdef CONFIG_COLORING
> +    clear_table(boot_colored_xen);
> +#endif

The boot_colored_xen is going to be touched by the secondary CPUs. So 
this doesn't need to be cleared.

>   }
>   
>   #ifdef CONFIG_ARM_64
> @@ -742,10 +910,16 @@ int init_secondary_pagetables(int cpu)
>   {
>       clear_boot_pagetables();
>   
> +    /*
> +     * For coloring the value of the ttbr was already set up during
> +     * setup_pagetables.
> +     */
> +#ifndef CONFIG_COLORING

This is not necessary if...

>       /* Set init_ttbr for this CPU coming up. All CPus share a single setof
>        * pagetables, but rewrite it each time for consistency with 32 bit. */
>       init_ttbr = (uintptr_t) xen_pgtable + phys_offset;

... virt_to_maddr() here.

>       clean_dcache(init_ttbr);
> +#endif
>       return 0;
>   }
>   #else
> @@ -859,12 +1033,20 @@ void __init setup_xenheap_mappings(unsigned long base_mfn,
>           else if ( xenheap_first_first_slot == -1)
>           {
>               /* Use xenheap_first_first to bootstrap the mappings */
> -            first = xenheap_first_first;
> +            paddr_t phys_addr;
> +
> +            /*
> +             * At this stage is safe to use virt_to_maddr because Xen mapping
> +             * is already in place. Using virt_to_maddr allows us to unify
> +             * codepath with and without cache coloring enabled.
> +             */
> +            phys_addr = virt_to_maddr((vaddr_t)xenheap_first_first);
> +            pte = mfn_to_xen_entry(maddr_to_mfn(phys_addr),MT_NORMAL);
>   
> -            pte = pte_of_xenaddr((vaddr_t)xenheap_first_first);
>               pte.pt.table = 1;
>               write_pte(p, pte);
>   
> +            first = xenheap_first_first;
>               xenheap_first_first_slot = slot;
>           }
>           else
> diff --git a/xen/arch/arm/psci.c b/xen/arch/arm/psci.c
> index 0c90c2305c..d443fac6a2 100644
> --- a/xen/arch/arm/psci.c
> +++ b/xen/arch/arm/psci.c
> @@ -25,6 +25,7 @@
>   #include <asm/cpufeature.h>
>   #include <asm/psci.h>
>   #include <asm/acpi.h>
> +#include <asm/coloring.h>
>   
>   /*
>    * While a 64-bit OS can make calls with SMC32 calling conventions, for
> @@ -49,7 +50,8 @@ int call_psci_cpu_on(int cpu)
>   {
>       struct arm_smccc_res res;
>   
> -    arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu), __pa(init_secondary),
> +    arm_smccc_smc(psci_cpu_on_nr, cpu_logical_map(cpu),
> +                  __pa(virt_boot_xen((vaddr_t)init_secondary)),
>                     &res);
>   
>       return PSCI_RET(res);
> diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
> index 13b10515a8..294b806120 100644
> --- a/xen/arch/arm/setup.c
> +++ b/xen/arch/arm/setup.c
> @@ -924,6 +924,7 @@ void __init start_xen(unsigned long boot_phys_offset,
>       struct domain *d;
>       int rc;
>       paddr_t xen_paddr = (paddr_t)(_start + boot_phys_offset);
> +    uint32_t xen_size = (_end - _start);

You are setting the size here but...

>   
>       dcache_line_bytes = read_dcache_line_bytes();
>   
> @@ -952,13 +953,16 @@ void __init start_xen(unsigned long boot_phys_offset,
>       if ( !coloring_init() )
>           panic("Xen Coloring support: setup failed\n");
>   
> +    xen_size = XEN_COLOR_MAP_SIZE;

... overwrite it before using it. Shouldn't this be in the '#ifdef' below?

> +#ifdef CONFIG_COLORING
> +    xen_paddr = get_xen_paddr(xen_size);
> +#endif
> +
>       /* Register Xen's load address as a boot module. */
> -    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr,
> -                             (paddr_t)(uintptr_t)(_end - _start + 1), false);
> +    xen_bootmodule = add_boot_module(BOOTMOD_XEN, xen_paddr, xen_size, false);
>       BUG_ON(!xen_bootmodule);
>   
>       setup_pagetables(boot_phys_offset, xen_paddr);
> -

Spurious change.

>       setup_mm();
>   
>       /* Parse the ACPI tables for possible boot-time configuration */
> @@ -1072,6 +1076,17 @@ void __init start_xen(unsigned long boot_phys_offset,
>   
>       setup_virt_paging();
>   
> +    /*
> +     * This removal is useful if cache coloring is enabled but
> +     * it should not affect non coloring configuration.
> +     * The removal is done earlier than discard_initial_modules
> +     * beacuse in do_initcalls there is the livepatch support
> +     * setup which uses the virtual addresses starting from
> +     * BOOT_RELOC_VIRT_START.
> +     * Remove coloring mappings to expose a clear state to the
> +     * livepatch module.
> +     */
> +    remove_early_mappings(BOOT_RELOC_VIRT_START, SZ_2M);
>       do_initcalls();
>   
>       /*
> diff --git a/xen/arch/arm/smpboot.c b/xen/arch/arm/smpboot.c
> index 7bfd0a73a7..5ef68976c9 100644
> --- a/xen/arch/arm/smpboot.c
> +++ b/xen/arch/arm/smpboot.c
> @@ -438,6 +438,7 @@ int __cpu_up(unsigned int cpu)
>   {
>       int rc;
>       s_time_t deadline;
> +    vaddr_t *smp_up_cpu_addr;
>   
>       printk("Bringing up CPU%d\n", cpu);
>   
> @@ -453,10 +454,22 @@ int __cpu_up(unsigned int cpu)
>       /* Tell the remote CPU what its logical CPU ID is. */
>       init_data.cpuid = cpu;
>   
> +    /*
> +     * If coloring is enabled, non-Master CPUs boot using the old Xen code.
> +     * During the boot process each cpu is booted one after another using the
> +     * smp_cpu_cpu variable. This variable is accessed in head.S using its
> +     * physical address.
> +     * That address is calculated using the physical offset of the old Xen
> +     * code. With coloring we can not rely anymore on that offset. For this
> +     * reason in order to boot the other cpus we rely on the old xen code that
> +     * was mapped during tables setup in mm.c so that we can use the old physical
> +     * offset and the old head.S code also. In order to modify the old Xen code
> +     * we need to access it using the mapped done in color_xen.
> +     */
> +    smp_up_cpu_addr = (vaddr_t *)virt_boot_xen((vaddr_t)&smp_up_cpu);

smp_up_cpu is defined as unsigned long. So shouldn't the cast be 
(unsigned long *)?

> +    *smp_up_cpu_addr = cpu_logical_map(cpu);

Why is this line moved before the comment: "/* Open ... */"?

>       /* Open the gate for this CPU */
> -    smp_up_cpu = cpu_logical_map(cpu);
> -    clean_dcache(smp_up_cpu);
> -
> +    clean_dcache(*smp_up_cpu_addr);
>       rc = arch_cpu_up(cpu);
>   
>       console_end_sync();

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 33/36] doc, xen-command-line: introduce coloring options
  2022-03-04 17:46 ` [PATCH 33/36] doc, xen-command-line: introduce coloring options Marco Solieri
  2022-03-07  7:42   ` Jan Beulich
@ 2022-03-14 22:07   ` Julien Grall
  1 sibling, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-14 22:07 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi Marco,

On 04/03/2022 17:46, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Four additional parameters in the Xen command line are used to define
> the underlying coloring policy, which is not directly configurable
> otherwise.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   docs/misc/xen-command-line.pandoc | 51 +++++++++++++++++++++++++++++--
>   1 file changed, 49 insertions(+), 2 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index efda335652..a472d51cf9 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -299,6 +299,20 @@ can be maintained with the pv-shim mechanism.
>       cause Xen not to use Indirect Branch Tracking even when support is
>       available in hardware.
>   
> +### buddy\_size (arm64)
> +> `= <size in megabyte>`
> +
> +> Default: `64 MB`
> +
> +Amount of memory reserved for the buddy allocator when colored allocator is
> +active. This options is useful only if coloring support is enabled.
> +The colored allocator is meant as an alternative to the buddy allocator,
> +since its allocation policy is by definition incompatible with the
> +generic one. Since the Xen heap systems is not colored yet, we need to
> +support the coexistence of the two allocators for now. This parameter, which is
> +optional and for expert only, is used to set the amount of memory reserved to
> +the buddy allocator.

A few questions:
   - How did you chose the default?
   - How can a user decide the size of the buddy_size?

> +
>   ### clocksource (x86)
>   > `= pit | hpet | acpi | tsc`
>   
> @@ -884,7 +898,17 @@ Controls for the dom0 IOMMU setup.
>   
>       Incorrect use of this option may result in a malfunctioning system.
>   
> -### dom0_ioports_disable (x86)

This change sounds unrelated to the patch itself. I would also expect 
that we would want to backport it. So can this be backported.

> +### dom0\_colors (arm64)
> +> `= List of <integer>-<integer>`
> +
> +> Default: `All available colors`
> +
> +Specify dom0 color configuration. If the parameter is not set, all available
> +colors are chosen and the user is warned on Xen's serial console. This color
> +configuration acts also as the default one for all DomUs that do not have any
> +explicit color assignment in their configuration file.
> +
> +### dom0\_ioports\_disable (x86)
>   > `= List of <hex>-<hex>`
>   
>   Specify a list of IO ports to be excluded from dom0 access.
> @@ -2625,6 +2649,20 @@ unknown NMIs will still be processed.
>   Set the NMI watchdog timeout in seconds.  Specifying `0` will turn off
>   the watchdog.
>   
> +### way\_size (arm64)
> +> `= <size in byte>`
> +
> +> Default: `Obtained from the hardware`
> +
> +Specify the way size of the Last Level Cache. This parameter is only useful with
> +coloring support enabled. It is an optional, expert-only parameter and it is
> +used to calculate what bits in the physical address can be used by the coloring
> +algorithm, and thus the maximum available colors on the platform. It can be
> +obtained by dividing the total LLC size by the number of associativity ways.
> +By default, the value is also automatically computed during coloring
> +initialization to avoid any kind of misconfiguration. For this reason, it is
> +highly recommended to use this boot argument with specific needs only.

Given the last two sentences, why would someone wants to use it?

> +
>   ### x2apic (x86)
>   > `= <boolean>`
>   
> @@ -2642,7 +2680,16 @@ In the case that x2apic is in use, this option switches between physical and
>   clustered mode.  The default, given no hint from the **FADT**, is cluster
>   mode.
>   
> -### xenheap_megabytes (arm32)

Same here.

> +### xen\_colors (arm64)
> +> `= List of <integer>-<integer>`
> +
> +> Default: `0-0: the lowermost color`
> +
> +Specify Xen color configuration.
> +Two colors are most likely needed on platforms where private caches are
> +physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57.

How can someone decide the number of colors to be used for Xen?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 35/36] doc, device-tree: introduce 'colors' property
  2022-03-04 17:47 ` [PATCH 35/36] doc, device-tree: introduce 'colors' property Marco Solieri
@ 2022-03-14 22:17   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-14 22:17 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi,

On 04/03/2022 17:47, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Dom0less uses device tree for DomUs when booting them without using
> Dom0. Add a new device tree property 'colors' that specifies the
> coloring configuration for DomUs when using Dom0less.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>

The documentation is small enough that I would prefer if this is folded 
in the patch parsing the property.

> ---
>   docs/misc/arm/device-tree/booting.txt | 3 +++
>   1 file changed, 3 insertions(+)
> 
> diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
> index a94125394e..44971bfe60 100644
> --- a/docs/misc/arm/device-tree/booting.txt
> +++ b/docs/misc/arm/device-tree/booting.txt
> @@ -162,6 +162,9 @@ with the following properties:
>   
>       An integer specifying the number of vcpus to allocate to the guest.
>   
> +- colors
> +    A 64 bit bitmask specifying the color configuration for the guest.

Why are we limiting dom0less domUs to 64 colors when Xen can support up 
to 128 colors (potentially more in the future)?

To avoid tie the bindings too much to Xen, I would instead explicitly 
list the colors. Something like:

colors = < 10 20 30 >

This would also help users that manually write the DT to confirm they 
put the correct numbers.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 36/36] doc, arm: add usage documentation for cache coloring support
  2022-03-04 17:47 ` [PATCH 36/36] doc, arm: add usage documentation for cache coloring support Marco Solieri
@ 2022-03-15 19:23   ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-15 19:23 UTC (permalink / raw)
  To: Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio

Hi Marco,

On 04/03/2022 17:47, Marco Solieri wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Add basic documentation that shows how cache coloring support can be
> used in Xen. It introduces the basic concepts behind cache coloring,
> defines the cache selection format, and explains how to assign colors to
> the supported domains: Dom0, DomUs and Xen itself. Known issues are
> also reported.
> 
> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
>   docs/misc/arm/cache_coloring.rst | 191 +++++++++++++++++++++++++++++++
>   1 file changed, 191 insertions(+)
>   create mode 100644 docs/misc/arm/cache_coloring.rst
> 
> diff --git a/docs/misc/arm/cache_coloring.rst b/docs/misc/arm/cache_coloring.rst
> new file mode 100644
> index 0000000000..082afb1b6c
> --- /dev/null
> +++ b/docs/misc/arm/cache_coloring.rst
> @@ -0,0 +1,191 @@
> +Xen coloring support user's guide
> +=================================
> +
> +The cache coloring support in Xen allows to reserve last level cache partition
AFAICT, the code is assuming that the Last level cache is L3. However, 
this documentation looks generic enough that someone could think it can 
be used on any platforms.

> +for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.

What is missing to support arm32?

> +
> +In order to enable and use it, few steps are needed.
> +
> +- Enable coloring in XEN configuration file.
> +
> +        CONFIG_COLORING=y
> +
> +- Enable/disable debug information (optional).
> +
> +        CONFIG_COLORING_DEBUG=y/n

This option doesn't seem to exist.

> +
> +Before digging into configuration instructions, configurers should first

I would write "system integrator" rather than "configurers".

> +understand the basics of cache coloring.

I read this as a suggestion to read external documentation. Do you have 
good pointer?

> +
> +Background
> +**********
> +
> +Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
> +to each core (hence using multiple cache units), while the last level is shared
> +among all of them. Such configuration implies that memory operations on one
> +core (e.g. running a DomU) are able to generate interference on another core
> +(e.g .hosting another DomU). Cache coloring allows eliminating this
> +mutual interference, and thus guaranteeing higher and more predictable
> +performances for memory accesses.
> +The key concept underlying cache coloring is a fragmentation of the memory
> +space into a set of sub-spaces called colors that are mapped to disjoint cache
> +partitions. Technically, the whole memory space is first divided into a number
> +of subsequent regions. Then each region is in turn divided into a number of
> +subsequent sub-colors. The generic i-th color is then obtained by all the
> +i-th sub-colors in each region.
> +
> +.. raw:: html
> +
> +    <pre>
> +                            Region j            Region j+1
> +                .....................   ............
> +                .                     . .
> +                .                       .
> +            _ _ _______________ _ _____________________ _ _
> +                |     |     |     |     |     |     |
> +                | c_0 | c_1 |     | c_n | c_0 | c_1 |
> +           _ _ _|_____|_____|_ _ _|_____|_____|_____|_ _ _
> +                    :                       :
> +                    :                       :...         ... .
> +                    :                            color 0
> +                    :...........................         ... .
> +                                                :
> +          . . ..................................:
> +    </pre>
> +
> +There are two pragmatic lesson to be learnt.
> +
> +1. If one wants to avoid cache interference between two domains, different
> +   colors needs to be used for their memory.
> +
> +2. Color assignment must privilege contiguity in the partitioning. E.g.,
> +   assigning colors (0,1) to domain I  and (2,3) to domain  J is better than
> +   assigning colors (0,2) to I and (1,3) to J.
> +
> +
> +Color(s) selection format
> +**************************
> +
> +Regardless of the domain that has to be colored (Dom0, DomUs and Xen),

Xen is not really a domain. How about 'memory pool'?

> +the color selection can be expressed using the same syntax.  In particular,

Here you are saying the syntax is the same for everyone. But below, you 
provide a new syntax for dom0less domUs.

> +the latter is expressed as a comma-separated list of hyphen-separated intervals
> +of color numbers, as in `0-4,5-8,10-15`.  Ranges are always represented using
> +strings. Note that no spaces are allowed.
> +
> +The number of available colors depends on the LLC layout of the specific
> +platform and determines the maximum allowed value.  This number can be either
> +calculated [#f1]_ or read from the output given by the hypervisor during boot,
> +if DEBUG logging is enabled.

I think it would be good to print the number of colors even in non-debug 
build.

> +
> +Examples:
> +
> ++---------------------+-----------------------------------+
> +|**Configuration**    |**Actual selection**               |
> ++---------------------+-----------------------------------+
> +|  1-2,5-8            | [1, 2, 5, 6, 7, 8]                |
> ++---------------------+-----------------------------------+
> +|  0-8,3-8            | [0, 1, 2, 3, 4, 5, 6, 7, 8]       |
> ++---------------------+-----------------------------------+
> +|  0-0                | [0]                               |
> ++---------------------+-----------------------------------+

0-0 is a bit odd to write. I would consider to allow a system integrator 
to simply write '0'.

> +
> +General coloring parameters
> +***************************
> +
> +Four additional parameters in the Xen command line are used to define the
> +underlying coloring policy, which is not directly configurable otherwise.
> +
> +Please refer to the relative documentation in docs/man/xl.cfg.pod.5.in.
> +
> +Dom0less support
> +****************
> +Support for the Dom0less experimental features is provided. Color selection for

I don't understand the first sentence. Are you saying dom0less domUs 
support is experimental or the support for coloring dom0less domUs is 
experimental?

> +a virtual machine is defined by the attribute `colors`, whose format is not a
> +string for ranges list, but a bitmask. It suffices to set all and only the bits
> +having a position equal to the chosen colors, leaving unset all the others. For
> +example, if we choose 8 colors out of 16, we can use a bitmask with 8 bits set
> +and 8 bit unset, like:
> +

[...]

> +Known issues
> +************
> +
> +Explicitly define way_size in QEMU
> +##################################
> +
> +Currently, QEMU does not have a comprehensive cache model, so the cache coloring
> +support fails to detect a cache geometry where to operate. In this case, the
> +boot hangs as soon as the Xen image is loaded. To overcome this issue, it is
> +enough to specify the way_size parameter in the command line. Any multiple
> +greater than 1 of the page size allows the coloring mechanism to work, but the
> +precise behavior on the system that QEMU is emulating can be obtained with its
> +way_size. For instance, set way_size=65536.

Can we consider to fix QEMU?

> +
> +
> +Fail to boot colored DomUs with large memory size
> +#################################################
> +
> +If the kernel used for Dom0 does not contain the upstream commit

Dom0 is technically not tie to Linux. So please be explicit and write 
"Linux kernel".

> +3941552aec1e04d63999988a057ae09a1c56ebeb and uses the hypercall buffer device,
> +colored DomUs with memory size larger then 127 MB cannot be created. This is
> +caused by the default limit of this buffer of 64 pages. The solution is to
> +manually apply the above patch, or to check if there is an updated version of
> +the kernel in use for Dom0 that contains this change.

I don't understand how coloring is coming in the equation here. Can you 
provide more details?

> +
> +Notes:
> +******
> +
> +.. [#f1] To compute the number of available colors on a platform, one can simply
> +  divide `way_size` by `page_size`, where: `page_size` is the size of the page
> +  used on the system (usually 4 KiB);

It is fairly common for a CPU to support multiple page granularities 
(i.e 4KB, 16KB, 64KB). The Arm Arm architecture allows each level to use 
a different granularity.

For instance, dom0 may use 64KB page granularity, domU 4KB. Xen will 
always use 4KB for now.

So can you clarify what you mean by page used on the system? Is it Xen 
page granularity?


>  `way_size` is size of each LLC way.  For
> +  example, an Arm Cortex-A53 with a 16-ways associative 1 MiB LLC enable 16
> +  colors, when pages are 4 KiB.
> +
> +

NIT: One newline should be sufficient here.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration
  2022-03-09 19:09   ` Julien Grall
@ 2022-03-22  9:17     ` Luca Miccio
  2022-03-23 19:02       ` Julien Grall
  2022-05-13 14:22     ` Carlo Nonato
  1 sibling, 1 reply; 79+ messages in thread
From: Luca Miccio @ 2022-03-22  9:17 UTC (permalink / raw)
  To: Julien Grall
  Cc: Marco Solieri, xen-devel, Andrew Cooper, George Dunlap,
	Jan Beulich, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Stefano Stabellini

[-- Attachment #1: Type: text/plain, Size: 9755 bytes --]

Hi Julien,


Il giorno mer 9 mar 2022 alle ore 20:09 Julien Grall <julien@xen.org> ha
scritto:

> Hi,
>
> On 04/03/2022 17:46, Marco Solieri wrote:
> > From: Luca Miccio <lucmiccio@gmail.com>
> >
> > Add three new bootargs allowing configuration of cache coloring support
> > for Xen:
>
> I would prefer if documentation of each command line is part of the
> patch introducing them. This would help understanding some of the
> parameters.
>
>
Ok.


> > - way_size: The size of a LLC way in bytes. This value is mainly used
> >    to calculate the maximum available colors on the platform.
>
> We should only add command line option when they are a strong use case.
> In documentation, you wrote that someone may want to overwrite the way
> size for "specific needs".
>
> Can you explain what would be those needs?

> - dom0_colors: The coloring configuration for Dom0, which also acts as
> >    default configuration for any DomU without an explicit configuration.
> > - xen_colors: The coloring configuration for the Xen hypervisor itself.
> >
> > A cache coloring configuration consists of a selection of colors to be
> > assigned to a VM or to the hypervisor. It is represented by a set of
> > ranges. Add a common function that parses a string with a
> > comma-separated set of hyphen-separated ranges like "0-7,15-16" and
> > returns both: the number of chosen colors, and an array containing their
> > ids.
> > Currently we support platforms with up to 128 colors.
>
> Is there any reason this value is hardcoded in Xen rather than part of
> the Kconfig?
>
>
Not really at the time when this patch was created. But as we notify in
patch 32,
there is an assert that fails if we use a certain amount of colors. Maybe
we should
find a better way to store the color information.

Luca.

> >
> > Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
> > Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> > Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
> > ---
> >   xen/arch/arm/Kconfig                |   5 ++
> >   xen/arch/arm/Makefile               |   2 +-
> >   xen/arch/arm/coloring.c             | 131 ++++++++++++++++++++++++++++
> >   xen/arch/arm/include/asm/coloring.h |  28 ++++++
> >   4 files changed, 165 insertions(+), 1 deletion(-)
> >   create mode 100644 xen/arch/arm/coloring.c
> >   create mode 100644 xen/arch/arm/include/asm/coloring.h
> >
> > diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
> > index ecfa6822e4..f0f999d172 100644
> > --- a/xen/arch/arm/Kconfig
> > +++ b/xen/arch/arm/Kconfig
> > @@ -97,6 +97,11 @@ config HARDEN_BRANCH_PREDICTOR
> >
> >         If unsure, say Y.
> >
> > +config COLORING
> > +     bool "L2 cache coloring"
> > +     default n
>
> This wants to be gated with EXPERT for time-being. SUPPORT.MD woudl
> Furthermore, I think this wants to be gated with EXPERT for the time-being.
>
> > +     depends on ARM_64
>
> Why is this limited to arm64?
>
> > +
> >   config TEE
> >       bool "Enable TEE mediators support (UNSUPPORTED)" if UNSUPPORTED
> >       default n
> > diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
> > index c993ce72a3..581896a528 100644
> > --- a/xen/arch/arm/Makefile
> > +++ b/xen/arch/arm/Makefile
> > @@ -66,7 +66,7 @@ obj-$(CONFIG_SBSA_VUART_CONSOLE) += vpl011.o
> >   obj-y += vsmc.o
> >   obj-y += vpsci.o
> >   obj-y += vuart.o
> > -
> > +obj-$(CONFIG_COLORING) += coloring.o
>
> Please keep the newline before extra-y. The file are meant to be ordered
> alphabetically. So this should be inserted in the correct position.
>
> >   extra-y += xen.lds
> >
> >   #obj-bin-y += ....o
> > diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
> > new file mode 100644
> > index 0000000000..8f1cff6efb
> > --- /dev/null
> > +++ b/xen/arch/arm/coloring.c
> > @@ -0,0 +1,131 @@
> > +/*
> > + * xen/arch/arm/coloring.c
> > + *
> > + * Coloring support for ARM
> > + *
> > + * Copyright (C) 2019 Xilinx Inc.
> > + *
> > + * Authors:
> > + *    Luca Miccio <lucmiccio@gmail.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program.  If not, see <http://www.gnu.org/licenses/
> >.
> > + */
> > +#include <xen/init.h>
> > +#include <xen/types.h>
> > +#include <xen/lib.h>
> > +#include <xen/errno.h>
> > +#include <xen/param.h>
> > +#include <asm/coloring.h>
>
> The includes should be ordered so <xen/...> are first, then <asm/...>.
> They are also ordered alphabetically within their own category.
>
> > +
> > +/* Number of color(s) assigned to Xen */
> > +static uint32_t xen_col_num;
> > +/* Coloring configuration of Xen as bitmask */
> > +static uint32_t xen_col_mask[MAX_COLORS_CELLS];
> Xen provides helpers to create and use bitmaps (see
> include/xen/bitmap.h). Can you use?
>
> > +
> > +/* Number of color(s) assigned to Dom0 */
> > +static uint32_t dom0_col_num;
> > +/* Coloring configuration of Dom0 as bitmask */
> > +static uint32_t dom0_col_mask[MAX_COLORS_CELLS];
> > +
> > +static uint64_t way_size;
> > +
> > +/*************************
> > + * PARSING COLORING BOOTARGS
> > + */
> > +
> > +/*
> > + * Parse the coloring configuration given in the buf string, following
> the
> > + * syntax below, and store the number of colors and a corresponding
> mask in
> > + * the last two given pointers.
> > + *
> > + * COLOR_CONFIGURATION ::= RANGE,...,RANGE
> > + * RANGE               ::= COLOR-COLOR
> > + *
> > + * Example: "2-6,15-16" represents the set of colors: 2,3,4,5,6,15,16.
> > + */
> > +static int parse_color_config(
> > +    const char *buf, uint32_t *col_mask, uint32_t *col_num)
>
>
> Coding style. We usually declarate paremeters on the same line as the
> function name. If they can't fit on the same line, then we split in two
> with the parameter aligned to the first paremeter.
>
> > +{
> > +    int start, end, i;
>
> AFAICT, none of the 3 variables will store negative values. So can they
> be unsigned?
>
> > +    const char* s = buf;
> > +    unsigned int offset;
> > +
> > +    if ( !col_mask || !col_num )
> > +        return -EINVAL;
> > +
> > +    *col_num = 0;
> > +    for ( i = 0; i < MAX_COLORS_CELLS; i++ )
> > +        col_mask[i] = 0;
> dom0_col_mask and xen_col_mask are already zeroed. I would also expect
> the same for dynamically allocated bitmask. So can this be dropped?
>
> > +
> > +    while ( *s != '\0' )
> > +    {
> > +        if ( *s != ',' )
> > +        {
> > +            start = simple_strtoul(s, &s, 0);
> > +
> > +            /* Ranges are hyphen-separated */
> > +            if ( *s != '-' )
> > +                goto fail;
> > +            s++;
> > +
> > +            end = simple_strtoul(s, &s, 0);
> > +
> > +            for ( i = start; i <= end; i++ )
> > +            {
> > +                offset = i / 32;
> > +                if ( offset > MAX_COLORS_CELLS )
> > +                    goto fail;
> > +
> > +                if ( !(col_mask[offset] & (1 << i % 32)) )
> > +                    *col_num += 1;
> > +                col_mask[offset] |= (1 << i % 32);
> > +            }
> > +        }
> > +        else
> > +            s++;
> > +    }
> > +
> > +    return *s ? -EINVAL : 0;
> > +fail:
> > +    return -EINVAL;
> > +}
> > +
> > +static int __init parse_way_size(const char *s)
> > +{
> > +    way_size = simple_strtoull(s, &s, 0);
> > +
> > +    return *s ? -EINVAL : 0;
> > +}
> > +custom_param("way_size", parse_way_size);
> > +
> > +static int __init parse_dom0_colors(const char *s)
> > +{
> > +    return parse_color_config(s, dom0_col_mask, &dom0_col_num);
> > +}
> > +custom_param("dom0_colors", parse_dom0_colors);
> > +
> > +static int __init parse_xen_colors(const char *s)
> > +{
> > +    return parse_color_config(s, xen_col_mask, &xen_col_num);
> > +}
> > +custom_param("xen_colors", parse_xen_colors);
> > +
> > +/*
> > + * Local variables:
> > + * mode: C
> > + * c-file-style: "BSD"
> > + * c-basic-offset: 4
> > + * tab-width: 4
> > + * indent-tabs-mode: nil
> > + * End:
> > + */
> > diff --git a/xen/arch/arm/include/asm/coloring.h
> b/xen/arch/arm/include/asm/coloring.h
> > new file mode 100644
> > index 0000000000..60958d1244
> > --- /dev/null
> > +++ b/xen/arch/arm/include/asm/coloring.h
> > @@ -0,0 +1,28 @@
> > +/*
> > + * xen/arm/include/asm/coloring.h
> > + *
> > + * Coloring support for ARM
> > + *
> > + * Copyright (C) 2019 Xilinx Inc.
> > + *
> > + * Authors:
> > + *    Luca Miccio <lucmiccio@gmail.com>
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program.  If not, see <http://www.gnu.org/licenses/
> >.
> > + */
> > +#ifndef __ASM_ARM_COLORING_H__
> > +#define __ASM_ARM_COLORING_H__
> > +
> > +#define MAX_COLORS_CELLS 4
> > +
> > +#endif /* !__ASM_ARM_COLORING_H__ */
>
> Cheers,
>
> --
> Julien Grall
>

[-- Attachment #2: Type: text/html, Size: 13088 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration
  2022-03-22  9:17     ` Luca Miccio
@ 2022-03-23 19:02       ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-03-23 19:02 UTC (permalink / raw)
  To: Luca Miccio
  Cc: Marco Solieri, xen-devel, Andrew Cooper, George Dunlap,
	Jan Beulich, Stefano Stabellini, Wei Liu, Marco Solieri,
	Andrea Bastoni, Stefano Stabellini



On 22/03/2022 09:17, Luca Miccio wrote:
> Hi Julien,

Hi Luca,

>>> - way_size: The size of a LLC way in bytes. This value is mainly used
>>>     to calculate the maximum available colors on the platform.
>>
>> We should only add command line option when they are a strong use case.
>> In documentation, you wrote that someone may want to overwrite the way
>> size for "specific needs".
>>
>> Can you explain what would be those needs?
> 
>> - dom0_colors: The coloring configuration for Dom0, which also acts as
>>>     default configuration for any DomU without an explicit configuration.
>>> - xen_colors: The coloring configuration for the Xen hypervisor itself.
>>>
>>> A cache coloring configuration consists of a selection of colors to be
>>> assigned to a VM or to the hypervisor. It is represented by a set of
>>> ranges. Add a common function that parses a string with a
>>> comma-separated set of hyphen-separated ranges like "0-7,15-16" and
>>> returns both: the number of chosen colors, and an array containing their
>>> ids.
>>> Currently we support platforms with up to 128 colors.
>>
>> Is there any reason this value is hardcoded in Xen rather than part of
>> the Kconfig?
>>
>>
> Not really at the time when this patch was created. But as we notify in
> patch 32,
> there is an assert that fails if we use a certain amount of colors. Maybe
> we should
> find a better way to store the color information.

You could use a bitmap. Xen already provide facilities to use them in 
the public interface (see xenctl_bitmap) and convert the Xen internal 
bitmap (see DECLARE_BITMAP).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration
  2022-03-09 19:09   ` Julien Grall
  2022-03-22  9:17     ` Luca Miccio
@ 2022-05-13 14:22     ` Carlo Nonato
  2022-05-13 17:41       ` Julien Grall
  1 sibling, 1 reply; 79+ messages in thread
From: Carlo Nonato @ 2022-05-13 14:22 UTC (permalink / raw)
  To: Julien Grall, Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Stefano Stabellini

Hi Julien,

I'm Carlo, the new developer that will work on this patch set and on the 
review.

Thanks for all the comments. I'll try to answer to all the open points 
and also
ask for feedback.

On 09/03/22 20:09, Julien Grall wrote:
>> - way_size: The size of a LLC way in bytes. This value is mainly used
>>    to calculate the maximum available colors on the platform.
>
> We should only add command line option when they are a strong use 
> case. In documentation, you wrote that someone may want to overwrite 
> the way size for "specific needs".
>
> Can you explain what would be those needs?
This parameter is here mainly to support QEMU on which the automatic 
probing
of the LLC size doesn't work properly.

Also, since from this value we compute the maximum number of colors
the architecture supports, you may want to fix the way size so as to 
simulate
a different use case for debugging purposes.

Should I add those notes somewhere (doc, commit messages, etc.)?

>> A cache coloring configuration consists of a selection of colors to be
>> assigned to a VM or to the hypervisor. It is represented by a set of
>> ranges. Add a common function that parses a string with a
>> comma-separated set of hyphen-separated ranges like "0-7,15-16" and
>> returns both: the number of chosen colors, and an array containing their
>> ids.
>> Currently we support platforms with up to 128 colors.
>
> Is there any reason this value is hardcoded in Xen rather than part of 
> the Kconfig?
Having another parameter to configure can complicate things from
the user perspective. Also 128 is more than enough for the current ARM
processors we tested.
>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>> ---
>>   xen/arch/arm/Kconfig                |   5 ++
>>   xen/arch/arm/Makefile               |   2 +-
>>   xen/arch/arm/coloring.c             | 131 ++++++++++++++++++++++++++++
>>   xen/arch/arm/include/asm/coloring.h |  28 ++++++
>>   4 files changed, 165 insertions(+), 1 deletion(-)
>>   create mode 100644 xen/arch/arm/coloring.c
>>   create mode 100644 xen/arch/arm/include/asm/coloring.h
>>
>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>> index ecfa6822e4..f0f999d172 100644
>> --- a/xen/arch/arm/Kconfig
>> +++ b/xen/arch/arm/Kconfig
>> @@ -97,6 +97,11 @@ config HARDEN_BRANCH_PREDICTOR
>>           If unsure, say Y.
>>   +config COLORING
>> +    bool "L2 cache coloring"
>> +    default n
>
> This wants to be gated with EXPERT for time-being. SUPPORT.MD woudl
> Furthermore, I think this wants to be gated with EXPERT for the 
> time-being.
>
>> +    depends on ARM_64
>
> Why is this limited to arm64?
Because arm32 isn't an "interesting" architecture where to have coloring
since there are locking primitives that provides sufficient isolation and so
the problem is not common.
On x86 instead, the functions that map memory into caches are not so 
easy to
exploit to achieve isolation.

Thanks.

- Carlo Nonato



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/36] xen/arm: compute LLC way size by hardware inspection
  2022-03-09 20:12   ` Julien Grall
@ 2022-05-13 14:34     ` Carlo Nonato
  2022-05-13 19:08       ` Julien Grall
  0 siblings, 1 reply; 79+ messages in thread
From: Carlo Nonato @ 2022-05-13 14:34 UTC (permalink / raw)
  To: Julien Grall, Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio,
	Stefano Stabellini

Hi Julien

On 09/03/22 21:12, Julien Grall wrote:

>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>> ---
>>   xen/arch/arm/coloring.c | 76 +++++++++++++++++++++++++++++++++++++++++
>>   1 file changed, 76 insertions(+)
>>
>> diff --git a/xen/arch/arm/coloring.c b/xen/arch/arm/coloring.c
>> index 8f1cff6efb..e3d490b453 100644
>> --- a/xen/arch/arm/coloring.c
>> +++ b/xen/arch/arm/coloring.c
>> @@ -25,7 +25,10 @@
>>   #include <xen/lib.h>
>>   #include <xen/errno.h>
>>   #include <xen/param.h>
>> +
>
>
> NIT: I think this belongs to patch #4.
>
>> +#include <asm/sysregs.h>
>
> Please order the include alphabetically.
>
>>   #include <asm/coloring.h> > +#include <asm/io.h>
>
> You don't seem to use read*/write* helper. So why do you need this?
>>     /* Number of color(s) assigned to Xen */
>>   static uint32_t xen_col_num;
>> @@ -39,6 +42,79 @@ static uint32_t dom0_col_mask[MAX_COLORS_CELLS];
>>     static uint64_t way_size;
>>   +#define CTR_LINESIZE_MASK 0x7
>> +#define CTR_SIZE_SHIFT 13
>> +#define CTR_SIZE_MASK 0x3FFF
>> +#define CTR_SELECT_L2 1 << 1
>> +#define CTR_SELECT_L3 1 << 2
>> +#define CTR_CTYPEn_MASK 0x7
>> +#define CTR_CTYPE2_SHIFT 3
>> +#define CTR_CTYPE3_SHIFT 6
>> +#define CTR_LLC_ON 1 << 2
>> +#define CTR_LOC_SHIFT 24
>> +#define CTR_LOC_MASK 0x7
>> +#define CTR_LOC_L2 1 << 1
>> +#define CTR_LOC_NOT_IMPLEMENTED 1 << 0
>
> We already define some CTR_* in processor.h. Please any extra one there.
>
>> +
>> +
>> +/* Return the way size of last level cache by asking the hardware */
>> +static uint64_t get_llc_way_size(void)
>
> This will break compilation as you are introducing get_llc_way_size() 
> but not using it.
>
> I would suggest to fold this patch in the next one.
>
>> +{
>> +    uint32_t cache_sel = READ_SYSREG64(CSSELR_EL1);
>
> The return type for READ_SYSREG64() is uint64_t. That said, the 
> equivalent register on 32bit is CSSELR which is 32-bit. So this should 
> be READ_SYSREG() and the matching type is register_t.
Since we don't want to support arm32, should I stick with 
READ_SYSREG64() or switch to the generic one you
pointed me out?
>> +    uint32_t cache_global_info = READ_SYSREG64(CLIDR_EL1);
>
> Same remark here. Except the matching register is CLIDR.
>
>> +    uint32_t cache_info;
>> +    uint32_t cache_line_size;
>> +    uint32_t cache_set_num;
>> +    uint32_t cache_sel_tmp;
>> +
>> +    printk(XENLOG_INFO "Get information on LLC\n");
>> +    printk(XENLOG_INFO "Cache CLIDR_EL1: 0x%"PRIx32"\n", 
>> cache_global_info);
>> +
>> +    /* Check if at least L2 is implemented */
>> +    if ( ((cache_global_info >> CTR_LOC_SHIFT) & CTR_LOC_MASK)
>
> This is a bit confusing. cache_global_info is storing CLIDR_* but you 
> are using macro starting with CTR_*.
>
> Did you intend to name the macros CLIDR_*?
>
> The same remark goes for the other use of CTR_ below. The name of the 
> macros should match the register they are meant to be used on.
You are right for the naming mistakes. Should I add those defines in 
some specific file or
can they stay here?
>> +        == CTR_LOC_NOT_IMPLEMENTED )
>
> I am a bit confused this the check here. Shouln't you check that 
> Ctype2 is notn 0 instead?
I should check a little bit better how this automatic probing thing 
actually works
and we also have to clarify better what is the LLC for us, so that I 
know what we
should really test for in this function. Probably you're right though.
>> +    {
>> +        printk(XENLOG_ERR "ERROR: L2 Cache not implemented\n");
>> +        return 0;
>> +    }
>> +
>> +    /* Save old value of CSSELR_EL1 */
>> +    cache_sel_tmp = cache_sel;
>> +
>> +    /* Get LLC index */
>> +    if ( ((cache_global_info >> CTR_CTYPE2_SHIFT) & CTR_CTYPEn_MASK)
>> +        == CTR_LLC_ON )
>
> I don't understand this check. You define CTR_LLC_ON to 1 << 2. So it 
> would be 0b10. From the field you checked, this value mean "Data Cache 
> Only". How is this indicating the which level to chose?
>
> But then in patch #4 you wrote we will do cache coloring on L2. So why 
> are we selecting L3?
1 << 2 is actually 0b100 which stands for "Unified cache". Still I don't 
know if this is
the best way to test what we want.
>> +        cache_sel = CTR_SELECT_L2;
>> +    else
>> +        cache_sel = CTR_SELECT_L3;
>> +
>> +    printk(XENLOG_INFO "LLC selection: %u\n", cache_sel);
>> +    /* Select the correct LLC in CSSELR_EL1 */
>> +    WRITE_SYSREG64(cache_sel, CSSELR_EL1);
>
> This should be WRITE_SYSREG().
>
>> +
>> +    /* Ensure write */
>> +    isb();
>> +
>> +    /* Get info about the LLC */
>> +    cache_info = READ_SYSREG64(CCSIDR_EL1);
>> +
>> +    /* ARM TRM: (Log2(Number of bytes in cache line)) - 4. */
>
> From my understanding "TRM" in the Arm world refers to a specific 
> processor. In this case we want to quote the spec. So we usually say 
> "Arm Arm".
>
>> +    cache_line_size = 1 << ((cache_info & CTR_LINESIZE_MASK) + 4);
>> +    /* ARM TRM: (Number of sets in cache) - 1 */
>> +    cache_set_num = ((cache_info >> CTR_SIZE_SHIFT) & CTR_SIZE_MASK) 
>> + 1;
>
> The shifts here are assuming that FEAT_CCIDX is not implemented. I 
> would be OK if we decide to not support cache coloring on such 
> platform. However, we need to return an error if a user tries to use 
> cache coloring on such platform.
>
In my understanding, if FEAT_CCIDX is implemented then CCSIDR_EL1 is a 
64-bit register.
So it's just a matter of probing for FEAT_CCIDX and in that case 
changing the way we access
that register (since the layout changes too).

Thanks.

- Carlo Nonato



^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration
  2022-05-13 14:22     ` Carlo Nonato
@ 2022-05-13 17:41       ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-05-13 17:41 UTC (permalink / raw)
  To: Carlo Nonato, Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Stefano Stabellini



On 13/05/2022 15:22, Carlo Nonato wrote:
> Hi Julien,

Hi Carlo,

> I'm Carlo, the new developer that will work on this patch set and on the 
> review.
> 
> Thanks for all the comments. I'll try to answer to all the open points 
> and also
> ask for feedback.
> 
> On 09/03/22 20:09, Julien Grall wrote:
>>> - way_size: The size of a LLC way in bytes. This value is mainly used
>>>    to calculate the maximum available colors on the platform.
>>
>> We should only add command line option when they are a strong use 
>> case. In documentation, you wrote that someone may want to overwrite 
>> the way size for "specific needs".
>>
>> Can you explain what would be those needs?
> This parameter is here mainly to support QEMU on which the automatic 
> probing
> of the LLC size doesn't work properly.

I am not in favor of adding command line option just for QEMU. But...

> 
> Also, since from this value we compute the maximum number of colors
> the architecture supports, you may want to fix the way size so as to 
> simulate
> a different use case for debugging purposes.

... this reason is more compelling to me.

> 
> Should I add those notes somewhere (doc, commit messages, etc.)?

So I would mention it in the commit message and also the doc description 
the options.

> 
>>> A cache coloring configuration consists of a selection of colors to be
>>> assigned to a VM or to the hypervisor. It is represented by a set of
>>> ranges. Add a common function that parses a string with a
>>> comma-separated set of hyphen-separated ranges like "0-7,15-16" and
>>> returns both: the number of chosen colors, and an array containing their
>>> ids.
>>> Currently we support platforms with up to 128 colors.
>>
>> Is there any reason this value is hardcoded in Xen rather than part of 
>> the Kconfig?
> Having another parameter to configure can complicate things from
> the user perspective. 

I don't think it would be more complicated. The default would still be 
128 and would help the user to easily modify the value if...

> Also 128 is more than enough for the current ARM
> processors we tested.

... they are using a processor you didn't tested on.

>>> Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
>>> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
>>> Signed-off-by: Stefano Stabellini <stefano.stabellini@xilinx.com>
>>> ---
>>>   xen/arch/arm/Kconfig                |   5 ++
>>>   xen/arch/arm/Makefile               |   2 +-
>>>   xen/arch/arm/coloring.c             | 131 ++++++++++++++++++++++++++++
>>>   xen/arch/arm/include/asm/coloring.h |  28 ++++++
>>>   4 files changed, 165 insertions(+), 1 deletion(-)
>>>   create mode 100644 xen/arch/arm/coloring.c
>>>   create mode 100644 xen/arch/arm/include/asm/coloring.h
>>>
>>> diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
>>> index ecfa6822e4..f0f999d172 100644
>>> --- a/xen/arch/arm/Kconfig
>>> +++ b/xen/arch/arm/Kconfig
>>> @@ -97,6 +97,11 @@ config HARDEN_BRANCH_PREDICTOR
>>>           If unsure, say Y.
>>>   +config COLORING
>>> +    bool "L2 cache coloring"
>>> +    default n
>>
>> This wants to be gated with EXPERT for time-being. SUPPORT.MD woudl
>> Furthermore, I think this wants to be gated with EXPERT for the 
>> time-being.
>>
>>> +    depends on ARM_64
>>
>> Why is this limited to arm64?
> Because arm32 isn't an "interesting" architecture where to have coloring
> since there are locking primitives that provides sufficient isolation 
> and so
> the problem is not common.

I am afraid I don't understand this rationale. What sort of locking are 
you talking about?

That said,I am not asking to implement the 32-bit side. I am more 
interested to know what's the effort required here. IOW, is it disabled 
because you haven't tested?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/36] xen/arm: compute LLC way size by hardware inspection
  2022-05-13 14:34     ` Carlo Nonato
@ 2022-05-13 19:08       ` Julien Grall
  0 siblings, 0 replies; 79+ messages in thread
From: Julien Grall @ 2022-05-13 19:08 UTC (permalink / raw)
  To: Carlo Nonato, Marco Solieri, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Marco Solieri, Andrea Bastoni, Luca Miccio,
	Stefano Stabellini



On 13/05/2022 15:34, Carlo Nonato wrote:
> Hi Julien

Hi,

> On 09/03/22 21:12, Julien Grall wrote:
>>> +
>>> +
>>> +/* Return the way size of last level cache by asking the hardware */
>>> +static uint64_t get_llc_way_size(void)
>>
>> This will break compilation as you are introducing get_llc_way_size() 
>> but not using it.
>>
>> I would suggest to fold this patch in the next one.
>>
>>> +{
>>> +    uint32_t cache_sel = READ_SYSREG64(CSSELR_EL1);
>>
>> The return type for READ_SYSREG64() is uint64_t. That said, the 
>> equivalent register on 32bit is CSSELR which is 32-bit. So this should 
>> be READ_SYSREG() and the matching type is register_t.
> Since we don't want to support arm32, should I stick with 
> READ_SYSREG64() or switch to the generic one you
> pointed me out?

If this code is meant to only work on 64-bit, then I would prefer if we 
use READ_SYSREG() and register_t (see more below about READ_SYSREG64 vs 
READ_SYSREG()).

>>> +    uint32_t cache_global_info = READ_SYSREG64(CLIDR_EL1);
>>
>> Same remark here. Except the matching register is CLIDR.
>>
>>> +    uint32_t cache_info;
>>> +    uint32_t cache_line_size;
>>> +    uint32_t cache_set_num;
>>> +    uint32_t cache_sel_tmp;
>>> +
>>> +    printk(XENLOG_INFO "Get information on LLC\n");
>>> +    printk(XENLOG_INFO "Cache CLIDR_EL1: 0x%"PRIx32"\n", 
>>> cache_global_info);
>>> +
>>> +    /* Check if at least L2 is implemented */
>>> +    if ( ((cache_global_info >> CTR_LOC_SHIFT) & CTR_LOC_MASK)
>>
>> This is a bit confusing. cache_global_info is storing CLIDR_* but you 
>> are using macro starting with CTR_*.
>>
>> Did you intend to name the macros CLIDR_*?
>>
>> The same remark goes for the other use of CTR_ below. The name of the 
>> macros should match the register they are meant to be used on.
> You are right for the naming mistakes. Should I add those defines in 
> some specific file or
> can they stay here?

I would define them in arch/arm/include/asm/processor.h where we already 
define all the mask for system registers.

>>> +        == CTR_LOC_NOT_IMPLEMENTED )
>>
>> I am a bit confused this the check here. Shouln't you check that 
>> Ctype2 is notn 0 instead?
> I should check a little bit better how this automatic probing thing 
> actually works
> and we also have to clarify better what is the LLC for us, so that I 
> know what we
> should really test for in this function. Probably you're right though.
>>> +    {
>>> +        printk(XENLOG_ERR "ERROR: L2 Cache not implemented\n");
>>> +        return 0;
>>> +    }
>>> +
>>> +    /* Save old value of CSSELR_EL1 */
>>> +    cache_sel_tmp = cache_sel;
>>> +
>>> +    /* Get LLC index */
>>> +    if ( ((cache_global_info >> CTR_CTYPE2_SHIFT) & CTR_CTYPEn_MASK)
>>> +        == CTR_LLC_ON )
>>
>> I don't understand this check. You define CTR_LLC_ON to 1 << 2. So it 
>> would be 0b10. From the field you checked, this value mean "Data Cache 
>> Only". How is this indicating the which level to chose?
>>
>> But then in patch #4 you wrote we will do cache coloring on L2. So why 
>> are we selecting L3?
> 1 << 2 is actually 0b100 which stands for "Unified cache".

Oh yes. Sorry, I miscalculated the field.

>  Still I don't 
> know if this is
> the best way to test what we want.

Would you be able to explain what you want to test?

>>> +        cache_sel = CTR_SELECT_L2;
>>> +    else
>>> +        cache_sel = CTR_SELECT_L3;
>>> +
>>> +    printk(XENLOG_INFO "LLC selection: %u\n", cache_sel);
>>> +    /* Select the correct LLC in CSSELR_EL1 */
>>> +    WRITE_SYSREG64(cache_sel, CSSELR_EL1);
>>
>> This should be WRITE_SYSREG().
>>
>>> +
>>> +    /* Ensure write */
>>> +    isb();
>>> +
>>> +    /* Get info about the LLC */
>>> +    cache_info = READ_SYSREG64(CCSIDR_EL1);
>>> +
>>> +    /* ARM TRM: (Log2(Number of bytes in cache line)) - 4. */
>>
>> From my understanding "TRM" in the Arm world refers to a specific 
>> processor. In this case we want to quote the spec. So we usually say 
>> "Arm Arm".
>>
>>> +    cache_line_size = 1 << ((cache_info & CTR_LINESIZE_MASK) + 4);
>>> +    /* ARM TRM: (Number of sets in cache) - 1 */
>>> +    cache_set_num = ((cache_info >> CTR_SIZE_SHIFT) & CTR_SIZE_MASK) 
>>> + 1;
>>
>> The shifts here are assuming that FEAT_CCIDX is not implemented. I 
>> would be OK if we decide to not support cache coloring on such 
>> platform. However, we need to return an error if a user tries to use 
>> cache coloring on such platform.
>>
> In my understanding, if FEAT_CCIDX is implemented then CCSIDR_EL1 is a 
> 64-bit register.

Technically all the system registers on arm64 are 64-bit registers. That 
said, earlier version of the Arm Arm suggested that some where 32-bit 
when in fact the top bits were RES0.

In Xen, we should try to use register_t and READ_SYSREG() when using 
system register so we don't end up to mask the top by mistake (a future 
revision of the spec may define them).

If the co-processor register is also 64-bit on 32-bit, then we should 
use register_t and READ_SYSREG64().

> So it's just a matter of probing for FEAT_CCIDX and in that case 
> changing the way we access
> that register (since the layout changes too).

Yes. I will review it if you want to implement it. But I am equally fine 
if you just want to add a check and return an error if FEAT_CCIDX is 
implemented.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2022-05-13 19:08 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-03-04 17:46 [PATCH 00/36] Arm cache coloring Marco Solieri
2022-03-04 17:46 ` [PATCH 01/36] Revert "xen/arm: setup: Add Xen as boot module before printing all boot modules" Marco Solieri
2022-03-04 18:50   ` Julien Grall
2022-03-04 17:46 ` [PATCH 02/36] Revert "xen/arm: mm: Initialize page-tables earlier" Marco Solieri
2022-03-04 17:46 ` [PATCH 03/36] xen/arm: restore xen_paddr argument in setup_pagetables Marco Solieri
2022-03-04 17:46 ` [PATCH 04/36] xen/arm: add parsing function for cache coloring configuration Marco Solieri
2022-03-09 19:09   ` Julien Grall
2022-03-22  9:17     ` Luca Miccio
2022-03-23 19:02       ` Julien Grall
2022-05-13 14:22     ` Carlo Nonato
2022-05-13 17:41       ` Julien Grall
2022-03-04 17:46 ` [PATCH 05/36] xen/arm: compute LLC way size by hardware inspection Marco Solieri
2022-03-09 20:12   ` Julien Grall
2022-05-13 14:34     ` Carlo Nonato
2022-05-13 19:08       ` Julien Grall
2022-03-04 17:46 ` [PATCH 06/36] xen/arm: add coloring basic initialization Marco Solieri
2022-03-04 17:46 ` [PATCH 07/36] xen/arm: add coloring data to domains Marco Solieri
2022-03-07  7:22   ` Jan Beulich
2022-03-04 17:46 ` [PATCH 08/36] xen/arm: add colored flag to page struct Marco Solieri
2022-03-04 20:13   ` Julien Grall
2022-03-04 17:46 ` [PATCH 09/36] xen/arch: add default colors selection function Marco Solieri
2022-03-07  7:28   ` Jan Beulich
2022-03-04 17:46 ` [PATCH 10/36] xen/arch: check color " Marco Solieri
2022-03-09 20:17   ` Julien Grall
2022-03-14  6:06   ` Henry Wang
2022-03-04 17:46 ` [PATCH 11/36] xen/include: define hypercall parameter for coloring Marco Solieri
2022-03-07  7:31   ` Jan Beulich
2022-03-09 20:29   ` Julien Grall
2022-03-04 17:46 ` [PATCH 12/36] xen/arm: initialize cache coloring data for Dom0/U Marco Solieri
2022-03-11 19:05   ` Julien Grall
2022-03-04 17:46 ` [PATCH 13/36] xen/arm: A domain is not direct mapped when coloring is enabled Marco Solieri
2022-03-09 20:34   ` Julien Grall
2022-03-04 17:46 ` [PATCH 14/36] xen/arch: add dump coloring info for domains Marco Solieri
2022-03-04 17:46 ` [PATCH 15/36] tools: add support for cache coloring configuration Marco Solieri
2022-03-04 17:46 ` [PATCH 16/36] xen/color alloc: implement color_from_page for ARM64 Marco Solieri
2022-03-04 20:54   ` Julien Grall
2022-03-11 17:39     ` Marco Solieri
2022-03-11 17:57       ` Julien Grall
2022-03-04 17:46 ` [PATCH 17/36] xen/arm: add get_max_color function Marco Solieri
2022-03-11 19:09   ` Julien Grall
2022-03-04 17:46 ` [PATCH 18/36] Alloc: introduce page_list_for_each_reverse Marco Solieri
2022-03-07  7:35   ` Jan Beulich
2022-03-04 17:46 ` [PATCH 19/36] xen/arch: introduce cache-coloring allocator Marco Solieri
2022-03-09 14:35   ` Jan Beulich
2022-03-04 17:46 ` [PATCH 20/36] xen/common: introduce buddy required reservation Marco Solieri
2022-03-09 14:45   ` Jan Beulich
2022-03-09 14:47     ` Jan Beulich
2022-03-04 17:46 ` [PATCH 21/36] xen/common: add colored allocator initialization Marco Solieri
2022-03-09 14:58   ` Jan Beulich
2022-03-04 17:46 ` [PATCH 22/36] xen/arch: init cache coloring conf for Xen Marco Solieri
2022-03-14 18:59   ` Julien Grall
2022-03-04 17:46 ` [PATCH 23/36] xen/arch: coloring: manually calculate Xen physical addresses Marco Solieri
2022-03-14 19:23   ` Julien Grall
2022-03-04 17:46 ` [PATCH 24/36] xen/arm: enable consider_modules for coloring Marco Solieri
2022-03-14 19:24   ` Julien Grall
2022-03-04 17:46 ` [PATCH 25/36] xen/arm: bring back get_xen_paddr Marco Solieri
2022-03-04 17:46 ` [PATCH 26/36] xen/arm: add argument to remove_early_mappings Marco Solieri
2022-03-14 19:59   ` Julien Grall
2022-03-04 17:46 ` [PATCH 27/36] xen/arch: add coloring support for Xen Marco Solieri
2022-03-04 19:47   ` Julien Grall
2022-03-09 11:28     ` Julien Grall
2022-03-14  3:47   ` Henry Wang
2022-03-14 21:58   ` Julien Grall
2022-03-04 17:46 ` [PATCH 28/36] xen/arm: introduce xen_map_text_rw Marco Solieri
2022-03-07  7:39   ` Jan Beulich
2022-03-11 22:28     ` Julien Grall
2022-03-04 17:46 ` [PATCH 29/36] xen/arm: add dump function for coloring info Marco Solieri
2022-03-04 17:46 ` [PATCH 30/36] xen/arm: add coloring support to dom0less Marco Solieri
2022-03-04 17:46 ` [PATCH 31/36] Disable coloring if static memory support is selected Marco Solieri
2022-03-14 20:04   ` Julien Grall
2022-03-04 17:46 ` [PATCH 32/36] xen/arm: reduce the number of supported colors Marco Solieri
2022-03-04 17:46 ` [PATCH 33/36] doc, xen-command-line: introduce coloring options Marco Solieri
2022-03-07  7:42   ` Jan Beulich
2022-03-14 22:07   ` Julien Grall
2022-03-04 17:46 ` [PATCH 34/36] doc, xl.cfg: introduce coloring configuration option Marco Solieri
2022-03-04 17:47 ` [PATCH 35/36] doc, device-tree: introduce 'colors' property Marco Solieri
2022-03-14 22:17   ` Julien Grall
2022-03-04 17:47 ` [PATCH 36/36] doc, arm: add usage documentation for cache coloring support Marco Solieri
2022-03-15 19:23   ` Julien Grall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.