All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v7 00/14] Arm cache coloring
@ 2024-03-15 10:58 Carlo Nonato
  2024-03-15 10:58 ` [PATCH v7 01/14] xen/common: add cache coloring common code Carlo Nonato
                   ` (13 more replies)
  0 siblings, 14 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk, Anthony PERARD, Juergen Gross

Shared caches in multi-core CPU architectures represent a problem for
predictability of memory access latency. This jeopardizes applicability
of many Arm platform in real-time critical and mixed-criticality
scenarios. We introduce support for cache partitioning with page
coloring, a transparent software technique that enables isolation
between domains and Xen, and thus avoids cache interference.

When creating a domain, a simple syntax (e.g. `0-3` or `4-11`) allows
the user to define assignments of cache partitions ids, called colors,
where assigning different colors guarantees no mutual eviction on cache
will ever happen. This instructs the Xen memory allocator to provide
the i-th color assignee only with pages that maps to color i, i.e. that
are indexed in the i-th cache partition.

The proposed implementation supports the dom0less feature.
The proposed implementation doesn't support the static-mem feature.
The solution has been tested in several scenarios, including Xilinx Zynq
MPSoCs.

Carlo Nonato (13):
  xen/common: add cache coloring common code
  xen/arm: add initial support for LLC coloring on arm64
  xen/arm: permit non direct-mapped Dom0 construction
  xen/arm: add Dom0 cache coloring support
  xen: extend domctl interface for cache coloring
  tools: add support for cache coloring configuration
  xen/arm: add support for cache coloring configuration via device-tree
  xen/page_alloc: introduce preserved page flags macro
  xen/page_alloc: introduce page flag to stop buddy merging
  xen: add cache coloring allocator for domains
  xen/arm: use domain memory to allocate p2m page tables
  xen/arm: make consider_modules() available for xen relocation
  xen/arm: add cache coloring support for Xen

Luca Miccio (1):
  xen/arm: add Xen cache colors command line parameter

 SUPPORT.md                              |   7 +
 docs/man/xl.cfg.5.pod.in                |  10 +
 docs/misc/arm/device-tree/booting.txt   |   4 +
 docs/misc/cache-coloring.rst            | 255 +++++++++++++++++
 docs/misc/xen-command-line.pandoc       |  70 +++++
 tools/include/libxl.h                   |   5 +
 tools/include/xenctrl.h                 |   9 +
 tools/libs/ctrl/xc_domain.c             |  35 +++
 tools/libs/light/libxl_create.c         |   9 +
 tools/libs/light/libxl_types.idl        |   1 +
 tools/xl/xl_parse.c                     |  38 ++-
 xen/arch/Kconfig                        |  28 ++
 xen/arch/arm/Kconfig                    |   1 +
 xen/arch/arm/Makefile                   |   1 +
 xen/arch/arm/alternative.c              |  30 +-
 xen/arch/arm/arm32/mmu/mm.c             |  93 +-----
 xen/arch/arm/arm64/mmu/head.S           |  58 +++-
 xen/arch/arm/arm64/mmu/mm.c             |  28 +-
 xen/arch/arm/dom0less-build.c           |  59 ++--
 xen/arch/arm/domain_build.c             | 109 ++++++-
 xen/arch/arm/include/asm/domain_build.h |   1 +
 xen/arch/arm/include/asm/mm.h           |   5 +
 xen/arch/arm/include/asm/mmu/layout.h   |   3 +
 xen/arch/arm/include/asm/processor.h    |  16 ++
 xen/arch/arm/include/asm/setup.h        |   3 +
 xen/arch/arm/llc-coloring.c             | 136 +++++++++
 xen/arch/arm/mmu/p2m.c                  |   4 +-
 xen/arch/arm/mmu/setup.c                | 199 ++++++++++++-
 xen/arch/arm/setup.c                    |  13 +-
 xen/common/Kconfig                      |   3 +
 xen/common/Makefile                     |   1 +
 xen/common/domain.c                     |   3 +
 xen/common/domctl.c                     |   8 +
 xen/common/keyhandler.c                 |   3 +
 xen/common/llc-coloring.c               | 360 ++++++++++++++++++++++++
 xen/common/page_alloc.c                 | 210 +++++++++++++-
 xen/include/public/domctl.h             |   9 +
 xen/include/xen/llc-coloring.h          |  64 +++++
 xen/include/xen/sched.h                 |   5 +
 39 files changed, 1721 insertions(+), 175 deletions(-)
 create mode 100644 docs/misc/cache-coloring.rst
 create mode 100644 xen/arch/arm/llc-coloring.c
 create mode 100644 xen/common/llc-coloring.c
 create mode 100644 xen/include/xen/llc-coloring.h

-- 
2.34.1



^ permalink raw reply	[flat|nested] 60+ messages in thread

* [PATCH v7 01/14] xen/common: add cache coloring common code
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-15 11:39   ` Carlo Nonato
  2024-03-19 14:58   ` Jan Beulich
  2024-03-15 10:58 ` [PATCH v7 02/14] xen/arm: add initial support for LLC coloring on arm64 Carlo Nonato
                   ` (12 subsequent siblings)
  13 siblings, 2 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri

Last Level Cache (LLC) coloring allows to partition the cache in smaller
chunks called cache colors. Since not all architectures can actually
implement it, add a HAS_LLC_COLORING Kconfig and put other options under
xen/arch.

LLC colors are a property of the domain, so the domain struct has to be
extended.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v7:
- SUPPORT.md changes added to this patch
- extended documentation to better address applicability of cache coloring
- "llc-nr-ways" and "llc-size" params introduced in favor of "llc-way-size"
- moved dump_llc_coloring_info() call in 'm' keyhandler (pagealloc_info())
v6:
- moved almost all code in common
- moved documentation in this patch
- reintroduced range for CONFIG_NR_LLC_COLORS
- reintroduced some stub functions to reduce the number of checks on
  llc_coloring_enabled
- moved domain_llc_coloring_free() in same patch where allocation happens
- turned "d->llc_colors" to pointer-to-const
- llc_coloring_init() now returns void and panics if errors are found
v5:
- used - instead of _ for filenames
- removed domain_create_llc_colored()
- removed stub functions
- coloring domain fields are now #ifdef protected
v4:
- Kconfig options moved to xen/arch
- removed range for CONFIG_NR_LLC_COLORS
- added "llc_coloring_enabled" global to later implement the boot-time
  switch
- added domain_create_llc_colored() to be able to pass colors
- added is_domain_llc_colored() macro
---
 SUPPORT.md                        |   7 ++
 docs/misc/cache-coloring.rst      | 125 ++++++++++++++++++++++++++++++
 docs/misc/xen-command-line.pandoc |  37 +++++++++
 xen/arch/Kconfig                  |  20 +++++
 xen/common/Kconfig                |   3 +
 xen/common/Makefile               |   1 +
 xen/common/keyhandler.c           |   3 +
 xen/common/llc-coloring.c         | 102 ++++++++++++++++++++++++
 xen/common/page_alloc.c           |   3 +
 xen/include/xen/llc-coloring.h    |  36 +++++++++
 xen/include/xen/sched.h           |   5 ++
 11 files changed, 342 insertions(+)
 create mode 100644 docs/misc/cache-coloring.rst
 create mode 100644 xen/common/llc-coloring.c
 create mode 100644 xen/include/xen/llc-coloring.h

diff --git a/SUPPORT.md b/SUPPORT.md
index 510bb02190..456abd42bf 100644
--- a/SUPPORT.md
+++ b/SUPPORT.md
@@ -364,6 +364,13 @@ by maintaining multiple physical to machine (p2m) memory mappings.
     Status, x86 HVM: Tech Preview
     Status, ARM: Tech Preview
 
+### Cache coloring
+
+Allows to reserve Last Level Cache (LLC) partitions for Dom0, DomUs and Xen
+itself.
+
+    Status, Arm64: Experimental
+
 ## Resource Management
 
 ### CPU Pools
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
new file mode 100644
index 0000000000..52ce52ffbd
--- /dev/null
+++ b/docs/misc/cache-coloring.rst
@@ -0,0 +1,125 @@
+Xen cache coloring user guide
+=============================
+
+The cache coloring support in Xen allows to reserve Last Level Cache (LLC)
+partitions for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
+Cache coloring realizes per-set cache partitioning in software and is applicable
+to shared LLCs as implemented in Cortex-A53, Cortex-A72 and similar CPUs.
+
+To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
+
+If needed, change the maximum number of colors with
+``CONFIG_NR_LLC_COLORS=<n>``.
+
+Runtime configuration is done via `Command line parameters`_.
+
+Background
+**********
+
+Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
+to each core (hence using multiple cache units), while the last level is shared
+among all of them. Such configuration implies that memory operations on one
+core (e.g. running a DomU) are able to generate interference on another core
+(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
+in software and mitigates this, guaranteeing higher and more predictable
+performances for memory accesses.
+Software-based cache coloring is particularly useful in those situations where
+no hardware mechanisms (e.g., DSU-based way partitioning) are available to
+partition caches. This is the case for e.g., Cortex-A53, A57 and A72 CPUs that
+feature a L2 LLC cache shared among all cores.
+
+The key concept underlying cache coloring is a fragmentation of the memory
+space into a set of sub-spaces called colors that are mapped to disjoint cache
+partitions. Technically, the whole memory space is first divided into a number
+of subsequent regions. Then each region is in turn divided into a number of
+subsequent sub-colors. The generic i-th color is then obtained by all the
+i-th sub-colors in each region.
+
+::
+
+                            Region j            Region j+1
+                .....................   ............
+                .                     . .
+                .                       .
+            _ _ _______________ _ _____________________ _ _
+                |     |     |     |     |     |     |
+                | c_0 | c_1 |     | c_n | c_0 | c_1 |
+           _ _ _|_____|_____|_ _ _|_____|_____|_____|_ _ _
+                    :                       :
+                    :                       :...         ... .
+                    :                            color 0
+                    :...........................         ... .
+                                                :
+          . . ..................................:
+
+How colors are actually defined depends on the function that maps memory to
+cache lines. In case of physically-indexed, physically-tagged caches with linear
+mapping, the set index is found by extracting some contiguous bits from the
+physical address. This allows colors to be defined as shown in figure: they
+appear in memory as subsequent blocks of equal size and repeats themselves after
+``n`` different colors, where ``n`` is the total number of colors.
+
+If some kind of bit shuffling appears in the mapping function, then colors
+assume a different layout in memory. Those kind of caches aren't supported by
+the current implementation.
+
+**Note**: Finding the exact cache mapping function can be a really difficult
+task since it's not always documented in the CPU manual. As said Cortex-A53, A57
+and A72 are known to work with the current implementation.
+
+How to compute the number of colors
+###################################
+
+Given the linear mapping from physical memory to cache lines for granted, the
+number of available colors for a specific platform is computed using three
+parameters:
+
+- the size of the LLC.
+- the number of the LLC ways.
+- the page size used by Xen.
+
+The first two parameters can be found in the processor manual, while the third
+one is the minimum mapping granularity. Dividing the cache size by the number of
+its ways we obtain the size of a way. Dividing this number by the page size,
+the number of total cache colors is found. So for example an Arm Cortex-A53
+with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
+4 KiB in size.
+
+LLC size and number of ways are probed automatically by default so there's
+should be no need to compute the number of colors by yourself.
+
+Effective colors assignment
+###########################
+
+When assigning colors:
+
+1. If one wants to avoid cache interference between two domains, different
+   colors needs to be used for their memory.
+
+2. To improve spatial locality, color assignment should privilege continuity in
+   the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
+   domain J is better than assigning colors (0,2) to I and (1,3) to J.
+
+Command line parameters
+***********************
+
+Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
+
++----------------------+-------------------------------+
+| **Parameter**        | **Description**               |
++----------------------+-------------------------------+
+| ``llc-coloring``     | enable coloring at runtime    |
++----------------------+-------------------------------+
+| ``llc-size``         | set the LLC size              |
++----------------------+-------------------------------+
+| ``llc-nr-ways``      | set the LLC number of ways    |
++----------------------+-------------------------------+
+
+Auto-probing of LLC specs
+#########################
+
+LLC size and number of ways are probed automatically by default.
+
+LLC specs can be manually set via the above command line parameters. This
+bypasses any auto-probing and it's used to overcome failing situations or for
+debugging/testing purposes.
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 54edbc0fbc..2936abea2c 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
 in hypervisor context to be able to dump the Last Interrupt/Exception To/From
 record with other registers.
 
+### llc-coloring
+> `= <boolean>`
+
+> Default: `false`
+
+Flag to enable or disable LLC coloring support at runtime. This option is
+available only when `CONFIG_LLC_COLORING` is enabled. See the general
+cache coloring documentation for more info.
+
+### llc-nr-ways
+> `= <integer>`
+
+> Default: `Obtained from hardware`
+
+Specify the number of ways of the Last Level Cache. This option is available
+only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
+to find the number of supported cache colors. By default the value is
+automatically computed by probing the hardware, but in case of specific needs,
+it can be manually set. Those include failing probing and debugging/testing
+purposes so that it's possibile to emulate platforms with different number of
+supported colors. If set, also "llc-size" must be set, otherwise the default
+will be used.
+
+### llc-size
+> `= <size>`
+
+> Default: `Obtained from hardware`
+
+Specify the size of the Last Level Cache. This option is available only when
+`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
+the number of supported cache colors. By default the value is automatically
+computed by probing the hardware, but in case of specific needs, it can be
+manually set. Those include failing probing and debugging/testing purposes so
+that it's possibile to emulate platforms with different number of supported
+colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
+used.
+
 ### lock-depth-size
 > `= <integer>`
 
diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index 67ba38f32f..a65c38e53e 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -31,3 +31,23 @@ config NR_NUMA_NODES
 	  associated with multiple-nodes management. It is the upper bound of
 	  the number of NUMA nodes that the scheduler, memory allocation and
 	  other NUMA-aware components can handle.
+
+config LLC_COLORING
+	bool "Last Level Cache (LLC) coloring" if EXPERT
+	depends on HAS_LLC_COLORING
+	depends on !NUMA
+
+config NR_LLC_COLORS
+	int "Maximum number of LLC colors"
+	range 2 1024
+	default 128
+	depends on LLC_COLORING
+	help
+	  Controls the build-time size of various arrays associated with LLC
+	  coloring. Refer to cache coloring documentation for how to compute the
+	  number of colors supported by the platform. This is only an upper
+	  bound. The runtime value is autocomputed or manually set via cmdline.
+	  The default value corresponds to an 8 MiB 16-ways LLC, which should be
+	  more than what's needed in the general case. Use only power of 2 values.
+	  1024 is the number of colors that fit in a 4 KiB page when integers are 4
+	  bytes long.
diff --git a/xen/common/Kconfig b/xen/common/Kconfig
index a5c3d5a6bf..1e467178bd 100644
--- a/xen/common/Kconfig
+++ b/xen/common/Kconfig
@@ -71,6 +71,9 @@ config HAS_IOPORTS
 config HAS_KEXEC
 	bool
 
+config HAS_LLC_COLORING
+	bool
+
 config HAS_PMAP
 	bool
 
diff --git a/xen/common/Makefile b/xen/common/Makefile
index e5eee19a85..3054254a7d 100644
--- a/xen/common/Makefile
+++ b/xen/common/Makefile
@@ -23,6 +23,7 @@ obj-y += keyhandler.o
 obj-$(CONFIG_KEXEC) += kexec.o
 obj-$(CONFIG_KEXEC) += kimage.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o livepatch_elf.o
+obj-$(CONFIG_LLC_COLORING) += llc-coloring.o
 obj-$(CONFIG_MEM_ACCESS) += mem_access.o
 obj-y += memory.o
 obj-y += multicall.o
diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
index 127ca50696..778f93e063 100644
--- a/xen/common/keyhandler.c
+++ b/xen/common/keyhandler.c
@@ -5,6 +5,7 @@
 #include <asm/regs.h>
 #include <xen/delay.h>
 #include <xen/keyhandler.h>
+#include <xen/llc-coloring.h>
 #include <xen/param.h>
 #include <xen/shutdown.h>
 #include <xen/event.h>
@@ -303,6 +304,8 @@ static void cf_check dump_domains(unsigned char key)
 
         arch_dump_domain_info(d);
 
+        domain_dump_llc_colors(d);
+
         rangeset_domain_printk(d);
 
         dump_pageframe_info(d);
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
new file mode 100644
index 0000000000..db96a83ddd
--- /dev/null
+++ b/xen/common/llc-coloring.c
@@ -0,0 +1,102 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Last Level Cache (LLC) coloring common code
+ *
+ * Copyright (C) 2022 Xilinx Inc.
+ */
+#include <xen/keyhandler.h>
+#include <xen/llc-coloring.h>
+#include <xen/param.h>
+
+static bool __ro_after_init llc_coloring_enabled;
+boolean_param("llc-coloring", llc_coloring_enabled);
+
+static unsigned int __initdata llc_size;
+size_param("llc-size", llc_size);
+static unsigned int __initdata llc_nr_ways;
+integer_param("llc-nr-ways", llc_nr_ways);
+/* Number of colors available in the LLC */
+static unsigned int __ro_after_init max_nr_colors;
+
+static void print_colors(const unsigned int *colors, unsigned int num_colors)
+{
+    unsigned int i;
+
+    printk("{ ");
+    for ( i = 0; i < num_colors; i++ )
+    {
+        unsigned int start = colors[i], end = start;
+
+        printk("%u", start);
+
+        for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
+            ;
+
+        if ( start != end )
+            printk("-%u", end);
+
+        if ( i < num_colors - 1 )
+            printk(", ");
+    }
+    printk(" }\n");
+}
+
+void __init llc_coloring_init(void)
+{
+    unsigned int way_size;
+
+    if ( !llc_coloring_enabled )
+        return;
+
+    if ( llc_size && llc_nr_ways )
+        way_size = llc_size / llc_nr_ways;
+    else
+    {
+        way_size = get_llc_way_size();
+        if ( !way_size )
+            panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
+    }
+
+    /*
+     * The maximum number of colors must be a power of 2 in order to correctly
+     * map them to bits of an address.
+     */
+    max_nr_colors = way_size >> PAGE_SHIFT;
+
+    if ( max_nr_colors & (max_nr_colors - 1) )
+        panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
+
+    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
+        panic("Number of LLC colors (%u) not in range [2, %u]\n",
+              max_nr_colors, CONFIG_NR_LLC_COLORS);
+
+    arch_llc_coloring_init();
+}
+
+void cf_check dump_llc_coloring_info(void)
+{
+    if ( !llc_coloring_enabled )
+        return;
+
+    printk("LLC coloring info:\n");
+    printk("    Number of LLC colors supported: %u\n", max_nr_colors);
+}
+
+void cf_check domain_dump_llc_colors(const struct domain *d)
+{
+    if ( !llc_coloring_enabled )
+        return;
+
+    printk("%u LLC colors: ", d->num_llc_colors);
+    print_colors(d->llc_colors, d->num_llc_colors);
+}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 2ec17df9b4..c38edb9a58 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -126,6 +126,7 @@
 #include <xen/irq.h>
 #include <xen/keyhandler.h>
 #include <xen/lib.h>
+#include <xen/llc-coloring.h>
 #include <xen/mm.h>
 #include <xen/nodemask.h>
 #include <xen/numa.h>
@@ -2623,6 +2624,8 @@ static void cf_check pagealloc_info(unsigned char key)
     }
 
     printk("    Dom heap: %lukB free\n", total << (PAGE_SHIFT-10));
+
+    dump_llc_coloring_info();
 }
 
 static __init int cf_check pagealloc_keyhandler_init(void)
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
new file mode 100644
index 0000000000..c60c8050c5
--- /dev/null
+++ b/xen/include/xen/llc-coloring.h
@@ -0,0 +1,36 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Last Level Cache (LLC) coloring common header
+ *
+ * Copyright (C) 2022 Xilinx Inc.
+ */
+#ifndef __COLORING_H__
+#define __COLORING_H__
+
+#include <xen/sched.h>
+#include <public/domctl.h>
+
+#ifdef CONFIG_LLC_COLORING
+void llc_coloring_init(void);
+void dump_llc_coloring_info(void);
+void domain_dump_llc_colors(const struct domain *d);
+#else
+static inline void llc_coloring_init(void) {}
+static inline void dump_llc_coloring_info(void) {}
+static inline void domain_dump_llc_colors(const struct domain *d) {}
+#endif
+
+unsigned int get_llc_way_size(void);
+void arch_llc_coloring_init(void);
+
+#endif /* __COLORING_H__ */
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index 37f5922f32..96cc934fc3 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -627,6 +627,11 @@ struct domain
 
     /* Holding CDF_* constant. Internal flags for domain creation. */
     unsigned int cdf;
+
+#ifdef CONFIG_LLC_COLORING
+    unsigned int num_llc_colors;
+    const unsigned int *llc_colors;
+#endif
 };
 
 static inline struct page_list_head *page_to_list(
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 02/14] xen/arm: add initial support for LLC coloring on arm64
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
  2024-03-15 10:58 ` [PATCH v7 01/14] xen/common: add cache coloring common code Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-15 10:58 ` [PATCH v7 03/14] xen/arm: permit non direct-mapped Dom0 construction Carlo Nonato
                   ` (11 subsequent siblings)
  13 siblings, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk, Marco Solieri

LLC coloring needs to know the last level cache layout in order to make the
best use of it. This can be probed by inspecting the CLIDR_EL1 register,
so the Last Level is defined as the last level visible by this register.
Note that this excludes system caches in some platforms.

Static memory allocation and cache coloring are incompatible because static
memory can't be guaranteed to use only colors assigned to the domain.
Panic during DomUs creation when both are enabled.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v7:
- only minor changes
v6:
- get_llc_way_size() now checks for at least separate I/D caches
v5:
- used - instead of _ for filenames
- moved static-mem check in this patch
- moved dom0 colors parsing in next patch
- moved color allocation and configuration in next patch
- moved check_colors() in next patch
- colors are now printed in short form
v4:
- added "llc-coloring" cmdline option for the boot-time switch
- dom0 colors are now checked during domain init as for any other domain
- fixed processor.h masks bit width
- check for overflow in parse_color_config()
- check_colors() now checks also that colors are sorted and unique
---
 docs/misc/cache-coloring.rst         | 14 ++++++
 xen/arch/arm/Kconfig                 |  1 +
 xen/arch/arm/Makefile                |  1 +
 xen/arch/arm/dom0less-build.c        |  6 +++
 xen/arch/arm/include/asm/processor.h | 16 ++++++
 xen/arch/arm/llc-coloring.c          | 75 ++++++++++++++++++++++++++++
 xen/arch/arm/setup.c                 |  3 ++
 xen/common/llc-coloring.c            |  2 +-
 xen/include/xen/llc-coloring.h       |  4 ++
 9 files changed, 121 insertions(+), 1 deletion(-)
 create mode 100644 xen/arch/arm/llc-coloring.c

diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 52ce52ffbd..871e7a3ddb 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -120,6 +120,20 @@ Auto-probing of LLC specs
 
 LLC size and number of ways are probed automatically by default.
 
+In the Arm implementation, this is done by inspecting the CLIDR_EL1 register.
+This means that other system caches that aren't visible there are ignored.
+
 LLC specs can be manually set via the above command line parameters. This
 bypasses any auto-probing and it's used to overcome failing situations or for
 debugging/testing purposes.
+
+Known issues and limitations
+****************************
+
+"xen,static-mem" isn't supported when coloring is enabled
+#########################################################
+
+In the domain configuration, "xen,static-mem" allows memory to be statically
+allocated to the domain. This isn't possible when LLC coloring is enabled,
+because that memory can't be guaranteed to use only colors assigned to the
+domain.
diff --git a/xen/arch/arm/Kconfig b/xen/arch/arm/Kconfig
index 40f834bb71..fa96d8247e 100644
--- a/xen/arch/arm/Kconfig
+++ b/xen/arch/arm/Kconfig
@@ -8,6 +8,7 @@ config ARM_64
 	depends on !ARM_32
 	select 64BIT
 	select HAS_FAST_MULTIPLY
+	select HAS_LLC_COLORING
 
 config ARM
 	def_bool y
diff --git a/xen/arch/arm/Makefile b/xen/arch/arm/Makefile
index 7b1350e2ef..18ae566521 100644
--- a/xen/arch/arm/Makefile
+++ b/xen/arch/arm/Makefile
@@ -37,6 +37,7 @@ obj-$(CONFIG_IOREQ_SERVER) += ioreq.o
 obj-y += irq.o
 obj-y += kernel.init.o
 obj-$(CONFIG_LIVEPATCH) += livepatch.o
+obj-$(CONFIG_LLC_COLORING) += llc-coloring.o
 obj-y += mem_access.o
 obj-y += mm.o
 obj-y += monitor.o
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index fb63ec6fd1..1142f7f74a 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -5,6 +5,7 @@
 #include <xen/grant_table.h>
 #include <xen/iocap.h>
 #include <xen/libfdt/libfdt.h>
+#include <xen/llc-coloring.h>
 #include <xen/sched.h>
 #include <xen/serial.h>
 #include <xen/sizes.h>
@@ -879,7 +880,12 @@ void __init create_domUs(void)
             panic("No more domain IDs available\n");
 
         if ( dt_find_property(node, "xen,static-mem", NULL) )
+        {
+            if ( llc_coloring_enabled )
+                panic("LLC coloring and static memory are incompatible\n");
+
             flags |= CDF_staticmem;
+        }
 
         if ( dt_property_read_bool(node, "direct-map") )
         {
diff --git a/xen/arch/arm/include/asm/processor.h b/xen/arch/arm/include/asm/processor.h
index 8e02410465..ef33ea198c 100644
--- a/xen/arch/arm/include/asm/processor.h
+++ b/xen/arch/arm/include/asm/processor.h
@@ -18,6 +18,22 @@
 #define CTR_IDC_SHIFT       28
 #define CTR_DIC_SHIFT       29
 
+/* CCSIDR Current Cache Size ID Register */
+#define CCSIDR_LINESIZE_MASK            _AC(0x7, UL)
+#define CCSIDR_NUMSETS_SHIFT            13
+#define CCSIDR_NUMSETS_MASK             _AC(0x3fff, UL)
+#define CCSIDR_NUMSETS_SHIFT_FEAT_CCIDX 32
+#define CCSIDR_NUMSETS_MASK_FEAT_CCIDX  _AC(0xffffff, UL)
+
+/* CSSELR Cache Size Selection Register */
+#define CSSELR_LEVEL_MASK  _AC(0x7, UL)
+#define CSSELR_LEVEL_SHIFT 1
+
+/* CLIDR Cache Level ID Register */
+#define CLIDR_CTYPEn_SHIFT(n) (3 * ((n) - 1))
+#define CLIDR_CTYPEn_MASK     _AC(0x7, UL)
+#define CLIDR_CTYPEn_LEVELS   7
+
 #define ICACHE_POLICY_VPIPT  0
 #define ICACHE_POLICY_AIVIVT 1
 #define ICACHE_POLICY_VIPT   2
diff --git a/xen/arch/arm/llc-coloring.c b/xen/arch/arm/llc-coloring.c
new file mode 100644
index 0000000000..b83540ff41
--- /dev/null
+++ b/xen/arch/arm/llc-coloring.c
@@ -0,0 +1,75 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Last Level Cache (LLC) coloring support for ARM
+ *
+ * Copyright (C) 2022 Xilinx Inc.
+ */
+#include <xen/llc-coloring.h>
+#include <xen/types.h>
+
+#include <asm/processor.h>
+#include <asm/sysregs.h>
+
+/* Return the LLC way size by probing the hardware */
+unsigned int __init get_llc_way_size(void)
+{
+    register_t ccsidr_el1;
+    register_t clidr_el1 = READ_SYSREG(CLIDR_EL1);
+    register_t csselr_el1 = READ_SYSREG(CSSELR_EL1);
+    register_t id_aa64mmfr2_el1 = READ_SYSREG(ID_AA64MMFR2_EL1);
+    uint32_t ccsidr_numsets_shift = CCSIDR_NUMSETS_SHIFT;
+    uint32_t ccsidr_numsets_mask = CCSIDR_NUMSETS_MASK;
+    unsigned int n, line_size, num_sets;
+
+    for ( n = CLIDR_CTYPEn_LEVELS; n != 0; n-- )
+    {
+        uint8_t ctype_n = (clidr_el1 >> CLIDR_CTYPEn_SHIFT(n)) &
+                          CLIDR_CTYPEn_MASK;
+
+        /* Unified cache (see Arm ARM DDI 0487J.a D19.2.27) */
+        if ( ctype_n == 0b100 )
+            break;
+    }
+
+    if ( n == 0 )
+        return 0;
+
+    WRITE_SYSREG((n - 1) << CSSELR_LEVEL_SHIFT, CSSELR_EL1);
+    isb();
+
+    ccsidr_el1 = READ_SYSREG(CCSIDR_EL1);
+
+    /* Arm ARM: (Log2(Number of bytes in cache line)) - 4 */
+    line_size = 1U << ((ccsidr_el1 & CCSIDR_LINESIZE_MASK) + 4);
+
+    /* If FEAT_CCIDX is enabled, CCSIDR_EL1 has a different bit layout */
+    if ( (id_aa64mmfr2_el1 >> ID_AA64MMFR2_CCIDX_SHIFT) & 0x7 )
+    {
+        ccsidr_numsets_shift = CCSIDR_NUMSETS_SHIFT_FEAT_CCIDX;
+        ccsidr_numsets_mask = CCSIDR_NUMSETS_MASK_FEAT_CCIDX;
+    }
+
+    /* Arm ARM: (Number of sets in cache) - 1 */
+    num_sets = ((ccsidr_el1 >> ccsidr_numsets_shift) & ccsidr_numsets_mask) + 1;
+
+    printk(XENLOG_INFO "LLC found: L%u (line size: %u bytes, sets num: %u)\n",
+           n, line_size, num_sets);
+
+    /* Restore value in CSSELR_EL1 */
+    WRITE_SYSREG(csselr_el1, CSSELR_EL1);
+    isb();
+
+    return line_size * num_sets;
+}
+
+void __init arch_llc_coloring_init(void) {}
+
+/*
+ * Local variables:
+ * mode: C
+ * c-file-style: "BSD"
+ * c-basic-offset: 4
+ * tab-width: 4
+ * indent-tabs-mode: nil
+ * End:
+ */
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 424744ad5e..c72c90302e 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -12,6 +12,7 @@
 #include <xen/device_tree.h>
 #include <xen/domain_page.h>
 #include <xen/grant_table.h>
+#include <xen/llc-coloring.h>
 #include <xen/types.h>
 #include <xen/string.h>
 #include <xen/serial.h>
@@ -746,6 +747,8 @@ void asmlinkage __init start_xen(unsigned long boot_phys_offset,
     printk("Command line: %s\n", cmdline);
     cmdline_parse(cmdline);
 
+    llc_coloring_init();
+
     setup_mm();
 
     vm_init();
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index db96a83ddd..51eae90ad5 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -8,7 +8,7 @@
 #include <xen/llc-coloring.h>
 #include <xen/param.h>
 
-static bool __ro_after_init llc_coloring_enabled;
+bool __ro_after_init llc_coloring_enabled;
 boolean_param("llc-coloring", llc_coloring_enabled);
 
 static unsigned int __initdata llc_size;
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index c60c8050c5..67b27c995b 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -11,10 +11,14 @@
 #include <public/domctl.h>
 
 #ifdef CONFIG_LLC_COLORING
+extern bool llc_coloring_enabled;
+
 void llc_coloring_init(void);
 void dump_llc_coloring_info(void);
 void domain_dump_llc_colors(const struct domain *d);
 #else
+#define llc_coloring_enabled false
+
 static inline void llc_coloring_init(void) {}
 static inline void dump_llc_coloring_info(void) {}
 static inline void domain_dump_llc_colors(const struct domain *d) {}
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 03/14] xen/arm: permit non direct-mapped Dom0 construction
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
  2024-03-15 10:58 ` [PATCH v7 01/14] xen/common: add cache coloring common code Carlo Nonato
  2024-03-15 10:58 ` [PATCH v7 02/14] xen/arm: add initial support for LLC coloring on arm64 Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-15 10:58 ` [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support Carlo Nonato
                   ` (10 subsequent siblings)
  13 siblings, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk

Cache coloring requires Dom0 not to be direct-mapped because of its non
contiguous mapping nature, so allocate_memory() is needed in this case.
8d2c3ab18cc1 ("arm/dom0less: put dom0less feature code in a separate module")
moved allocate_memory() in dom0less_build.c. In order to use it
in Dom0 construction bring it back to domain_build.c and declare it in
domain_build.h.

Take the opportunity to adapt the implementation of allocate_memory() so
that it uses the host layout when called on the hwdom, via
find_unallocated_memory().

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v7:
- allocate_memory() now uses the host layout when called on the hwdom
v6:
- new patch
---
 xen/arch/arm/dom0less-build.c           | 43 -----------
 xen/arch/arm/domain_build.c             | 99 ++++++++++++++++++++++++-
 xen/arch/arm/include/asm/domain_build.h |  1 +
 3 files changed, 96 insertions(+), 47 deletions(-)

diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 1142f7f74a..992080e61a 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -49,49 +49,6 @@ bool __init is_dom0less_mode(void)
     return ( !dom0found && domUfound );
 }
 
-static void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
-{
-    unsigned int i;
-    paddr_t bank_size;
-
-    printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
-           /* Don't want format this as PRIpaddr (16 digit hex) */
-           (unsigned long)(kinfo->unassigned_mem >> 20), d);
-
-    kinfo->mem.nr_banks = 0;
-    bank_size = MIN(GUEST_RAM0_SIZE, kinfo->unassigned_mem);
-    if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM0_BASE),
-                               bank_size) )
-        goto fail;
-
-    bank_size = MIN(GUEST_RAM1_SIZE, kinfo->unassigned_mem);
-    if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(GUEST_RAM1_BASE),
-                               bank_size) )
-        goto fail;
-
-    if ( kinfo->unassigned_mem )
-        goto fail;
-
-    for( i = 0; i < kinfo->mem.nr_banks; i++ )
-    {
-        printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
-               d,
-               i,
-               kinfo->mem.bank[i].start,
-               kinfo->mem.bank[i].start + kinfo->mem.bank[i].size,
-               /* Don't want format this as PRIpaddr (16 digit hex) */
-               (unsigned long)(kinfo->mem.bank[i].size >> 20));
-    }
-
-    return;
-
-fail:
-    panic("Failed to allocate requested domain memory."
-          /* Don't want format this as PRIpaddr (16 digit hex) */
-          " %ldKB unallocated. Fix the VMs configurations.\n",
-          (unsigned long)kinfo->unassigned_mem >> 10);
-}
-
 #ifdef CONFIG_VGICV2
 static int __init make_gicv2_domU_node(struct kernel_info *kinfo)
 {
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index 085d88671e..d21be2c57b 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -415,7 +415,6 @@ static void __init allocate_memory_11(struct domain *d,
     }
 }
 
-#ifdef CONFIG_DOM0LESS_BOOT
 bool __init allocate_bank_memory(struct domain *d, struct kernel_info *kinfo,
                                  gfn_t sgfn, paddr_t tot_size)
 {
@@ -477,7 +476,96 @@ bool __init allocate_bank_memory(struct domain *d, struct kernel_info *kinfo,
 
     return true;
 }
-#endif
+
+/* Forward declaration */
+static int __init find_unallocated_memory(const struct kernel_info *kinfo,
+                                          struct meminfo *ext_regions);
+
+void __init allocate_memory(struct domain *d, struct kernel_info *kinfo)
+{
+    unsigned int i = 0;
+    unsigned int nr_banks = 2;
+    paddr_t bank_start, bank_size;
+    struct meminfo *hwdom_free_mem = NULL;
+
+    printk(XENLOG_INFO "Allocating mappings totalling %ldMB for %pd:\n",
+           /* Don't want format this as PRIpaddr (16 digit hex) */
+           (unsigned long)(kinfo->unassigned_mem >> 20), d);
+
+    kinfo->mem.nr_banks = 0;
+    /*
+     * Use host memory layout for hwdom. Only case for this is when LLC coloring
+     * is enabled.
+     */
+    if ( is_hardware_domain(d) )
+    {
+        ASSERT(llc_coloring_enabled);
+
+        hwdom_free_mem = xzalloc(struct meminfo);
+        if ( !hwdom_free_mem )
+            goto fail;
+
+        if ( find_unallocated_memory(kinfo, hwdom_free_mem) )
+            goto fail;
+
+        nr_banks = hwdom_free_mem->nr_banks;
+    }
+
+    for ( ; kinfo->unassigned_mem > 0 && nr_banks > 0; i++, nr_banks-- )
+    {
+        if ( is_hardware_domain(d) )
+        {
+            bank_start = hwdom_free_mem->bank[i].start;
+            bank_size = hwdom_free_mem->bank[i].size;
+
+            if ( bank_size < min_t(paddr_t, kinfo->unassigned_mem, MB(128)) )
+                continue;
+        }
+        else
+        {
+            if ( i == 0 )
+            {
+                bank_start = GUEST_RAM0_BASE;
+                bank_size = GUEST_RAM0_SIZE;
+            }
+            else if ( i == 1 )
+            {
+                bank_start = GUEST_RAM1_BASE;
+                bank_size = GUEST_RAM1_SIZE;
+            }
+            else
+                goto fail;
+        }
+
+        bank_size = MIN(bank_size, kinfo->unassigned_mem);
+        if ( !allocate_bank_memory(d, kinfo, gaddr_to_gfn(bank_start),
+                                   bank_size) )
+            goto fail;
+    }
+
+    if ( kinfo->unassigned_mem )
+        goto fail;
+
+    for( i = 0; i < kinfo->mem.nr_banks; i++ )
+    {
+        printk(XENLOG_INFO "%pd BANK[%d] %#"PRIpaddr"-%#"PRIpaddr" (%ldMB)\n",
+               d,
+               i,
+               kinfo->mem.bank[i].start,
+               kinfo->mem.bank[i].start + kinfo->mem.bank[i].size,
+               /* Don't want format this as PRIpaddr (16 digit hex) */
+               (unsigned long)(kinfo->mem.bank[i].size >> 20));
+    }
+
+    xfree(hwdom_free_mem);
+    return;
+
+fail:
+    panic("Failed to allocate requested domain memory."
+          /* Don't want format this as PRIpaddr (16 digit hex) */
+          " %ldKB unallocated. Fix the VMs configurations.\n",
+          (unsigned long)kinfo->unassigned_mem >> 10);
+}
 
 /*
  * When PCI passthrough is available we want to keep the
@@ -1161,7 +1249,7 @@ int __init make_hypervisor_node(struct domain *d,
         if ( !ext_regions )
             return -ENOMEM;
 
-        if ( is_domain_direct_mapped(d) )
+        if ( domain_use_host_layout(d) )
         {
             if ( !is_iommu_enabled(d) )
                 res = find_unallocated_memory(kinfo, ext_regions);
@@ -2073,7 +2161,10 @@ static int __init construct_dom0(struct domain *d)
     /* type must be set before allocate_memory */
     d->arch.type = kinfo.type;
 #endif
-    allocate_memory_11(d, &kinfo);
+    if ( is_domain_direct_mapped(d) )
+        allocate_memory_11(d, &kinfo);
+    else
+        allocate_memory(d, &kinfo);
     find_gnttab_region(d, &kinfo);
 
     rc = process_shm_chosen(d, &kinfo);
diff --git a/xen/arch/arm/include/asm/domain_build.h b/xen/arch/arm/include/asm/domain_build.h
index da9e6025f3..b8e171e5cc 100644
--- a/xen/arch/arm/include/asm/domain_build.h
+++ b/xen/arch/arm/include/asm/domain_build.h
@@ -8,6 +8,7 @@ typedef __be32 gic_interrupt_t[3];
 
 bool allocate_bank_memory(struct domain *d, struct kernel_info *kinfo,
                           gfn_t sgfn, paddr_t tot_size);
+void allocate_memory(struct domain *d, struct kernel_info *kinfo);
 int construct_domain(struct domain *d, struct kernel_info *kinfo);
 int domain_fdt_begin_node(void *fdt, const char *name, uint64_t unit);
 int make_chosen_node(const struct kernel_info *kinfo);
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (2 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 03/14] xen/arm: permit non direct-mapped Dom0 construction Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-19 15:30   ` Jan Beulich
  2024-03-19 15:45   ` Jan Beulich
  2024-03-15 10:58 ` [PATCH v7 05/14] xen: extend domctl interface for cache coloring Carlo Nonato
                   ` (9 subsequent siblings)
  13 siblings, 2 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk, Marco Solieri

Add a command line parameter to allow the user to set the coloring
configuration for Dom0.
A common configuration syntax for cache colors is introduced and
documented.
Take the opportunity to also add:
 - default configuration notion.
 - function to check well-formed configurations.

Direct mapping Dom0 isn't possible when coloring is enabled, so
CDF_directmap flag is removed when creating it.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v7:
- parse_color_config() doesn't accept leading/trailing commas anymore
- removed alloc_colors() helper
v6:
- moved domain_llc_coloring_free() in this patch
- removed domain_alloc_colors() in favor of a more explicit allocation
- parse_color_config() now accepts the size of the array to be filled
- allocate_memory() moved in another patch
v5:
- Carlo Nonato as the new author
- moved dom0 colors parsing (parse_colors()) in this patch
- added dom0_set_llc_colors() to set dom0 colors after creation
- moved color allocation and checking in this patch
- error handling when allocating color arrays
- FIXME: copy pasted allocate_memory() cause it got moved
v4:
- dom0 colors are dynamically allocated as for any other domain
  (colors are duplicated in dom0_colors and in the new array, but logic
  is simpler)
---
 docs/misc/cache-coloring.rst      |  29 +++++++
 docs/misc/xen-command-line.pandoc |   9 +++
 xen/arch/arm/domain_build.c       |  10 ++-
 xen/common/domain.c               |   3 +
 xen/common/llc-coloring.c         | 128 ++++++++++++++++++++++++++++++
 xen/include/xen/llc-coloring.h    |   3 +
 6 files changed, 181 insertions(+), 1 deletion(-)

diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 871e7a3ddb..4c859135cb 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -114,6 +114,35 @@ Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
 +----------------------+-------------------------------+
 | ``llc-nr-ways``      | set the LLC number of ways    |
 +----------------------+-------------------------------+
+| ``dom0-llc-colors``  | Dom0 color configuration      |
++----------------------+-------------------------------+
+
+Colors selection format
+***********************
+
+Regardless of the memory pool that has to be colored (Xen, Dom0/DomUs),
+the color selection can be expressed using the same syntax. In particular a
+comma-separated list of colors or ranges of colors is used.
+Ranges are hyphen-separated intervals (such as `0-4`) and are inclusive on both
+sides.
+
+Note that:
+
+- no spaces are allowed between values.
+- no overlapping ranges or duplicated colors are allowed.
+- values must be written in ascending order.
+
+Examples:
+
++-------------------+-----------------------------+
+| **Configuration** | **Actual selection**        |
++-------------------+-----------------------------+
+| 1-2,5-8           | [1, 2, 5, 6, 7, 8]          |
++-------------------+-----------------------------+
+| 4-8,10,11,12      | [4, 5, 6, 7, 8, 10, 11, 12] |
++-------------------+-----------------------------+
+| 0                 | [0]                         |
++-------------------+-----------------------------+
 
 Auto-probing of LLC specs
 #########################
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 2936abea2c..28035a214d 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
 
 Specify a list of IO ports to be excluded from dom0 access.
 
+### dom0-llc-colors
+> `= List of [ <integer> | <integer>-<integer> ]`
+
+> Default: `All available LLC colors`
+
+Specify dom0 LLC color configuration. This option is available only when
+`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
+colors are used.
+
 ### dom0_max_vcpus
 
 Either:
diff --git a/xen/arch/arm/domain_build.c b/xen/arch/arm/domain_build.c
index d21be2c57b..3de1659836 100644
--- a/xen/arch/arm/domain_build.c
+++ b/xen/arch/arm/domain_build.c
@@ -2,6 +2,7 @@
 #include <xen/init.h>
 #include <xen/compile.h>
 #include <xen/lib.h>
+#include <xen/llc-coloring.h>
 #include <xen/mm.h>
 #include <xen/param.h>
 #include <xen/domain_page.h>
@@ -2208,6 +2209,7 @@ void __init create_dom0(void)
         .max_maptrack_frames = -1,
         .grant_opts = XEN_DOMCTL_GRANT_version(opt_gnttab_max_version),
     };
+    unsigned int flags = CDF_privileged;
     int rc;
 
     /* The vGIC for DOM0 is exactly emulating the hardware GIC */
@@ -2235,10 +2237,16 @@ void __init create_dom0(void)
             panic("SVE vector length error\n");
     }
 
-    dom0 = domain_create(0, &dom0_cfg, CDF_privileged | CDF_directmap);
+    if ( !llc_coloring_enabled )
+        flags |= CDF_directmap;
+
+    dom0 = domain_create(0, &dom0_cfg, flags);
     if ( IS_ERR(dom0) )
         panic("Error creating domain 0 (rc = %ld)\n", PTR_ERR(dom0));
 
+    if ( llc_coloring_enabled && (rc = dom0_set_llc_colors(dom0)) )
+        panic("Error initializing LLC coloring for domain 0 (rc = %d)", rc);
+
     if ( alloc_dom0_vcpu0(dom0) == NULL )
         panic("Error creating domain 0 vcpu0\n");
 
diff --git a/xen/common/domain.c b/xen/common/domain.c
index f6f5574996..f144b54f4f 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -33,6 +33,7 @@
 #include <xen/xenoprof.h>
 #include <xen/irq.h>
 #include <xen/argo.h>
+#include <xen/llc-coloring.h>
 #include <asm/p2m.h>
 #include <asm/processor.h>
 #include <public/sched.h>
@@ -1208,6 +1209,8 @@ void domain_destroy(struct domain *d)
 
     BUG_ON(!d->is_dying);
 
+    domain_llc_coloring_free(d);
+
     /* May be already destroyed, or get_domain() can race us. */
     if ( atomic_cmpxchg(&d->refcnt, 0, DOMAIN_DESTROYED) != 0 )
         return;
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 51eae90ad5..ebd7087dc2 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -18,6 +18,63 @@ integer_param("llc-nr-ways", llc_nr_ways);
 /* Number of colors available in the LLC */
 static unsigned int __ro_after_init max_nr_colors;
 
+static unsigned int __initdata dom0_colors[CONFIG_NR_LLC_COLORS];
+static unsigned int __initdata dom0_num_colors;
+
+/*
+ * Parse the coloring configuration given in the buf string, following the
+ * syntax below.
+ *
+ * COLOR_CONFIGURATION ::= COLOR | RANGE,...,COLOR | RANGE
+ * RANGE               ::= COLOR-COLOR
+ *
+ * Example: "0,2-6,15-16" represents the set of colors: 0,2,3,4,5,6,15,16.
+ */
+static int __init parse_color_config(const char *buf, unsigned int *colors,
+                                     unsigned int max_num_colors,
+                                     unsigned int *num_colors)
+{
+    const char *s = buf;
+
+    *num_colors = 0;
+
+    while ( *s != '\0' )
+    {
+        unsigned int color, start, end;
+
+        start = simple_strtoul(s, &s, 0);
+
+        if ( *s == '-' )    /* Range */
+        {
+            s++;
+            end = simple_strtoul(s, &s, 0);
+        }
+        else                /* Single value */
+            end = start;
+
+        if ( start > end || (end - start) > (UINT_MAX - *num_colors) ||
+             (*num_colors + (end - start)) >= max_num_colors )
+            return -EINVAL;
+
+        for ( color = start; color <= end; color++ )
+            colors[(*num_colors)++] = color;
+
+        if ( *s == ',' )
+            s++;
+        else if ( *s != '\0' )
+            break;
+    }
+
+    return *s ? -EINVAL : 0;
+}
+
+static int __init parse_dom0_colors(const char *s)
+{
+    return parse_color_config(s, dom0_colors, ARRAY_SIZE(dom0_colors),
+                              &dom0_num_colors);
+}
+custom_param("dom0-llc-colors", parse_dom0_colors);
+
 static void print_colors(const unsigned int *colors, unsigned int num_colors)
 {
     unsigned int i;
@@ -41,6 +98,22 @@ static void print_colors(const unsigned int *colors, unsigned int num_colors)
     printk(" }\n");
 }
 
+static bool check_colors(const unsigned int *colors, unsigned int num_colors)
+{
+    unsigned int i;
+
+    for ( i = 0; i < num_colors; i++ )
+    {
+        if ( colors[i] >= max_nr_colors )
+        {
+            printk(XENLOG_ERR "LLC color %u >= %u\n", colors[i], max_nr_colors);
+            return false;
+        }
+    }
+
+    return true;
+}
+
 void __init llc_coloring_init(void)
 {
     unsigned int way_size;
@@ -91,6 +164,61 @@ void cf_check domain_dump_llc_colors(const struct domain *d)
     print_colors(d->llc_colors, d->num_llc_colors);
 }
 
+static int domain_set_default_colors(struct domain *d)
+{
+    unsigned int *colors = xmalloc_array(unsigned int, max_nr_colors);
+    unsigned int i;
+
+    if ( !colors )
+        return -ENOMEM;
+
+    printk(XENLOG_WARNING
+           "LLC color config not found for %pd, using all colors\n", d);
+
+    for ( i = 0; i < max_nr_colors; i++ )
+        colors[i] = i;
+
+    d->llc_colors = colors;
+    d->num_llc_colors = max_nr_colors;
+
+    return 0;
+}
+
+int __init dom0_set_llc_colors(struct domain *d)
+{
+    unsigned int *colors;
+
+    if ( !dom0_num_colors )
+        return domain_set_default_colors(d);
+
+    if ( !check_colors(dom0_colors, dom0_num_colors) )
+    {
+        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
+        return -EINVAL;
+    }
+
+    colors = xmalloc_array(unsigned int, dom0_num_colors);
+    if ( !colors )
+        return -ENOMEM;
+
+    /* Static type checking */
+    (void)(colors == dom0_colors);
+    memcpy(colors, dom0_colors, sizeof(*colors) * dom0_num_colors);
+    d->llc_colors = colors;
+    d->num_llc_colors = dom0_num_colors;
+
+    return 0;
+}
+
+void domain_llc_coloring_free(struct domain *d)
+{
+    if ( !llc_coloring_enabled )
+        return;
+
+    /* free pointer-to-const using __va(__pa()) */
+    xfree(__va(__pa(d->llc_colors)));
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index 67b27c995b..ee82932266 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -16,16 +16,19 @@ extern bool llc_coloring_enabled;
 void llc_coloring_init(void);
 void dump_llc_coloring_info(void);
 void domain_dump_llc_colors(const struct domain *d);
+void domain_llc_coloring_free(struct domain *d);
 #else
 #define llc_coloring_enabled false
 
 static inline void llc_coloring_init(void) {}
 static inline void dump_llc_coloring_info(void) {}
 static inline void domain_dump_llc_colors(const struct domain *d) {}
+static inline void domain_llc_coloring_free(struct domain *d) {}
 #endif
 
 unsigned int get_llc_way_size(void);
 void arch_llc_coloring_init(void);
+int dom0_set_llc_colors(struct domain *d);
 
 #endif /* __COLORING_H__ */
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 05/14] xen: extend domctl interface for cache coloring
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (3 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-19 15:37   ` Jan Beulich
  2024-03-15 10:58 ` [PATCH v7 06/14] tools: add support for cache coloring configuration Carlo Nonato
                   ` (8 subsequent siblings)
  13 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri

Add a new domctl hypercall to allow the user to set LLC coloring
configurations. Colors can be set only once, just after domain creation,
since recoloring isn't supported.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v7:
- -EOPNOTSUPP returned in case of hypercall called without llc_coloring_enabled
- domain_set_llc_colors_domctl() renamed to domain_set_llc_colors()
- added padding and input bound checks to domain_set_llc_colors()
- removed alloc_colors() helper usage from domain_set_llc_colors()
v6:
- reverted the XEN_DOMCTL_INTERFACE_VERSION bump
- reverted to uint32 for the guest handle
- explicit padding added to the domctl struct
- rewrote domain_set_llc_colors_domctl() to be more explicit
v5:
- added a new hypercall to set colors
- uint for the guest handle
v4:
- updated XEN_DOMCTL_INTERFACE_VERSION
---
 xen/common/domctl.c            |  8 ++++++++
 xen/common/llc-coloring.c      | 34 ++++++++++++++++++++++++++++++++++
 xen/include/public/domctl.h    |  9 +++++++++
 xen/include/xen/llc-coloring.h |  2 ++
 4 files changed, 53 insertions(+)

diff --git a/xen/common/domctl.c b/xen/common/domctl.c
index f5a71ee5f7..6c940ac833 100644
--- a/xen/common/domctl.c
+++ b/xen/common/domctl.c
@@ -8,6 +8,7 @@
 
 #include <xen/types.h>
 #include <xen/lib.h>
+#include <xen/llc-coloring.h>
 #include <xen/err.h>
 #include <xen/mm.h>
 #include <xen/sched.h>
@@ -858,6 +859,13 @@ long do_domctl(XEN_GUEST_HANDLE_PARAM(xen_domctl_t) u_domctl)
                 __HYPERVISOR_domctl, "h", u_domctl);
         break;
 
+    case XEN_DOMCTL_set_llc_colors:
+        if ( llc_coloring_enabled )
+            ret = domain_set_llc_colors(d, &op->u.set_llc_colors);
+        else
+            ret = -EOPNOTSUPP;
+        break;
+
     default:
         ret = arch_do_domctl(op, d, u_domctl);
         break;
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index ebd7087dc2..9c1f152b96 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -4,6 +4,7 @@
  *
  * Copyright (C) 2022 Xilinx Inc.
  */
+#include <xen/guest_access.h>
 #include <xen/keyhandler.h>
 #include <xen/llc-coloring.h>
 #include <xen/param.h>
@@ -219,6 +220,39 @@ void domain_llc_coloring_free(struct domain *d)
     xfree(__va(__pa(d->llc_colors)));
 }
 
+int domain_set_llc_colors(struct domain *d,
+                          const struct xen_domctl_set_llc_colors *config)
+{
+    unsigned int *colors;
+
+    if ( d->num_llc_colors )
+        return -EEXIST;
+
+    if ( !config->num_llc_colors )
+        return domain_set_default_colors(d);
+
+    if ( config->num_llc_colors > max_nr_colors || config->pad )
+        return -EINVAL;
+
+    colors = xmalloc_array(unsigned int, config->num_llc_colors);
+    if ( !colors )
+        return -ENOMEM;
+
+    if ( copy_from_guest(colors, config->llc_colors, config->num_llc_colors) )
+        return -EFAULT;
+
+    if ( !check_colors(colors, config->num_llc_colors) )
+    {
+        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
+        return -EINVAL;
+    }
+
+    d->llc_colors = colors;
+    d->num_llc_colors = config->num_llc_colors;
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/public/domctl.h b/xen/include/public/domctl.h
index a33f9ec32b..d44eac8775 100644
--- a/xen/include/public/domctl.h
+++ b/xen/include/public/domctl.h
@@ -1190,6 +1190,13 @@ struct xen_domctl_vmtrace_op {
 typedef struct xen_domctl_vmtrace_op xen_domctl_vmtrace_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_domctl_vmtrace_op_t);
 
+struct xen_domctl_set_llc_colors {
+    /* IN LLC coloring parameters */
+    uint32_t num_llc_colors;
+    uint32_t pad;
+    XEN_GUEST_HANDLE_64(uint32) llc_colors;
+};
+
 struct xen_domctl {
     uint32_t cmd;
 #define XEN_DOMCTL_createdomain                   1
@@ -1277,6 +1284,7 @@ struct xen_domctl {
 #define XEN_DOMCTL_vmtrace_op                    84
 #define XEN_DOMCTL_get_paging_mempool_size       85
 #define XEN_DOMCTL_set_paging_mempool_size       86
+#define XEN_DOMCTL_set_llc_colors                87
 #define XEN_DOMCTL_gdbsx_guestmemio            1000
 #define XEN_DOMCTL_gdbsx_pausevcpu             1001
 #define XEN_DOMCTL_gdbsx_unpausevcpu           1002
@@ -1339,6 +1347,7 @@ struct xen_domctl {
         struct xen_domctl_vuart_op          vuart_op;
         struct xen_domctl_vmtrace_op        vmtrace_op;
         struct xen_domctl_paging_mempool    paging_mempool;
+        struct xen_domctl_set_llc_colors    set_llc_colors;
         uint8_t                             pad[128];
     } u;
 };
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index ee82932266..b3801fca00 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -29,6 +29,8 @@ static inline void domain_llc_coloring_free(struct domain *d) {}
 unsigned int get_llc_way_size(void);
 void arch_llc_coloring_init(void);
 int dom0_set_llc_colors(struct domain *d);
+int domain_set_llc_colors(struct domain *d,
+                          const struct xen_domctl_set_llc_colors *config);
 
 #endif /* __COLORING_H__ */
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 06/14] tools: add support for cache coloring configuration
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (4 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 05/14] xen: extend domctl interface for cache coloring Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-25 10:55   ` Anthony PERARD
  2024-03-15 10:58 ` [PATCH v7 07/14] xen/arm: add support for cache coloring configuration via device-tree Carlo Nonato
                   ` (7 subsequent siblings)
  13 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Wei Liu, Anthony PERARD, Juergen Gross, Marco Solieri

Add a new "llc_colors" parameter that defines the LLC color assignment for
a domain. The user can specify one or more color ranges using the same
syntax used everywhere else for color config described in the
documentation.
The parameter is defined as a list of strings that represent the color
ranges.

Documentation is also added.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v7:
- removed unneeded NULL check before xc_hypercall_buffer_free() in
  xc_domain_set_llc_colors()
v6:
- no edits
v5:
- added LIBXL_HAVE_BUILDINFO_LLC_COLORS
- moved color configuration in xc_domain_set_llc_colors() cause of the new
  hypercall
v4:
- removed overlapping color ranges checks during parsing
- moved hypercall buffer initialization in libxenctrl
---
 docs/man/xl.cfg.5.pod.in         | 10 +++++++++
 tools/include/libxl.h            |  5 +++++
 tools/include/xenctrl.h          |  9 ++++++++
 tools/libs/ctrl/xc_domain.c      | 35 +++++++++++++++++++++++++++++
 tools/libs/light/libxl_create.c  |  9 ++++++++
 tools/libs/light/libxl_types.idl |  1 +
 tools/xl/xl_parse.c              | 38 +++++++++++++++++++++++++++++++-
 7 files changed, 106 insertions(+), 1 deletion(-)

diff --git a/docs/man/xl.cfg.5.pod.in b/docs/man/xl.cfg.5.pod.in
index 039e057318..941de07408 100644
--- a/docs/man/xl.cfg.5.pod.in
+++ b/docs/man/xl.cfg.5.pod.in
@@ -3070,6 +3070,16 @@ raised.
 
 =back
 
+=over 4
+
+=item B<llc_colors=[ "RANGE", "RANGE", ...]>
+
+Specify the Last Level Cache (LLC) color configuration for the guest.
+B<RANGE> can be either a single color value or a hypen-separated closed
+interval of colors (such as "0-4").
+
+=back
+
 =head3 x86
 
 =over 4
diff --git a/tools/include/libxl.h b/tools/include/libxl.h
index 62cb07dea6..49521e5da4 100644
--- a/tools/include/libxl.h
+++ b/tools/include/libxl.h
@@ -1368,6 +1368,11 @@ void libxl_mac_copy(libxl_ctx *ctx, libxl_mac *dst, const libxl_mac *src);
  */
 #define LIBXL_HAVE_BUILDINFO_HVM_SYSTEM_FIRMWARE
 
+/*
+ * The libxl_domain_build_info has the llc_colors array.
+ */
+#define LIBXL_HAVE_BUILDINFO_LLC_COLORS 1
+
 /*
  * ERROR_REMUS_XXX error code only exists from Xen 4.5, Xen 4.6 and it
  * is changed to ERROR_CHECKPOINT_XXX in Xen 4.7
diff --git a/tools/include/xenctrl.h b/tools/include/xenctrl.h
index 2ef8b4e054..2c2a5c4bd4 100644
--- a/tools/include/xenctrl.h
+++ b/tools/include/xenctrl.h
@@ -2653,6 +2653,15 @@ int xc_livepatch_replace(xc_interface *xch, char *name, uint32_t timeout, uint32
 int xc_domain_cacheflush(xc_interface *xch, uint32_t domid,
                          xen_pfn_t start_pfn, xen_pfn_t nr_pfns);
 
+/*
+ * Set LLC colors for a domain.
+ * It can only be used directly after domain creation. An attempt to use it
+ * afterwards will result in an error.
+ */
+int xc_domain_set_llc_colors(xc_interface *xch, uint32_t domid,
+                             const unsigned int *llc_colors,
+                             unsigned int num_llc_colors);
+
 #if defined(__arm__) || defined(__aarch64__)
 int xc_dt_overlay(xc_interface *xch, void *overlay_fdt,
                   uint32_t overlay_fdt_size, uint8_t overlay_op);
diff --git a/tools/libs/ctrl/xc_domain.c b/tools/libs/ctrl/xc_domain.c
index f2d9d14b4d..d315cfa6c1 100644
--- a/tools/libs/ctrl/xc_domain.c
+++ b/tools/libs/ctrl/xc_domain.c
@@ -2180,6 +2180,41 @@ int xc_domain_soft_reset(xc_interface *xch,
     domctl.domain = domid;
     return do_domctl(xch, &domctl);
 }
+
+int xc_domain_set_llc_colors(xc_interface *xch, uint32_t domid,
+                             const unsigned int *llc_colors,
+                             unsigned int num_llc_colors)
+{
+    struct xen_domctl domctl = {};
+    DECLARE_HYPERCALL_BUFFER(uint32_t, local);
+    int ret = -1;
+
+    if ( num_llc_colors )
+    {
+        size_t bytes = sizeof(uint32_t) * num_llc_colors;
+
+        local = xc_hypercall_buffer_alloc(xch, local, bytes);
+        if ( local == NULL )
+        {
+            PERROR("Could not allocate LLC colors for set_llc_colors");
+            ret = -ENOMEM;
+            goto out;
+        }
+        memcpy(local, llc_colors, bytes);
+        set_xen_guest_handle(domctl.u.set_llc_colors.llc_colors, local);
+    }
+
+    domctl.cmd = XEN_DOMCTL_set_llc_colors;
+    domctl.domain = domid;
+    domctl.u.set_llc_colors.num_llc_colors = num_llc_colors;
+
+    ret = do_domctl(xch, &domctl);
+
+out:
+    xc_hypercall_buffer_free(xch, local);
+
+    return ret;
+}
 /*
  * Local variables:
  * mode: C
diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
index 5546335973..79f206f616 100644
--- a/tools/libs/light/libxl_create.c
+++ b/tools/libs/light/libxl_create.c
@@ -726,6 +726,15 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
             /* A new domain now exists */
             *domid = local_domid;
 
+            ret = xc_domain_set_llc_colors(ctx->xch, local_domid,
+                                           b_info->llc_colors,
+                                           b_info->num_llc_colors);
+            if (ret < 0 && errno != EOPNOTSUPP) {
+                LOGED(ERROR, local_domid, "LLC colors allocation failed");
+                rc = ERROR_FAIL;
+                goto out;
+            }
+
             rc = libxl__is_domid_recent(gc, local_domid, &recent);
             if (rc)
                 goto out;
diff --git a/tools/libs/light/libxl_types.idl b/tools/libs/light/libxl_types.idl
index 470122e768..79118e1582 100644
--- a/tools/libs/light/libxl_types.idl
+++ b/tools/libs/light/libxl_types.idl
@@ -616,6 +616,7 @@ libxl_domain_build_info = Struct("domain_build_info",[
     ("ioports",          Array(libxl_ioport_range, "num_ioports")),
     ("irqs",             Array(uint32, "num_irqs")),
     ("iomem",            Array(libxl_iomem_range, "num_iomem")),
+    ("llc_colors",       Array(uint32, "num_llc_colors")),
     ("claim_mode",	     libxl_defbool),
     ("event_channels",   uint32),
     ("kernel",           string),
diff --git a/tools/xl/xl_parse.c b/tools/xl/xl_parse.c
index 80ffe85f5e..aa9623a6b9 100644
--- a/tools/xl/xl_parse.c
+++ b/tools/xl/xl_parse.c
@@ -1294,7 +1294,7 @@ void parse_config_data(const char *config_source,
     XLU_ConfigList *cpus, *vbds, *nics, *pcis, *cvfbs, *cpuids, *vtpms,
                    *usbctrls, *usbdevs, *p9devs, *vdispls, *pvcallsifs_devs;
     XLU_ConfigList *channels, *ioports, *irqs, *iomem, *viridian, *dtdevs,
-                   *mca_caps, *smbios;
+                   *mca_caps, *smbios, *llc_colors;
     int num_ioports, num_irqs, num_iomem, num_cpus, num_viridian, num_mca_caps;
     int num_smbios;
     int pci_power_mgmt = 0;
@@ -1302,6 +1302,7 @@ void parse_config_data(const char *config_source,
     int pci_permissive = 0;
     int pci_seize = 0;
     int i, e;
+    int num_llc_colors;
     char *kernel_basename;
 
     libxl_domain_create_info *c_info = &d_config->c_info;
@@ -1445,6 +1446,41 @@ void parse_config_data(const char *config_source,
     if (!xlu_cfg_get_long (config, "maxmem", &l, 0))
         b_info->max_memkb = l * 1024;
 
+    if (!xlu_cfg_get_list(config, "llc_colors", &llc_colors, &num_llc_colors, 0)) {
+        int cur_index = 0;
+
+        b_info->num_llc_colors = 0;
+        for (i = 0; i < num_llc_colors; i++) {
+            uint32_t start = 0, end = 0, k;
+
+            buf = xlu_cfg_get_listitem(llc_colors, i);
+            if (!buf) {
+                fprintf(stderr,
+                        "xl: Can't get element %d in LLC color list\n", i);
+                exit(1);
+            }
+
+            if (sscanf(buf, "%" SCNu32 "-%" SCNu32, &start, &end) != 2) {
+                if (sscanf(buf, "%" SCNu32, &start) != 1) {
+                    fprintf(stderr, "xl: Invalid LLC color range: %s\n", buf);
+                    exit(1);
+                }
+                end = start;
+            } else if (start > end) {
+                fprintf(stderr,
+                        "xl: Start LLC color is greater than end: %s\n", buf);
+                exit(1);
+            }
+
+            b_info->num_llc_colors += (end - start) + 1;
+            b_info->llc_colors = (uint32_t *)realloc(b_info->llc_colors,
+                        sizeof(*b_info->llc_colors) * b_info->num_llc_colors);
+
+            for (k = start; k <= end; k++)
+                b_info->llc_colors[cur_index++] = k;
+        }
+    }
+
     if (!xlu_cfg_get_long (config, "vcpus", &l, 0)) {
         vcpus = l;
         if (libxl_cpu_bitmap_alloc(ctx, &b_info->avail_vcpus, l)) {
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 07/14] xen/arm: add support for cache coloring configuration via device-tree
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (5 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 06/14] tools: add support for cache coloring configuration Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-19 15:41   ` Jan Beulich
  2024-03-15 10:58 ` [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro Carlo Nonato
                   ` (6 subsequent siblings)
  13 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk, Andrew Cooper, George Dunlap,
	Jan Beulich, Wei Liu, Marco Solieri

Add the "llc-colors" Device Tree attribute to express DomUs and Dom0less
color configurations.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v7:
- removed alloc_colors() helper usage from domain_set_llc_colors_from_str()
v6:
- rewrote domain_set_llc_colors_from_str() to be more explicit
v5:
- static-mem check has been moved in a previous patch
- added domain_set_llc_colors_from_str() to set colors after domain creation
---
 docs/misc/arm/device-tree/booting.txt |  4 +++
 docs/misc/cache-coloring.rst          | 48 +++++++++++++++++++++++++++
 xen/arch/arm/dom0less-build.c         | 10 ++++++
 xen/common/llc-coloring.c             | 31 +++++++++++++++++
 xen/include/xen/llc-coloring.h        |  1 +
 5 files changed, 94 insertions(+)

diff --git a/docs/misc/arm/device-tree/booting.txt b/docs/misc/arm/device-tree/booting.txt
index bbd955e9c2..bbe49faadc 100644
--- a/docs/misc/arm/device-tree/booting.txt
+++ b/docs/misc/arm/device-tree/booting.txt
@@ -162,6 +162,10 @@ with the following properties:
 
     An integer specifying the number of vcpus to allocate to the guest.
 
+- llc-colors
+    A string specifying the LLC color configuration for the guest.
+    Refer to docs/misc/cache_coloring.rst for syntax.
+
 - vpl011
 
     An empty property to enable/disable a virtual pl011 for the guest to
diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 4c859135cb..028aecda28 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -12,6 +12,7 @@ If needed, change the maximum number of colors with
 ``CONFIG_NR_LLC_COLORS=<n>``.
 
 Runtime configuration is done via `Command line parameters`_.
+For DomUs follow `DomUs configuration`_.
 
 Background
 **********
@@ -156,6 +157,53 @@ LLC specs can be manually set via the above command line parameters. This
 bypasses any auto-probing and it's used to overcome failing situations or for
 debugging/testing purposes.
 
+DomUs configuration
+*******************
+
+DomUs colors can be set either in the ``xl`` configuration file (documentation
+at `docs/man/xl.cfg.pod.5.in`) or via Device Tree, also for Dom0less
+configurations (documentation at `docs/misc/arm/device-tree/booting.txt`) using
+the ``llc-colors`` option. For example:
+
+::
+
+    xen,xen-bootargs = "console=dtuart dtuart=serial0 dom0_mem=1G dom0_max_vcpus=1 sched=null llc-coloring=on dom0-llc-colors=2-6";
+    xen,dom0-bootargs "console=hvc0 earlycon=xen earlyprintk=xen root=/dev/ram0"
+
+    dom0 {
+        compatible = "xen,linux-zimage" "xen,multiboot-module";
+        reg = <0x0 0x1000000 0x0 15858176>;
+    };
+
+    dom0-ramdisk {
+        compatible = "xen,linux-initrd" "xen,multiboot-module";
+        reg = <0x0 0x2000000 0x0 20638062>;
+    };
+
+    domU0 {
+        #address-cells = <0x1>;
+        #size-cells = <0x1>;
+        compatible = "xen,domain";
+        memory = <0x0 0x40000>;
+        llc-colors = "4-8,10,11,12";
+        cpus = <0x1>;
+        vpl011 = <0x1>;
+
+        module@2000000 {
+            compatible = "multiboot,kernel", "multiboot,module";
+            reg = <0x2000000 0xffffff>;
+            bootargs = "console=ttyAMA0";
+        };
+
+        module@30000000 {
+            compatible = "multiboot,ramdisk", "multiboot,module";
+            reg = <0x3000000 0xffffff>;
+        };
+    };
+
+**Note:** If no color configuration is provided for a domain, the default one,
+which corresponds to all available colors is used instead.
+
 Known issues and limitations
 ****************************
 
diff --git a/xen/arch/arm/dom0less-build.c b/xen/arch/arm/dom0less-build.c
index 992080e61a..f7ac9b9900 100644
--- a/xen/arch/arm/dom0less-build.c
+++ b/xen/arch/arm/dom0less-build.c
@@ -807,6 +807,7 @@ void __init create_domUs(void)
     struct dt_device_node *node;
     const struct dt_device_node *cpupool_node,
                                 *chosen = dt_find_node_by_path("/chosen");
+    const char *llc_colors_str = NULL;
 
     BUG_ON(chosen == NULL);
     dt_for_each_child_node(chosen, node)
@@ -950,6 +951,10 @@ void __init create_domUs(void)
 #endif
         }
 
+        dt_property_read_string(node, "llc-colors", &llc_colors_str);
+        if ( !llc_coloring_enabled && llc_colors_str)
+            panic("'llc-colors' found, but LLC coloring is disabled\n");
+
         /*
          * The variable max_init_domid is initialized with zero, so here it's
          * very important to use the pre-increment operator to call
@@ -960,6 +965,11 @@ void __init create_domUs(void)
             panic("Error creating domain %s (rc = %ld)\n",
                   dt_node_name(node), PTR_ERR(d));
 
+        if ( llc_coloring_enabled &&
+             (rc = domain_set_llc_colors_from_str(d, llc_colors_str)) )
+            panic("Error initializing LLC coloring for domain %s (rc = %d)\n",
+                  dt_node_name(node), rc);
+
         d->is_console = true;
         dt_device_set_used_by(node, d->domain_id);
 
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 9c1f152b96..77d24553e0 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -253,6 +253,37 @@ int domain_set_llc_colors(struct domain *d,
     return 0;
 }
 
+int __init domain_set_llc_colors_from_str(struct domain *d, const char *str)
+{
+    int err;
+    unsigned int *colors, num_colors;
+
+    if ( !str )
+        return domain_set_default_colors(d);
+
+    colors = xmalloc_array(unsigned int, max_nr_colors);
+    if ( !colors )
+        return -ENOMEM;
+
+    err = parse_color_config(str, colors, max_nr_colors, &num_colors);
+    if ( err )
+    {
+        printk(XENLOG_ERR "Error parsing LLC color configuration");
+        return err;
+    }
+
+    if ( !check_colors(colors, num_colors) )
+    {
+        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
+        return -EINVAL;
+    }
+
+    d->llc_colors = colors;
+    d->num_llc_colors = num_colors;
+
+    return 0;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index b3801fca00..49ebd1e712 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -31,6 +31,7 @@ void arch_llc_coloring_init(void);
 int dom0_set_llc_colors(struct domain *d);
 int domain_set_llc_colors(struct domain *d,
                           const struct xen_domctl_set_llc_colors *config);
+int domain_set_llc_colors_from_str(struct domain *d, const char *str);
 
 #endif /* __COLORING_H__ */
 
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (6 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 07/14] xen/arm: add support for cache coloring configuration via device-tree Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-19 15:47   ` Jan Beulich
  2024-03-21 16:07   ` Julien Grall
  2024-03-15 10:58 ` [PATCH v7 09/14] xen/page_alloc: introduce page flag to stop buddy merging Carlo Nonato
                   ` (5 subsequent siblings)
  13 siblings, 2 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

PGC_static and PGC_extra needs to be preserved when assigning a page.
Define a new macro that groups those flags and use it instead of or'ing
every time.

To make preserved flags even more meaningful, they are kept also when
switching state in mark_page_free().

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v7:
- PGC_preserved used also in mark_page_free()
v6:
- preserved_flags renamed to PGC_preserved
- PGC_preserved is used only in assign_pages()
v5:
- new patch
---
 xen/common/page_alloc.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index c38edb9a58..6a98d9013f 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -158,6 +158,8 @@
 #define PGC_static 0
 #endif
 
+#define PGC_preserved (PGC_extra | PGC_static)
+
 #ifndef PGT_TYPE_INFO_INITIALIZER
 #define PGT_TYPE_INFO_INITIALIZER 0
 #endif
@@ -1424,11 +1426,11 @@ static bool mark_page_free(struct page_info *pg, mfn_t mfn)
     {
     case PGC_state_inuse:
         BUG_ON(pg->count_info & PGC_broken);
-        pg->count_info = PGC_state_free;
+        pg->count_info = PGC_state_free | (pg->count_info & PGC_preserved);
         break;
 
     case PGC_state_offlining:
-        pg->count_info = (pg->count_info & PGC_broken) |
+        pg->count_info = (pg->count_info & (PGC_broken | PGC_preserved)) |
                          PGC_state_offlined;
         pg_offlined = true;
         break;
@@ -2363,7 +2365,7 @@ int assign_pages(
 
         for ( i = 0; i < nr; i++ )
         {
-            ASSERT(!(pg[i].count_info & ~(PGC_extra | PGC_static)));
+            ASSERT(!(pg[i].count_info & ~PGC_preserved));
             if ( pg[i].count_info & PGC_extra )
                 extra_pages++;
         }
@@ -2423,7 +2425,7 @@ int assign_pages(
         page_set_owner(&pg[i], d);
         smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
         pg[i].count_info =
-            (pg[i].count_info & (PGC_extra | PGC_static)) | PGC_allocated | 1;
+            (pg[i].count_info & PGC_preserved) | PGC_allocated | 1;
 
         page_list_add_tail(&pg[i], page_to_list(d, &pg[i]));
     }
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 09/14] xen/page_alloc: introduce page flag to stop buddy merging
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (7 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-19 15:49   ` Jan Beulich
  2024-03-15 10:58 ` [PATCH v7 10/14] xen: add cache coloring allocator for domains Carlo Nonato
                   ` (4 subsequent siblings)
  13 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu

Add a new PGC_no_buddy_merge flag that prevents the buddy algorithm in
free_heap_pages() from merging pages that have it set. As of now, only
PGC_static has this feature, but future work can extend it easier than
before.

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v7:
- new patch
---
 xen/common/page_alloc.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 6a98d9013f..3adea713b7 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -159,6 +159,7 @@
 #endif
 
 #define PGC_preserved (PGC_extra | PGC_static)
+#define PGC_no_buddy_merge PGC_static
 
 #ifndef PGT_TYPE_INFO_INITIALIZER
 #define PGT_TYPE_INFO_INITIALIZER 0
@@ -1504,7 +1505,7 @@ static void free_heap_pages(
             /* Merge with predecessor block? */
             if ( !mfn_valid(page_to_mfn(predecessor)) ||
                  !page_state_is(predecessor, free) ||
-                 (predecessor->count_info & PGC_static) ||
+                 (predecessor->count_info & PGC_no_buddy_merge) ||
                  (PFN_ORDER(predecessor) != order) ||
                  (page_to_nid(predecessor) != node) )
                 break;
@@ -1528,7 +1529,7 @@ static void free_heap_pages(
             /* Merge with successor block? */
             if ( !mfn_valid(page_to_mfn(successor)) ||
                  !page_state_is(successor, free) ||
-                 (successor->count_info & PGC_static) ||
+                 (successor->count_info & PGC_no_buddy_merge) ||
                  (PFN_ORDER(successor) != order) ||
                  (page_to_nid(successor) != node) )
                 break;
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 10/14] xen: add cache coloring allocator for domains
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (8 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 09/14] xen/page_alloc: introduce page flag to stop buddy merging Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-19 16:43   ` Jan Beulich
  2024-03-15 10:58 ` [PATCH v7 11/14] xen/arm: use domain memory to allocate p2m page tables Carlo Nonato
                   ` (3 subsequent siblings)
  13 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk, Marco Solieri

Add a new memory page allocator that implements the cache coloring mechanism.
The allocation algorithm enforces equal frequency distribution of cache
partitions, following the coloring configuration of a domain. This allows
for an even utilization of cache sets for every domain.

Pages are stored in a color-indexed array of lists. Those lists are filled
by a simple init function which computes the color of each page.
When a domain requests a page, the allocator extract the page from the list
with the maximum number of free pages between those that the domain can
access, given its coloring configuration.

The allocator can only handle requests of order-0 pages. This allows for
easier implementation and since cache coloring targets only embedded systems,
it's assumed not to be a major problem.

The buddy allocator must coexist with the colored one because the Xen heap
isn't colored. For this reason a new Kconfig option and a command line
parameter are added to let the user set the amount of memory reserved for
the buddy allocator. Even when cache coloring is enabled, this memory
isn't managed by the colored allocator.

Colored heap information is dumped in the dump_heap() debug-key function.

Based on original work from: Luca Miccio <lucmiccio@gmail.com>

Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v7:
- requests to alloc_color_heap_page() now fail if MEMF_bits is used
v6:
- colored allocator functions are now static
v5:
- Carlo Nonato as the new author
- the colored allocator balances color usage for each domain and it searches
  linearly only in the number of colors (FIXME removed)
- addedd scrub functionality
- removed stub functions (still requires some macro definition)
- addr_to_color turned to mfn_to_color for easier operations
- removed BUG_ON in init_color_heap_pages() in favor of panic()
- only non empty page lists are logged in dump_color_heap()
v4:
- moved colored allocator code after buddy allocator because it now has
  some dependencies on buddy functions
- buddy_alloc_size is now used only by the colored allocator
- fixed a bug that allowed the buddy to merge pages when they were colored
- free_color_heap_page() now calls mark_page_free()
- free_color_heap_page() uses of the frametable array for faster searches
- added FIXME comment for the linear search in free_color_heap_page()
- removed alloc_color_domheap_page() to let the colored allocator exploit
  some more buddy allocator code
- alloc_color_heap_page() now allocs min address pages first
- reduced the mess in end_boot_allocator(): use the first loop for
  init_color_heap_pages()
- fixed page_list_add_prev() (list.h) since it was doing the opposite of
  what it was supposed to do
- fixed page_list_add_prev() (non list.h) to check also for next existence
- removed unused page_list_add_next()
- moved p2m code in another patch
---
 docs/misc/cache-coloring.rst      |  37 ++++++
 docs/misc/xen-command-line.pandoc |  14 +++
 xen/arch/Kconfig                  |   8 ++
 xen/arch/arm/include/asm/mm.h     |   5 +
 xen/common/llc-coloring.c         |  13 ++
 xen/common/page_alloc.c           | 196 +++++++++++++++++++++++++++++-
 xen/include/xen/llc-coloring.h    |   4 +
 7 files changed, 271 insertions(+), 6 deletions(-)

diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 028aecda28..50b6d94ffc 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -11,6 +11,9 @@ To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
 If needed, change the maximum number of colors with
 ``CONFIG_NR_LLC_COLORS=<n>``.
 
+If needed, change the buddy allocator reserved size with
+``CONFIG_BUDDY_ALLOCATOR_SIZE=<n>``.
+
 Runtime configuration is done via `Command line parameters`_.
 For DomUs follow `DomUs configuration`_.
 
@@ -117,6 +120,8 @@ Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
 +----------------------+-------------------------------+
 | ``dom0-llc-colors``  | Dom0 color configuration      |
 +----------------------+-------------------------------+
+| ``buddy-alloc-size`` | Buddy allocator reserved size |
++----------------------+-------------------------------+
 
 Colors selection format
 ***********************
@@ -204,6 +209,17 @@ the ``llc-colors`` option. For example:
 **Note:** If no color configuration is provided for a domain, the default one,
 which corresponds to all available colors is used instead.
 
+Colored allocator and buddy allocator
+*************************************
+
+The colored allocator distributes pages based on color configurations of
+domains so that each domains only gets pages of its own colors.
+The colored allocator is meant as an alternative to the buddy allocator because
+its allocation policy is by definition incompatible with the generic one. Since
+the Xen heap is not colored yet, we need to support the coexistence of the two
+allocators and some memory must be left for the buddy one. Buddy memory
+reservation is configured via Kconfig or via command-line.
+
 Known issues and limitations
 ****************************
 
@@ -214,3 +230,24 @@ In the domain configuration, "xen,static-mem" allows memory to be statically
 allocated to the domain. This isn't possible when LLC coloring is enabled,
 because that memory can't be guaranteed to use only colors assigned to the
 domain.
+
+Cache coloring is intended only for embedded systems
+####################################################
+
+The current implementation aims to satisfy the need of predictability in
+embedded systems with small amount of memory to be managed in a colored way.
+Given that, some shortcuts are taken in the development. Expect worse
+performances on larger systems.
+
+Colored allocator can only make use of order-0 pages
+####################################################
+
+The cache coloring technique relies on memory mappings and on the smallest
+mapping granularity to achieve the maximum number of colors (cache partitions)
+possible. This granularity is what is normally called a page and, in Xen
+terminology, the order-0 page is the smallest one. The fairly simple
+colored allocator currently implemented, makes use only of such pages.
+It must be said that a more complex one could, in theory, adopt higher order
+pages if the colors selection contained adjacent colors. Two subsequent colors,
+for example, can be represented by an order-1 page, four colors correspond to
+an order-2 page, etc.
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 28035a214d..461403362f 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -270,6 +270,20 @@ and not running softirqs. Reduce this if softirqs are not being run frequently
 enough. Setting this to a high value may cause boot failure, particularly if
 the NMI watchdog is also enabled.
 
+### buddy-alloc-size (arm64)
+> `= <size>`
+
+> Default: `64M`
+
+Amount of memory reserved for the buddy allocator when colored allocator is
+active. This options is parsed only when LLC coloring support is enabled.
+The colored allocator is meant as an alternative to the buddy allocator,
+because its allocation policy is by definition incompatible with the generic
+one. Since the Xen heap systems is not colored yet, we need to support the
+coexistence of the two allocators for now. This parameter, which is optional
+and for expert only, it's used to set the amount of memory reserved to the
+buddy allocator.
+
 ### cet
     = List of [ shstk=<bool>, ibt=<bool> ]
 
diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
index a65c38e53e..6819a96f78 100644
--- a/xen/arch/Kconfig
+++ b/xen/arch/Kconfig
@@ -51,3 +51,11 @@ config NR_LLC_COLORS
 	  more than what's needed in the general case. Use only power of 2 values.
 	  1024 is the number of colors that fit in a 4 KiB page when integers are 4
 	  bytes long.
+
+config BUDDY_ALLOCATOR_SIZE
+	int "Buddy allocator reserved memory size (MiB)"
+	default "64"
+	depends on LLC_COLORING
+	help
+	  Amount of memory reserved for the buddy allocator to serve Xen heap,
+	  working alongside the colored one.
diff --git a/xen/arch/arm/include/asm/mm.h b/xen/arch/arm/include/asm/mm.h
index 48538b5337..68b7754bec 100644
--- a/xen/arch/arm/include/asm/mm.h
+++ b/xen/arch/arm/include/asm/mm.h
@@ -145,6 +145,11 @@ struct page_info
 #else
 #define PGC_static     0
 #endif
+#ifdef CONFIG_LLC_COLORING
+/* Page is cache colored */
+#define _PGC_colored      PG_shift(4)
+#define PGC_colored       PG_mask(1, 4)
+#endif
 /* ... */
 /* Page is broken? */
 #define _PGC_broken       PG_shift(7)
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index 77d24553e0..e34ba6b6ec 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -22,6 +22,9 @@ static unsigned int __ro_after_init max_nr_colors;
 static unsigned int __initdata dom0_colors[CONFIG_NR_LLC_COLORS];
 static unsigned int __initdata dom0_num_colors;
 
+#define mfn_color_mask              (max_nr_colors - 1)
+#define mfn_to_color(mfn)           (mfn_x(mfn) & mfn_color_mask)
+
 /*
  * Parse the coloring configuration given in the buf string, following the
  * syntax below.
@@ -284,6 +287,16 @@ int __init domain_set_llc_colors_from_str(struct domain *d, const char *str)
     return 0;
 }
 
+unsigned int page_to_llc_color(const struct page_info *pg)
+{
+    return mfn_to_color(page_to_mfn(pg));
+}
+
+unsigned int get_max_nr_llc_colors(void)
+{
+    return max_nr_colors;
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 3adea713b7..8aab18d1fe 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -158,8 +158,12 @@
 #define PGC_static 0
 #endif
 
-#define PGC_preserved (PGC_extra | PGC_static)
-#define PGC_no_buddy_merge PGC_static
+#ifndef PGC_colored
+#define PGC_colored 0
+#endif
+
+#define PGC_preserved (PGC_extra | PGC_static | PGC_colored)
+#define PGC_no_buddy_merge (PGC_static | PGC_colored)
 
 #ifndef PGT_TYPE_INFO_INITIALIZER
 #define PGT_TYPE_INFO_INITIALIZER 0
@@ -1945,6 +1949,164 @@ static unsigned long avail_heap_pages(
     return free_pages;
 }
 
+/*************************
+ * COLORED SIDE-ALLOCATOR
+ *
+ * Pages are grouped by LLC color in lists which are globally referred to as the
+ * color heap. Lists are populated in end_boot_allocator().
+ * After initialization there will be N lists where N is the number of
+ * available colors on the platform.
+ */
+static struct page_list_head *__ro_after_init _color_heap;
+#define color_heap(color) (&_color_heap[color])
+
+static unsigned long *__ro_after_init free_colored_pages;
+
+/* Memory required for buddy allocator to work with colored one */
+#ifdef CONFIG_LLC_COLORING
+static unsigned long __initdata buddy_alloc_size =
+    MB(CONFIG_BUDDY_ALLOCATOR_SIZE);
+size_param("buddy-alloc-size", buddy_alloc_size);
+
+#define domain_num_llc_colors(d) (d)->num_llc_colors
+#define domain_llc_color(d, i)   (d)->llc_colors[i]
+#else
+static unsigned long __initdata buddy_alloc_size;
+
+#define domain_num_llc_colors(d) 0
+#define domain_llc_color(d, i)   0
+#endif
+
+static void free_color_heap_page(struct page_info *pg, bool need_scrub)
+{
+    unsigned int color = page_to_llc_color(pg);
+    struct page_list_head *head = color_heap(color);
+
+    spin_lock(&heap_lock);
+
+    mark_page_free(pg, page_to_mfn(pg));
+
+    if ( need_scrub )
+    {
+        pg->count_info |= PGC_need_scrub;
+        poison_one_page(pg);
+    }
+
+    free_colored_pages[color]++;
+    page_list_add(pg, head);
+
+    spin_unlock(&heap_lock);
+}
+
+static struct page_info *alloc_color_heap_page(unsigned int memflags,
+                                               const struct domain *d)
+{
+    struct page_info *pg = NULL;
+    unsigned int i, color = 0;
+    unsigned long max = 0;
+    bool need_tlbflush = false;
+    uint32_t tlbflush_timestamp = 0;
+    bool need_scrub;
+
+    if ( memflags >> _MEMF_bits )
+        return NULL;
+
+    spin_lock(&heap_lock);
+
+    for ( i = 0; i < domain_num_llc_colors(d); i++ )
+    {
+        unsigned long free = free_colored_pages[domain_llc_color(d, i)];
+
+        if ( free > max )
+        {
+            color = domain_llc_color(d, i);
+            pg = page_list_first(color_heap(color));
+            max = free;
+        }
+    }
+
+    if ( !pg )
+    {
+        spin_unlock(&heap_lock);
+        return NULL;
+    }
+
+    need_scrub = pg->count_info & (PGC_need_scrub);
+    pg->count_info = PGC_state_inuse | (pg->count_info & PGC_colored);
+    free_colored_pages[color]--;
+    page_list_del(pg, color_heap(color));
+
+    if ( !(memflags & MEMF_no_tlbflush) )
+        accumulate_tlbflush(&need_tlbflush, pg, &tlbflush_timestamp);
+
+    init_free_page_fields(pg);
+
+    spin_unlock(&heap_lock);
+
+    if ( !(memflags & MEMF_no_scrub) )
+    {
+        if ( need_scrub )
+            scrub_one_page(pg);
+        else
+            check_one_page(pg);
+    }
+
+    if ( need_tlbflush )
+        filtered_flush_tlb_mask(tlbflush_timestamp);
+
+    flush_page_to_ram(mfn_x(page_to_mfn(pg)),
+                      !(memflags & MEMF_no_icache_flush));
+
+    return pg;
+}
+
+static void __init init_color_heap_pages(struct page_info *pg,
+                                         unsigned long nr_pages)
+{
+    unsigned int i;
+    bool need_scrub = opt_bootscrub == BOOTSCRUB_IDLE;
+
+    if ( buddy_alloc_size )
+    {
+        unsigned long buddy_pages = min(PFN_DOWN(buddy_alloc_size), nr_pages);
+
+        init_heap_pages(pg, buddy_pages);
+        nr_pages -= buddy_pages;
+        buddy_alloc_size -= buddy_pages << PAGE_SHIFT;
+        pg += buddy_pages;
+    }
+
+    if ( !_color_heap )
+    {
+        unsigned int max_nr_colors = get_max_nr_llc_colors();
+
+        _color_heap = xmalloc_array(struct page_list_head, max_nr_colors);
+        free_colored_pages = xzalloc_array(unsigned long, max_nr_colors);
+        if ( !_color_heap || !free_colored_pages )
+            panic("Can't allocate colored heap. Buddy reserved size is too low");
+
+        for ( i = 0; i < max_nr_colors; i++ )
+            INIT_PAGE_LIST_HEAD(color_heap(i));
+    }
+
+    for ( i = 0; i < nr_pages; i++ )
+    {
+        pg[i].count_info = PGC_colored;
+        free_color_heap_page(&pg[i], need_scrub);
+    }
+}
+
+static void dump_color_heap(void)
+{
+    unsigned int color;
+
+    printk("Dumping color heap info\n");
+    for ( color = 0; color < get_max_nr_llc_colors(); color++ )
+        if ( free_colored_pages[color] > 0 )
+            printk("Color heap[%u]: %lu pages\n",
+                   color, free_colored_pages[color]);
+}
+
 void __init end_boot_allocator(void)
 {
     unsigned int i;
@@ -1964,7 +2126,13 @@ void __init end_boot_allocator(void)
     for ( i = nr_bootmem_regions; i-- > 0; )
     {
         struct bootmem_region *r = &bootmem_region_list[i];
-        if ( r->s < r->e )
+
+        if ( r->s >= r->e )
+            continue;
+
+        if ( llc_coloring_enabled )
+            init_color_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
+        else
             init_heap_pages(mfn_to_page(_mfn(r->s)), r->e - r->s);
     }
     nr_bootmem_regions = 0;
@@ -2460,7 +2628,14 @@ struct page_info *alloc_domheap_pages(
     if ( memflags & MEMF_no_owner )
         memflags |= MEMF_no_refcount;
 
-    if ( !dma_bitsize )
+    /* Only domains are supported for coloring */
+    if ( d && llc_coloring_enabled )
+    {
+        /* Colored allocation must be done on 0 order */
+        if ( order || (pg = alloc_color_heap_page(memflags, d)) == NULL )
+            return NULL;
+    }
+    else if ( !dma_bitsize )
         memflags &= ~MEMF_no_dma;
     else if ( (dma_zone = bits_to_zone(dma_bitsize)) < zone_hi )
         pg = alloc_heap_pages(dma_zone + 1, zone_hi, order, memflags, d);
@@ -2485,7 +2660,10 @@ struct page_info *alloc_domheap_pages(
         }
         if ( assign_page(pg, order, d, memflags) )
         {
-            free_heap_pages(pg, order, memflags & MEMF_no_scrub);
+            if ( pg->count_info & PGC_colored )
+                free_color_heap_page(pg, memflags & MEMF_no_scrub);
+            else
+                free_heap_pages(pg, order, memflags & MEMF_no_scrub);
             return NULL;
         }
     }
@@ -2568,7 +2746,10 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
             scrub = 1;
         }
 
-        free_heap_pages(pg, order, scrub);
+        if ( pg->count_info & PGC_colored )
+            free_color_heap_page(pg, scrub);
+        else
+            free_heap_pages(pg, order, scrub);
     }
 
     if ( drop_dom_ref )
@@ -2677,6 +2858,9 @@ static void cf_check dump_heap(unsigned char key)
             continue;
         printk("Node %d has %lu unscrubbed pages\n", i, node_need_scrub[i]);
     }
+
+    if ( llc_coloring_enabled )
+        dump_color_heap();
 }
 
 static __init int cf_check register_heap_trigger(void)
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index 49ebd1e712..7f8218bfb2 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -33,6 +33,10 @@ int domain_set_llc_colors(struct domain *d,
                           const struct xen_domctl_set_llc_colors *config);
 int domain_set_llc_colors_from_str(struct domain *d, const char *str);
 
+struct page_info;
+unsigned int page_to_llc_color(const struct page_info *pg);
+unsigned int get_max_nr_llc_colors(void);
+
 #endif /* __COLORING_H__ */
 
 /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 11/14] xen/arm: use domain memory to allocate p2m page tables
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (9 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 10/14] xen: add cache coloring allocator for domains Carlo Nonato
@ 2024-03-15 10:58 ` Carlo Nonato
  2024-03-15 10:59 ` [PATCH v7 12/14] xen/arm: add Xen cache colors command line parameter Carlo Nonato
                   ` (2 subsequent siblings)
  13 siblings, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:58 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk

Cache colored domains can benefit from having p2m page tables allocated
with the same coloring schema so that isolation can be achieved also for
those kind of memory accesses.
In order to do that, the domain struct is passed to the allocator and the
MEMF_no_owner flag is used.

This will be useful also when NUMA will be supported on Arm.

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Acked-by: Julien Grall <julien@xen.org>
---
v7:
- no changes
v6:
- Carlo Nonato as the only signed-off-by
v5:
- new patch
---
 xen/arch/arm/mmu/p2m.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/xen/arch/arm/mmu/p2m.c b/xen/arch/arm/mmu/p2m.c
index 41fcca011c..d02a478cb8 100644
--- a/xen/arch/arm/mmu/p2m.c
+++ b/xen/arch/arm/mmu/p2m.c
@@ -32,7 +32,7 @@ static struct page_info *p2m_alloc_page(struct domain *d)
      */
     if ( is_hardware_domain(d) )
     {
-        pg = alloc_domheap_page(NULL, 0);
+        pg = alloc_domheap_page(d, MEMF_no_owner);
         if ( pg == NULL )
             printk(XENLOG_G_ERR "Failed to allocate P2M pages for hwdom.\n");
     }
@@ -81,7 +81,7 @@ int p2m_set_allocation(struct domain *d, unsigned long pages, bool *preempted)
         if ( d->arch.paging.p2m_total_pages < pages )
         {
             /* Need to allocate more memory from domheap */
-            pg = alloc_domheap_page(NULL, 0);
+            pg = alloc_domheap_page(d, MEMF_no_owner);
             if ( pg == NULL )
             {
                 printk(XENLOG_ERR "Failed to allocate P2M pages.\n");
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 12/14] xen/arm: add Xen cache colors command line parameter
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (10 preceding siblings ...)
  2024-03-15 10:58 ` [PATCH v7 11/14] xen/arm: use domain memory to allocate p2m page tables Carlo Nonato
@ 2024-03-15 10:59 ` Carlo Nonato
  2024-03-19 15:54   ` Jan Beulich
  2024-03-15 10:59 ` [PATCH v7 13/14] xen/arm: make consider_modules() available for xen relocation Carlo Nonato
  2024-03-15 10:59 ` [PATCH v7 14/14] xen/arm: add cache coloring support for Xen Carlo Nonato
  13 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Luca Miccio, Andrew Cooper, George Dunlap, Jan Beulich,
	Julien Grall, Stefano Stabellini, Wei Liu, Marco Solieri,
	Carlo Nonato

From: Luca Miccio <lucmiccio@gmail.com>

Add a new command line parameter to configure Xen cache colors.
These colors can be dumped with the cache coloring info debug-key.

By default, Xen uses the first color.
Benchmarking the VM interrupt response time provides an estimation of
LLC usage by Xen's most latency-critical runtime task. Results on Arm
Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
reserves 64 KiB of L2, is enough to attain best responsiveness:
- Xen 1 color latency:  3.1 us
- Xen 2 color latency:  3.1 us

More colors are instead very likely to be needed on processors whose L1
cache is physically-indexed and physically-tagged, such as Cortex-A57.
In such cases, coloring applies to L1 also, and there typically are two
distinct L1-colors. Therefore, reserving only one color for Xen would
senselessly partitions a cache memory that is already private, i.e.
underutilize it. The default amount of Xen colors is thus set to one.

Signed-off-by: Luca Miccio <lucmiccio@gmail.com>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v7:
- removed XEN_DEFAULT_COLOR
- XEN_DEFAULT_NUM_COLORS is now used in a for loop to set xen default colors
---
 docs/misc/cache-coloring.rst      |  2 ++
 docs/misc/xen-command-line.pandoc | 10 ++++++++++
 xen/common/llc-coloring.c         | 29 +++++++++++++++++++++++++++++
 3 files changed, 41 insertions(+)

diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
index 50b6d94ffc..f427a14b65 100644
--- a/docs/misc/cache-coloring.rst
+++ b/docs/misc/cache-coloring.rst
@@ -122,6 +122,8 @@ Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
 +----------------------+-------------------------------+
 | ``buddy-alloc-size`` | Buddy allocator reserved size |
 +----------------------+-------------------------------+
+| ``xen-llc-colors``   | Xen color configuration       |
++----------------------+-------------------------------+
 
 Colors selection format
 ***********************
diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
index 461403362f..fa18ec942e 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -2885,6 +2885,16 @@ mode.
 **WARNING: `x2apic_phys` is deprecated and superseded by `x2apic-mode`.
 The latter takes precedence if both are set.**
 
+### xen-llc-colors (arm64)
+> `= List of [ <integer> | <integer>-<integer> ]`
+
+> Default: `0: the lowermost color`
+
+Specify Xen LLC color configuration. This options is available only when
+`CONFIG_LLC_COLORING` is enabled.
+Two colors are most likely needed on platforms where private caches are
+physically indexed, e.g. the L1 instruction cache of the Arm Cortex-A57.
+
 ### xenheap_megabytes (arm32)
 > `= <size>`
 
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index e34ba6b6ec..f1a7561d79 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -9,6 +9,8 @@
 #include <xen/llc-coloring.h>
 #include <xen/param.h>
 
+#define XEN_DEFAULT_NUM_COLORS 1
+
 bool __ro_after_init llc_coloring_enabled;
 boolean_param("llc-coloring", llc_coloring_enabled);
 
@@ -22,6 +24,9 @@ static unsigned int __ro_after_init max_nr_colors;
 static unsigned int __initdata dom0_colors[CONFIG_NR_LLC_COLORS];
 static unsigned int __initdata dom0_num_colors;
 
+static unsigned int __ro_after_init xen_colors[CONFIG_NR_LLC_COLORS];
+static unsigned int __ro_after_init xen_num_colors;
+
 #define mfn_color_mask              (max_nr_colors - 1)
 #define mfn_to_color(mfn)           (mfn_x(mfn) & mfn_color_mask)
 
@@ -79,6 +84,13 @@ static int __init parse_dom0_colors(const char *s)
 }
 custom_param("dom0-llc-colors", parse_dom0_colors);
 
+static int __init parse_xen_colors(const char *s)
+{
+    return parse_color_config(s, xen_colors, ARRAY_SIZE(xen_colors),
+                              &xen_num_colors);
+}
+custom_param("xen-llc-colors", parse_xen_colors);
+
 static void print_colors(const unsigned int *colors, unsigned int num_colors)
 {
     unsigned int i;
@@ -147,6 +159,21 @@ void __init llc_coloring_init(void)
         panic("Number of LLC colors (%u) not in range [2, %u]\n",
               max_nr_colors, CONFIG_NR_LLC_COLORS);
 
+    if ( !xen_num_colors )
+    {
+        unsigned int i;
+
+        xen_num_colors = MIN(XEN_DEFAULT_NUM_COLORS, max_nr_colors);
+
+        printk(XENLOG_WARNING
+               "Xen LLC color config not found. Using first %u colors\n",
+               xen_num_colors);
+        for ( i = 0; i < xen_num_colors; i++ )
+            xen_colors[i] = i;
+    }
+    else if ( !check_colors(xen_colors, xen_num_colors) )
+        panic("Bad LLC color config for Xen\n");
+
     arch_llc_coloring_init();
 }
 
@@ -157,6 +184,8 @@ void cf_check dump_llc_coloring_info(void)
 
     printk("LLC coloring info:\n");
     printk("    Number of LLC colors supported: %u\n", max_nr_colors);
+    printk("    Xen LLC colors (%u): ", xen_num_colors);
+    print_colors(xen_colors, xen_num_colors);
 }
 
 void cf_check domain_dump_llc_colors(const struct domain *d)
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 13/14] xen/arm: make consider_modules() available for xen relocation
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (11 preceding siblings ...)
  2024-03-15 10:59 ` [PATCH v7 12/14] xen/arm: add Xen cache colors command line parameter Carlo Nonato
@ 2024-03-15 10:59 ` Carlo Nonato
  2024-03-15 10:59 ` [PATCH v7 14/14] xen/arm: add cache coloring support for Xen Carlo Nonato
  13 siblings, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk

Cache coloring must physically relocate Xen in order to color the hypervisor
and consider_modules() is a key function that is needed to find a new
available physical address.

672d67f339c0 ("xen/arm: Split MMU-specific setup_mm() and related code out")
moved consider_modules() under arm32. Move it to mmu/setup.c and make it
non-static so that it can be used outside.

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
---
v7:
- moved consider_modules() to arm/mmu/setup.c
v6:
- new patch
---
 xen/arch/arm/arm32/mmu/mm.c      | 93 +------------------------------
 xen/arch/arm/include/asm/setup.h |  3 +
 xen/arch/arm/mmu/setup.c         | 95 ++++++++++++++++++++++++++++++++
 3 files changed, 99 insertions(+), 92 deletions(-)

diff --git a/xen/arch/arm/arm32/mmu/mm.c b/xen/arch/arm/arm32/mmu/mm.c
index cb441ca87c..e9e1e48f9f 100644
--- a/xen/arch/arm/arm32/mmu/mm.c
+++ b/xen/arch/arm/arm32/mmu/mm.c
@@ -7,6 +7,7 @@
 #include <xen/pfn.h>
 #include <asm/fixmap.h>
 #include <asm/static-memory.h>
+#include <asm/setup.h>
 
 static unsigned long opt_xenheap_megabytes __initdata;
 integer_param("xenheap_megabytes", opt_xenheap_megabytes);
@@ -29,98 +30,6 @@ static void __init setup_directmap_mappings(unsigned long base_mfn,
     directmap_virt_end = XENHEAP_VIRT_START + nr_mfns * PAGE_SIZE;
 }
 
-/*
- * Returns the end address of the highest region in the range s..e
- * with required size and alignment that does not conflict with the
- * modules from first_mod to nr_modules.
- *
- * For non-recursive callers first_mod should normally be 0 (all
- * modules and Xen itself) or 1 (all modules but not Xen).
- */
-static paddr_t __init consider_modules(paddr_t s, paddr_t e,
-                                       uint32_t size, paddr_t align,
-                                       int first_mod)
-{
-    const struct bootmodules *mi = &bootinfo.modules;
-    int i;
-    int nr;
-
-    s = (s+align-1) & ~(align-1);
-    e = e & ~(align-1);
-
-    if ( s > e ||  e - s < size )
-        return 0;
-
-    /* First check the boot modules */
-    for ( i = first_mod; i < mi->nr_mods; i++ )
-    {
-        paddr_t mod_s = mi->module[i].start;
-        paddr_t mod_e = mod_s + mi->module[i].size;
-
-        if ( s < mod_e && mod_s < e )
-        {
-            mod_e = consider_modules(mod_e, e, size, align, i+1);
-            if ( mod_e )
-                return mod_e;
-
-            return consider_modules(s, mod_s, size, align, i+1);
-        }
-    }
-
-    /* Now check any fdt reserved areas. */
-
-    nr = fdt_num_mem_rsv(device_tree_flattened);
-
-    for ( ; i < mi->nr_mods + nr; i++ )
-    {
-        paddr_t mod_s, mod_e;
-
-        if ( fdt_get_mem_rsv_paddr(device_tree_flattened,
-                                   i - mi->nr_mods,
-                                   &mod_s, &mod_e ) < 0 )
-            /* If we can't read it, pretend it doesn't exist... */
-            continue;
-
-        /* fdt_get_mem_rsv_paddr returns length */
-        mod_e += mod_s;
-
-        if ( s < mod_e && mod_s < e )
-        {
-            mod_e = consider_modules(mod_e, e, size, align, i+1);
-            if ( mod_e )
-                return mod_e;
-
-            return consider_modules(s, mod_s, size, align, i+1);
-        }
-    }
-
-    /*
-     * i is the current bootmodule we are evaluating, across all
-     * possible kinds of bootmodules.
-     *
-     * When retrieving the corresponding reserved-memory addresses, we
-     * need to index the bootinfo.reserved_mem bank starting from 0, and
-     * only counting the reserved-memory modules. Hence, we need to use
-     * i - nr.
-     */
-    nr += mi->nr_mods;
-    for ( ; i - nr < bootinfo.reserved_mem.nr_banks; i++ )
-    {
-        paddr_t r_s = bootinfo.reserved_mem.bank[i - nr].start;
-        paddr_t r_e = r_s + bootinfo.reserved_mem.bank[i - nr].size;
-
-        if ( s < r_e && r_s < e )
-        {
-            r_e = consider_modules(r_e, e, size, align, i + 1);
-            if ( r_e )
-                return r_e;
-
-            return consider_modules(s, r_s, size, align, i + 1);
-        }
-    }
-    return e;
-}
-
 /*
  * Find a contiguous region that fits in the static heap region with
  * required size and alignment, and return the end address of the region
diff --git a/xen/arch/arm/include/asm/setup.h b/xen/arch/arm/include/asm/setup.h
index d15a88d2e0..37c0e345f0 100644
--- a/xen/arch/arm/include/asm/setup.h
+++ b/xen/arch/arm/include/asm/setup.h
@@ -207,6 +207,9 @@ struct init_info
     unsigned int cpuid;
 };
 
+paddr_t consider_modules(paddr_t s, paddr_t e, uint32_t size, paddr_t align,
+                         int first_mod);
+
 #endif
 /*
  * Local variables:
diff --git a/xen/arch/arm/mmu/setup.c b/xen/arch/arm/mmu/setup.c
index 57f1b46499..de036c1f49 100644
--- a/xen/arch/arm/mmu/setup.c
+++ b/xen/arch/arm/mmu/setup.c
@@ -6,7 +6,10 @@
  */
 
 #include <xen/init.h>
+#include <xen/lib.h>
 #include <xen/libfdt/libfdt.h>
+#include <xen/libfdt/libfdt-xen.h>
+#include <xen/llc-coloring.h>
 #include <xen/sizes.h>
 #include <xen/vmap.h>
 
@@ -218,6 +221,98 @@ static void xen_pt_enforce_wnx(void)
     flush_xen_tlb_local();
 }
 
+/*
+ * Returns the end address of the highest region in the range s..e
+ * with required size and alignment that does not conflict with the
+ * modules from first_mod to nr_modules.
+ *
+ * For non-recursive callers first_mod should normally be 0 (all
+ * modules and Xen itself) or 1 (all modules but not Xen).
+ */
+paddr_t __init consider_modules(paddr_t s, paddr_t e,
+                                uint32_t size, paddr_t align,
+                                int first_mod)
+{
+    const struct bootmodules *mi = &bootinfo.modules;
+    int i;
+    int nr;
+
+    s = (s+align-1) & ~(align-1);
+    e = e & ~(align-1);
+
+    if ( s > e ||  e - s < size )
+        return 0;
+
+    /* First check the boot modules */
+    for ( i = first_mod; i < mi->nr_mods; i++ )
+    {
+        paddr_t mod_s = mi->module[i].start;
+        paddr_t mod_e = mod_s + mi->module[i].size;
+
+        if ( s < mod_e && mod_s < e )
+        {
+            mod_e = consider_modules(mod_e, e, size, align, i+1);
+            if ( mod_e )
+                return mod_e;
+
+            return consider_modules(s, mod_s, size, align, i+1);
+        }
+    }
+
+    /* Now check any fdt reserved areas. */
+
+    nr = fdt_num_mem_rsv(device_tree_flattened);
+
+    for ( ; i < mi->nr_mods + nr; i++ )
+    {
+        paddr_t mod_s, mod_e;
+
+        if ( fdt_get_mem_rsv_paddr(device_tree_flattened,
+                                   i - mi->nr_mods,
+                                   &mod_s, &mod_e ) < 0 )
+            /* If we can't read it, pretend it doesn't exist... */
+            continue;
+
+        /* fdt_get_mem_rsv_paddr returns length */
+        mod_e += mod_s;
+
+        if ( s < mod_e && mod_s < e )
+        {
+            mod_e = consider_modules(mod_e, e, size, align, i+1);
+            if ( mod_e )
+                return mod_e;
+
+            return consider_modules(s, mod_s, size, align, i+1);
+        }
+    }
+
+    /*
+     * i is the current bootmodule we are evaluating, across all
+     * possible kinds of bootmodules.
+     *
+     * When retrieving the corresponding reserved-memory addresses, we
+     * need to index the bootinfo.reserved_mem bank starting from 0, and
+     * only counting the reserved-memory modules. Hence, we need to use
+     * i - nr.
+     */
+    nr += mi->nr_mods;
+    for ( ; i - nr < bootinfo.reserved_mem.nr_banks; i++ )
+    {
+        paddr_t r_s = bootinfo.reserved_mem.bank[i - nr].start;
+        paddr_t r_e = r_s + bootinfo.reserved_mem.bank[i - nr].size;
+
+        if ( s < r_e && r_s < e )
+        {
+            r_e = consider_modules(r_e, e, size, align, i + 1);
+            if ( r_e )
+                return r_e;
+
+            return consider_modules(s, r_s, size, align, i + 1);
+        }
+    }
+    return e;
+}
+
 /*
  * Boot-time pagetable setup.
  * Changes here may need matching changes in head.S
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* [PATCH v7 14/14] xen/arm: add cache coloring support for Xen
  2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
                   ` (12 preceding siblings ...)
  2024-03-15 10:59 ` [PATCH v7 13/14] xen/arm: make consider_modules() available for xen relocation Carlo Nonato
@ 2024-03-15 10:59 ` Carlo Nonato
  2024-03-19 15:58   ` Jan Beulich
  2024-03-19 16:03   ` Jan Beulich
  13 siblings, 2 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 10:59 UTC (permalink / raw)
  To: xen-devel
  Cc: Carlo Nonato, Stefano Stabellini, Julien Grall, Bertrand Marquis,
	Michal Orzel, Volodymyr Babchuk, Andrew Cooper, George Dunlap,
	Jan Beulich, Wei Liu, Marco Solieri

Add the cache coloring support for Xen physical space.

Since Xen must be relocated to a new physical space, some relocation
functionalities must be brought back:
- the virtual address of the new space is taken from 0c18fb76323b
  ("xen/arm: Remove unused BOOT_RELOC_VIRT_START").
- relocate_xen() and get_xen_paddr() are taken from f60658c6ae47
  ("xen/arm: Stop relocating Xen").

setup_pagetables() must be adapted for coloring and for relocation. Runtime
page tables are used to map the colored space, but they are also linked in
boot tables so that the new space is temporarily available for relocation.
This implies that Xen protection must happen after the copy.

Finally, since the alternative framework needs to remap the Xen text and
inittext sections, this operation must be done in a coloring-aware way.
The function xen_remap_colored() is introduced for that.

Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
---
v7:
- added BUG_ON() checks to arch_llc_coloring_init() and
  create_llc_coloring_mappings()
v6:
- squashed with BOOT_RELOC_VIRT_START patch
- consider_modules() moved in another patch
- removed psci and smpboot code because of new idmap work already handles that
- moved xen_remap_colored() in alternative.c since it's only used there
- removed xen_colored_temp[] in favor of xen_xenmap[] usage for mapping
- use of boot_module_find_by_kind() to remove the need of extra parameter in
  setup_pagetables()
- moved get_xen_paddr() in arm/llc-coloring.c since it's only used there
v5:
- FIXME: consider_modules copy pasted since it got moved
v4:
- removed set_value_for_secondary() because it was wrongly cleaning cache
- relocate_xen() now calls switch_ttbr_id()
---
 xen/arch/arm/alternative.c            |  30 +++++++-
 xen/arch/arm/arm64/mmu/head.S         |  58 +++++++++++++-
 xen/arch/arm/arm64/mmu/mm.c           |  28 ++++++-
 xen/arch/arm/include/asm/mmu/layout.h |   3 +
 xen/arch/arm/llc-coloring.c           |  63 +++++++++++++++-
 xen/arch/arm/mmu/setup.c              | 104 ++++++++++++++++++++++----
 xen/arch/arm/setup.c                  |  10 ++-
 xen/common/llc-coloring.c             |  23 ++++++
 xen/include/xen/llc-coloring.h        |  14 ++++
 9 files changed, 310 insertions(+), 23 deletions(-)

diff --git a/xen/arch/arm/alternative.c b/xen/arch/arm/alternative.c
index 016e66978b..8ca649b55e 100644
--- a/xen/arch/arm/alternative.c
+++ b/xen/arch/arm/alternative.c
@@ -9,6 +9,7 @@
 #include <xen/init.h>
 #include <xen/types.h>
 #include <xen/kernel.h>
+#include <xen/llc-coloring.h>
 #include <xen/mm.h>
 #include <xen/vmap.h>
 #include <xen/smp.h>
@@ -191,6 +192,27 @@ static int __apply_alternatives_multi_stop(void *xenmap)
     return 0;
 }
 
+static void __init *xen_remap_colored(mfn_t xen_mfn, paddr_t xen_size)
+{
+    unsigned int i;
+    void *xenmap;
+    mfn_t *xen_colored_mfns, mfn;
+
+    xen_colored_mfns = xmalloc_array(mfn_t, xen_size >> PAGE_SHIFT);
+    if ( !xen_colored_mfns )
+        panic("Can't allocate LLC colored MFNs\n");
+
+    for_each_xen_colored_mfn ( xen_mfn, mfn, i )
+    {
+        xen_colored_mfns[i] = mfn;
+    }
+
+    xenmap = vmap(xen_colored_mfns, xen_size >> PAGE_SHIFT);
+    xfree(xen_colored_mfns);
+
+    return xenmap;
+}
+
 /*
  * This function should only be called during boot and before CPU0 jump
  * into the idle_loop.
@@ -209,8 +231,12 @@ void __init apply_alternatives_all(void)
      * The text and inittext section are read-only. So re-map Xen to
      * be able to patch the code.
      */
-    xenmap = __vmap(&xen_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
-                    VMAP_DEFAULT);
+    if ( llc_coloring_enabled )
+        xenmap = xen_remap_colored(xen_mfn, xen_size);
+    else
+        xenmap = __vmap(&xen_mfn, 1U << xen_order, 1, 1, PAGE_HYPERVISOR,
+                        VMAP_DEFAULT);
+
     /* Re-mapping Xen is not expected to fail during boot. */
     BUG_ON(!xenmap);
 
diff --git a/xen/arch/arm/arm64/mmu/head.S b/xen/arch/arm/arm64/mmu/head.S
index fa40b696dd..7ad2c00fd5 100644
--- a/xen/arch/arm/arm64/mmu/head.S
+++ b/xen/arch/arm/arm64/mmu/head.S
@@ -427,6 +427,61 @@ fail:   PRINT("- Boot failed -\r\n")
         b     1b
 ENDPROC(fail)
 
+/*
+ * Copy Xen to new location and switch TTBR
+ * x0    ttbr
+ * x1    source address
+ * x2    destination address
+ * x3    length
+ *
+ * Source and destination must be word aligned, length is rounded up
+ * to a 16 byte boundary.
+ *
+ * MUST BE VERY CAREFUL when saving things to RAM over the copy
+ */
+ENTRY(relocate_xen)
+        /*
+         * Copy 16 bytes at a time using:
+         *   x9: counter
+         *   x10: data
+         *   x11: data
+         *   x12: source
+         *   x13: destination
+         */
+        mov     x9, x3
+        mov     x12, x1
+        mov     x13, x2
+
+1:      ldp     x10, x11, [x12], #16
+        stp     x10, x11, [x13], #16
+
+        subs    x9, x9, #16
+        bgt     1b
+
+        /*
+         * Flush destination from dcache using:
+         *   x9: counter
+         *   x10: step
+         *   x11: vaddr
+         *
+         * This is to ensure data is visible to the instruction cache
+         */
+        dsb   sy
+
+        mov   x9, x3
+        ldr   x10, =dcache_line_bytes /* x10 := step */
+        ldr   x10, [x10]
+        mov   x11, x2
+
+1:      dc    cvac, x11
+
+        add   x11, x11, x10
+        subs  x9, x9, x10
+        bgt   1b
+
+        /* No need for dsb/isb because they are alredy done in switch_ttbr_id */
+        b switch_ttbr_id
+
 /*
  * Switch TTBR
  *
@@ -452,7 +507,8 @@ ENTRY(switch_ttbr_id)
 
         /*
          * 5) Flush I-cache
-         * This should not be necessary but it is kept for safety.
+         * This should not be necessary in the general case, but it's needed
+         * for cache coloring because code is relocated in that case.
          */
         ic     iallu
         isb
diff --git a/xen/arch/arm/arm64/mmu/mm.c b/xen/arch/arm/arm64/mmu/mm.c
index d2651c9486..07cf8040a2 100644
--- a/xen/arch/arm/arm64/mmu/mm.c
+++ b/xen/arch/arm/arm64/mmu/mm.c
@@ -1,6 +1,7 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
 #include <xen/init.h>
+#include <xen/llc-coloring.h>
 #include <xen/mm.h>
 #include <xen/pfn.h>
 
@@ -125,27 +126,46 @@ void update_identity_mapping(bool enable)
 }
 
 extern void switch_ttbr_id(uint64_t ttbr);
+extern void relocate_xen(uint64_t ttbr, void *src, void *dst, size_t len);
 
 typedef void (switch_ttbr_fn)(uint64_t ttbr);
+typedef void (relocate_xen_fn)(uint64_t ttbr, void *src, void *dst, size_t len);
 
 void __init switch_ttbr(uint64_t ttbr)
 {
-    vaddr_t id_addr = virt_to_maddr(switch_ttbr_id);
-    switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
+    vaddr_t vaddr, id_addr;
     lpae_t pte;
 
+    if ( llc_coloring_enabled )
+        vaddr = (vaddr_t)relocate_xen;
+    else
+        vaddr = (vaddr_t)switch_ttbr_id;
+
+    id_addr = virt_to_maddr(vaddr);
+
     /* Enable the identity mapping in the boot page tables */
     update_identity_mapping(true);
 
     /* Enable the identity mapping in the runtime page tables */
-    pte = pte_of_xenaddr((vaddr_t)switch_ttbr_id);
+    pte = pte_of_xenaddr(vaddr);
     pte.pt.table = 1;
     pte.pt.xn = 0;
     pte.pt.ro = 1;
     write_pte(&xen_third_id[third_table_offset(id_addr)], pte);
 
     /* Switch TTBR */
-    fn(ttbr);
+    if ( llc_coloring_enabled )
+    {
+        relocate_xen_fn *fn = (relocate_xen_fn *)id_addr;
+
+        fn(ttbr, _start, (void *)BOOT_RELOC_VIRT_START, _end - _start);
+    }
+    else
+    {
+        switch_ttbr_fn *fn = (switch_ttbr_fn *)id_addr;
+
+        fn(ttbr);
+    }
 
     /*
      * Disable the identity mapping in the runtime page tables.
diff --git a/xen/arch/arm/include/asm/mmu/layout.h b/xen/arch/arm/include/asm/mmu/layout.h
index a3b546465b..19c0ec63a5 100644
--- a/xen/arch/arm/include/asm/mmu/layout.h
+++ b/xen/arch/arm/include/asm/mmu/layout.h
@@ -30,6 +30,7 @@
  *  10M -  12M   Fixmap: special-purpose 4K mapping slots
  *  12M -  16M   Early boot mapping of FDT
  *  16M -  18M   Livepatch vmap (if compiled in)
+ *  16M -  24M   Cache-colored Xen text, data, bss (temporary, if compiled in)
  *
  *   1G -   2G   VMAP: ioremap and early_ioremap
  *
@@ -74,6 +75,8 @@
 #define BOOT_FDT_VIRT_START     (FIXMAP_VIRT_START + FIXMAP_VIRT_SIZE)
 #define BOOT_FDT_VIRT_SIZE      _AT(vaddr_t, MB(4))
 
+#define BOOT_RELOC_VIRT_START   (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
+
 #ifdef CONFIG_LIVEPATCH
 #define LIVEPATCH_VMAP_START    (BOOT_FDT_VIRT_START + BOOT_FDT_VIRT_SIZE)
 #define LIVEPATCH_VMAP_SIZE    _AT(vaddr_t, MB(2))
diff --git a/xen/arch/arm/llc-coloring.c b/xen/arch/arm/llc-coloring.c
index b83540ff41..a072407e6c 100644
--- a/xen/arch/arm/llc-coloring.c
+++ b/xen/arch/arm/llc-coloring.c
@@ -9,6 +9,7 @@
 
 #include <asm/processor.h>
 #include <asm/sysregs.h>
+#include <asm/setup.h>
 
 /* Return the LLC way size by probing the hardware */
 unsigned int __init get_llc_way_size(void)
@@ -62,7 +63,67 @@ unsigned int __init get_llc_way_size(void)
     return line_size * num_sets;
 }
 
-void __init arch_llc_coloring_init(void) {}
+/**
+ * get_xen_paddr - get physical address to relocate Xen to
+ *
+ * Xen is relocated to as near to the top of RAM as possible and
+ * aligned to a XEN_PADDR_ALIGN boundary.
+ */
+static paddr_t __init get_xen_paddr(paddr_t xen_size)
+{
+    const struct meminfo *mi = &bootinfo.mem;
+    paddr_t min_size;
+    paddr_t paddr = 0;
+    unsigned int i;
+
+    min_size = (xen_size + (XEN_PADDR_ALIGN-1)) & ~(XEN_PADDR_ALIGN-1);
+
+    /* Find the highest bank with enough space. */
+    for ( i = 0; i < mi->nr_banks; i++ )
+    {
+        const struct membank *bank = &mi->bank[i];
+        paddr_t s, e;
+
+        if ( bank->size >= min_size )
+        {
+            e = consider_modules(bank->start, bank->start + bank->size,
+                                 min_size, XEN_PADDR_ALIGN, 0);
+            if ( !e )
+                continue;
+
+#ifdef CONFIG_ARM_32
+            /* Xen must be under 4GB */
+            if ( e > GB(4) )
+                e = GB(4);
+            if ( e < bank->start )
+                continue;
+#endif
+
+            s = e - min_size;
+
+            if ( s > paddr )
+                paddr = s;
+        }
+    }
+
+    if ( !paddr )
+        panic("Not enough memory to relocate Xen\n");
+
+    printk("Placing Xen at 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
+           paddr, paddr + min_size);
+
+    return paddr;
+}
+
+void __init arch_llc_coloring_init(void)
+{
+    struct bootmodule *xen_bootmodule = boot_module_find_by_kind(BOOTMOD_XEN);
+
+    BUG_ON(!xen_bootmodule);
+
+    xen_bootmodule->size = xen_colored_map_size();
+    xen_bootmodule->start = get_xen_paddr(xen_bootmodule->size);
+}
 
 /*
  * Local variables:
diff --git a/xen/arch/arm/mmu/setup.c b/xen/arch/arm/mmu/setup.c
index de036c1f49..5823b7237f 100644
--- a/xen/arch/arm/mmu/setup.c
+++ b/xen/arch/arm/mmu/setup.c
@@ -18,6 +18,11 @@
 /* Override macros from asm/page.h to make them work with mfn_t */
 #undef mfn_to_virt
 #define mfn_to_virt(mfn) __mfn_to_virt(mfn_x(mfn))
+#undef virt_to_mfn
+#define virt_to_mfn(va) _mfn(__virt_to_mfn(va))
+
+#define virt_to_reloc_virt(virt) \
+    (((vaddr_t)virt) - XEN_VIRT_START + BOOT_RELOC_VIRT_START)
 
 /* Main runtime page tables */
 
@@ -72,6 +77,7 @@ static void __init __maybe_unused build_assertions(void)
     /* 2MB aligned regions */
     BUILD_BUG_ON(XEN_VIRT_START & ~SECOND_MASK);
     BUILD_BUG_ON(FIXMAP_ADDR(0) & ~SECOND_MASK);
+    BUILD_BUG_ON(BOOT_RELOC_VIRT_START & ~SECOND_MASK);
     /* 1GB aligned regions */
 #ifdef CONFIG_ARM_32
     BUILD_BUG_ON(XENHEAP_VIRT_START & ~FIRST_MASK);
@@ -135,7 +141,12 @@ static void __init __maybe_unused build_assertions(void)
 
 lpae_t __init pte_of_xenaddr(vaddr_t va)
 {
-    paddr_t ma = va + phys_offset;
+    paddr_t ma;
+
+    if ( llc_coloring_enabled )
+        ma = virt_to_maddr(virt_to_reloc_virt(va));
+    else
+        ma = va + phys_offset;
 
     return mfn_to_xen_entry(maddr_to_mfn(ma), MT_NORMAL);
 }
@@ -313,9 +324,44 @@ paddr_t __init consider_modules(paddr_t s, paddr_t e,
     return e;
 }
 
+static void __init create_llc_coloring_mappings(void)
+{
+    lpae_t pte;
+    unsigned int i;
+    struct bootmodule *xen_bootmodule = boot_module_find_by_kind(BOOTMOD_XEN);
+    mfn_t start_mfn = maddr_to_mfn(xen_bootmodule->start), mfn;
+
+    for_each_xen_colored_mfn ( start_mfn, mfn, i )
+    {
+        pte = mfn_to_xen_entry(mfn, MT_NORMAL);
+        pte.pt.table = 1; /* level 3 mappings always have this bit set */
+        xen_xenmap[i] = pte;
+    }
+
+    for ( i = 0; i < XEN_NR_ENTRIES(2); i++ )
+    {
+        vaddr_t va = BOOT_RELOC_VIRT_START + (i << XEN_PT_LEVEL_SHIFT(2));
+
+        pte = mfn_to_xen_entry(virt_to_mfn(xen_xenmap +
+                                           i * XEN_PT_LPAE_ENTRIES),
+                               MT_NORMAL);
+        pte.pt.table = 1;
+        write_pte(&boot_second[second_table_offset(va)], pte);
+    }
+}
+
 /*
- * Boot-time pagetable setup.
+ * Boot-time pagetable setup with coloring support
  * Changes here may need matching changes in head.S
+ *
+ * The cache coloring support consists of:
+ * - Create colored mapping that conforms to Xen color selection in xen_xenmap[]
+ * - Link the mapping in boot page tables using BOOT_RELOC_VIRT_START as vaddr
+ * - pte_of_xenaddr() takes care of translating addresses to the new space
+ *   during runtime page tables creation
+ * - Relocate xen and update TTBR with the new address in the colored space
+ *   (see switch_ttbr())
+ * - Protect the new space
  */
 void __init setup_pagetables(unsigned long boot_phys_offset)
 {
@@ -325,6 +371,9 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
 
     phys_offset = boot_phys_offset;
 
+    if ( llc_coloring_enabled )
+        create_llc_coloring_mappings();
+
     arch_setup_page_tables();
 
 #ifdef CONFIG_ARM_64
@@ -352,13 +401,7 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
             break;
         pte = pte_of_xenaddr(va);
         pte.pt.table = 1; /* third level mappings always have this bit set */
-        if ( is_kernel_text(va) || is_kernel_inittext(va) )
-        {
-            pte.pt.xn = 0;
-            pte.pt.ro = 1;
-        }
-        if ( is_kernel_rodata(va) )
-            pte.pt.ro = 1;
+        pte.pt.xn = 0; /* Permissions will be enforced later. Allow execution */
         xen_xenmap[i] = pte;
     }
 
@@ -384,13 +427,48 @@ void __init setup_pagetables(unsigned long boot_phys_offset)
     ttbr = (uintptr_t) cpu0_pgtable + phys_offset;
 #endif
 
-    switch_ttbr(ttbr);
-
-    xen_pt_enforce_wnx();
-
 #ifdef CONFIG_ARM_32
     per_cpu(xen_pgtable, 0) = cpu0_pgtable;
 #endif
+
+    if ( llc_coloring_enabled )
+        ttbr = virt_to_maddr(virt_to_reloc_virt(THIS_CPU_PGTABLE));
+
+    switch_ttbr(ttbr);
+
+    /* Protect Xen */
+    for ( i = 0; i < XEN_NR_ENTRIES(3); i++ )
+    {
+        vaddr_t va = XEN_VIRT_START + (i << PAGE_SHIFT);
+        lpae_t *entry = xen_xenmap + i;
+
+        if ( !is_kernel(va) )
+            break;
+
+        pte = read_atomic(entry);
+
+        if ( is_kernel_text(va) || is_kernel_inittext(va) )
+        {
+            pte.pt.xn = 0;
+            pte.pt.ro = 1;
+        } else if ( is_kernel_rodata(va) ) {
+            pte.pt.ro = 1;
+            pte.pt.xn = 1;
+        } else {
+            pte.pt.xn = 1;
+            pte.pt.ro = 0;
+        }
+
+        write_pte(entry, pte);
+    }
+
+    /*
+     * We modified live page-tables. Ensure the TLBs are invalidated
+     * before setting enforcing the WnX permissions.
+     */
+    flush_xen_tlb_local();
+
+    xen_pt_enforce_wnx();
 }
 
 void *__init arch_vmap_virt_end(void)
diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index c72c90302e..9acec8e8b1 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -724,8 +724,6 @@ void asmlinkage __init start_xen(unsigned long boot_phys_offset,
     /* Initialize traps early allow us to get backtrace when an error occurred */
     init_traps();
 
-    setup_pagetables(boot_phys_offset);
-
     smp_clear_cpu_maps();
 
     device_tree_flattened = early_fdt_map(fdt_paddr);
@@ -749,6 +747,14 @@ void asmlinkage __init start_xen(unsigned long boot_phys_offset,
 
     llc_coloring_init();
 
+    /*
+     * Page tables must be setup after LLC coloring initialization because
+     * coloring info are required in order to create colored mappings
+     */
+    setup_pagetables(boot_phys_offset);
+    /* Device-tree was mapped in boot page tables, remap it in the new tables */
+    device_tree_flattened = early_fdt_map(fdt_paddr);
+
     setup_mm();
 
     vm_init();
diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
index f1a7561d79..246b0ca04d 100644
--- a/xen/common/llc-coloring.c
+++ b/xen/common/llc-coloring.c
@@ -29,6 +29,8 @@ static unsigned int __ro_after_init xen_num_colors;
 
 #define mfn_color_mask              (max_nr_colors - 1)
 #define mfn_to_color(mfn)           (mfn_x(mfn) & mfn_color_mask)
+#define get_mfn_with_color(mfn, color) \
+    (_mfn((mfn_x(mfn) & ~mfn_color_mask) | (color)))
 
 /*
  * Parse the coloring configuration given in the buf string, following the
@@ -326,6 +328,27 @@ unsigned int get_max_nr_llc_colors(void)
     return max_nr_colors;
 }
 
+paddr_t __init xen_colored_map_size(void)
+{
+    return ROUNDUP((_end - _start) * max_nr_colors, XEN_PADDR_ALIGN);
+}
+
+mfn_t __init xen_colored_mfn(mfn_t mfn)
+{
+    unsigned int i, color = mfn_to_color(mfn);
+
+    for ( i = 0; i < xen_num_colors; i++ )
+    {
+        if ( color == xen_colors[i] )
+            return mfn;
+        else if ( color < xen_colors[i] )
+            return get_mfn_with_color(mfn, xen_colors[i]);
+    }
+
+    /* Jump to next color space (max_nr_colors mfns) and use the first color */
+    return get_mfn_with_color(mfn_add(mfn, max_nr_colors), xen_colors[0]);
+}
+
 /*
  * Local variables:
  * mode: C
diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
index 7f8218bfb2..618833c5dc 100644
--- a/xen/include/xen/llc-coloring.h
+++ b/xen/include/xen/llc-coloring.h
@@ -26,6 +26,17 @@ static inline void domain_dump_llc_colors(const struct domain *d) {}
 static inline void domain_llc_coloring_free(struct domain *d) {}
 #endif
 
+/**
+ * Iterate over each Xen mfn in the colored space.
+ * @start_mfn:  the first mfn that needs to be colored.
+ * @mfn:        the current mfn.
+ * @i:          loop index.
+ */
+#define for_each_xen_colored_mfn(start_mfn, mfn, i) \
+    for ( i = 0, mfn = xen_colored_mfn(start_mfn);  \
+          i < (_end - _start) >> PAGE_SHIFT;        \
+          i++, mfn = xen_colored_mfn(mfn_add(mfn, 1)) )
+
 unsigned int get_llc_way_size(void);
 void arch_llc_coloring_init(void);
 int dom0_set_llc_colors(struct domain *d);
@@ -37,6 +48,9 @@ struct page_info;
 unsigned int page_to_llc_color(const struct page_info *pg);
 unsigned int get_max_nr_llc_colors(void);
 
+paddr_t xen_colored_map_size(void);
+mfn_t xen_colored_mfn(mfn_t mfn);
+
 #endif /* __COLORING_H__ */
 
 /*
-- 
2.34.1



^ permalink raw reply related	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 01/14] xen/common: add cache coloring common code
  2024-03-15 10:58 ` [PATCH v7 01/14] xen/common: add cache coloring common code Carlo Nonato
@ 2024-03-15 11:39   ` Carlo Nonato
  2024-03-19 14:58   ` Jan Beulich
  1 sibling, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-15 11:39 UTC (permalink / raw)
  To: xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri

Hi all,

unfortunately, this patch doesn't apply cleanly to the latest master.
The conflict is very small: just a reordering of two lines in
xen/common/Kconfig. Should I resend the whole series?

Thanks.

On Fri, Mar 15, 2024 at 11:59 AM Carlo Nonato
<carlo.nonato@minervasys.tech> wrote:
>
> Last Level Cache (LLC) coloring allows to partition the cache in smaller
> chunks called cache colors. Since not all architectures can actually
> implement it, add a HAS_LLC_COLORING Kconfig and put other options under
> xen/arch.
>
> LLC colors are a property of the domain, so the domain struct has to be
> extended.
>
> Based on original work from: Luca Miccio <lucmiccio@gmail.com>
>
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> Signed-off-by: Marco Solieri <marco.solieri@minervasys.tech>
> ---
> v7:
> - SUPPORT.md changes added to this patch
> - extended documentation to better address applicability of cache coloring
> - "llc-nr-ways" and "llc-size" params introduced in favor of "llc-way-size"
> - moved dump_llc_coloring_info() call in 'm' keyhandler (pagealloc_info())
> v6:
> - moved almost all code in common
> - moved documentation in this patch
> - reintroduced range for CONFIG_NR_LLC_COLORS
> - reintroduced some stub functions to reduce the number of checks on
>   llc_coloring_enabled
> - moved domain_llc_coloring_free() in same patch where allocation happens
> - turned "d->llc_colors" to pointer-to-const
> - llc_coloring_init() now returns void and panics if errors are found
> v5:
> - used - instead of _ for filenames
> - removed domain_create_llc_colored()
> - removed stub functions
> - coloring domain fields are now #ifdef protected
> v4:
> - Kconfig options moved to xen/arch
> - removed range for CONFIG_NR_LLC_COLORS
> - added "llc_coloring_enabled" global to later implement the boot-time
>   switch
> - added domain_create_llc_colored() to be able to pass colors
> - added is_domain_llc_colored() macro
> ---
>  SUPPORT.md                        |   7 ++
>  docs/misc/cache-coloring.rst      | 125 ++++++++++++++++++++++++++++++
>  docs/misc/xen-command-line.pandoc |  37 +++++++++
>  xen/arch/Kconfig                  |  20 +++++
>  xen/common/Kconfig                |   3 +
>  xen/common/Makefile               |   1 +
>  xen/common/keyhandler.c           |   3 +
>  xen/common/llc-coloring.c         | 102 ++++++++++++++++++++++++
>  xen/common/page_alloc.c           |   3 +
>  xen/include/xen/llc-coloring.h    |  36 +++++++++
>  xen/include/xen/sched.h           |   5 ++
>  11 files changed, 342 insertions(+)
>  create mode 100644 docs/misc/cache-coloring.rst
>  create mode 100644 xen/common/llc-coloring.c
>  create mode 100644 xen/include/xen/llc-coloring.h
>
> diff --git a/SUPPORT.md b/SUPPORT.md
> index 510bb02190..456abd42bf 100644
> --- a/SUPPORT.md
> +++ b/SUPPORT.md
> @@ -364,6 +364,13 @@ by maintaining multiple physical to machine (p2m) memory mappings.
>      Status, x86 HVM: Tech Preview
>      Status, ARM: Tech Preview
>
> +### Cache coloring
> +
> +Allows to reserve Last Level Cache (LLC) partitions for Dom0, DomUs and Xen
> +itself.
> +
> +    Status, Arm64: Experimental
> +
>  ## Resource Management
>
>  ### CPU Pools
> diff --git a/docs/misc/cache-coloring.rst b/docs/misc/cache-coloring.rst
> new file mode 100644
> index 0000000000..52ce52ffbd
> --- /dev/null
> +++ b/docs/misc/cache-coloring.rst
> @@ -0,0 +1,125 @@
> +Xen cache coloring user guide
> +=============================
> +
> +The cache coloring support in Xen allows to reserve Last Level Cache (LLC)
> +partitions for Dom0, DomUs and Xen itself. Currently only ARM64 is supported.
> +Cache coloring realizes per-set cache partitioning in software and is applicable
> +to shared LLCs as implemented in Cortex-A53, Cortex-A72 and similar CPUs.
> +
> +To compile LLC coloring support set ``CONFIG_LLC_COLORING=y``.
> +
> +If needed, change the maximum number of colors with
> +``CONFIG_NR_LLC_COLORS=<n>``.
> +
> +Runtime configuration is done via `Command line parameters`_.
> +
> +Background
> +**********
> +
> +Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
> +to each core (hence using multiple cache units), while the last level is shared
> +among all of them. Such configuration implies that memory operations on one
> +core (e.g. running a DomU) are able to generate interference on another core
> +(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
> +in software and mitigates this, guaranteeing higher and more predictable
> +performances for memory accesses.
> +Software-based cache coloring is particularly useful in those situations where
> +no hardware mechanisms (e.g., DSU-based way partitioning) are available to
> +partition caches. This is the case for e.g., Cortex-A53, A57 and A72 CPUs that
> +feature a L2 LLC cache shared among all cores.
> +
> +The key concept underlying cache coloring is a fragmentation of the memory
> +space into a set of sub-spaces called colors that are mapped to disjoint cache
> +partitions. Technically, the whole memory space is first divided into a number
> +of subsequent regions. Then each region is in turn divided into a number of
> +subsequent sub-colors. The generic i-th color is then obtained by all the
> +i-th sub-colors in each region.
> +
> +::
> +
> +                            Region j            Region j+1
> +                .....................   ............
> +                .                     . .
> +                .                       .
> +            _ _ _______________ _ _____________________ _ _
> +                |     |     |     |     |     |     |
> +                | c_0 | c_1 |     | c_n | c_0 | c_1 |
> +           _ _ _|_____|_____|_ _ _|_____|_____|_____|_ _ _
> +                    :                       :
> +                    :                       :...         ... .
> +                    :                            color 0
> +                    :...........................         ... .
> +                                                :
> +          . . ..................................:
> +
> +How colors are actually defined depends on the function that maps memory to
> +cache lines. In case of physically-indexed, physically-tagged caches with linear
> +mapping, the set index is found by extracting some contiguous bits from the
> +physical address. This allows colors to be defined as shown in figure: they
> +appear in memory as subsequent blocks of equal size and repeats themselves after
> +``n`` different colors, where ``n`` is the total number of colors.
> +
> +If some kind of bit shuffling appears in the mapping function, then colors
> +assume a different layout in memory. Those kind of caches aren't supported by
> +the current implementation.
> +
> +**Note**: Finding the exact cache mapping function can be a really difficult
> +task since it's not always documented in the CPU manual. As said Cortex-A53, A57
> +and A72 are known to work with the current implementation.
> +
> +How to compute the number of colors
> +###################################
> +
> +Given the linear mapping from physical memory to cache lines for granted, the
> +number of available colors for a specific platform is computed using three
> +parameters:
> +
> +- the size of the LLC.
> +- the number of the LLC ways.
> +- the page size used by Xen.
> +
> +The first two parameters can be found in the processor manual, while the third
> +one is the minimum mapping granularity. Dividing the cache size by the number of
> +its ways we obtain the size of a way. Dividing this number by the page size,
> +the number of total cache colors is found. So for example an Arm Cortex-A53
> +with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
> +4 KiB in size.
> +
> +LLC size and number of ways are probed automatically by default so there's
> +should be no need to compute the number of colors by yourself.
> +
> +Effective colors assignment
> +###########################
> +
> +When assigning colors:
> +
> +1. If one wants to avoid cache interference between two domains, different
> +   colors needs to be used for their memory.
> +
> +2. To improve spatial locality, color assignment should privilege continuity in
> +   the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
> +   domain J is better than assigning colors (0,2) to I and (1,3) to J.
> +
> +Command line parameters
> +***********************
> +
> +Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
> +
> ++----------------------+-------------------------------+
> +| **Parameter**        | **Description**               |
> ++----------------------+-------------------------------+
> +| ``llc-coloring``     | enable coloring at runtime    |
> ++----------------------+-------------------------------+
> +| ``llc-size``         | set the LLC size              |
> ++----------------------+-------------------------------+
> +| ``llc-nr-ways``      | set the LLC number of ways    |
> ++----------------------+-------------------------------+
> +
> +Auto-probing of LLC specs
> +#########################
> +
> +LLC size and number of ways are probed automatically by default.
> +
> +LLC specs can be manually set via the above command line parameters. This
> +bypasses any auto-probing and it's used to overcome failing situations or for
> +debugging/testing purposes.
> diff --git a/docs/misc/xen-command-line.pandoc b/docs/misc/xen-command-line.pandoc
> index 54edbc0fbc..2936abea2c 100644
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
>  record with other registers.
>
> +### llc-coloring
> +> `= <boolean>`
> +
> +> Default: `false`
> +
> +Flag to enable or disable LLC coloring support at runtime. This option is
> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> +cache coloring documentation for more info.
> +
> +### llc-nr-ways
> +> `= <integer>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the number of ways of the Last Level Cache. This option is available
> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
> +to find the number of supported cache colors. By default the value is
> +automatically computed by probing the hardware, but in case of specific needs,
> +it can be manually set. Those include failing probing and debugging/testing
> +purposes so that it's possibile to emulate platforms with different number of
> +supported colors. If set, also "llc-size" must be set, otherwise the default
> +will be used.
> +
> +### llc-size
> +> `= <size>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the size of the Last Level Cache. This option is available only when
> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
> +the number of supported cache colors. By default the value is automatically
> +computed by probing the hardware, but in case of specific needs, it can be
> +manually set. Those include failing probing and debugging/testing purposes so
> +that it's possibile to emulate platforms with different number of supported
> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> +used.
> +
>  ### lock-depth-size
>  > `= <integer>`
>
> diff --git a/xen/arch/Kconfig b/xen/arch/Kconfig
> index 67ba38f32f..a65c38e53e 100644
> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -31,3 +31,23 @@ config NR_NUMA_NODES
>           associated with multiple-nodes management. It is the upper bound of
>           the number of NUMA nodes that the scheduler, memory allocation and
>           other NUMA-aware components can handle.
> +
> +config LLC_COLORING
> +       bool "Last Level Cache (LLC) coloring" if EXPERT
> +       depends on HAS_LLC_COLORING
> +       depends on !NUMA
> +
> +config NR_LLC_COLORS
> +       int "Maximum number of LLC colors"
> +       range 2 1024
> +       default 128
> +       depends on LLC_COLORING
> +       help
> +         Controls the build-time size of various arrays associated with LLC
> +         coloring. Refer to cache coloring documentation for how to compute the
> +         number of colors supported by the platform. This is only an upper
> +         bound. The runtime value is autocomputed or manually set via cmdline.
> +         The default value corresponds to an 8 MiB 16-ways LLC, which should be
> +         more than what's needed in the general case. Use only power of 2 values.
> +         1024 is the number of colors that fit in a 4 KiB page when integers are 4
> +         bytes long.
> diff --git a/xen/common/Kconfig b/xen/common/Kconfig
> index a5c3d5a6bf..1e467178bd 100644
> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -71,6 +71,9 @@ config HAS_IOPORTS
>  config HAS_KEXEC
>         bool
>
> +config HAS_LLC_COLORING
> +       bool
> +
>  config HAS_PMAP
>         bool
>
> diff --git a/xen/common/Makefile b/xen/common/Makefile
> index e5eee19a85..3054254a7d 100644
> --- a/xen/common/Makefile
> +++ b/xen/common/Makefile
> @@ -23,6 +23,7 @@ obj-y += keyhandler.o
>  obj-$(CONFIG_KEXEC) += kexec.o
>  obj-$(CONFIG_KEXEC) += kimage.o
>  obj-$(CONFIG_LIVEPATCH) += livepatch.o livepatch_elf.o
> +obj-$(CONFIG_LLC_COLORING) += llc-coloring.o
>  obj-$(CONFIG_MEM_ACCESS) += mem_access.o
>  obj-y += memory.o
>  obj-y += multicall.o
> diff --git a/xen/common/keyhandler.c b/xen/common/keyhandler.c
> index 127ca50696..778f93e063 100644
> --- a/xen/common/keyhandler.c
> +++ b/xen/common/keyhandler.c
> @@ -5,6 +5,7 @@
>  #include <asm/regs.h>
>  #include <xen/delay.h>
>  #include <xen/keyhandler.h>
> +#include <xen/llc-coloring.h>
>  #include <xen/param.h>
>  #include <xen/shutdown.h>
>  #include <xen/event.h>
> @@ -303,6 +304,8 @@ static void cf_check dump_domains(unsigned char key)
>
>          arch_dump_domain_info(d);
>
> +        domain_dump_llc_colors(d);
> +
>          rangeset_domain_printk(d);
>
>          dump_pageframe_info(d);
> diff --git a/xen/common/llc-coloring.c b/xen/common/llc-coloring.c
> new file mode 100644
> index 0000000000..db96a83ddd
> --- /dev/null
> +++ b/xen/common/llc-coloring.c
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Last Level Cache (LLC) coloring common code
> + *
> + * Copyright (C) 2022 Xilinx Inc.
> + */
> +#include <xen/keyhandler.h>
> +#include <xen/llc-coloring.h>
> +#include <xen/param.h>
> +
> +static bool __ro_after_init llc_coloring_enabled;
> +boolean_param("llc-coloring", llc_coloring_enabled);
> +
> +static unsigned int __initdata llc_size;
> +size_param("llc-size", llc_size);
> +static unsigned int __initdata llc_nr_ways;
> +integer_param("llc-nr-ways", llc_nr_ways);
> +/* Number of colors available in the LLC */
> +static unsigned int __ro_after_init max_nr_colors;
> +
> +static void print_colors(const unsigned int *colors, unsigned int num_colors)
> +{
> +    unsigned int i;
> +
> +    printk("{ ");
> +    for ( i = 0; i < num_colors; i++ )
> +    {
> +        unsigned int start = colors[i], end = start;
> +
> +        printk("%u", start);
> +
> +        for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
> +            ;
> +
> +        if ( start != end )
> +            printk("-%u", end);
> +
> +        if ( i < num_colors - 1 )
> +            printk(", ");
> +    }
> +    printk(" }\n");
> +}
> +
> +void __init llc_coloring_init(void)
> +{
> +    unsigned int way_size;
> +
> +    if ( !llc_coloring_enabled )
> +        return;
> +
> +    if ( llc_size && llc_nr_ways )
> +        way_size = llc_size / llc_nr_ways;
> +    else
> +    {
> +        way_size = get_llc_way_size();
> +        if ( !way_size )
> +            panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
> +    }
> +
> +    /*
> +     * The maximum number of colors must be a power of 2 in order to correctly
> +     * map them to bits of an address.
> +     */
> +    max_nr_colors = way_size >> PAGE_SHIFT;
> +
> +    if ( max_nr_colors & (max_nr_colors - 1) )
> +        panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
> +
> +    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> +        panic("Number of LLC colors (%u) not in range [2, %u]\n",
> +              max_nr_colors, CONFIG_NR_LLC_COLORS);
> +
> +    arch_llc_coloring_init();
> +}
> +
> +void cf_check dump_llc_coloring_info(void)
> +{
> +    if ( !llc_coloring_enabled )
> +        return;
> +
> +    printk("LLC coloring info:\n");
> +    printk("    Number of LLC colors supported: %u\n", max_nr_colors);
> +}
> +
> +void cf_check domain_dump_llc_colors(const struct domain *d)
> +{
> +    if ( !llc_coloring_enabled )
> +        return;
> +
> +    printk("%u LLC colors: ", d->num_llc_colors);
> +    print_colors(d->llc_colors, d->num_llc_colors);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
> index 2ec17df9b4..c38edb9a58 100644
> --- a/xen/common/page_alloc.c
> +++ b/xen/common/page_alloc.c
> @@ -126,6 +126,7 @@
>  #include <xen/irq.h>
>  #include <xen/keyhandler.h>
>  #include <xen/lib.h>
> +#include <xen/llc-coloring.h>
>  #include <xen/mm.h>
>  #include <xen/nodemask.h>
>  #include <xen/numa.h>
> @@ -2623,6 +2624,8 @@ static void cf_check pagealloc_info(unsigned char key)
>      }
>
>      printk("    Dom heap: %lukB free\n", total << (PAGE_SHIFT-10));
> +
> +    dump_llc_coloring_info();
>  }
>
>  static __init int cf_check pagealloc_keyhandler_init(void)
> diff --git a/xen/include/xen/llc-coloring.h b/xen/include/xen/llc-coloring.h
> new file mode 100644
> index 0000000000..c60c8050c5
> --- /dev/null
> +++ b/xen/include/xen/llc-coloring.h
> @@ -0,0 +1,36 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Last Level Cache (LLC) coloring common header
> + *
> + * Copyright (C) 2022 Xilinx Inc.
> + */
> +#ifndef __COLORING_H__
> +#define __COLORING_H__
> +
> +#include <xen/sched.h>
> +#include <public/domctl.h>
> +
> +#ifdef CONFIG_LLC_COLORING
> +void llc_coloring_init(void);
> +void dump_llc_coloring_info(void);
> +void domain_dump_llc_colors(const struct domain *d);
> +#else
> +static inline void llc_coloring_init(void) {}
> +static inline void dump_llc_coloring_info(void) {}
> +static inline void domain_dump_llc_colors(const struct domain *d) {}
> +#endif
> +
> +unsigned int get_llc_way_size(void);
> +void arch_llc_coloring_init(void);
> +
> +#endif /* __COLORING_H__ */
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * tab-width: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
> index 37f5922f32..96cc934fc3 100644
> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -627,6 +627,11 @@ struct domain
>
>      /* Holding CDF_* constant. Internal flags for domain creation. */
>      unsigned int cdf;
> +
> +#ifdef CONFIG_LLC_COLORING
> +    unsigned int num_llc_colors;
> +    const unsigned int *llc_colors;
> +#endif
>  };
>
>  static inline struct page_list_head *page_to_list(
> --
> 2.34.1
>


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 01/14] xen/common: add cache coloring common code
  2024-03-15 10:58 ` [PATCH v7 01/14] xen/common: add cache coloring common code Carlo Nonato
  2024-03-15 11:39   ` Carlo Nonato
@ 2024-03-19 14:58   ` Jan Beulich
  2024-03-21 15:03     ` Carlo Nonato
  1 sibling, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 14:58 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, xen-devel

On 15.03.2024 11:58, Carlo Nonato wrote:
> +Background
> +**********
> +
> +Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
> +to each core (hence using multiple cache units), while the last level is shared
> +among all of them. Such configuration implies that memory operations on one
> +core (e.g. running a DomU) are able to generate interference on another core
> +(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
> +in software and mitigates this, guaranteeing higher and more predictable
> +performances for memory accesses.

Are you sure about "higher"? On an otherwise idle system, a single domain (or
vCPU) may perform better when not partitioned, as more cache would be available
to it overall.

> +How to compute the number of colors
> +###################################
> +
> +Given the linear mapping from physical memory to cache lines for granted, the
> +number of available colors for a specific platform is computed using three
> +parameters:
> +
> +- the size of the LLC.
> +- the number of the LLC ways.
> +- the page size used by Xen.
> +
> +The first two parameters can be found in the processor manual, while the third
> +one is the minimum mapping granularity. Dividing the cache size by the number of
> +its ways we obtain the size of a way. Dividing this number by the page size,
> +the number of total cache colors is found. So for example an Arm Cortex-A53
> +with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
> +4 KiB in size.
> +
> +LLC size and number of ways are probed automatically by default so there's
> +should be no need to compute the number of colors by yourself.

Is this a leftover from the earlier (single) command line option?

> +Effective colors assignment
> +###########################
> +
> +When assigning colors:
> +
> +1. If one wants to avoid cache interference between two domains, different
> +   colors needs to be used for their memory.
> +
> +2. To improve spatial locality, color assignment should privilege continuity in

s/privilege/prefer/ ?

> +   the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
> +   domain J is better than assigning colors (0,2) to I and (1,3) to J.

While I consider 1 obvious without further explanation, the same isn't
the case for 2: What's the benefit of spatial locality? If there was
support for allocating higher order pages, I could certainly see the
point, but iirc that isn't supported (yet).

> +Command line parameters
> +***********************
> +
> +Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
> +
> ++----------------------+-------------------------------+
> +| **Parameter**        | **Description**               |
> ++----------------------+-------------------------------+
> +| ``llc-coloring``     | enable coloring at runtime    |
> ++----------------------+-------------------------------+
> +| ``llc-size``         | set the LLC size              |
> ++----------------------+-------------------------------+
> +| ``llc-nr-ways``      | set the LLC number of ways    |
> ++----------------------+-------------------------------+
> +
> +Auto-probing of LLC specs
> +#########################
> +
> +LLC size and number of ways are probed automatically by default.
> +
> +LLC specs can be manually set via the above command line parameters. This
> +bypasses any auto-probing and it's used to overcome failing situations or for
> +debugging/testing purposes.

As well as perhaps for cases where the auto-probing logic is flawed?

> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
>  record with other registers.
>  
> +### llc-coloring
> +> `= <boolean>`
> +
> +> Default: `false`
> +
> +Flag to enable or disable LLC coloring support at runtime. This option is
> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> +cache coloring documentation for more info.
> +
> +### llc-nr-ways
> +> `= <integer>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the number of ways of the Last Level Cache. This option is available
> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
> +to find the number of supported cache colors. By default the value is
> +automatically computed by probing the hardware, but in case of specific needs,
> +it can be manually set. Those include failing probing and debugging/testing
> +purposes so that it's possibile to emulate platforms with different number of
> +supported colors. If set, also "llc-size" must be set, otherwise the default
> +will be used.
> +
> +### llc-size
> +> `= <size>`
> +
> +> Default: `Obtained from hardware`
> +
> +Specify the size of the Last Level Cache. This option is available only when
> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
> +the number of supported cache colors. By default the value is automatically
> +computed by probing the hardware, but in case of specific needs, it can be
> +manually set. Those include failing probing and debugging/testing purposes so
> +that it's possibile to emulate platforms with different number of supported
> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> +used.

Wouldn't it make sense to infer "llc-coloring" when both of the latter options
were supplied?

> --- a/xen/arch/Kconfig
> +++ b/xen/arch/Kconfig
> @@ -31,3 +31,23 @@ config NR_NUMA_NODES
>  	  associated with multiple-nodes management. It is the upper bound of
>  	  the number of NUMA nodes that the scheduler, memory allocation and
>  	  other NUMA-aware components can handle.
> +
> +config LLC_COLORING
> +	bool "Last Level Cache (LLC) coloring" if EXPERT
> +	depends on HAS_LLC_COLORING
> +	depends on !NUMA
> +
> +config NR_LLC_COLORS
> +	int "Maximum number of LLC colors"
> +	range 2 1024
> +	default 128
> +	depends on LLC_COLORING
> +	help
> +	  Controls the build-time size of various arrays associated with LLC
> +	  coloring. Refer to cache coloring documentation for how to compute the
> +	  number of colors supported by the platform. This is only an upper
> +	  bound. The runtime value is autocomputed or manually set via cmdline.
> +	  The default value corresponds to an 8 MiB 16-ways LLC, which should be
> +	  more than what's needed in the general case. Use only power of 2 values.

I think I said so before: Rather than telling people to pick only power-of-2
values (and it remaining unclear what happens if they don't), why don't you
simply keep them from specifying anything bogus, by having them pass in the
value to use as a power of 2? I.e. "range 1 10" and "default 7" for what
you're currently putting in place.

> +	  1024 is the number of colors that fit in a 4 KiB page when integers are 4
> +	  bytes long.

How's this relevant here? As a justification it would make sense to have in
the description.

I'm btw also not convinced this is a good place to put these options. Imo ...

> --- a/xen/common/Kconfig
> +++ b/xen/common/Kconfig
> @@ -71,6 +71,9 @@ config HAS_IOPORTS
>  config HAS_KEXEC
>  	bool
>  
> +config HAS_LLC_COLORING
> +	bool
> +
>  config HAS_PMAP
>  	bool

... they'd better live further down from here.

> --- /dev/null
> +++ b/xen/common/llc-coloring.c
> @@ -0,0 +1,102 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Last Level Cache (LLC) coloring common code
> + *
> + * Copyright (C) 2022 Xilinx Inc.
> + */
> +#include <xen/keyhandler.h>
> +#include <xen/llc-coloring.h>
> +#include <xen/param.h>
> +
> +static bool __ro_after_init llc_coloring_enabled;
> +boolean_param("llc-coloring", llc_coloring_enabled);
> +
> +static unsigned int __initdata llc_size;
> +size_param("llc-size", llc_size);
> +static unsigned int __initdata llc_nr_ways;
> +integer_param("llc-nr-ways", llc_nr_ways);
> +/* Number of colors available in the LLC */
> +static unsigned int __ro_after_init max_nr_colors;
> +
> +static void print_colors(const unsigned int *colors, unsigned int num_colors)
> +{
> +    unsigned int i;
> +
> +    printk("{ ");
> +    for ( i = 0; i < num_colors; i++ )
> +    {
> +        unsigned int start = colors[i], end = start;
> +
> +        printk("%u", start);
> +
> +        for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
> +            ;
> +
> +        if ( start != end )
> +            printk("-%u", end);
> +
> +        if ( i < num_colors - 1 )
> +            printk(", ");
> +    }
> +    printk(" }\n");
> +}
> +
> +void __init llc_coloring_init(void)
> +{
> +    unsigned int way_size;
> +
> +    if ( !llc_coloring_enabled )
> +        return;
> +
> +    if ( llc_size && llc_nr_ways )
> +        way_size = llc_size / llc_nr_ways;
> +    else
> +    {
> +        way_size = get_llc_way_size();
> +        if ( !way_size )
> +            panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
> +    }
> +
> +    /*
> +     * The maximum number of colors must be a power of 2 in order to correctly
> +     * map them to bits of an address.
> +     */
> +    max_nr_colors = way_size >> PAGE_SHIFT;
> +
> +    if ( max_nr_colors & (max_nr_colors - 1) )
> +        panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
> +
> +    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> +        panic("Number of LLC colors (%u) not in range [2, %u]\n",
> +              max_nr_colors, CONFIG_NR_LLC_COLORS);

Rather than crashing when max_nr_colors is too large, couldn't you simply
halve it a number of times? That would still satisfy the requirement on
isolation, wouldn't it?

> +    arch_llc_coloring_init();
> +}
> +
> +void cf_check dump_llc_coloring_info(void)

I don't think cf_check is needed here nor ...

> +void cf_check domain_dump_llc_colors(const struct domain *d)

... here anymore. You're using direct calls now.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-15 10:58 ` [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support Carlo Nonato
@ 2024-03-19 15:30   ` Jan Beulich
  2024-03-21 15:04     ` Carlo Nonato
  2024-03-19 15:45   ` Jan Beulich
  1 sibling, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 15:30 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel

On 15.03.2024 11:58, Carlo Nonato wrote:
> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
>  
>  Specify a list of IO ports to be excluded from dom0 access.
>  
> +### dom0-llc-colors
> +> `= List of [ <integer> | <integer>-<integer> ]`
> +
> +> Default: `All available LLC colors`
> +
> +Specify dom0 LLC color configuration. This option is available only when
> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
> +colors are used.

My reservation towards this being a top-level option remains.

> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -18,6 +18,63 @@ integer_param("llc-nr-ways", llc_nr_ways);
>  /* Number of colors available in the LLC */
>  static unsigned int __ro_after_init max_nr_colors;
>  
> +static unsigned int __initdata dom0_colors[CONFIG_NR_LLC_COLORS];
> +static unsigned int __initdata dom0_num_colors;
> +
> +/*
> + * Parse the coloring configuration given in the buf string, following the
> + * syntax below.
> + *
> + * COLOR_CONFIGURATION ::= COLOR | RANGE,...,COLOR | RANGE
> + * RANGE               ::= COLOR-COLOR
> + *
> + * Example: "0,2-6,15-16" represents the set of colors: 0,2,3,4,5,6,15,16.
> + */
> +static int __init parse_color_config(const char *buf, unsigned int *colors,
> +                                     unsigned int max_num_colors,
> +                                     unsigned int *num_colors)
> +{
> +    const char *s = buf;
> +
> +    *num_colors = 0;
> +
> +    while ( *s != '\0' )
> +    {
> +        unsigned int color, start, end;
> +
> +        start = simple_strtoul(s, &s, 0);
> +
> +        if ( *s == '-' )    /* Range */
> +        {
> +            s++;
> +            end = simple_strtoul(s, &s, 0);
> +        }
> +        else                /* Single value */
> +            end = start;
> +
> +        if ( start > end || (end - start) > (UINT_MAX - *num_colors) ||
> +             (*num_colors + (end - start)) >= max_num_colors )
> +            return -EINVAL;
> +
> +        for ( color = start; color <= end; color++ )
> +            colors[(*num_colors)++] = color;

I can't spot any range check on start/end/color itself. In fact I was first
meaning to ask why the return value of simple_strtoul() is silently clipped
from unsigned long to unsigned int. Don't forget that a range specification
may easily degenerate into a negative number (due to a simple oversight or
typo), which would then be converted to a huge positive one.

> @@ -41,6 +98,22 @@ static void print_colors(const unsigned int *colors, unsigned int num_colors)
>      printk(" }\n");
>  }
>  
> +static bool check_colors(const unsigned int *colors, unsigned int num_colors)
> +{
> +    unsigned int i;
> +
> +    for ( i = 0; i < num_colors; i++ )
> +    {
> +        if ( colors[i] >= max_nr_colors )
> +        {
> +            printk(XENLOG_ERR "LLC color %u >= %u\n", colors[i], max_nr_colors);
> +            return false;
> +        }
> +    }
> +
> +    return true;
> +}

Oh, here's the range checking of the color values themselves. Perhaps
a comment in parse_color_config() would help.

> @@ -91,6 +164,61 @@ void cf_check domain_dump_llc_colors(const struct domain *d)
>      print_colors(d->llc_colors, d->num_llc_colors);
>  }
>  
> +static int domain_set_default_colors(struct domain *d)
> +{
> +    unsigned int *colors = xmalloc_array(unsigned int, max_nr_colors);
> +    unsigned int i;
> +
> +    if ( !colors )
> +        return -ENOMEM;
> +
> +    printk(XENLOG_WARNING
> +           "LLC color config not found for %pd, using all colors\n", d);
> +
> +    for ( i = 0; i < max_nr_colors; i++ )
> +        colors[i] = i;
> +
> +    d->llc_colors = colors;
> +    d->num_llc_colors = max_nr_colors;
> +
> +    return 0;
> +}

If this function is expected to actually come into play, wouldn't it
make sense to set up such an array just once, and re-use it wherever
necessary?

Also right here both this and check_colors() could be __init. I
understand that subsequent patches will also want to use the
functions at runtime, but until then this looks slightly wrong. I'd
like to ask that such aspects be mentioned in the description, to
avoid respective questions.

> +int __init dom0_set_llc_colors(struct domain *d)
> +{
> +    unsigned int *colors;
> +
> +    if ( !dom0_num_colors )
> +        return domain_set_default_colors(d);
> +
> +    if ( !check_colors(dom0_colors, dom0_num_colors) )
> +    {
> +        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
> +        return -EINVAL;
> +    }
> +
> +    colors = xmalloc_array(unsigned int, dom0_num_colors);
> +    if ( !colors )
> +        return -ENOMEM;
> +
> +    /* Static type checking */
> +    (void)(colors == dom0_colors);

Btw, a means to avoid this would by to use typeof() in the declaration
of "colors".

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 05/14] xen: extend domctl interface for cache coloring
  2024-03-15 10:58 ` [PATCH v7 05/14] xen: extend domctl interface for cache coloring Carlo Nonato
@ 2024-03-19 15:37   ` Jan Beulich
  2024-03-21 15:11     ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 15:37 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, xen-devel

On 15.03.2024 11:58, Carlo Nonato wrote:
> @@ -219,6 +220,39 @@ void domain_llc_coloring_free(struct domain *d)
>      xfree(__va(__pa(d->llc_colors)));
>  }
>  
> +int domain_set_llc_colors(struct domain *d,
> +                          const struct xen_domctl_set_llc_colors *config)
> +{
> +    unsigned int *colors;
> +
> +    if ( d->num_llc_colors )
> +        return -EEXIST;
> +
> +    if ( !config->num_llc_colors )
> +        return domain_set_default_colors(d);
> +
> +    if ( config->num_llc_colors > max_nr_colors || config->pad )

The check of "pad" wants carrying out in all cases; I expect it wants
moving to the caller.

> +        return -EINVAL;
> +
> +    colors = xmalloc_array(unsigned int, config->num_llc_colors);
> +    if ( !colors )
> +        return -ENOMEM;
> +
> +    if ( copy_from_guest(colors, config->llc_colors, config->num_llc_colors) )
> +        return -EFAULT;

You're leaking "colors" when taking this or ...

> +    if ( !check_colors(colors, config->num_llc_colors) )
> +    {
> +        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
> +        return -EINVAL;

... this error path.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 07/14] xen/arm: add support for cache coloring configuration via device-tree
  2024-03-15 10:58 ` [PATCH v7 07/14] xen/arm: add support for cache coloring configuration via device-tree Carlo Nonato
@ 2024-03-19 15:41   ` Jan Beulich
  2024-03-21 15:12     ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 15:41 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Marco Solieri, xen-devel

On 15.03.2024 11:58, Carlo Nonato wrote:
> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -253,6 +253,37 @@ int domain_set_llc_colors(struct domain *d,
>      return 0;
>  }
>  
> +int __init domain_set_llc_colors_from_str(struct domain *d, const char *str)
> +{
> +    int err;
> +    unsigned int *colors, num_colors;
> +
> +    if ( !str )
> +        return domain_set_default_colors(d);
> +
> +    colors = xmalloc_array(unsigned int, max_nr_colors);
> +    if ( !colors )
> +        return -ENOMEM;
> +
> +    err = parse_color_config(str, colors, max_nr_colors, &num_colors);
> +    if ( err )
> +    {
> +        printk(XENLOG_ERR "Error parsing LLC color configuration");
> +        return err;
> +    }
> +
> +    if ( !check_colors(colors, num_colors) )
> +    {
> +        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
> +        return -EINVAL;
> +    }

"colors" is again leaked on the error paths.

> +    d->llc_colors = colors;
> +    d->num_llc_colors = num_colors;

num_colors may be quite a bit smaller than max_nr_colors; worth re-
allocating the array to free up excess space?

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-15 10:58 ` [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support Carlo Nonato
  2024-03-19 15:30   ` Jan Beulich
@ 2024-03-19 15:45   ` Jan Beulich
  1 sibling, 0 replies; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 15:45 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel

On 15.03.2024 11:58, Carlo Nonato wrote:
> --- a/xen/common/llc-coloring.c
> +++ b/xen/common/llc-coloring.c
> @@ -18,6 +18,63 @@ integer_param("llc-nr-ways", llc_nr_ways);
>  /* Number of colors available in the LLC */
>  static unsigned int __ro_after_init max_nr_colors;
>  
> +static unsigned int __initdata dom0_colors[CONFIG_NR_LLC_COLORS];
> +static unsigned int __initdata dom0_num_colors;
> +
> +/*
> + * Parse the coloring configuration given in the buf string, following the
> + * syntax below.
> + *
> + * COLOR_CONFIGURATION ::= COLOR | RANGE,...,COLOR | RANGE
> + * RANGE               ::= COLOR-COLOR
> + *
> + * Example: "0,2-6,15-16" represents the set of colors: 0,2,3,4,5,6,15,16.
> + */
> +static int __init parse_color_config(const char *buf, unsigned int *colors,
> +                                     unsigned int max_num_colors,
> +                                     unsigned int *num_colors)
> +{
> +    const char *s = buf;
> +
> +    *num_colors = 0;
> +
> +    while ( *s != '\0' )
> +    {
> +        unsigned int color, start, end;
> +
> +        start = simple_strtoul(s, &s, 0);
> +
> +        if ( *s == '-' )    /* Range */
> +        {
> +            s++;
> +            end = simple_strtoul(s, &s, 0);
> +        }
> +        else                /* Single value */
> +            end = start;
> +
> +        if ( start > end || (end - start) > (UINT_MAX - *num_colors) ||
> +             (*num_colors + (end - start)) >= max_num_colors )
> +            return -EINVAL;
> +
> +        for ( color = start; color <= end; color++ )
> +            colors[(*num_colors)++] = color;
> +
> +        if ( *s == ',' )
> +            s++;
> +        else if ( *s != '\0' )
> +            break;
> +    }
> +
> +    return *s ? -EINVAL : 0;
> +}
> +
> +static int __init parse_dom0_colors(const char *s)
> +{
> +    return parse_color_config(s, dom0_colors, ARRAY_SIZE(dom0_colors),

With it not being possible to pass max_nr_colors here (due to the value
not having been established yet), don't you need to check somewhere else
that ...

> +                              &dom0_num_colors);

... dom0_num_colors isn't too large?

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-15 10:58 ` [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro Carlo Nonato
@ 2024-03-19 15:47   ` Jan Beulich
  2024-03-21 16:07   ` Julien Grall
  1 sibling, 0 replies; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 15:47 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 15.03.2024 11:58, Carlo Nonato wrote:
> PGC_static and PGC_extra needs to be preserved when assigning a page.
> Define a new macro that groups those flags and use it instead of or'ing
> every time.
> 
> To make preserved flags even more meaningful, they are kept also when
> switching state in mark_page_free().
> 
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 09/14] xen/page_alloc: introduce page flag to stop buddy merging
  2024-03-15 10:58 ` [PATCH v7 09/14] xen/page_alloc: introduce page flag to stop buddy merging Carlo Nonato
@ 2024-03-19 15:49   ` Jan Beulich
  0 siblings, 0 replies; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 15:49 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, xen-devel

On 15.03.2024 11:58, Carlo Nonato wrote:
> Add a new PGC_no_buddy_merge flag that prevents the buddy algorithm in
> free_heap_pages() from merging pages that have it set. As of now, only
> PGC_static has this feature, but future work can extend it easier than
> before.
> 

Suggested-by: Jan Beulich <jbeulich@suse.com>

> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>

Reviewed-by: Jan Beulich <jbeulich@suse.com>




^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 12/14] xen/arm: add Xen cache colors command line parameter
  2024-03-15 10:59 ` [PATCH v7 12/14] xen/arm: add Xen cache colors command line parameter Carlo Nonato
@ 2024-03-19 15:54   ` Jan Beulich
  2024-03-21 15:36     ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 15:54 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Luca Miccio, Andrew Cooper, George Dunlap, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri, xen-devel

On 15.03.2024 11:59, Carlo Nonato wrote:
> From: Luca Miccio <lucmiccio@gmail.com>
> 
> Add a new command line parameter to configure Xen cache colors.
> These colors can be dumped with the cache coloring info debug-key.
> 
> By default, Xen uses the first color.
> Benchmarking the VM interrupt response time provides an estimation of
> LLC usage by Xen's most latency-critical runtime task. Results on Arm
> Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
> reserves 64 KiB of L2, is enough to attain best responsiveness:
> - Xen 1 color latency:  3.1 us
> - Xen 2 color latency:  3.1 us
> 
> More colors are instead very likely to be needed on processors whose L1
> cache is physically-indexed and physically-tagged, such as Cortex-A57.
> In such cases, coloring applies to L1 also, and there typically are two
> distinct L1-colors. Therefore, reserving only one color for Xen would
> senselessly partitions a cache memory that is already private, i.e.
> underutilize it.

Here you say that using just a single color is undesirable on such systems.

> The default amount of Xen colors is thus set to one.

Yet then, without any further explanation you conclude that 1 is the
universal default.

> @@ -147,6 +159,21 @@ void __init llc_coloring_init(void)
>          panic("Number of LLC colors (%u) not in range [2, %u]\n",
>                max_nr_colors, CONFIG_NR_LLC_COLORS);
>  
> +    if ( !xen_num_colors )
> +    {
> +        unsigned int i;
> +
> +        xen_num_colors = MIN(XEN_DEFAULT_NUM_COLORS, max_nr_colors);
> +
> +        printk(XENLOG_WARNING
> +               "Xen LLC color config not found. Using first %u colors\n",
> +               xen_num_colors);
> +        for ( i = 0; i < xen_num_colors; i++ )
> +            xen_colors[i] = i;
> +    }
> +    else if ( !check_colors(xen_colors, xen_num_colors) )
> +        panic("Bad LLC color config for Xen\n");

This "else" branch again lacks a bounds check against max_nr_colors, if
I'm not mistaken.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 14/14] xen/arm: add cache coloring support for Xen
  2024-03-15 10:59 ` [PATCH v7 14/14] xen/arm: add cache coloring support for Xen Carlo Nonato
@ 2024-03-19 15:58   ` Jan Beulich
  2024-03-19 16:15     ` Jan Beulich
  2024-03-19 16:03   ` Jan Beulich
  1 sibling, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 15:58 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Marco Solieri, xen-devel

On 15.03.2024 11:59, Carlo Nonato wrote:
> @@ -326,6 +328,27 @@ unsigned int get_max_nr_llc_colors(void)
>      return max_nr_colors;
>  }
>  
> +paddr_t __init xen_colored_map_size(void)
> +{
> +    return ROUNDUP((_end - _start) * max_nr_colors, XEN_PADDR_ALIGN);
> +}

XEN_PADDR_ALIGN is an inherently Arm thing. Such better wouldn't appear
in common code.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 14/14] xen/arm: add cache coloring support for Xen
  2024-03-15 10:59 ` [PATCH v7 14/14] xen/arm: add cache coloring support for Xen Carlo Nonato
  2024-03-19 15:58   ` Jan Beulich
@ 2024-03-19 16:03   ` Jan Beulich
  1 sibling, 0 replies; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 16:03 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Marco Solieri, xen-devel

On 15.03.2024 11:59, Carlo Nonato wrote:
> @@ -62,7 +63,67 @@ unsigned int __init get_llc_way_size(void)
>      return line_size * num_sets;
>  }
>  
> -void __init arch_llc_coloring_init(void) {}

Btw, doing things this way isn't very nice. I was about to ask ...

> +/**
> + * get_xen_paddr - get physical address to relocate Xen to
> + *
> + * Xen is relocated to as near to the top of RAM as possible and
> + * aligned to a XEN_PADDR_ALIGN boundary.
> + */
> +static paddr_t __init get_xen_paddr(paddr_t xen_size)
> +{
> +    const struct meminfo *mi = &bootinfo.mem;
> +    paddr_t min_size;
> +    paddr_t paddr = 0;
> +    unsigned int i;
> +
> +    min_size = (xen_size + (XEN_PADDR_ALIGN-1)) & ~(XEN_PADDR_ALIGN-1);
> +
> +    /* Find the highest bank with enough space. */
> +    for ( i = 0; i < mi->nr_banks; i++ )
> +    {
> +        const struct membank *bank = &mi->bank[i];
> +        paddr_t s, e;
> +
> +        if ( bank->size >= min_size )
> +        {
> +            e = consider_modules(bank->start, bank->start + bank->size,
> +                                 min_size, XEN_PADDR_ALIGN, 0);
> +            if ( !e )
> +                continue;
> +
> +#ifdef CONFIG_ARM_32
> +            /* Xen must be under 4GB */
> +            if ( e > GB(4) )
> +                e = GB(4);
> +            if ( e < bank->start )
> +                continue;
> +#endif
> +
> +            s = e - min_size;
> +
> +            if ( s > paddr )
> +                paddr = s;
> +        }
> +    }
> +
> +    if ( !paddr )
> +        panic("Not enough memory to relocate Xen\n");
> +
> +    printk("Placing Xen at 0x%"PRIpaddr"-0x%"PRIpaddr"\n",
> +           paddr, paddr + min_size);
> +
> +    return paddr;
> +}
> +
> +void __init arch_llc_coloring_init(void)
> +{
> +    struct bootmodule *xen_bootmodule = boot_module_find_by_kind(BOOTMOD_XEN);
> +
> +    BUG_ON(!xen_bootmodule);
> +
> +    xen_bootmodule->size = xen_colored_map_size();
> +    xen_bootmodule->start = get_xen_paddr(xen_bootmodule->size);
> +}

... whether the build wouldn't have been broken until this function
is added. Since you know the function is going to gain a non-empty
body, please introduce it in the earlier patch as

void __init arch_llc_coloring_init(void)
{
}

instead.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 14/14] xen/arm: add cache coloring support for Xen
  2024-03-19 15:58   ` Jan Beulich
@ 2024-03-19 16:15     ` Jan Beulich
  0 siblings, 0 replies; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 16:15 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Marco Solieri, xen-devel

On 19.03.2024 16:58, Jan Beulich wrote:
> On 15.03.2024 11:59, Carlo Nonato wrote:
>> @@ -326,6 +328,27 @@ unsigned int get_max_nr_llc_colors(void)
>>      return max_nr_colors;
>>  }
>>  
>> +paddr_t __init xen_colored_map_size(void)
>> +{
>> +    return ROUNDUP((_end - _start) * max_nr_colors, XEN_PADDR_ALIGN);
>> +}
> 
> XEN_PADDR_ALIGN is an inherently Arm thing. Such better wouldn't appear
> in common code.

And actually in patch 10 you introduce get_max_nr_llc_colors). With
that, this calculation can more to Arm code.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 10/14] xen: add cache coloring allocator for domains
  2024-03-15 10:58 ` [PATCH v7 10/14] xen: add cache coloring allocator for domains Carlo Nonato
@ 2024-03-19 16:43   ` Jan Beulich
  2024-03-21 15:36     ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-19 16:43 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel

On 15.03.2024 11:58, Carlo Nonato wrote:
> Add a new memory page allocator that implements the cache coloring mechanism.
> The allocation algorithm enforces equal frequency distribution of cache
> partitions, following the coloring configuration of a domain. This allows
> for an even utilization of cache sets for every domain.
> 
> Pages are stored in a color-indexed array of lists. Those lists are filled
> by a simple init function which computes the color of each page.
> When a domain requests a page, the allocator extract the page from the list
> with the maximum number of free pages between those that the domain can
> access, given its coloring configuration.

Minor remark: I'm not a native speaker, but "between" here reads odd to
me. I'd have expected perhaps "among".

> --- a/docs/misc/xen-command-line.pandoc
> +++ b/docs/misc/xen-command-line.pandoc
> @@ -270,6 +270,20 @@ and not running softirqs. Reduce this if softirqs are not being run frequently
>  enough. Setting this to a high value may cause boot failure, particularly if
>  the NMI watchdog is also enabled.
>  
> +### buddy-alloc-size (arm64)
> +> `= <size>`
> +
> +> Default: `64M`
> +
> +Amount of memory reserved for the buddy allocator when colored allocator is
> +active. This options is parsed only when LLC coloring support is enabled.

Nit: s/parsed/used/ - the option is always parsed as long as LLC_COLORING=y.

> @@ -1945,6 +1949,164 @@ static unsigned long avail_heap_pages(
>      return free_pages;
>  }
>  
> +/*************************
> + * COLORED SIDE-ALLOCATOR
> + *
> + * Pages are grouped by LLC color in lists which are globally referred to as the
> + * color heap. Lists are populated in end_boot_allocator().
> + * After initialization there will be N lists where N is the number of
> + * available colors on the platform.
> + */
> +static struct page_list_head *__ro_after_init _color_heap;
> +#define color_heap(color) (&_color_heap[color])
> +
> +static unsigned long *__ro_after_init free_colored_pages;
> +
> +/* Memory required for buddy allocator to work with colored one */
> +#ifdef CONFIG_LLC_COLORING
> +static unsigned long __initdata buddy_alloc_size =
> +    MB(CONFIG_BUDDY_ALLOCATOR_SIZE);
> +size_param("buddy-alloc-size", buddy_alloc_size);
> +
> +#define domain_num_llc_colors(d) (d)->num_llc_colors
> +#define domain_llc_color(d, i)   (d)->llc_colors[i]
> +#else
> +static unsigned long __initdata buddy_alloc_size;
> +
> +#define domain_num_llc_colors(d) 0
> +#define domain_llc_color(d, i)   0
> +#endif
> +
> +static void free_color_heap_page(struct page_info *pg, bool need_scrub)
> +{
> +    unsigned int color = page_to_llc_color(pg);
> +    struct page_list_head *head = color_heap(color);
> +
> +    spin_lock(&heap_lock);
> +
> +    mark_page_free(pg, page_to_mfn(pg));
> +
> +    if ( need_scrub )
> +    {
> +        pg->count_info |= PGC_need_scrub;
> +        poison_one_page(pg);
> +    }
> +
> +    free_colored_pages[color]++;
> +    page_list_add(pg, head);

May I please ask for a comment (or at least some wording in the description)
as to the choice made here between head or tail insertion? When assuming
that across a system there's no sharing of colors, preferably re-using
cache-hot pages is certainly good. Whereas when colors can reasonably be
expected to be shared, avoiding to quickly re-use a freed page can also
have benefits.

> +static struct page_info *alloc_color_heap_page(unsigned int memflags,
> +                                               const struct domain *d)
> +{
> +    struct page_info *pg = NULL;
> +    unsigned int i, color = 0;
> +    unsigned long max = 0;
> +    bool need_tlbflush = false;
> +    uint32_t tlbflush_timestamp = 0;
> +    bool need_scrub;
> +
> +    if ( memflags >> _MEMF_bits )
> +        return NULL;

By mentioning MEMF_bits earlier on I meant to give an example. What
about MEMF_node and in particular MEMF_exact_node? Certain other flags
also aren't obvious as to being okay to silently ignore.

> +    spin_lock(&heap_lock);
> +
> +    for ( i = 0; i < domain_num_llc_colors(d); i++ )
> +    {
> +        unsigned long free = free_colored_pages[domain_llc_color(d, i)];
> +
> +        if ( free > max )
> +        {
> +            color = domain_llc_color(d, i);
> +            pg = page_list_first(color_heap(color));
> +            max = free;
> +        }
> +    }
> +
> +    if ( !pg )
> +    {
> +        spin_unlock(&heap_lock);
> +        return NULL;
> +    }
> +
> +    need_scrub = pg->count_info & (PGC_need_scrub);
> +    pg->count_info = PGC_state_inuse | (pg->count_info & PGC_colored);

Better PGC_preserved?

> +static void __init init_color_heap_pages(struct page_info *pg,
> +                                         unsigned long nr_pages)
> +{
> +    unsigned int i;
> +    bool need_scrub = opt_bootscrub == BOOTSCRUB_IDLE;
> +
> +    if ( buddy_alloc_size )
> +    {
> +        unsigned long buddy_pages = min(PFN_DOWN(buddy_alloc_size), nr_pages);
> +
> +        init_heap_pages(pg, buddy_pages);

There's a corner case where init_heap_pages() would break when passed 0
as 2nd argument. I think you want to alter the enclosing if() to
"if ( buddy_alloc_size >= PAGE_SIZE )" to be entirely certain to avoid
that case.

> +static void dump_color_heap(void)
> +{
> +    unsigned int color;
> +
> +    printk("Dumping color heap info\n");
> +    for ( color = 0; color < get_max_nr_llc_colors(); color++ )
> +        if ( free_colored_pages[color] > 0 )
> +            printk("Color heap[%u]: %lu pages\n",
> +                   color, free_colored_pages[color]);
> +}

While having all of the code above from here outside of any #ifdef is
helpful to prevent unintended breakage when changes are made and tested
only on non-Arm64 targets, I'd still like to ask: Halfway recent
compilers manage to eliminate everything? I'd like to avoid e.g. x86
being left with traces of coloring despite not being able at all to use
it.

> @@ -2485,7 +2660,10 @@ struct page_info *alloc_domheap_pages(
>          }
>          if ( assign_page(pg, order, d, memflags) )
>          {
> -            free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> +            if ( pg->count_info & PGC_colored )
> +                free_color_heap_page(pg, memflags & MEMF_no_scrub);
> +            else
> +                free_heap_pages(pg, order, memflags & MEMF_no_scrub);
>              return NULL;
>          }
>      }
> @@ -2568,7 +2746,10 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
>              scrub = 1;
>          }
>  
> -        free_heap_pages(pg, order, scrub);
> +        if ( pg->count_info & PGC_colored )
> +            free_color_heap_page(pg, scrub);
> +        else
> +            free_heap_pages(pg, order, scrub);
>      }

Instead of this, did you consider altering free_heap_pages() to forward
to free_color_heap_page()? That would then also allow to have a single,
central comment and/or assertion that the "order" value here isn't lost.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 01/14] xen/common: add cache coloring common code
  2024-03-19 14:58   ` Jan Beulich
@ 2024-03-21 15:03     ` Carlo Nonato
  2024-03-21 15:53       ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-21 15:03 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrea Bastoni, Andrew Cooper, George Dunlap, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri, xen-devel

Hi Jan,

(adding Andrea Bastoni in cc)

On Tue, Mar 19, 2024 at 3:58 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.03.2024 11:58, Carlo Nonato wrote:
> > +Background
> > +**********
> > +
> > +Cache hierarchy of a modern multi-core CPU typically has first levels dedicated
> > +to each core (hence using multiple cache units), while the last level is shared
> > +among all of them. Such configuration implies that memory operations on one
> > +core (e.g. running a DomU) are able to generate interference on another core
> > +(e.g. hosting another DomU). Cache coloring realizes per-set cache-partitioning
> > +in software and mitigates this, guaranteeing higher and more predictable
> > +performances for memory accesses.
>
> Are you sure about "higher"? On an otherwise idle system, a single domain (or
> vCPU) may perform better when not partitioned, as more cache would be available
> to it overall.

I'll drop "higher" and leave the rest.

> > +How to compute the number of colors
> > +###################################
> > +
> > +Given the linear mapping from physical memory to cache lines for granted, the
> > +number of available colors for a specific platform is computed using three
> > +parameters:
> > +
> > +- the size of the LLC.
> > +- the number of the LLC ways.
> > +- the page size used by Xen.
> > +
> > +The first two parameters can be found in the processor manual, while the third
> > +one is the minimum mapping granularity. Dividing the cache size by the number of
> > +its ways we obtain the size of a way. Dividing this number by the page size,
> > +the number of total cache colors is found. So for example an Arm Cortex-A53
> > +with a 16-ways associative 1 MiB LLC can isolate up to 16 colors when pages are
> > +4 KiB in size.
> > +
> > +LLC size and number of ways are probed automatically by default so there's
> > +should be no need to compute the number of colors by yourself.
>
> Is this a leftover from the earlier (single) command line option?

Nope, but I can drop it since it's already stated below.

> > +Effective colors assignment
> > +###########################
> > +
> > +When assigning colors:
> > +
> > +1. If one wants to avoid cache interference between two domains, different
> > +   colors needs to be used for their memory.
> > +
> > +2. To improve spatial locality, color assignment should privilege continuity in
>
> s/privilege/prefer/ ?
>
> > +   the partitioning. E.g., assigning colors (0,1) to domain I and (2,3) to
> > +   domain J is better than assigning colors (0,2) to I and (1,3) to J.
>
> While I consider 1 obvious without further explanation, the same isn't
> the case for 2: What's the benefit of spatial locality? If there was
> support for allocating higher order pages, I could certainly see the
> point, but iirc that isn't supported (yet).

I'll drop point 2.

> > +Command line parameters
> > +***********************
> > +
> > +Specific documentation is available at `docs/misc/xen-command-line.pandoc`.
> > +
> > ++----------------------+-------------------------------+
> > +| **Parameter**        | **Description**               |
> > ++----------------------+-------------------------------+
> > +| ``llc-coloring``     | enable coloring at runtime    |
> > ++----------------------+-------------------------------+
> > +| ``llc-size``         | set the LLC size              |
> > ++----------------------+-------------------------------+
> > +| ``llc-nr-ways``      | set the LLC number of ways    |
> > ++----------------------+-------------------------------+
> > +
> > +Auto-probing of LLC specs
> > +#########################
> > +
> > +LLC size and number of ways are probed automatically by default.
> > +
> > +LLC specs can be manually set via the above command line parameters. This
> > +bypasses any auto-probing and it's used to overcome failing situations or for
> > +debugging/testing purposes.
>
> As well as perhaps for cases where the auto-probing logic is flawed?

This is what I meant with "overcome failing situations", but I'll be more
explicit.

> > --- a/docs/misc/xen-command-line.pandoc
> > +++ b/docs/misc/xen-command-line.pandoc
> > @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
> >  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
> >  record with other registers.
> >
> > +### llc-coloring
> > +> `= <boolean>`
> > +
> > +> Default: `false`
> > +
> > +Flag to enable or disable LLC coloring support at runtime. This option is
> > +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> > +cache coloring documentation for more info.
> > +
> > +### llc-nr-ways
> > +> `= <integer>`
> > +
> > +> Default: `Obtained from hardware`
> > +
> > +Specify the number of ways of the Last Level Cache. This option is available
> > +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
> > +to find the number of supported cache colors. By default the value is
> > +automatically computed by probing the hardware, but in case of specific needs,
> > +it can be manually set. Those include failing probing and debugging/testing
> > +purposes so that it's possibile to emulate platforms with different number of
> > +supported colors. If set, also "llc-size" must be set, otherwise the default
> > +will be used.
> > +
> > +### llc-size
> > +> `= <size>`
> > +
> > +> Default: `Obtained from hardware`
> > +
> > +Specify the size of the Last Level Cache. This option is available only when
> > +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
> > +the number of supported cache colors. By default the value is automatically
> > +computed by probing the hardware, but in case of specific needs, it can be
> > +manually set. Those include failing probing and debugging/testing purposes so
> > +that it's possibile to emulate platforms with different number of supported
> > +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> > +used.
>
> Wouldn't it make sense to infer "llc-coloring" when both of the latter options
> were supplied?

To me it looks a bit strange that specifying some attributes of the cache
automatically enables cache coloring. Also it would require some changes in
how to express the auto-probing for such attributes.

> > --- a/xen/arch/Kconfig
> > +++ b/xen/arch/Kconfig
> > @@ -31,3 +31,23 @@ config NR_NUMA_NODES
> >         associated with multiple-nodes management. It is the upper bound of
> >         the number of NUMA nodes that the scheduler, memory allocation and
> >         other NUMA-aware components can handle.
> > +
> > +config LLC_COLORING
> > +     bool "Last Level Cache (LLC) coloring" if EXPERT
> > +     depends on HAS_LLC_COLORING
> > +     depends on !NUMA
> > +
> > +config NR_LLC_COLORS
> > +     int "Maximum number of LLC colors"
> > +     range 2 1024
> > +     default 128
> > +     depends on LLC_COLORING
> > +     help
> > +       Controls the build-time size of various arrays associated with LLC
> > +       coloring. Refer to cache coloring documentation for how to compute the
> > +       number of colors supported by the platform. This is only an upper
> > +       bound. The runtime value is autocomputed or manually set via cmdline.
> > +       The default value corresponds to an 8 MiB 16-ways LLC, which should be
> > +       more than what's needed in the general case. Use only power of 2 values.
>
> I think I said so before: Rather than telling people to pick only power-of-2
> values (and it remaining unclear what happens if they don't), why don't you
> simply keep them from specifying anything bogus, by having them pass in the
> value to use as a power of 2? I.e. "range 1 10" and "default 7" for what
> you're currently putting in place.

I'll do that.

> > +       1024 is the number of colors that fit in a 4 KiB page when integers are 4
> > +       bytes long.
>
> How's this relevant here? As a justification it would make sense to have in
> the description.

I'll move it.

> I'm btw also not convinced this is a good place to put these options. Imo ...
>
> > --- a/xen/common/Kconfig
> > +++ b/xen/common/Kconfig
> > @@ -71,6 +71,9 @@ config HAS_IOPORTS
> >  config HAS_KEXEC
> >       bool
> >
> > +config HAS_LLC_COLORING
> > +     bool
> > +
> >  config HAS_PMAP
> >       bool
>
> ... they'd better live further down from here.

Ok.

> > --- /dev/null
> > +++ b/xen/common/llc-coloring.c
> > @@ -0,0 +1,102 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Last Level Cache (LLC) coloring common code
> > + *
> > + * Copyright (C) 2022 Xilinx Inc.
> > + */
> > +#include <xen/keyhandler.h>
> > +#include <xen/llc-coloring.h>
> > +#include <xen/param.h>
> > +
> > +static bool __ro_after_init llc_coloring_enabled;
> > +boolean_param("llc-coloring", llc_coloring_enabled);
> > +
> > +static unsigned int __initdata llc_size;
> > +size_param("llc-size", llc_size);
> > +static unsigned int __initdata llc_nr_ways;
> > +integer_param("llc-nr-ways", llc_nr_ways);
> > +/* Number of colors available in the LLC */
> > +static unsigned int __ro_after_init max_nr_colors;
> > +
> > +static void print_colors(const unsigned int *colors, unsigned int num_colors)
> > +{
> > +    unsigned int i;
> > +
> > +    printk("{ ");
> > +    for ( i = 0; i < num_colors; i++ )
> > +    {
> > +        unsigned int start = colors[i], end = start;
> > +
> > +        printk("%u", start);
> > +
> > +        for ( ; i < num_colors - 1 && end + 1 == colors[i + 1]; i++, end++ )
> > +            ;
> > +
> > +        if ( start != end )
> > +            printk("-%u", end);
> > +
> > +        if ( i < num_colors - 1 )
> > +            printk(", ");
> > +    }
> > +    printk(" }\n");
> > +}
> > +
> > +void __init llc_coloring_init(void)
> > +{
> > +    unsigned int way_size;
> > +
> > +    if ( !llc_coloring_enabled )
> > +        return;
> > +
> > +    if ( llc_size && llc_nr_ways )
> > +        way_size = llc_size / llc_nr_ways;
> > +    else
> > +    {
> > +        way_size = get_llc_way_size();
> > +        if ( !way_size )
> > +            panic("LLC probing failed and 'llc-size' or 'llc-nr-ways' missing\n");
> > +    }
> > +
> > +    /*
> > +     * The maximum number of colors must be a power of 2 in order to correctly
> > +     * map them to bits of an address.
> > +     */
> > +    max_nr_colors = way_size >> PAGE_SHIFT;
> > +
> > +    if ( max_nr_colors & (max_nr_colors - 1) )
> > +        panic("Number of LLC colors (%u) isn't a power of 2\n", max_nr_colors);
> > +
> > +    if ( max_nr_colors < 2 || max_nr_colors > CONFIG_NR_LLC_COLORS )
> > +        panic("Number of LLC colors (%u) not in range [2, %u]\n",
> > +              max_nr_colors, CONFIG_NR_LLC_COLORS);
>
> Rather than crashing when max_nr_colors is too large, couldn't you simply
> halve it a number of times? That would still satisfy the requirement on
> isolation, wouldn't it?

Well I could simply set it to CONFIG_NR_LLC_COLORS at this point.

> > +    arch_llc_coloring_init();
> > +}
> > +
> > +void cf_check dump_llc_coloring_info(void)
>
> I don't think cf_check is needed here nor ...
>
> > +void cf_check domain_dump_llc_colors(const struct domain *d)
>
> ... here anymore. You're using direct calls now.

Ok.

> Jan

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-19 15:30   ` Jan Beulich
@ 2024-03-21 15:04     ` Carlo Nonato
  2024-03-21 15:57       ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-21 15:04 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel, Andrea Bastoni

Hi Jan,

On Tue, Mar 19, 2024 at 4:30 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.03.2024 11:58, Carlo Nonato wrote:
> > --- a/docs/misc/xen-command-line.pandoc
> > +++ b/docs/misc/xen-command-line.pandoc
> > @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
> >
> >  Specify a list of IO ports to be excluded from dom0 access.
> >
> > +### dom0-llc-colors
> > +> `= List of [ <integer> | <integer>-<integer> ]`
> > +
> > +> Default: `All available LLC colors`
> > +
> > +Specify dom0 LLC color configuration. This option is available only when
> > +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
> > +colors are used.
>
> My reservation towards this being a top-level option remains.

How can I turn this into a lower-level option? Moving it into "dom0=" doesn't
seem possible to me. How can I express a list (llc-colors) inside another list
(dom0)? dom0=llc-colors=0-3,12-15,other-param=... How can I stop parsing
before reaching other-param?

> > --- a/xen/common/llc-coloring.c
> > +++ b/xen/common/llc-coloring.c
> > @@ -18,6 +18,63 @@ integer_param("llc-nr-ways", llc_nr_ways);
> >  /* Number of colors available in the LLC */
> >  static unsigned int __ro_after_init max_nr_colors;
> >
> > +static unsigned int __initdata dom0_colors[CONFIG_NR_LLC_COLORS];
> > +static unsigned int __initdata dom0_num_colors;
> > +
> > +/*
> > + * Parse the coloring configuration given in the buf string, following the
> > + * syntax below.
> > + *
> > + * COLOR_CONFIGURATION ::= COLOR | RANGE,...,COLOR | RANGE
> > + * RANGE               ::= COLOR-COLOR
> > + *
> > + * Example: "0,2-6,15-16" represents the set of colors: 0,2,3,4,5,6,15,16.
> > + */
> > +static int __init parse_color_config(const char *buf, unsigned int *colors,
> > +                                     unsigned int max_num_colors,
> > +                                     unsigned int *num_colors)
> > +{
> > +    const char *s = buf;
> > +
> > +    *num_colors = 0;
> > +
> > +    while ( *s != '\0' )
> > +    {
> > +        unsigned int color, start, end;
> > +
> > +        start = simple_strtoul(s, &s, 0);
> > +
> > +        if ( *s == '-' )    /* Range */
> > +        {
> > +            s++;
> > +            end = simple_strtoul(s, &s, 0);
> > +        }
> > +        else                /* Single value */
> > +            end = start;
> > +
> > +        if ( start > end || (end - start) > (UINT_MAX - *num_colors) ||
> > +             (*num_colors + (end - start)) >= max_num_colors )
> > +            return -EINVAL;
> > +
> > +        for ( color = start; color <= end; color++ )
> > +            colors[(*num_colors)++] = color;
>
> I can't spot any range check on start/end/color itself. In fact I was first
> meaning to ask why the return value of simple_strtoul() is silently clipped
> from unsigned long to unsigned int. Don't forget that a range specification
> may easily degenerate into a negative number (due to a simple oversight or
> typo), which would then be converted to a huge positive one.
>
> > @@ -41,6 +98,22 @@ static void print_colors(const unsigned int *colors, unsigned int num_colors)
> >      printk(" }\n");
> >  }
> >
> > +static bool check_colors(const unsigned int *colors, unsigned int num_colors)
> > +{
> > +    unsigned int i;
> > +
> > +    for ( i = 0; i < num_colors; i++ )
> > +    {
> > +        if ( colors[i] >= max_nr_colors )
> > +        {
> > +            printk(XENLOG_ERR "LLC color %u >= %u\n", colors[i], max_nr_colors);
> > +            return false;
> > +        }
> > +    }
> > +
> > +    return true;
> > +}
>
> Oh, here's the range checking of the color values themselves. Perhaps
> a comment in parse_color_config() would help.

I'll add it.

> > @@ -91,6 +164,61 @@ void cf_check domain_dump_llc_colors(const struct domain *d)
> >      print_colors(d->llc_colors, d->num_llc_colors);
> >  }
> >
> > +static int domain_set_default_colors(struct domain *d)
> > +{
> > +    unsigned int *colors = xmalloc_array(unsigned int, max_nr_colors);
> > +    unsigned int i;
> > +
> > +    if ( !colors )
> > +        return -ENOMEM;
> > +
> > +    printk(XENLOG_WARNING
> > +           "LLC color config not found for %pd, using all colors\n", d);
> > +
> > +    for ( i = 0; i < max_nr_colors; i++ )
> > +        colors[i] = i;
> > +
> > +    d->llc_colors = colors;
> > +    d->num_llc_colors = max_nr_colors;
> > +
> > +    return 0;
> > +}
>
> If this function is expected to actually come into play, wouldn't it
> make sense to set up such an array just once, and re-use it wherever
> necessary?

Then how to distinguish when to free it in domain_destroy() and when not to do
it?

> Also right here both this and check_colors() could be __init. I
> understand that subsequent patches will also want to use the
> functions at runtime, but until then this looks slightly wrong. I'd
> like to ask that such aspects be mentioned in the description, to
> avoid respective questions.

Ok, I'll do that.

> > +int __init dom0_set_llc_colors(struct domain *d)
> > +{
> > +    unsigned int *colors;
> > +
> > +    if ( !dom0_num_colors )
> > +        return domain_set_default_colors(d);
> > +
> > +    if ( !check_colors(dom0_colors, dom0_num_colors) )
> > +    {
> > +        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
> > +        return -EINVAL;
> > +    }
> > +
> > +    colors = xmalloc_array(unsigned int, dom0_num_colors);
> > +    if ( !colors )
> > +        return -ENOMEM;
> > +
> > +    /* Static type checking */
> > +    (void)(colors == dom0_colors);
>
> Btw, a means to avoid this would by to use typeof() in the declaration
> of "colors".

Right.

> > +static int __init parse_dom0_colors(const char *s)
> > +{
> > +    return parse_color_config(s, dom0_colors, ARRAY_SIZE(dom0_colors),
>
> With it not being possible to pass max_nr_colors here (due to the value
> not having been established yet), don't you need to check somewhere else
> that ...
>
> > +                              &dom0_num_colors);
>
> ... dom0_num_colors isn't too large?

I can add it in dom0_set_llc_colors().

> Jan

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 05/14] xen: extend domctl interface for cache coloring
  2024-03-19 15:37   ` Jan Beulich
@ 2024-03-21 15:11     ` Carlo Nonato
  0 siblings, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-21 15:11 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Marco Solieri, xen-devel

Hi Jan,

On Tue, Mar 19, 2024 at 4:37 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.03.2024 11:58, Carlo Nonato wrote:
> > @@ -219,6 +220,39 @@ void domain_llc_coloring_free(struct domain *d)
> >      xfree(__va(__pa(d->llc_colors)));
> >  }
> >
> > +int domain_set_llc_colors(struct domain *d,
> > +                          const struct xen_domctl_set_llc_colors *config)
> > +{
> > +    unsigned int *colors;
> > +
> > +    if ( d->num_llc_colors )
> > +        return -EEXIST;
> > +
> > +    if ( !config->num_llc_colors )
> > +        return domain_set_default_colors(d);
> > +
> > +    if ( config->num_llc_colors > max_nr_colors || config->pad )
>
> The check of "pad" wants carrying out in all cases; I expect it wants
> moving to the caller.

Ok.

> > +        return -EINVAL;
> > +
> > +    colors = xmalloc_array(unsigned int, config->num_llc_colors);
> > +    if ( !colors )
> > +        return -ENOMEM;
> > +
> > +    if ( copy_from_guest(colors, config->llc_colors, config->num_llc_colors) )
> > +        return -EFAULT;
>
> You're leaking "colors" when taking this or ...
>
> > +    if ( !check_colors(colors, config->num_llc_colors) )
> > +    {
> > +        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
> > +        return -EINVAL;
>
> ... this error path.

You're right.

> Jan

Thanks.

On Tue, Mar 19, 2024 at 4:37 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.03.2024 11:58, Carlo Nonato wrote:
> > @@ -219,6 +220,39 @@ void domain_llc_coloring_free(struct domain *d)
> >      xfree(__va(__pa(d->llc_colors)));
> >  }
> >
> > +int domain_set_llc_colors(struct domain *d,
> > +                          const struct xen_domctl_set_llc_colors *config)
> > +{
> > +    unsigned int *colors;
> > +
> > +    if ( d->num_llc_colors )
> > +        return -EEXIST;
> > +
> > +    if ( !config->num_llc_colors )
> > +        return domain_set_default_colors(d);
> > +
> > +    if ( config->num_llc_colors > max_nr_colors || config->pad )
>
> The check of "pad" wants carrying out in all cases; I expect it wants
> moving to the caller.
>
> > +        return -EINVAL;
> > +
> > +    colors = xmalloc_array(unsigned int, config->num_llc_colors);
> > +    if ( !colors )
> > +        return -ENOMEM;
> > +
> > +    if ( copy_from_guest(colors, config->llc_colors, config->num_llc_colors) )
> > +        return -EFAULT;
>
> You're leaking "colors" when taking this or ...
>
> > +    if ( !check_colors(colors, config->num_llc_colors) )
> > +    {
> > +        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
> > +        return -EINVAL;
>
> ... this error path.
>
> Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 07/14] xen/arm: add support for cache coloring configuration via device-tree
  2024-03-19 15:41   ` Jan Beulich
@ 2024-03-21 15:12     ` Carlo Nonato
  0 siblings, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-21 15:12 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Stefano Stabellini, Julien Grall, Bertrand Marquis, Michal Orzel,
	Volodymyr Babchuk, Andrew Cooper, George Dunlap, Wei Liu,
	Marco Solieri, xen-devel

Hi Jan,

On Tue, Mar 19, 2024 at 4:41 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.03.2024 11:58, Carlo Nonato wrote:
> > --- a/xen/common/llc-coloring.c
> > +++ b/xen/common/llc-coloring.c
> > @@ -253,6 +253,37 @@ int domain_set_llc_colors(struct domain *d,
> >      return 0;
> >  }
> >
> > +int __init domain_set_llc_colors_from_str(struct domain *d, const char *str)
> > +{
> > +    int err;
> > +    unsigned int *colors, num_colors;
> > +
> > +    if ( !str )
> > +        return domain_set_default_colors(d);
> > +
> > +    colors = xmalloc_array(unsigned int, max_nr_colors);
> > +    if ( !colors )
> > +        return -ENOMEM;
> > +
> > +    err = parse_color_config(str, colors, max_nr_colors, &num_colors);
> > +    if ( err )
> > +    {
> > +        printk(XENLOG_ERR "Error parsing LLC color configuration");
> > +        return err;
> > +    }
> > +
> > +    if ( !check_colors(colors, num_colors) )
> > +    {
> > +        printk(XENLOG_ERR "Bad LLC color config for %pd\n", d);
> > +        return -EINVAL;
> > +    }
>
> "colors" is again leaked on the error paths.

Yep.

> > +    d->llc_colors = colors;
> > +    d->num_llc_colors = num_colors;
>
> num_colors may be quite a bit smaller than max_nr_colors; worth re-
> allocating the array to free up excess space?

Don't know if it's worth it, but it's a very small change so I think I can add
it.

> Jan

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 10/14] xen: add cache coloring allocator for domains
  2024-03-19 16:43   ` Jan Beulich
@ 2024-03-21 15:36     ` Carlo Nonato
  2024-03-21 16:03       ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-21 15:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel

Hi Jan,

On Tue, Mar 19, 2024 at 5:43 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.03.2024 11:58, Carlo Nonato wrote:
> > Add a new memory page allocator that implements the cache coloring mechanism.
> > The allocation algorithm enforces equal frequency distribution of cache
> > partitions, following the coloring configuration of a domain. This allows
> > for an even utilization of cache sets for every domain.
> >
> > Pages are stored in a color-indexed array of lists. Those lists are filled
> > by a simple init function which computes the color of each page.
> > When a domain requests a page, the allocator extract the page from the list
> > with the maximum number of free pages between those that the domain can
> > access, given its coloring configuration.
>
> Minor remark: I'm not a native speaker, but "between" here reads odd to
> me. I'd have expected perhaps "among".

Yes, I'm gonna change it.

> > --- a/docs/misc/xen-command-line.pandoc
> > +++ b/docs/misc/xen-command-line.pandoc
> > @@ -270,6 +270,20 @@ and not running softirqs. Reduce this if softirqs are not being run frequently
> >  enough. Setting this to a high value may cause boot failure, particularly if
> >  the NMI watchdog is also enabled.
> >
> > +### buddy-alloc-size (arm64)
> > +> `= <size>`
> > +
> > +> Default: `64M`
> > +
> > +Amount of memory reserved for the buddy allocator when colored allocator is
> > +active. This options is parsed only when LLC coloring support is enabled.
>
> Nit: s/parsed/used/ - the option is always parsed as long as LLC_COLORING=y.
>
> > @@ -1945,6 +1949,164 @@ static unsigned long avail_heap_pages(
> >      return free_pages;
> >  }
> >
> > +/*************************
> > + * COLORED SIDE-ALLOCATOR
> > + *
> > + * Pages are grouped by LLC color in lists which are globally referred to as the
> > + * color heap. Lists are populated in end_boot_allocator().
> > + * After initialization there will be N lists where N is the number of
> > + * available colors on the platform.
> > + */
> > +static struct page_list_head *__ro_after_init _color_heap;
> > +#define color_heap(color) (&_color_heap[color])
> > +
> > +static unsigned long *__ro_after_init free_colored_pages;
> > +
> > +/* Memory required for buddy allocator to work with colored one */
> > +#ifdef CONFIG_LLC_COLORING
> > +static unsigned long __initdata buddy_alloc_size =
> > +    MB(CONFIG_BUDDY_ALLOCATOR_SIZE);
> > +size_param("buddy-alloc-size", buddy_alloc_size);
> > +
> > +#define domain_num_llc_colors(d) (d)->num_llc_colors
> > +#define domain_llc_color(d, i)   (d)->llc_colors[i]
> > +#else
> > +static unsigned long __initdata buddy_alloc_size;
> > +
> > +#define domain_num_llc_colors(d) 0
> > +#define domain_llc_color(d, i)   0
> > +#endif
> > +
> > +static void free_color_heap_page(struct page_info *pg, bool need_scrub)
> > +{
> > +    unsigned int color = page_to_llc_color(pg);
> > +    struct page_list_head *head = color_heap(color);
> > +
> > +    spin_lock(&heap_lock);
> > +
> > +    mark_page_free(pg, page_to_mfn(pg));
> > +
> > +    if ( need_scrub )
> > +    {
> > +        pg->count_info |= PGC_need_scrub;
> > +        poison_one_page(pg);
> > +    }
> > +
> > +    free_colored_pages[color]++;
> > +    page_list_add(pg, head);
>
> May I please ask for a comment (or at least some wording in the description)
> as to the choice made here between head or tail insertion? When assuming
> that across a system there's no sharing of colors, preferably re-using
> cache-hot pages is certainly good. Whereas when colors can reasonably be
> expected to be shared, avoiding to quickly re-use a freed page can also
> have benefits.

I'll add it.

> > +static struct page_info *alloc_color_heap_page(unsigned int memflags,
> > +                                               const struct domain *d)
> > +{
> > +    struct page_info *pg = NULL;
> > +    unsigned int i, color = 0;
> > +    unsigned long max = 0;
> > +    bool need_tlbflush = false;
> > +    uint32_t tlbflush_timestamp = 0;
> > +    bool need_scrub;
> > +
> > +    if ( memflags >> _MEMF_bits )
> > +        return NULL;
>
> By mentioning MEMF_bits earlier on I meant to give an example. What
> about MEMF_node and in particular MEMF_exact_node? Certain other flags
> also aren't obvious as to being okay to silently ignore.

You're right.

> > +    spin_lock(&heap_lock);
> > +
> > +    for ( i = 0; i < domain_num_llc_colors(d); i++ )
> > +    {
> > +        unsigned long free = free_colored_pages[domain_llc_color(d, i)];
> > +
> > +        if ( free > max )
> > +        {
> > +            color = domain_llc_color(d, i);
> > +            pg = page_list_first(color_heap(color));
> > +            max = free;
> > +        }
> > +    }
> > +
> > +    if ( !pg )
> > +    {
> > +        spin_unlock(&heap_lock);
> > +        return NULL;
> > +    }
> > +
> > +    need_scrub = pg->count_info & (PGC_need_scrub);
> > +    pg->count_info = PGC_state_inuse | (pg->count_info & PGC_colored);
>
> Better PGC_preserved?

Yeah.

> > +static void __init init_color_heap_pages(struct page_info *pg,
> > +                                         unsigned long nr_pages)
> > +{
> > +    unsigned int i;
> > +    bool need_scrub = opt_bootscrub == BOOTSCRUB_IDLE;
> > +
> > +    if ( buddy_alloc_size )
> > +    {
> > +        unsigned long buddy_pages = min(PFN_DOWN(buddy_alloc_size), nr_pages);
> > +
> > +        init_heap_pages(pg, buddy_pages);
>
> There's a corner case where init_heap_pages() would break when passed 0
> as 2nd argument.

I don't see it. There's just a for-loop that would be skipped in that case...

> I think you want to alter the enclosing if() to
> "if ( buddy_alloc_size >= PAGE_SIZE )" to be entirely certain to avoid
> that case.

... anyway, ok.

> > +static void dump_color_heap(void)
> > +{
> > +    unsigned int color;
> > +
> > +    printk("Dumping color heap info\n");
> > +    for ( color = 0; color < get_max_nr_llc_colors(); color++ )
> > +        if ( free_colored_pages[color] > 0 )
> > +            printk("Color heap[%u]: %lu pages\n",
> > +                   color, free_colored_pages[color]);
> > +}
>
> While having all of the code above from here outside of any #ifdef is
> helpful to prevent unintended breakage when changes are made and tested
> only on non-Arm64 targets, I'd still like to ask: Halfway recent
> compilers manage to eliminate everything? I'd like to avoid e.g. x86
> being left with traces of coloring despite not being able at all to use
> it.

I don't know the answer to this, sorry.

> > @@ -2485,7 +2660,10 @@ struct page_info *alloc_domheap_pages(
> >          }
> >          if ( assign_page(pg, order, d, memflags) )
> >          {
> > -            free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> > +            if ( pg->count_info & PGC_colored )
> > +                free_color_heap_page(pg, memflags & MEMF_no_scrub);
> > +            else
> > +                free_heap_pages(pg, order, memflags & MEMF_no_scrub);
> >              return NULL;
> >          }
> >      }
> > @@ -2568,7 +2746,10 @@ void free_domheap_pages(struct page_info *pg, unsigned int order)
> >              scrub = 1;
> >          }
> >
> > -        free_heap_pages(pg, order, scrub);
> > +        if ( pg->count_info & PGC_colored )
> > +            free_color_heap_page(pg, scrub);
> > +        else
> > +            free_heap_pages(pg, order, scrub);
> >      }
>
> Instead of this, did you consider altering free_heap_pages() to forward
> to free_color_heap_page()? That would then also allow to have a single,
> central comment and/or assertion that the "order" value here isn't lost.

Yes this can be easily done.

> Jan

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 12/14] xen/arm: add Xen cache colors command line parameter
  2024-03-19 15:54   ` Jan Beulich
@ 2024-03-21 15:36     ` Carlo Nonato
  0 siblings, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-21 15:36 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Luca Miccio, Andrew Cooper, George Dunlap, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri, xen-devel

Hi Jan

On Tue, Mar 19, 2024 at 4:54 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 15.03.2024 11:59, Carlo Nonato wrote:
> > From: Luca Miccio <lucmiccio@gmail.com>
> >
> > Add a new command line parameter to configure Xen cache colors.
> > These colors can be dumped with the cache coloring info debug-key.
> >
> > By default, Xen uses the first color.
> > Benchmarking the VM interrupt response time provides an estimation of
> > LLC usage by Xen's most latency-critical runtime task. Results on Arm
> > Cortex-A53 on Xilinx Zynq UltraScale+ XCZU9EG show that one color, which
> > reserves 64 KiB of L2, is enough to attain best responsiveness:
> > - Xen 1 color latency:  3.1 us
> > - Xen 2 color latency:  3.1 us
> >
> > More colors are instead very likely to be needed on processors whose L1
> > cache is physically-indexed and physically-tagged, such as Cortex-A57.
> > In such cases, coloring applies to L1 also, and there typically are two
> > distinct L1-colors. Therefore, reserving only one color for Xen would
> > senselessly partitions a cache memory that is already private, i.e.
> > underutilize it.
>
> Here you say that using just a single color is undesirable on such systems.
>
> > The default amount of Xen colors is thus set to one.
>
> Yet then, without any further explanation you conclude that 1 is the
> universal default.

A single default that suits every need doesn't exist, but we know that 1 is
good for the most widespread target we have (Cortex-A53). Having that said,
I think that a simple reorder of the description, while also making it more
explicit, solves the issue.

> > @@ -147,6 +159,21 @@ void __init llc_coloring_init(void)
> >          panic("Number of LLC colors (%u) not in range [2, %u]\n",
> >                max_nr_colors, CONFIG_NR_LLC_COLORS);
> >
> > +    if ( !xen_num_colors )
> > +    {
> > +        unsigned int i;
> > +
> > +        xen_num_colors = MIN(XEN_DEFAULT_NUM_COLORS, max_nr_colors);
> > +
> > +        printk(XENLOG_WARNING
> > +               "Xen LLC color config not found. Using first %u colors\n",
> > +               xen_num_colors);
> > +        for ( i = 0; i < xen_num_colors; i++ )
> > +            xen_colors[i] = i;
> > +    }
> > +    else if ( !check_colors(xen_colors, xen_num_colors) )
> > +        panic("Bad LLC color config for Xen\n");
>
> This "else" branch again lacks a bounds check against max_nr_colors, if
> I'm not mistaken.

Yep.

> Jan

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 01/14] xen/common: add cache coloring common code
  2024-03-21 15:03     ` Carlo Nonato
@ 2024-03-21 15:53       ` Jan Beulich
  2024-03-21 17:22         ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-21 15:53 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrea Bastoni, Andrew Cooper, George Dunlap, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri, xen-devel

On 21.03.2024 16:03, Carlo Nonato wrote:
> On Tue, Mar 19, 2024 at 3:58 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>> --- a/docs/misc/xen-command-line.pandoc
>>> +++ b/docs/misc/xen-command-line.pandoc
>>> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
>>>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
>>>  record with other registers.
>>>
>>> +### llc-coloring
>>> +> `= <boolean>`
>>> +
>>> +> Default: `false`
>>> +
>>> +Flag to enable or disable LLC coloring support at runtime. This option is
>>> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
>>> +cache coloring documentation for more info.
>>> +
>>> +### llc-nr-ways
>>> +> `= <integer>`
>>> +
>>> +> Default: `Obtained from hardware`
>>> +
>>> +Specify the number of ways of the Last Level Cache. This option is available
>>> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
>>> +to find the number of supported cache colors. By default the value is
>>> +automatically computed by probing the hardware, but in case of specific needs,
>>> +it can be manually set. Those include failing probing and debugging/testing
>>> +purposes so that it's possibile to emulate platforms with different number of
>>> +supported colors. If set, also "llc-size" must be set, otherwise the default
>>> +will be used.
>>> +
>>> +### llc-size
>>> +> `= <size>`
>>> +
>>> +> Default: `Obtained from hardware`
>>> +
>>> +Specify the size of the Last Level Cache. This option is available only when
>>> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
>>> +the number of supported cache colors. By default the value is automatically
>>> +computed by probing the hardware, but in case of specific needs, it can be
>>> +manually set. Those include failing probing and debugging/testing purposes so
>>> +that it's possibile to emulate platforms with different number of supported
>>> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
>>> +used.
>>
>> Wouldn't it make sense to infer "llc-coloring" when both of the latter options
>> were supplied?
> 
> To me it looks a bit strange that specifying some attributes of the cache
> automatically enables cache coloring. Also it would require some changes in
> how to express the auto-probing for such attributes.

Whereas to me it looks strange that, when having llc-size and llc-nr-ways
provided, I'd need to add a 3rd option. What purpose other than enabling
coloring could there be when specifying those parameters?

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-21 15:04     ` Carlo Nonato
@ 2024-03-21 15:57       ` Jan Beulich
  2024-03-21 17:31         ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-21 15:57 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel, Andrea Bastoni

On 21.03.2024 16:04, Carlo Nonato wrote:
> On Tue, Mar 19, 2024 at 4:30 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>> --- a/docs/misc/xen-command-line.pandoc
>>> +++ b/docs/misc/xen-command-line.pandoc
>>> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
>>>
>>>  Specify a list of IO ports to be excluded from dom0 access.
>>>
>>> +### dom0-llc-colors
>>> +> `= List of [ <integer> | <integer>-<integer> ]`
>>> +
>>> +> Default: `All available LLC colors`
>>> +
>>> +Specify dom0 LLC color configuration. This option is available only when
>>> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
>>> +colors are used.
>>
>> My reservation towards this being a top-level option remains.
> 
> How can I turn this into a lower-level option? Moving it into "dom0=" doesn't
> seem possible to me. How can I express a list (llc-colors) inside another list
> (dom0)? dom0=llc-colors=0-3,12-15,other-param=... How can I stop parsing
> before reaching other-param?

For example by using a different separator:

dom0=llc-colors=0-3+12-15,other-param=...

>>> @@ -91,6 +164,61 @@ void cf_check domain_dump_llc_colors(const struct domain *d)
>>>      print_colors(d->llc_colors, d->num_llc_colors);
>>>  }
>>>
>>> +static int domain_set_default_colors(struct domain *d)
>>> +{
>>> +    unsigned int *colors = xmalloc_array(unsigned int, max_nr_colors);
>>> +    unsigned int i;
>>> +
>>> +    if ( !colors )
>>> +        return -ENOMEM;
>>> +
>>> +    printk(XENLOG_WARNING
>>> +           "LLC color config not found for %pd, using all colors\n", d);
>>> +
>>> +    for ( i = 0; i < max_nr_colors; i++ )
>>> +        colors[i] = i;
>>> +
>>> +    d->llc_colors = colors;
>>> +    d->num_llc_colors = max_nr_colors;
>>> +
>>> +    return 0;
>>> +}
>>
>> If this function is expected to actually come into play, wouldn't it
>> make sense to set up such an array just once, and re-use it wherever
>> necessary?
> 
> Then how to distinguish when to free it in domain_destroy() and when not to do
> it?

By checking against that one special array instance.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 10/14] xen: add cache coloring allocator for domains
  2024-03-21 15:36     ` Carlo Nonato
@ 2024-03-21 16:03       ` Jan Beulich
  0 siblings, 0 replies; 60+ messages in thread
From: Jan Beulich @ 2024-03-21 16:03 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel

On 21.03.2024 16:36, Carlo Nonato wrote:
> On Tue, Mar 19, 2024 at 5:43 PM Jan Beulich <jbeulich@suse.com> wrote:
>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>> +static void __init init_color_heap_pages(struct page_info *pg,
>>> +                                         unsigned long nr_pages)
>>> +{
>>> +    unsigned int i;
>>> +    bool need_scrub = opt_bootscrub == BOOTSCRUB_IDLE;
>>> +
>>> +    if ( buddy_alloc_size )
>>> +    {
>>> +        unsigned long buddy_pages = min(PFN_DOWN(buddy_alloc_size), nr_pages);
>>> +
>>> +        init_heap_pages(pg, buddy_pages);
>>
>> There's a corner case where init_heap_pages() would break when passed 0
>> as 2nd argument.
> 
> I don't see it. There's just a for-loop that would be skipped in that case...

Look at the first comment in the function and the if() following it. I
don't think that code would work very well when nr_pages == 0.

>>> +static void dump_color_heap(void)
>>> +{
>>> +    unsigned int color;
>>> +
>>> +    printk("Dumping color heap info\n");
>>> +    for ( color = 0; color < get_max_nr_llc_colors(); color++ )
>>> +        if ( free_colored_pages[color] > 0 )
>>> +            printk("Color heap[%u]: %lu pages\n",
>>> +                   color, free_colored_pages[color]);
>>> +}
>>
>> While having all of the code above from here outside of any #ifdef is
>> helpful to prevent unintended breakage when changes are made and tested
>> only on non-Arm64 targets, I'd still like to ask: Halfway recent
>> compilers manage to eliminate everything? I'd like to avoid e.g. x86
>> being left with traces of coloring despite not being able at all to use
>> it.
> 
> I don't know the answer to this, sorry.

Yet it is important to have.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-15 10:58 ` [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro Carlo Nonato
  2024-03-19 15:47   ` Jan Beulich
@ 2024-03-21 16:07   ` Julien Grall
  2024-03-21 16:10     ` Julien Grall
  1 sibling, 1 reply; 60+ messages in thread
From: Julien Grall @ 2024-03-21 16:07 UTC (permalink / raw)
  To: Carlo Nonato, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Roger Pau Monné

(+ Roger)

Hi Carlo,

On 15/03/2024 10:58, Carlo Nonato wrote:
> PGC_static and PGC_extra needs to be preserved when assigning a page.
> Define a new macro that groups those flags and use it instead of or'ing
> every time.
> 
> To make preserved flags even more meaningful, they are kept also when
> switching state in mark_page_free().
> 
> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>

This patch is introducing a regression in OSStest (and possibly gitlab?):

Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0 
v=0xe40000010007ffff t=0x24
Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y 
Not tainted ]----
Mar 21 12:00:42.829857 (XEN) CPU:    12
Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>] 
common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT: 
hypervisor (d0v8)
Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx: 
000000000007ffff   rcx: 0000000000000028
Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi: 
ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp: 
ffff83047bec7b98   r8:  0000000000000000
Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10: 
000000000000000c   r11: 0000000000000010
Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13: 
0000000000000000   r14: ffff82e0044238a0
Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0: 
0000000080050033   cr4: 0000000000372660
Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb: 
ffff88801f200000   gss: 0000000000000000
Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000 
ss: e010   cs: e008
Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f> 
(common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9 
79 36 0f 0b 41 89 cd 48 c7 47 f0
Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20 
0000000000000001 ffff83046ccda000
Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016 
0000000000000000 0000000000000000
Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000 
0000000000000028 0000000000000021
Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000 
00007d2000000000 ffff83047bec7c48
Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000 
0000000000000100 0000000000000000
Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000 
ffff83047bec7c80 ffff82d0402f626c
Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640 
0000000000000000 0000000000000000
Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0 
ffff82d0402f65a0 ffff83046ccda000
Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000 
0000000000000000 ffff83047bec7cc0
Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0 
ffff82d0402bd543 ffff83046ccda000
Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08 
ffff82d04032c524 ffff83046ccda000
Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002 
ffff83047bec7d58 ffff82d040206750
Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5 
ffff83047bec7d48 0000000000000000
Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010 
ffff82d0405e9120 0000000000000001
Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728 
0000000000000007 ffff83023e3b3000
Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90 
ffff83023e38e000 ffff83023e2efb40
Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80 
0000000000000206 ffff83047bec7dc0
Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff 
e75aaa8d0000000c ac0d6d864e487f62
Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000 
ffffffff000003ff 00000002ffffffff
Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff 
0000000000000000 0000000000000000
Mar 21 12:00:43.093646 (XEN) Xen call trace:
Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R 
common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F 
alloc_domheap_pages+0x17d/0x1e4
Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F 
hap_set_allocation+0x73/0x23c
Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F 
hap_enable+0x138/0x33c
Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F 
paging_enable+0x2d/0x45
Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F 
hvm_domain_initialise+0x185/0x428
Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F 
arch_domain_create+0x3e7/0x4c1
Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F 
domain_create+0x4cc/0x7e2
Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F 
do_domctl+0x1850/0x192d
Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F 
pv_hypercall+0x617/0x6b5
Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F 
lstar_enter+0x13a/0x140
Mar 21 12:00:43.153689 (XEN)
Mar 21 12:00:43.153711 (XEN)
Mar 21 12:00:43.153731 (XEN) ****************************************
Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
Mar 21 12:00:43.165703 (XEN) ****************************************
Mar 21 12:00:43.177633 (XEN)
Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)

The code around the BUG is:

         /* Reference count must continuously be zero for free pages. */
         if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
         {
             printk(XENLOG_ERR
                    "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
                    i, mfn_x(page_to_mfn(pg + i)),
                    pg[i].count_info, pg[i].v.free.order,
                    pg[i].u.free.val, pg[i].tlbflush_timestamp);
             BUG();
         }

Now that you are preserving some flags, you also want to modify the 
condition. I haven't checked the rest of the code, so there might be 
some adjustments necessary.

For now I have reverted the patch to unblock the CI.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-21 16:07   ` Julien Grall
@ 2024-03-21 16:10     ` Julien Grall
  2024-03-21 16:22       ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Julien Grall @ 2024-03-21 16:10 UTC (permalink / raw)
  To: Carlo Nonato, xen-devel
  Cc: Andrew Cooper, George Dunlap, Jan Beulich, Stefano Stabellini,
	Wei Liu, Roger Pau Monné



On 21/03/2024 16:07, Julien Grall wrote:
> (+ Roger)
> 
> Hi Carlo,
> 
> On 15/03/2024 10:58, Carlo Nonato wrote:
>> PGC_static and PGC_extra needs to be preserved when assigning a page.
>> Define a new macro that groups those flags and use it instead of or'ing
>> every time.
>>
>> To make preserved flags even more meaningful, they are kept also when
>> switching state in mark_page_free().
>>
>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> 
> This patch is introducing a regression in OSStest (and possibly gitlab?):
> 
> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0 
> v=0xe40000010007ffff t=0x24
> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y 
> Not tainted ]----
> Mar 21 12:00:42.829857 (XEN) CPU:    12
> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>] 
> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT: 
> hypervisor (d0v8)
> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx: 
> 000000000007ffff   rcx: 0000000000000028
> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi: 
> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp: 
> ffff83047bec7b98   r8:  0000000000000000
> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10: 
> 000000000000000c   r11: 0000000000000010
> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13: 
> 0000000000000000   r14: ffff82e0044238a0
> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0: 
> 0000000080050033   cr4: 0000000000372660
> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb: 
> ffff88801f200000   gss: 0000000000000000
> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000 
> ss: e010   cs: e008
> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f> 
> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9 
> 79 36 0f 0b 41 89 cd 48 c7 47 f0
> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20 
> 0000000000000001 ffff83046ccda000
> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016 
> 0000000000000000 0000000000000000
> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000 
> 0000000000000028 0000000000000021
> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000 
> 00007d2000000000 ffff83047bec7c48
> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000 
> 0000000000000100 0000000000000000
> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000 
> ffff83047bec7c80 ffff82d0402f626c
> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640 
> 0000000000000000 0000000000000000
> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0 
> ffff82d0402f65a0 ffff83046ccda000
> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000 
> 0000000000000000 ffff83047bec7cc0
> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0 
> ffff82d0402bd543 ffff83046ccda000
> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08 
> ffff82d04032c524 ffff83046ccda000
> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002 
> ffff83047bec7d58 ffff82d040206750
> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5 
> ffff83047bec7d48 0000000000000000
> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010 
> ffff82d0405e9120 0000000000000001
> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728 
> 0000000000000007 ffff83023e3b3000
> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90 
> ffff83023e38e000 ffff83023e2efb40
> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80 
> 0000000000000206 ffff83047bec7dc0
> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff 
> e75aaa8d0000000c ac0d6d864e487f62
> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000 
> ffffffff000003ff 00000002ffffffff
> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff 
> 0000000000000000 0000000000000000
> Mar 21 12:00:43.093646 (XEN) Xen call trace:
> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R 
> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F 
> alloc_domheap_pages+0x17d/0x1e4
> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F 
> hap_set_allocation+0x73/0x23c
> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F 
> hap_enable+0x138/0x33c
> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F 
> paging_enable+0x2d/0x45
> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F 
> hvm_domain_initialise+0x185/0x428
> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F 
> arch_domain_create+0x3e7/0x4c1
> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F 
> domain_create+0x4cc/0x7e2
> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F 
> do_domctl+0x1850/0x192d
> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F 
> pv_hypercall+0x617/0x6b5
> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F 
> lstar_enter+0x13a/0x140
> Mar 21 12:00:43.153689 (XEN)
> Mar 21 12:00:43.153711 (XEN)
> Mar 21 12:00:43.153731 (XEN) ****************************************
> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
> Mar 21 12:00:43.165703 (XEN) ****************************************
> Mar 21 12:00:43.177633 (XEN)
> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
> 
> The code around the BUG is:
> 
>          /* Reference count must continuously be zero for free pages. */
>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
>          {
>              printk(XENLOG_ERR
>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
>                     i, mfn_x(page_to_mfn(pg + i)),
>                     pg[i].count_info, pg[i].v.free.order,
>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
>              BUG();
>          }
> 
> Now that you are preserving some flags, you also want to modify the 
> condition. I haven't checked the rest of the code, so there might be 
> some adjustments necessary.

Actually maybe the condition should not be adjusted. I think it would be 
wrong if a free pages has the flag PGC_extra set. Any thoughts?

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-21 16:10     ` Julien Grall
@ 2024-03-21 16:22       ` Jan Beulich
  2024-03-22 15:07         ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-21 16:22 UTC (permalink / raw)
  To: Julien Grall
  Cc: Carlo Nonato, Andrew Cooper, George Dunlap, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	xen-devel

On 21.03.2024 17:10, Julien Grall wrote:
> On 21/03/2024 16:07, Julien Grall wrote:
>> On 15/03/2024 10:58, Carlo Nonato wrote:
>>> PGC_static and PGC_extra needs to be preserved when assigning a page.
>>> Define a new macro that groups those flags and use it instead of or'ing
>>> every time.
>>>
>>> To make preserved flags even more meaningful, they are kept also when
>>> switching state in mark_page_free().
>>>
>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>
>> This patch is introducing a regression in OSStest (and possibly gitlab?):
>>
>> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0 
>> v=0xe40000010007ffff t=0x24
>> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
>> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y 
>> Not tainted ]----
>> Mar 21 12:00:42.829857 (XEN) CPU:    12
>> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>] 
>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT: 
>> hypervisor (d0v8)
>> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx: 
>> 000000000007ffff   rcx: 0000000000000028
>> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi: 
>> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
>> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp: 
>> ffff83047bec7b98   r8:  0000000000000000
>> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10: 
>> 000000000000000c   r11: 0000000000000010
>> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13: 
>> 0000000000000000   r14: ffff82e0044238a0
>> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0: 
>> 0000000080050033   cr4: 0000000000372660
>> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
>> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb: 
>> ffff88801f200000   gss: 0000000000000000
>> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000 
>> ss: e010   cs: e008
>> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f> 
>> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
>> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9 
>> 79 36 0f 0b 41 89 cd 48 c7 47 f0
>> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
>> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20 
>> 0000000000000001 ffff83046ccda000
>> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016 
>> 0000000000000000 0000000000000000
>> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000 
>> 0000000000000028 0000000000000021
>> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000 
>> 00007d2000000000 ffff83047bec7c48
>> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000 
>> 0000000000000100 0000000000000000
>> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000 
>> ffff83047bec7c80 ffff82d0402f626c
>> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640 
>> 0000000000000000 0000000000000000
>> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0 
>> ffff82d0402f65a0 ffff83046ccda000
>> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000 
>> 0000000000000000 ffff83047bec7cc0
>> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0 
>> ffff82d0402bd543 ffff83046ccda000
>> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08 
>> ffff82d04032c524 ffff83046ccda000
>> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002 
>> ffff83047bec7d58 ffff82d040206750
>> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5 
>> ffff83047bec7d48 0000000000000000
>> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010 
>> ffff82d0405e9120 0000000000000001
>> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728 
>> 0000000000000007 ffff83023e3b3000
>> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90 
>> ffff83023e38e000 ffff83023e2efb40
>> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80 
>> 0000000000000206 ffff83047bec7dc0
>> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff 
>> e75aaa8d0000000c ac0d6d864e487f62
>> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000 
>> ffffffff000003ff 00000002ffffffff
>> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff 
>> 0000000000000000 0000000000000000
>> Mar 21 12:00:43.093646 (XEN) Xen call trace:
>> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R 
>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F 
>> alloc_domheap_pages+0x17d/0x1e4
>> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F 
>> hap_set_allocation+0x73/0x23c
>> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F 
>> hap_enable+0x138/0x33c
>> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F 
>> paging_enable+0x2d/0x45
>> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F 
>> hvm_domain_initialise+0x185/0x428
>> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F 
>> arch_domain_create+0x3e7/0x4c1
>> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F 
>> domain_create+0x4cc/0x7e2
>> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F 
>> do_domctl+0x1850/0x192d
>> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F 
>> pv_hypercall+0x617/0x6b5
>> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F 
>> lstar_enter+0x13a/0x140
>> Mar 21 12:00:43.153689 (XEN)
>> Mar 21 12:00:43.153711 (XEN)
>> Mar 21 12:00:43.153731 (XEN) ****************************************
>> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
>> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
>> Mar 21 12:00:43.165703 (XEN) ****************************************
>> Mar 21 12:00:43.177633 (XEN)
>> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
>>
>> The code around the BUG is:
>>
>>          /* Reference count must continuously be zero for free pages. */
>>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
>>          {
>>              printk(XENLOG_ERR
>>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
>>                     i, mfn_x(page_to_mfn(pg + i)),
>>                     pg[i].count_info, pg[i].v.free.order,
>>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
>>              BUG();
>>          }
>>
>> Now that you are preserving some flags, you also want to modify the 
>> condition. I haven't checked the rest of the code, so there might be 
>> some adjustments necessary.
> 
> Actually maybe the condition should not be adjusted. I think it would be 
> wrong if a free pages has the flag PGC_extra set. Any thoughts?

I agree, yet I'm inclined to say PGC_extra should have been cleared
before trying to free the page.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 01/14] xen/common: add cache coloring common code
  2024-03-21 15:53       ` Jan Beulich
@ 2024-03-21 17:22         ` Carlo Nonato
  2024-03-22  7:25           ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-21 17:22 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrea Bastoni, Andrew Cooper, George Dunlap, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri, xen-devel

Hi Jan,

On Thu, Mar 21, 2024 at 4:53 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.03.2024 16:03, Carlo Nonato wrote:
> > On Tue, Mar 19, 2024 at 3:58 PM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 15.03.2024 11:58, Carlo Nonato wrote:
> >>> --- a/docs/misc/xen-command-line.pandoc
> >>> +++ b/docs/misc/xen-command-line.pandoc
> >>> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
> >>>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
> >>>  record with other registers.
> >>>
> >>> +### llc-coloring
> >>> +> `= <boolean>`
> >>> +
> >>> +> Default: `false`
> >>> +
> >>> +Flag to enable or disable LLC coloring support at runtime. This option is
> >>> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
> >>> +cache coloring documentation for more info.
> >>> +
> >>> +### llc-nr-ways
> >>> +> `= <integer>`
> >>> +
> >>> +> Default: `Obtained from hardware`
> >>> +
> >>> +Specify the number of ways of the Last Level Cache. This option is available
> >>> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
> >>> +to find the number of supported cache colors. By default the value is
> >>> +automatically computed by probing the hardware, but in case of specific needs,
> >>> +it can be manually set. Those include failing probing and debugging/testing
> >>> +purposes so that it's possibile to emulate platforms with different number of
> >>> +supported colors. If set, also "llc-size" must be set, otherwise the default
> >>> +will be used.
> >>> +
> >>> +### llc-size
> >>> +> `= <size>`
> >>> +
> >>> +> Default: `Obtained from hardware`
> >>> +
> >>> +Specify the size of the Last Level Cache. This option is available only when
> >>> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
> >>> +the number of supported cache colors. By default the value is automatically
> >>> +computed by probing the hardware, but in case of specific needs, it can be
> >>> +manually set. Those include failing probing and debugging/testing purposes so
> >>> +that it's possibile to emulate platforms with different number of supported
> >>> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
> >>> +used.
> >>
> >> Wouldn't it make sense to infer "llc-coloring" when both of the latter options
> >> were supplied?
> >
> > To me it looks a bit strange that specifying some attributes of the cache
> > automatically enables cache coloring. Also it would require some changes in
> > how to express the auto-probing for such attributes.
>
> Whereas to me it looks strange that, when having llc-size and llc-nr-ways
> provided, I'd need to add a 3rd option. What purpose other than enabling
> coloring could there be when specifying those parameters?

Ok, I probably misunderstood you. You mean just to assume llc-coloring=on
when both llc-size and llc-nr-ways are present and not to remove
llc-coloring completely, right? I'm ok with this.

> Jan

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-21 15:57       ` Jan Beulich
@ 2024-03-21 17:31         ` Carlo Nonato
  2024-03-22  7:26           ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-21 17:31 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel, Andrea Bastoni

On Thu, Mar 21, 2024 at 4:57 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.03.2024 16:04, Carlo Nonato wrote:
> > On Tue, Mar 19, 2024 at 4:30 PM Jan Beulich <jbeulich@suse.com> wrote:
> >> On 15.03.2024 11:58, Carlo Nonato wrote:
> >>> --- a/docs/misc/xen-command-line.pandoc
> >>> +++ b/docs/misc/xen-command-line.pandoc
> >>> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
> >>>
> >>>  Specify a list of IO ports to be excluded from dom0 access.
> >>>
> >>> +### dom0-llc-colors
> >>> +> `= List of [ <integer> | <integer>-<integer> ]`
> >>> +
> >>> +> Default: `All available LLC colors`
> >>> +
> >>> +Specify dom0 LLC color configuration. This option is available only when
> >>> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
> >>> +colors are used.
> >>
> >> My reservation towards this being a top-level option remains.
> >
> > How can I turn this into a lower-level option? Moving it into "dom0=" doesn't
> > seem possible to me. How can I express a list (llc-colors) inside another list
> > (dom0)? dom0=llc-colors=0-3,12-15,other-param=... How can I stop parsing
> > before reaching other-param?
>
> For example by using a different separator:
>
> dom0=llc-colors=0-3+12-15,other-param=...

Ok, but that would mean to change the implementation of the parsing function
and to adopt this syntax also in other places, something that I would've
preferred to avoid. Anyway I'll follow your suggestion.

> >>> @@ -91,6 +164,61 @@ void cf_check domain_dump_llc_colors(const struct domain *d)
> >>>      print_colors(d->llc_colors, d->num_llc_colors);
> >>>  }
> >>>
> >>> +static int domain_set_default_colors(struct domain *d)
> >>> +{
> >>> +    unsigned int *colors = xmalloc_array(unsigned int, max_nr_colors);
> >>> +    unsigned int i;
> >>> +
> >>> +    if ( !colors )
> >>> +        return -ENOMEM;
> >>> +
> >>> +    printk(XENLOG_WARNING
> >>> +           "LLC color config not found for %pd, using all colors\n", d);
> >>> +
> >>> +    for ( i = 0; i < max_nr_colors; i++ )
> >>> +        colors[i] = i;
> >>> +
> >>> +    d->llc_colors = colors;
> >>> +    d->num_llc_colors = max_nr_colors;
> >>> +
> >>> +    return 0;
> >>> +}
> >>
> >> If this function is expected to actually come into play, wouldn't it
> >> make sense to set up such an array just once, and re-use it wherever
> >> necessary?
> >
> > Then how to distinguish when to free it in domain_destroy() and when not to do
> > it?
>
> By checking against that one special array instance.

Ok.

> Jan

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 01/14] xen/common: add cache coloring common code
  2024-03-21 17:22         ` Carlo Nonato
@ 2024-03-22  7:25           ` Jan Beulich
  0 siblings, 0 replies; 60+ messages in thread
From: Jan Beulich @ 2024-03-22  7:25 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrea Bastoni, Andrew Cooper, George Dunlap, Julien Grall,
	Stefano Stabellini, Wei Liu, Marco Solieri, xen-devel

On 21.03.2024 18:22, Carlo Nonato wrote:
> Hi Jan,
> 
> On Thu, Mar 21, 2024 at 4:53 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 21.03.2024 16:03, Carlo Nonato wrote:
>>> On Tue, Mar 19, 2024 at 3:58 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>>>> --- a/docs/misc/xen-command-line.pandoc
>>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>>> @@ -1706,6 +1706,43 @@ This option is intended for debugging purposes only.  Enable MSR_DEBUGCTL.LBR
>>>>>  in hypervisor context to be able to dump the Last Interrupt/Exception To/From
>>>>>  record with other registers.
>>>>>
>>>>> +### llc-coloring
>>>>> +> `= <boolean>`
>>>>> +
>>>>> +> Default: `false`
>>>>> +
>>>>> +Flag to enable or disable LLC coloring support at runtime. This option is
>>>>> +available only when `CONFIG_LLC_COLORING` is enabled. See the general
>>>>> +cache coloring documentation for more info.
>>>>> +
>>>>> +### llc-nr-ways
>>>>> +> `= <integer>`
>>>>> +
>>>>> +> Default: `Obtained from hardware`
>>>>> +
>>>>> +Specify the number of ways of the Last Level Cache. This option is available
>>>>> +only when `CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used
>>>>> +to find the number of supported cache colors. By default the value is
>>>>> +automatically computed by probing the hardware, but in case of specific needs,
>>>>> +it can be manually set. Those include failing probing and debugging/testing
>>>>> +purposes so that it's possibile to emulate platforms with different number of
>>>>> +supported colors. If set, also "llc-size" must be set, otherwise the default
>>>>> +will be used.
>>>>> +
>>>>> +### llc-size
>>>>> +> `= <size>`
>>>>> +
>>>>> +> Default: `Obtained from hardware`
>>>>> +
>>>>> +Specify the size of the Last Level Cache. This option is available only when
>>>>> +`CONFIG_LLC_COLORING` is enabled. LLC size and number of ways are used to find
>>>>> +the number of supported cache colors. By default the value is automatically
>>>>> +computed by probing the hardware, but in case of specific needs, it can be
>>>>> +manually set. Those include failing probing and debugging/testing purposes so
>>>>> +that it's possibile to emulate platforms with different number of supported
>>>>> +colors. If set, also "llc-nr-ways" must be set, otherwise the default will be
>>>>> +used.
>>>>
>>>> Wouldn't it make sense to infer "llc-coloring" when both of the latter options
>>>> were supplied?
>>>
>>> To me it looks a bit strange that specifying some attributes of the cache
>>> automatically enables cache coloring. Also it would require some changes in
>>> how to express the auto-probing for such attributes.
>>
>> Whereas to me it looks strange that, when having llc-size and llc-nr-ways
>> provided, I'd need to add a 3rd option. What purpose other than enabling
>> coloring could there be when specifying those parameters?
> 
> Ok, I probably misunderstood you. You mean just to assume llc-coloring=on
> when both llc-size and llc-nr-ways are present and not to remove
> llc-coloring completely, right?

Yes. The common thing, after all, will be to just have llc-coloring on the
command line.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-21 17:31         ` Carlo Nonato
@ 2024-03-22  7:26           ` Jan Beulich
  2024-03-27 11:39             ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-22  7:26 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Julien Grall, Stefano Stabellini,
	Wei Liu, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Marco Solieri, xen-devel, Andrea Bastoni

On 21.03.2024 18:31, Carlo Nonato wrote:
> On Thu, Mar 21, 2024 at 4:57 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 21.03.2024 16:04, Carlo Nonato wrote:
>>> On Tue, Mar 19, 2024 at 4:30 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>>>> --- a/docs/misc/xen-command-line.pandoc
>>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>>> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
>>>>>
>>>>>  Specify a list of IO ports to be excluded from dom0 access.
>>>>>
>>>>> +### dom0-llc-colors
>>>>> +> `= List of [ <integer> | <integer>-<integer> ]`
>>>>> +
>>>>> +> Default: `All available LLC colors`
>>>>> +
>>>>> +Specify dom0 LLC color configuration. This option is available only when
>>>>> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
>>>>> +colors are used.
>>>>
>>>> My reservation towards this being a top-level option remains.
>>>
>>> How can I turn this into a lower-level option? Moving it into "dom0=" doesn't
>>> seem possible to me. How can I express a list (llc-colors) inside another list
>>> (dom0)? dom0=llc-colors=0-3,12-15,other-param=... How can I stop parsing
>>> before reaching other-param?
>>
>> For example by using a different separator:
>>
>> dom0=llc-colors=0-3+12-15,other-param=...
> 
> Ok, but that would mean to change the implementation of the parsing function
> and to adopt this syntax also in other places, something that I would've
> preferred to avoid. Anyway I'll follow your suggestion.

Well, this is all used by Arm only for now. You will want to make sure Arm
folks are actually okay with this alternative approach.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-21 16:22       ` Jan Beulich
@ 2024-03-22 15:07         ` Carlo Nonato
  2024-03-25  7:19           ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-22 15:07 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Julien Grall, Andrew Cooper, George Dunlap, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	xen-devel

Hi guys,

On Thu, Mar 21, 2024 at 5:23 PM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.03.2024 17:10, Julien Grall wrote:
> > On 21/03/2024 16:07, Julien Grall wrote:
> >> On 15/03/2024 10:58, Carlo Nonato wrote:
> >>> PGC_static and PGC_extra needs to be preserved when assigning a page.
> >>> Define a new macro that groups those flags and use it instead of or'ing
> >>> every time.
> >>>
> >>> To make preserved flags even more meaningful, they are kept also when
> >>> switching state in mark_page_free().
> >>>
> >>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> >>
> >> This patch is introducing a regression in OSStest (and possibly gitlab?):
> >>
> >> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0
> >> v=0xe40000010007ffff t=0x24
> >> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
> >> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y
> >> Not tainted ]----
> >> Mar 21 12:00:42.829857 (XEN) CPU:    12
> >> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>]
> >> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> >> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT:
> >> hypervisor (d0v8)
> >> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx:
> >> 000000000007ffff   rcx: 0000000000000028
> >> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi:
> >> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
> >> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp:
> >> ffff83047bec7b98   r8:  0000000000000000
> >> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10:
> >> 000000000000000c   r11: 0000000000000010
> >> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13:
> >> 0000000000000000   r14: ffff82e0044238a0
> >> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0:
> >> 0000000080050033   cr4: 0000000000372660
> >> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
> >> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb:
> >> ffff88801f200000   gss: 0000000000000000
> >> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000
> >> ss: e010   cs: e008
> >> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f>
> >> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
> >> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9
> >> 79 36 0f 0b 41 89 cd 48 c7 47 f0
> >> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
> >> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20
> >> 0000000000000001 ffff83046ccda000
> >> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016
> >> 0000000000000000 0000000000000000
> >> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000
> >> 0000000000000028 0000000000000021
> >> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000
> >> 00007d2000000000 ffff83047bec7c48
> >> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000
> >> 0000000000000100 0000000000000000
> >> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000
> >> ffff83047bec7c80 ffff82d0402f626c
> >> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640
> >> 0000000000000000 0000000000000000
> >> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0
> >> ffff82d0402f65a0 ffff83046ccda000
> >> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000
> >> 0000000000000000 ffff83047bec7cc0
> >> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0
> >> ffff82d0402bd543 ffff83046ccda000
> >> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08
> >> ffff82d04032c524 ffff83046ccda000
> >> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002
> >> ffff83047bec7d58 ffff82d040206750
> >> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5
> >> ffff83047bec7d48 0000000000000000
> >> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010
> >> ffff82d0405e9120 0000000000000001
> >> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728
> >> 0000000000000007 ffff83023e3b3000
> >> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90
> >> ffff83023e38e000 ffff83023e2efb40
> >> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80
> >> 0000000000000206 ffff83047bec7dc0
> >> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff
> >> e75aaa8d0000000c ac0d6d864e487f62
> >> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000
> >> ffffffff000003ff 00000002ffffffff
> >> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff
> >> 0000000000000000 0000000000000000
> >> Mar 21 12:00:43.093646 (XEN) Xen call trace:
> >> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R
> >> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> >> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F
> >> alloc_domheap_pages+0x17d/0x1e4
> >> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F
> >> hap_set_allocation+0x73/0x23c
> >> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F
> >> hap_enable+0x138/0x33c
> >> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F
> >> paging_enable+0x2d/0x45
> >> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F
> >> hvm_domain_initialise+0x185/0x428
> >> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F
> >> arch_domain_create+0x3e7/0x4c1
> >> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F
> >> domain_create+0x4cc/0x7e2
> >> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F
> >> do_domctl+0x1850/0x192d
> >> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F
> >> pv_hypercall+0x617/0x6b5
> >> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F
> >> lstar_enter+0x13a/0x140
> >> Mar 21 12:00:43.153689 (XEN)
> >> Mar 21 12:00:43.153711 (XEN)
> >> Mar 21 12:00:43.153731 (XEN) ****************************************
> >> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
> >> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
> >> Mar 21 12:00:43.165703 (XEN) ****************************************
> >> Mar 21 12:00:43.177633 (XEN)
> >> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
> >>
> >> The code around the BUG is:
> >>
> >>          /* Reference count must continuously be zero for free pages. */
> >>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
> >>          {
> >>              printk(XENLOG_ERR
> >>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
> >>                     i, mfn_x(page_to_mfn(pg + i)),
> >>                     pg[i].count_info, pg[i].v.free.order,
> >>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
> >>              BUG();
> >>          }
> >>
> >> Now that you are preserving some flags, you also want to modify the
> >> condition. I haven't checked the rest of the code, so there might be
> >> some adjustments necessary.
> >
> > Actually maybe the condition should not be adjusted. I think it would be
> > wrong if a free pages has the flag PGC_extra set. Any thoughts?
>
> I agree, yet I'm inclined to say PGC_extra should have been cleared
> before trying to free the page.

So what to do now? Should I drop this commit?

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-22 15:07         ` Carlo Nonato
@ 2024-03-25  7:19           ` Jan Beulich
  2024-03-26 16:39             ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-25  7:19 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Julien Grall, Andrew Cooper, George Dunlap, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	xen-devel

On 22.03.2024 16:07, Carlo Nonato wrote:
> Hi guys,
> 
> On Thu, Mar 21, 2024 at 5:23 PM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 21.03.2024 17:10, Julien Grall wrote:
>>> On 21/03/2024 16:07, Julien Grall wrote:
>>>> On 15/03/2024 10:58, Carlo Nonato wrote:
>>>>> PGC_static and PGC_extra needs to be preserved when assigning a page.
>>>>> Define a new macro that groups those flags and use it instead of or'ing
>>>>> every time.
>>>>>
>>>>> To make preserved flags even more meaningful, they are kept also when
>>>>> switching state in mark_page_free().
>>>>>
>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>>>
>>>> This patch is introducing a regression in OSStest (and possibly gitlab?):
>>>>
>>>> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0
>>>> v=0xe40000010007ffff t=0x24
>>>> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
>>>> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y
>>>> Not tainted ]----
>>>> Mar 21 12:00:42.829857 (XEN) CPU:    12
>>>> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>]
>>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>>>> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT:
>>>> hypervisor (d0v8)
>>>> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx:
>>>> 000000000007ffff   rcx: 0000000000000028
>>>> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi:
>>>> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
>>>> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp:
>>>> ffff83047bec7b98   r8:  0000000000000000
>>>> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10:
>>>> 000000000000000c   r11: 0000000000000010
>>>> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13:
>>>> 0000000000000000   r14: ffff82e0044238a0
>>>> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0:
>>>> 0000000080050033   cr4: 0000000000372660
>>>> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
>>>> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb:
>>>> ffff88801f200000   gss: 0000000000000000
>>>> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000
>>>> ss: e010   cs: e008
>>>> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f>
>>>> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
>>>> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9
>>>> 79 36 0f 0b 41 89 cd 48 c7 47 f0
>>>> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
>>>> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20
>>>> 0000000000000001 ffff83046ccda000
>>>> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016
>>>> 0000000000000000 0000000000000000
>>>> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000
>>>> 0000000000000028 0000000000000021
>>>> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000
>>>> 00007d2000000000 ffff83047bec7c48
>>>> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000
>>>> 0000000000000100 0000000000000000
>>>> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000
>>>> ffff83047bec7c80 ffff82d0402f626c
>>>> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640
>>>> 0000000000000000 0000000000000000
>>>> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0
>>>> ffff82d0402f65a0 ffff83046ccda000
>>>> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000
>>>> 0000000000000000 ffff83047bec7cc0
>>>> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0
>>>> ffff82d0402bd543 ffff83046ccda000
>>>> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08
>>>> ffff82d04032c524 ffff83046ccda000
>>>> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002
>>>> ffff83047bec7d58 ffff82d040206750
>>>> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5
>>>> ffff83047bec7d48 0000000000000000
>>>> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010
>>>> ffff82d0405e9120 0000000000000001
>>>> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728
>>>> 0000000000000007 ffff83023e3b3000
>>>> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90
>>>> ffff83023e38e000 ffff83023e2efb40
>>>> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80
>>>> 0000000000000206 ffff83047bec7dc0
>>>> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff
>>>> e75aaa8d0000000c ac0d6d864e487f62
>>>> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000
>>>> ffffffff000003ff 00000002ffffffff
>>>> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff
>>>> 0000000000000000 0000000000000000
>>>> Mar 21 12:00:43.093646 (XEN) Xen call trace:
>>>> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R
>>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>>>> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F
>>>> alloc_domheap_pages+0x17d/0x1e4
>>>> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F
>>>> hap_set_allocation+0x73/0x23c
>>>> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F
>>>> hap_enable+0x138/0x33c
>>>> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F
>>>> paging_enable+0x2d/0x45
>>>> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F
>>>> hvm_domain_initialise+0x185/0x428
>>>> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F
>>>> arch_domain_create+0x3e7/0x4c1
>>>> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F
>>>> domain_create+0x4cc/0x7e2
>>>> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F
>>>> do_domctl+0x1850/0x192d
>>>> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F
>>>> pv_hypercall+0x617/0x6b5
>>>> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F
>>>> lstar_enter+0x13a/0x140
>>>> Mar 21 12:00:43.153689 (XEN)
>>>> Mar 21 12:00:43.153711 (XEN)
>>>> Mar 21 12:00:43.153731 (XEN) ****************************************
>>>> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
>>>> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
>>>> Mar 21 12:00:43.165703 (XEN) ****************************************
>>>> Mar 21 12:00:43.177633 (XEN)
>>>> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
>>>>
>>>> The code around the BUG is:
>>>>
>>>>          /* Reference count must continuously be zero for free pages. */
>>>>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
>>>>          {
>>>>              printk(XENLOG_ERR
>>>>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
>>>>                     i, mfn_x(page_to_mfn(pg + i)),
>>>>                     pg[i].count_info, pg[i].v.free.order,
>>>>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
>>>>              BUG();
>>>>          }
>>>>
>>>> Now that you are preserving some flags, you also want to modify the
>>>> condition. I haven't checked the rest of the code, so there might be
>>>> some adjustments necessary.
>>>
>>> Actually maybe the condition should not be adjusted. I think it would be
>>> wrong if a free pages has the flag PGC_extra set. Any thoughts?
>>
>> I agree, yet I'm inclined to say PGC_extra should have been cleared
>> before trying to free the page.
> 
> So what to do now? Should I drop this commit?

No, we need to get to the root of the issue. Since osstest has hit it quite
easily as it seems, I'm somewhat surprised you didn't hit it in your testing.
In any event, as per my earlier reply, my present guess is that your change
has merely uncovered a previously latent issue elsewhere.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 06/14] tools: add support for cache coloring configuration
  2024-03-15 10:58 ` [PATCH v7 06/14] tools: add support for cache coloring configuration Carlo Nonato
@ 2024-03-25 10:55   ` Anthony PERARD
  2024-03-25 11:44     ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Anthony PERARD @ 2024-03-25 10:55 UTC (permalink / raw)
  To: Carlo Nonato; +Cc: xen-devel, Juergen Gross, Marco Solieri

On Fri, Mar 15, 2024 at 11:58:54AM +0100, Carlo Nonato wrote:
> diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
> index 5546335973..79f206f616 100644
> --- a/tools/libs/light/libxl_create.c
> +++ b/tools/libs/light/libxl_create.c
> @@ -726,6 +726,15 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
>              /* A new domain now exists */
>              *domid = local_domid;
>  
> +            ret = xc_domain_set_llc_colors(ctx->xch, local_domid,
> +                                           b_info->llc_colors,
> +                                           b_info->num_llc_colors);
> +            if (ret < 0 && errno != EOPNOTSUPP) {

Wait, this additional check on EOPNOTSUPP, does that mean we ignore
"llc_colors" configure by the admin of the VM if the system doesn't
support it? Shouldn't we also report an error in this case? At least if
`num_llc_colors > 0`.

Thanks,

-- 
Anthony PERARD


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 06/14] tools: add support for cache coloring configuration
  2024-03-25 10:55   ` Anthony PERARD
@ 2024-03-25 11:44     ` Carlo Nonato
  0 siblings, 0 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-25 11:44 UTC (permalink / raw)
  To: Anthony PERARD; +Cc: xen-devel, Juergen Gross, Marco Solieri

Hi Anthony,

On Mon, Mar 25, 2024 at 11:55 AM Anthony PERARD
<anthony.perard@cloud.com> wrote:
>
> On Fri, Mar 15, 2024 at 11:58:54AM +0100, Carlo Nonato wrote:
> > diff --git a/tools/libs/light/libxl_create.c b/tools/libs/light/libxl_create.c
> > index 5546335973..79f206f616 100644
> > --- a/tools/libs/light/libxl_create.c
> > +++ b/tools/libs/light/libxl_create.c
> > @@ -726,6 +726,15 @@ int libxl__domain_make(libxl__gc *gc, libxl_domain_config *d_config,
> >              /* A new domain now exists */
> >              *domid = local_domid;
> >
> > +            ret = xc_domain_set_llc_colors(ctx->xch, local_domid,
> > +                                           b_info->llc_colors,
> > +                                           b_info->num_llc_colors);
> > +            if (ret < 0 && errno != EOPNOTSUPP) {
>
> Wait, this additional check on EOPNOTSUPP, does that mean we ignore
> "llc_colors" configure by the admin of the VM if the system doesn't
> support it? Shouldn't we also report an error in this case? At least if
> `num_llc_colors > 0`.

You're right. The problem was that I didn't want to log because coloring is a
very niche feature and I need to indiscriminately try to color a domain. Doing
that only when `num_llc_colors > 0` is fine though.

Thanks.

> Thanks,
>
> --
> Anthony PERARD


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-25  7:19           ` Jan Beulich
@ 2024-03-26 16:39             ` Carlo Nonato
  2024-03-26 17:04               ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-26 16:39 UTC (permalink / raw)
  To: Jan Beulich
  Cc: Julien Grall, Andrew Cooper, George Dunlap, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	xen-devel

Hi Jan,

On Mon, Mar 25, 2024 at 8:19 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 22.03.2024 16:07, Carlo Nonato wrote:
> > Hi guys,
> >
> > On Thu, Mar 21, 2024 at 5:23 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 21.03.2024 17:10, Julien Grall wrote:
> >>> On 21/03/2024 16:07, Julien Grall wrote:
> >>>> On 15/03/2024 10:58, Carlo Nonato wrote:
> >>>>> PGC_static and PGC_extra needs to be preserved when assigning a page.
> >>>>> Define a new macro that groups those flags and use it instead of or'ing
> >>>>> every time.
> >>>>>
> >>>>> To make preserved flags even more meaningful, they are kept also when
> >>>>> switching state in mark_page_free().
> >>>>>
> >>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
> >>>>
> >>>> This patch is introducing a regression in OSStest (and possibly gitlab?):
> >>>>
> >>>> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0
> >>>> v=0xe40000010007ffff t=0x24
> >>>> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
> >>>> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y
> >>>> Not tainted ]----
> >>>> Mar 21 12:00:42.829857 (XEN) CPU:    12
> >>>> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>]
> >>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> >>>> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT:
> >>>> hypervisor (d0v8)
> >>>> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx:
> >>>> 000000000007ffff   rcx: 0000000000000028
> >>>> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi:
> >>>> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
> >>>> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp:
> >>>> ffff83047bec7b98   r8:  0000000000000000
> >>>> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10:
> >>>> 000000000000000c   r11: 0000000000000010
> >>>> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13:
> >>>> 0000000000000000   r14: ffff82e0044238a0
> >>>> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0:
> >>>> 0000000080050033   cr4: 0000000000372660
> >>>> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
> >>>> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb:
> >>>> ffff88801f200000   gss: 0000000000000000
> >>>> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000
> >>>> ss: e010   cs: e008
> >>>> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f>
> >>>> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
> >>>> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9
> >>>> 79 36 0f 0b 41 89 cd 48 c7 47 f0
> >>>> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
> >>>> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20
> >>>> 0000000000000001 ffff83046ccda000
> >>>> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016
> >>>> 0000000000000000 0000000000000000
> >>>> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000
> >>>> 0000000000000028 0000000000000021
> >>>> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000
> >>>> 00007d2000000000 ffff83047bec7c48
> >>>> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000
> >>>> 0000000000000100 0000000000000000
> >>>> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000
> >>>> ffff83047bec7c80 ffff82d0402f626c
> >>>> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640
> >>>> 0000000000000000 0000000000000000
> >>>> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0
> >>>> ffff82d0402f65a0 ffff83046ccda000
> >>>> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000
> >>>> 0000000000000000 ffff83047bec7cc0
> >>>> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0
> >>>> ffff82d0402bd543 ffff83046ccda000
> >>>> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08
> >>>> ffff82d04032c524 ffff83046ccda000
> >>>> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002
> >>>> ffff83047bec7d58 ffff82d040206750
> >>>> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5
> >>>> ffff83047bec7d48 0000000000000000
> >>>> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010
> >>>> ffff82d0405e9120 0000000000000001
> >>>> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728
> >>>> 0000000000000007 ffff83023e3b3000
> >>>> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90
> >>>> ffff83023e38e000 ffff83023e2efb40
> >>>> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80
> >>>> 0000000000000206 ffff83047bec7dc0
> >>>> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff
> >>>> e75aaa8d0000000c ac0d6d864e487f62
> >>>> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000
> >>>> ffffffff000003ff 00000002ffffffff
> >>>> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff
> >>>> 0000000000000000 0000000000000000
> >>>> Mar 21 12:00:43.093646 (XEN) Xen call trace:
> >>>> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R
> >>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
> >>>> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F
> >>>> alloc_domheap_pages+0x17d/0x1e4
> >>>> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F
> >>>> hap_set_allocation+0x73/0x23c
> >>>> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F
> >>>> hap_enable+0x138/0x33c
> >>>> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F
> >>>> paging_enable+0x2d/0x45
> >>>> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F
> >>>> hvm_domain_initialise+0x185/0x428
> >>>> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F
> >>>> arch_domain_create+0x3e7/0x4c1
> >>>> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F
> >>>> domain_create+0x4cc/0x7e2
> >>>> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F
> >>>> do_domctl+0x1850/0x192d
> >>>> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F
> >>>> pv_hypercall+0x617/0x6b5
> >>>> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F
> >>>> lstar_enter+0x13a/0x140
> >>>> Mar 21 12:00:43.153689 (XEN)
> >>>> Mar 21 12:00:43.153711 (XEN)
> >>>> Mar 21 12:00:43.153731 (XEN) ****************************************
> >>>> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
> >>>> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
> >>>> Mar 21 12:00:43.165703 (XEN) ****************************************
> >>>> Mar 21 12:00:43.177633 (XEN)
> >>>> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
> >>>>
> >>>> The code around the BUG is:
> >>>>
> >>>>          /* Reference count must continuously be zero for free pages. */
> >>>>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
> >>>>          {
> >>>>              printk(XENLOG_ERR
> >>>>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
> >>>>                     i, mfn_x(page_to_mfn(pg + i)),
> >>>>                     pg[i].count_info, pg[i].v.free.order,
> >>>>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
> >>>>              BUG();
> >>>>          }
> >>>>
> >>>> Now that you are preserving some flags, you also want to modify the
> >>>> condition. I haven't checked the rest of the code, so there might be
> >>>> some adjustments necessary.
> >>>
> >>> Actually maybe the condition should not be adjusted. I think it would be
> >>> wrong if a free pages has the flag PGC_extra set. Any thoughts?
> >>
> >> I agree, yet I'm inclined to say PGC_extra should have been cleared
> >> before trying to free the page.
> >
> > So what to do now? Should I drop this commit?
>
> No, we need to get to the root of the issue. Since osstest has hit it quite
> easily as it seems, I'm somewhat surprised you didn't hit it in your testing.
> In any event, as per my earlier reply, my present guess is that your change
> has merely uncovered a previously latent issue elsewhere.

Ok, what about removing PGC_extra in free_heap_pages() before the
mark_page_free() call?

Thanks.

> Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-26 16:39             ` Carlo Nonato
@ 2024-03-26 17:04               ` Jan Beulich
  2024-03-26 21:55                 ` Julien Grall
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-26 17:04 UTC (permalink / raw)
  To: Carlo Nonato
  Cc: Julien Grall, Andrew Cooper, George Dunlap, Stefano Stabellini,
	Wei Liu, Roger Pau Monné,
	xen-devel

On 26.03.2024 17:39, Carlo Nonato wrote:
> On Mon, Mar 25, 2024 at 8:19 AM Jan Beulich <jbeulich@suse.com> wrote:
>> On 22.03.2024 16:07, Carlo Nonato wrote:
>>> On Thu, Mar 21, 2024 at 5:23 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>> On 21.03.2024 17:10, Julien Grall wrote:
>>>>> On 21/03/2024 16:07, Julien Grall wrote:
>>>>>> On 15/03/2024 10:58, Carlo Nonato wrote:
>>>>>>> PGC_static and PGC_extra needs to be preserved when assigning a page.
>>>>>>> Define a new macro that groups those flags and use it instead of or'ing
>>>>>>> every time.
>>>>>>>
>>>>>>> To make preserved flags even more meaningful, they are kept also when
>>>>>>> switching state in mark_page_free().
>>>>>>>
>>>>>>> Signed-off-by: Carlo Nonato <carlo.nonato@minervasys.tech>
>>>>>>
>>>>>> This patch is introducing a regression in OSStest (and possibly gitlab?):
>>>>>>
>>>>>> Mar 21 12:00:29.533676 (XEN) pg[0] MFN 2211c5 c=0x2c00000000000000 o=0
>>>>>> v=0xe40000010007ffff t=0x24
>>>>>> Mar 21 12:00:42.829785 (XEN) Xen BUG at common/page_alloc.c:1033
>>>>>> Mar 21 12:00:42.829829 (XEN) ----[ Xen-4.19-unstable  x86_64  debug=y
>>>>>> Not tainted ]----
>>>>>> Mar 21 12:00:42.829857 (XEN) CPU:    12
>>>>>> Mar 21 12:00:42.841571 (XEN) RIP:    e008:[<ffff82d04022fe1f>]
>>>>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>>>>>> Mar 21 12:00:42.841609 (XEN) RFLAGS: 0000000000010282   CONTEXT:
>>>>>> hypervisor (d0v8)
>>>>>> Mar 21 12:00:42.853654 (XEN) rax: ffff83023e3ed06c   rbx:
>>>>>> 000000000007ffff   rcx: 0000000000000028
>>>>>> Mar 21 12:00:42.853689 (XEN) rdx: ffff83047bec7fff   rsi:
>>>>>> ffff83023e3ea3e8   rdi: ffff83023e3ea3e0
>>>>>> Mar 21 12:00:42.865657 (XEN) rbp: ffff83047bec7c10   rsp:
>>>>>> ffff83047bec7b98   r8:  0000000000000000
>>>>>> Mar 21 12:00:42.877647 (XEN) r9:  0000000000000001   r10:
>>>>>> 000000000000000c   r11: 0000000000000010
>>>>>> Mar 21 12:00:42.877682 (XEN) r12: 0000000000000001   r13:
>>>>>> 0000000000000000   r14: ffff82e0044238a0
>>>>>> Mar 21 12:00:42.889652 (XEN) r15: 0000000000000000   cr0:
>>>>>> 0000000080050033   cr4: 0000000000372660
>>>>>> Mar 21 12:00:42.901651 (XEN) cr3: 000000046fe34000   cr2: 00007fb72757610b
>>>>>> Mar 21 12:00:42.901685 (XEN) fsb: 00007fb726def380   gsb:
>>>>>> ffff88801f200000   gss: 0000000000000000
>>>>>> Mar 21 12:00:42.913646 (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000
>>>>>> ss: e010   cs: e008
>>>>>> Mar 21 12:00:42.913680 (XEN) Xen code around <ffff82d04022fe1f>
>>>>>> (common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2):
>>>>>> Mar 21 12:00:42.925645 (XEN)  d1 1c 00 e8 ad dd 02 00 <0f> 0b 48 85 c9
>>>>>> 79 36 0f 0b 41 89 cd 48 c7 47 f0
>>>>>> Mar 21 12:00:42.937649 (XEN) Xen stack trace from rsp=ffff83047bec7b98:
>>>>>> Mar 21 12:00:42.937683 (XEN)    0000000000000024 000000007bec7c20
>>>>>> 0000000000000001 ffff83046ccda000
>>>>>> Mar 21 12:00:42.949653 (XEN)    ffff82e000000021 0000000000000016
>>>>>> 0000000000000000 0000000000000000
>>>>>> Mar 21 12:00:42.949687 (XEN)    0000000000000000 0000000000000000
>>>>>> 0000000000000028 0000000000000021
>>>>>> Mar 21 12:00:42.961652 (XEN)    ffff83046ccda000 0000000000000000
>>>>>> 00007d2000000000 ffff83047bec7c48
>>>>>> Mar 21 12:00:42.961687 (XEN)    ffff82d0402302ff ffff83046ccda000
>>>>>> 0000000000000100 0000000000000000
>>>>>> Mar 21 12:00:42.973655 (XEN)    ffff82d0405f0080 00007d2000000000
>>>>>> ffff83047bec7c80 ffff82d0402f626c
>>>>>> Mar 21 12:00:42.985656 (XEN)    ffff83046ccda000 ffff83046ccda640
>>>>>> 0000000000000000 0000000000000000
>>>>>> Mar 21 12:00:42.985690 (XEN)    ffff83046ccda220 ffff83047bec7cb0
>>>>>> ffff82d0402f65a0 ffff83046ccda000
>>>>>> Mar 21 12:00:42.997662 (XEN)    0000000000000000 0000000000000000
>>>>>> 0000000000000000 ffff83047bec7cc0
>>>>>> Mar 21 12:00:43.009660 (XEN)    ffff82d040311f8a ffff83047bec7ce0
>>>>>> ffff82d0402bd543 ffff83046ccda000
>>>>>> Mar 21 12:00:43.009695 (XEN)    ffff83047bec7dc8 ffff83047bec7d08
>>>>>> ffff82d04032c524 ffff83046ccda000
>>>>>> Mar 21 12:00:43.021653 (XEN)    ffff83047bec7dc8 0000000000000002
>>>>>> ffff83047bec7d58 ffff82d040206750
>>>>>> Mar 21 12:00:43.033642 (XEN)    0000000000000000 ffff82d040233fe5
>>>>>> ffff83047bec7d48 0000000000000000
>>>>>> Mar 21 12:00:43.033678 (XEN)    0000000000000002 00007fb72767f010
>>>>>> ffff82d0405e9120 0000000000000001
>>>>>> Mar 21 12:00:43.045654 (XEN)    ffff83047bec7e70 ffff82d040240728
>>>>>> 0000000000000007 ffff83023e3b3000
>>>>>> Mar 21 12:00:43.045690 (XEN)    0000000000000246 ffff83023e2efa90
>>>>>> ffff83023e38e000 ffff83023e2efb40
>>>>>> Mar 21 12:00:43.057609 (XEN)    0000000000000007 ffff83023e3afb80
>>>>>> 0000000000000206 ffff83047bec7dc0
>>>>>> Mar 21 12:00:43.069662 (XEN)    0000001600000001 000000000000ffff
>>>>>> e75aaa8d0000000c ac0d6d864e487f62
>>>>>> Mar 21 12:00:43.069697 (XEN)    000000037fa48d76 0000000200000000
>>>>>> ffffffff000003ff 00000002ffffffff
>>>>>> Mar 21 12:00:43.081647 (XEN)    0000000000000000 00000000000001ff
>>>>>> 0000000000000000 0000000000000000
>>>>>> Mar 21 12:00:43.093646 (XEN) Xen call trace:
>>>>>> Mar 21 12:00:43.093677 (XEN)    [<ffff82d04022fe1f>] R
>>>>>> common/page_alloc.c#alloc_heap_pages+0x37f/0x6e2
>>>>>> Mar 21 12:00:43.093705 (XEN)    [<ffff82d0402302ff>] F
>>>>>> alloc_domheap_pages+0x17d/0x1e4
>>>>>> Mar 21 12:00:43.105652 (XEN)    [<ffff82d0402f626c>] F
>>>>>> hap_set_allocation+0x73/0x23c
>>>>>> Mar 21 12:00:43.105685 (XEN)    [<ffff82d0402f65a0>] F
>>>>>> hap_enable+0x138/0x33c
>>>>>> Mar 21 12:00:43.117646 (XEN)    [<ffff82d040311f8a>] F
>>>>>> paging_enable+0x2d/0x45
>>>>>> Mar 21 12:00:43.117679 (XEN)    [<ffff82d0402bd543>] F
>>>>>> hvm_domain_initialise+0x185/0x428
>>>>>> Mar 21 12:00:43.129652 (XEN)    [<ffff82d04032c524>] F
>>>>>> arch_domain_create+0x3e7/0x4c1
>>>>>> Mar 21 12:00:43.129687 (XEN)    [<ffff82d040206750>] F
>>>>>> domain_create+0x4cc/0x7e2
>>>>>> Mar 21 12:00:43.141665 (XEN)    [<ffff82d040240728>] F
>>>>>> do_domctl+0x1850/0x192d
>>>>>> Mar 21 12:00:43.141699 (XEN)    [<ffff82d04031a96a>] F
>>>>>> pv_hypercall+0x617/0x6b5
>>>>>> Mar 21 12:00:43.153656 (XEN)    [<ffff82d0402012ca>] F
>>>>>> lstar_enter+0x13a/0x140
>>>>>> Mar 21 12:00:43.153689 (XEN)
>>>>>> Mar 21 12:00:43.153711 (XEN)
>>>>>> Mar 21 12:00:43.153731 (XEN) ****************************************
>>>>>> Mar 21 12:00:43.165647 (XEN) Panic on CPU 12:
>>>>>> Mar 21 12:00:43.165678 (XEN) Xen BUG at common/page_alloc.c:1033
>>>>>> Mar 21 12:00:43.165703 (XEN) ****************************************
>>>>>> Mar 21 12:00:43.177633 (XEN)
>>>>>> Mar 21 12:00:43.177662 (XEN) Manual reset required ('noreboot' specified)
>>>>>>
>>>>>> The code around the BUG is:
>>>>>>
>>>>>>          /* Reference count must continuously be zero for free pages. */
>>>>>>          if ( (pg[i].count_info & ~PGC_need_scrub) != PGC_state_free )
>>>>>>          {
>>>>>>              printk(XENLOG_ERR
>>>>>>                     "pg[%u] MFN %"PRI_mfn" c=%#lx o=%u v=%#lx t=%#x\n",
>>>>>>                     i, mfn_x(page_to_mfn(pg + i)),
>>>>>>                     pg[i].count_info, pg[i].v.free.order,
>>>>>>                     pg[i].u.free.val, pg[i].tlbflush_timestamp);
>>>>>>              BUG();
>>>>>>          }
>>>>>>
>>>>>> Now that you are preserving some flags, you also want to modify the
>>>>>> condition. I haven't checked the rest of the code, so there might be
>>>>>> some adjustments necessary.
>>>>>
>>>>> Actually maybe the condition should not be adjusted. I think it would be
>>>>> wrong if a free pages has the flag PGC_extra set. Any thoughts?
>>>>
>>>> I agree, yet I'm inclined to say PGC_extra should have been cleared
>>>> before trying to free the page.
>>>
>>> So what to do now? Should I drop this commit?
>>
>> No, we need to get to the root of the issue. Since osstest has hit it quite
>> easily as it seems, I'm somewhat surprised you didn't hit it in your testing.
>> In any event, as per my earlier reply, my present guess is that your change
>> has merely uncovered a previously latent issue elsewhere.
> 
> Ok, what about removing PGC_extra in free_heap_pages() before the
> mark_page_free() call?

Question is: How would you justify such a change? IOW I'm not convinced
(yet) this wants doing there.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-26 17:04               ` Jan Beulich
@ 2024-03-26 21:55                 ` Julien Grall
  2024-03-27 11:10                   ` Carlo Nonato
  0 siblings, 1 reply; 60+ messages in thread
From: Julien Grall @ 2024-03-26 21:55 UTC (permalink / raw)
  To: Jan Beulich, Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi Carlo & Jan,

On 26/03/2024 17:04, Jan Beulich wrote:
>>> No, we need to get to the root of the issue. Since osstest has hit it quite
>>> easily as it seems, I'm somewhat surprised you didn't hit it in your testing.
>>> In any event, as per my earlier reply, my present guess is that your change
>>> has merely uncovered a previously latent issue elsewhere.
>>
>> Ok, what about removing PGC_extra in free_heap_pages() before the
>> mark_page_free() call?
> 
> Question is: How would you justify such a change? IOW I'm not convinced
> (yet) this wants doing there.

Looking at the code, the flag is originally set in 
alloc_domheap_pages(). So I guess it would make sense to do it in 
free_domheap_pages().

For PGC_static, I don't think we can reach free_domheap_pages() with the 
flag set (the pages should live in a separate pool). So I believe there 
is nothing to do for them.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-26 21:55                 ` Julien Grall
@ 2024-03-27 11:10                   ` Carlo Nonato
  2024-03-27 13:28                     ` Julien Grall
  0 siblings, 1 reply; 60+ messages in thread
From: Carlo Nonato @ 2024-03-27 11:10 UTC (permalink / raw)
  To: Julien Grall, Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi guys,

> Question is: How would you justify such a change? IOW I'm not convinced
> (yet) this wants doing there.

You mean in this series?

> Looking at the code, the flag is originally set in
> alloc_domheap_pages(). So I guess it would make sense to do it in
> free_domheap_pages().

We don't hold the heap_lock there. Is it safe to change count_info without it?

Thanks.

> For PGC_static, I don't think we can reach free_domheap_pages() with the
> flag set (the pages should live in a separate pool). So I believe there
> is nothing to do for them.
>
> Cheers,
>
> --
> Julien Grall


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-22  7:26           ` Jan Beulich
@ 2024-03-27 11:39             ` Carlo Nonato
  2024-03-27 11:56               ` Julien Grall
  2024-03-27 11:57               ` Michal Orzel
  0 siblings, 2 replies; 60+ messages in thread
From: Carlo Nonato @ 2024-03-27 11:39 UTC (permalink / raw)
  To: Julien Grall, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Stefano Stabellini
  Cc: Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu,
	Marco Solieri, xen-devel, Andrea Bastoni

Hi guys,

On Fri, Mar 22, 2024 at 8:26 AM Jan Beulich <jbeulich@suse.com> wrote:
>
> On 21.03.2024 18:31, Carlo Nonato wrote:
> > On Thu, Mar 21, 2024 at 4:57 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>
> >> On 21.03.2024 16:04, Carlo Nonato wrote:
> >>> On Tue, Mar 19, 2024 at 4:30 PM Jan Beulich <jbeulich@suse.com> wrote:
> >>>> On 15.03.2024 11:58, Carlo Nonato wrote:
> >>>>> --- a/docs/misc/xen-command-line.pandoc
> >>>>> +++ b/docs/misc/xen-command-line.pandoc
> >>>>> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
> >>>>>
> >>>>>  Specify a list of IO ports to be excluded from dom0 access.
> >>>>>
> >>>>> +### dom0-llc-colors
> >>>>> +> `= List of [ <integer> | <integer>-<integer> ]`
> >>>>> +
> >>>>> +> Default: `All available LLC colors`
> >>>>> +
> >>>>> +Specify dom0 LLC color configuration. This option is available only when
> >>>>> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
> >>>>> +colors are used.
> >>>>
> >>>> My reservation towards this being a top-level option remains.
> >>>
> >>> How can I turn this into a lower-level option? Moving it into "dom0=" doesn't
> >>> seem possible to me. How can I express a list (llc-colors) inside another list
> >>> (dom0)? dom0=llc-colors=0-3,12-15,other-param=... How can I stop parsing
> >>> before reaching other-param?
> >>
> >> For example by using a different separator:
> >>
> >> dom0=llc-colors=0-3+12-15,other-param=...
> >
> > Ok, but that would mean to change the implementation of the parsing function
> > and to adopt this syntax also in other places, something that I would've
> > preferred to avoid. Anyway I'll follow your suggestion.
>
> Well, this is all used by Arm only for now. You will want to make sure Arm
> folks are actually okay with this alternative approach.
>
> Jan

Are you Arm maintainers ok with this?

Thanks.


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-27 11:39             ` Carlo Nonato
@ 2024-03-27 11:56               ` Julien Grall
  2024-04-05  0:20                 ` Stefano Stabellini
  2024-03-27 11:57               ` Michal Orzel
  1 sibling, 1 reply; 60+ messages in thread
From: Julien Grall @ 2024-03-27 11:56 UTC (permalink / raw)
  To: Carlo Nonato, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Stefano Stabellini
  Cc: Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu,
	Marco Solieri, xen-devel, Andrea Bastoni

Hi,

On 27/03/2024 11:39, Carlo Nonato wrote:
> On Fri, Mar 22, 2024 at 8:26 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 21.03.2024 18:31, Carlo Nonato wrote:
>>> On Thu, Mar 21, 2024 at 4:57 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 21.03.2024 16:04, Carlo Nonato wrote:
>>>>> On Tue, Mar 19, 2024 at 4:30 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>>>>>> --- a/docs/misc/xen-command-line.pandoc
>>>>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>>>>> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
>>>>>>>
>>>>>>>   Specify a list of IO ports to be excluded from dom0 access.
>>>>>>>
>>>>>>> +### dom0-llc-colors
>>>>>>> +> `= List of [ <integer> | <integer>-<integer> ]`
>>>>>>> +
>>>>>>> +> Default: `All available LLC colors`
>>>>>>> +
>>>>>>> +Specify dom0 LLC color configuration. This option is available only when
>>>>>>> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
>>>>>>> +colors are used.
>>>>>>
>>>>>> My reservation towards this being a top-level option remains.
>>>>>
>>>>> How can I turn this into a lower-level option? Moving it into "dom0=" doesn't
>>>>> seem possible to me. How can I express a list (llc-colors) inside another list
>>>>> (dom0)? dom0=llc-colors=0-3,12-15,other-param=... How can I stop parsing
>>>>> before reaching other-param?
>>>>
>>>> For example by using a different separator:
>>>>
>>>> dom0=llc-colors=0-3+12-15,other-param=...
>>>
>>> Ok, but that would mean to change the implementation of the parsing function
>>> and to adopt this syntax also in other places, something that I would've
>>> preferred to avoid. Anyway I'll follow your suggestion.
>>
>> Well, this is all used by Arm only for now. You will want to make sure Arm
>> folks are actually okay with this alternative approach.
>>
>> Jan
> 
> Are you Arm maintainers ok with this?

Unfortunately no. I find the use of + and "nested" = extremely confusing 
to read.

There might be other symbols to use (e.g. s/=/:/ s#+#/#), but I am not 
entirely sure the value of trying to cram all the options under a single 
top-level parameter. So I right now, I would prefrr if we stick with the 
existing approach (i.e. introducing dom0-llc-colors).

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-27 11:39             ` Carlo Nonato
  2024-03-27 11:56               ` Julien Grall
@ 2024-03-27 11:57               ` Michal Orzel
  1 sibling, 0 replies; 60+ messages in thread
From: Michal Orzel @ 2024-03-27 11:57 UTC (permalink / raw)
  To: Carlo Nonato, Julien Grall, Bertrand Marquis, Volodymyr Babchuk,
	Stefano Stabellini
  Cc: Jan Beulich, Andrew Cooper, George Dunlap, Wei Liu,
	Marco Solieri, xen-devel, Andrea Bastoni



On 27/03/2024 12:39, Carlo Nonato wrote:
> 
> 
> Hi guys,
> 
> On Fri, Mar 22, 2024 at 8:26 AM Jan Beulich <jbeulich@suse.com> wrote:
>>
>> On 21.03.2024 18:31, Carlo Nonato wrote:
>>> On Thu, Mar 21, 2024 at 4:57 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>
>>>> On 21.03.2024 16:04, Carlo Nonato wrote:
>>>>> On Tue, Mar 19, 2024 at 4:30 PM Jan Beulich <jbeulich@suse.com> wrote:
>>>>>> On 15.03.2024 11:58, Carlo Nonato wrote:
>>>>>>> --- a/docs/misc/xen-command-line.pandoc
>>>>>>> +++ b/docs/misc/xen-command-line.pandoc
>>>>>>> @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
>>>>>>>
>>>>>>>  Specify a list of IO ports to be excluded from dom0 access.
>>>>>>>
>>>>>>> +### dom0-llc-colors
>>>>>>> +> `= List of [ <integer> | <integer>-<integer> ]`
>>>>>>> +
>>>>>>> +> Default: `All available LLC colors`
>>>>>>> +
>>>>>>> +Specify dom0 LLC color configuration. This option is available only when
>>>>>>> +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set, all available
>>>>>>> +colors are used.
>>>>>>
>>>>>> My reservation towards this being a top-level option remains.
>>>>>
>>>>> How can I turn this into a lower-level option? Moving it into "dom0=" doesn't
>>>>> seem possible to me. How can I express a list (llc-colors) inside another list
>>>>> (dom0)? dom0=llc-colors=0-3,12-15,other-param=... How can I stop parsing
>>>>> before reaching other-param?
>>>>
>>>> For example by using a different separator:
>>>>
>>>> dom0=llc-colors=0-3+12-15,other-param=...
>>>
>>> Ok, but that would mean to change the implementation of the parsing function
>>> and to adopt this syntax also in other places, something that I would've
>>> preferred to avoid. Anyway I'll follow your suggestion.
>>
>> Well, this is all used by Arm only for now. You will want to make sure Arm
>> folks are actually okay with this alternative approach.
>>
>> Jan
> 
> Are you Arm maintainers ok with this?
I'm not a fan of this syntax and I find it more difficult to parse compared to the usual case, where
every option is clearly separated. That said, I won't oppose to it.

~Michal


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-27 11:10                   ` Carlo Nonato
@ 2024-03-27 13:28                     ` Julien Grall
  2024-03-27 13:38                       ` Jan Beulich
  0 siblings, 1 reply; 60+ messages in thread
From: Julien Grall @ 2024-03-27 13:28 UTC (permalink / raw)
  To: Carlo Nonato, Jan Beulich
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi Carlo,

On 27/03/2024 11:10, Carlo Nonato wrote:
> Hi guys,
> 
>> Question is: How would you justify such a change? IOW I'm not convinced
>> (yet) this wants doing there.
> 
> You mean in this series?
> 
>> Looking at the code, the flag is originally set in
>> alloc_domheap_pages(). So I guess it would make sense to do it in
>> free_domheap_pages().
> 
> We don't hold the heap_lock there.
Regardless of the safety question (I will answer below), count_info is 
not meant to be protected by heap_lock. The lock is protecting the heap 
and ensure we are not corrupting them when there are concurrent call to 
free_heap_pages().

 > Is it safe to change count_info without it?

count_info is meant to be accessed locklessly. The flag PGC_extra cannot 
be set/clear concurrently because you can't allocate a page that has not 
yet been freed.

So it would be fine to use clear_bit(..., ...);

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-27 13:28                     ` Julien Grall
@ 2024-03-27 13:38                       ` Jan Beulich
  2024-03-27 13:48                         ` Julien Grall
  0 siblings, 1 reply; 60+ messages in thread
From: Jan Beulich @ 2024-03-27 13:38 UTC (permalink / raw)
  To: Julien Grall, Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	xen-devel

On 27.03.2024 14:28, Julien Grall wrote:
> Hi Carlo,
> 
> On 27/03/2024 11:10, Carlo Nonato wrote:
>> Hi guys,
>>
>>> Question is: How would you justify such a change? IOW I'm not convinced
>>> (yet) this wants doing there.
>>
>> You mean in this series?
>>
>>> Looking at the code, the flag is originally set in
>>> alloc_domheap_pages(). So I guess it would make sense to do it in
>>> free_domheap_pages().
>>
>> We don't hold the heap_lock there.
> Regardless of the safety question (I will answer below), count_info is 
> not meant to be protected by heap_lock. The lock is protecting the heap 
> and ensure we are not corrupting them when there are concurrent call to 
> free_heap_pages().
> 
>  > Is it safe to change count_info without it?
> 
> count_info is meant to be accessed locklessly. The flag PGC_extra cannot 
> be set/clear concurrently because you can't allocate a page that has not 
> yet been freed.
> 
> So it would be fine to use clear_bit(..., ...);

Actually we hardly ever use clear_bit() on count_info. Normally we use
ordinary C operators. Atomic (and otherwise lockless) updates are useful
only if done like this everywhere.

Jan


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-27 13:38                       ` Jan Beulich
@ 2024-03-27 13:48                         ` Julien Grall
  2024-03-28 18:21                           ` Julien Grall
  0 siblings, 1 reply; 60+ messages in thread
From: Julien Grall @ 2024-03-27 13:48 UTC (permalink / raw)
  To: Jan Beulich, Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	xen-devel



On 27/03/2024 13:38, Jan Beulich wrote:
> On 27.03.2024 14:28, Julien Grall wrote:
>> Hi Carlo,
>>
>> On 27/03/2024 11:10, Carlo Nonato wrote:
>>> Hi guys,
>>>
>>>> Question is: How would you justify such a change? IOW I'm not convinced
>>>> (yet) this wants doing there.
>>>
>>> You mean in this series?
>>>
>>>> Looking at the code, the flag is originally set in
>>>> alloc_domheap_pages(). So I guess it would make sense to do it in
>>>> free_domheap_pages().
>>>
>>> We don't hold the heap_lock there.
>> Regardless of the safety question (I will answer below), count_info is
>> not meant to be protected by heap_lock. The lock is protecting the heap
>> and ensure we are not corrupting them when there are concurrent call to
>> free_heap_pages().
>>
>>   > Is it safe to change count_info without it?
>>
>> count_info is meant to be accessed locklessly. The flag PGC_extra cannot
>> be set/clear concurrently because you can't allocate a page that has not
>> yet been freed.
>>
>> So it would be fine to use clear_bit(..., ...);
> 
> Actually we hardly ever use clear_bit() on count_info. Normally we use
> ordinary C operators.

I knew you would say that. I am not convince it is safe to always using 
count_info without any atomic operations. But I never got around to 
check all them.

> Atomic (and otherwise lockless) updates are useful
> only if done like this everywhere.

You are right. But starting to use the bitops is not going to hurt 
anyone (other than maybe performance, but once we convert all of them, 
then this will become moot). In fact, it helps start to slowly move 
towards the aim to have count_info safe.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro
  2024-03-27 13:48                         ` Julien Grall
@ 2024-03-28 18:21                           ` Julien Grall
  0 siblings, 0 replies; 60+ messages in thread
From: Julien Grall @ 2024-03-28 18:21 UTC (permalink / raw)
  To: Jan Beulich, Carlo Nonato
  Cc: Andrew Cooper, George Dunlap, Stefano Stabellini, Wei Liu,
	Roger Pau Monné,
	xen-devel

Hi,

Replying to myself.

On 27/03/2024 13:48, Julien Grall wrote:
> On 27/03/2024 13:38, Jan Beulich wrote:
>> On 27.03.2024 14:28, Julien Grall wrote:
>>> Hi Carlo,
>>>
>>> On 27/03/2024 11:10, Carlo Nonato wrote:
>>>> Hi guys,
>>>>
>>>>> Question is: How would you justify such a change? IOW I'm not 
>>>>> convinced
>>>>> (yet) this wants doing there.
>>>>
>>>> You mean in this series?
>>>>
>>>>> Looking at the code, the flag is originally set in
>>>>> alloc_domheap_pages(). So I guess it would make sense to do it in
>>>>> free_domheap_pages().
>>>>
>>>> We don't hold the heap_lock there.
>>> Regardless of the safety question (I will answer below), count_info is
>>> not meant to be protected by heap_lock. The lock is protecting the heap
>>> and ensure we are not corrupting them when there are concurrent call to
>>> free_heap_pages().
>>>
>>>   > Is it safe to change count_info without it?
>>>
>>> count_info is meant to be accessed locklessly. The flag PGC_extra cannot
>>> be set/clear concurrently because you can't allocate a page that has not
>>> yet been freed.
>>>
>>> So it would be fine to use clear_bit(..., ...);
>>
>> Actually we hardly ever use clear_bit() on count_info. Normally we use
>> ordinary C operators.
> 
> I knew you would say that. I am not convince it is safe to always using 
> count_info without any atomic operations. But I never got around to 
> check all them.
> 
>> Atomic (and otherwise lockless) updates are useful
>> only if done like this everywhere.
> 
> You are right. But starting to use the bitops is not going to hurt 
> anyone (other than maybe performance, but once we convert all of them, 
> then this will become moot). In fact, it helps start to slowly move 
> towards the aim to have count_info safe.

I think I managed to convince myself that, count_info |= ... is fine in 
this case and no locking is necessary.

Cheers,

-- 
Julien Grall


^ permalink raw reply	[flat|nested] 60+ messages in thread

* Re: [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support
  2024-03-27 11:56               ` Julien Grall
@ 2024-04-05  0:20                 ` Stefano Stabellini
  0 siblings, 0 replies; 60+ messages in thread
From: Stefano Stabellini @ 2024-04-05  0:20 UTC (permalink / raw)
  To: Julien Grall
  Cc: Carlo Nonato, Bertrand Marquis, Michal Orzel, Volodymyr Babchuk,
	Stefano Stabellini, Jan Beulich, Andrew Cooper, George Dunlap,
	Wei Liu, Marco Solieri, xen-devel, Andrea Bastoni

[-- Attachment #1: Type: text/plain, Size: 2736 bytes --]

On Wed, 27 Mar 2024, Julien Grall wrote:
> Hi,
> 
> On 27/03/2024 11:39, Carlo Nonato wrote:
> > On Fri, Mar 22, 2024 at 8:26 AM Jan Beulich <jbeulich@suse.com> wrote:
> > > 
> > > On 21.03.2024 18:31, Carlo Nonato wrote:
> > > > On Thu, Mar 21, 2024 at 4:57 PM Jan Beulich <jbeulich@suse.com> wrote:
> > > > > 
> > > > > On 21.03.2024 16:04, Carlo Nonato wrote:
> > > > > > On Tue, Mar 19, 2024 at 4:30 PM Jan Beulich <jbeulich@suse.com>
> > > > > > wrote:
> > > > > > > On 15.03.2024 11:58, Carlo Nonato wrote:
> > > > > > > > --- a/docs/misc/xen-command-line.pandoc
> > > > > > > > +++ b/docs/misc/xen-command-line.pandoc
> > > > > > > > @@ -963,6 +963,15 @@ Controls for the dom0 IOMMU setup.
> > > > > > > > 
> > > > > > > >   Specify a list of IO ports to be excluded from dom0 access.
> > > > > > > > 
> > > > > > > > +### dom0-llc-colors
> > > > > > > > +> `= List of [ <integer> | <integer>-<integer> ]`
> > > > > > > > +
> > > > > > > > +> Default: `All available LLC colors`
> > > > > > > > +
> > > > > > > > +Specify dom0 LLC color configuration. This option is available
> > > > > > > > only when
> > > > > > > > +`CONFIG_LLC_COLORING` is enabled. If the parameter is not set,
> > > > > > > > all available
> > > > > > > > +colors are used.
> > > > > > > 
> > > > > > > My reservation towards this being a top-level option remains.
> > > > > > 
> > > > > > How can I turn this into a lower-level option? Moving it into
> > > > > > "dom0=" doesn't
> > > > > > seem possible to me. How can I express a list (llc-colors) inside
> > > > > > another list
> > > > > > (dom0)? dom0=llc-colors=0-3,12-15,other-param=... How can I stop
> > > > > > parsing
> > > > > > before reaching other-param?
> > > > > 
> > > > > For example by using a different separator:
> > > > > 
> > > > > dom0=llc-colors=0-3+12-15,other-param=...
> > > > 
> > > > Ok, but that would mean to change the implementation of the parsing
> > > > function
> > > > and to adopt this syntax also in other places, something that I would've
> > > > preferred to avoid. Anyway I'll follow your suggestion.
> > > 
> > > Well, this is all used by Arm only for now. You will want to make sure Arm
> > > folks are actually okay with this alternative approach.
> > > 
> > > Jan
> > 
> > Are you Arm maintainers ok with this?
> 
> Unfortunately no. I find the use of + and "nested" = extremely confusing to
> read.
> 
> There might be other symbols to use (e.g. s/=/:/ s#+#/#), but I am not
> entirely sure the value of trying to cram all the options under a single
> top-level parameter. So I right now, I would prefrr if we stick with the
> existing approach (i.e. introducing dom0-llc-colors).

I also prefer the original as suggested in this version of the patch

^ permalink raw reply	[flat|nested] 60+ messages in thread

end of thread, other threads:[~2024-04-05  0:20 UTC | newest]

Thread overview: 60+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-03-15 10:58 [PATCH v7 00/14] Arm cache coloring Carlo Nonato
2024-03-15 10:58 ` [PATCH v7 01/14] xen/common: add cache coloring common code Carlo Nonato
2024-03-15 11:39   ` Carlo Nonato
2024-03-19 14:58   ` Jan Beulich
2024-03-21 15:03     ` Carlo Nonato
2024-03-21 15:53       ` Jan Beulich
2024-03-21 17:22         ` Carlo Nonato
2024-03-22  7:25           ` Jan Beulich
2024-03-15 10:58 ` [PATCH v7 02/14] xen/arm: add initial support for LLC coloring on arm64 Carlo Nonato
2024-03-15 10:58 ` [PATCH v7 03/14] xen/arm: permit non direct-mapped Dom0 construction Carlo Nonato
2024-03-15 10:58 ` [PATCH v7 04/14] xen/arm: add Dom0 cache coloring support Carlo Nonato
2024-03-19 15:30   ` Jan Beulich
2024-03-21 15:04     ` Carlo Nonato
2024-03-21 15:57       ` Jan Beulich
2024-03-21 17:31         ` Carlo Nonato
2024-03-22  7:26           ` Jan Beulich
2024-03-27 11:39             ` Carlo Nonato
2024-03-27 11:56               ` Julien Grall
2024-04-05  0:20                 ` Stefano Stabellini
2024-03-27 11:57               ` Michal Orzel
2024-03-19 15:45   ` Jan Beulich
2024-03-15 10:58 ` [PATCH v7 05/14] xen: extend domctl interface for cache coloring Carlo Nonato
2024-03-19 15:37   ` Jan Beulich
2024-03-21 15:11     ` Carlo Nonato
2024-03-15 10:58 ` [PATCH v7 06/14] tools: add support for cache coloring configuration Carlo Nonato
2024-03-25 10:55   ` Anthony PERARD
2024-03-25 11:44     ` Carlo Nonato
2024-03-15 10:58 ` [PATCH v7 07/14] xen/arm: add support for cache coloring configuration via device-tree Carlo Nonato
2024-03-19 15:41   ` Jan Beulich
2024-03-21 15:12     ` Carlo Nonato
2024-03-15 10:58 ` [PATCH v7 08/14] xen/page_alloc: introduce preserved page flags macro Carlo Nonato
2024-03-19 15:47   ` Jan Beulich
2024-03-21 16:07   ` Julien Grall
2024-03-21 16:10     ` Julien Grall
2024-03-21 16:22       ` Jan Beulich
2024-03-22 15:07         ` Carlo Nonato
2024-03-25  7:19           ` Jan Beulich
2024-03-26 16:39             ` Carlo Nonato
2024-03-26 17:04               ` Jan Beulich
2024-03-26 21:55                 ` Julien Grall
2024-03-27 11:10                   ` Carlo Nonato
2024-03-27 13:28                     ` Julien Grall
2024-03-27 13:38                       ` Jan Beulich
2024-03-27 13:48                         ` Julien Grall
2024-03-28 18:21                           ` Julien Grall
2024-03-15 10:58 ` [PATCH v7 09/14] xen/page_alloc: introduce page flag to stop buddy merging Carlo Nonato
2024-03-19 15:49   ` Jan Beulich
2024-03-15 10:58 ` [PATCH v7 10/14] xen: add cache coloring allocator for domains Carlo Nonato
2024-03-19 16:43   ` Jan Beulich
2024-03-21 15:36     ` Carlo Nonato
2024-03-21 16:03       ` Jan Beulich
2024-03-15 10:58 ` [PATCH v7 11/14] xen/arm: use domain memory to allocate p2m page tables Carlo Nonato
2024-03-15 10:59 ` [PATCH v7 12/14] xen/arm: add Xen cache colors command line parameter Carlo Nonato
2024-03-19 15:54   ` Jan Beulich
2024-03-21 15:36     ` Carlo Nonato
2024-03-15 10:59 ` [PATCH v7 13/14] xen/arm: make consider_modules() available for xen relocation Carlo Nonato
2024-03-15 10:59 ` [PATCH v7 14/14] xen/arm: add cache coloring support for Xen Carlo Nonato
2024-03-19 15:58   ` Jan Beulich
2024-03-19 16:15     ` Jan Beulich
2024-03-19 16:03   ` Jan Beulich

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.