All of lore.kernel.org
 help / color / mirror / Atom feed
* [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot
@ 2023-11-30  9:07 ` Shaoqin Huang
  0 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Shaoqin Huang, Alexandru Elisei, David Woodhouse, Eric Auger,
	kvm, Laurent Vivier, linuxppc-dev, Nadav Amit, Nico Boehr,
	Nikos Nikoleris, Thomas Huth

Hi,

I'm posting Alexandru's patch set[1] rebased on the latest branch with the
conflicts being resolved. No big changes compare to its original code.

As this version 1 of this series was posted one years ago, I would first let you
recall it, what's the intention of this series and what this series do. You can
view it by click the link[2] and view the cover-letter.

Since when writing the series[1], the efi support for arm64[3] hasn't been
merged into the kvm-unit-tests, but now the efi support for arm64 has been
merged. Directly rebase the series[1] onto the latest branch will break the efi
tests. This is mainly because the Patch #15 ("arm/arm64: Enable the MMU early")
moves the mmu_enable() out of the setup_mmu(), which causes the efi test will
not enable the mmu. So I do a small change in the efi_mem_init() which makes the
efi test also enable the MMU early, and make it works.

And another change should be noticed is in the Patch #17 ("arm/arm64: Perform
dcache maintenance"). In the efi_mem_init(), it will disable the mmu, and build
a new pagetable and re-enable the mmu, if the asm_mmu_disable clean and
invalidate the data caches for entire memory, we don't need to care the dcache
and after mmu disabled, we use the mmu_setup_early() to re-enable the mmu, which
takes care all the cache maintenance. But the situation changes since the Patch
#18 ("arm/arm64: Rework the cache maintenance in asm_mmu_disable") only clean
and invalidate the data caches for the stack memory area. So we need to clean
and invalidate the data caches manually before disable the mmu, I'm not
confident about current cache maintenance at the efi setup patch, so I ask for
your help to review it if it's right or not.

And I also drop one patch ("s390: Do not use the physical allocator") from[1]
since this cause s390 test to fail.

This series may include bug, so I really appreciate your review to improve this
series together.

You can get the code from:

$ git clone https://gitlab.com/shahuang/kvm-unit-tests.git \
	-b arm-arm64-rework-cache-maintenance-at-boot-v1

[1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/arm-arm64-rework-cache-maintenance-at-boot-v2-wip2
[2] https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
[3] https://patchwork.kernel.org/project/kvm/cover/20230530160924.82158-1-nikos.nikoleris@arm.com/

Changelog:
----------
RFC->v1:
  - Gathered Reviewed-by tags.
  - Various changes to commit messages and comments to hopefully make the code
    easier to understand.
  - Patches #8 ("lib/alloc_phys: Expand documentation with usage and limitations")
    are new.
  - Folded patch "arm: page.h: Add missing libcflat.h include" into #17
    ("arm/arm64: Perform dcache maintenance at boot").
  - Reordered the series to group patches that touch aproximately the same code
    together - the patches that change the physical allocator are now first,
    followed come the patches that change how the secondaries are brought online.
  - Fixed several nasty bugs where the r4 register was being clobbered in the arm
    assembly.
  - Unmap the early UART address if the DTB address does not match the early
    address.
  - Added dcache maintenance when a page table is modified with the MMU disabled.
  - Moved the cache maintenance when disabling the MMU to be executed before the
    MMU is disabled.
  - Rebase it on lasted branch which efi support has been merged.
  - Make the efi test also enable MMU early.
  - Add cache maintenance on efi setup path especially before mmu_disable.

RFC: https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/

Alexandru Elisei (18):
  Makefile: Define __ASSEMBLY__ for assembly files
  powerpc: Replace the physical allocator with the page allocator
  lib/alloc_phys: Initialize align_min
  lib/alloc_phys: Consolidate allocate functions into memalign_early()
  lib/alloc_phys: Remove locking
  lib/alloc_phys: Remove allocation accounting
  lib/alloc_phys: Add callback to perform cache maintenance
  lib/alloc_phys: Expand documentation with usage and limitations
  arm/arm64: Zero secondary CPUs' stack
  arm/arm64: Allocate secondaries' stack using the page allocator
  arm/arm64: assembler.h: Replace size with end address for
    dcache_by_line_op
  arm/arm64: Add C functions for doing cache maintenance
  arm/arm64: Configure secondaries' stack before enabling the MMU
  arm/arm64: Use pgd_alloc() to allocate mmu_idmap
  arm/arm64: Enable the MMU early
  arm/arm64: Map the UART when creating the translation tables
  arm/arm64: Perform dcache maintenance at boot
  arm/arm64: Rework the cache maintenance in asm_mmu_disable

 Makefile                   |   5 +-
 arm/Makefile.arm           |   4 +-
 arm/Makefile.arm64         |   4 +-
 arm/Makefile.common        |   6 +-
 arm/cstart.S               |  71 +++++++++++++++------
 arm/cstart64.S             |  76 +++++++++++++++++------
 lib/alloc_phys.c           | 122 ++++++++++++-------------------------
 lib/alloc_phys.h           |  28 ++++++---
 lib/arm/asm/assembler.h    |  15 ++---
 lib/arm/asm/cacheflush.h   |   1 +
 lib/arm/asm/mmu-api.h      |   1 +
 lib/arm/asm/mmu.h          |   6 --
 lib/arm/asm/page.h         |   2 +
 lib/arm/asm/pgtable.h      |  39 ++++++++++--
 lib/arm/asm/thread_info.h  |   3 +-
 lib/arm/cache.S            |  89 +++++++++++++++++++++++++++
 lib/arm/io.c               |  31 ++++++++++
 lib/arm/io.h               |   3 +
 lib/arm/mmu.c              |  37 ++++++++---
 lib/arm/processor.c        |   1 -
 lib/arm/setup.c            |  82 +++++++++++++++++++++----
 lib/arm/smp.c              |   5 ++
 lib/arm64/asm/assembler.h  |  11 ++--
 lib/arm64/asm/cacheflush.h |  37 +++++++++++
 lib/arm64/asm/mmu.h        |   5 --
 lib/arm64/asm/pgtable.h    |  50 +++++++++++++--
 lib/arm64/cache.S          |  85 ++++++++++++++++++++++++++
 lib/arm64/processor.c      |   1 -
 lib/devicetree.c           |   2 +-
 lib/powerpc/setup.c        |   9 ++-
 powerpc/Makefile.common    |   1 +
 powerpc/cstart64.S         |   1 -
 powerpc/spapr_hcall.c      |   5 +-
 33 files changed, 642 insertions(+), 196 deletions(-)
 create mode 100644 lib/arm/asm/cacheflush.h
 create mode 100644 lib/arm/cache.S
 create mode 100644 lib/arm64/asm/cacheflush.h
 create mode 100644 lib/arm64/cache.S

-- 
2.40.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot
@ 2023-11-30  9:07 ` Shaoqin Huang
  0 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, linuxppc-dev,
	Shaoqin Huang, Nikos Nikoleris, Eric Auger, Nadav Amit,
	Alexandru Elisei, David Woodhouse

Hi,

I'm posting Alexandru's patch set[1] rebased on the latest branch with the
conflicts being resolved. No big changes compare to its original code.

As this version 1 of this series was posted one years ago, I would first let you
recall it, what's the intention of this series and what this series do. You can
view it by click the link[2] and view the cover-letter.

Since when writing the series[1], the efi support for arm64[3] hasn't been
merged into the kvm-unit-tests, but now the efi support for arm64 has been
merged. Directly rebase the series[1] onto the latest branch will break the efi
tests. This is mainly because the Patch #15 ("arm/arm64: Enable the MMU early")
moves the mmu_enable() out of the setup_mmu(), which causes the efi test will
not enable the mmu. So I do a small change in the efi_mem_init() which makes the
efi test also enable the MMU early, and make it works.

And another change should be noticed is in the Patch #17 ("arm/arm64: Perform
dcache maintenance"). In the efi_mem_init(), it will disable the mmu, and build
a new pagetable and re-enable the mmu, if the asm_mmu_disable clean and
invalidate the data caches for entire memory, we don't need to care the dcache
and after mmu disabled, we use the mmu_setup_early() to re-enable the mmu, which
takes care all the cache maintenance. But the situation changes since the Patch
#18 ("arm/arm64: Rework the cache maintenance in asm_mmu_disable") only clean
and invalidate the data caches for the stack memory area. So we need to clean
and invalidate the data caches manually before disable the mmu, I'm not
confident about current cache maintenance at the efi setup patch, so I ask for
your help to review it if it's right or not.

And I also drop one patch ("s390: Do not use the physical allocator") from[1]
since this cause s390 test to fail.

This series may include bug, so I really appreciate your review to improve this
series together.

You can get the code from:

$ git clone https://gitlab.com/shahuang/kvm-unit-tests.git \
	-b arm-arm64-rework-cache-maintenance-at-boot-v1

[1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/arm-arm64-rework-cache-maintenance-at-boot-v2-wip2
[2] https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
[3] https://patchwork.kernel.org/project/kvm/cover/20230530160924.82158-1-nikos.nikoleris@arm.com/

Changelog:
----------
RFC->v1:
  - Gathered Reviewed-by tags.
  - Various changes to commit messages and comments to hopefully make the code
    easier to understand.
  - Patches #8 ("lib/alloc_phys: Expand documentation with usage and limitations")
    are new.
  - Folded patch "arm: page.h: Add missing libcflat.h include" into #17
    ("arm/arm64: Perform dcache maintenance at boot").
  - Reordered the series to group patches that touch aproximately the same code
    together - the patches that change the physical allocator are now first,
    followed come the patches that change how the secondaries are brought online.
  - Fixed several nasty bugs where the r4 register was being clobbered in the arm
    assembly.
  - Unmap the early UART address if the DTB address does not match the early
    address.
  - Added dcache maintenance when a page table is modified with the MMU disabled.
  - Moved the cache maintenance when disabling the MMU to be executed before the
    MMU is disabled.
  - Rebase it on lasted branch which efi support has been merged.
  - Make the efi test also enable MMU early.
  - Add cache maintenance on efi setup path especially before mmu_disable.

RFC: https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/

Alexandru Elisei (18):
  Makefile: Define __ASSEMBLY__ for assembly files
  powerpc: Replace the physical allocator with the page allocator
  lib/alloc_phys: Initialize align_min
  lib/alloc_phys: Consolidate allocate functions into memalign_early()
  lib/alloc_phys: Remove locking
  lib/alloc_phys: Remove allocation accounting
  lib/alloc_phys: Add callback to perform cache maintenance
  lib/alloc_phys: Expand documentation with usage and limitations
  arm/arm64: Zero secondary CPUs' stack
  arm/arm64: Allocate secondaries' stack using the page allocator
  arm/arm64: assembler.h: Replace size with end address for
    dcache_by_line_op
  arm/arm64: Add C functions for doing cache maintenance
  arm/arm64: Configure secondaries' stack before enabling the MMU
  arm/arm64: Use pgd_alloc() to allocate mmu_idmap
  arm/arm64: Enable the MMU early
  arm/arm64: Map the UART when creating the translation tables
  arm/arm64: Perform dcache maintenance at boot
  arm/arm64: Rework the cache maintenance in asm_mmu_disable

 Makefile                   |   5 +-
 arm/Makefile.arm           |   4 +-
 arm/Makefile.arm64         |   4 +-
 arm/Makefile.common        |   6 +-
 arm/cstart.S               |  71 +++++++++++++++------
 arm/cstart64.S             |  76 +++++++++++++++++------
 lib/alloc_phys.c           | 122 ++++++++++++-------------------------
 lib/alloc_phys.h           |  28 ++++++---
 lib/arm/asm/assembler.h    |  15 ++---
 lib/arm/asm/cacheflush.h   |   1 +
 lib/arm/asm/mmu-api.h      |   1 +
 lib/arm/asm/mmu.h          |   6 --
 lib/arm/asm/page.h         |   2 +
 lib/arm/asm/pgtable.h      |  39 ++++++++++--
 lib/arm/asm/thread_info.h  |   3 +-
 lib/arm/cache.S            |  89 +++++++++++++++++++++++++++
 lib/arm/io.c               |  31 ++++++++++
 lib/arm/io.h               |   3 +
 lib/arm/mmu.c              |  37 ++++++++---
 lib/arm/processor.c        |   1 -
 lib/arm/setup.c            |  82 +++++++++++++++++++++----
 lib/arm/smp.c              |   5 ++
 lib/arm64/asm/assembler.h  |  11 ++--
 lib/arm64/asm/cacheflush.h |  37 +++++++++++
 lib/arm64/asm/mmu.h        |   5 --
 lib/arm64/asm/pgtable.h    |  50 +++++++++++++--
 lib/arm64/cache.S          |  85 ++++++++++++++++++++++++++
 lib/arm64/processor.c      |   1 -
 lib/devicetree.c           |   2 +-
 lib/powerpc/setup.c        |   9 ++-
 powerpc/Makefile.common    |   1 +
 powerpc/cstart64.S         |   1 -
 powerpc/spapr_hcall.c      |   5 +-
 33 files changed, 642 insertions(+), 196 deletions(-)
 create mode 100644 lib/arm/asm/cacheflush.h
 create mode 100644 lib/arm/cache.S
 create mode 100644 lib/arm64/asm/cacheflush.h
 create mode 100644 lib/arm64/cache.S

-- 
2.40.1


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
  2023-11-30  9:07 ` Shaoqin Huang
@ 2023-11-30  9:07   ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Alexandru Elisei, Nikos Nikoleris, Shaoqin Huang, Eric Auger,
	Laurent Vivier, Thomas Huth, Nico Boehr, David Woodhouse,
	Nadav Amit, kvm, linuxppc-dev

From: Alexandru Elisei <alexandru.elisei@arm.com>

There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
with functionality relies on the __ASSEMBLY__ prepocessor constant being
correctly defined to work correctly. So far, kvm-unit-tests has relied on
the assembly files to define the constant before including any header
files which depend on it.

Let's make sure that nobody gets this wrong and define it as a compiler
constant when compiling assembly files. __ASSEMBLY__ is now defined for all
.S files, even those that didn't set it explicitely before.

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
---
 Makefile           | 5 ++++-
 arm/cstart.S       | 1 -
 arm/cstart64.S     | 1 -
 powerpc/cstart64.S | 1 -
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 602910dd..27ed14e6 100644
--- a/Makefile
+++ b/Makefile
@@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
 
 autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
 
+AFLAGS  = $(CFLAGS)
+AFLAGS += -D__ASSEMBLY__
+
 LDFLAGS += -nostdlib $(no_pie) -z noexecstack
 
 $(libcflat): $(cflatobjs)
@@ -113,7 +116,7 @@ directories:
 	@mkdir -p $(OBJDIRS)
 
 %.o: %.S
-	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
+	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
 
 -include */.*.d */*/.*.d
 
diff --git a/arm/cstart.S b/arm/cstart.S
index 3dd71ed9..b24ecabc 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -5,7 +5,6 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
-#define __ASSEMBLY__
 #include <auxinfo.h>
 #include <asm/assembler.h>
 #include <asm/thread_info.h>
diff --git a/arm/cstart64.S b/arm/cstart64.S
index bc2be45a..a8ad6dc8 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -5,7 +5,6 @@
  *
  * This work is licensed under the terms of the GNU GPL, version 2.
  */
-#define __ASSEMBLY__
 #include <auxinfo.h>
 #include <asm/asm-offsets.h>
 #include <asm/assembler.h>
diff --git a/powerpc/cstart64.S b/powerpc/cstart64.S
index 34e39341..fa32ef24 100644
--- a/powerpc/cstart64.S
+++ b/powerpc/cstart64.S
@@ -5,7 +5,6 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
-#define __ASSEMBLY__
 #include <asm/hcall.h>
 #include <asm/ppc_asm.h>
 #include <asm/rtas.h>
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
@ 2023-11-30  9:07   ` Shaoqin Huang
  0 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, linuxppc-dev,
	Shaoqin Huang, Nikos Nikoleris, Eric Auger, Nadav Amit,
	Alexandru Elisei, David Woodhouse

From: Alexandru Elisei <alexandru.elisei@arm.com>

There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
with functionality relies on the __ASSEMBLY__ prepocessor constant being
correctly defined to work correctly. So far, kvm-unit-tests has relied on
the assembly files to define the constant before including any header
files which depend on it.

Let's make sure that nobody gets this wrong and define it as a compiler
constant when compiling assembly files. __ASSEMBLY__ is now defined for all
.S files, even those that didn't set it explicitely before.

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
---
 Makefile           | 5 ++++-
 arm/cstart.S       | 1 -
 arm/cstart64.S     | 1 -
 powerpc/cstart64.S | 1 -
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/Makefile b/Makefile
index 602910dd..27ed14e6 100644
--- a/Makefile
+++ b/Makefile
@@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
 
 autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
 
+AFLAGS  = $(CFLAGS)
+AFLAGS += -D__ASSEMBLY__
+
 LDFLAGS += -nostdlib $(no_pie) -z noexecstack
 
 $(libcflat): $(cflatobjs)
@@ -113,7 +116,7 @@ directories:
 	@mkdir -p $(OBJDIRS)
 
 %.o: %.S
-	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
+	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
 
 -include */.*.d */*/.*.d
 
diff --git a/arm/cstart.S b/arm/cstart.S
index 3dd71ed9..b24ecabc 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -5,7 +5,6 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
-#define __ASSEMBLY__
 #include <auxinfo.h>
 #include <asm/assembler.h>
 #include <asm/thread_info.h>
diff --git a/arm/cstart64.S b/arm/cstart64.S
index bc2be45a..a8ad6dc8 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -5,7 +5,6 @@
  *
  * This work is licensed under the terms of the GNU GPL, version 2.
  */
-#define __ASSEMBLY__
 #include <auxinfo.h>
 #include <asm/asm-offsets.h>
 #include <asm/assembler.h>
diff --git a/powerpc/cstart64.S b/powerpc/cstart64.S
index 34e39341..fa32ef24 100644
--- a/powerpc/cstart64.S
+++ b/powerpc/cstart64.S
@@ -5,7 +5,6 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
-#define __ASSEMBLY__
 #include <asm/hcall.h>
 #include <asm/ppc_asm.h>
 #include <asm/rtas.h>
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 02/18] powerpc: Replace the physical allocator with the page allocator
  2023-11-30  9:07 ` Shaoqin Huang
@ 2023-11-30  9:07   ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Alexandru Elisei, Laurent Vivier, Thomas Huth, kvm-ppc,
	linuxppc-dev, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

The spapr_hcall test makes two page sized allocations using the physical
allocator. Replace the physical allocator with the page allocator, which
has has more features (like support for freeing allocations), and would
allow for further simplification of the physical allocator.

CC: Laurent Vivier <lvivier@redhat.com>
CC: Thomas Huth <thuth@redhat.com>
CC: kvm-ppc@vger.kernel.org
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/powerpc/setup.c     | 9 ++++++---
 powerpc/Makefile.common | 1 +
 powerpc/spapr_hcall.c   | 5 +++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/powerpc/setup.c b/lib/powerpc/setup.c
index 1be4c030..80fd38ae 100644
--- a/lib/powerpc/setup.c
+++ b/lib/powerpc/setup.c
@@ -15,6 +15,7 @@
 #include <devicetree.h>
 #include <alloc.h>
 #include <alloc_phys.h>
+#include <alloc_page.h>
 #include <argv.h>
 #include <asm/setup.h>
 #include <asm/page.h>
@@ -111,6 +112,7 @@ static void mem_init(phys_addr_t freemem_start)
 	struct mem_region primary, mem = {
 		.start = (phys_addr_t)-1,
 	};
+	phys_addr_t base, top;
 	int nr_regs, i;
 
 	nr_regs = dt_get_memory_params(regs, NR_MEM_REGIONS);
@@ -146,9 +148,10 @@ static void mem_init(phys_addr_t freemem_start)
 	__physical_start = mem.start;	/* PHYSICAL_START */
 	__physical_end = mem.end;	/* PHYSICAL_END */
 
-	phys_alloc_init(freemem_start, primary.end - freemem_start);
-	phys_alloc_set_minimum_alignment(__icache_bytes > __dcache_bytes
-					 ? __icache_bytes : __dcache_bytes);
+	base = PAGE_ALIGN(freemem_start) >> PAGE_SHIFT;
+	top = primary.end >> PAGE_SHIFT;
+	page_alloc_init_area(0, base, top);
+	page_alloc_ops_enable();
 }
 
 void setup(const void *fdt)
diff --git a/powerpc/Makefile.common b/powerpc/Makefile.common
index f8f47490..ae70443a 100644
--- a/powerpc/Makefile.common
+++ b/powerpc/Makefile.common
@@ -34,6 +34,7 @@ include $(SRCDIR)/scripts/asm-offsets.mak
 cflatobjs += lib/util.o
 cflatobjs += lib/getchar.o
 cflatobjs += lib/alloc_phys.o
+cflatobjs += lib/alloc_page.o
 cflatobjs += lib/alloc.o
 cflatobjs += lib/devicetree.o
 cflatobjs += lib/migrate.o
diff --git a/powerpc/spapr_hcall.c b/powerpc/spapr_hcall.c
index e9b5300a..77ab4187 100644
--- a/powerpc/spapr_hcall.c
+++ b/powerpc/spapr_hcall.c
@@ -8,6 +8,7 @@
 #include <libcflat.h>
 #include <util.h>
 #include <alloc.h>
+#include <alloc_page.h>
 #include <asm/hcall.h>
 #include <asm/processor.h>
 
@@ -58,8 +59,8 @@ static void test_h_page_init(int argc, char **argv)
 	if (argc > 1)
 		report_abort("Unsupported argument: '%s'", argv[1]);
 
-	dst = memalign(PAGE_SIZE, PAGE_SIZE);
-	src = memalign(PAGE_SIZE, PAGE_SIZE);
+	dst = alloc_page();
+	src = alloc_page();
 	if (!dst || !src)
 		report_abort("Failed to alloc memory");
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 02/18] powerpc: Replace the physical allocator with the page allocator
@ 2023-11-30  9:07   ` Shaoqin Huang
  0 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Laurent Vivier, Thomas Huth, kvm, Alexandru Elisei, kvm-ppc,
	linuxppc-dev

From: Alexandru Elisei <alexandru.elisei@arm.com>

The spapr_hcall test makes two page sized allocations using the physical
allocator. Replace the physical allocator with the page allocator, which
has has more features (like support for freeing allocations), and would
allow for further simplification of the physical allocator.

CC: Laurent Vivier <lvivier@redhat.com>
CC: Thomas Huth <thuth@redhat.com>
CC: kvm-ppc@vger.kernel.org
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/powerpc/setup.c     | 9 ++++++---
 powerpc/Makefile.common | 1 +
 powerpc/spapr_hcall.c   | 5 +++--
 3 files changed, 10 insertions(+), 5 deletions(-)

diff --git a/lib/powerpc/setup.c b/lib/powerpc/setup.c
index 1be4c030..80fd38ae 100644
--- a/lib/powerpc/setup.c
+++ b/lib/powerpc/setup.c
@@ -15,6 +15,7 @@
 #include <devicetree.h>
 #include <alloc.h>
 #include <alloc_phys.h>
+#include <alloc_page.h>
 #include <argv.h>
 #include <asm/setup.h>
 #include <asm/page.h>
@@ -111,6 +112,7 @@ static void mem_init(phys_addr_t freemem_start)
 	struct mem_region primary, mem = {
 		.start = (phys_addr_t)-1,
 	};
+	phys_addr_t base, top;
 	int nr_regs, i;
 
 	nr_regs = dt_get_memory_params(regs, NR_MEM_REGIONS);
@@ -146,9 +148,10 @@ static void mem_init(phys_addr_t freemem_start)
 	__physical_start = mem.start;	/* PHYSICAL_START */
 	__physical_end = mem.end;	/* PHYSICAL_END */
 
-	phys_alloc_init(freemem_start, primary.end - freemem_start);
-	phys_alloc_set_minimum_alignment(__icache_bytes > __dcache_bytes
-					 ? __icache_bytes : __dcache_bytes);
+	base = PAGE_ALIGN(freemem_start) >> PAGE_SHIFT;
+	top = primary.end >> PAGE_SHIFT;
+	page_alloc_init_area(0, base, top);
+	page_alloc_ops_enable();
 }
 
 void setup(const void *fdt)
diff --git a/powerpc/Makefile.common b/powerpc/Makefile.common
index f8f47490..ae70443a 100644
--- a/powerpc/Makefile.common
+++ b/powerpc/Makefile.common
@@ -34,6 +34,7 @@ include $(SRCDIR)/scripts/asm-offsets.mak
 cflatobjs += lib/util.o
 cflatobjs += lib/getchar.o
 cflatobjs += lib/alloc_phys.o
+cflatobjs += lib/alloc_page.o
 cflatobjs += lib/alloc.o
 cflatobjs += lib/devicetree.o
 cflatobjs += lib/migrate.o
diff --git a/powerpc/spapr_hcall.c b/powerpc/spapr_hcall.c
index e9b5300a..77ab4187 100644
--- a/powerpc/spapr_hcall.c
+++ b/powerpc/spapr_hcall.c
@@ -8,6 +8,7 @@
 #include <libcflat.h>
 #include <util.h>
 #include <alloc.h>
+#include <alloc_page.h>
 #include <asm/hcall.h>
 #include <asm/processor.h>
 
@@ -58,8 +59,8 @@ static void test_h_page_init(int argc, char **argv)
 	if (argc > 1)
 		report_abort("Unsupported argument: '%s'", argv[1]);
 
-	dst = memalign(PAGE_SIZE, PAGE_SIZE);
-	src = memalign(PAGE_SIZE, PAGE_SIZE);
+	dst = alloc_page();
+	src = alloc_page();
 	if (!dst || !src)
 		report_abort("Failed to alloc memory");
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 03/18] lib/alloc_phys: Initialize align_min
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (2 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

Commit 11c4715fbf87 ("alloc: implement free") changed align_min from a
static variable to a field for the alloc_ops struct and carried over the
initializer value of DEFAULT_MINIMUM_ALIGNMENT.

Commit 7e3e823b78c0 ("lib/alloc.h: remove align_min from struct alloc_ops")
removed the align_min field and changed it back to a static variable, but
missed initializing it. Change it back to being initialized with the value
DEFAULT_MINIMUM_ALIGNMENT.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/alloc_phys.c | 7 +++----
 lib/alloc_phys.h | 2 --
 2 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/lib/alloc_phys.c b/lib/alloc_phys.c
index a4d2bf23..3a78d0ac 100644
--- a/lib/alloc_phys.c
+++ b/lib/alloc_phys.c
@@ -13,8 +13,6 @@
 
 #define PHYS_ALLOC_NR_REGIONS	256
 
-#define DEFAULT_MINIMUM_ALIGNMENT	32
-
 struct phys_alloc_region {
 	phys_addr_t base;
 	phys_addr_t size;
@@ -26,12 +24,13 @@ static int nr_regions;
 static struct spinlock lock;
 static phys_addr_t base, top;
 
+#define DEFAULT_MINIMUM_ALIGNMENT	32
+static size_t align_min = DEFAULT_MINIMUM_ALIGNMENT;
+
 static void *early_memalign(size_t alignment, size_t size);
 static struct alloc_ops early_alloc_ops = {
 	.memalign = early_memalign,
 };
-static size_t align_min;
-
 struct alloc_ops *alloc_ops = &early_alloc_ops;
 
 void phys_alloc_show(void)
diff --git a/lib/alloc_phys.h b/lib/alloc_phys.h
index 611aa70d..8049c340 100644
--- a/lib/alloc_phys.h
+++ b/lib/alloc_phys.h
@@ -15,8 +15,6 @@
  */
 #include "libcflat.h"
 
-#define DEFAULT_MINIMUM_ALIGNMENT 32
-
 /*
  * phys_alloc_init creates the initial free memory region of size @size
  * at @base. The minimum alignment is set to DEFAULT_MINIMUM_ALIGNMENT.
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 04/18] lib/alloc_phys: Consolidate allocate functions into memalign_early()
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (3 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

phys_alloc_aligned_safe() is called only by early_memalign() and the safe
parameter is always true. In the spirit of simplifying the code, merge the
two functions together. Rename it to memalign_early(), to match the naming
scheme used by the page allocator.

Change the type of top_safe to phys_addr_t, to match the type of the top
and base variables describing the available physical memory; this is a
cosmetic change only, since libcflat.h defines phys_addr_t as an alias
for u64.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/alloc_phys.c | 40 +++++++++++++++-------------------------
 1 file changed, 15 insertions(+), 25 deletions(-)

diff --git a/lib/alloc_phys.c b/lib/alloc_phys.c
index 3a78d0ac..65c860cb 100644
--- a/lib/alloc_phys.c
+++ b/lib/alloc_phys.c
@@ -27,9 +27,9 @@ static phys_addr_t base, top;
 #define DEFAULT_MINIMUM_ALIGNMENT	32
 static size_t align_min = DEFAULT_MINIMUM_ALIGNMENT;
 
-static void *early_memalign(size_t alignment, size_t size);
+static void *memalign_early(size_t alignment, size_t sz);
 static struct alloc_ops early_alloc_ops = {
-	.memalign = early_memalign,
+	.memalign = memalign_early,
 };
 struct alloc_ops *alloc_ops = &early_alloc_ops;
 
@@ -66,21 +66,24 @@ void phys_alloc_set_minimum_alignment(phys_addr_t align)
 	spin_unlock(&lock);
 }
 
-static phys_addr_t phys_alloc_aligned_safe(phys_addr_t size,
-					   phys_addr_t align, bool safe)
+static void *memalign_early(size_t alignment, size_t sz)
 {
 	static bool warned = false;
-	phys_addr_t addr, size_orig = size;
-	u64 top_safe;
+	phys_addr_t align = (phys_addr_t)alignment;
+	phys_addr_t size = (phys_addr_t)sz;
+	phys_addr_t size_orig = size;
+	phys_addr_t addr, top_safe;
+
+	assert(align && !(align & (align - 1)));
 
 	spin_lock(&lock);
 
 	top_safe = top;
 
-	if (safe && sizeof(long) == 4)
+	if (sizeof(long) == 4)
 		top_safe = MIN(top_safe, 1ULL << 32);
-
 	assert(base < top_safe);
+
 	if (align < align_min)
 		align = align_min;
 
@@ -92,10 +95,10 @@ static phys_addr_t phys_alloc_aligned_safe(phys_addr_t size,
 		       " (align=%#" PRIx64 "), "
 		       "need=%#" PRIx64 ", but free=%#" PRIx64 ". "
 		       "top=%#" PRIx64 ", top_safe=%#" PRIx64 "\n",
-		       (u64)size_orig, (u64)align, (u64)size, top_safe - base,
-		       (u64)top, top_safe);
+		       (u64)size_orig, (u64)align, (u64)size,
+		       (u64)(top_safe - base), (u64)top, (u64)top_safe);
 		spin_unlock(&lock);
-		return INVALID_PHYS_ADDR;
+		return NULL;
 	}
 
 	base += size;
@@ -112,7 +115,7 @@ static phys_addr_t phys_alloc_aligned_safe(phys_addr_t size,
 
 	spin_unlock(&lock);
 
-	return addr;
+	return phys_to_virt(addr);
 }
 
 void phys_alloc_get_unused(phys_addr_t *p_base, phys_addr_t *p_top)
@@ -128,16 +131,3 @@ void phys_alloc_get_unused(phys_addr_t *p_base, phys_addr_t *p_top)
 	base = top;
 	spin_unlock(&lock);
 }
-
-static void *early_memalign(size_t alignment, size_t size)
-{
-	phys_addr_t addr;
-
-	assert(alignment && !(alignment & (alignment - 1)));
-
-	addr = phys_alloc_aligned_safe(size, alignment, true);
-	if (addr == INVALID_PHYS_ADDR)
-		return NULL;
-
-	return phys_to_virt(addr);
-}
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 05/18] lib/alloc_phys: Remove locking
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (4 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

powerpc and s390 do not use the physical allocator.

arm and arm64 initialize the physical allocator in the setup code, only to
drain it immediately afterwards to initialize the page allocator.

x86 calls setup_vm() before any test that allocates memory, which similarly
drains the physical allocator to initialize the page allocator.

The setup code runs on a single core, which means that there is no need to
protect the internal data structures of the physical allocator against
concurrent accesses. Simplify the allocator by removing the locking.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/alloc_phys.c | 14 --------------
 1 file changed, 14 deletions(-)

diff --git a/lib/alloc_phys.c b/lib/alloc_phys.c
index 65c860cb..064077f7 100644
--- a/lib/alloc_phys.c
+++ b/lib/alloc_phys.c
@@ -21,7 +21,6 @@ struct phys_alloc_region {
 static struct phys_alloc_region regions[PHYS_ALLOC_NR_REGIONS];
 static int nr_regions;
 
-static struct spinlock lock;
 static phys_addr_t base, top;
 
 #define DEFAULT_MINIMUM_ALIGNMENT	32
@@ -37,7 +36,6 @@ void phys_alloc_show(void)
 {
 	int i;
 
-	spin_lock(&lock);
 	printf("phys_alloc minimum alignment: %#" PRIx64 "\n", (u64)align_min);
 	for (i = 0; i < nr_regions; ++i)
 		printf("%016" PRIx64 "-%016" PRIx64 " [%s]\n",
@@ -46,24 +44,19 @@ void phys_alloc_show(void)
 			"USED");
 	printf("%016" PRIx64 "-%016" PRIx64 " [%s]\n",
 		(u64)base, (u64)(top - 1), "FREE");
-	spin_unlock(&lock);
 }
 
 void phys_alloc_init(phys_addr_t base_addr, phys_addr_t size)
 {
-	spin_lock(&lock);
 	base = base_addr;
 	top = base + size;
 	nr_regions = 0;
-	spin_unlock(&lock);
 }
 
 void phys_alloc_set_minimum_alignment(phys_addr_t align)
 {
 	assert(align && !(align & (align - 1)));
-	spin_lock(&lock);
 	align_min = align;
-	spin_unlock(&lock);
 }
 
 static void *memalign_early(size_t alignment, size_t sz)
@@ -76,8 +69,6 @@ static void *memalign_early(size_t alignment, size_t sz)
 
 	assert(align && !(align & (align - 1)));
 
-	spin_lock(&lock);
-
 	top_safe = top;
 
 	if (sizeof(long) == 4)
@@ -97,7 +88,6 @@ static void *memalign_early(size_t alignment, size_t sz)
 		       "top=%#" PRIx64 ", top_safe=%#" PRIx64 "\n",
 		       (u64)size_orig, (u64)align, (u64)size,
 		       (u64)(top_safe - base), (u64)top, (u64)top_safe);
-		spin_unlock(&lock);
 		return NULL;
 	}
 
@@ -113,8 +103,6 @@ static void *memalign_early(size_t alignment, size_t sz)
 		warned = true;
 	}
 
-	spin_unlock(&lock);
-
 	return phys_to_virt(addr);
 }
 
@@ -124,10 +112,8 @@ void phys_alloc_get_unused(phys_addr_t *p_base, phys_addr_t *p_top)
 	*p_top = top;
 	if (base == top)
 		return;
-	spin_lock(&lock);
 	regions[nr_regions].base = base;
 	regions[nr_regions].size = top - base;
 	++nr_regions;
 	base = top;
-	spin_unlock(&lock);
 }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 06/18] lib/alloc_phys: Remove allocation accounting
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (5 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

The page allocator has better allocation tracking and is used by all
architectures, while the physical allocator is never used for allocating
memory.

Simplify the physical allocator by removing allocation accounting. This
accomplishes two things:

1. It makes the allocator more useful, as the warning that was displayed
for each allocation after the 256th is removed. That can become an issue
if the allocator is used for creating the translation tables, for example.

2. It becomes trivial to add cache maintenance for the internal structures
that the physical allocator maintains, which are now only four static
variables.

phys_alloc_show() has received a slight change in the way it displays the
use and free regions: the end of the region is now non-inclusive, to allow
phys_alloc_show() to express that no memory has been used, or no memory is
free: in both cases, the start and the end adresses are equal.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/alloc_phys.c | 63 +++++++++++++-----------------------------------
 lib/alloc_phys.h |  5 ++--
 2 files changed, 20 insertions(+), 48 deletions(-)

diff --git a/lib/alloc_phys.c b/lib/alloc_phys.c
index 064077f7..c96bcb48 100644
--- a/lib/alloc_phys.c
+++ b/lib/alloc_phys.c
@@ -11,17 +11,11 @@
 #include "asm/io.h"
 #include "alloc_phys.h"
 
-#define PHYS_ALLOC_NR_REGIONS	256
-
-struct phys_alloc_region {
-	phys_addr_t base;
-	phys_addr_t size;
-};
-
-static struct phys_alloc_region regions[PHYS_ALLOC_NR_REGIONS];
-static int nr_regions;
-
-static phys_addr_t base, top;
+/*
+ * used is the end address of the currently allocated memory, non-inclusive.
+ * used equals top means that all memory has been allocated.
+ */
+static phys_addr_t base, used, top;
 
 #define DEFAULT_MINIMUM_ALIGNMENT	32
 static size_t align_min = DEFAULT_MINIMUM_ALIGNMENT;
@@ -34,23 +28,15 @@ struct alloc_ops *alloc_ops = &early_alloc_ops;
 
 void phys_alloc_show(void)
 {
-	int i;
-
 	printf("phys_alloc minimum alignment: %#" PRIx64 "\n", (u64)align_min);
-	for (i = 0; i < nr_regions; ++i)
-		printf("%016" PRIx64 "-%016" PRIx64 " [%s]\n",
-			(u64)regions[i].base,
-			(u64)(regions[i].base + regions[i].size - 1),
-			"USED");
-	printf("%016" PRIx64 "-%016" PRIx64 " [%s]\n",
-		(u64)base, (u64)(top - 1), "FREE");
+	printf("%016" PRIx64 "-%016" PRIx64 " [USED]\n", (u64)base, (u64)used);
+	printf("%016" PRIx64 "-%016" PRIx64 " [FREE]\n", (u64)used, (u64)top);
 }
 
 void phys_alloc_init(phys_addr_t base_addr, phys_addr_t size)
 {
-	base = base_addr;
+	used = base = base_addr;
 	top = base + size;
-	nr_regions = 0;
 }
 
 void phys_alloc_set_minimum_alignment(phys_addr_t align)
@@ -61,7 +47,6 @@ void phys_alloc_set_minimum_alignment(phys_addr_t align)
 
 static void *memalign_early(size_t alignment, size_t sz)
 {
-	static bool warned = false;
 	phys_addr_t align = (phys_addr_t)alignment;
 	phys_addr_t size = (phys_addr_t)sz;
 	phys_addr_t size_orig = size;
@@ -70,7 +55,6 @@ static void *memalign_early(size_t alignment, size_t sz)
 	assert(align && !(align & (align - 1)));
 
 	top_safe = top;
-
 	if (sizeof(long) == 4)
 		top_safe = MIN(top_safe, 1ULL << 32);
 	assert(base < top_safe);
@@ -78,42 +62,29 @@ static void *memalign_early(size_t alignment, size_t sz)
 	if (align < align_min)
 		align = align_min;
 
-	addr = ALIGN(base, align);
-	size += addr - base;
+	addr = ALIGN(used, align);
+	size += addr - used;
 
-	if ((top_safe - base) < size) {
+	if (size > top_safe - used) {
 		printf("phys_alloc: requested=%#" PRIx64
 		       " (align=%#" PRIx64 "), "
 		       "need=%#" PRIx64 ", but free=%#" PRIx64 ". "
 		       "top=%#" PRIx64 ", top_safe=%#" PRIx64 "\n",
 		       (u64)size_orig, (u64)align, (u64)size,
-		       (u64)(top_safe - base), (u64)top, (u64)top_safe);
+		       (u64)(top_safe - used), (u64)top, (u64)top_safe);
 		return NULL;
 	}
 
-	base += size;
-
-	if (nr_regions < PHYS_ALLOC_NR_REGIONS) {
-		regions[nr_regions].base = addr;
-		regions[nr_regions].size = size_orig;
-		++nr_regions;
-	} else if (!warned) {
-		printf("WARNING: phys_alloc: No free log entries, "
-		       "can no longer log allocations...\n");
-		warned = true;
-	}
+	used += size;
 
 	return phys_to_virt(addr);
 }
 
 void phys_alloc_get_unused(phys_addr_t *p_base, phys_addr_t *p_top)
 {
-	*p_base = base;
+	*p_base = used;
 	*p_top = top;
-	if (base == top)
-		return;
-	regions[nr_regions].base = base;
-	regions[nr_regions].size = top - base;
-	++nr_regions;
-	base = top;
+
+	/* Empty allocator. */
+	used = top;
 }
diff --git a/lib/alloc_phys.h b/lib/alloc_phys.h
index 8049c340..4d350f01 100644
--- a/lib/alloc_phys.h
+++ b/lib/alloc_phys.h
@@ -29,8 +29,9 @@ extern void phys_alloc_set_minimum_alignment(phys_addr_t align);
 
 /*
  * phys_alloc_show outputs all currently allocated regions with the
- * following format
- *   <start_addr>-<end_addr> [<USED|FREE>]
+ * following format, where <end_addr> is non-inclusive:
+ *
+ * <start_addr>-<end_addr> [<USED|FREE>]
  */
 extern void phys_alloc_show(void);
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 07/18] lib/alloc_phys: Add callback to perform cache maintenance
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (6 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

Some architectures, like arm and arm64, require explicit cache maintenance
to maintain the caches in sync with memory when toggling the caches. Add
the function to do the required maintenance on the internal structures that
the allocator maintains.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/alloc_phys.c | 8 ++++++++
 lib/alloc_phys.h | 8 ++++++++
 2 files changed, 16 insertions(+)

diff --git a/lib/alloc_phys.c b/lib/alloc_phys.c
index c96bcb48..5d5487f2 100644
--- a/lib/alloc_phys.c
+++ b/lib/alloc_phys.c
@@ -45,6 +45,14 @@ void phys_alloc_set_minimum_alignment(phys_addr_t align)
 	align_min = align;
 }
 
+void phys_alloc_perform_cache_maintenance(cache_maint_fn maint_fn)
+{
+	maint_fn((unsigned long)&base);
+	maint_fn((unsigned long)&used);
+	maint_fn((unsigned long)&top);
+	maint_fn((unsigned long)&align_min);
+}
+
 static void *memalign_early(size_t alignment, size_t sz)
 {
 	phys_addr_t align = (phys_addr_t)alignment;
diff --git a/lib/alloc_phys.h b/lib/alloc_phys.h
index 4d350f01..86b3d021 100644
--- a/lib/alloc_phys.h
+++ b/lib/alloc_phys.h
@@ -15,6 +15,8 @@
  */
 #include "libcflat.h"
 
+typedef void (*cache_maint_fn)(unsigned long addr);
+
 /*
  * phys_alloc_init creates the initial free memory region of size @size
  * at @base. The minimum alignment is set to DEFAULT_MINIMUM_ALIGNMENT.
@@ -27,6 +29,12 @@ extern void phys_alloc_init(phys_addr_t base, phys_addr_t size);
  */
 extern void phys_alloc_set_minimum_alignment(phys_addr_t align);
 
+/*
+ * Perform cache maintenance on the internal structures that the physical
+ * allocator maintains.
+ */
+extern void phys_alloc_perform_cache_maintenance(cache_maint_fn maint_fn);
+
 /*
  * phys_alloc_show outputs all currently allocated regions with the
  * following format, where <end_addr> is non-inclusive:
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 08/18] lib/alloc_phys: Expand documentation with usage and limitations
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (7 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

The physical allocator has gotten simpler, document its limitations and
current/expected usage.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/alloc_phys.h | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/lib/alloc_phys.h b/lib/alloc_phys.h
index 86b3d021..861959cf 100644
--- a/lib/alloc_phys.h
+++ b/lib/alloc_phys.h
@@ -4,10 +4,17 @@
  * phys_alloc is a very simple allocator which allows physical memory
  * to be partitioned into regions until all memory is allocated.
  *
- * Note: This is such a simple allocator that there is no way to free
- * a region. For more complicated memory management a single region
- * can be allocated, but then have its memory managed by a more
- * sophisticated allocator, e.g. a page allocator.
+ * Note: This is such a simple allocator that there is no way to free a
+ * region, and concurrent allocations are not supported. As such, it is
+ * mostly suitable for the architecture setup code, and less so for
+ * allocating memory in a test. For more complicated memory management a
+ * single region can be allocated (or the entire free memory), but then
+ * have that memory managed by a more sophisticated allocator, e.g. the
+ * page or the vmalloc allocators.
+ *
+ * Because of its simplicity, phys_alloc can easily perform cache
+ * maintenance on the state tracking variables it maintains, making it
+ * suitable for architectures which require such operations.
  *
  * Copyright (C) 2014, Red Hat Inc, Andrew Jones <drjones@redhat.com>
  *
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 09/18] arm/arm64: Zero secondary CPUs' stack
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (8 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Alexandru Elisei, Nikos Nikoleris, Shaoqin Huang, Eric Auger, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

For the boot CPU, the entire stack is zeroed in the entry code. For the
secondaries, only struct thread_info, which lives at the bottom of the
stack, is zeroed in thread_info_init().

Be consistent and zero the entire stack for the secondaries. This should
also improve reproducibility of the testsuite, as all the stacks will start
with the same contents, which is zero. And now that all the stacks are
zeroed in the entry code, there is no need to explicitely zero struct
thread_info in thread_info_init().

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
---
 arm/cstart.S          |  6 ++++++
 arm/cstart64.S        | 11 +++++++----
 lib/arm/processor.c   |  1 -
 lib/arm64/processor.c |  1 -
 4 files changed, 13 insertions(+), 6 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index b24ecabc..2ecebd1d 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -151,7 +151,13 @@ secondary_entry:
 	 */
 	ldr	r1, =secondary_data
 	ldr	r0, [r1]
+	mov	r2, r0
+	lsr	r2, #THREAD_SHIFT
+	lsl	r2, #THREAD_SHIFT
+	add	r3, r2, #THREAD_SIZE
+	zero_range r2, r3, r4, r5
 	mov	sp, r0
+
 	bl	exceptions_init
 	bl	enable_vfp
 
diff --git a/arm/cstart64.S b/arm/cstart64.S
index a8ad6dc8..5ba2fb27 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -14,10 +14,6 @@
 #include <asm/thread_info.h>
 #include <asm/sysreg.h>
 
-#ifdef CONFIG_EFI
-#include "efi/crt0-efi-aarch64.S"
-#else
-
 .macro zero_range, tmp1, tmp2
 9998:	cmp	\tmp1, \tmp2
 	b.eq	9997f
@@ -26,6 +22,10 @@
 9997:
 .endm
 
+#ifdef CONFIG_EFI
+#include "efi/crt0-efi-aarch64.S"
+#else
+
 .section .init
 
 /*
@@ -162,6 +162,9 @@ secondary_entry:
 	/* set the stack */
 	adrp	x0, secondary_data
 	ldr	x0, [x0, :lo12:secondary_data]
+	and	x1, x0, #THREAD_MASK
+	add	x2, x1, #THREAD_SIZE
+	zero_range x1, x2
 	mov	sp, x0
 
 	/* finish init in C code */
diff --git a/lib/arm/processor.c b/lib/arm/processor.c
index 9d575968..ceff1c0a 100644
--- a/lib/arm/processor.c
+++ b/lib/arm/processor.c
@@ -117,7 +117,6 @@ void do_handle_exception(enum vector v, struct pt_regs *regs)
 
 void thread_info_init(struct thread_info *ti, unsigned int flags)
 {
-	memset(ti, 0, sizeof(struct thread_info));
 	ti->cpu = mpidr_to_cpu(get_mpidr());
 	ti->flags = flags;
 }
diff --git a/lib/arm64/processor.c b/lib/arm64/processor.c
index 5bcad679..a8ef8c59 100644
--- a/lib/arm64/processor.c
+++ b/lib/arm64/processor.c
@@ -233,7 +233,6 @@ void install_vector_handler(enum vector v, vector_fn fn)
 
 static void __thread_info_init(struct thread_info *ti, unsigned int flags)
 {
-	memset(ti, 0, sizeof(struct thread_info));
 	ti->cpu = mpidr_to_cpu(get_mpidr());
 	ti->flags = flags;
 }
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 10/18] arm/arm64: Allocate secondaries' stack using the page allocator
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (9 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, Eric Auger, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

The vmalloc allocator returns non-id mapped addresses, where the virtual
address is different than the physical address. As a result, it's
impossible to access the stack of the secondary CPUs while the MMU is
disabled (if AUXINFO_MMU_OFF is set, a test disables the MMU or an
exception happens on the secondary before the MMU is enabled).

It turns out that kvm-unit-tests always configures the stack size to be a
power-of-two multiple of PAGE_SIZE: on arm, THREAD_SIZE is 16K and
PAGE_SIZE is 4K; on arm64, THREAD_SIZE is 16K when PAGE_SIZE is 4K or 16K,
and 64K when PAGE_SIZE is 64K. Use memalign_pages_flags() as a drop-in
replacement for vmalloc's vm_memalign(), which is the value for
alloc_ops->memalign when the stack is allocated, as it has the benefits:

1. The secondary CPUs' stack can be used with the MMU off.

2. The secondary CPUs' stack is identity mapped, just like the stack for
the primary CPU, making the configuration of the all the CPUs consistent.

3. start_usr(), which can take a new stack to use at EL0/in user mode, now
works if the function is called after the MMU has been disabled. This
doesn't affect the vectors-user test, as the only way to run the test with
the MMU disabled is by setting AUXINFO_MMU_INFO, in which case the vmalloc
allocator is not initialized and alloc_ops->memalign resolves to
memalign_pages().

memalign_pages_flags() has been used instead of memalign_pages() to
instruct the allocator not to zero the stack, as it's already zeroed in the
entry code.

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/asm/thread_info.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/lib/arm/asm/thread_info.h b/lib/arm/asm/thread_info.h
index eaa72582..190e082c 100644
--- a/lib/arm/asm/thread_info.h
+++ b/lib/arm/asm/thread_info.h
@@ -25,6 +25,7 @@
 #ifndef __ASSEMBLY__
 #include <asm/processor.h>
 #include <alloc.h>
+#include <alloc_page.h>
 
 #ifdef __arm__
 #include <asm/ptrace.h>
@@ -40,7 +41,7 @@
 
 static inline void *thread_stack_alloc(void)
 {
-	void *sp = memalign(THREAD_ALIGNMENT, THREAD_SIZE);
+	void *sp = memalign_pages_flags(THREAD_ALIGNMENT, THREAD_SIZE, FLAG_DONTZERO);
 	return sp + THREAD_START_SP;
 }
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 11/18] arm/arm64: assembler.h: Replace size with end address for dcache_by_line_op
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (10 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Alexandru Elisei, Nikos Nikoleris, Eric Auger, Shaoqin Huang, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

Commit b5f659be4775 ("arm/arm64: Remove dcache_line_size global
variable") moved the dcache_by_line_op macro to assembler.h and changed
it to take the size of the regions instead of the end address as
parameter. This was done to keep the file in sync with the upstream
Linux kernel implementation at the time.

But in both places where the macro is used, the code has the start and
end address of the region, and it has to compute the size to pass it to
dcache_by_line_op. Then the macro itsef computes the end by adding size
to start.

Get rid of this massaging of parameters and change the macro to the end
address as parameter directly.

Besides slightly simplyfing the code by remove two unneeded arithmetic
operations, this makes the macro compatible with the current upstream
version of Linux (which was similarly changed to take the end address in
commit 163d3f80695e ("arm64: dcache_by_line_op to take end parameter
instead of size")), which will allow us to reuse (part of) the Linux C
wrappers over the assembly macro.

The change has been tested with the same snippet of code used to test
commit 410b3bf09e76 ("arm/arm64: Perform dcache clean + invalidate after
turning MMU off").

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/cstart.S              |  1 -
 arm/cstart64.S            |  1 -
 lib/arm/asm/assembler.h   | 11 +++++------
 lib/arm64/asm/assembler.h | 11 +++++------
 4 files changed, 10 insertions(+), 14 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index 2ecebd1d..98d61230 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -226,7 +226,6 @@ asm_mmu_disable:
 	ldr	r0, [r0]
 	ldr	r1, =__phys_end
 	ldr	r1, [r1]
-	sub	r1, r1, r0
 	dcache_by_line_op dccimvac, sy, r0, r1, r2, r3
 
 	mov     pc, lr
diff --git a/arm/cstart64.S b/arm/cstart64.S
index 5ba2fb27..7fb44f42 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -264,7 +264,6 @@ asm_mmu_disable:
 	ldr	x0, [x0, :lo12:__phys_offset]
 	adrp	x1, __phys_end
 	ldr	x1, [x1, :lo12:__phys_end]
-	sub	x1, x1, x0
 	dcache_by_line_op civac, sy, x0, x1, x2, x3
 
 	ret
diff --git a/lib/arm/asm/assembler.h b/lib/arm/asm/assembler.h
index 4200252d..db5f0f55 100644
--- a/lib/arm/asm/assembler.h
+++ b/lib/arm/asm/assembler.h
@@ -25,17 +25,16 @@
 
 /*
  * Macro to perform a data cache maintenance for the interval
- * [addr, addr + size).
+ * [addr, end).
  *
  * 	op:		operation to execute
  * 	domain		domain used in the dsb instruction
  * 	addr:		starting virtual address of the region
- * 	size:		size of the region
- * 	Corrupts:	addr, size, tmp1, tmp2
+ * 	end:		the end of the region (non-inclusive)
+ * 	Corrupts:	addr, tmp1, tmp2
  */
-	.macro dcache_by_line_op op, domain, addr, size, tmp1, tmp2
+	.macro dcache_by_line_op op, domain, addr, end, tmp1, tmp2
 	dcache_line_size \tmp1, \tmp2
-	add	\size, \addr, \size
 	sub	\tmp2, \tmp1, #1
 	bic	\addr, \addr, \tmp2
 9998:
@@ -45,7 +44,7 @@
 	.err
 	.endif
 	add	\addr, \addr, \tmp1
-	cmp	\addr, \size
+	cmp	\addr, \end
 	blo	9998b
 	dsb	\domain
 	.endm
diff --git a/lib/arm64/asm/assembler.h b/lib/arm64/asm/assembler.h
index aa8c65a2..1e09d65a 100644
--- a/lib/arm64/asm/assembler.h
+++ b/lib/arm64/asm/assembler.h
@@ -28,25 +28,24 @@
 
 /*
  * Macro to perform a data cache maintenance for the interval
- * [addr, addr + size). Use the raw value for the dcache line size because
+ * [addr, end). Use the raw value for the dcache line size because
  * kvm-unit-tests has no concept of scheduling.
  *
  * 	op:		operation passed to dc instruction
  * 	domain:		domain used in dsb instruction
  * 	addr:		starting virtual address of the region
- * 	size:		size of the region
- * 	Corrupts:	addr, size, tmp1, tmp2
+ * 	end:		the end of the region (non-inclusive)
+ * 	Corrupts:	addr, tmp1, tmp2
  */
 
-	.macro dcache_by_line_op op, domain, addr, size, tmp1, tmp2
+	.macro dcache_by_line_op op, domain, addr, end, tmp1, tmp2
 	raw_dcache_line_size \tmp1, \tmp2
-	add	\size, \addr, \size
 	sub	\tmp2, \tmp1, #1
 	bic	\addr, \addr, \tmp2
 9998:
 	dc	\op, \addr
 	add	\addr, \addr, \tmp1
-	cmp	\addr, \size
+	cmp	\addr, \end
 	b.lo	9998b
 	dsb	\domain
 	.endm
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 12/18] arm/arm64: Add C functions for doing cache maintenance
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (11 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Alexandru Elisei, Shaoqin Huang, Eric Auger, Nikos Nikoleris, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

Add C functions for doing cache maintenance, which will be used for
implementing the necessary maintenance when kvm-unit-tests boots.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
---
 arm/Makefile.arm           |  4 +-
 arm/Makefile.arm64         |  4 +-
 arm/Makefile.common        |  6 +--
 lib/arm/asm/assembler.h    |  4 +-
 lib/arm/asm/cacheflush.h   |  1 +
 lib/arm/asm/mmu.h          |  6 ---
 lib/arm/cache.S            | 89 ++++++++++++++++++++++++++++++++++++++
 lib/arm64/asm/cacheflush.h | 37 ++++++++++++++++
 lib/arm64/asm/mmu.h        |  5 ---
 lib/arm64/cache.S          | 85 ++++++++++++++++++++++++++++++++++++
 10 files changed, 224 insertions(+), 17 deletions(-)
 create mode 100644 lib/arm/asm/cacheflush.h
 create mode 100644 lib/arm/cache.S
 create mode 100644 lib/arm64/asm/cacheflush.h
 create mode 100644 lib/arm64/cache.S

diff --git a/arm/Makefile.arm b/arm/Makefile.arm
index 7fd39f3a..0489aa92 100644
--- a/arm/Makefile.arm
+++ b/arm/Makefile.arm
@@ -26,7 +26,9 @@ endif
 define arch_elf_check =
 endef
 
-cstart.o = $(TEST_DIR)/cstart.o
+asmobjs  = $(TEST_DIR)/cstart.o
+asmobjs += lib/arm/cache.o
+
 cflatobjs += lib/arm/spinlock.o
 cflatobjs += lib/arm/processor.o
 cflatobjs += lib/arm/stack.o
diff --git a/arm/Makefile.arm64 b/arm/Makefile.arm64
index 960880f1..10561f34 100644
--- a/arm/Makefile.arm64
+++ b/arm/Makefile.arm64
@@ -21,8 +21,10 @@ define arch_elf_check =
 		$(error $(1) has unsupported reloc types))
 endef
 
-cstart.o = $(TEST_DIR)/cstart64.o
+asmobjs  = $(TEST_DIR)/cstart64.o
+asmobjs += lib/arm64/cache.o
 cflatobjs += lib/arm64/stack.o
+
 cflatobjs += lib/arm64/processor.o
 cflatobjs += lib/arm64/spinlock.o
 cflatobjs += lib/arm64/gic-v3-its.o lib/arm64/gic-v3-its-cmd.o
diff --git a/arm/Makefile.common b/arm/Makefile.common
index 5214c8ac..10aa13f4 100644
--- a/arm/Makefile.common
+++ b/arm/Makefile.common
@@ -71,7 +71,7 @@ FLATLIBS = $(libcflat) $(LIBFDT_archive) $(libeabi)
 
 ifeq ($(CONFIG_EFI),y)
 %.so: EFI_LDFLAGS += -defsym=EFI_SUBSYSTEM=0xa --no-undefined
-%.so: %.o $(FLATLIBS) $(SRCDIR)/arm/efi/elf_aarch64_efi.lds $(cstart.o)
+%.so: %.o $(FLATLIBS) $(SRCDIR)/arm/efi/elf_aarch64_efi.lds $(asmobjs)
 	$(CC) $(CFLAGS) -c -o $(@:.so=.aux.o) $(SRCDIR)/lib/auxinfo.c \
 		-DPROGNAME=\"$(@:.so=.efi)\" -DAUXFLAGS=$(AUXFLAGS)
 	$(LD) $(EFI_LDFLAGS) -o $@ -T $(SRCDIR)/arm/efi/elf_aarch64_efi.lds \
@@ -91,7 +91,7 @@ ifeq ($(CONFIG_EFI),y)
 		-O binary $^ $@
 else
 %.elf: LDFLAGS += $(arch_LDFLAGS)
-%.elf: %.o $(FLATLIBS) $(SRCDIR)/arm/flat.lds $(cstart.o)
+%.elf: %.o $(FLATLIBS) $(SRCDIR)/arm/flat.lds $(asmobjs)
 	$(CC) $(CFLAGS) -c -o $(@:.elf=.aux.o) $(SRCDIR)/lib/auxinfo.c \
 		-DPROGNAME=\"$(@:.elf=.flat)\" -DAUXFLAGS=$(AUXFLAGS)
 	$(LD) $(LDFLAGS) -o $@ -T $(SRCDIR)/arm/flat.lds \
@@ -113,4 +113,4 @@ arm_clean: asm_offsets_clean
 	      $(TEST_DIR)/.*.d $(TEST_DIR)/efi/.*.d lib/arm/.*.d
 
 generated-files = $(asm-offsets)
-$(tests-all:.$(exe)=.o) $(cstart.o) $(cflatobjs): $(generated-files)
+$(tests-all:.$(exe)=.o) $(asmobjs) $(cflatobjs): $(generated-files)
diff --git a/lib/arm/asm/assembler.h b/lib/arm/asm/assembler.h
index db5f0f55..e728bb21 100644
--- a/lib/arm/asm/assembler.h
+++ b/lib/arm/asm/assembler.h
@@ -41,7 +41,9 @@
 	.ifc	\op, dccimvac
 	mcr	p15, 0, \addr, c7, c14, 1
 	.else
-	.err
+	.ifc	\op, dccmvac
+	mcr	p15, 0, \addr, c7, c10, 1
+	.endif
 	.endif
 	add	\addr, \addr, \tmp1
 	cmp	\addr, \end
diff --git a/lib/arm/asm/cacheflush.h b/lib/arm/asm/cacheflush.h
new file mode 100644
index 00000000..42dc88a4
--- /dev/null
+++ b/lib/arm/asm/cacheflush.h
@@ -0,0 +1 @@
+#include "../../arm64/asm/cacheflush.h"
diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h
index b24b97e5..6359ba64 100644
--- a/lib/arm/asm/mmu.h
+++ b/lib/arm/asm/mmu.h
@@ -45,12 +45,6 @@ static inline void flush_tlb_page(unsigned long vaddr)
 	isb();
 }
 
-static inline void flush_dcache_addr(unsigned long vaddr)
-{
-	/* DCCIMVAC */
-	asm volatile("mcr p15, 0, %0, c7, c14, 1" :: "r" (vaddr));
-}
-
 #include <asm/mmu-api.h>
 
 #endif /* _ASMARM_MMU_H_ */
diff --git a/lib/arm/cache.S b/lib/arm/cache.S
new file mode 100644
index 00000000..cdb6c168
--- /dev/null
+++ b/lib/arm/cache.S
@@ -0,0 +1,89 @@
+/*
+ * Based on arch/arm64/mm/cache.S
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012, 2022 ARM Ltd.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+
+#include <asm/assembler.h>
+
+/*
+ *      dcache_inval_poc(start, end)
+ *
+ *      Ensure that any D-cache lines for the interval [start, end) are
+ *      invalidated. Any partial lines at the ends of the interval are cleaned
+ *      and invalidated to PoC instead to prevent data loss.
+ *
+ *      - start   - start address of region
+ *      - end     - end address of region
+ */
+.global dcache_inval_poc
+dcache_inval_poc:
+	dmb	sy
+	dcache_line_size r2, r3
+	sub 	r3, r2, #1
+	tst	r1, r3			// end cache line aligned?
+	bic	r1, r1, r3
+	beq 	1f
+	// DCCIMVAC
+	mcr	p15, 0, r1, c7, c14, 1	// clean + invalidate end cache line
+1:	tst	r0, r3			// start cache line aligned?
+	bic	r0, r0, r3
+	beq	2f
+	mcr	p15, 0, r0, c7, c14, 1	// clean + invalidate start cache line
+	b	3f
+	// DCIMVAC
+2:	mcr	p15, 0, r0, c7, c6, 1	// invalidate current cache line
+3:	add	r0, r0, r2
+	cmp	r0, r1
+	blo	2b
+	dsb	sy
+	mov	pc, lr
+
+/*
+ *      dcache_clean_poc(start, end)
+ *
+ *      Ensure that any D-cache lines for the interval [start, end)
+ *      are cleaned to the PoC.
+ *
+ *      - start   - start address of region
+ *      - end     - end address of region
+ */
+.global dcache_clean_poc
+dcache_clean_poc:
+	dmb	sy
+	dcache_by_line_op dccmvac, sy, r0, r1, r2, r3
+	mov	pc, lr
+
+/*
+ *      dcache_clean_addr_poc(addr)
+ *
+ *      Ensure that the D-cache line for address addr is cleaned to the PoC.
+ *
+ *      - addr    - the address
+ */
+.global dcache_clean_addr_poc
+dcache_clean_addr_poc:
+	dmb	sy
+	// DCCMVAC
+	mcr 	p15, 0, r0, c7, c10, 1
+	dsb	sy
+	mov	pc, lr
+
+/*
+ *      dcache_clean_inval_addr_poc(addr)
+ *
+ *      Ensure that the D-cache line for address addr is cleaned and invalidated
+ *      to the PoC.
+ *
+ *      - addr    - the address
+ */
+.global dcache_clean_inval_addr_poc
+dcache_clean_inval_addr_poc:
+	dmb	sy
+	// DCCIMVAC
+	mcr 	p15, 0, r0, c7, c14, 1
+	dsb	sy
+	mov	pc, lr
diff --git a/lib/arm64/asm/cacheflush.h b/lib/arm64/asm/cacheflush.h
new file mode 100644
index 00000000..53a57820
--- /dev/null
+++ b/lib/arm64/asm/cacheflush.h
@@ -0,0 +1,37 @@
+#ifndef _ASMARM64_CACHEFLUSH_H_
+#define _ASMARM64_CACHEFLUSH_H_
+/*
+ * Based on arch/arm64/asm/include/cacheflush.h
+ *
+ * Copyright (C) 1999-2002 Russell King.
+ * Copyright (C) 2012,2022 ARM Ltd.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+
+#include <asm/page.h>
+
+extern void dcache_clean_addr_poc(unsigned long addr);
+/*
+ * Invalidating a specific address is dangerous, because it means invalidating
+ * everything that shares the same cache line. Do clean and invalidate instead,
+ * as the clean is harmless.
+ */
+extern void dcache_clean_inval_addr_poc(unsigned long addr);
+
+extern void dcache_inval_poc(unsigned long start, unsigned long end);
+extern void dcache_clean_poc(unsigned long start, unsigned long end);
+
+static inline void dcache_inval_page_poc(unsigned long page_addr)
+{
+	assert(PAGE_ALIGN(page_addr) == page_addr);
+	dcache_inval_poc(page_addr, page_addr + PAGE_SIZE);
+}
+
+static inline void dcache_clean_page_poc(unsigned long page_addr)
+{
+	assert(PAGE_ALIGN(page_addr) == page_addr);
+	dcache_clean_poc(page_addr, page_addr + PAGE_SIZE);
+}
+
+#endif /* _ASMARM64_CACHEFLUSH_H_ */
diff --git a/lib/arm64/asm/mmu.h b/lib/arm64/asm/mmu.h
index 5c27edb2..2ebbe925 100644
--- a/lib/arm64/asm/mmu.h
+++ b/lib/arm64/asm/mmu.h
@@ -28,11 +28,6 @@ static inline void flush_tlb_page(unsigned long vaddr)
 	isb();
 }
 
-static inline void flush_dcache_addr(unsigned long vaddr)
-{
-	asm volatile("dc civac, %0" :: "r" (vaddr));
-}
-
 #include <asm/mmu-api.h>
 
 #endif /* _ASMARM64_MMU_H_ */
diff --git a/lib/arm64/cache.S b/lib/arm64/cache.S
new file mode 100644
index 00000000..595b6d9a
--- /dev/null
+++ b/lib/arm64/cache.S
@@ -0,0 +1,85 @@
+/*
+ * Based on arch/arm64/mm/cache.S
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012, 2022 ARM Ltd.
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2.
+ */
+
+#include <asm/assembler.h>
+
+/*
+ *      dcache_inval_poc(start, end)
+ *
+ *      Ensure that any D-cache lines for the interval [start, end) are
+ *      invalidated. Any partial lines at the ends of the interval are cleaned
+ *      and invalidated to PoC instead to prevent data loss.
+ *
+ *      - start   - start address of region
+ *      - end     - end address of region
+ */
+.global dcache_inval_poc
+dcache_inval_poc:
+	dmb	sy
+	raw_dcache_line_size x2, x3
+	sub 	x3, x2, #1
+	tst	x1, x3			// end cache line aligned?
+	bic	x1, x1, x3
+	b.eq 	1f
+	dc	civac, x1		// clean + invalidate end cache line
+1:	tst	x0, x3			// start cache line aligned?
+	bic	x0, x0, x3
+	b.eq	2f
+	dc 	civac, x0		// clean + invalidate start cache line
+	b	3f
+2:	dc	ivac, x0		// invalidate current cache line
+3:	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	2b
+	dsb	sy
+	ret
+
+/*
+ *      dcache_clean_poc(start, end)
+ *
+ *      Ensure that any D-cache lines for the interval [start, end)
+ *      are cleaned to the PoC.
+ *
+ *      - start   - start address of region
+ *      - end     - end address of region
+ */
+.global dcache_clean_poc
+dcache_clean_poc:
+	dmb	sy
+	dcache_by_line_op cvac, sy, x0, x1, x2, x3
+	ret
+
+/*
+ *      dcache_clean_addr_poc(addr)
+ *
+ *      Ensure that the D-cache line for address addr is cleaned to the PoC.
+ *
+ *      - addr    - the address
+ */
+.global dcache_clean_addr_poc
+dcache_clean_addr_poc:
+	dmb	sy
+	dc	cvac, x0
+	dsb	sy
+	ret
+
+/*
+ *      dcache_clean_inval_addr_poc(addr)
+ *
+ *      Ensure that the D-cache line for address addr is cleaned and invalidated
+ *      to the PoC.
+ *
+ *      - addr    - the address
+ */
+.global dcache_clean_inval_addr_poc
+dcache_clean_inval_addr_poc:
+	dmb	sy
+	dc	civac, x0
+	dsb	sy
+	ret
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 13/18] arm/arm64: Configure secondaries' stack before enabling the MMU
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (12 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, Eric Auger, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

Now that the secondaries' stack is linearly mapped, we can set it before
turning the MMU on. This makes the entry code for the secondaries
consistent with the entry code for the boot CPU.

secondary_data is written by the CPU that brings the secondary online with
the MMU enabled in the common case (that is, unless the user specifically
compiled the tests to run with the MMU disabled), and now it is read by the
secondary with the MMU disabled. Data is fetched from PoC by the secondary
when the MMU is disabled, clean struct secondary_data to PoC to handle
this.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/cstart.S   | 20 ++++++++++----------
 arm/cstart64.S | 16 ++++++++--------
 lib/arm/smp.c  |  5 +++++
 3 files changed, 23 insertions(+), 18 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index 98d61230..090cf38d 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -134,16 +134,6 @@ get_mmu_off:
 
 .global secondary_entry
 secondary_entry:
-	/* enable the MMU unless requested off */
-	bl	get_mmu_off
-	cmp	r0, #0
-	bne	1f
-	mov	r1, #0
-	ldr	r0, =mmu_idmap
-	ldr	r0, [r0]
-	bl	asm_mmu_enable
-
-1:
 	/*
 	 * Set the stack, and set up vector table
 	 * and exception stacks. Exception stacks
@@ -161,6 +151,16 @@ secondary_entry:
 	bl	exceptions_init
 	bl	enable_vfp
 
+	/* enable the MMU unless requested off */
+	bl	get_mmu_off
+	cmp	r0, #0
+	bne	1f
+	mov	r1, #0
+	ldr	r0, =mmu_idmap
+	ldr	r0, [r0]
+	bl	asm_mmu_enable
+
+1:
 	/* finish init in C code */
 	bl	secondary_cinit
 
diff --git a/arm/cstart64.S b/arm/cstart64.S
index 7fb44f42..b9784d82 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -144,6 +144,14 @@ get_mmu_off:
 
 .globl secondary_entry
 secondary_entry:
+	/* set the stack */
+	adrp	x0, secondary_data
+	ldr	x0, [x0, :lo12:secondary_data]
+	and	x1, x0, #THREAD_MASK
+	add	x2, x1, #THREAD_SIZE
+	zero_range x1, x2
+	mov	sp, x0
+
 	/* Enable FP/ASIMD */
 	mov	x0, #(3 << 20)
 	msr	cpacr_el1, x0
@@ -159,14 +167,6 @@ secondary_entry:
 	bl	asm_mmu_enable
 
 1:
-	/* set the stack */
-	adrp	x0, secondary_data
-	ldr	x0, [x0, :lo12:secondary_data]
-	and	x1, x0, #THREAD_MASK
-	add	x2, x1, #THREAD_SIZE
-	zero_range x1, x2
-	mov	sp, x0
-
 	/* finish init in C code */
 	bl	secondary_cinit
 
diff --git a/lib/arm/smp.c b/lib/arm/smp.c
index 1d470d1a..c9b247a8 100644
--- a/lib/arm/smp.c
+++ b/lib/arm/smp.c
@@ -7,6 +7,8 @@
  */
 #include <libcflat.h>
 #include <auxinfo.h>
+
+#include <asm/cacheflush.h>
 #include <asm/thread_info.h>
 #include <asm/spinlock.h>
 #include <asm/cpumask.h>
@@ -60,6 +62,9 @@ static void __smp_boot_secondary(int cpu, secondary_entry_fn entry)
 
 	secondary_data.stack = thread_stack_alloc();
 	secondary_data.entry = entry;
+	dcache_clean_poc((unsigned long)&secondary_data,
+			 (unsigned long)&secondary_data + sizeof(secondary_data));
+
 	mmu_mark_disabled(cpu);
 	ret = cpu_psci_cpu_boot(cpu);
 	assert(ret == 0);
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 14/18] arm/arm64: Use pgd_alloc() to allocate mmu_idmap
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (13 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, Eric Auger, Shaoqin Huang, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

Until commit 031755dbfefb ("arm: enable vmalloc"), the idmap was allocated
using pgd_alloc(). After that commit, all the page table allocator
functions were switched to using the page allocator, but pgd_alloc() was
left unchanged and became unused, with the idmap now being allocated with
alloc_page().

For arm64, the pgd table size varies based on the page size, which is
configured by the user. For arm, it will always contain 4 entries (it
translates bits 31:30 of the input address). To keep things simple and
consistent with the other functions and across both architectures, modify
pgd_alloc() to use alloc_page() instead of memalign like the rest of the
page table allocator functions and use it to allocate the idmap.

Note that when the idmap is created, alloc_ops->memalign is
memalign_pages(), which allocates memory with page granularity. Which means
that the existing code also allocated a full page, so the total memory used
is not increased by using alloc_page().

Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 lib/arm/asm/pgtable.h   | 4 ++--
 lib/arm/mmu.c           | 4 ++--
 lib/arm64/asm/pgtable.h | 4 ++--
 3 files changed, 6 insertions(+), 6 deletions(-)

diff --git a/lib/arm/asm/pgtable.h b/lib/arm/asm/pgtable.h
index d7c73906..a35f4296 100644
--- a/lib/arm/asm/pgtable.h
+++ b/lib/arm/asm/pgtable.h
@@ -43,8 +43,8 @@
 #define pgd_free(pgd) free(pgd)
 static inline pgd_t *pgd_alloc(void)
 {
-	pgd_t *pgd = memalign(L1_CACHE_BYTES, PTRS_PER_PGD * sizeof(pgd_t));
-	memset(pgd, 0, PTRS_PER_PGD * sizeof(pgd_t));
+	assert(PTRS_PER_PGD * sizeof(pgd_t) <= PAGE_SIZE);
+	pgd_t *pgd = alloc_page();
 	return pgd;
 }
 
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 2f4ec815..70c5333c 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -217,7 +217,7 @@ void *setup_mmu(phys_addr_t phys_end, void *unused)
 #endif
 
 	if (!mmu_idmap)
-		mmu_idmap = alloc_page();
+		mmu_idmap = pgd_alloc();
 
 	for (r = mem_regions; r->end; ++r) {
 		if (r->flags & MR_F_IO) {
@@ -253,7 +253,7 @@ void __iomem *__ioremap(phys_addr_t phys_addr, size_t size)
 		pgtable = current_thread_info()->pgtable;
 	} else {
 		if (!mmu_idmap)
-			mmu_idmap = alloc_page();
+			mmu_idmap = pgd_alloc();
 		pgtable = mmu_idmap;
 	}
 
diff --git a/lib/arm64/asm/pgtable.h b/lib/arm64/asm/pgtable.h
index bfb8a993..06357920 100644
--- a/lib/arm64/asm/pgtable.h
+++ b/lib/arm64/asm/pgtable.h
@@ -49,8 +49,8 @@
 #define pgd_free(pgd) free(pgd)
 static inline pgd_t *pgd_alloc(void)
 {
-	pgd_t *pgd = memalign(PAGE_SIZE, PTRS_PER_PGD * sizeof(pgd_t));
-	memset(pgd, 0, PTRS_PER_PGD * sizeof(pgd_t));
+	assert(PTRS_PER_PGD * sizeof(pgd_t) <= PAGE_SIZE);
+	pgd_t *pgd = alloc_page();
 	return pgd;
 }
 
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 15/18] arm/arm64: Enable the MMU early
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (14 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, Shaoqin Huang, Eric Auger, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

Enable the MMU immediately after the physical allocator is initialized,
to make reasoning about what cache maintenance operations are needed a
lot easier.

The translation tables are created with the MMU disabled. Use memalign(),
which resolves to memalign_early(), for creating them, because the physical
allocator has everything in place to perform dcache maintenance on the data
structures it maintains.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
---
 lib/arm/asm/mmu-api.h   |  1 +
 lib/arm/asm/pgtable.h   | 25 ++++++++++++++++++++++---
 lib/arm/mmu.c           | 28 ++++++++++++++++++++--------
 lib/arm/setup.c         |  7 +++++++
 lib/arm64/asm/pgtable.h | 32 ++++++++++++++++++++++++++++----
 5 files changed, 78 insertions(+), 15 deletions(-)

diff --git a/lib/arm/asm/mmu-api.h b/lib/arm/asm/mmu-api.h
index 6c1136d9..4cb03fcb 100644
--- a/lib/arm/asm/mmu-api.h
+++ b/lib/arm/asm/mmu-api.h
@@ -10,6 +10,7 @@ extern void mmu_mark_enabled(int cpu);
 extern void mmu_mark_disabled(int cpu);
 extern void mmu_enable(pgd_t *pgtable);
 extern void mmu_disable(void);
+extern void mmu_setup_early(phys_addr_t phys_end);
 
 extern void mmu_set_range_sect(pgd_t *pgtable, uintptr_t virt_offset,
 			       phys_addr_t phys_start, phys_addr_t phys_end,
diff --git a/lib/arm/asm/pgtable.h b/lib/arm/asm/pgtable.h
index a35f4296..49c74e19 100644
--- a/lib/arm/asm/pgtable.h
+++ b/lib/arm/asm/pgtable.h
@@ -41,10 +41,16 @@
 #define pgd_offset(pgtable, addr) ((pgtable) + pgd_index(addr))
 
 #define pgd_free(pgd) free(pgd)
+static inline pgd_t *pgd_alloc_early(void)
+{
+	pgd_t *pgd = memalign(PAGE_SIZE, PAGE_SIZE);
+	memset(pgd, 0, PAGE_SIZE);
+	return pgd;
+}
 static inline pgd_t *pgd_alloc(void)
 {
 	assert(PTRS_PER_PGD * sizeof(pgd_t) <= PAGE_SIZE);
-	pgd_t *pgd = alloc_page();
+	pgd_t *pgd = page_alloc_initialized() ? alloc_page() : pgd_alloc_early();
 	return pgd;
 }
 
@@ -65,10 +71,16 @@ static inline pmd_t *pgd_page_vaddr(pgd_t pgd)
 	(pgd_page_vaddr(*(pgd)) + pmd_index(addr))
 
 #define pmd_free(pmd) free_page(pmd)
+static inline pmd_t *pmd_alloc_one_early(void)
+{
+	pmd_t *pmd = memalign(PAGE_SIZE, PAGE_SIZE);
+	memset(pmd, 0, PAGE_SIZE);
+	return pmd;
+}
 static inline pmd_t *pmd_alloc_one(void)
 {
 	assert(PTRS_PER_PMD * sizeof(pmd_t) == PAGE_SIZE);
-	pmd_t *pmd = alloc_page();
+	pmd_t *pmd = page_alloc_initialized() ? alloc_page() : pmd_alloc_one_early();
 	return pmd;
 }
 static inline pmd_t *pmd_alloc(pgd_t *pgd, unsigned long addr)
@@ -92,10 +104,16 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 	(pmd_page_vaddr(*(pmd)) + pte_index(addr))
 
 #define pte_free(pte) free_page(pte)
+static inline pte_t *pte_alloc_one_early(void)
+{
+	pte_t *pte = memalign(PAGE_SIZE, PAGE_SIZE);
+	memset(pte, 0, PAGE_SIZE);
+	return pte;
+}
 static inline pte_t *pte_alloc_one(void)
 {
 	assert(PTRS_PER_PTE * sizeof(pte_t) == PAGE_SIZE);
-	pte_t *pte = alloc_page();
+	pte_t *pte = page_alloc_initialized() ? alloc_page() : pte_alloc_one_early();
 	return pte;
 }
 static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr)
@@ -104,6 +122,7 @@ static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr)
 		pmd_t entry;
 		pmd_val(entry) = pgtable_pa(pte_alloc_one()) | PMD_TYPE_TABLE;
 		WRITE_ONCE(*pmd, entry);
+
 	}
 	return pte_offset(pmd, addr);
 }
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 70c5333c..d23a12e8 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -12,11 +12,10 @@
 #include <asm/setup.h>
 #include <asm/page.h>
 #include <asm/io.h>
+#include <asm/pgtable.h>
+#include <asm/pgtable-hwdef.h>
 
-#include "alloc_page.h"
 #include "vmalloc.h"
-#include <asm/pgtable-hwdef.h>
-#include <asm/pgtable.h>
 
 #include <linux/compiler.h>
 
@@ -201,7 +200,7 @@ void mmu_set_range_sect(pgd_t *pgtable, uintptr_t virt_offset,
 	}
 }
 
-void *setup_mmu(phys_addr_t phys_end, void *unused)
+void mmu_setup_early(phys_addr_t phys_end)
 {
 	struct mem_region *r;
 
@@ -216,9 +215,7 @@ void *setup_mmu(phys_addr_t phys_end, void *unused)
 			"Unsupported translation granule %ld\n", PAGE_SIZE);
 #endif
 
-	if (!mmu_idmap)
-		mmu_idmap = pgd_alloc();
-
+	mmu_idmap = pgd_alloc();
 	for (r = mem_regions; r->end; ++r) {
 		if (r->flags & MR_F_IO) {
 			continue;
@@ -236,7 +233,22 @@ void *setup_mmu(phys_addr_t phys_end, void *unused)
 		}
 	}
 
-	mmu_enable(mmu_idmap);
+	/*
+	 * Open-code part of mmu_enabled(), because at this point thread_info
+	 * hasn't been initialized. mmu_mark_enabled() cannot be called here
+	 * because the cpumask operations can only be called later, after
+	 * nr_cpus is initialized in cpu_init().
+	 */
+	asm_mmu_enable((phys_addr_t)(unsigned long)mmu_idmap);
+	current_thread_info()->pgtable = mmu_idmap;
+}
+
+void *setup_mmu(phys_addr_t phys_end, void *unused1)
+{
+	assert(mmu_idmap);
+
+	mmu_mark_enabled(0);
+
 	return mmu_idmap;
 }
 
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index b6fc453e..4b9423e5 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -26,6 +26,7 @@
 #include <asm/smp.h>
 #include <asm/timer.h>
 #include <asm/psci.h>
+#include <asm/mmu.h>
 
 #include "io.h"
 
@@ -226,6 +227,9 @@ static void mem_init(phys_addr_t freemem_start)
 	phys_alloc_init(freemem_start, freemem->end - freemem_start);
 	phys_alloc_set_minimum_alignment(SMP_CACHE_BYTES);
 
+	if (!(auxinfo.flags & AUXINFO_MMU_OFF))
+		mmu_setup_early(freemem->end);
+
 	phys_alloc_get_unused(&base, &top);
 	base = PAGE_ALIGN(base);
 	top = top & PAGE_MASK;
@@ -417,6 +421,9 @@ static efi_status_t efi_mem_init(efi_bootinfo_t *efi_bootinfo)
 	phys_alloc_init(free_mem_start, free_mem_pages << EFI_PAGE_SHIFT);
 	phys_alloc_set_minimum_alignment(SMP_CACHE_BYTES);
 
+	if (!(auxinfo.flags & AUXINFO_MMU_OFF))
+		mmu_setup_early(free_mem_start + (free_mem_pages << EFI_PAGE_SHIFT));
+
 	phys_alloc_get_unused(&base, &top);
 	base = PAGE_ALIGN(base);
 	top = top & PAGE_MASK;
diff --git a/lib/arm64/asm/pgtable.h b/lib/arm64/asm/pgtable.h
index 06357920..cc6a1bb5 100644
--- a/lib/arm64/asm/pgtable.h
+++ b/lib/arm64/asm/pgtable.h
@@ -47,10 +47,16 @@
 #define pgd_offset(pgtable, addr) ((pgtable) + pgd_index(addr))
 
 #define pgd_free(pgd) free(pgd)
+static inline pgd_t *pgd_alloc_early(void)
+{
+	pgd_t *pgd = memalign(PAGE_SIZE, PAGE_SIZE);
+	memset(pgd, 0, PAGE_SIZE);
+	return pgd;
+}
 static inline pgd_t *pgd_alloc(void)
 {
 	assert(PTRS_PER_PGD * sizeof(pgd_t) <= PAGE_SIZE);
-	pgd_t *pgd = alloc_page();
+	pgd_t *pgd = page_alloc_initialized() ? alloc_page() : pgd_alloc_early();
 	return pgd;
 }
 
@@ -75,10 +81,16 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 #define pmd_offset(pud, addr)						\
 	(pud_page_vaddr(*(pud)) + pmd_index(addr))
 #define pmd_free(pmd)	free_page(pmd)
+static inline pmd_t *pmd_alloc_one_early(void)
+{
+	pmd_t *pmd = memalign(PAGE_SIZE, PAGE_SIZE);
+	memset(pmd, 0, PAGE_SIZE);
+	return pmd;
+}
 static inline pmd_t *pmd_alloc_one(void)
 {
 	assert(PTRS_PER_PMD * sizeof(pmd_t) == PAGE_SIZE);
-	pmd_t *pmd = alloc_page();
+	pmd_t *pmd = page_alloc_initialized() ? alloc_page() : pmd_alloc_one_early();
 	return pmd;
 }
 static inline pmd_t *pmd_alloc(pud_t *pud, unsigned long addr)
@@ -102,10 +114,16 @@ static inline pmd_t *pmd_alloc(pud_t *pud, unsigned long addr)
 #define pud_offset(pgd, addr)                           \
 	(pgd_page_vaddr(*(pgd)) + pud_index(addr))
 #define pud_free(pud) free_page(pud)
+static inline pud_t *pud_alloc_one_early(void)
+{
+	pud_t *pud = memalign(PAGE_SIZE, PAGE_SIZE);
+	memset(pud, 0, PAGE_SIZE);
+	return pud;
+}
 static inline pud_t *pud_alloc_one(void)
 {
 	assert(PTRS_PER_PUD * sizeof(pud_t) == PAGE_SIZE);
-	pud_t *pud = alloc_page();
+	pud_t *pud = page_alloc_initialized() ? alloc_page() : pud_alloc_one_early();
 	return pud;
 }
 static inline pud_t *pud_alloc(pgd_t *pgd, unsigned long addr)
@@ -129,10 +147,16 @@ static inline pud_t *pud_alloc(pgd_t *pgd, unsigned long addr)
 	(pmd_page_vaddr(*(pmd)) + pte_index(addr))
 
 #define pte_free(pte) free_page(pte)
+static inline pte_t *pte_alloc_one_early(void)
+{
+	pte_t *pte = memalign(PAGE_SIZE, PAGE_SIZE);
+	memset(pte, 0, PAGE_SIZE);
+	return pte;
+}
 static inline pte_t *pte_alloc_one(void)
 {
 	assert(PTRS_PER_PTE * sizeof(pte_t) == PAGE_SIZE);
-	pte_t *pte = alloc_page();
+	pte_t *pte = page_alloc_initialized() ? alloc_page() : pte_alloc_one_early();
 	return pte;
 }
 static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr)
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 16/18] arm/arm64: Map the UART when creating the translation tables
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (15 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, Shaoqin Huang, Eric Auger, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

The MMU is now enabled before the UART is probed, which leaves
kvm-unit-tests with a window where attempting to write to the console
results in an infinite data abort loop triggered by the exception
handlers themselves trying to use the console. Get around this by
mapping the UART early address when creating the translation tables.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
---
 lib/arm/io.c  | 31 +++++++++++++++++++++++++++++++
 lib/arm/io.h  |  3 +++
 lib/arm/mmu.c |  3 +++
 3 files changed, 37 insertions(+)

diff --git a/lib/arm/io.c b/lib/arm/io.c
index c15e57c4..becd52a5 100644
--- a/lib/arm/io.c
+++ b/lib/arm/io.c
@@ -15,6 +15,8 @@
 #include <asm/psci.h>
 #include <asm/spinlock.h>
 #include <asm/io.h>
+#include <asm/mmu.h>
+#include <asm/thread_info.h>
 
 #include "io.h"
 
@@ -29,6 +31,29 @@ static struct spinlock uart_lock;
 #define UART_EARLY_BASE (u8 *)(unsigned long)CONFIG_UART_EARLY_BASE
 static volatile u8 *uart0_base = UART_EARLY_BASE;
 
+static void uart_unmap_early_base(void)
+{
+	pteval_t *ptevalp;
+	pgd_t *pgtable;
+
+	if (mmu_enabled()) {
+		pgtable = current_thread_info()->pgtable;
+	} else {
+		pgtable = mmu_idmap;
+	}
+
+	/*
+	 * The UART has been mapped early in the boot process and the PTE has
+	 * been allocated using the physical allocator, which means it cannot be
+	 * freed.
+	 */
+	ptevalp = follow_pte(pgtable, (uintptr_t)UART_EARLY_BASE);
+	if (ptevalp) {
+		WRITE_ONCE(*ptevalp, 0);
+		flush_tlb_page((uintptr_t)UART_EARLY_BASE);
+	}
+}
+
 static void uart0_init_fdt(void)
 {
 	/*
@@ -98,11 +123,17 @@ void io_init(void)
 		printf("WARNING: early print support may not work. "
 		       "Found uart at %p, but early base is %p.\n",
 			uart0_base, UART_EARLY_BASE);
+		uart_unmap_early_base();
 	}
 
 	chr_testdev_init();
 }
 
+void __iomem *uart_early_base(void)
+{
+	return UART_EARLY_BASE;
+}
+
 void puts(const char *s)
 {
 	spin_lock(&uart_lock);
diff --git a/lib/arm/io.h b/lib/arm/io.h
index 183479c8..74b2850a 100644
--- a/lib/arm/io.h
+++ b/lib/arm/io.h
@@ -7,6 +7,9 @@
 #ifndef _ARM_IO_H_
 #define _ARM_IO_H_
 
+#include <asm/io.h>
+
 extern void io_init(void);
+extern void __iomem *uart_early_base(void);
 
 #endif
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index d23a12e8..0aec0bf9 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -15,6 +15,7 @@
 #include <asm/pgtable.h>
 #include <asm/pgtable-hwdef.h>
 
+#include "io.h"
 #include "vmalloc.h"
 
 #include <linux/compiler.h>
@@ -233,6 +234,8 @@ void mmu_setup_early(phys_addr_t phys_end)
 		}
 	}
 
+	ioremap((phys_addr_t)(unsigned long)uart_early_base(), PAGE_SIZE);
+
 	/*
 	 * Open-code part of mmu_enabled(), because at this point thread_info
 	 * hasn't been initialized. mmu_mark_enabled() cannot be called here
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 17/18] arm/arm64: Perform dcache maintenance at boot
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (16 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm
  Cc: Alexandru Elisei, Shaoqin Huang, Eric Auger, Nikos Nikoleris, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

The arm and arm64 architectures require explicit cache maintenance to keep
the memory in sync with the caches when the MMU is turned on. That's
because when the MMU is off, reads and writes are access system memory
directly, but with the MMU on, the same accesses hit the cache.

The sequence of cache maintenance operations to keep the cache contents
in sync with memory is:

1. Dcache invalidation for memory location M before the first write to that
location. This is needed to avoid a dirty cache line for address M*
being evicted after the explicit write that follows, and overwriting the
value that is explicitely written. Dcache clean also works, but
invalidation is faster on bare metal. When running in a virtual machine,
dcache invalidation is promoted to dcache clean + invalidation, so it makes
little difference exactly which of the dcache operations is chosen, as long
as the result is that the cache line is not dirty.

2. A write, or multiple writes, to memory address M are performed.

3. After the last write, and before the first read with the MMU on, the
dcache line that hold the value for memory location M needs to be
invalidated. This is needed to make sure that loads fetch data from
memory, where the most recent value written in step 2 is, instead of
reading clean, but stale values from the cache**.

For robustness and simplicty, the dcache invalidation at step 3 is
performed after the last write with the MMU off.

When running under KVM, the maintenance operations can be omitted when
first enabling the MMU because:

- KVM performs a dcache clean + invalidate on each page the first time a
guest accesses the page. This corresponds to the maintenance at step 1.

- KVM performs a dcache clean + invalidate on the entire guest memory when
the MMU is turned on. This corresponds to the maintenance at step 3. The
dcache clean is harmless for a guest, because all accessed cache lines are
already clean because of the maintenance that KVM performs when memory is
first accessed.

The situation changes when kvm-unit-tests turns the MMU off and then turns
it back on: KVM will skip the cache maintenance when the MMU is enabled
again, and likely skip the maintenance on page access if the page is
already mapped at stage 2. In this case, explicit cache maintenance is
required even when running as a KVM guest, although this is the
responsibility of the test to perform, since the kvm-unit-tests library
doesn't turn the MMU off after enabling it.

Do what is needed to ensure correctness and perform the CMOs before
enabling the MMU on the boot path. The stack is a special case, since
kvm-unit-tests runs C code with the MMU disabled and the compiler can
arbitrarily use the stack when compiling C code. Invalidate the entire
stack in the assembly function asm_mmu_enable, as there is no other way to
make sure that the stack is invalidated after the MMU is enabled.

Note that for static variables, the dcache maintenance at step 1 is omitted
because either the variable is stored in the data section of the test
image, which is cleaned to the PoC as per the Linux boot protocol, or in
the BSS section, which is invalidated to the PoC in the assembly entry code.

*The cache line could have been dirtied by higher level software, for
example by firmware.
**Even though the address M is invalidated in step 1, higher level software
could have allocated a new, clean cache line, as a result of a direct or
speculated read.

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
---
 arm/cstart.S            | 24 +++++++++++++
 arm/cstart64.S          | 34 +++++++++++++++++++
 lib/arm/asm/page.h      |  2 ++
 lib/arm/asm/pgtable.h   | 12 +++++++
 lib/arm/mmu.c           |  4 +++
 lib/arm/setup.c         | 75 +++++++++++++++++++++++++++++++++++------
 lib/arm64/asm/pgtable.h | 16 +++++++++
 lib/devicetree.c        |  2 +-
 8 files changed, 157 insertions(+), 12 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index 090cf38d..48dc87f5 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -34,12 +34,24 @@ start:
 	/* zero BSS */
 	ldr	r4, =bss
 	ldr	r5, =ebss
+	mov	r3, r4
+	dmb	sy
+	dcache_by_line_op dcimvac, sy, r4, r5, r6, r7
+	mov	r4, r3
 	zero_range r4, r5, r6, r7
+	mov	r4, r3
+	dmb	sy
+	dcache_by_line_op dcimvac, sy, r4, r5, r6, r7
 
 	/* zero stack */
 	ldr	r5, =stacktop
 	sub	r4, r5, #THREAD_SIZE
+	mov	r3, r4
+	dmb	sy
+	dcache_by_line_op dcimvac, sy, r4, r5, r6, r7
+	mov	r4, r3
 	zero_range r4, r5, r6, r7
+	/* asm_mmu_enable will perform the rest of the cache maintenance. */
 
 	/*
 	 * set stack, making room at top of stack for cpu0's
@@ -145,7 +157,12 @@ secondary_entry:
 	lsr	r2, #THREAD_SHIFT
 	lsl	r2, #THREAD_SHIFT
 	add	r3, r2, #THREAD_SIZE
+	mov	r6, r2
+	dmb	sy
+	dcache_by_line_op dcimvac, sy, r2, r3, r4, r5
+	mov	r2, r6
 	zero_range r2, r3, r4, r5
+	/* asm_mmu_enable will perform the rest of the cache maintenance. */
 	mov	sp, r0
 
 	bl	exceptions_init
@@ -204,6 +221,13 @@ asm_mmu_enable:
 	mcrr	p15, 0, r0, r1, c2
 	isb
 
+	dmb	sy
+	mov	r0, sp
+	lsr	r0, #THREAD_SHIFT
+	lsl	r0, #THREAD_SHIFT
+	add	r1, r0, #THREAD_SIZE
+	dcache_by_line_op dcimvac, sy, r0, r1, r2, r3
+
 	/* SCTLR */
 	mrc	p15, 0, r2, c1, c0, 0
 	orr	r2, #CR_C
diff --git a/arm/cstart64.S b/arm/cstart64.S
index b9784d82..d8200ea2 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -53,6 +53,7 @@ start:
 	add     x5, x5, :lo12:reloc_start
 	adrp	x6, reloc_end
 	add     x6, x6, :lo12:reloc_end
+
 1:
 	cmp	x5, x6
 	b.hs	1f
@@ -60,22 +61,44 @@ start:
 	ldr	x8, [x5, #16]			// r_addend
 	add	x8, x8, x4			// val = base + r_addend
 	str	x8, [x4, x7]			// base[r_offset] = val
+	add	x7, x4, x7
+	dmb	sy
+	/* Image is cleaned to PoC, no need for CMOs before the memory write. */
+	dc	ivac, x7
 	add	x5, x5, #24
 	b	1b
 
 1:
+	/* Complete the cache maintenance operations. */
+	dsb	sy
+
 	/* zero BSS */
 	adrp	x4, bss
 	add	x4, x4, :lo12:bss
 	adrp    x5, ebss
 	add     x5, x5, :lo12:ebss
+	/* Stash start of bss, as dcache_by_line_op corrupts it. */
+	mov	x9, x4
+	dmb	sy
+	/* Make sure there are no dirty cache lines that can be evicted. */
+	dcache_by_line_op ivac, sy, x4, x5, x6, x7
+	mov	x4, x9
 	zero_range x4, x5
+	mov	x9, x4
+	dmb	sy
+	/* Invalidate clean and potentially stale cache lines. */
+	dcache_by_line_op ivac, sy, x4, x5, x6, x7
 
 	/* zero and set up stack */
 	adrp    x5, stacktop
 	add     x5, x5, :lo12:stacktop
 	sub	x4, x5, #THREAD_SIZE
+	mov	x9, x4
+	dmb	sy
+	dcache_by_line_op ivac, sy, x4, x5, x6, x7
+	mov	x4, x9
 	zero_range x4, x5
+	/* asm_mmu_enable will perform the rest of the cache maintenance. */
 
 	/* set SCTLR_EL1 to a known value */
 	ldr	x4, =INIT_SCTLR_EL1_MMU_OFF
@@ -149,7 +172,12 @@ secondary_entry:
 	ldr	x0, [x0, :lo12:secondary_data]
 	and	x1, x0, #THREAD_MASK
 	add	x2, x1, #THREAD_SIZE
+	mov	x9, x1
+	dmb	sy
+	dcache_by_line_op ivac, sy, x1, x2, x3, x4
+	mov	x1, x9
 	zero_range x1, x2
+	/* asm_mmu_enable will perform the rest of the cache maintenance. */
 	mov	sp, x0
 
 	/* Enable FP/ASIMD */
@@ -242,6 +270,12 @@ asm_mmu_enable:
 	msr	ttbr0_el1, x0
 	isb
 
+	dmb	sy
+	mov	x9, sp
+	and	x9, x9, #THREAD_MASK
+	add	x10, x9, #THREAD_SIZE
+	dcache_by_line_op ivac, sy, x9, x10, x11, x12
+
 	/* SCTLR */
 	mrs	x1, sctlr_el1
 	orr	x1, x1, SCTLR_EL1_C
diff --git a/lib/arm/asm/page.h b/lib/arm/asm/page.h
index 8eb4a883..0a46bda0 100644
--- a/lib/arm/asm/page.h
+++ b/lib/arm/asm/page.h
@@ -8,6 +8,8 @@
 
 #include <linux/const.h>
 
+#include <libcflat.h>
+
 #define PAGE_SHIFT		12
 #define PAGE_SIZE		(_AC(1,UL) << PAGE_SHIFT)
 #define PAGE_MASK		(~(PAGE_SIZE-1))
diff --git a/lib/arm/asm/pgtable.h b/lib/arm/asm/pgtable.h
index 49c74e19..4c565737 100644
--- a/lib/arm/asm/pgtable.h
+++ b/lib/arm/asm/pgtable.h
@@ -22,6 +22,8 @@
  */
 #include <linux/compiler.h>
 
+#include <asm/cacheflush.h>
+
 #define pgtable_va(x)		((void *)(unsigned long)(x))
 #define pgtable_pa(x)		((unsigned long)(x))
 
@@ -44,7 +46,9 @@
 static inline pgd_t *pgd_alloc_early(void)
 {
 	pgd_t *pgd = memalign(PAGE_SIZE, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pgd);
 	memset(pgd, 0, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pgd);
 	return pgd;
 }
 static inline pgd_t *pgd_alloc(void)
@@ -74,7 +78,9 @@ static inline pmd_t *pgd_page_vaddr(pgd_t pgd)
 static inline pmd_t *pmd_alloc_one_early(void)
 {
 	pmd_t *pmd = memalign(PAGE_SIZE, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pmd);
 	memset(pmd, 0, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pmd);
 	return pmd;
 }
 static inline pmd_t *pmd_alloc_one(void)
@@ -89,6 +95,8 @@ static inline pmd_t *pmd_alloc(pgd_t *pgd, unsigned long addr)
 		pgd_t entry;
 		pgd_val(entry) = pgtable_pa(pmd_alloc_one()) | PMD_TYPE_TABLE;
 		WRITE_ONCE(*pgd, entry);
+		if (!page_alloc_initialized())
+			dcache_clean_inval_addr_poc((unsigned long)pgd);
 	}
 	return pmd_offset(pgd, addr);
 }
@@ -107,7 +115,9 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 static inline pte_t *pte_alloc_one_early(void)
 {
 	pte_t *pte = memalign(PAGE_SIZE, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pte);
 	memset(pte, 0, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pte);
 	return pte;
 }
 static inline pte_t *pte_alloc_one(void)
@@ -122,6 +132,8 @@ static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr)
 		pmd_t entry;
 		pmd_val(entry) = pgtable_pa(pte_alloc_one()) | PMD_TYPE_TABLE;
 		WRITE_ONCE(*pmd, entry);
+		if (!page_alloc_initialized())
+			dcache_clean_inval_addr_poc((unsigned long)pmd);
 
 	}
 	return pte_offset(pmd, addr);
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 0aec0bf9..c280c361 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -5,6 +5,7 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
+#include <asm/cacheflush.h>
 #include <asm/setup.h>
 #include <asm/thread_info.h>
 #include <asm/cpumask.h>
@@ -15,6 +16,7 @@
 #include <asm/pgtable.h>
 #include <asm/pgtable-hwdef.h>
 
+#include "alloc_phys.h"
 #include "io.h"
 #include "vmalloc.h"
 
@@ -236,6 +238,8 @@ void mmu_setup_early(phys_addr_t phys_end)
 
 	ioremap((phys_addr_t)(unsigned long)uart_early_base(), PAGE_SIZE);
 
+	phys_alloc_perform_cache_maintenance(dcache_clean_inval_addr_poc);
+
 	/*
 	 * Open-code part of mmu_enabled(), because at this point thread_info
 	 * hasn't been initialized. mmu_mark_enabled() cannot be called here
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 4b9423e5..71106682 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -19,6 +19,7 @@
 #include <vmalloc.h>
 #include <auxinfo.h>
 #include <argv.h>
+#include <asm/cacheflush.h>
 #include <asm/thread_info.h>
 #include <asm/setup.h>
 #include <asm/page.h>
@@ -197,6 +198,7 @@ static void mem_init(phys_addr_t freemem_start)
 	struct mem_region *freemem, *r, mem = {
 		.start = (phys_addr_t)-1,
 	};
+	int nr_regions = 0;
 
 	freemem = mem_region_find(freemem_start);
 	assert(freemem && !(freemem->flags & (MR_F_IO | MR_F_CODE)));
@@ -208,6 +210,7 @@ static void mem_init(phys_addr_t freemem_start)
 			if (r->end > mem.end)
 				mem.end = r->end;
 		}
+		nr_regions++;
 	}
 	assert(mem.end && !(mem.start & ~PHYS_MASK));
 	mem.end &= PHYS_MASK;
@@ -221,6 +224,9 @@ static void mem_init(phys_addr_t freemem_start)
 	/* Ensure our selected freemem range is somewhere in our full range */
 	assert(freemem_start >= mem.start && freemem->end <= mem.end);
 
+	dcache_inval_poc((unsigned long)mem_regions,
+			 (unsigned long)(mem_regions + nr_regions));
+
 	__phys_offset = mem.start;	/* PHYS_OFFSET */
 	__phys_end = mem.end;		/* PHYS_END */
 
@@ -240,32 +246,69 @@ static void mem_init(phys_addr_t freemem_start)
 	page_alloc_ops_enable();
 }
 
-void setup(const void *fdt, phys_addr_t freemem_start)
+extern const void *fdt;
+static void do_fdt_move(void *freemem, const void *fdt_addr, u32 *fdt_size)
 {
-	void *freemem;
-	const char *bootargs, *tmp;
-	u32 fdt_size;
 	int ret;
 
-	assert(sizeof(long) == 8 || freemem_start < (3ul << 30));
-	freemem = (void *)(unsigned long)freemem_start;
-
 	/* Move the FDT to the base of free memory */
-	fdt_size = fdt_totalsize(fdt);
-	ret = fdt_move(fdt, freemem, fdt_size);
+	*fdt_size = fdt_totalsize(fdt_addr);
+
+	/* Invalidate potentially dirty cache lines. */
+	dcache_inval_poc((unsigned long)freemem, (unsigned long)freemem + *fdt_size);
+	ret = fdt_move(fdt_addr, freemem, *fdt_size);
 	assert(ret == 0);
+
 	ret = dt_init(freemem);
+	/*
+	 * Invalidate the clean (the bootloader cleans the test image to PoC),
+	 * but potentially stale, cache line that holds the value of the
+	 * variable fdt, to force the CPU to fetch it from memory when the MMU
+	 * is enabled.
+	 */
+	dcache_clean_inval_addr_poc((unsigned long)&fdt);
 	assert(ret == 0);
-	freemem += fdt_size;
+}
+
+static void initrd_move(void *freemem)
+{
+	const char *tmp;
+	int ret;
 
 	/* Move the initrd to the top of the FDT */
 	ret = dt_get_initrd(&tmp, &initrd_size);
 	assert(ret == 0 || ret == -FDT_ERR_NOTFOUND);
 	if (ret == 0) {
 		initrd = freemem;
+		/* Invalidate the potentially stale value for the variable. */
+		dcache_clean_inval_addr_poc((unsigned long)&initrd);
+		/*
+		 * Invalidate potentially dirty cache lines for where the initrd
+		 * will be moved.
+		 */
+		dcache_inval_poc((unsigned long)freemem, (unsigned long)freemem + initrd_size);
 		memmove(initrd, tmp, initrd_size);
-		freemem += initrd_size;
 	}
+}
+
+void setup(const void *fdt_addr, phys_addr_t freemem_start)
+{
+	void *freemem;
+	const char *bootargs;
+	u32 fdt_size;
+	int ret;
+
+	assert(sizeof(long) == 8 || freemem_start < (3ul << 30));
+	freemem = (void *)(unsigned long)freemem_start;
+
+	do_fdt_move(freemem, fdt_addr, &fdt_size);
+	freemem += fdt_size;
+
+	initrd_move(freemem);
+	freemem += initrd_size;
+
+	/* Invalidate potentially stale cache lines for the fdt and initrd. */
+	dcache_inval_poc(freemem_start, (unsigned long)freemem);
 
 	mem_regions_add_dt_regions();
 	mem_regions_add_assumed();
@@ -335,6 +378,7 @@ static efi_status_t efi_mem_init(efi_bootinfo_t *efi_bootinfo)
 	uintptr_t data = (uintptr_t)&_data, edata = ALIGN((uintptr_t)&_edata, 4096);
 	const void *fdt = efi_bootinfo->fdt;
 	int fdt_size, ret;
+	int nr_regions = 0;
 
 	/*
 	 * Record the largest free EFI_CONVENTIONAL_MEMORY region
@@ -398,7 +442,16 @@ static efi_status_t efi_mem_init(efi_bootinfo_t *efi_bootinfo)
 				__phys_end = r.end;
 		}
 		mem_region_add(&r);
+		nr_regions++;
 	}
+
+	/*
+	 * The mem_regions will be used after mmu_disable and before mmu_enable
+	 * again, so clean the dcache to poc.
+	 */
+	dcache_clean_poc((unsigned long)mem_regions,
+			 (unsigned long)(mem_regions + nr_regions));
+
 	if (fdt) {
 		/* Move the FDT to the base of free memory */
 		fdt_size = fdt_totalsize(fdt);
diff --git a/lib/arm64/asm/pgtable.h b/lib/arm64/asm/pgtable.h
index cc6a1bb5..6f59d65a 100644
--- a/lib/arm64/asm/pgtable.h
+++ b/lib/arm64/asm/pgtable.h
@@ -15,6 +15,8 @@
  */
 #include <alloc.h>
 #include <alloc_page.h>
+
+#include <asm/cacheflush.h>
 #include <asm/setup.h>
 #include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
@@ -50,7 +52,9 @@
 static inline pgd_t *pgd_alloc_early(void)
 {
 	pgd_t *pgd = memalign(PAGE_SIZE, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pgd);
 	memset(pgd, 0, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pgd);
 	return pgd;
 }
 static inline pgd_t *pgd_alloc(void)
@@ -84,7 +88,9 @@ static inline pte_t *pmd_page_vaddr(pmd_t pmd)
 static inline pmd_t *pmd_alloc_one_early(void)
 {
 	pmd_t *pmd = memalign(PAGE_SIZE, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pmd);
 	memset(pmd, 0, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pmd);
 	return pmd;
 }
 static inline pmd_t *pmd_alloc_one(void)
@@ -99,6 +105,8 @@ static inline pmd_t *pmd_alloc(pud_t *pud, unsigned long addr)
 		pud_t entry;
 		pud_val(entry) = pgtable_pa(pmd_alloc_one()) | PMD_TYPE_TABLE;
 		WRITE_ONCE(*pud, entry);
+		if (!page_alloc_initialized())
+			dcache_clean_inval_addr_poc((unsigned long)pud);
 	}
 	return pmd_offset(pud, addr);
 }
@@ -117,7 +125,9 @@ static inline pmd_t *pmd_alloc(pud_t *pud, unsigned long addr)
 static inline pud_t *pud_alloc_one_early(void)
 {
 	pud_t *pud = memalign(PAGE_SIZE, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pud);
 	memset(pud, 0, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pud);
 	return pud;
 }
 static inline pud_t *pud_alloc_one(void)
@@ -132,6 +142,8 @@ static inline pud_t *pud_alloc(pgd_t *pgd, unsigned long addr)
 		pgd_t entry;
 		pgd_val(entry) = pgtable_pa(pud_alloc_one()) | PMD_TYPE_TABLE;
 		WRITE_ONCE(*pgd, entry);
+		if (!page_alloc_initialized())
+			dcache_clean_inval_addr_poc((unsigned long)pgd);
 	}
 	return pud_offset(pgd, addr);
 }
@@ -150,7 +162,9 @@ static inline pud_t *pud_alloc(pgd_t *pgd, unsigned long addr)
 static inline pte_t *pte_alloc_one_early(void)
 {
 	pte_t *pte = memalign(PAGE_SIZE, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pte);
 	memset(pte, 0, PAGE_SIZE);
+	dcache_inval_page_poc((unsigned long)pte);
 	return pte;
 }
 static inline pte_t *pte_alloc_one(void)
@@ -165,6 +179,8 @@ static inline pte_t *pte_alloc(pmd_t *pmd, unsigned long addr)
 		pmd_t entry;
 		pmd_val(entry) = pgtable_pa(pte_alloc_one()) | PMD_TYPE_TABLE;
 		WRITE_ONCE(*pmd, entry);
+		if (!page_alloc_initialized())
+			dcache_clean_inval_addr_poc((unsigned long)pmd);
 	}
 	return pte_offset(pmd, addr);
 }
diff --git a/lib/devicetree.c b/lib/devicetree.c
index 3ff9d164..1cc44cf1 100644
--- a/lib/devicetree.c
+++ b/lib/devicetree.c
@@ -7,7 +7,7 @@
 #include "libfdt/libfdt.h"
 #include "devicetree.h"
 
-static const void *fdt;
+const void *fdt;
 
 const void *dt_fdt(void)
 {
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [kvm-unit-tests PATCH v1 18/18] arm/arm64: Rework the cache maintenance in asm_mmu_disable
  2023-11-30  9:07 ` Shaoqin Huang
                   ` (17 preceding siblings ...)
  (?)
@ 2023-11-30  9:07 ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-11-30  9:07 UTC (permalink / raw)
  To: Andrew Jones, kvmarm; +Cc: Alexandru Elisei, Eric Auger, kvm

From: Alexandru Elisei <alexandru.elisei@arm.com>

asm_mmu_disable is overly ambitious and provably incorrect:

1. It tries to clean and invalidate the data caches for the **entire**
memory, which is highly unnecessary, as it's very unlikely that a test
will write to the entire memory, and even more unlikely that a test will
modify the text section of the test image.

2. There is no corresponding dcache invalidate command for the entire
memory in asm_mmu_enable, leaving it up to the test that disabled the
MMU to do the cache maintenance in an asymmetrical fashion: only for
re-enabling the MMU, but not for disabling it.

3. It's missing the DMB SY memory barrier to ensure that the dcache
maintenance is performed after the last store executed in program order
before calling asm_mmu_disable.

Fix all of the issues in one go, by doing the cache maintenance only for
the stack, as that is out of the control of the C code, and add the missing
memory barrier. A test that disables the MMU is now expected to do whatever
cache maintenance is necessary to execute correctly after the MMU is
disabled.

The code used to test that mmu_disable works correctly is similar to the
code used to test commit 410b3bf09e76 ("arm/arm64: Perform dcache clean
+ invalidate after turning MMU off"), with extra cache maintenance
added:

+#include <alloc_page.h>
+#include <asm/cacheflush.h>
+#include <asm/mmu.h>
 int main(int argc, char **argv)
 {
+       int *x = alloc_page();
+       bool pass = true;
+       int i;
+
+       for  (i = 0; i < 1000000; i++) {
+               *x = 0x42;
+               dcache_clean_addr_poc((unsigned long)x);
+               mmu_disable();
+               if (*x != 0x42) {
+                       mmu_enable(current_thread_info()->pgtable);
+                       pass = false;
+                       break;
+               }
+               *x = 0x50;
+               /* Needed for the invalidation only. */
+               dcache_clean_inval_addr_poc((unsigned long)x);
+               mmu_enable(current_thread_info()->pgtable);
+               if (*x != 0x50) {
+                       pass = false;
+                       break;
+               }
+       }
+       report(pass, "MMU disable cache maintenance");

Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
---
 arm/cstart.S   | 19 +++++++++++++------
 arm/cstart64.S | 19 ++++++++++++-------
 2 files changed, 25 insertions(+), 13 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index 48dc87f5..3950e45e 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -240,18 +240,25 @@ asm_mmu_enable:
 
 .globl asm_mmu_disable
 asm_mmu_disable:
+	/*
+	 * A test can change the memory attributes for a memory location to
+	 * Device or Inner Non-cacheable, which makes the barrier required to
+	 * avoid reordering of previous memory accesses with respect to the
+	 * cache maintenance.
+	 */
+	dmb	sy
+	mov	r0, sp
+	lsr	r0, #THREAD_SHIFT
+	lsl	r0, #THREAD_SHIFT
+	add	r1, r0, #THREAD_SIZE
+	dcache_by_line_op dccmvac, sy, r0, r1, r2, r3
+
 	/* SCTLR */
 	mrc	p15, 0, r0, c1, c0, 0
 	bic	r0, #CR_M
 	mcr	p15, 0, r0, c1, c0, 0
 	isb
 
-	ldr	r0, =__phys_offset
-	ldr	r0, [r0]
-	ldr	r1, =__phys_end
-	ldr	r1, [r1]
-	dcache_by_line_op dccimvac, sy, r0, r1, r2, r3
-
 	mov     pc, lr
 
 /*
diff --git a/arm/cstart64.S b/arm/cstart64.S
index d8200ea2..af56186c 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -288,18 +288,23 @@ asm_mmu_enable:
 
 .globl asm_mmu_disable
 asm_mmu_disable:
+	/*
+	 * A test can change the memory attributes for a memory location to
+	 * Device or Inner Non-cacheable, which makes the barrier required to
+	 * avoid reordering of previous memory accesses with respect to the
+	 * cache maintenance.
+	 */
+	dmb	sy
+	mov	x9, sp
+	and	x9, x9, #THREAD_MASK
+	add	x10, x9, #THREAD_SIZE
+	dcache_by_line_op cvac, sy, x9, x10, x11, x12
+
 	mrs	x0, sctlr_el1
 	bic	x0, x0, SCTLR_EL1_M
 	msr	sctlr_el1, x0
 	isb
 
-	/* Clean + invalidate the entire memory */
-	adrp	x0, __phys_offset
-	ldr	x0, [x0, :lo12:__phys_offset]
-	adrp	x1, __phys_end
-	ldr	x1, [x1, :lo12:__phys_end]
-	dcache_by_line_op civac, sy, x0, x1, x2, x3
-
 	ret
 
 /*
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot
  2023-11-30  9:07 ` Shaoqin Huang
@ 2023-11-30 10:35   ` Alexandru Elisei
  -1 siblings, 0 replies; 38+ messages in thread
From: Alexandru Elisei @ 2023-11-30 10:35 UTC (permalink / raw)
  To: Shaoqin Huang
  Cc: Andrew Jones, kvmarm, David Woodhouse, Eric Auger, kvm,
	Laurent Vivier, linuxppc-dev, Nadav Amit, Nico Boehr,
	Nikos Nikoleris, Thomas Huth

Hi,

Thank you so much for reviving this, much appreciated.

I wanted to let you know that I definitely plan to review the series as
soon as possible, unfortunately I don't believe I won't be able to do that
for at least 2 weeks.

Thanks,
Alex

On Thu, Nov 30, 2023 at 04:07:02AM -0500, Shaoqin Huang wrote:
> Hi,
> 
> I'm posting Alexandru's patch set[1] rebased on the latest branch with the
> conflicts being resolved. No big changes compare to its original code.
> 
> As this version 1 of this series was posted one years ago, I would first let you
> recall it, what's the intention of this series and what this series do. You can
> view it by click the link[2] and view the cover-letter.
> 
> Since when writing the series[1], the efi support for arm64[3] hasn't been
> merged into the kvm-unit-tests, but now the efi support for arm64 has been
> merged. Directly rebase the series[1] onto the latest branch will break the efi
> tests. This is mainly because the Patch #15 ("arm/arm64: Enable the MMU early")
> moves the mmu_enable() out of the setup_mmu(), which causes the efi test will
> not enable the mmu. So I do a small change in the efi_mem_init() which makes the
> efi test also enable the MMU early, and make it works.
> 
> And another change should be noticed is in the Patch #17 ("arm/arm64: Perform
> dcache maintenance"). In the efi_mem_init(), it will disable the mmu, and build
> a new pagetable and re-enable the mmu, if the asm_mmu_disable clean and
> invalidate the data caches for entire memory, we don't need to care the dcache
> and after mmu disabled, we use the mmu_setup_early() to re-enable the mmu, which
> takes care all the cache maintenance. But the situation changes since the Patch
> #18 ("arm/arm64: Rework the cache maintenance in asm_mmu_disable") only clean
> and invalidate the data caches for the stack memory area. So we need to clean
> and invalidate the data caches manually before disable the mmu, I'm not
> confident about current cache maintenance at the efi setup patch, so I ask for
> your help to review it if it's right or not.
> 
> And I also drop one patch ("s390: Do not use the physical allocator") from[1]
> since this cause s390 test to fail.
> 
> This series may include bug, so I really appreciate your review to improve this
> series together.
> 
> You can get the code from:
> 
> $ git clone https://gitlab.com/shahuang/kvm-unit-tests.git \
> 	-b arm-arm64-rework-cache-maintenance-at-boot-v1
> 
> [1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/arm-arm64-rework-cache-maintenance-at-boot-v2-wip2
> [2] https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
> [3] https://patchwork.kernel.org/project/kvm/cover/20230530160924.82158-1-nikos.nikoleris@arm.com/
> 
> Changelog:
> ----------
> RFC->v1:
>   - Gathered Reviewed-by tags.
>   - Various changes to commit messages and comments to hopefully make the code
>     easier to understand.
>   - Patches #8 ("lib/alloc_phys: Expand documentation with usage and limitations")
>     are new.
>   - Folded patch "arm: page.h: Add missing libcflat.h include" into #17
>     ("arm/arm64: Perform dcache maintenance at boot").
>   - Reordered the series to group patches that touch aproximately the same code
>     together - the patches that change the physical allocator are now first,
>     followed come the patches that change how the secondaries are brought online.
>   - Fixed several nasty bugs where the r4 register was being clobbered in the arm
>     assembly.
>   - Unmap the early UART address if the DTB address does not match the early
>     address.
>   - Added dcache maintenance when a page table is modified with the MMU disabled.
>   - Moved the cache maintenance when disabling the MMU to be executed before the
>     MMU is disabled.
>   - Rebase it on lasted branch which efi support has been merged.
>   - Make the efi test also enable MMU early.
>   - Add cache maintenance on efi setup path especially before mmu_disable.
> 
> RFC: https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
> 
> Alexandru Elisei (18):
>   Makefile: Define __ASSEMBLY__ for assembly files
>   powerpc: Replace the physical allocator with the page allocator
>   lib/alloc_phys: Initialize align_min
>   lib/alloc_phys: Consolidate allocate functions into memalign_early()
>   lib/alloc_phys: Remove locking
>   lib/alloc_phys: Remove allocation accounting
>   lib/alloc_phys: Add callback to perform cache maintenance
>   lib/alloc_phys: Expand documentation with usage and limitations
>   arm/arm64: Zero secondary CPUs' stack
>   arm/arm64: Allocate secondaries' stack using the page allocator
>   arm/arm64: assembler.h: Replace size with end address for
>     dcache_by_line_op
>   arm/arm64: Add C functions for doing cache maintenance
>   arm/arm64: Configure secondaries' stack before enabling the MMU
>   arm/arm64: Use pgd_alloc() to allocate mmu_idmap
>   arm/arm64: Enable the MMU early
>   arm/arm64: Map the UART when creating the translation tables
>   arm/arm64: Perform dcache maintenance at boot
>   arm/arm64: Rework the cache maintenance in asm_mmu_disable
> 
>  Makefile                   |   5 +-
>  arm/Makefile.arm           |   4 +-
>  arm/Makefile.arm64         |   4 +-
>  arm/Makefile.common        |   6 +-
>  arm/cstart.S               |  71 +++++++++++++++------
>  arm/cstart64.S             |  76 +++++++++++++++++------
>  lib/alloc_phys.c           | 122 ++++++++++++-------------------------
>  lib/alloc_phys.h           |  28 ++++++---
>  lib/arm/asm/assembler.h    |  15 ++---
>  lib/arm/asm/cacheflush.h   |   1 +
>  lib/arm/asm/mmu-api.h      |   1 +
>  lib/arm/asm/mmu.h          |   6 --
>  lib/arm/asm/page.h         |   2 +
>  lib/arm/asm/pgtable.h      |  39 ++++++++++--
>  lib/arm/asm/thread_info.h  |   3 +-
>  lib/arm/cache.S            |  89 +++++++++++++++++++++++++++
>  lib/arm/io.c               |  31 ++++++++++
>  lib/arm/io.h               |   3 +
>  lib/arm/mmu.c              |  37 ++++++++---
>  lib/arm/processor.c        |   1 -
>  lib/arm/setup.c            |  82 +++++++++++++++++++++----
>  lib/arm/smp.c              |   5 ++
>  lib/arm64/asm/assembler.h  |  11 ++--
>  lib/arm64/asm/cacheflush.h |  37 +++++++++++
>  lib/arm64/asm/mmu.h        |   5 --
>  lib/arm64/asm/pgtable.h    |  50 +++++++++++++--
>  lib/arm64/cache.S          |  85 ++++++++++++++++++++++++++
>  lib/arm64/processor.c      |   1 -
>  lib/devicetree.c           |   2 +-
>  lib/powerpc/setup.c        |   9 ++-
>  powerpc/Makefile.common    |   1 +
>  powerpc/cstart64.S         |   1 -
>  powerpc/spapr_hcall.c      |   5 +-
>  33 files changed, 642 insertions(+), 196 deletions(-)
>  create mode 100644 lib/arm/asm/cacheflush.h
>  create mode 100644 lib/arm/cache.S
>  create mode 100644 lib/arm64/asm/cacheflush.h
>  create mode 100644 lib/arm64/cache.S
> 
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot
@ 2023-11-30 10:35   ` Alexandru Elisei
  0 siblings, 0 replies; 38+ messages in thread
From: Alexandru Elisei @ 2023-11-30 10:35 UTC (permalink / raw)
  To: Shaoqin Huang
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, Andrew Jones,
	Nikos Nikoleris, Eric Auger, Nadav Amit, kvmarm, linuxppc-dev,
	David Woodhouse

Hi,

Thank you so much for reviving this, much appreciated.

I wanted to let you know that I definitely plan to review the series as
soon as possible, unfortunately I don't believe I won't be able to do that
for at least 2 weeks.

Thanks,
Alex

On Thu, Nov 30, 2023 at 04:07:02AM -0500, Shaoqin Huang wrote:
> Hi,
> 
> I'm posting Alexandru's patch set[1] rebased on the latest branch with the
> conflicts being resolved. No big changes compare to its original code.
> 
> As this version 1 of this series was posted one years ago, I would first let you
> recall it, what's the intention of this series and what this series do. You can
> view it by click the link[2] and view the cover-letter.
> 
> Since when writing the series[1], the efi support for arm64[3] hasn't been
> merged into the kvm-unit-tests, but now the efi support for arm64 has been
> merged. Directly rebase the series[1] onto the latest branch will break the efi
> tests. This is mainly because the Patch #15 ("arm/arm64: Enable the MMU early")
> moves the mmu_enable() out of the setup_mmu(), which causes the efi test will
> not enable the mmu. So I do a small change in the efi_mem_init() which makes the
> efi test also enable the MMU early, and make it works.
> 
> And another change should be noticed is in the Patch #17 ("arm/arm64: Perform
> dcache maintenance"). In the efi_mem_init(), it will disable the mmu, and build
> a new pagetable and re-enable the mmu, if the asm_mmu_disable clean and
> invalidate the data caches for entire memory, we don't need to care the dcache
> and after mmu disabled, we use the mmu_setup_early() to re-enable the mmu, which
> takes care all the cache maintenance. But the situation changes since the Patch
> #18 ("arm/arm64: Rework the cache maintenance in asm_mmu_disable") only clean
> and invalidate the data caches for the stack memory area. So we need to clean
> and invalidate the data caches manually before disable the mmu, I'm not
> confident about current cache maintenance at the efi setup patch, so I ask for
> your help to review it if it's right or not.
> 
> And I also drop one patch ("s390: Do not use the physical allocator") from[1]
> since this cause s390 test to fail.
> 
> This series may include bug, so I really appreciate your review to improve this
> series together.
> 
> You can get the code from:
> 
> $ git clone https://gitlab.com/shahuang/kvm-unit-tests.git \
> 	-b arm-arm64-rework-cache-maintenance-at-boot-v1
> 
> [1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/arm-arm64-rework-cache-maintenance-at-boot-v2-wip2
> [2] https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
> [3] https://patchwork.kernel.org/project/kvm/cover/20230530160924.82158-1-nikos.nikoleris@arm.com/
> 
> Changelog:
> ----------
> RFC->v1:
>   - Gathered Reviewed-by tags.
>   - Various changes to commit messages and comments to hopefully make the code
>     easier to understand.
>   - Patches #8 ("lib/alloc_phys: Expand documentation with usage and limitations")
>     are new.
>   - Folded patch "arm: page.h: Add missing libcflat.h include" into #17
>     ("arm/arm64: Perform dcache maintenance at boot").
>   - Reordered the series to group patches that touch aproximately the same code
>     together - the patches that change the physical allocator are now first,
>     followed come the patches that change how the secondaries are brought online.
>   - Fixed several nasty bugs where the r4 register was being clobbered in the arm
>     assembly.
>   - Unmap the early UART address if the DTB address does not match the early
>     address.
>   - Added dcache maintenance when a page table is modified with the MMU disabled.
>   - Moved the cache maintenance when disabling the MMU to be executed before the
>     MMU is disabled.
>   - Rebase it on lasted branch which efi support has been merged.
>   - Make the efi test also enable MMU early.
>   - Add cache maintenance on efi setup path especially before mmu_disable.
> 
> RFC: https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
> 
> Alexandru Elisei (18):
>   Makefile: Define __ASSEMBLY__ for assembly files
>   powerpc: Replace the physical allocator with the page allocator
>   lib/alloc_phys: Initialize align_min
>   lib/alloc_phys: Consolidate allocate functions into memalign_early()
>   lib/alloc_phys: Remove locking
>   lib/alloc_phys: Remove allocation accounting
>   lib/alloc_phys: Add callback to perform cache maintenance
>   lib/alloc_phys: Expand documentation with usage and limitations
>   arm/arm64: Zero secondary CPUs' stack
>   arm/arm64: Allocate secondaries' stack using the page allocator
>   arm/arm64: assembler.h: Replace size with end address for
>     dcache_by_line_op
>   arm/arm64: Add C functions for doing cache maintenance
>   arm/arm64: Configure secondaries' stack before enabling the MMU
>   arm/arm64: Use pgd_alloc() to allocate mmu_idmap
>   arm/arm64: Enable the MMU early
>   arm/arm64: Map the UART when creating the translation tables
>   arm/arm64: Perform dcache maintenance at boot
>   arm/arm64: Rework the cache maintenance in asm_mmu_disable
> 
>  Makefile                   |   5 +-
>  arm/Makefile.arm           |   4 +-
>  arm/Makefile.arm64         |   4 +-
>  arm/Makefile.common        |   6 +-
>  arm/cstart.S               |  71 +++++++++++++++------
>  arm/cstart64.S             |  76 +++++++++++++++++------
>  lib/alloc_phys.c           | 122 ++++++++++++-------------------------
>  lib/alloc_phys.h           |  28 ++++++---
>  lib/arm/asm/assembler.h    |  15 ++---
>  lib/arm/asm/cacheflush.h   |   1 +
>  lib/arm/asm/mmu-api.h      |   1 +
>  lib/arm/asm/mmu.h          |   6 --
>  lib/arm/asm/page.h         |   2 +
>  lib/arm/asm/pgtable.h      |  39 ++++++++++--
>  lib/arm/asm/thread_info.h  |   3 +-
>  lib/arm/cache.S            |  89 +++++++++++++++++++++++++++
>  lib/arm/io.c               |  31 ++++++++++
>  lib/arm/io.h               |   3 +
>  lib/arm/mmu.c              |  37 ++++++++---
>  lib/arm/processor.c        |   1 -
>  lib/arm/setup.c            |  82 +++++++++++++++++++++----
>  lib/arm/smp.c              |   5 ++
>  lib/arm64/asm/assembler.h  |  11 ++--
>  lib/arm64/asm/cacheflush.h |  37 +++++++++++
>  lib/arm64/asm/mmu.h        |   5 --
>  lib/arm64/asm/pgtable.h    |  50 +++++++++++++--
>  lib/arm64/cache.S          |  85 ++++++++++++++++++++++++++
>  lib/arm64/processor.c      |   1 -
>  lib/devicetree.c           |   2 +-
>  lib/powerpc/setup.c        |   9 ++-
>  powerpc/Makefile.common    |   1 +
>  powerpc/cstart64.S         |   1 -
>  powerpc/spapr_hcall.c      |   5 +-
>  33 files changed, 642 insertions(+), 196 deletions(-)
>  create mode 100644 lib/arm/asm/cacheflush.h
>  create mode 100644 lib/arm/cache.S
>  create mode 100644 lib/arm64/asm/cacheflush.h
>  create mode 100644 lib/arm64/cache.S
> 
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot
  2023-11-30 10:35   ` Alexandru Elisei
@ 2023-12-01  8:40     ` Shaoqin Huang
  -1 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-12-01  8:40 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Andrew Jones, kvmarm, David Woodhouse, Eric Auger, kvm,
	Laurent Vivier, linuxppc-dev, Nadav Amit, Nico Boehr,
	Nikos Nikoleris, Thomas Huth

Hi Alexandru,

Just take your time. I also appreciate your work. :)

Thanks,
Shaoqin

On 11/30/23 18:35, Alexandru Elisei wrote:
> Hi,
> 
> Thank you so much for reviving this, much appreciated.
> 
> I wanted to let you know that I definitely plan to review the series as
> soon as possible, unfortunately I don't believe I won't be able to do that
> for at least 2 weeks.
> 
> Thanks,
> Alex
> 
> On Thu, Nov 30, 2023 at 04:07:02AM -0500, Shaoqin Huang wrote:
>> Hi,
>>
>> I'm posting Alexandru's patch set[1] rebased on the latest branch with the
>> conflicts being resolved. No big changes compare to its original code.
>>
>> As this version 1 of this series was posted one years ago, I would first let you
>> recall it, what's the intention of this series and what this series do. You can
>> view it by click the link[2] and view the cover-letter.
>>
>> Since when writing the series[1], the efi support for arm64[3] hasn't been
>> merged into the kvm-unit-tests, but now the efi support for arm64 has been
>> merged. Directly rebase the series[1] onto the latest branch will break the efi
>> tests. This is mainly because the Patch #15 ("arm/arm64: Enable the MMU early")
>> moves the mmu_enable() out of the setup_mmu(), which causes the efi test will
>> not enable the mmu. So I do a small change in the efi_mem_init() which makes the
>> efi test also enable the MMU early, and make it works.
>>
>> And another change should be noticed is in the Patch #17 ("arm/arm64: Perform
>> dcache maintenance"). In the efi_mem_init(), it will disable the mmu, and build
>> a new pagetable and re-enable the mmu, if the asm_mmu_disable clean and
>> invalidate the data caches for entire memory, we don't need to care the dcache
>> and after mmu disabled, we use the mmu_setup_early() to re-enable the mmu, which
>> takes care all the cache maintenance. But the situation changes since the Patch
>> #18 ("arm/arm64: Rework the cache maintenance in asm_mmu_disable") only clean
>> and invalidate the data caches for the stack memory area. So we need to clean
>> and invalidate the data caches manually before disable the mmu, I'm not
>> confident about current cache maintenance at the efi setup patch, so I ask for
>> your help to review it if it's right or not.
>>
>> And I also drop one patch ("s390: Do not use the physical allocator") from[1]
>> since this cause s390 test to fail.
>>
>> This series may include bug, so I really appreciate your review to improve this
>> series together.
>>
>> You can get the code from:
>>
>> $ git clone https://gitlab.com/shahuang/kvm-unit-tests.git \
>> 	-b arm-arm64-rework-cache-maintenance-at-boot-v1
>>
>> [1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/arm-arm64-rework-cache-maintenance-at-boot-v2-wip2
>> [2] https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
>> [3] https://patchwork.kernel.org/project/kvm/cover/20230530160924.82158-1-nikos.nikoleris@arm.com/
>>
>> Changelog:
>> ----------
>> RFC->v1:
>>    - Gathered Reviewed-by tags.
>>    - Various changes to commit messages and comments to hopefully make the code
>>      easier to understand.
>>    - Patches #8 ("lib/alloc_phys: Expand documentation with usage and limitations")
>>      are new.
>>    - Folded patch "arm: page.h: Add missing libcflat.h include" into #17
>>      ("arm/arm64: Perform dcache maintenance at boot").
>>    - Reordered the series to group patches that touch aproximately the same code
>>      together - the patches that change the physical allocator are now first,
>>      followed come the patches that change how the secondaries are brought online.
>>    - Fixed several nasty bugs where the r4 register was being clobbered in the arm
>>      assembly.
>>    - Unmap the early UART address if the DTB address does not match the early
>>      address.
>>    - Added dcache maintenance when a page table is modified with the MMU disabled.
>>    - Moved the cache maintenance when disabling the MMU to be executed before the
>>      MMU is disabled.
>>    - Rebase it on lasted branch which efi support has been merged.
>>    - Make the efi test also enable MMU early.
>>    - Add cache maintenance on efi setup path especially before mmu_disable.
>>
>> RFC: https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
>>
>> Alexandru Elisei (18):
>>    Makefile: Define __ASSEMBLY__ for assembly files
>>    powerpc: Replace the physical allocator with the page allocator
>>    lib/alloc_phys: Initialize align_min
>>    lib/alloc_phys: Consolidate allocate functions into memalign_early()
>>    lib/alloc_phys: Remove locking
>>    lib/alloc_phys: Remove allocation accounting
>>    lib/alloc_phys: Add callback to perform cache maintenance
>>    lib/alloc_phys: Expand documentation with usage and limitations
>>    arm/arm64: Zero secondary CPUs' stack
>>    arm/arm64: Allocate secondaries' stack using the page allocator
>>    arm/arm64: assembler.h: Replace size with end address for
>>      dcache_by_line_op
>>    arm/arm64: Add C functions for doing cache maintenance
>>    arm/arm64: Configure secondaries' stack before enabling the MMU
>>    arm/arm64: Use pgd_alloc() to allocate mmu_idmap
>>    arm/arm64: Enable the MMU early
>>    arm/arm64: Map the UART when creating the translation tables
>>    arm/arm64: Perform dcache maintenance at boot
>>    arm/arm64: Rework the cache maintenance in asm_mmu_disable
>>
>>   Makefile                   |   5 +-
>>   arm/Makefile.arm           |   4 +-
>>   arm/Makefile.arm64         |   4 +-
>>   arm/Makefile.common        |   6 +-
>>   arm/cstart.S               |  71 +++++++++++++++------
>>   arm/cstart64.S             |  76 +++++++++++++++++------
>>   lib/alloc_phys.c           | 122 ++++++++++++-------------------------
>>   lib/alloc_phys.h           |  28 ++++++---
>>   lib/arm/asm/assembler.h    |  15 ++---
>>   lib/arm/asm/cacheflush.h   |   1 +
>>   lib/arm/asm/mmu-api.h      |   1 +
>>   lib/arm/asm/mmu.h          |   6 --
>>   lib/arm/asm/page.h         |   2 +
>>   lib/arm/asm/pgtable.h      |  39 ++++++++++--
>>   lib/arm/asm/thread_info.h  |   3 +-
>>   lib/arm/cache.S            |  89 +++++++++++++++++++++++++++
>>   lib/arm/io.c               |  31 ++++++++++
>>   lib/arm/io.h               |   3 +
>>   lib/arm/mmu.c              |  37 ++++++++---
>>   lib/arm/processor.c        |   1 -
>>   lib/arm/setup.c            |  82 +++++++++++++++++++++----
>>   lib/arm/smp.c              |   5 ++
>>   lib/arm64/asm/assembler.h  |  11 ++--
>>   lib/arm64/asm/cacheflush.h |  37 +++++++++++
>>   lib/arm64/asm/mmu.h        |   5 --
>>   lib/arm64/asm/pgtable.h    |  50 +++++++++++++--
>>   lib/arm64/cache.S          |  85 ++++++++++++++++++++++++++
>>   lib/arm64/processor.c      |   1 -
>>   lib/devicetree.c           |   2 +-
>>   lib/powerpc/setup.c        |   9 ++-
>>   powerpc/Makefile.common    |   1 +
>>   powerpc/cstart64.S         |   1 -
>>   powerpc/spapr_hcall.c      |   5 +-
>>   33 files changed, 642 insertions(+), 196 deletions(-)
>>   create mode 100644 lib/arm/asm/cacheflush.h
>>   create mode 100644 lib/arm/cache.S
>>   create mode 100644 lib/arm64/asm/cacheflush.h
>>   create mode 100644 lib/arm64/cache.S
>>
>> -- 
>> 2.40.1
>>
> 

-- 
Shaoqin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot
@ 2023-12-01  8:40     ` Shaoqin Huang
  0 siblings, 0 replies; 38+ messages in thread
From: Shaoqin Huang @ 2023-12-01  8:40 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, Andrew Jones,
	Nikos Nikoleris, Eric Auger, Nadav Amit, kvmarm, linuxppc-dev,
	David Woodhouse

Hi Alexandru,

Just take your time. I also appreciate your work. :)

Thanks,
Shaoqin

On 11/30/23 18:35, Alexandru Elisei wrote:
> Hi,
> 
> Thank you so much for reviving this, much appreciated.
> 
> I wanted to let you know that I definitely plan to review the series as
> soon as possible, unfortunately I don't believe I won't be able to do that
> for at least 2 weeks.
> 
> Thanks,
> Alex
> 
> On Thu, Nov 30, 2023 at 04:07:02AM -0500, Shaoqin Huang wrote:
>> Hi,
>>
>> I'm posting Alexandru's patch set[1] rebased on the latest branch with the
>> conflicts being resolved. No big changes compare to its original code.
>>
>> As this version 1 of this series was posted one years ago, I would first let you
>> recall it, what's the intention of this series and what this series do. You can
>> view it by click the link[2] and view the cover-letter.
>>
>> Since when writing the series[1], the efi support for arm64[3] hasn't been
>> merged into the kvm-unit-tests, but now the efi support for arm64 has been
>> merged. Directly rebase the series[1] onto the latest branch will break the efi
>> tests. This is mainly because the Patch #15 ("arm/arm64: Enable the MMU early")
>> moves the mmu_enable() out of the setup_mmu(), which causes the efi test will
>> not enable the mmu. So I do a small change in the efi_mem_init() which makes the
>> efi test also enable the MMU early, and make it works.
>>
>> And another change should be noticed is in the Patch #17 ("arm/arm64: Perform
>> dcache maintenance"). In the efi_mem_init(), it will disable the mmu, and build
>> a new pagetable and re-enable the mmu, if the asm_mmu_disable clean and
>> invalidate the data caches for entire memory, we don't need to care the dcache
>> and after mmu disabled, we use the mmu_setup_early() to re-enable the mmu, which
>> takes care all the cache maintenance. But the situation changes since the Patch
>> #18 ("arm/arm64: Rework the cache maintenance in asm_mmu_disable") only clean
>> and invalidate the data caches for the stack memory area. So we need to clean
>> and invalidate the data caches manually before disable the mmu, I'm not
>> confident about current cache maintenance at the efi setup patch, so I ask for
>> your help to review it if it's right or not.
>>
>> And I also drop one patch ("s390: Do not use the physical allocator") from[1]
>> since this cause s390 test to fail.
>>
>> This series may include bug, so I really appreciate your review to improve this
>> series together.
>>
>> You can get the code from:
>>
>> $ git clone https://gitlab.com/shahuang/kvm-unit-tests.git \
>> 	-b arm-arm64-rework-cache-maintenance-at-boot-v1
>>
>> [1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/arm-arm64-rework-cache-maintenance-at-boot-v2-wip2
>> [2] https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
>> [3] https://patchwork.kernel.org/project/kvm/cover/20230530160924.82158-1-nikos.nikoleris@arm.com/
>>
>> Changelog:
>> ----------
>> RFC->v1:
>>    - Gathered Reviewed-by tags.
>>    - Various changes to commit messages and comments to hopefully make the code
>>      easier to understand.
>>    - Patches #8 ("lib/alloc_phys: Expand documentation with usage and limitations")
>>      are new.
>>    - Folded patch "arm: page.h: Add missing libcflat.h include" into #17
>>      ("arm/arm64: Perform dcache maintenance at boot").
>>    - Reordered the series to group patches that touch aproximately the same code
>>      together - the patches that change the physical allocator are now first,
>>      followed come the patches that change how the secondaries are brought online.
>>    - Fixed several nasty bugs where the r4 register was being clobbered in the arm
>>      assembly.
>>    - Unmap the early UART address if the DTB address does not match the early
>>      address.
>>    - Added dcache maintenance when a page table is modified with the MMU disabled.
>>    - Moved the cache maintenance when disabling the MMU to be executed before the
>>      MMU is disabled.
>>    - Rebase it on lasted branch which efi support has been merged.
>>    - Make the efi test also enable MMU early.
>>    - Add cache maintenance on efi setup path especially before mmu_disable.
>>
>> RFC: https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
>>
>> Alexandru Elisei (18):
>>    Makefile: Define __ASSEMBLY__ for assembly files
>>    powerpc: Replace the physical allocator with the page allocator
>>    lib/alloc_phys: Initialize align_min
>>    lib/alloc_phys: Consolidate allocate functions into memalign_early()
>>    lib/alloc_phys: Remove locking
>>    lib/alloc_phys: Remove allocation accounting
>>    lib/alloc_phys: Add callback to perform cache maintenance
>>    lib/alloc_phys: Expand documentation with usage and limitations
>>    arm/arm64: Zero secondary CPUs' stack
>>    arm/arm64: Allocate secondaries' stack using the page allocator
>>    arm/arm64: assembler.h: Replace size with end address for
>>      dcache_by_line_op
>>    arm/arm64: Add C functions for doing cache maintenance
>>    arm/arm64: Configure secondaries' stack before enabling the MMU
>>    arm/arm64: Use pgd_alloc() to allocate mmu_idmap
>>    arm/arm64: Enable the MMU early
>>    arm/arm64: Map the UART when creating the translation tables
>>    arm/arm64: Perform dcache maintenance at boot
>>    arm/arm64: Rework the cache maintenance in asm_mmu_disable
>>
>>   Makefile                   |   5 +-
>>   arm/Makefile.arm           |   4 +-
>>   arm/Makefile.arm64         |   4 +-
>>   arm/Makefile.common        |   6 +-
>>   arm/cstart.S               |  71 +++++++++++++++------
>>   arm/cstart64.S             |  76 +++++++++++++++++------
>>   lib/alloc_phys.c           | 122 ++++++++++++-------------------------
>>   lib/alloc_phys.h           |  28 ++++++---
>>   lib/arm/asm/assembler.h    |  15 ++---
>>   lib/arm/asm/cacheflush.h   |   1 +
>>   lib/arm/asm/mmu-api.h      |   1 +
>>   lib/arm/asm/mmu.h          |   6 --
>>   lib/arm/asm/page.h         |   2 +
>>   lib/arm/asm/pgtable.h      |  39 ++++++++++--
>>   lib/arm/asm/thread_info.h  |   3 +-
>>   lib/arm/cache.S            |  89 +++++++++++++++++++++++++++
>>   lib/arm/io.c               |  31 ++++++++++
>>   lib/arm/io.h               |   3 +
>>   lib/arm/mmu.c              |  37 ++++++++---
>>   lib/arm/processor.c        |   1 -
>>   lib/arm/setup.c            |  82 +++++++++++++++++++++----
>>   lib/arm/smp.c              |   5 ++
>>   lib/arm64/asm/assembler.h  |  11 ++--
>>   lib/arm64/asm/cacheflush.h |  37 +++++++++++
>>   lib/arm64/asm/mmu.h        |   5 --
>>   lib/arm64/asm/pgtable.h    |  50 +++++++++++++--
>>   lib/arm64/cache.S          |  85 ++++++++++++++++++++++++++
>>   lib/arm64/processor.c      |   1 -
>>   lib/devicetree.c           |   2 +-
>>   lib/powerpc/setup.c        |   9 ++-
>>   powerpc/Makefile.common    |   1 +
>>   powerpc/cstart64.S         |   1 -
>>   powerpc/spapr_hcall.c      |   5 +-
>>   33 files changed, 642 insertions(+), 196 deletions(-)
>>   create mode 100644 lib/arm/asm/cacheflush.h
>>   create mode 100644 lib/arm/cache.S
>>   create mode 100644 lib/arm64/asm/cacheflush.h
>>   create mode 100644 lib/arm64/cache.S
>>
>> -- 
>> 2.40.1
>>
> 

-- 
Shaoqin


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
  2023-11-30  9:07   ` Shaoqin Huang
@ 2024-01-15 12:44     ` Andrew Jones
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Jones @ 2024-01-15 12:44 UTC (permalink / raw)
  To: Shaoqin Huang
  Cc: kvmarm, Alexandru Elisei, Nikos Nikoleris, Eric Auger,
	Laurent Vivier, Thomas Huth, Nico Boehr, David Woodhouse,
	Nadav Amit, kvm, linuxppc-dev

On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> From: Alexandru Elisei <alexandru.elisei@arm.com>
> 
> There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> with functionality relies on the __ASSEMBLY__ prepocessor constant being
> correctly defined to work correctly. So far, kvm-unit-tests has relied on
> the assembly files to define the constant before including any header
> files which depend on it.
> 
> Let's make sure that nobody gets this wrong and define it as a compiler
> constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> .S files, even those that didn't set it explicitely before.
> 
> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> ---
>  Makefile           | 5 ++++-
>  arm/cstart.S       | 1 -
>  arm/cstart64.S     | 1 -
>  powerpc/cstart64.S | 1 -
>  4 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 602910dd..27ed14e6 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
>  
>  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
>  
> +AFLAGS  = $(CFLAGS)
> +AFLAGS += -D__ASSEMBLY__
> +
>  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
>  
>  $(libcflat): $(cflatobjs)
> @@ -113,7 +116,7 @@ directories:
>  	@mkdir -p $(OBJDIRS)
>  
>  %.o: %.S
> -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<

I think we can drop the two hunks above from this patch and just rely on
the compiler to add __ASSEMBLY__ for us when compiling assembly files.

Thanks,
drew

>  
>  -include */.*.d */*/.*.d
>  
> diff --git a/arm/cstart.S b/arm/cstart.S
> index 3dd71ed9..b24ecabc 100644
> --- a/arm/cstart.S
> +++ b/arm/cstart.S
> @@ -5,7 +5,6 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> -#define __ASSEMBLY__
>  #include <auxinfo.h>
>  #include <asm/assembler.h>
>  #include <asm/thread_info.h>
> diff --git a/arm/cstart64.S b/arm/cstart64.S
> index bc2be45a..a8ad6dc8 100644
> --- a/arm/cstart64.S
> +++ b/arm/cstart64.S
> @@ -5,7 +5,6 @@
>   *
>   * This work is licensed under the terms of the GNU GPL, version 2.
>   */
> -#define __ASSEMBLY__
>  #include <auxinfo.h>
>  #include <asm/asm-offsets.h>
>  #include <asm/assembler.h>
> diff --git a/powerpc/cstart64.S b/powerpc/cstart64.S
> index 34e39341..fa32ef24 100644
> --- a/powerpc/cstart64.S
> +++ b/powerpc/cstart64.S
> @@ -5,7 +5,6 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> -#define __ASSEMBLY__
>  #include <asm/hcall.h>
>  #include <asm/ppc_asm.h>
>  #include <asm/rtas.h>
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
@ 2024-01-15 12:44     ` Andrew Jones
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Jones @ 2024-01-15 12:44 UTC (permalink / raw)
  To: Shaoqin Huang
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, linuxppc-dev,
	Nikos Nikoleris, Eric Auger, Nadav Amit, kvmarm,
	Alexandru Elisei, David Woodhouse

On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> From: Alexandru Elisei <alexandru.elisei@arm.com>
> 
> There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> with functionality relies on the __ASSEMBLY__ prepocessor constant being
> correctly defined to work correctly. So far, kvm-unit-tests has relied on
> the assembly files to define the constant before including any header
> files which depend on it.
> 
> Let's make sure that nobody gets this wrong and define it as a compiler
> constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> .S files, even those that didn't set it explicitely before.
> 
> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> ---
>  Makefile           | 5 ++++-
>  arm/cstart.S       | 1 -
>  arm/cstart64.S     | 1 -
>  powerpc/cstart64.S | 1 -
>  4 files changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/Makefile b/Makefile
> index 602910dd..27ed14e6 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
>  
>  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
>  
> +AFLAGS  = $(CFLAGS)
> +AFLAGS += -D__ASSEMBLY__
> +
>  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
>  
>  $(libcflat): $(cflatobjs)
> @@ -113,7 +116,7 @@ directories:
>  	@mkdir -p $(OBJDIRS)
>  
>  %.o: %.S
> -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<

I think we can drop the two hunks above from this patch and just rely on
the compiler to add __ASSEMBLY__ for us when compiling assembly files.

Thanks,
drew

>  
>  -include */.*.d */*/.*.d
>  
> diff --git a/arm/cstart.S b/arm/cstart.S
> index 3dd71ed9..b24ecabc 100644
> --- a/arm/cstart.S
> +++ b/arm/cstart.S
> @@ -5,7 +5,6 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> -#define __ASSEMBLY__
>  #include <auxinfo.h>
>  #include <asm/assembler.h>
>  #include <asm/thread_info.h>
> diff --git a/arm/cstart64.S b/arm/cstart64.S
> index bc2be45a..a8ad6dc8 100644
> --- a/arm/cstart64.S
> +++ b/arm/cstart64.S
> @@ -5,7 +5,6 @@
>   *
>   * This work is licensed under the terms of the GNU GPL, version 2.
>   */
> -#define __ASSEMBLY__
>  #include <auxinfo.h>
>  #include <asm/asm-offsets.h>
>  #include <asm/assembler.h>
> diff --git a/powerpc/cstart64.S b/powerpc/cstart64.S
> index 34e39341..fa32ef24 100644
> --- a/powerpc/cstart64.S
> +++ b/powerpc/cstart64.S
> @@ -5,7 +5,6 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> -#define __ASSEMBLY__
>  #include <asm/hcall.h>
>  #include <asm/ppc_asm.h>
>  #include <asm/rtas.h>
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
  2024-01-15 12:44     ` Andrew Jones
@ 2024-02-15 16:05       ` Alexandru Elisei
  -1 siblings, 0 replies; 38+ messages in thread
From: Alexandru Elisei @ 2024-02-15 16:05 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Shaoqin Huang, kvmarm, Nikos Nikoleris, Eric Auger,
	Laurent Vivier, Thomas Huth, Nico Boehr, David Woodhouse,
	Nadav Amit, kvm, linuxppc-dev

Hi Drew,

On Mon, Jan 15, 2024 at 01:44:17PM +0100, Andrew Jones wrote:
> On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> > From: Alexandru Elisei <alexandru.elisei@arm.com>
> > 
> > There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> > with functionality relies on the __ASSEMBLY__ prepocessor constant being
> > correctly defined to work correctly. So far, kvm-unit-tests has relied on
> > the assembly files to define the constant before including any header
> > files which depend on it.
> > 
> > Let's make sure that nobody gets this wrong and define it as a compiler
> > constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> > .S files, even those that didn't set it explicitely before.
> > 
> > Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> > ---
> >  Makefile           | 5 ++++-
> >  arm/cstart.S       | 1 -
> >  arm/cstart64.S     | 1 -
> >  powerpc/cstart64.S | 1 -
> >  4 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/Makefile b/Makefile
> > index 602910dd..27ed14e6 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
> >  
> >  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
> >  
> > +AFLAGS  = $(CFLAGS)
> > +AFLAGS += -D__ASSEMBLY__
> > +
> >  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
> >  
> >  $(libcflat): $(cflatobjs)
> > @@ -113,7 +116,7 @@ directories:
> >  	@mkdir -p $(OBJDIRS)
> >  
> >  %.o: %.S
> > -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> > +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
> 
> I think we can drop the two hunks above from this patch and just rely on
> the compiler to add __ASSEMBLY__ for us when compiling assembly files.

I think the precompiler adds __ASSEMBLER__, not __ASSEMBLY__ [1]. Am I
missing something?

[1] https://gcc.gnu.org/onlinedocs/cpp/macros/predefined-macros.html#c.__ASSEMBLER__

Thanks,
Alex

> 
> Thanks,
> drew
> 
> >  
> >  -include */.*.d */*/.*.d
> >  
> > diff --git a/arm/cstart.S b/arm/cstart.S
> > index 3dd71ed9..b24ecabc 100644
> > --- a/arm/cstart.S
> > +++ b/arm/cstart.S
> > @@ -5,7 +5,6 @@
> >   *
> >   * This work is licensed under the terms of the GNU LGPL, version 2.
> >   */
> > -#define __ASSEMBLY__
> >  #include <auxinfo.h>
> >  #include <asm/assembler.h>
> >  #include <asm/thread_info.h>
> > diff --git a/arm/cstart64.S b/arm/cstart64.S
> > index bc2be45a..a8ad6dc8 100644
> > --- a/arm/cstart64.S
> > +++ b/arm/cstart64.S
> > @@ -5,7 +5,6 @@
> >   *
> >   * This work is licensed under the terms of the GNU GPL, version 2.
> >   */
> > -#define __ASSEMBLY__
> >  #include <auxinfo.h>
> >  #include <asm/asm-offsets.h>
> >  #include <asm/assembler.h>
> > diff --git a/powerpc/cstart64.S b/powerpc/cstart64.S
> > index 34e39341..fa32ef24 100644
> > --- a/powerpc/cstart64.S
> > +++ b/powerpc/cstart64.S
> > @@ -5,7 +5,6 @@
> >   *
> >   * This work is licensed under the terms of the GNU LGPL, version 2.
> >   */
> > -#define __ASSEMBLY__
> >  #include <asm/hcall.h>
> >  #include <asm/ppc_asm.h>
> >  #include <asm/rtas.h>
> > -- 
> > 2.40.1
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
@ 2024-02-15 16:05       ` Alexandru Elisei
  0 siblings, 0 replies; 38+ messages in thread
From: Alexandru Elisei @ 2024-02-15 16:05 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, Shaoqin Huang,
	Nikos Nikoleris, Eric Auger, Nadav Amit, kvmarm, linuxppc-dev,
	David Woodhouse

Hi Drew,

On Mon, Jan 15, 2024 at 01:44:17PM +0100, Andrew Jones wrote:
> On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> > From: Alexandru Elisei <alexandru.elisei@arm.com>
> > 
> > There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> > with functionality relies on the __ASSEMBLY__ prepocessor constant being
> > correctly defined to work correctly. So far, kvm-unit-tests has relied on
> > the assembly files to define the constant before including any header
> > files which depend on it.
> > 
> > Let's make sure that nobody gets this wrong and define it as a compiler
> > constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> > .S files, even those that didn't set it explicitely before.
> > 
> > Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> > ---
> >  Makefile           | 5 ++++-
> >  arm/cstart.S       | 1 -
> >  arm/cstart64.S     | 1 -
> >  powerpc/cstart64.S | 1 -
> >  4 files changed, 4 insertions(+), 4 deletions(-)
> > 
> > diff --git a/Makefile b/Makefile
> > index 602910dd..27ed14e6 100644
> > --- a/Makefile
> > +++ b/Makefile
> > @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
> >  
> >  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
> >  
> > +AFLAGS  = $(CFLAGS)
> > +AFLAGS += -D__ASSEMBLY__
> > +
> >  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
> >  
> >  $(libcflat): $(cflatobjs)
> > @@ -113,7 +116,7 @@ directories:
> >  	@mkdir -p $(OBJDIRS)
> >  
> >  %.o: %.S
> > -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> > +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
> 
> I think we can drop the two hunks above from this patch and just rely on
> the compiler to add __ASSEMBLY__ for us when compiling assembly files.

I think the precompiler adds __ASSEMBLER__, not __ASSEMBLY__ [1]. Am I
missing something?

[1] https://gcc.gnu.org/onlinedocs/cpp/macros/predefined-macros.html#c.__ASSEMBLER__

Thanks,
Alex

> 
> Thanks,
> drew
> 
> >  
> >  -include */.*.d */*/.*.d
> >  
> > diff --git a/arm/cstart.S b/arm/cstart.S
> > index 3dd71ed9..b24ecabc 100644
> > --- a/arm/cstart.S
> > +++ b/arm/cstart.S
> > @@ -5,7 +5,6 @@
> >   *
> >   * This work is licensed under the terms of the GNU LGPL, version 2.
> >   */
> > -#define __ASSEMBLY__
> >  #include <auxinfo.h>
> >  #include <asm/assembler.h>
> >  #include <asm/thread_info.h>
> > diff --git a/arm/cstart64.S b/arm/cstart64.S
> > index bc2be45a..a8ad6dc8 100644
> > --- a/arm/cstart64.S
> > +++ b/arm/cstart64.S
> > @@ -5,7 +5,6 @@
> >   *
> >   * This work is licensed under the terms of the GNU GPL, version 2.
> >   */
> > -#define __ASSEMBLY__
> >  #include <auxinfo.h>
> >  #include <asm/asm-offsets.h>
> >  #include <asm/assembler.h>
> > diff --git a/powerpc/cstart64.S b/powerpc/cstart64.S
> > index 34e39341..fa32ef24 100644
> > --- a/powerpc/cstart64.S
> > +++ b/powerpc/cstart64.S
> > @@ -5,7 +5,6 @@
> >   *
> >   * This work is licensed under the terms of the GNU LGPL, version 2.
> >   */
> > -#define __ASSEMBLY__
> >  #include <asm/hcall.h>
> >  #include <asm/ppc_asm.h>
> >  #include <asm/rtas.h>
> > -- 
> > 2.40.1
> > 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
  2024-02-15 16:05       ` Alexandru Elisei
@ 2024-02-15 16:32         ` Andrew Jones
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Jones @ 2024-02-15 16:32 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Shaoqin Huang, kvmarm, Nikos Nikoleris, Eric Auger,
	Laurent Vivier, Thomas Huth, Nico Boehr, David Woodhouse,
	Nadav Amit, kvm, linuxppc-dev

On Thu, Feb 15, 2024 at 04:05:56PM +0000, Alexandru Elisei wrote:
> Hi Drew,
> 
> On Mon, Jan 15, 2024 at 01:44:17PM +0100, Andrew Jones wrote:
> > On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> > > From: Alexandru Elisei <alexandru.elisei@arm.com>
> > > 
> > > There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> > > with functionality relies on the __ASSEMBLY__ prepocessor constant being
> > > correctly defined to work correctly. So far, kvm-unit-tests has relied on
> > > the assembly files to define the constant before including any header
> > > files which depend on it.
> > > 
> > > Let's make sure that nobody gets this wrong and define it as a compiler
> > > constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> > > .S files, even those that didn't set it explicitely before.
> > > 
> > > Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> > > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> > > ---
> > >  Makefile           | 5 ++++-
> > >  arm/cstart.S       | 1 -
> > >  arm/cstart64.S     | 1 -
> > >  powerpc/cstart64.S | 1 -
> > >  4 files changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/Makefile b/Makefile
> > > index 602910dd..27ed14e6 100644
> > > --- a/Makefile
> > > +++ b/Makefile
> > > @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
> > >  
> > >  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
> > >  
> > > +AFLAGS  = $(CFLAGS)
> > > +AFLAGS += -D__ASSEMBLY__
> > > +
> > >  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
> > >  
> > >  $(libcflat): $(cflatobjs)
> > > @@ -113,7 +116,7 @@ directories:
> > >  	@mkdir -p $(OBJDIRS)
> > >  
> > >  %.o: %.S
> > > -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> > > +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
> > 
> > I think we can drop the two hunks above from this patch and just rely on
> > the compiler to add __ASSEMBLY__ for us when compiling assembly files.
> 
> I think the precompiler adds __ASSEMBLER__, not __ASSEMBLY__ [1]. Am I
> missing something?
> 
> [1] https://gcc.gnu.org/onlinedocs/cpp/macros/predefined-macros.html#c.__ASSEMBLER__

You're right. I'm not opposed to changing all the __ASSEMBLY__ references
to __ASSEMBLER__. I'll try to do that at some point unless you beat me to
it.

Thanks,
drew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
@ 2024-02-15 16:32         ` Andrew Jones
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Jones @ 2024-02-15 16:32 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, Shaoqin Huang,
	Nikos Nikoleris, Eric Auger, Nadav Amit, kvmarm, linuxppc-dev,
	David Woodhouse

On Thu, Feb 15, 2024 at 04:05:56PM +0000, Alexandru Elisei wrote:
> Hi Drew,
> 
> On Mon, Jan 15, 2024 at 01:44:17PM +0100, Andrew Jones wrote:
> > On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> > > From: Alexandru Elisei <alexandru.elisei@arm.com>
> > > 
> > > There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> > > with functionality relies on the __ASSEMBLY__ prepocessor constant being
> > > correctly defined to work correctly. So far, kvm-unit-tests has relied on
> > > the assembly files to define the constant before including any header
> > > files which depend on it.
> > > 
> > > Let's make sure that nobody gets this wrong and define it as a compiler
> > > constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> > > .S files, even those that didn't set it explicitely before.
> > > 
> > > Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> > > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> > > ---
> > >  Makefile           | 5 ++++-
> > >  arm/cstart.S       | 1 -
> > >  arm/cstart64.S     | 1 -
> > >  powerpc/cstart64.S | 1 -
> > >  4 files changed, 4 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/Makefile b/Makefile
> > > index 602910dd..27ed14e6 100644
> > > --- a/Makefile
> > > +++ b/Makefile
> > > @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
> > >  
> > >  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
> > >  
> > > +AFLAGS  = $(CFLAGS)
> > > +AFLAGS += -D__ASSEMBLY__
> > > +
> > >  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
> > >  
> > >  $(libcflat): $(cflatobjs)
> > > @@ -113,7 +116,7 @@ directories:
> > >  	@mkdir -p $(OBJDIRS)
> > >  
> > >  %.o: %.S
> > > -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> > > +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
> > 
> > I think we can drop the two hunks above from this patch and just rely on
> > the compiler to add __ASSEMBLY__ for us when compiling assembly files.
> 
> I think the precompiler adds __ASSEMBLER__, not __ASSEMBLY__ [1]. Am I
> missing something?
> 
> [1] https://gcc.gnu.org/onlinedocs/cpp/macros/predefined-macros.html#c.__ASSEMBLER__

You're right. I'm not opposed to changing all the __ASSEMBLY__ references
to __ASSEMBLER__. I'll try to do that at some point unless you beat me to
it.

Thanks,
drew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
  2024-02-15 16:32         ` Andrew Jones
@ 2024-02-15 17:16           ` Alexandru Elisei
  -1 siblings, 0 replies; 38+ messages in thread
From: Alexandru Elisei @ 2024-02-15 17:16 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Shaoqin Huang, kvmarm, Nikos Nikoleris, Eric Auger,
	Laurent Vivier, Thomas Huth, Nico Boehr, David Woodhouse,
	Nadav Amit, kvm, linuxppc-dev

Hi Drew,

On Thu, Feb 15, 2024 at 05:32:22PM +0100, Andrew Jones wrote:
> On Thu, Feb 15, 2024 at 04:05:56PM +0000, Alexandru Elisei wrote:
> > Hi Drew,
> > 
> > On Mon, Jan 15, 2024 at 01:44:17PM +0100, Andrew Jones wrote:
> > > On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> > > > From: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > 
> > > > There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> > > > with functionality relies on the __ASSEMBLY__ prepocessor constant being
> > > > correctly defined to work correctly. So far, kvm-unit-tests has relied on
> > > > the assembly files to define the constant before including any header
> > > > files which depend on it.
> > > > 
> > > > Let's make sure that nobody gets this wrong and define it as a compiler
> > > > constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> > > > .S files, even those that didn't set it explicitely before.
> > > > 
> > > > Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> > > > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> > > > ---
> > > >  Makefile           | 5 ++++-
> > > >  arm/cstart.S       | 1 -
> > > >  arm/cstart64.S     | 1 -
> > > >  powerpc/cstart64.S | 1 -
> > > >  4 files changed, 4 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/Makefile b/Makefile
> > > > index 602910dd..27ed14e6 100644
> > > > --- a/Makefile
> > > > +++ b/Makefile
> > > > @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
> > > >  
> > > >  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
> > > >  
> > > > +AFLAGS  = $(CFLAGS)
> > > > +AFLAGS += -D__ASSEMBLY__
> > > > +
> > > >  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
> > > >  
> > > >  $(libcflat): $(cflatobjs)
> > > > @@ -113,7 +116,7 @@ directories:
> > > >  	@mkdir -p $(OBJDIRS)
> > > >  
> > > >  %.o: %.S
> > > > -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> > > > +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
> > > 
> > > I think we can drop the two hunks above from this patch and just rely on
> > > the compiler to add __ASSEMBLY__ for us when compiling assembly files.
> > 
> > I think the precompiler adds __ASSEMBLER__, not __ASSEMBLY__ [1]. Am I
> > missing something?
> > 
> > [1] https://gcc.gnu.org/onlinedocs/cpp/macros/predefined-macros.html#c.__ASSEMBLER__
> 
> You're right. I'm not opposed to changing all the __ASSEMBLY__ references
> to __ASSEMBLER__. I'll try to do that at some point unless you beat me to
> it.

Actually, I quite prefer the Linux style of using __ASSEMBLY__ instead of
__ASSEMBLER__, because it makes reusing Linux files easier. That, and the
habit formed by staring at Linux assembly files.

Thanks,
Alex

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
@ 2024-02-15 17:16           ` Alexandru Elisei
  0 siblings, 0 replies; 38+ messages in thread
From: Alexandru Elisei @ 2024-02-15 17:16 UTC (permalink / raw)
  To: Andrew Jones
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, Shaoqin Huang,
	Nikos Nikoleris, Eric Auger, Nadav Amit, kvmarm, linuxppc-dev,
	David Woodhouse

Hi Drew,

On Thu, Feb 15, 2024 at 05:32:22PM +0100, Andrew Jones wrote:
> On Thu, Feb 15, 2024 at 04:05:56PM +0000, Alexandru Elisei wrote:
> > Hi Drew,
> > 
> > On Mon, Jan 15, 2024 at 01:44:17PM +0100, Andrew Jones wrote:
> > > On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> > > > From: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > 
> > > > There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> > > > with functionality relies on the __ASSEMBLY__ prepocessor constant being
> > > > correctly defined to work correctly. So far, kvm-unit-tests has relied on
> > > > the assembly files to define the constant before including any header
> > > > files which depend on it.
> > > > 
> > > > Let's make sure that nobody gets this wrong and define it as a compiler
> > > > constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> > > > .S files, even those that didn't set it explicitely before.
> > > > 
> > > > Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> > > > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> > > > ---
> > > >  Makefile           | 5 ++++-
> > > >  arm/cstart.S       | 1 -
> > > >  arm/cstart64.S     | 1 -
> > > >  powerpc/cstart64.S | 1 -
> > > >  4 files changed, 4 insertions(+), 4 deletions(-)
> > > > 
> > > > diff --git a/Makefile b/Makefile
> > > > index 602910dd..27ed14e6 100644
> > > > --- a/Makefile
> > > > +++ b/Makefile
> > > > @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
> > > >  
> > > >  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
> > > >  
> > > > +AFLAGS  = $(CFLAGS)
> > > > +AFLAGS += -D__ASSEMBLY__
> > > > +
> > > >  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
> > > >  
> > > >  $(libcflat): $(cflatobjs)
> > > > @@ -113,7 +116,7 @@ directories:
> > > >  	@mkdir -p $(OBJDIRS)
> > > >  
> > > >  %.o: %.S
> > > > -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> > > > +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
> > > 
> > > I think we can drop the two hunks above from this patch and just rely on
> > > the compiler to add __ASSEMBLY__ for us when compiling assembly files.
> > 
> > I think the precompiler adds __ASSEMBLER__, not __ASSEMBLY__ [1]. Am I
> > missing something?
> > 
> > [1] https://gcc.gnu.org/onlinedocs/cpp/macros/predefined-macros.html#c.__ASSEMBLER__
> 
> You're right. I'm not opposed to changing all the __ASSEMBLY__ references
> to __ASSEMBLER__. I'll try to do that at some point unless you beat me to
> it.

Actually, I quite prefer the Linux style of using __ASSEMBLY__ instead of
__ASSEMBLER__, because it makes reusing Linux files easier. That, and the
habit formed by staring at Linux assembly files.

Thanks,
Alex

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
  2024-02-15 17:16           ` Alexandru Elisei
@ 2024-02-15 19:13             ` Andrew Jones
  -1 siblings, 0 replies; 38+ messages in thread
From: Andrew Jones @ 2024-02-15 19:13 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Shaoqin Huang, kvmarm, Nikos Nikoleris, Eric Auger,
	Laurent Vivier, Thomas Huth, Nico Boehr, David Woodhouse,
	Nadav Amit, kvm, linuxppc-dev

On Thu, Feb 15, 2024 at 05:16:01PM +0000, Alexandru Elisei wrote:
> Hi Drew,
> 
> On Thu, Feb 15, 2024 at 05:32:22PM +0100, Andrew Jones wrote:
> > On Thu, Feb 15, 2024 at 04:05:56PM +0000, Alexandru Elisei wrote:
> > > Hi Drew,
> > > 
> > > On Mon, Jan 15, 2024 at 01:44:17PM +0100, Andrew Jones wrote:
> > > > On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> > > > > From: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > > 
> > > > > There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> > > > > with functionality relies on the __ASSEMBLY__ prepocessor constant being
> > > > > correctly defined to work correctly. So far, kvm-unit-tests has relied on
> > > > > the assembly files to define the constant before including any header
> > > > > files which depend on it.
> > > > > 
> > > > > Let's make sure that nobody gets this wrong and define it as a compiler
> > > > > constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> > > > > .S files, even those that didn't set it explicitely before.
> > > > > 
> > > > > Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> > > > > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > > > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > > Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> > > > > ---
> > > > >  Makefile           | 5 ++++-
> > > > >  arm/cstart.S       | 1 -
> > > > >  arm/cstart64.S     | 1 -
> > > > >  powerpc/cstart64.S | 1 -
> > > > >  4 files changed, 4 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/Makefile b/Makefile
> > > > > index 602910dd..27ed14e6 100644
> > > > > --- a/Makefile
> > > > > +++ b/Makefile
> > > > > @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
> > > > >  
> > > > >  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
> > > > >  
> > > > > +AFLAGS  = $(CFLAGS)
> > > > > +AFLAGS += -D__ASSEMBLY__
> > > > > +
> > > > >  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
> > > > >  
> > > > >  $(libcflat): $(cflatobjs)
> > > > > @@ -113,7 +116,7 @@ directories:
> > > > >  	@mkdir -p $(OBJDIRS)
> > > > >  
> > > > >  %.o: %.S
> > > > > -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> > > > > +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
> > > > 
> > > > I think we can drop the two hunks above from this patch and just rely on
> > > > the compiler to add __ASSEMBLY__ for us when compiling assembly files.
> > > 
> > > I think the precompiler adds __ASSEMBLER__, not __ASSEMBLY__ [1]. Am I
> > > missing something?
> > > 
> > > [1] https://gcc.gnu.org/onlinedocs/cpp/macros/predefined-macros.html#c.__ASSEMBLER__
> > 
> > You're right. I'm not opposed to changing all the __ASSEMBLY__ references
> > to __ASSEMBLER__. I'll try to do that at some point unless you beat me to
> > it.
> 
> Actually, I quite prefer the Linux style of using __ASSEMBLY__ instead of
> __ASSEMBLER__, because it makes reusing Linux files easier. That, and the
> habit formed by staring at Linux assembly files.

Those are good arguments and also saves the churn. OK, let's keep this
patch and __ASSEMBLY__

Thanks,
drew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files
@ 2024-02-15 19:13             ` Andrew Jones
  0 siblings, 0 replies; 38+ messages in thread
From: Andrew Jones @ 2024-02-15 19:13 UTC (permalink / raw)
  To: Alexandru Elisei
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, Shaoqin Huang,
	Nikos Nikoleris, Eric Auger, Nadav Amit, kvmarm, linuxppc-dev,
	David Woodhouse

On Thu, Feb 15, 2024 at 05:16:01PM +0000, Alexandru Elisei wrote:
> Hi Drew,
> 
> On Thu, Feb 15, 2024 at 05:32:22PM +0100, Andrew Jones wrote:
> > On Thu, Feb 15, 2024 at 04:05:56PM +0000, Alexandru Elisei wrote:
> > > Hi Drew,
> > > 
> > > On Mon, Jan 15, 2024 at 01:44:17PM +0100, Andrew Jones wrote:
> > > > On Thu, Nov 30, 2023 at 04:07:03AM -0500, Shaoqin Huang wrote:
> > > > > From: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > > 
> > > > > There are 25 header files today (found with grep -r "#ifndef __ASSEMBLY__)
> > > > > with functionality relies on the __ASSEMBLY__ prepocessor constant being
> > > > > correctly defined to work correctly. So far, kvm-unit-tests has relied on
> > > > > the assembly files to define the constant before including any header
> > > > > files which depend on it.
> > > > > 
> > > > > Let's make sure that nobody gets this wrong and define it as a compiler
> > > > > constant when compiling assembly files. __ASSEMBLY__ is now defined for all
> > > > > .S files, even those that didn't set it explicitely before.
> > > > > 
> > > > > Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> > > > > Reviewed-by: Andrew Jones <andrew.jones@linux.dev>
> > > > > Signed-off-by: Alexandru Elisei <alexandru.elisei@arm.com>
> > > > > Signed-off-by: Shaoqin Huang <shahuang@redhat.com>
> > > > > ---
> > > > >  Makefile           | 5 ++++-
> > > > >  arm/cstart.S       | 1 -
> > > > >  arm/cstart64.S     | 1 -
> > > > >  powerpc/cstart64.S | 1 -
> > > > >  4 files changed, 4 insertions(+), 4 deletions(-)
> > > > > 
> > > > > diff --git a/Makefile b/Makefile
> > > > > index 602910dd..27ed14e6 100644
> > > > > --- a/Makefile
> > > > > +++ b/Makefile
> > > > > @@ -92,6 +92,9 @@ CFLAGS += -Woverride-init -Wmissing-prototypes -Wstrict-prototypes
> > > > >  
> > > > >  autodepend-flags = -MMD -MP -MF $(dir $*).$(notdir $*).d
> > > > >  
> > > > > +AFLAGS  = $(CFLAGS)
> > > > > +AFLAGS += -D__ASSEMBLY__
> > > > > +
> > > > >  LDFLAGS += -nostdlib $(no_pie) -z noexecstack
> > > > >  
> > > > >  $(libcflat): $(cflatobjs)
> > > > > @@ -113,7 +116,7 @@ directories:
> > > > >  	@mkdir -p $(OBJDIRS)
> > > > >  
> > > > >  %.o: %.S
> > > > > -	$(CC) $(CFLAGS) -c -nostdlib -o $@ $<
> > > > > +	$(CC) $(AFLAGS) -c -nostdlib -o $@ $<
> > > > 
> > > > I think we can drop the two hunks above from this patch and just rely on
> > > > the compiler to add __ASSEMBLY__ for us when compiling assembly files.
> > > 
> > > I think the precompiler adds __ASSEMBLER__, not __ASSEMBLY__ [1]. Am I
> > > missing something?
> > > 
> > > [1] https://gcc.gnu.org/onlinedocs/cpp/macros/predefined-macros.html#c.__ASSEMBLER__
> > 
> > You're right. I'm not opposed to changing all the __ASSEMBLY__ references
> > to __ASSEMBLER__. I'll try to do that at some point unless you beat me to
> > it.
> 
> Actually, I quite prefer the Linux style of using __ASSEMBLY__ instead of
> __ASSEMBLER__, because it makes reusing Linux files easier. That, and the
> habit formed by staring at Linux assembly files.

Those are good arguments and also saves the churn. OK, let's keep this
patch and __ASSEMBLY__

Thanks,
drew

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot
  2023-11-30  9:07 ` Shaoqin Huang
@ 2024-02-16 15:47   ` Alexandru Elisei
  -1 siblings, 0 replies; 38+ messages in thread
From: Alexandru Elisei @ 2024-02-16 15:47 UTC (permalink / raw)
  To: Shaoqin Huang
  Cc: Andrew Jones, kvmarm, David Woodhouse, Eric Auger, kvm,
	Laurent Vivier, linuxppc-dev, Nadav Amit, Nico Boehr,
	Nikos Nikoleris, Thomas Huth

Hi,

On Thu, Nov 30, 2023 at 04:07:02AM -0500, Shaoqin Huang wrote:
> Hi,
> 
> I'm posting Alexandru's patch set[1] rebased on the latest branch with the
> conflicts being resolved. No big changes compare to its original code.
> 
> As this version 1 of this series was posted one years ago, I would first let you
> recall it, what's the intention of this series and what this series do. You can
> view it by click the link[2] and view the cover-letter.
> 
> Since when writing the series[1], the efi support for arm64[3] hasn't been
> merged into the kvm-unit-tests, but now the efi support for arm64 has been
> merged. Directly rebase the series[1] onto the latest branch will break the efi
> tests. This is mainly because the Patch #15 ("arm/arm64: Enable the MMU early")
> moves the mmu_enable() out of the setup_mmu(), which causes the efi test will
> not enable the mmu. So I do a small change in the efi_mem_init() which makes the
> efi test also enable the MMU early, and make it works.
> 
> And another change should be noticed is in the Patch #17 ("arm/arm64: Perform
> dcache maintenance"). In the efi_mem_init(), it will disable the mmu, and build
> a new pagetable and re-enable the mmu, if the asm_mmu_disable clean and
> invalidate the data caches for entire memory, we don't need to care the dcache
> and after mmu disabled, we use the mmu_setup_early() to re-enable the mmu, which
> takes care all the cache maintenance. But the situation changes since the Patch
> #18 ("arm/arm64: Rework the cache maintenance in asm_mmu_disable") only clean
> and invalidate the data caches for the stack memory area. So we need to clean
> and invalidate the data caches manually before disable the mmu, I'm not
> confident about current cache maintenance at the efi setup patch, so I ask for
> your help to review it if it's right or not.
> 
> And I also drop one patch ("s390: Do not use the physical allocator") from[1]
> since this cause s390 test to fail.

This is unfortunate. What tests do you see failing?

I wrote the s390x patch so I can justify dropping the locking in the
physical allocator. And I wanted to drop the locking so I wouldn't have to
do maintenance operation on the spinlock.

Because of how kvm-unit-tests implements spinlocks for arm/arm64, they
don't protect against concurrent accesses when the MMU is turned off. And
because arm and arm64 use the physical allocator during the test boot
sequence, not during a test, using a spin lock is also useless since there
will be no concurrent accesses (the boot phase is done on a single CPU).

But since replacing the physical allocator causes test failures for s390x,
looks like the physical will still be needed to tests, and that requires
having the spinlock.

I guess the best approach would be to teach the physical allocator to do
cache maintenance on the spinlock. We might as well, since the UART needs
it too, and I don't think this series addresses that.

Thanks,
Alex

> 
> This series may include bug, so I really appreciate your review to improve this
> series together.
> 
> You can get the code from:
> 
> $ git clone https://gitlab.com/shahuang/kvm-unit-tests.git \
> 	-b arm-arm64-rework-cache-maintenance-at-boot-v1
> 
> [1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/arm-arm64-rework-cache-maintenance-at-boot-v2-wip2
> [2] https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
> [3] https://patchwork.kernel.org/project/kvm/cover/20230530160924.82158-1-nikos.nikoleris@arm.com/
> 
> Changelog:
> ----------
> RFC->v1:
>   - Gathered Reviewed-by tags.
>   - Various changes to commit messages and comments to hopefully make the code
>     easier to understand.
>   - Patches #8 ("lib/alloc_phys: Expand documentation with usage and limitations")
>     are new.
>   - Folded patch "arm: page.h: Add missing libcflat.h include" into #17
>     ("arm/arm64: Perform dcache maintenance at boot").
>   - Reordered the series to group patches that touch aproximately the same code
>     together - the patches that change the physical allocator are now first,
>     followed come the patches that change how the secondaries are brought online.
>   - Fixed several nasty bugs where the r4 register was being clobbered in the arm
>     assembly.
>   - Unmap the early UART address if the DTB address does not match the early
>     address.
>   - Added dcache maintenance when a page table is modified with the MMU disabled.
>   - Moved the cache maintenance when disabling the MMU to be executed before the
>     MMU is disabled.
>   - Rebase it on lasted branch which efi support has been merged.
>   - Make the efi test also enable MMU early.
>   - Add cache maintenance on efi setup path especially before mmu_disable.
> 
> RFC: https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
> 
> Alexandru Elisei (18):
>   Makefile: Define __ASSEMBLY__ for assembly files
>   powerpc: Replace the physical allocator with the page allocator
>   lib/alloc_phys: Initialize align_min
>   lib/alloc_phys: Consolidate allocate functions into memalign_early()
>   lib/alloc_phys: Remove locking
>   lib/alloc_phys: Remove allocation accounting
>   lib/alloc_phys: Add callback to perform cache maintenance
>   lib/alloc_phys: Expand documentation with usage and limitations
>   arm/arm64: Zero secondary CPUs' stack
>   arm/arm64: Allocate secondaries' stack using the page allocator
>   arm/arm64: assembler.h: Replace size with end address for
>     dcache_by_line_op
>   arm/arm64: Add C functions for doing cache maintenance
>   arm/arm64: Configure secondaries' stack before enabling the MMU
>   arm/arm64: Use pgd_alloc() to allocate mmu_idmap
>   arm/arm64: Enable the MMU early
>   arm/arm64: Map the UART when creating the translation tables
>   arm/arm64: Perform dcache maintenance at boot
>   arm/arm64: Rework the cache maintenance in asm_mmu_disable
> 
>  Makefile                   |   5 +-
>  arm/Makefile.arm           |   4 +-
>  arm/Makefile.arm64         |   4 +-
>  arm/Makefile.common        |   6 +-
>  arm/cstart.S               |  71 +++++++++++++++------
>  arm/cstart64.S             |  76 +++++++++++++++++------
>  lib/alloc_phys.c           | 122 ++++++++++++-------------------------
>  lib/alloc_phys.h           |  28 ++++++---
>  lib/arm/asm/assembler.h    |  15 ++---
>  lib/arm/asm/cacheflush.h   |   1 +
>  lib/arm/asm/mmu-api.h      |   1 +
>  lib/arm/asm/mmu.h          |   6 --
>  lib/arm/asm/page.h         |   2 +
>  lib/arm/asm/pgtable.h      |  39 ++++++++++--
>  lib/arm/asm/thread_info.h  |   3 +-
>  lib/arm/cache.S            |  89 +++++++++++++++++++++++++++
>  lib/arm/io.c               |  31 ++++++++++
>  lib/arm/io.h               |   3 +
>  lib/arm/mmu.c              |  37 ++++++++---
>  lib/arm/processor.c        |   1 -
>  lib/arm/setup.c            |  82 +++++++++++++++++++++----
>  lib/arm/smp.c              |   5 ++
>  lib/arm64/asm/assembler.h  |  11 ++--
>  lib/arm64/asm/cacheflush.h |  37 +++++++++++
>  lib/arm64/asm/mmu.h        |   5 --
>  lib/arm64/asm/pgtable.h    |  50 +++++++++++++--
>  lib/arm64/cache.S          |  85 ++++++++++++++++++++++++++
>  lib/arm64/processor.c      |   1 -
>  lib/devicetree.c           |   2 +-
>  lib/powerpc/setup.c        |   9 ++-
>  powerpc/Makefile.common    |   1 +
>  powerpc/cstart64.S         |   1 -
>  powerpc/spapr_hcall.c      |   5 +-
>  33 files changed, 642 insertions(+), 196 deletions(-)
>  create mode 100644 lib/arm/asm/cacheflush.h
>  create mode 100644 lib/arm/cache.S
>  create mode 100644 lib/arm64/asm/cacheflush.h
>  create mode 100644 lib/arm64/cache.S
> 
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot
@ 2024-02-16 15:47   ` Alexandru Elisei
  0 siblings, 0 replies; 38+ messages in thread
From: Alexandru Elisei @ 2024-02-16 15:47 UTC (permalink / raw)
  To: Shaoqin Huang
  Cc: Laurent Vivier, Thomas Huth, Nico Boehr, kvm, Andrew Jones,
	Nikos Nikoleris, Eric Auger, Nadav Amit, kvmarm, linuxppc-dev,
	David Woodhouse

Hi,

On Thu, Nov 30, 2023 at 04:07:02AM -0500, Shaoqin Huang wrote:
> Hi,
> 
> I'm posting Alexandru's patch set[1] rebased on the latest branch with the
> conflicts being resolved. No big changes compare to its original code.
> 
> As this version 1 of this series was posted one years ago, I would first let you
> recall it, what's the intention of this series and what this series do. You can
> view it by click the link[2] and view the cover-letter.
> 
> Since when writing the series[1], the efi support for arm64[3] hasn't been
> merged into the kvm-unit-tests, but now the efi support for arm64 has been
> merged. Directly rebase the series[1] onto the latest branch will break the efi
> tests. This is mainly because the Patch #15 ("arm/arm64: Enable the MMU early")
> moves the mmu_enable() out of the setup_mmu(), which causes the efi test will
> not enable the mmu. So I do a small change in the efi_mem_init() which makes the
> efi test also enable the MMU early, and make it works.
> 
> And another change should be noticed is in the Patch #17 ("arm/arm64: Perform
> dcache maintenance"). In the efi_mem_init(), it will disable the mmu, and build
> a new pagetable and re-enable the mmu, if the asm_mmu_disable clean and
> invalidate the data caches for entire memory, we don't need to care the dcache
> and after mmu disabled, we use the mmu_setup_early() to re-enable the mmu, which
> takes care all the cache maintenance. But the situation changes since the Patch
> #18 ("arm/arm64: Rework the cache maintenance in asm_mmu_disable") only clean
> and invalidate the data caches for the stack memory area. So we need to clean
> and invalidate the data caches manually before disable the mmu, I'm not
> confident about current cache maintenance at the efi setup patch, so I ask for
> your help to review it if it's right or not.
> 
> And I also drop one patch ("s390: Do not use the physical allocator") from[1]
> since this cause s390 test to fail.

This is unfortunate. What tests do you see failing?

I wrote the s390x patch so I can justify dropping the locking in the
physical allocator. And I wanted to drop the locking so I wouldn't have to
do maintenance operation on the spinlock.

Because of how kvm-unit-tests implements spinlocks for arm/arm64, they
don't protect against concurrent accesses when the MMU is turned off. And
because arm and arm64 use the physical allocator during the test boot
sequence, not during a test, using a spin lock is also useless since there
will be no concurrent accesses (the boot phase is done on a single CPU).

But since replacing the physical allocator causes test failures for s390x,
looks like the physical will still be needed to tests, and that requires
having the spinlock.

I guess the best approach would be to teach the physical allocator to do
cache maintenance on the spinlock. We might as well, since the UART needs
it too, and I don't think this series addresses that.

Thanks,
Alex

> 
> This series may include bug, so I really appreciate your review to improve this
> series together.
> 
> You can get the code from:
> 
> $ git clone https://gitlab.com/shahuang/kvm-unit-tests.git \
> 	-b arm-arm64-rework-cache-maintenance-at-boot-v1
> 
> [1] https://gitlab.arm.com/linux-arm/kvm-unit-tests-ae/-/tree/arm-arm64-rework-cache-maintenance-at-boot-v2-wip2
> [2] https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
> [3] https://patchwork.kernel.org/project/kvm/cover/20230530160924.82158-1-nikos.nikoleris@arm.com/
> 
> Changelog:
> ----------
> RFC->v1:
>   - Gathered Reviewed-by tags.
>   - Various changes to commit messages and comments to hopefully make the code
>     easier to understand.
>   - Patches #8 ("lib/alloc_phys: Expand documentation with usage and limitations")
>     are new.
>   - Folded patch "arm: page.h: Add missing libcflat.h include" into #17
>     ("arm/arm64: Perform dcache maintenance at boot").
>   - Reordered the series to group patches that touch aproximately the same code
>     together - the patches that change the physical allocator are now first,
>     followed come the patches that change how the secondaries are brought online.
>   - Fixed several nasty bugs where the r4 register was being clobbered in the arm
>     assembly.
>   - Unmap the early UART address if the DTB address does not match the early
>     address.
>   - Added dcache maintenance when a page table is modified with the MMU disabled.
>   - Moved the cache maintenance when disabling the MMU to be executed before the
>     MMU is disabled.
>   - Rebase it on lasted branch which efi support has been merged.
>   - Make the efi test also enable MMU early.
>   - Add cache maintenance on efi setup path especially before mmu_disable.
> 
> RFC: https://lore.kernel.org/all/20220809091558.14379-1-alexandru.elisei@arm.com/
> 
> Alexandru Elisei (18):
>   Makefile: Define __ASSEMBLY__ for assembly files
>   powerpc: Replace the physical allocator with the page allocator
>   lib/alloc_phys: Initialize align_min
>   lib/alloc_phys: Consolidate allocate functions into memalign_early()
>   lib/alloc_phys: Remove locking
>   lib/alloc_phys: Remove allocation accounting
>   lib/alloc_phys: Add callback to perform cache maintenance
>   lib/alloc_phys: Expand documentation with usage and limitations
>   arm/arm64: Zero secondary CPUs' stack
>   arm/arm64: Allocate secondaries' stack using the page allocator
>   arm/arm64: assembler.h: Replace size with end address for
>     dcache_by_line_op
>   arm/arm64: Add C functions for doing cache maintenance
>   arm/arm64: Configure secondaries' stack before enabling the MMU
>   arm/arm64: Use pgd_alloc() to allocate mmu_idmap
>   arm/arm64: Enable the MMU early
>   arm/arm64: Map the UART when creating the translation tables
>   arm/arm64: Perform dcache maintenance at boot
>   arm/arm64: Rework the cache maintenance in asm_mmu_disable
> 
>  Makefile                   |   5 +-
>  arm/Makefile.arm           |   4 +-
>  arm/Makefile.arm64         |   4 +-
>  arm/Makefile.common        |   6 +-
>  arm/cstart.S               |  71 +++++++++++++++------
>  arm/cstart64.S             |  76 +++++++++++++++++------
>  lib/alloc_phys.c           | 122 ++++++++++++-------------------------
>  lib/alloc_phys.h           |  28 ++++++---
>  lib/arm/asm/assembler.h    |  15 ++---
>  lib/arm/asm/cacheflush.h   |   1 +
>  lib/arm/asm/mmu-api.h      |   1 +
>  lib/arm/asm/mmu.h          |   6 --
>  lib/arm/asm/page.h         |   2 +
>  lib/arm/asm/pgtable.h      |  39 ++++++++++--
>  lib/arm/asm/thread_info.h  |   3 +-
>  lib/arm/cache.S            |  89 +++++++++++++++++++++++++++
>  lib/arm/io.c               |  31 ++++++++++
>  lib/arm/io.h               |   3 +
>  lib/arm/mmu.c              |  37 ++++++++---
>  lib/arm/processor.c        |   1 -
>  lib/arm/setup.c            |  82 +++++++++++++++++++++----
>  lib/arm/smp.c              |   5 ++
>  lib/arm64/asm/assembler.h  |  11 ++--
>  lib/arm64/asm/cacheflush.h |  37 +++++++++++
>  lib/arm64/asm/mmu.h        |   5 --
>  lib/arm64/asm/pgtable.h    |  50 +++++++++++++--
>  lib/arm64/cache.S          |  85 ++++++++++++++++++++++++++
>  lib/arm64/processor.c      |   1 -
>  lib/devicetree.c           |   2 +-
>  lib/powerpc/setup.c        |   9 ++-
>  powerpc/Makefile.common    |   1 +
>  powerpc/cstart64.S         |   1 -
>  powerpc/spapr_hcall.c      |   5 +-
>  33 files changed, 642 insertions(+), 196 deletions(-)
>  create mode 100644 lib/arm/asm/cacheflush.h
>  create mode 100644 lib/arm/cache.S
>  create mode 100644 lib/arm64/asm/cacheflush.h
>  create mode 100644 lib/arm64/cache.S
> 
> -- 
> 2.40.1
> 

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2024-02-16 15:48 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-30  9:07 [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot Shaoqin Huang
2023-11-30  9:07 ` Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 01/18] Makefile: Define __ASSEMBLY__ for assembly files Shaoqin Huang
2023-11-30  9:07   ` Shaoqin Huang
2024-01-15 12:44   ` Andrew Jones
2024-01-15 12:44     ` Andrew Jones
2024-02-15 16:05     ` Alexandru Elisei
2024-02-15 16:05       ` Alexandru Elisei
2024-02-15 16:32       ` Andrew Jones
2024-02-15 16:32         ` Andrew Jones
2024-02-15 17:16         ` Alexandru Elisei
2024-02-15 17:16           ` Alexandru Elisei
2024-02-15 19:13           ` Andrew Jones
2024-02-15 19:13             ` Andrew Jones
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 02/18] powerpc: Replace the physical allocator with the page allocator Shaoqin Huang
2023-11-30  9:07   ` Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 03/18] lib/alloc_phys: Initialize align_min Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 04/18] lib/alloc_phys: Consolidate allocate functions into memalign_early() Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 05/18] lib/alloc_phys: Remove locking Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 06/18] lib/alloc_phys: Remove allocation accounting Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 07/18] lib/alloc_phys: Add callback to perform cache maintenance Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 08/18] lib/alloc_phys: Expand documentation with usage and limitations Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 09/18] arm/arm64: Zero secondary CPUs' stack Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 10/18] arm/arm64: Allocate secondaries' stack using the page allocator Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 11/18] arm/arm64: assembler.h: Replace size with end address for dcache_by_line_op Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 12/18] arm/arm64: Add C functions for doing cache maintenance Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 13/18] arm/arm64: Configure secondaries' stack before enabling the MMU Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 14/18] arm/arm64: Use pgd_alloc() to allocate mmu_idmap Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 15/18] arm/arm64: Enable the MMU early Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 16/18] arm/arm64: Map the UART when creating the translation tables Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 17/18] arm/arm64: Perform dcache maintenance at boot Shaoqin Huang
2023-11-30  9:07 ` [kvm-unit-tests PATCH v1 18/18] arm/arm64: Rework the cache maintenance in asm_mmu_disable Shaoqin Huang
2023-11-30 10:35 ` [kvm-unit-tests PATCH v1 00/18] arm/arm64: Rework cache maintenance at boot Alexandru Elisei
2023-11-30 10:35   ` Alexandru Elisei
2023-12-01  8:40   ` Shaoqin Huang
2023-12-01  8:40     ` Shaoqin Huang
2024-02-16 15:47 ` Alexandru Elisei
2024-02-16 15:47   ` Alexandru Elisei

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.