kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi
@ 2021-04-29 16:41 Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 1/8] arm/arm64: Reorganize cstart assembler Andrew Jones
                   ` (7 more replies)
  0 siblings, 8 replies; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

v3:
 - Picked up Alex's r-b on patch 1 and 5
 - Removed useless cast in pci_host_bridge_get_paddr [Andre]
 - Added XN bits when mapping I/O pages [Alex]
 - Tweaks to patch 6 suggested by Alex and myself
 - Improved patch 8 commit message and changed psci-invoke's input
   and output parameter types

v2:
 - Addressed all comments from Nikos and Alex
 - The biggest changes are
   * dropping the weird persistent map stuff that I never liked by taking
     Alex's suggestion to just create the idmap early
   * adding mem_region_add() to clean up memory region adding code, also
     improved the assignment of region fields
 - Also, while we found that we still have a memory map assumption
   (3G-4G reserved for virtual memory allocation), I only make that
   assumption clear. I've left removing it for an additional patch
   for another day.
 - Added psci_invoke_none() to use prior to the PSCI method being set
 - Added r-b's for each patch given, unless the commit changed too much
 - Didn't take Alex's suggestion to use x5 for stacktop when calling
   setup from start. I prefer explicitly loading it again.


This series is a collection of patches derived from [1] that pave the
way for new targets, e.g. target-efi[2]. These patches mostly address
the elimination of memory map assumptions and they shouldn't have any
functional changes. The last two patches are a couple of patches not
related to the memory map, but they also prepare for bare metal targets.
I tossed them in since I don't think they should be too controversial.
This patch series is also available here [3].

[1] https://github.com/rhdrjones/kvm-unit-tests/commits/target-efi
[2] https://www.youtube.com/watch?v=kvaufVrL0J0
[3] https://gitlab.com/rhdrjones/kvm-unit-tests/-/commits/efiprep

Thanks,
drew


Andrew Jones (8):
  arm/arm64: Reorganize cstart assembler
  arm/arm64: Move setup_vm into setup
  pci-testdev: ioremap regions
  arm/arm64: mmu: Stop mapping an assumed IO region
  arm/arm64: mmu: Remove memory layout assumptions
  arm/arm64: setup: Consolidate memory layout assumptions
  chr-testdev: Silently fail init
  arm/arm64: psci: Don't assume method is hvc

 arm/cstart.S                |  92 ++++++++++++-------
 arm/cstart64.S              |  45 ++++++---
 arm/flat.lds                |  23 +++++
 arm/selftest.c              |  34 ++-----
 lib/arm/asm/io.h            |   6 ++
 lib/arm/asm/mmu.h           |   3 +
 lib/arm/asm/page.h          |   2 +
 lib/arm/asm/pgtable-hwdef.h |   1 +
 lib/arm/asm/psci.h          |  10 +-
 lib/arm/asm/setup.h         |   7 +-
 lib/arm/mmu.c               |  53 +++++++----
 lib/arm/psci.c              |  35 +++++--
 lib/arm/setup.c             | 177 +++++++++++++++++++++---------------
 lib/arm64/asm/io.h          |   6 ++
 lib/arm64/asm/mmu.h         |   1 +
 lib/arm64/asm/page.h        |   2 +
 lib/chr-testdev.c           |   5 +-
 lib/pci-host-generic.c      |   5 +-
 lib/pci-host-generic.h      |   4 +-
 lib/pci-testdev.c           |   4 +
 20 files changed, 335 insertions(+), 180 deletions(-)

-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v3 1/8] arm/arm64: Reorganize cstart assembler
  2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
@ 2021-04-29 16:41 ` Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 2/8] arm/arm64: Move setup_vm into setup Andrew Jones
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

Move secondary_entry helper functions out of .init and into .text,
since secondary_entry isn't run at at "init" time. Actually, anything
that is used after init time should be in .text, as we may not include
.init in some build configurations.

Reviewed-by Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 arm/cstart.S   | 66 +++++++++++++++++++++++++++++---------------------
 arm/cstart64.S | 18 ++++++++------
 2 files changed, 49 insertions(+), 35 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index d88a98362940..b2c0ba061cd5 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -96,32 +96,7 @@ start:
 	bl	exit
 	b	halt
 
-
-.macro set_mode_stack mode, stack
-	add	\stack, #S_FRAME_SIZE
-	msr	cpsr_c, #(\mode | PSR_I_BIT | PSR_F_BIT)
-	isb
-	mov	sp, \stack
-.endm
-
-exceptions_init:
-	mrc	p15, 0, r2, c1, c0, 0	@ read SCTLR
-	bic	r2, #CR_V		@ SCTLR.V := 0
-	mcr	p15, 0, r2, c1, c0, 0	@ write SCTLR
-	ldr	r2, =vector_table
-	mcr	p15, 0, r2, c12, c0, 0	@ write VBAR
-
-	mrs	r2, cpsr
-
-	/* first frame reserved for svc mode */
-	set_mode_stack	UND_MODE, r0
-	set_mode_stack	ABT_MODE, r0
-	set_mode_stack	IRQ_MODE, r0
-	set_mode_stack	FIQ_MODE, r0
-
-	msr	cpsr_cxsf, r2		@ back to svc mode
-	isb
-	mov	pc, lr
+.text
 
 enable_vfp:
 	/* Enable full access to CP10 and CP11: */
@@ -133,8 +108,6 @@ enable_vfp:
 	vmsr	fpexc, r0
 	mov	pc, lr
 
-.text
-
 .global get_mmu_off
 get_mmu_off:
 	ldr	r0, =auxinfo
@@ -235,6 +208,43 @@ asm_mmu_disable:
 
 	mov     pc, lr
 
+/*
+ * Vectors
+ */
+
+.macro set_mode_stack mode, stack
+	add	\stack, #S_FRAME_SIZE
+	msr	cpsr_c, #(\mode | PSR_I_BIT | PSR_F_BIT)
+	isb
+	mov	sp, \stack
+.endm
+
+/*
+ * exceptions_init
+ *
+ * Input r0 is the stack top, which is the exception stacks base
+ */
+exceptions_init:
+	mrc	p15, 0, r2, c1, c0, 0	@ read SCTLR
+	bic	r2, #CR_V		@ SCTLR.V := 0
+	mcr	p15, 0, r2, c1, c0, 0	@ write SCTLR
+	ldr	r2, =vector_table
+	mcr	p15, 0, r2, c12, c0, 0	@ write VBAR
+
+	mrs	r2, cpsr
+
+	/*
+	 * The first frame is reserved for svc mode
+	 */
+	set_mode_stack	UND_MODE, r0
+	set_mode_stack	ABT_MODE, r0
+	set_mode_stack	IRQ_MODE, r0
+	set_mode_stack	FIQ_MODE, r0
+
+	msr	cpsr_cxsf, r2		@ back to svc mode
+	isb
+	mov	pc, lr
+
 /*
  * Vector stubs
  * Simplified version of the Linux kernel implementation
diff --git a/arm/cstart64.S b/arm/cstart64.S
index 0a85338bcdae..7963e1fea979 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -109,13 +109,6 @@ start:
 	bl	exit
 	b	halt
 
-exceptions_init:
-	adrp	x4, vector_table
-	add	x4, x4, :lo12:vector_table
-	msr	vbar_el1, x4
-	isb
-	ret
-
 .text
 
 .globl get_mmu_off
@@ -251,6 +244,17 @@ asm_mmu_disable:
 
 /*
  * Vectors
+ */
+
+exceptions_init:
+	adrp	x4, vector_table
+	add	x4, x4, :lo12:vector_table
+	msr	vbar_el1, x4
+	isb
+	ret
+
+/*
+ * Vector stubs
  * Adapted from arch/arm64/kernel/entry.S
  */
 .macro vector_stub, name, vec
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v3 2/8] arm/arm64: Move setup_vm into setup
  2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 1/8] arm/arm64: Reorganize cstart assembler Andrew Jones
@ 2021-04-29 16:41 ` Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 3/8] pci-testdev: ioremap regions Andrew Jones
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

Consolidate our setup calls to reduce the amount we need to do from
init::start. Also remove a couple of pointless comments from setup().

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 arm/cstart.S    | 6 ------
 arm/cstart64.S  | 5 -----
 lib/arm/setup.c | 7 +++++--
 3 files changed, 5 insertions(+), 13 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index b2c0ba061cd5..bf3c78157e6a 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -81,12 +81,7 @@ start:
 	/* complete setup */
 	pop	{r0-r1}
 	bl	setup
-	bl	get_mmu_off
-	cmp	r0, #0
-	bne	1f
-	bl	setup_vm
 
-1:
 	/* run the test */
 	ldr	r0, =__argc
 	ldr	r0, [r0]
@@ -108,7 +103,6 @@ enable_vfp:
 	vmsr	fpexc, r0
 	mov	pc, lr
 
-.global get_mmu_off
 get_mmu_off:
 	ldr	r0, =auxinfo
 	ldr	r0, [r0, #4]
diff --git a/arm/cstart64.S b/arm/cstart64.S
index 7963e1fea979..27251fe8b5cd 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -93,11 +93,7 @@ start:
 
 	/* complete setup */
 	bl	setup				// x0 is the addr of the dtb
-	bl	get_mmu_off
-	cbnz	x0, 1f
-	bl	setup_vm
 
-1:
 	/* run the test */
 	adrp	x0, __argc
 	ldr	w0, [x0, :lo12:__argc]
@@ -111,7 +107,6 @@ start:
 
 .text
 
-.globl get_mmu_off
 get_mmu_off:
 	adrp	x0, auxinfo
 	ldr	x0, [x0, :lo12:auxinfo + 8]
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 751ba980000a..9c16f6004e9f 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -16,6 +16,8 @@
 #include <alloc.h>
 #include <alloc_phys.h>
 #include <alloc_page.h>
+#include <vmalloc.h>
+#include <auxinfo.h>
 #include <argv.h>
 #include <asm/thread_info.h>
 #include <asm/setup.h>
@@ -233,7 +235,6 @@ void setup(const void *fdt)
 		freemem += initrd_size;
 	}
 
-	/* call init functions */
 	mem_init(PAGE_ALIGN((unsigned long)freemem));
 	cpu_init();
 
@@ -243,7 +244,6 @@ void setup(const void *fdt)
 	/* mem_init must be called before io_init */
 	io_init();
 
-	/* finish setup */
 	timer_save_state();
 
 	ret = dt_get_bootargs(&bootargs);
@@ -256,4 +256,7 @@ void setup(const void *fdt)
 		memcpy(env, initrd, initrd_size);
 		setup_env(env, initrd_size);
 	}
+
+	if (!(auxinfo.flags & AUXINFO_MMU_OFF))
+		setup_vm();
 }
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v3 3/8] pci-testdev: ioremap regions
  2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 1/8] arm/arm64: Reorganize cstart assembler Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 2/8] arm/arm64: Move setup_vm into setup Andrew Jones
@ 2021-04-29 16:41 ` Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region Andrew Jones
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

Don't assume the physical addresses used with PCI have already been
identity mapped.

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 lib/pci-host-generic.c | 5 ++---
 lib/pci-host-generic.h | 4 ++--
 lib/pci-testdev.c      | 4 ++++
 3 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/lib/pci-host-generic.c b/lib/pci-host-generic.c
index 818150dc0a66..c5e3db9b96f4 100644
--- a/lib/pci-host-generic.c
+++ b/lib/pci-host-generic.c
@@ -122,7 +122,7 @@ static struct pci_host_bridge *pci_dt_probe(void)
 		      sizeof(host->addr_space[0]) * nr_addr_spaces);
 	assert(host != NULL);
 
-	host->start		= base.addr;
+	host->start		= ioremap(base.addr, base.size);
 	host->size		= base.size;
 	host->bus		= bus;
 	host->bus_max		= bus_max;
@@ -279,8 +279,7 @@ phys_addr_t pci_host_bridge_get_paddr(u64 pci_addr)
 
 static void __iomem *pci_get_dev_conf(struct pci_host_bridge *host, int devfn)
 {
-	return (void __iomem *)(unsigned long)
-		host->start + (devfn << PCI_ECAM_DEVFN_SHIFT);
+	return host->start + (devfn << PCI_ECAM_DEVFN_SHIFT);
 }
 
 u8 pci_config_readb(pcidevaddr_t dev, u8 off)
diff --git a/lib/pci-host-generic.h b/lib/pci-host-generic.h
index fd30e7c74ed8..0ffe6380ec8f 100644
--- a/lib/pci-host-generic.h
+++ b/lib/pci-host-generic.h
@@ -18,8 +18,8 @@ struct pci_addr_space {
 };
 
 struct pci_host_bridge {
-	phys_addr_t		start;
-	phys_addr_t		size;
+	void __iomem		*start;
+	size_t			size;
 	int			bus;
 	int			bus_max;
 	int			nr_addr_spaces;
diff --git a/lib/pci-testdev.c b/lib/pci-testdev.c
index 039bb44781c1..4f2e5663b2d6 100644
--- a/lib/pci-testdev.c
+++ b/lib/pci-testdev.c
@@ -185,7 +185,11 @@ int pci_testdev(void)
 	mem = ioremap(addr, PAGE_SIZE);
 
 	addr = pci_bar_get_addr(&pci_dev, 1);
+#if defined(__i386__) || defined(__x86_64__)
 	io = (void *)(unsigned long)addr;
+#else
+	io = ioremap(addr, PAGE_SIZE);
+#endif
 
 	nr_tests += pci_testdev_all(mem, &pci_testdev_mem_ops);
 	nr_tests += pci_testdev_all(io, &pci_testdev_io_ops);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region
  2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
                   ` (2 preceding siblings ...)
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 3/8] pci-testdev: ioremap regions Andrew Jones
@ 2021-04-29 16:41 ` Andrew Jones
  2021-05-10 15:45   ` Alexandru Elisei
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 5/8] arm/arm64: mmu: Remove memory layout assumptions Andrew Jones
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

By providing a proper ioremap function, we can just rely on devices
calling it for each region they need (as they already do) instead of
mapping a big assumed I/O range. We don't require the MMU to be
enabled at the time of the ioremap. In that case, we add the mapping
to the identity map anyway. This allows us to call setup_vm after
io_init. Why don't we just call setup_vm before io_init, I hear you
ask? Well, that's because tests like sieve want to start with the MMU
off, later call setup_vm, and all the while have working I/O. Some
unit tests are just really demanding...

While at it, ensure we map the I/O regions with XN (execute never),
as suggested by Alexandru Elisei.

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 lib/arm/asm/io.h            |  6 ++++++
 lib/arm/asm/mmu.h           |  3 +++
 lib/arm/asm/page.h          |  2 ++
 lib/arm/asm/pgtable-hwdef.h |  1 +
 lib/arm/mmu.c               | 37 +++++++++++++++++++++++++++----------
 lib/arm64/asm/io.h          |  6 ++++++
 lib/arm64/asm/mmu.h         |  1 +
 lib/arm64/asm/page.h        |  2 ++
 8 files changed, 48 insertions(+), 10 deletions(-)

diff --git a/lib/arm/asm/io.h b/lib/arm/asm/io.h
index ba3b0b2412ad..e4caa6ff5d1e 100644
--- a/lib/arm/asm/io.h
+++ b/lib/arm/asm/io.h
@@ -77,6 +77,12 @@ static inline void __raw_writel(u32 val, volatile void __iomem *addr)
 		     : "r" (val));
 }
 
+#define ioremap ioremap
+static inline void __iomem *ioremap(phys_addr_t phys_addr, size_t size)
+{
+	return __ioremap(phys_addr, size);
+}
+
 #define virt_to_phys virt_to_phys
 static inline phys_addr_t virt_to_phys(const volatile void *x)
 {
diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h
index 122874b8aebe..94e70f0a84bf 100644
--- a/lib/arm/asm/mmu.h
+++ b/lib/arm/asm/mmu.h
@@ -8,10 +8,13 @@
 #include <asm/barrier.h>
 
 #define PTE_USER		L_PTE_USER
+#define PTE_UXN			L_PTE_XN
+#define PTE_PXN			L_PTE_PXN
 #define PTE_RDONLY		PTE_AP2
 #define PTE_SHARED		L_PTE_SHARED
 #define PTE_AF			PTE_EXT_AF
 #define PTE_WBWA		L_PTE_MT_WRITEALLOC
+#define PTE_UNCACHED		L_PTE_MT_UNCACHED
 
 /* See B3.18.7 TLB maintenance operations */
 
diff --git a/lib/arm/asm/page.h b/lib/arm/asm/page.h
index 1fb5cd26ac66..8eb4a883808e 100644
--- a/lib/arm/asm/page.h
+++ b/lib/arm/asm/page.h
@@ -47,5 +47,7 @@ typedef struct { pteval_t pgprot; } pgprot_t;
 extern phys_addr_t __virt_to_phys(unsigned long addr);
 extern unsigned long __phys_to_virt(phys_addr_t addr);
 
+extern void *__ioremap(phys_addr_t phys_addr, size_t size);
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASMARM_PAGE_H_ */
diff --git a/lib/arm/asm/pgtable-hwdef.h b/lib/arm/asm/pgtable-hwdef.h
index fe1d8540ea3f..90fd306c7cc0 100644
--- a/lib/arm/asm/pgtable-hwdef.h
+++ b/lib/arm/asm/pgtable-hwdef.h
@@ -34,6 +34,7 @@
 #define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
 #define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
 #define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
+#define L_PTE_PXN		(_AT(pteval_t, 1) << 53)	/* PXN */
 #define L_PTE_XN		(_AT(pteval_t, 1) << 54)	/* XN */
 
 /*
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 15eef007f256..791b1f88f946 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -11,6 +11,7 @@
 #include <asm/mmu.h>
 #include <asm/setup.h>
 #include <asm/page.h>
+#include <asm/io.h>
 
 #include "alloc_page.h"
 #include "vmalloc.h"
@@ -157,9 +158,8 @@ void mmu_set_range_sect(pgd_t *pgtable, uintptr_t virt_offset,
 void *setup_mmu(phys_addr_t phys_end)
 {
 	uintptr_t code_end = (uintptr_t)&etext;
-	struct mem_region *r;
 
-	/* 0G-1G = I/O, 1G-3G = identity, 3G-4G = vmalloc */
+	/* 3G-4G region is reserved for vmalloc, cap phys_end at 3G */
 	if (phys_end > (3ul << 30))
 		phys_end = 3ul << 30;
 
@@ -170,14 +170,8 @@ void *setup_mmu(phys_addr_t phys_end)
 			"Unsupported translation granule %ld\n", PAGE_SIZE);
 #endif
 
-	mmu_idmap = alloc_page();
-
-	for (r = mem_regions; r->end; ++r) {
-		if (!(r->flags & MR_F_IO))
-			continue;
-		mmu_set_range_sect(mmu_idmap, r->start, r->start, r->end,
-				   __pgprot(PMD_SECT_UNCACHED | PMD_SECT_USER));
-	}
+	if (!mmu_idmap)
+		mmu_idmap = alloc_page();
 
 	/* armv8 requires code shared between EL1 and EL0 to be read-only */
 	mmu_set_range_ptes(mmu_idmap, PHYS_OFFSET,
@@ -192,6 +186,29 @@ void *setup_mmu(phys_addr_t phys_end)
 	return mmu_idmap;
 }
 
+void __iomem *__ioremap(phys_addr_t phys_addr, size_t size)
+{
+	phys_addr_t paddr_aligned = phys_addr & PAGE_MASK;
+	phys_addr_t paddr_end = PAGE_ALIGN(phys_addr + size);
+	pgprot_t prot = __pgprot(PTE_UNCACHED | PTE_USER | PTE_UXN | PTE_PXN);
+	pgd_t *pgtable;
+
+	assert(sizeof(long) == 8 || !(phys_addr >> 32));
+
+	if (mmu_enabled()) {
+		pgtable = current_thread_info()->pgtable;
+	} else {
+		if (!mmu_idmap)
+			mmu_idmap = alloc_page();
+		pgtable = mmu_idmap;
+	}
+
+	mmu_set_range_ptes(pgtable, paddr_aligned, paddr_aligned,
+			   paddr_end, prot);
+
+	return (void __iomem *)(unsigned long)phys_addr;
+}
+
 phys_addr_t __virt_to_phys(unsigned long addr)
 {
 	if (mmu_enabled()) {
diff --git a/lib/arm64/asm/io.h b/lib/arm64/asm/io.h
index e0a03b250d5b..be19f471c0fa 100644
--- a/lib/arm64/asm/io.h
+++ b/lib/arm64/asm/io.h
@@ -71,6 +71,12 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
 	return val;
 }
 
+#define ioremap ioremap
+static inline void __iomem *ioremap(phys_addr_t phys_addr, size_t size)
+{
+	return __ioremap(phys_addr, size);
+}
+
 #define virt_to_phys virt_to_phys
 static inline phys_addr_t virt_to_phys(const volatile void *x)
 {
diff --git a/lib/arm64/asm/mmu.h b/lib/arm64/asm/mmu.h
index 72d75eafc882..72371b2d9fe3 100644
--- a/lib/arm64/asm/mmu.h
+++ b/lib/arm64/asm/mmu.h
@@ -8,6 +8,7 @@
 #include <asm/barrier.h>
 
 #define PMD_SECT_UNCACHED	PMD_ATTRINDX(MT_DEVICE_nGnRE)
+#define PTE_UNCACHED		PTE_ATTRINDX(MT_DEVICE_nGnRE)
 #define PTE_WBWA		PTE_ATTRINDX(MT_NORMAL)
 
 static inline void flush_tlb_all(void)
diff --git a/lib/arm64/asm/page.h b/lib/arm64/asm/page.h
index ae4484b22114..d0fac6ea563d 100644
--- a/lib/arm64/asm/page.h
+++ b/lib/arm64/asm/page.h
@@ -72,5 +72,7 @@ typedef struct { pteval_t pgprot; } pgprot_t;
 extern phys_addr_t __virt_to_phys(unsigned long addr);
 extern unsigned long __phys_to_virt(phys_addr_t addr);
 
+extern void *__ioremap(phys_addr_t phys_addr, size_t size);
+
 #endif /* !__ASSEMBLY__ */
 #endif /* _ASMARM64_PAGE_H_ */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v3 5/8] arm/arm64: mmu: Remove memory layout assumptions
  2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
                   ` (3 preceding siblings ...)
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region Andrew Jones
@ 2021-04-29 16:41 ` Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 6/8] arm/arm64: setup: Consolidate " Andrew Jones
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

Rather than making too many assumptions about the memory layout
in mmu code, just set up the page tables per the memory regions
(which means putting all the memory layout assumptions in setup).
To ensure we get the right default flags set we need to split the
primary region into two regions for code and data.

We still only expect the primary regions to be present, but the
next patch will remove that assumption too.

(Unfortunately we still have an assumption in setup_mmu. We assume
 the range 3G-4G is available for the virtual memory allocator. We'll
 need to remove that assumption as well with another patch in order
 to support arbitrary memory maps.)

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 lib/arm/asm/setup.h |  1 +
 lib/arm/mmu.c       | 26 +++++++++++++++-----------
 lib/arm/setup.c     | 29 +++++++++++++++++++++--------
 3 files changed, 37 insertions(+), 19 deletions(-)

diff --git a/lib/arm/asm/setup.h b/lib/arm/asm/setup.h
index c8afb2493f8d..210c14f818fb 100644
--- a/lib/arm/asm/setup.h
+++ b/lib/arm/asm/setup.h
@@ -15,6 +15,7 @@ extern int nr_cpus;
 
 #define MR_F_PRIMARY		(1U << 0)
 #define MR_F_IO			(1U << 1)
+#define MR_F_CODE		(1U << 2)
 #define MR_F_UNKNOWN		(1U << 31)
 
 struct mem_region {
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 791b1f88f946..7d658a3fe89c 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -20,8 +20,6 @@
 
 #include <linux/compiler.h>
 
-extern unsigned long etext;
-
 pgd_t *mmu_idmap;
 
 /* CPU 0 starts with disabled MMU */
@@ -157,7 +155,7 @@ void mmu_set_range_sect(pgd_t *pgtable, uintptr_t virt_offset,
 
 void *setup_mmu(phys_addr_t phys_end)
 {
-	uintptr_t code_end = (uintptr_t)&etext;
+	struct mem_region *r;
 
 	/* 3G-4G region is reserved for vmalloc, cap phys_end at 3G */
 	if (phys_end > (3ul << 30))
@@ -173,14 +171,20 @@ void *setup_mmu(phys_addr_t phys_end)
 	if (!mmu_idmap)
 		mmu_idmap = alloc_page();
 
-	/* armv8 requires code shared between EL1 and EL0 to be read-only */
-	mmu_set_range_ptes(mmu_idmap, PHYS_OFFSET,
-		PHYS_OFFSET, code_end,
-		__pgprot(PTE_WBWA | PTE_RDONLY | PTE_USER));
-
-	mmu_set_range_ptes(mmu_idmap, code_end,
-		code_end, phys_end,
-		__pgprot(PTE_WBWA | PTE_USER));
+	for (r = mem_regions; r->end; ++r) {
+		if (r->flags & MR_F_IO) {
+			continue;
+		} else if (r->flags & MR_F_CODE) {
+			assert_msg(r->flags & MR_F_PRIMARY, "Unexpected code region");
+			/* armv8 requires code shared between EL1 and EL0 to be read-only */
+			mmu_set_range_ptes(mmu_idmap, r->start, r->start, r->end,
+					   __pgprot(PTE_WBWA | PTE_USER | PTE_RDONLY));
+		} else {
+			assert_msg(r->flags & MR_F_PRIMARY, "Unexpected data region");
+			mmu_set_range_ptes(mmu_idmap, r->start, r->start, r->end,
+					   __pgprot(PTE_WBWA | PTE_USER));
+		}
+	}
 
 	mmu_enable(mmu_idmap);
 	return mmu_idmap;
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 9c16f6004e9f..7db308b70744 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -31,6 +31,7 @@
 #define NR_INITIAL_MEM_REGIONS 16
 
 extern unsigned long stacktop;
+extern unsigned long etext;
 
 struct timer_state __timer_state;
 
@@ -88,10 +89,12 @@ unsigned int mem_region_get_flags(phys_addr_t paddr)
 
 static void mem_init(phys_addr_t freemem_start)
 {
+	phys_addr_t code_end = (phys_addr_t)(unsigned long)&etext;
 	struct dt_pbus_reg regs[NR_INITIAL_MEM_REGIONS];
-	struct mem_region primary, mem = {
+	struct mem_region mem = {
 		.start = (phys_addr_t)-1,
 	};
+	struct mem_region *primary = NULL;
 	phys_addr_t base, top;
 	int nr_regs, nr_io = 0, i;
 
@@ -110,8 +113,6 @@ static void mem_init(phys_addr_t freemem_start)
 	nr_regs = dt_get_memory_params(regs, NR_INITIAL_MEM_REGIONS - nr_io);
 	assert(nr_regs > 0);
 
-	primary = (struct mem_region){ 0 };
-
 	for (i = 0; i < nr_regs; ++i) {
 		struct mem_region *r = &mem_regions[nr_io + i];
 
@@ -123,7 +124,7 @@ static void mem_init(phys_addr_t freemem_start)
 		 */
 		if (freemem_start >= r->start && freemem_start < r->end) {
 			r->flags |= MR_F_PRIMARY;
-			primary = *r;
+			primary = r;
 		}
 
 		/*
@@ -135,13 +136,25 @@ static void mem_init(phys_addr_t freemem_start)
 		if (r->end > mem.end)
 			mem.end = r->end;
 	}
-	assert(primary.end != 0);
+	assert(primary);
 	assert(!(mem.start & ~PHYS_MASK) && !((mem.end - 1) & ~PHYS_MASK));
 
-	__phys_offset = primary.start;	/* PHYS_OFFSET */
-	__phys_end = primary.end;	/* PHYS_END */
+	__phys_offset = primary->start;	/* PHYS_OFFSET */
+	__phys_end = primary->end;	/* PHYS_END */
+
+	/* Split the primary region into two regions; code and data */
+	mem_regions[nr_io + i] = (struct mem_region){
+		.start = code_end,
+		.end = primary->end,
+		.flags = MR_F_PRIMARY,
+	};
+	*primary = (struct mem_region){
+		.start = primary->start,
+		.end = code_end,
+		.flags = MR_F_PRIMARY | MR_F_CODE,
+	};
 
-	phys_alloc_init(freemem_start, primary.end - freemem_start);
+	phys_alloc_init(freemem_start, __phys_end - freemem_start);
 	phys_alloc_set_minimum_alignment(SMP_CACHE_BYTES);
 
 	phys_alloc_get_unused(&base, &top);
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v3 6/8] arm/arm64: setup: Consolidate memory layout assumptions
  2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
                   ` (4 preceding siblings ...)
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 5/8] arm/arm64: mmu: Remove memory layout assumptions Andrew Jones
@ 2021-04-29 16:41 ` Andrew Jones
  2021-05-11 15:11   ` Alexandru Elisei
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 7/8] chr-testdev: Silently fail init Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc Andrew Jones
  7 siblings, 1 reply; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

Keep as much memory layout assumptions as possible in init::start
and a single setup function. This prepares us for calling setup()
from different start functions which have been linked with different
linker scripts. To do this, stacktop is only referenced from
init::start, making freemem_start a parameter to setup(). We also
split mem_init() into three parts, one that populates the mem regions
per the DT, one that populates the mem regions per assumptions,
and one that does the mem init. The concept of a primary region
is dropped, but we add a sanity check for the absence of memory
holes, because we don't know how to deal with them yet.

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 arm/cstart.S        |   4 +-
 arm/cstart64.S      |   2 +
 arm/flat.lds        |  23 ++++++
 lib/arm/asm/setup.h |   8 +-
 lib/arm/mmu.c       |   2 -
 lib/arm/setup.c     | 175 ++++++++++++++++++++++++--------------------
 6 files changed, 128 insertions(+), 86 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index bf3c78157e6a..446966de350d 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -80,7 +80,9 @@ start:
 
 	/* complete setup */
 	pop	{r0-r1}
-	bl	setup
+	mov	r3, #0
+	ldr	r2, =stacktop		@ r2,r3 is the base of free memory
+	bl	setup			@ r0 is the addr of the dtb
 
 	/* run the test */
 	ldr	r0, =__argc
diff --git a/arm/cstart64.S b/arm/cstart64.S
index 27251fe8b5cd..42ba3a3ca249 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -92,6 +92,8 @@ start:
 	bl	exceptions_init
 
 	/* complete setup */
+	adrp	x1, stacktop
+	add	x1, x1, :lo12:stacktop		// x1 is the base of free memory
 	bl	setup				// x0 is the addr of the dtb
 
 	/* run the test */
diff --git a/arm/flat.lds b/arm/flat.lds
index 6ed377c0eaa0..6fb459efb815 100644
--- a/arm/flat.lds
+++ b/arm/flat.lds
@@ -1,3 +1,26 @@
+/*
+ * init::start will pass stacktop to setup() as the base of free memory.
+ * setup() will then move the FDT and initrd to that base before calling
+ * mem_init(). With those movements and this linker script, we'll end up
+ * having the following memory layout:
+ *
+ *    +----------------------+   <-- top of physical memory
+ *    |                      |
+ *    ~                      ~
+ *    |                      |
+ *    +----------------------+   <-- top of initrd
+ *    |                      |
+ *    +----------------------+   <-- top of FDT
+ *    |                      |
+ *    +----------------------+   <-- top of cpu0's stack
+ *    |                      |
+ *    +----------------------+   <-- top of text/data/bss sections
+ *    |                      |
+ *    |                      |
+ *    +----------------------+   <-- load address
+ *    |                      |
+ *    +----------------------+   <-- physical address 0x0
+ */
 
 SECTIONS
 {
diff --git a/lib/arm/asm/setup.h b/lib/arm/asm/setup.h
index 210c14f818fb..f0e70b119fb0 100644
--- a/lib/arm/asm/setup.h
+++ b/lib/arm/asm/setup.h
@@ -13,9 +13,8 @@
 extern u64 cpus[NR_CPUS];	/* per-cpu IDs (MPIDRs) */
 extern int nr_cpus;
 
-#define MR_F_PRIMARY		(1U << 0)
-#define MR_F_IO			(1U << 1)
-#define MR_F_CODE		(1U << 2)
+#define MR_F_IO			(1U << 0)
+#define MR_F_CODE		(1U << 1)
 #define MR_F_UNKNOWN		(1U << 31)
 
 struct mem_region {
@@ -26,6 +25,7 @@ struct mem_region {
 extern struct mem_region *mem_regions;
 extern phys_addr_t __phys_offset, __phys_end;
 
+extern struct mem_region *mem_region_find(phys_addr_t paddr);
 extern unsigned int mem_region_get_flags(phys_addr_t paddr);
 
 #define PHYS_OFFSET		(__phys_offset)
@@ -35,6 +35,6 @@ extern unsigned int mem_region_get_flags(phys_addr_t paddr);
 #define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
 #define SMP_CACHE_BYTES		L1_CACHE_BYTES
 
-void setup(const void *fdt);
+void setup(const void *fdt, phys_addr_t freemem_start);
 
 #endif /* _ASMARM_SETUP_H_ */
diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
index 7d658a3fe89c..7628f797a543 100644
--- a/lib/arm/mmu.c
+++ b/lib/arm/mmu.c
@@ -175,12 +175,10 @@ void *setup_mmu(phys_addr_t phys_end)
 		if (r->flags & MR_F_IO) {
 			continue;
 		} else if (r->flags & MR_F_CODE) {
-			assert_msg(r->flags & MR_F_PRIMARY, "Unexpected code region");
 			/* armv8 requires code shared between EL1 and EL0 to be read-only */
 			mmu_set_range_ptes(mmu_idmap, r->start, r->start, r->end,
 					   __pgprot(PTE_WBWA | PTE_USER | PTE_RDONLY));
 		} else {
-			assert_msg(r->flags & MR_F_PRIMARY, "Unexpected data region");
 			mmu_set_range_ptes(mmu_idmap, r->start, r->start, r->end,
 					   __pgprot(PTE_WBWA | PTE_USER));
 		}
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 7db308b70744..86f054304baf 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -28,9 +28,10 @@
 
 #include "io.h"
 
-#define NR_INITIAL_MEM_REGIONS 16
+#define MAX_DT_MEM_REGIONS	16
+#define NR_EXTRA_MEM_REGIONS	16
+#define NR_INITIAL_MEM_REGIONS	(MAX_DT_MEM_REGIONS + NR_EXTRA_MEM_REGIONS)
 
-extern unsigned long stacktop;
 extern unsigned long etext;
 
 struct timer_state __timer_state;
@@ -75,28 +76,68 @@ static void cpu_init(void)
 	set_cpu_online(0, true);
 }
 
-unsigned int mem_region_get_flags(phys_addr_t paddr)
+static void mem_region_add(struct mem_region *r)
+{
+	struct mem_region *r_next = mem_regions;
+	int i = 0;
+
+	for (; r_next->end; ++r_next, ++i)
+		;
+	assert(i < NR_INITIAL_MEM_REGIONS);
+
+	*r_next = *r;
+}
+
+static void mem_regions_add_dt_regions(void)
+{
+	struct dt_pbus_reg regs[MAX_DT_MEM_REGIONS];
+	int nr_regs, i;
+
+	nr_regs = dt_get_memory_params(regs, MAX_DT_MEM_REGIONS);
+	assert(nr_regs > 0);
+
+	for (i = 0; i < nr_regs; ++i) {
+		mem_region_add(&(struct mem_region){
+			.start = regs[i].addr,
+			.end = regs[i].addr + regs[i].size,
+		});
+	}
+}
+
+struct mem_region *mem_region_find(phys_addr_t paddr)
 {
 	struct mem_region *r;
 
-	for (r = mem_regions; r->end; ++r) {
+	for (r = mem_regions; r->end; ++r)
 		if (paddr >= r->start && paddr < r->end)
-			return r->flags;
-	}
+			return r;
+	return NULL;
+}
 
-	return MR_F_UNKNOWN;
+unsigned int mem_region_get_flags(phys_addr_t paddr)
+{
+	struct mem_region *r = mem_region_find(paddr);
+	return r ? r->flags : MR_F_UNKNOWN;
 }
 
-static void mem_init(phys_addr_t freemem_start)
+static void mem_regions_add_assumed(void)
 {
 	phys_addr_t code_end = (phys_addr_t)(unsigned long)&etext;
-	struct dt_pbus_reg regs[NR_INITIAL_MEM_REGIONS];
-	struct mem_region mem = {
-		.start = (phys_addr_t)-1,
+	struct mem_region *r;
+
+	r = mem_region_find(code_end - 1);
+	assert(r);
+
+	/* Split the region with the code into two regions; code and data */
+	mem_region_add(&(struct mem_region){
+		.start = code_end,
+		.end = r->end,
+	});
+	*r = (struct mem_region){
+		.start = r->start,
+		.end = code_end,
+		.flags = MR_F_CODE,
 	};
-	struct mem_region *primary = NULL;
-	phys_addr_t base, top;
-	int nr_regs, nr_io = 0, i;
 
 	/*
 	 * mach-virt I/O regions:
@@ -104,57 +145,47 @@ static void mem_init(phys_addr_t freemem_start)
 	 *   - 512M at 256G (arm64, arm uses highmem=off)
 	 *   - 512G at 512G (arm64, arm uses highmem=off)
 	 */
-	mem_regions[nr_io++] = (struct mem_region){ 0, (1ul << 30), MR_F_IO };
+	mem_region_add(&(struct mem_region){ 0, (1ul << 30), MR_F_IO });
 #ifdef __aarch64__
-	mem_regions[nr_io++] = (struct mem_region){ (1ul << 38), (1ul << 38) | (1ul << 29), MR_F_IO };
-	mem_regions[nr_io++] = (struct mem_region){ (1ul << 39), (1ul << 40), MR_F_IO };
+	mem_region_add(&(struct mem_region){ (1ul << 38), (1ul << 38) | (1ul << 29), MR_F_IO });
+	mem_region_add(&(struct mem_region){ (1ul << 39), (1ul << 40), MR_F_IO });
 #endif
+}
 
-	nr_regs = dt_get_memory_params(regs, NR_INITIAL_MEM_REGIONS - nr_io);
-	assert(nr_regs > 0);
-
-	for (i = 0; i < nr_regs; ++i) {
-		struct mem_region *r = &mem_regions[nr_io + i];
+static void mem_init(phys_addr_t freemem_start)
+{
+	phys_addr_t base, top;
+	struct mem_region *freemem, *r, mem = {
+		.start = (phys_addr_t)-1,
+	};
 
-		r->start = regs[i].addr;
-		r->end = regs[i].addr + regs[i].size;
+	freemem = mem_region_find(freemem_start);
+	assert(freemem && !(freemem->flags & (MR_F_IO | MR_F_CODE)));
 
-		/*
-		 * pick the region we're in for our primary region
-		 */
-		if (freemem_start >= r->start && freemem_start < r->end) {
-			r->flags |= MR_F_PRIMARY;
-			primary = r;
+	for (r = mem_regions; r->end; ++r) {
+		if (!(r->flags & MR_F_IO)) {
+			if (r->start < mem.start)
+				mem.start = r->start;
+			if (r->end > mem.end)
+				mem.end = r->end;
 		}
-
-		/*
-		 * set the lowest and highest addresses found,
-		 * ignoring potential gaps
-		 */
-		if (r->start < mem.start)
-			mem.start = r->start;
-		if (r->end > mem.end)
-			mem.end = r->end;
 	}
-	assert(primary);
-	assert(!(mem.start & ~PHYS_MASK) && !((mem.end - 1) & ~PHYS_MASK));
+	assert(mem.end && !(mem.start & ~PHYS_MASK));
+	mem.end &= PHYS_MASK;
 
-	__phys_offset = primary->start;	/* PHYS_OFFSET */
-	__phys_end = primary->end;	/* PHYS_END */
+	/* Check for holes */
+	r = mem_region_find(mem.start);
+	while (r && r->end != mem.end)
+		r = mem_region_find(r->end);
+	assert(r);
 
-	/* Split the primary region into two regions; code and data */
-	mem_regions[nr_io + i] = (struct mem_region){
-		.start = code_end,
-		.end = primary->end,
-		.flags = MR_F_PRIMARY,
-	};
-	*primary = (struct mem_region){
-		.start = primary->start,
-		.end = code_end,
-		.flags = MR_F_PRIMARY | MR_F_CODE,
-	};
+	/* Ensure our selected freemem range is somewhere in our full range */
+	assert(freemem_start >= mem.start && freemem->end <= mem.end);
 
-	phys_alloc_init(freemem_start, __phys_end - freemem_start);
+	__phys_offset = mem.start;	/* PHYS_OFFSET */
+	__phys_end = mem.end;		/* PHYS_END */
+
+	phys_alloc_init(freemem_start, freemem->end - freemem_start);
 	phys_alloc_set_minimum_alignment(SMP_CACHE_BYTES);
 
 	phys_alloc_get_unused(&base, &top);
@@ -204,35 +235,17 @@ static void timer_save_state(void)
 	__timer_state.vtimer.irq_flags = fdt32_to_cpu(data[8]);
 }
 
-void setup(const void *fdt)
+void setup(const void *fdt, phys_addr_t freemem_start)
 {
-	void *freemem = &stacktop;
+	void *freemem;
 	const char *bootargs, *tmp;
 	u32 fdt_size;
 	int ret;
 
-	/*
-	 * Before calling mem_init we need to move the fdt and initrd
-	 * to safe locations. We move them to construct the memory
-	 * map illustrated below:
-	 *
-	 *    +----------------------+   <-- top of physical memory
-	 *    |                      |
-	 *    ~                      ~
-	 *    |                      |
-	 *    +----------------------+   <-- top of initrd
-	 *    |                      |
-	 *    +----------------------+   <-- top of FDT
-	 *    |                      |
-	 *    +----------------------+   <-- top of cpu0's stack
-	 *    |                      |
-	 *    +----------------------+   <-- top of text/data/bss sections,
-	 *    |                      |       see arm/flat.lds
-	 *    |                      |
-	 *    +----------------------+   <-- load address
-	 *    |                      |
-	 *    +----------------------+
-	 */
+	assert(sizeof(long) == 8 || freemem_start < (3ul << 30));
+	freemem = (void *)(unsigned long)freemem_start;
+
+	/* Move the FDT to the base of free memory */
 	fdt_size = fdt_totalsize(fdt);
 	ret = fdt_move(fdt, freemem, fdt_size);
 	assert(ret == 0);
@@ -240,6 +253,7 @@ void setup(const void *fdt)
 	assert(ret == 0);
 	freemem += fdt_size;
 
+	/* Move the initrd to the top of the FDT */
 	ret = dt_get_initrd(&tmp, &initrd_size);
 	assert(ret == 0 || ret == -FDT_ERR_NOTFOUND);
 	if (ret == 0) {
@@ -248,7 +262,10 @@ void setup(const void *fdt)
 		freemem += initrd_size;
 	}
 
+	mem_regions_add_dt_regions();
+	mem_regions_add_assumed();
 	mem_init(PAGE_ALIGN((unsigned long)freemem));
+
 	cpu_init();
 
 	/* cpu_init must be called before thread_info_init */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v3 7/8] chr-testdev: Silently fail init
  2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
                   ` (5 preceding siblings ...)
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 6/8] arm/arm64: setup: Consolidate " Andrew Jones
@ 2021-04-29 16:41 ` Andrew Jones
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc Andrew Jones
  7 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

If there's no virtio-console / chr-testdev configured, then the user
probably didn't want them. Just silently fail rather than stating
the obvious.

Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 lib/chr-testdev.c | 5 +----
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/lib/chr-testdev.c b/lib/chr-testdev.c
index 6890f63c8b29..b3c641a833fe 100644
--- a/lib/chr-testdev.c
+++ b/lib/chr-testdev.c
@@ -54,11 +54,8 @@ void chr_testdev_init(void)
 	int ret;
 
 	vcon = virtio_bind(VIRTIO_ID_CONSOLE);
-	if (vcon == NULL) {
-		printf("%s: %s: can't find a virtio-console\n",
-				__func__, TESTDEV_NAME);
+	if (vcon == NULL)
 		return;
-	}
 
 	ret = vcon->config->find_vqs(vcon, 2, vqs, NULL, io_names);
 	if (ret < 0) {
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc
  2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
                   ` (6 preceding siblings ...)
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 7/8] chr-testdev: Silently fail init Andrew Jones
@ 2021-04-29 16:41 ` Andrew Jones
  2021-05-12 16:14   ` Alexandru Elisei
  2021-05-13 10:18   ` [PATCH kvm-unit-tests v4] " Andrew Jones
  7 siblings, 2 replies; 22+ messages in thread
From: Andrew Jones @ 2021-04-29 16:41 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

The method can be smc in addition to hvc, and it will be when running
on bare metal. Additionally, we move the invocations to assembly so
we don't have to rely on compiler assumptions. We also fix the
prototype of psci_invoke. It should return long, not int, and
function_id should be an unsigned int, not an unsigned long.

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 arm/cstart.S       | 22 ++++++++++++++++++++++
 arm/cstart64.S     | 22 ++++++++++++++++++++++
 arm/selftest.c     | 34 +++++++---------------------------
 lib/arm/asm/psci.h | 10 ++++++++--
 lib/arm/psci.c     | 35 +++++++++++++++++++++++++++--------
 lib/arm/setup.c    |  2 ++
 6 files changed, 88 insertions(+), 37 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index 446966de350d..2401d92cdadc 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -95,6 +95,28 @@ start:
 
 .text
 
+/*
+ * psci_invoke_hvc / psci_invoke_smc
+ *
+ * Inputs:
+ *   r0 -- function_id
+ *   r1 -- arg0
+ *   r2 -- arg1
+ *   r3 -- arg2
+ *
+ * Outputs:
+ *   r0 -- return code
+ */
+.globl psci_invoke_hvc
+psci_invoke_hvc:
+	hvc	#0
+	mov	pc, lr
+
+.globl psci_invoke_smc
+psci_invoke_smc:
+	smc	#0
+	mov	pc, lr
+
 enable_vfp:
 	/* Enable full access to CP10 and CP11: */
 	mov	r0, #(3 << 22 | 3 << 20)
diff --git a/arm/cstart64.S b/arm/cstart64.S
index 42ba3a3ca249..e4ab7d06251e 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -109,6 +109,28 @@ start:
 
 .text
 
+/*
+ * psci_invoke_hvc / psci_invoke_smc
+ *
+ * Inputs:
+ *   w0 -- function_id
+ *   x1 -- arg0
+ *   x2 -- arg1
+ *   x3 -- arg2
+ *
+ * Outputs:
+ *   x0 -- return code
+ */
+.globl psci_invoke_hvc
+psci_invoke_hvc:
+	hvc	#0
+	ret
+
+.globl psci_invoke_smc
+psci_invoke_smc:
+	smc	#0
+	ret
+
 get_mmu_off:
 	adrp	x0, auxinfo
 	ldr	x0, [x0, :lo12:auxinfo + 8]
diff --git a/arm/selftest.c b/arm/selftest.c
index 4495b161cdd5..9f459ed3d571 100644
--- a/arm/selftest.c
+++ b/arm/selftest.c
@@ -400,33 +400,13 @@ static void check_vectors(void *arg __unused)
 	exit(report_summary());
 }
 
-static bool psci_check(void)
+static void psci_print(void)
 {
-	const struct fdt_property *method;
-	int node, len, ver;
-
-	node = fdt_node_offset_by_compatible(dt_fdt(), -1, "arm,psci-0.2");
-	if (node < 0) {
-		printf("PSCI v0.2 compatibility required\n");
-		return false;
-	}
-
-	method = fdt_get_property(dt_fdt(), node, "method", &len);
-	if (method == NULL) {
-		printf("bad psci device tree node\n");
-		return false;
-	}
-
-	if (len < 4 || strcmp(method->data, "hvc") != 0) {
-		printf("psci method must be hvc\n");
-		return false;
-	}
-
-	ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
-	printf("PSCI version %d.%d\n", PSCI_VERSION_MAJOR(ver),
-				       PSCI_VERSION_MINOR(ver));
-
-	return true;
+	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
+	report_info("PSCI version: %d.%d", PSCI_VERSION_MAJOR(ver),
+					  PSCI_VERSION_MINOR(ver));
+	report_info("PSCI method: %s", psci_invoke == psci_invoke_hvc ?
+				       "hvc" : "smc");
 }
 
 static void cpu_report(void *data __unused)
@@ -465,7 +445,7 @@ int main(int argc, char **argv)
 
 	} else if (strcmp(argv[1], "smp") == 0) {
 
-		report(psci_check(), "PSCI version");
+		psci_print();
 		on_cpus(cpu_report, NULL);
 		while (!cpumask_full(&ready))
 			cpu_relax();
diff --git a/lib/arm/asm/psci.h b/lib/arm/asm/psci.h
index 7b956bf5987d..f87fca0422cc 100644
--- a/lib/arm/asm/psci.h
+++ b/lib/arm/asm/psci.h
@@ -3,8 +3,14 @@
 #include <libcflat.h>
 #include <linux/psci.h>
 
-extern int psci_invoke(unsigned long function_id, unsigned long arg0,
-		       unsigned long arg1, unsigned long arg2);
+typedef long (*psci_invoke_fn)(unsigned int function_id, unsigned long arg0,
+			       unsigned long arg1, unsigned long arg2);
+extern psci_invoke_fn psci_invoke;
+extern long psci_invoke_hvc(unsigned int function_id, unsigned long arg0,
+			    unsigned long arg1, unsigned long arg2);
+extern long psci_invoke_smc(unsigned int function_id, unsigned long arg0,
+			    unsigned long arg1, unsigned long arg2);
+extern void psci_set_conduit(void);
 extern int psci_cpu_on(unsigned long cpuid, unsigned long entry_point);
 extern void psci_system_reset(void);
 extern int cpu_psci_cpu_boot(unsigned int cpu);
diff --git a/lib/arm/psci.c b/lib/arm/psci.c
index 936c83948b6a..3053b3041c28 100644
--- a/lib/arm/psci.c
+++ b/lib/arm/psci.c
@@ -6,22 +6,21 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
+#include <devicetree.h>
 #include <asm/psci.h>
 #include <asm/setup.h>
 #include <asm/page.h>
 #include <asm/smp.h>
 
-__attribute__((noinline))
-int psci_invoke(unsigned long function_id, unsigned long arg0,
-		unsigned long arg1, unsigned long arg2)
+static long psci_invoke_none(unsigned int function_id, unsigned long arg0,
+			     unsigned long arg1, unsigned long arg2)
 {
-	asm volatile(
-		"hvc #0"
-	: "+r" (function_id)
-	: "r" (arg0), "r" (arg1), "r" (arg2));
-	return function_id;
+	printf("No PSCI method configured! Can't invoke...\n");
+	return PSCI_RET_NOT_PRESENT;
 }
 
+psci_invoke_fn psci_invoke = psci_invoke_none;
+
 int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
 {
 #ifdef __arm__
@@ -56,3 +55,23 @@ void psci_system_off(void)
 	int err = psci_invoke(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
 	printf("CPU%d unable to do system off (error = %d)\n", smp_processor_id(), err);
 }
+
+void psci_set_conduit(void)
+{
+	const void *fdt = dt_fdt();
+	const struct fdt_property *method;
+	int node, len;
+
+	node = fdt_node_offset_by_compatible(fdt, -1, "arm,psci-0.2");
+	assert_msg(node >= 0, "PSCI v0.2 compatibility required");
+
+	method = fdt_get_property(fdt, node, "method", &len);
+	assert(method != NULL && len == 4);
+
+	if (strcmp(method->data, "hvc") == 0)
+		psci_invoke = psci_invoke_hvc;
+	else if (strcmp(method->data, "smc") == 0)
+		psci_invoke = psci_invoke_smc;
+	else
+		assert_msg(false, "Unknown PSCI conduit: %s", method->data);
+}
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 86f054304baf..bcdf0d78c2e2 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -25,6 +25,7 @@
 #include <asm/processor.h>
 #include <asm/smp.h>
 #include <asm/timer.h>
+#include <asm/psci.h>
 
 #include "io.h"
 
@@ -266,6 +267,7 @@ void setup(const void *fdt, phys_addr_t freemem_start)
 	mem_regions_add_assumed();
 	mem_init(PAGE_ALIGN((unsigned long)freemem));
 
+	psci_set_conduit();
 	cpu_init();
 
 	/* cpu_init must be called before thread_info_init */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region Andrew Jones
@ 2021-05-10 15:45   ` Alexandru Elisei
  2021-05-13 15:48     ` Alexandru Elisei
  0 siblings, 1 reply; 22+ messages in thread
From: Alexandru Elisei @ 2021-05-10 15:45 UTC (permalink / raw)
  To: Andrew Jones, kvm; +Cc: nikos.nikoleris, andre.przywara, eric.auger

Hi Drew,

On 4/29/21 5:41 PM, Andrew Jones wrote:
> By providing a proper ioremap function, we can just rely on devices
> calling it for each region they need (as they already do) instead of
> mapping a big assumed I/O range. We don't require the MMU to be
> enabled at the time of the ioremap. In that case, we add the mapping
> to the identity map anyway. This allows us to call setup_vm after
> io_init. Why don't we just call setup_vm before io_init, I hear you
> ask? Well, that's because tests like sieve want to start with the MMU
> off, later call setup_vm, and all the while have working I/O. Some
> unit tests are just really demanding...
>
> While at it, ensure we map the I/O regions with XN (execute never),
> as suggested by Alexandru Elisei.

I got to thinking why this wasn't an issue before. Under KVM, device memory is not
usually mapped at stage 2, so any speculated reads wouldn't have reached memory at
all. The only way I imagine that happening if the user was running kvm-unit-tests
with a passthrough PCI device, which I don't think happens too often.

But we cannot rely on devices not being mapped at stage 2 when running under EFI
(we're mapping them ourselves with ioremap), so I believe this is a good fix.

Had another look at the patch, looks good to me:

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>

Thanks,

Alex

>
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  lib/arm/asm/io.h            |  6 ++++++
>  lib/arm/asm/mmu.h           |  3 +++
>  lib/arm/asm/page.h          |  2 ++
>  lib/arm/asm/pgtable-hwdef.h |  1 +
>  lib/arm/mmu.c               | 37 +++++++++++++++++++++++++++----------
>  lib/arm64/asm/io.h          |  6 ++++++
>  lib/arm64/asm/mmu.h         |  1 +
>  lib/arm64/asm/page.h        |  2 ++
>  8 files changed, 48 insertions(+), 10 deletions(-)
>
> diff --git a/lib/arm/asm/io.h b/lib/arm/asm/io.h
> index ba3b0b2412ad..e4caa6ff5d1e 100644
> --- a/lib/arm/asm/io.h
> +++ b/lib/arm/asm/io.h
> @@ -77,6 +77,12 @@ static inline void __raw_writel(u32 val, volatile void __iomem *addr)
>  		     : "r" (val));
>  }
>  
> +#define ioremap ioremap
> +static inline void __iomem *ioremap(phys_addr_t phys_addr, size_t size)
> +{
> +	return __ioremap(phys_addr, size);
> +}
> +
>  #define virt_to_phys virt_to_phys
>  static inline phys_addr_t virt_to_phys(const volatile void *x)
>  {
> diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h
> index 122874b8aebe..94e70f0a84bf 100644
> --- a/lib/arm/asm/mmu.h
> +++ b/lib/arm/asm/mmu.h
> @@ -8,10 +8,13 @@
>  #include <asm/barrier.h>
>  
>  #define PTE_USER		L_PTE_USER
> +#define PTE_UXN			L_PTE_XN
> +#define PTE_PXN			L_PTE_PXN
>  #define PTE_RDONLY		PTE_AP2
>  #define PTE_SHARED		L_PTE_SHARED
>  #define PTE_AF			PTE_EXT_AF
>  #define PTE_WBWA		L_PTE_MT_WRITEALLOC
> +#define PTE_UNCACHED		L_PTE_MT_UNCACHED
>  
>  /* See B3.18.7 TLB maintenance operations */
>  
> diff --git a/lib/arm/asm/page.h b/lib/arm/asm/page.h
> index 1fb5cd26ac66..8eb4a883808e 100644
> --- a/lib/arm/asm/page.h
> +++ b/lib/arm/asm/page.h
> @@ -47,5 +47,7 @@ typedef struct { pteval_t pgprot; } pgprot_t;
>  extern phys_addr_t __virt_to_phys(unsigned long addr);
>  extern unsigned long __phys_to_virt(phys_addr_t addr);
>  
> +extern void *__ioremap(phys_addr_t phys_addr, size_t size);
> +
>  #endif /* !__ASSEMBLY__ */
>  #endif /* _ASMARM_PAGE_H_ */
> diff --git a/lib/arm/asm/pgtable-hwdef.h b/lib/arm/asm/pgtable-hwdef.h
> index fe1d8540ea3f..90fd306c7cc0 100644
> --- a/lib/arm/asm/pgtable-hwdef.h
> +++ b/lib/arm/asm/pgtable-hwdef.h
> @@ -34,6 +34,7 @@
>  #define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
>  #define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
>  #define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
> +#define L_PTE_PXN		(_AT(pteval_t, 1) << 53)	/* PXN */
>  #define L_PTE_XN		(_AT(pteval_t, 1) << 54)	/* XN */
>  
>  /*
> diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
> index 15eef007f256..791b1f88f946 100644
> --- a/lib/arm/mmu.c
> +++ b/lib/arm/mmu.c
> @@ -11,6 +11,7 @@
>  #include <asm/mmu.h>
>  #include <asm/setup.h>
>  #include <asm/page.h>
> +#include <asm/io.h>
>  
>  #include "alloc_page.h"
>  #include "vmalloc.h"
> @@ -157,9 +158,8 @@ void mmu_set_range_sect(pgd_t *pgtable, uintptr_t virt_offset,
>  void *setup_mmu(phys_addr_t phys_end)
>  {
>  	uintptr_t code_end = (uintptr_t)&etext;
> -	struct mem_region *r;
>  
> -	/* 0G-1G = I/O, 1G-3G = identity, 3G-4G = vmalloc */
> +	/* 3G-4G region is reserved for vmalloc, cap phys_end at 3G */
>  	if (phys_end > (3ul << 30))
>  		phys_end = 3ul << 30;
>  
> @@ -170,14 +170,8 @@ void *setup_mmu(phys_addr_t phys_end)
>  			"Unsupported translation granule %ld\n", PAGE_SIZE);
>  #endif
>  
> -	mmu_idmap = alloc_page();
> -
> -	for (r = mem_regions; r->end; ++r) {
> -		if (!(r->flags & MR_F_IO))
> -			continue;
> -		mmu_set_range_sect(mmu_idmap, r->start, r->start, r->end,
> -				   __pgprot(PMD_SECT_UNCACHED | PMD_SECT_USER));
> -	}
> +	if (!mmu_idmap)
> +		mmu_idmap = alloc_page();
>  
>  	/* armv8 requires code shared between EL1 and EL0 to be read-only */
>  	mmu_set_range_ptes(mmu_idmap, PHYS_OFFSET,
> @@ -192,6 +186,29 @@ void *setup_mmu(phys_addr_t phys_end)
>  	return mmu_idmap;
>  }
>  
> +void __iomem *__ioremap(phys_addr_t phys_addr, size_t size)
> +{
> +	phys_addr_t paddr_aligned = phys_addr & PAGE_MASK;
> +	phys_addr_t paddr_end = PAGE_ALIGN(phys_addr + size);
> +	pgprot_t prot = __pgprot(PTE_UNCACHED | PTE_USER | PTE_UXN | PTE_PXN);
> +	pgd_t *pgtable;
> +
> +	assert(sizeof(long) == 8 || !(phys_addr >> 32));
> +
> +	if (mmu_enabled()) {
> +		pgtable = current_thread_info()->pgtable;
> +	} else {
> +		if (!mmu_idmap)
> +			mmu_idmap = alloc_page();
> +		pgtable = mmu_idmap;
> +	}
> +
> +	mmu_set_range_ptes(pgtable, paddr_aligned, paddr_aligned,
> +			   paddr_end, prot);
> +
> +	return (void __iomem *)(unsigned long)phys_addr;
> +}
> +
>  phys_addr_t __virt_to_phys(unsigned long addr)
>  {
>  	if (mmu_enabled()) {
> diff --git a/lib/arm64/asm/io.h b/lib/arm64/asm/io.h
> index e0a03b250d5b..be19f471c0fa 100644
> --- a/lib/arm64/asm/io.h
> +++ b/lib/arm64/asm/io.h
> @@ -71,6 +71,12 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
>  	return val;
>  }
>  
> +#define ioremap ioremap
> +static inline void __iomem *ioremap(phys_addr_t phys_addr, size_t size)
> +{
> +	return __ioremap(phys_addr, size);
> +}
> +
>  #define virt_to_phys virt_to_phys
>  static inline phys_addr_t virt_to_phys(const volatile void *x)
>  {
> diff --git a/lib/arm64/asm/mmu.h b/lib/arm64/asm/mmu.h
> index 72d75eafc882..72371b2d9fe3 100644
> --- a/lib/arm64/asm/mmu.h
> +++ b/lib/arm64/asm/mmu.h
> @@ -8,6 +8,7 @@
>  #include <asm/barrier.h>
>  
>  #define PMD_SECT_UNCACHED	PMD_ATTRINDX(MT_DEVICE_nGnRE)
> +#define PTE_UNCACHED		PTE_ATTRINDX(MT_DEVICE_nGnRE)
>  #define PTE_WBWA		PTE_ATTRINDX(MT_NORMAL)
>  
>  static inline void flush_tlb_all(void)
> diff --git a/lib/arm64/asm/page.h b/lib/arm64/asm/page.h
> index ae4484b22114..d0fac6ea563d 100644
> --- a/lib/arm64/asm/page.h
> +++ b/lib/arm64/asm/page.h
> @@ -72,5 +72,7 @@ typedef struct { pteval_t pgprot; } pgprot_t;
>  extern phys_addr_t __virt_to_phys(unsigned long addr);
>  extern unsigned long __phys_to_virt(phys_addr_t addr);
>  
> +extern void *__ioremap(phys_addr_t phys_addr, size_t size);
> +
>  #endif /* !__ASSEMBLY__ */
>  #endif /* _ASMARM64_PAGE_H_ */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 6/8] arm/arm64: setup: Consolidate memory layout assumptions
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 6/8] arm/arm64: setup: Consolidate " Andrew Jones
@ 2021-05-11 15:11   ` Alexandru Elisei
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandru Elisei @ 2021-05-11 15:11 UTC (permalink / raw)
  To: Andrew Jones, kvm; +Cc: nikos.nikoleris, andre.przywara, eric.auger

Hi Drew,

On 4/29/21 5:41 PM, Andrew Jones wrote:
> Keep as much memory layout assumptions as possible in init::start
> and a single setup function. This prepares us for calling setup()
> from different start functions which have been linked with different
> linker scripts. To do this, stacktop is only referenced from
> init::start, making freemem_start a parameter to setup(). We also
> split mem_init() into three parts, one that populates the mem regions
> per the DT, one that populates the mem regions per assumptions,
> and one that does the mem init. The concept of a primary region
> is dropped, but we add a sanity check for the absence of memory
> holes, because we don't know how to deal with them yet.

Looks good to me:

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>

Thanks,

Alex

>
> Reviewed-by: Nikos Nikoleris <nikos.nikoleris@arm.com>
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  arm/cstart.S        |   4 +-
>  arm/cstart64.S      |   2 +
>  arm/flat.lds        |  23 ++++++
>  lib/arm/asm/setup.h |   8 +-
>  lib/arm/mmu.c       |   2 -
>  lib/arm/setup.c     | 175 ++++++++++++++++++++++++--------------------
>  6 files changed, 128 insertions(+), 86 deletions(-)
>
> diff --git a/arm/cstart.S b/arm/cstart.S
> index bf3c78157e6a..446966de350d 100644
> --- a/arm/cstart.S
> +++ b/arm/cstart.S
> @@ -80,7 +80,9 @@ start:
>  
>  	/* complete setup */
>  	pop	{r0-r1}
> -	bl	setup
> +	mov	r3, #0
> +	ldr	r2, =stacktop		@ r2,r3 is the base of free memory
> +	bl	setup			@ r0 is the addr of the dtb
>  
>  	/* run the test */
>  	ldr	r0, =__argc
> diff --git a/arm/cstart64.S b/arm/cstart64.S
> index 27251fe8b5cd..42ba3a3ca249 100644
> --- a/arm/cstart64.S
> +++ b/arm/cstart64.S
> @@ -92,6 +92,8 @@ start:
>  	bl	exceptions_init
>  
>  	/* complete setup */
> +	adrp	x1, stacktop
> +	add	x1, x1, :lo12:stacktop		// x1 is the base of free memory
>  	bl	setup				// x0 is the addr of the dtb
>  
>  	/* run the test */
> diff --git a/arm/flat.lds b/arm/flat.lds
> index 6ed377c0eaa0..6fb459efb815 100644
> --- a/arm/flat.lds
> +++ b/arm/flat.lds
> @@ -1,3 +1,26 @@
> +/*
> + * init::start will pass stacktop to setup() as the base of free memory.
> + * setup() will then move the FDT and initrd to that base before calling
> + * mem_init(). With those movements and this linker script, we'll end up
> + * having the following memory layout:
> + *
> + *    +----------------------+   <-- top of physical memory
> + *    |                      |
> + *    ~                      ~
> + *    |                      |
> + *    +----------------------+   <-- top of initrd
> + *    |                      |
> + *    +----------------------+   <-- top of FDT
> + *    |                      |
> + *    +----------------------+   <-- top of cpu0's stack
> + *    |                      |
> + *    +----------------------+   <-- top of text/data/bss sections
> + *    |                      |
> + *    |                      |
> + *    +----------------------+   <-- load address
> + *    |                      |
> + *    +----------------------+   <-- physical address 0x0
> + */
>  
>  SECTIONS
>  {
> diff --git a/lib/arm/asm/setup.h b/lib/arm/asm/setup.h
> index 210c14f818fb..f0e70b119fb0 100644
> --- a/lib/arm/asm/setup.h
> +++ b/lib/arm/asm/setup.h
> @@ -13,9 +13,8 @@
>  extern u64 cpus[NR_CPUS];	/* per-cpu IDs (MPIDRs) */
>  extern int nr_cpus;
>  
> -#define MR_F_PRIMARY		(1U << 0)
> -#define MR_F_IO			(1U << 1)
> -#define MR_F_CODE		(1U << 2)
> +#define MR_F_IO			(1U << 0)
> +#define MR_F_CODE		(1U << 1)
>  #define MR_F_UNKNOWN		(1U << 31)
>  
>  struct mem_region {
> @@ -26,6 +25,7 @@ struct mem_region {
>  extern struct mem_region *mem_regions;
>  extern phys_addr_t __phys_offset, __phys_end;
>  
> +extern struct mem_region *mem_region_find(phys_addr_t paddr);
>  extern unsigned int mem_region_get_flags(phys_addr_t paddr);
>  
>  #define PHYS_OFFSET		(__phys_offset)
> @@ -35,6 +35,6 @@ extern unsigned int mem_region_get_flags(phys_addr_t paddr);
>  #define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
>  #define SMP_CACHE_BYTES		L1_CACHE_BYTES
>  
> -void setup(const void *fdt);
> +void setup(const void *fdt, phys_addr_t freemem_start);
>  
>  #endif /* _ASMARM_SETUP_H_ */
> diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
> index 7d658a3fe89c..7628f797a543 100644
> --- a/lib/arm/mmu.c
> +++ b/lib/arm/mmu.c
> @@ -175,12 +175,10 @@ void *setup_mmu(phys_addr_t phys_end)
>  		if (r->flags & MR_F_IO) {
>  			continue;
>  		} else if (r->flags & MR_F_CODE) {
> -			assert_msg(r->flags & MR_F_PRIMARY, "Unexpected code region");
>  			/* armv8 requires code shared between EL1 and EL0 to be read-only */
>  			mmu_set_range_ptes(mmu_idmap, r->start, r->start, r->end,
>  					   __pgprot(PTE_WBWA | PTE_USER | PTE_RDONLY));
>  		} else {
> -			assert_msg(r->flags & MR_F_PRIMARY, "Unexpected data region");
>  			mmu_set_range_ptes(mmu_idmap, r->start, r->start, r->end,
>  					   __pgprot(PTE_WBWA | PTE_USER));
>  		}
> diff --git a/lib/arm/setup.c b/lib/arm/setup.c
> index 7db308b70744..86f054304baf 100644
> --- a/lib/arm/setup.c
> +++ b/lib/arm/setup.c
> @@ -28,9 +28,10 @@
>  
>  #include "io.h"
>  
> -#define NR_INITIAL_MEM_REGIONS 16
> +#define MAX_DT_MEM_REGIONS	16
> +#define NR_EXTRA_MEM_REGIONS	16
> +#define NR_INITIAL_MEM_REGIONS	(MAX_DT_MEM_REGIONS + NR_EXTRA_MEM_REGIONS)
>  
> -extern unsigned long stacktop;
>  extern unsigned long etext;
>  
>  struct timer_state __timer_state;
> @@ -75,28 +76,68 @@ static void cpu_init(void)
>  	set_cpu_online(0, true);
>  }
>  
> -unsigned int mem_region_get_flags(phys_addr_t paddr)
> +static void mem_region_add(struct mem_region *r)
> +{
> +	struct mem_region *r_next = mem_regions;
> +	int i = 0;
> +
> +	for (; r_next->end; ++r_next, ++i)
> +		;
> +	assert(i < NR_INITIAL_MEM_REGIONS);
> +
> +	*r_next = *r;
> +}
> +
> +static void mem_regions_add_dt_regions(void)
> +{
> +	struct dt_pbus_reg regs[MAX_DT_MEM_REGIONS];
> +	int nr_regs, i;
> +
> +	nr_regs = dt_get_memory_params(regs, MAX_DT_MEM_REGIONS);
> +	assert(nr_regs > 0);
> +
> +	for (i = 0; i < nr_regs; ++i) {
> +		mem_region_add(&(struct mem_region){
> +			.start = regs[i].addr,
> +			.end = regs[i].addr + regs[i].size,
> +		});
> +	}
> +}
> +
> +struct mem_region *mem_region_find(phys_addr_t paddr)
>  {
>  	struct mem_region *r;
>  
> -	for (r = mem_regions; r->end; ++r) {
> +	for (r = mem_regions; r->end; ++r)
>  		if (paddr >= r->start && paddr < r->end)
> -			return r->flags;
> -	}
> +			return r;
> +	return NULL;
> +}
>  
> -	return MR_F_UNKNOWN;
> +unsigned int mem_region_get_flags(phys_addr_t paddr)
> +{
> +	struct mem_region *r = mem_region_find(paddr);
> +	return r ? r->flags : MR_F_UNKNOWN;
>  }
>  
> -static void mem_init(phys_addr_t freemem_start)
> +static void mem_regions_add_assumed(void)
>  {
>  	phys_addr_t code_end = (phys_addr_t)(unsigned long)&etext;
> -	struct dt_pbus_reg regs[NR_INITIAL_MEM_REGIONS];
> -	struct mem_region mem = {
> -		.start = (phys_addr_t)-1,
> +	struct mem_region *r;
> +
> +	r = mem_region_find(code_end - 1);
> +	assert(r);
> +
> +	/* Split the region with the code into two regions; code and data */
> +	mem_region_add(&(struct mem_region){
> +		.start = code_end,
> +		.end = r->end,
> +	});
> +	*r = (struct mem_region){
> +		.start = r->start,
> +		.end = code_end,
> +		.flags = MR_F_CODE,
>  	};
> -	struct mem_region *primary = NULL;
> -	phys_addr_t base, top;
> -	int nr_regs, nr_io = 0, i;
>  
>  	/*
>  	 * mach-virt I/O regions:
> @@ -104,57 +145,47 @@ static void mem_init(phys_addr_t freemem_start)
>  	 *   - 512M at 256G (arm64, arm uses highmem=off)
>  	 *   - 512G at 512G (arm64, arm uses highmem=off)
>  	 */
> -	mem_regions[nr_io++] = (struct mem_region){ 0, (1ul << 30), MR_F_IO };
> +	mem_region_add(&(struct mem_region){ 0, (1ul << 30), MR_F_IO });
>  #ifdef __aarch64__
> -	mem_regions[nr_io++] = (struct mem_region){ (1ul << 38), (1ul << 38) | (1ul << 29), MR_F_IO };
> -	mem_regions[nr_io++] = (struct mem_region){ (1ul << 39), (1ul << 40), MR_F_IO };
> +	mem_region_add(&(struct mem_region){ (1ul << 38), (1ul << 38) | (1ul << 29), MR_F_IO });
> +	mem_region_add(&(struct mem_region){ (1ul << 39), (1ul << 40), MR_F_IO });
>  #endif
> +}
>  
> -	nr_regs = dt_get_memory_params(regs, NR_INITIAL_MEM_REGIONS - nr_io);
> -	assert(nr_regs > 0);
> -
> -	for (i = 0; i < nr_regs; ++i) {
> -		struct mem_region *r = &mem_regions[nr_io + i];
> +static void mem_init(phys_addr_t freemem_start)
> +{
> +	phys_addr_t base, top;
> +	struct mem_region *freemem, *r, mem = {
> +		.start = (phys_addr_t)-1,
> +	};
>  
> -		r->start = regs[i].addr;
> -		r->end = regs[i].addr + regs[i].size;
> +	freemem = mem_region_find(freemem_start);
> +	assert(freemem && !(freemem->flags & (MR_F_IO | MR_F_CODE)));
>  
> -		/*
> -		 * pick the region we're in for our primary region
> -		 */
> -		if (freemem_start >= r->start && freemem_start < r->end) {
> -			r->flags |= MR_F_PRIMARY;
> -			primary = r;
> +	for (r = mem_regions; r->end; ++r) {
> +		if (!(r->flags & MR_F_IO)) {
> +			if (r->start < mem.start)
> +				mem.start = r->start;
> +			if (r->end > mem.end)
> +				mem.end = r->end;
>  		}
> -
> -		/*
> -		 * set the lowest and highest addresses found,
> -		 * ignoring potential gaps
> -		 */
> -		if (r->start < mem.start)
> -			mem.start = r->start;
> -		if (r->end > mem.end)
> -			mem.end = r->end;
>  	}
> -	assert(primary);
> -	assert(!(mem.start & ~PHYS_MASK) && !((mem.end - 1) & ~PHYS_MASK));
> +	assert(mem.end && !(mem.start & ~PHYS_MASK));
> +	mem.end &= PHYS_MASK;
>  
> -	__phys_offset = primary->start;	/* PHYS_OFFSET */
> -	__phys_end = primary->end;	/* PHYS_END */
> +	/* Check for holes */
> +	r = mem_region_find(mem.start);
> +	while (r && r->end != mem.end)
> +		r = mem_region_find(r->end);
> +	assert(r);
>  
> -	/* Split the primary region into two regions; code and data */
> -	mem_regions[nr_io + i] = (struct mem_region){
> -		.start = code_end,
> -		.end = primary->end,
> -		.flags = MR_F_PRIMARY,
> -	};
> -	*primary = (struct mem_region){
> -		.start = primary->start,
> -		.end = code_end,
> -		.flags = MR_F_PRIMARY | MR_F_CODE,
> -	};
> +	/* Ensure our selected freemem range is somewhere in our full range */
> +	assert(freemem_start >= mem.start && freemem->end <= mem.end);
>  
> -	phys_alloc_init(freemem_start, __phys_end - freemem_start);
> +	__phys_offset = mem.start;	/* PHYS_OFFSET */
> +	__phys_end = mem.end;		/* PHYS_END */
> +
> +	phys_alloc_init(freemem_start, freemem->end - freemem_start);
>  	phys_alloc_set_minimum_alignment(SMP_CACHE_BYTES);
>  
>  	phys_alloc_get_unused(&base, &top);
> @@ -204,35 +235,17 @@ static void timer_save_state(void)
>  	__timer_state.vtimer.irq_flags = fdt32_to_cpu(data[8]);
>  }
>  
> -void setup(const void *fdt)
> +void setup(const void *fdt, phys_addr_t freemem_start)
>  {
> -	void *freemem = &stacktop;
> +	void *freemem;
>  	const char *bootargs, *tmp;
>  	u32 fdt_size;
>  	int ret;
>  
> -	/*
> -	 * Before calling mem_init we need to move the fdt and initrd
> -	 * to safe locations. We move them to construct the memory
> -	 * map illustrated below:
> -	 *
> -	 *    +----------------------+   <-- top of physical memory
> -	 *    |                      |
> -	 *    ~                      ~
> -	 *    |                      |
> -	 *    +----------------------+   <-- top of initrd
> -	 *    |                      |
> -	 *    +----------------------+   <-- top of FDT
> -	 *    |                      |
> -	 *    +----------------------+   <-- top of cpu0's stack
> -	 *    |                      |
> -	 *    +----------------------+   <-- top of text/data/bss sections,
> -	 *    |                      |       see arm/flat.lds
> -	 *    |                      |
> -	 *    +----------------------+   <-- load address
> -	 *    |                      |
> -	 *    +----------------------+
> -	 */
> +	assert(sizeof(long) == 8 || freemem_start < (3ul << 30));
> +	freemem = (void *)(unsigned long)freemem_start;
> +
> +	/* Move the FDT to the base of free memory */
>  	fdt_size = fdt_totalsize(fdt);
>  	ret = fdt_move(fdt, freemem, fdt_size);
>  	assert(ret == 0);
> @@ -240,6 +253,7 @@ void setup(const void *fdt)
>  	assert(ret == 0);
>  	freemem += fdt_size;
>  
> +	/* Move the initrd to the top of the FDT */
>  	ret = dt_get_initrd(&tmp, &initrd_size);
>  	assert(ret == 0 || ret == -FDT_ERR_NOTFOUND);
>  	if (ret == 0) {
> @@ -248,7 +262,10 @@ void setup(const void *fdt)
>  		freemem += initrd_size;
>  	}
>  
> +	mem_regions_add_dt_regions();
> +	mem_regions_add_assumed();
>  	mem_init(PAGE_ALIGN((unsigned long)freemem));
> +
>  	cpu_init();
>  
>  	/* cpu_init must be called before thread_info_init */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc Andrew Jones
@ 2021-05-12 16:14   ` Alexandru Elisei
  2021-05-13  7:08     ` Andrew Jones
  2021-05-13 10:18   ` [PATCH kvm-unit-tests v4] " Andrew Jones
  1 sibling, 1 reply; 22+ messages in thread
From: Alexandru Elisei @ 2021-05-12 16:14 UTC (permalink / raw)
  To: Andrew Jones, kvm; +Cc: nikos.nikoleris, andre.przywara, eric.auger

Hi Drew,

On 4/29/21 5:41 PM, Andrew Jones wrote:
> The method can be smc in addition to hvc, and it will be when running
> on bare metal. Additionally, we move the invocations to assembly so
> we don't have to rely on compiler assumptions. We also fix the
> prototype of psci_invoke. It should return long, not int, and
> function_id should be an unsigned int, not an unsigned long.

Sorry to harp on this again, but to be honest, it's still not clear to me why the
psci_invoke_{hvc,smc} functions return a long int.

If we only expect the PSCI functions to return error codes, then the PSCI spec
says these are 32-bit signed integers. If we want to support PSCI functions
returning other values, like PSCI_STAT_{RESIDENCY,COUNT}, then the invoke
functions should return an unsigned value.

The only case we're supporting is the error return for the SMC calling convention
(which says that error codes are 32/64bit signed integers). Since the commit
specifically mentions this change, I think some explanation why this width was
chosen would be appropriate. Unless it's painfully obvious and I'm just not seeing
it, which is a possibility.

Other than that, the patch looks good. I'll some tests to make sure everything is
in order.

Thanks,

Alex

>
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  arm/cstart.S       | 22 ++++++++++++++++++++++
>  arm/cstart64.S     | 22 ++++++++++++++++++++++
>  arm/selftest.c     | 34 +++++++---------------------------
>  lib/arm/asm/psci.h | 10 ++++++++--
>  lib/arm/psci.c     | 35 +++++++++++++++++++++++++++--------
>  lib/arm/setup.c    |  2 ++
>  6 files changed, 88 insertions(+), 37 deletions(-)
>
> diff --git a/arm/cstart.S b/arm/cstart.S
> index 446966de350d..2401d92cdadc 100644
> --- a/arm/cstart.S
> +++ b/arm/cstart.S
> @@ -95,6 +95,28 @@ start:
>  
>  .text
>  
> +/*
> + * psci_invoke_hvc / psci_invoke_smc
> + *
> + * Inputs:
> + *   r0 -- function_id
> + *   r1 -- arg0
> + *   r2 -- arg1
> + *   r3 -- arg2
> + *
> + * Outputs:
> + *   r0 -- return code
> + */
> +.globl psci_invoke_hvc
> +psci_invoke_hvc:
> +	hvc	#0
> +	mov	pc, lr
> +
> +.globl psci_invoke_smc
> +psci_invoke_smc:
> +	smc	#0
> +	mov	pc, lr
> +
>  enable_vfp:
>  	/* Enable full access to CP10 and CP11: */
>  	mov	r0, #(3 << 22 | 3 << 20)
> diff --git a/arm/cstart64.S b/arm/cstart64.S
> index 42ba3a3ca249..e4ab7d06251e 100644
> --- a/arm/cstart64.S
> +++ b/arm/cstart64.S
> @@ -109,6 +109,28 @@ start:
>  
>  .text
>  
> +/*
> + * psci_invoke_hvc / psci_invoke_smc
> + *
> + * Inputs:
> + *   w0 -- function_id
> + *   x1 -- arg0
> + *   x2 -- arg1
> + *   x3 -- arg2
> + *
> + * Outputs:
> + *   x0 -- return code
> + */
> +.globl psci_invoke_hvc
> +psci_invoke_hvc:
> +	hvc	#0
> +	ret
> +
> +.globl psci_invoke_smc
> +psci_invoke_smc:
> +	smc	#0
> +	ret
> +
>  get_mmu_off:
>  	adrp	x0, auxinfo
>  	ldr	x0, [x0, :lo12:auxinfo + 8]
> diff --git a/arm/selftest.c b/arm/selftest.c
> index 4495b161cdd5..9f459ed3d571 100644
> --- a/arm/selftest.c
> +++ b/arm/selftest.c
> @@ -400,33 +400,13 @@ static void check_vectors(void *arg __unused)
>  	exit(report_summary());
>  }
>  
> -static bool psci_check(void)
> +static void psci_print(void)
>  {
> -	const struct fdt_property *method;
> -	int node, len, ver;
> -
> -	node = fdt_node_offset_by_compatible(dt_fdt(), -1, "arm,psci-0.2");
> -	if (node < 0) {
> -		printf("PSCI v0.2 compatibility required\n");
> -		return false;
> -	}
> -
> -	method = fdt_get_property(dt_fdt(), node, "method", &len);
> -	if (method == NULL) {
> -		printf("bad psci device tree node\n");
> -		return false;
> -	}
> -
> -	if (len < 4 || strcmp(method->data, "hvc") != 0) {
> -		printf("psci method must be hvc\n");
> -		return false;
> -	}
> -
> -	ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> -	printf("PSCI version %d.%d\n", PSCI_VERSION_MAJOR(ver),
> -				       PSCI_VERSION_MINOR(ver));
> -
> -	return true;
> +	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> +	report_info("PSCI version: %d.%d", PSCI_VERSION_MAJOR(ver),
> +					  PSCI_VERSION_MINOR(ver));
> +	report_info("PSCI method: %s", psci_invoke == psci_invoke_hvc ?
> +				       "hvc" : "smc");
>  }
>  
>  static void cpu_report(void *data __unused)
> @@ -465,7 +445,7 @@ int main(int argc, char **argv)
>  
>  	} else if (strcmp(argv[1], "smp") == 0) {
>  
> -		report(psci_check(), "PSCI version");
> +		psci_print();
>  		on_cpus(cpu_report, NULL);
>  		while (!cpumask_full(&ready))
>  			cpu_relax();
> diff --git a/lib/arm/asm/psci.h b/lib/arm/asm/psci.h
> index 7b956bf5987d..f87fca0422cc 100644
> --- a/lib/arm/asm/psci.h
> +++ b/lib/arm/asm/psci.h
> @@ -3,8 +3,14 @@
>  #include <libcflat.h>
>  #include <linux/psci.h>
>  
> -extern int psci_invoke(unsigned long function_id, unsigned long arg0,
> -		       unsigned long arg1, unsigned long arg2);
> +typedef long (*psci_invoke_fn)(unsigned int function_id, unsigned long arg0,
> +			       unsigned long arg1, unsigned long arg2);
> +extern psci_invoke_fn psci_invoke;
> +extern long psci_invoke_hvc(unsigned int function_id, unsigned long arg0,
> +			    unsigned long arg1, unsigned long arg2);
> +extern long psci_invoke_smc(unsigned int function_id, unsigned long arg0,
> +			    unsigned long arg1, unsigned long arg2);
> +extern void psci_set_conduit(void);
>  extern int psci_cpu_on(unsigned long cpuid, unsigned long entry_point);
>  extern void psci_system_reset(void);
>  extern int cpu_psci_cpu_boot(unsigned int cpu);
> diff --git a/lib/arm/psci.c b/lib/arm/psci.c
> index 936c83948b6a..3053b3041c28 100644
> --- a/lib/arm/psci.c
> +++ b/lib/arm/psci.c
> @@ -6,22 +6,21 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> +#include <devicetree.h>
>  #include <asm/psci.h>
>  #include <asm/setup.h>
>  #include <asm/page.h>
>  #include <asm/smp.h>
>  
> -__attribute__((noinline))
> -int psci_invoke(unsigned long function_id, unsigned long arg0,
> -		unsigned long arg1, unsigned long arg2)
> +static long psci_invoke_none(unsigned int function_id, unsigned long arg0,
> +			     unsigned long arg1, unsigned long arg2)
>  {
> -	asm volatile(
> -		"hvc #0"
> -	: "+r" (function_id)
> -	: "r" (arg0), "r" (arg1), "r" (arg2));
> -	return function_id;
> +	printf("No PSCI method configured! Can't invoke...\n");
> +	return PSCI_RET_NOT_PRESENT;
>  }
>  
> +psci_invoke_fn psci_invoke = psci_invoke_none;
> +
>  int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
>  {
>  #ifdef __arm__
> @@ -56,3 +55,23 @@ void psci_system_off(void)
>  	int err = psci_invoke(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
>  	printf("CPU%d unable to do system off (error = %d)\n", smp_processor_id(), err);
>  }
> +
> +void psci_set_conduit(void)
> +{
> +	const void *fdt = dt_fdt();
> +	const struct fdt_property *method;
> +	int node, len;
> +
> +	node = fdt_node_offset_by_compatible(fdt, -1, "arm,psci-0.2");
> +	assert_msg(node >= 0, "PSCI v0.2 compatibility required");
> +
> +	method = fdt_get_property(fdt, node, "method", &len);
> +	assert(method != NULL && len == 4);
> +
> +	if (strcmp(method->data, "hvc") == 0)
> +		psci_invoke = psci_invoke_hvc;
> +	else if (strcmp(method->data, "smc") == 0)
> +		psci_invoke = psci_invoke_smc;
> +	else
> +		assert_msg(false, "Unknown PSCI conduit: %s", method->data);
> +}
> diff --git a/lib/arm/setup.c b/lib/arm/setup.c
> index 86f054304baf..bcdf0d78c2e2 100644
> --- a/lib/arm/setup.c
> +++ b/lib/arm/setup.c
> @@ -25,6 +25,7 @@
>  #include <asm/processor.h>
>  #include <asm/smp.h>
>  #include <asm/timer.h>
> +#include <asm/psci.h>
>  
>  #include "io.h"
>  
> @@ -266,6 +267,7 @@ void setup(const void *fdt, phys_addr_t freemem_start)
>  	mem_regions_add_assumed();
>  	mem_init(PAGE_ALIGN((unsigned long)freemem));
>  
> +	psci_set_conduit();
>  	cpu_init();
>  
>  	/* cpu_init must be called before thread_info_init */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc
  2021-05-12 16:14   ` Alexandru Elisei
@ 2021-05-13  7:08     ` Andrew Jones
  2021-05-13  9:08       ` Alexandru Elisei
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Jones @ 2021-05-13  7:08 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, nikos.nikoleris, andre.przywara, eric.auger

On Wed, May 12, 2021 at 05:14:24PM +0100, Alexandru Elisei wrote:
> Hi Drew,
> 
> On 4/29/21 5:41 PM, Andrew Jones wrote:
> > The method can be smc in addition to hvc, and it will be when running
> > on bare metal. Additionally, we move the invocations to assembly so
> > we don't have to rely on compiler assumptions. We also fix the
> > prototype of psci_invoke. It should return long, not int, and
> > function_id should be an unsigned int, not an unsigned long.
> 
> Sorry to harp on this again, but to be honest, it's still not clear to me why the
> psci_invoke_{hvc,smc} functions return a long int.
> 
> If we only expect the PSCI functions to return error codes, then the PSCI spec
> says these are 32-bit signed integers. If we want to support PSCI functions
> returning other values, like PSCI_STAT_{RESIDENCY,COUNT}, then the invoke
> functions should return an unsigned value.
> 
> The only case we're supporting is the error return for the SMC calling convention
> (which says that error codes are 32/64bit signed integers).

psci_invoke_{hvc,smc} should implement the SMC calling convention, since
they're just wrapping the smc/hvc call. PSCI calls that build on that,
e.g. psci_cpu_on, can define their own return type and then translate
the signed long returned by SMC into, e.g. 32-bit signed integers. Indeed
that's what psci_cpu_on does.

I would write something like that in the commit message or rename
psci_invoke to smc_invoke.

Thanks,
drew

> Since the commit
> specifically mentions this change, I think some explanation why this width was
> chosen would be appropriate. Unless it's painfully obvious and I'm just not seeing
> it, which is a possibility.
> 
> Other than that, the patch looks good. I'll some tests to make sure everything is
> in order.
> 
> Thanks,
> 
> Alex
> 
> >
> > Signed-off-by: Andrew Jones <drjones@redhat.com>
> > ---
> >  arm/cstart.S       | 22 ++++++++++++++++++++++
> >  arm/cstart64.S     | 22 ++++++++++++++++++++++
> >  arm/selftest.c     | 34 +++++++---------------------------
> >  lib/arm/asm/psci.h | 10 ++++++++--
> >  lib/arm/psci.c     | 35 +++++++++++++++++++++++++++--------
> >  lib/arm/setup.c    |  2 ++
> >  6 files changed, 88 insertions(+), 37 deletions(-)
> >
> > diff --git a/arm/cstart.S b/arm/cstart.S
> > index 446966de350d..2401d92cdadc 100644
> > --- a/arm/cstart.S
> > +++ b/arm/cstart.S
> > @@ -95,6 +95,28 @@ start:
> >  
> >  .text
> >  
> > +/*
> > + * psci_invoke_hvc / psci_invoke_smc
> > + *
> > + * Inputs:
> > + *   r0 -- function_id
> > + *   r1 -- arg0
> > + *   r2 -- arg1
> > + *   r3 -- arg2
> > + *
> > + * Outputs:
> > + *   r0 -- return code
> > + */
> > +.globl psci_invoke_hvc
> > +psci_invoke_hvc:
> > +	hvc	#0
> > +	mov	pc, lr
> > +
> > +.globl psci_invoke_smc
> > +psci_invoke_smc:
> > +	smc	#0
> > +	mov	pc, lr
> > +
> >  enable_vfp:
> >  	/* Enable full access to CP10 and CP11: */
> >  	mov	r0, #(3 << 22 | 3 << 20)
> > diff --git a/arm/cstart64.S b/arm/cstart64.S
> > index 42ba3a3ca249..e4ab7d06251e 100644
> > --- a/arm/cstart64.S
> > +++ b/arm/cstart64.S
> > @@ -109,6 +109,28 @@ start:
> >  
> >  .text
> >  
> > +/*
> > + * psci_invoke_hvc / psci_invoke_smc
> > + *
> > + * Inputs:
> > + *   w0 -- function_id
> > + *   x1 -- arg0
> > + *   x2 -- arg1
> > + *   x3 -- arg2
> > + *
> > + * Outputs:
> > + *   x0 -- return code
> > + */
> > +.globl psci_invoke_hvc
> > +psci_invoke_hvc:
> > +	hvc	#0
> > +	ret
> > +
> > +.globl psci_invoke_smc
> > +psci_invoke_smc:
> > +	smc	#0
> > +	ret
> > +
> >  get_mmu_off:
> >  	adrp	x0, auxinfo
> >  	ldr	x0, [x0, :lo12:auxinfo + 8]
> > diff --git a/arm/selftest.c b/arm/selftest.c
> > index 4495b161cdd5..9f459ed3d571 100644
> > --- a/arm/selftest.c
> > +++ b/arm/selftest.c
> > @@ -400,33 +400,13 @@ static void check_vectors(void *arg __unused)
> >  	exit(report_summary());
> >  }
> >  
> > -static bool psci_check(void)
> > +static void psci_print(void)
> >  {
> > -	const struct fdt_property *method;
> > -	int node, len, ver;
> > -
> > -	node = fdt_node_offset_by_compatible(dt_fdt(), -1, "arm,psci-0.2");
> > -	if (node < 0) {
> > -		printf("PSCI v0.2 compatibility required\n");
> > -		return false;
> > -	}
> > -
> > -	method = fdt_get_property(dt_fdt(), node, "method", &len);
> > -	if (method == NULL) {
> > -		printf("bad psci device tree node\n");
> > -		return false;
> > -	}
> > -
> > -	if (len < 4 || strcmp(method->data, "hvc") != 0) {
> > -		printf("psci method must be hvc\n");
> > -		return false;
> > -	}
> > -
> > -	ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> > -	printf("PSCI version %d.%d\n", PSCI_VERSION_MAJOR(ver),
> > -				       PSCI_VERSION_MINOR(ver));
> > -
> > -	return true;
> > +	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> > +	report_info("PSCI version: %d.%d", PSCI_VERSION_MAJOR(ver),
> > +					  PSCI_VERSION_MINOR(ver));
> > +	report_info("PSCI method: %s", psci_invoke == psci_invoke_hvc ?
> > +				       "hvc" : "smc");
> >  }
> >  
> >  static void cpu_report(void *data __unused)
> > @@ -465,7 +445,7 @@ int main(int argc, char **argv)
> >  
> >  	} else if (strcmp(argv[1], "smp") == 0) {
> >  
> > -		report(psci_check(), "PSCI version");
> > +		psci_print();
> >  		on_cpus(cpu_report, NULL);
> >  		while (!cpumask_full(&ready))
> >  			cpu_relax();
> > diff --git a/lib/arm/asm/psci.h b/lib/arm/asm/psci.h
> > index 7b956bf5987d..f87fca0422cc 100644
> > --- a/lib/arm/asm/psci.h
> > +++ b/lib/arm/asm/psci.h
> > @@ -3,8 +3,14 @@
> >  #include <libcflat.h>
> >  #include <linux/psci.h>
> >  
> > -extern int psci_invoke(unsigned long function_id, unsigned long arg0,
> > -		       unsigned long arg1, unsigned long arg2);
> > +typedef long (*psci_invoke_fn)(unsigned int function_id, unsigned long arg0,
> > +			       unsigned long arg1, unsigned long arg2);
> > +extern psci_invoke_fn psci_invoke;
> > +extern long psci_invoke_hvc(unsigned int function_id, unsigned long arg0,
> > +			    unsigned long arg1, unsigned long arg2);
> > +extern long psci_invoke_smc(unsigned int function_id, unsigned long arg0,
> > +			    unsigned long arg1, unsigned long arg2);
> > +extern void psci_set_conduit(void);
> >  extern int psci_cpu_on(unsigned long cpuid, unsigned long entry_point);
> >  extern void psci_system_reset(void);
> >  extern int cpu_psci_cpu_boot(unsigned int cpu);
> > diff --git a/lib/arm/psci.c b/lib/arm/psci.c
> > index 936c83948b6a..3053b3041c28 100644
> > --- a/lib/arm/psci.c
> > +++ b/lib/arm/psci.c
> > @@ -6,22 +6,21 @@
> >   *
> >   * This work is licensed under the terms of the GNU LGPL, version 2.
> >   */
> > +#include <devicetree.h>
> >  #include <asm/psci.h>
> >  #include <asm/setup.h>
> >  #include <asm/page.h>
> >  #include <asm/smp.h>
> >  
> > -__attribute__((noinline))
> > -int psci_invoke(unsigned long function_id, unsigned long arg0,
> > -		unsigned long arg1, unsigned long arg2)
> > +static long psci_invoke_none(unsigned int function_id, unsigned long arg0,
> > +			     unsigned long arg1, unsigned long arg2)
> >  {
> > -	asm volatile(
> > -		"hvc #0"
> > -	: "+r" (function_id)
> > -	: "r" (arg0), "r" (arg1), "r" (arg2));
> > -	return function_id;
> > +	printf("No PSCI method configured! Can't invoke...\n");
> > +	return PSCI_RET_NOT_PRESENT;
> >  }
> >  
> > +psci_invoke_fn psci_invoke = psci_invoke_none;
> > +
> >  int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
> >  {
> >  #ifdef __arm__
> > @@ -56,3 +55,23 @@ void psci_system_off(void)
> >  	int err = psci_invoke(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
> >  	printf("CPU%d unable to do system off (error = %d)\n", smp_processor_id(), err);
> >  }
> > +
> > +void psci_set_conduit(void)
> > +{
> > +	const void *fdt = dt_fdt();
> > +	const struct fdt_property *method;
> > +	int node, len;
> > +
> > +	node = fdt_node_offset_by_compatible(fdt, -1, "arm,psci-0.2");
> > +	assert_msg(node >= 0, "PSCI v0.2 compatibility required");
> > +
> > +	method = fdt_get_property(fdt, node, "method", &len);
> > +	assert(method != NULL && len == 4);
> > +
> > +	if (strcmp(method->data, "hvc") == 0)
> > +		psci_invoke = psci_invoke_hvc;
> > +	else if (strcmp(method->data, "smc") == 0)
> > +		psci_invoke = psci_invoke_smc;
> > +	else
> > +		assert_msg(false, "Unknown PSCI conduit: %s", method->data);
> > +}
> > diff --git a/lib/arm/setup.c b/lib/arm/setup.c
> > index 86f054304baf..bcdf0d78c2e2 100644
> > --- a/lib/arm/setup.c
> > +++ b/lib/arm/setup.c
> > @@ -25,6 +25,7 @@
> >  #include <asm/processor.h>
> >  #include <asm/smp.h>
> >  #include <asm/timer.h>
> > +#include <asm/psci.h>
> >  
> >  #include "io.h"
> >  
> > @@ -266,6 +267,7 @@ void setup(const void *fdt, phys_addr_t freemem_start)
> >  	mem_regions_add_assumed();
> >  	mem_init(PAGE_ALIGN((unsigned long)freemem));
> >  
> > +	psci_set_conduit();
> >  	cpu_init();
> >  
> >  	/* cpu_init must be called before thread_info_init */
> 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc
  2021-05-13  7:08     ` Andrew Jones
@ 2021-05-13  9:08       ` Alexandru Elisei
  2021-05-13 10:06         ` Andrew Jones
  0 siblings, 1 reply; 22+ messages in thread
From: Alexandru Elisei @ 2021-05-13  9:08 UTC (permalink / raw)
  To: Andrew Jones; +Cc: kvm, nikos.nikoleris, andre.przywara, eric.auger

Hi Drew,

On 5/13/21 8:08 AM, Andrew Jones wrote:
> On Wed, May 12, 2021 at 05:14:24PM +0100, Alexandru Elisei wrote:
>> Hi Drew,
>>
>> On 4/29/21 5:41 PM, Andrew Jones wrote:
>>> The method can be smc in addition to hvc, and it will be when running
>>> on bare metal. Additionally, we move the invocations to assembly so
>>> we don't have to rely on compiler assumptions. We also fix the
>>> prototype of psci_invoke. It should return long, not int, and
>>> function_id should be an unsigned int, not an unsigned long.
>> Sorry to harp on this again, but to be honest, it's still not clear to me why the
>> psci_invoke_{hvc,smc} functions return a long int.
>>
>> If we only expect the PSCI functions to return error codes, then the PSCI spec
>> says these are 32-bit signed integers. If we want to support PSCI functions
>> returning other values, like PSCI_STAT_{RESIDENCY,COUNT}, then the invoke
>> functions should return an unsigned value.
>>
>> The only case we're supporting is the error return for the SMC calling convention
>> (which says that error codes are 32/64bit signed integers).
> psci_invoke_{hvc,smc} should implement the SMC calling convention, since
> they're just wrapping the smc/hvc call. PSCI calls that build on that,
> e.g. psci_cpu_on, can define their own return type and then translate
> the signed long returned by SMC into, e.g. 32-bit signed integers. Indeed
> that's what psci_cpu_on does.
>
> I would write something like that in the commit message or rename
> psci_invoke to smc_invoke.

I agree that psci_invoke_* use the SMC calling convention, but we're not
implementing *all* the features of the SMC calling convention, because SMCCC can
return more than one result in registers r0-r3. In my opinion, I think the easiest
solution and the most consistent with both specifications would be to keep the
current names and change the return value either to an int, and put a comment
saying that we only support PSCI functions that return an error, either to a long
unsigned int, meaning that we support *all* PSCI functions as defined in ARM DEN
0022D.

What do you think? Does that make sense?

Thanks,

Alex

>  
>
> Thanks,
> drew
>
>> Since the commit
>> specifically mentions this change, I think some explanation why this width was
>> chosen would be appropriate. Unless it's painfully obvious and I'm just not seeing
>> it, which is a possibility.
>>
>> Other than that, the patch looks good. I'll some tests to make sure everything is
>> in order.
>>
>> Thanks,
>>
>> Alex
>>
>>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>>> ---
>>>  arm/cstart.S       | 22 ++++++++++++++++++++++
>>>  arm/cstart64.S     | 22 ++++++++++++++++++++++
>>>  arm/selftest.c     | 34 +++++++---------------------------
>>>  lib/arm/asm/psci.h | 10 ++++++++--
>>>  lib/arm/psci.c     | 35 +++++++++++++++++++++++++++--------
>>>  lib/arm/setup.c    |  2 ++
>>>  6 files changed, 88 insertions(+), 37 deletions(-)
>>>
>>> diff --git a/arm/cstart.S b/arm/cstart.S
>>> index 446966de350d..2401d92cdadc 100644
>>> --- a/arm/cstart.S
>>> +++ b/arm/cstart.S
>>> @@ -95,6 +95,28 @@ start:
>>>  
>>>  .text
>>>  
>>> +/*
>>> + * psci_invoke_hvc / psci_invoke_smc
>>> + *
>>> + * Inputs:
>>> + *   r0 -- function_id
>>> + *   r1 -- arg0
>>> + *   r2 -- arg1
>>> + *   r3 -- arg2
>>> + *
>>> + * Outputs:
>>> + *   r0 -- return code
>>> + */
>>> +.globl psci_invoke_hvc
>>> +psci_invoke_hvc:
>>> +	hvc	#0
>>> +	mov	pc, lr
>>> +
>>> +.globl psci_invoke_smc
>>> +psci_invoke_smc:
>>> +	smc	#0
>>> +	mov	pc, lr
>>> +
>>>  enable_vfp:
>>>  	/* Enable full access to CP10 and CP11: */
>>>  	mov	r0, #(3 << 22 | 3 << 20)
>>> diff --git a/arm/cstart64.S b/arm/cstart64.S
>>> index 42ba3a3ca249..e4ab7d06251e 100644
>>> --- a/arm/cstart64.S
>>> +++ b/arm/cstart64.S
>>> @@ -109,6 +109,28 @@ start:
>>>  
>>>  .text
>>>  
>>> +/*
>>> + * psci_invoke_hvc / psci_invoke_smc
>>> + *
>>> + * Inputs:
>>> + *   w0 -- function_id
>>> + *   x1 -- arg0
>>> + *   x2 -- arg1
>>> + *   x3 -- arg2
>>> + *
>>> + * Outputs:
>>> + *   x0 -- return code
>>> + */
>>> +.globl psci_invoke_hvc
>>> +psci_invoke_hvc:
>>> +	hvc	#0
>>> +	ret
>>> +
>>> +.globl psci_invoke_smc
>>> +psci_invoke_smc:
>>> +	smc	#0
>>> +	ret
>>> +
>>>  get_mmu_off:
>>>  	adrp	x0, auxinfo
>>>  	ldr	x0, [x0, :lo12:auxinfo + 8]
>>> diff --git a/arm/selftest.c b/arm/selftest.c
>>> index 4495b161cdd5..9f459ed3d571 100644
>>> --- a/arm/selftest.c
>>> +++ b/arm/selftest.c
>>> @@ -400,33 +400,13 @@ static void check_vectors(void *arg __unused)
>>>  	exit(report_summary());
>>>  }
>>>  
>>> -static bool psci_check(void)
>>> +static void psci_print(void)
>>>  {
>>> -	const struct fdt_property *method;
>>> -	int node, len, ver;
>>> -
>>> -	node = fdt_node_offset_by_compatible(dt_fdt(), -1, "arm,psci-0.2");
>>> -	if (node < 0) {
>>> -		printf("PSCI v0.2 compatibility required\n");
>>> -		return false;
>>> -	}
>>> -
>>> -	method = fdt_get_property(dt_fdt(), node, "method", &len);
>>> -	if (method == NULL) {
>>> -		printf("bad psci device tree node\n");
>>> -		return false;
>>> -	}
>>> -
>>> -	if (len < 4 || strcmp(method->data, "hvc") != 0) {
>>> -		printf("psci method must be hvc\n");
>>> -		return false;
>>> -	}
>>> -
>>> -	ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
>>> -	printf("PSCI version %d.%d\n", PSCI_VERSION_MAJOR(ver),
>>> -				       PSCI_VERSION_MINOR(ver));
>>> -
>>> -	return true;
>>> +	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
>>> +	report_info("PSCI version: %d.%d", PSCI_VERSION_MAJOR(ver),
>>> +					  PSCI_VERSION_MINOR(ver));
>>> +	report_info("PSCI method: %s", psci_invoke == psci_invoke_hvc ?
>>> +				       "hvc" : "smc");
>>>  }
>>>  
>>>  static void cpu_report(void *data __unused)
>>> @@ -465,7 +445,7 @@ int main(int argc, char **argv)
>>>  
>>>  	} else if (strcmp(argv[1], "smp") == 0) {
>>>  
>>> -		report(psci_check(), "PSCI version");
>>> +		psci_print();
>>>  		on_cpus(cpu_report, NULL);
>>>  		while (!cpumask_full(&ready))
>>>  			cpu_relax();
>>> diff --git a/lib/arm/asm/psci.h b/lib/arm/asm/psci.h
>>> index 7b956bf5987d..f87fca0422cc 100644
>>> --- a/lib/arm/asm/psci.h
>>> +++ b/lib/arm/asm/psci.h
>>> @@ -3,8 +3,14 @@
>>>  #include <libcflat.h>
>>>  #include <linux/psci.h>
>>>  
>>> -extern int psci_invoke(unsigned long function_id, unsigned long arg0,
>>> -		       unsigned long arg1, unsigned long arg2);
>>> +typedef long (*psci_invoke_fn)(unsigned int function_id, unsigned long arg0,
>>> +			       unsigned long arg1, unsigned long arg2);
>>> +extern psci_invoke_fn psci_invoke;
>>> +extern long psci_invoke_hvc(unsigned int function_id, unsigned long arg0,
>>> +			    unsigned long arg1, unsigned long arg2);
>>> +extern long psci_invoke_smc(unsigned int function_id, unsigned long arg0,
>>> +			    unsigned long arg1, unsigned long arg2);
>>> +extern void psci_set_conduit(void);
>>>  extern int psci_cpu_on(unsigned long cpuid, unsigned long entry_point);
>>>  extern void psci_system_reset(void);
>>>  extern int cpu_psci_cpu_boot(unsigned int cpu);
>>> diff --git a/lib/arm/psci.c b/lib/arm/psci.c
>>> index 936c83948b6a..3053b3041c28 100644
>>> --- a/lib/arm/psci.c
>>> +++ b/lib/arm/psci.c
>>> @@ -6,22 +6,21 @@
>>>   *
>>>   * This work is licensed under the terms of the GNU LGPL, version 2.
>>>   */
>>> +#include <devicetree.h>
>>>  #include <asm/psci.h>
>>>  #include <asm/setup.h>
>>>  #include <asm/page.h>
>>>  #include <asm/smp.h>
>>>  
>>> -__attribute__((noinline))
>>> -int psci_invoke(unsigned long function_id, unsigned long arg0,
>>> -		unsigned long arg1, unsigned long arg2)
>>> +static long psci_invoke_none(unsigned int function_id, unsigned long arg0,
>>> +			     unsigned long arg1, unsigned long arg2)
>>>  {
>>> -	asm volatile(
>>> -		"hvc #0"
>>> -	: "+r" (function_id)
>>> -	: "r" (arg0), "r" (arg1), "r" (arg2));
>>> -	return function_id;
>>> +	printf("No PSCI method configured! Can't invoke...\n");
>>> +	return PSCI_RET_NOT_PRESENT;
>>>  }
>>>  
>>> +psci_invoke_fn psci_invoke = psci_invoke_none;
>>> +
>>>  int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
>>>  {
>>>  #ifdef __arm__
>>> @@ -56,3 +55,23 @@ void psci_system_off(void)
>>>  	int err = psci_invoke(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
>>>  	printf("CPU%d unable to do system off (error = %d)\n", smp_processor_id(), err);
>>>  }
>>> +
>>> +void psci_set_conduit(void)
>>> +{
>>> +	const void *fdt = dt_fdt();
>>> +	const struct fdt_property *method;
>>> +	int node, len;
>>> +
>>> +	node = fdt_node_offset_by_compatible(fdt, -1, "arm,psci-0.2");
>>> +	assert_msg(node >= 0, "PSCI v0.2 compatibility required");
>>> +
>>> +	method = fdt_get_property(fdt, node, "method", &len);
>>> +	assert(method != NULL && len == 4);
>>> +
>>> +	if (strcmp(method->data, "hvc") == 0)
>>> +		psci_invoke = psci_invoke_hvc;
>>> +	else if (strcmp(method->data, "smc") == 0)
>>> +		psci_invoke = psci_invoke_smc;
>>> +	else
>>> +		assert_msg(false, "Unknown PSCI conduit: %s", method->data);
>>> +}
>>> diff --git a/lib/arm/setup.c b/lib/arm/setup.c
>>> index 86f054304baf..bcdf0d78c2e2 100644
>>> --- a/lib/arm/setup.c
>>> +++ b/lib/arm/setup.c
>>> @@ -25,6 +25,7 @@
>>>  #include <asm/processor.h>
>>>  #include <asm/smp.h>
>>>  #include <asm/timer.h>
>>> +#include <asm/psci.h>
>>>  
>>>  #include "io.h"
>>>  
>>> @@ -266,6 +267,7 @@ void setup(const void *fdt, phys_addr_t freemem_start)
>>>  	mem_regions_add_assumed();
>>>  	mem_init(PAGE_ALIGN((unsigned long)freemem));
>>>  
>>> +	psci_set_conduit();
>>>  	cpu_init();
>>>  
>>>  	/* cpu_init must be called before thread_info_init */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc
  2021-05-13  9:08       ` Alexandru Elisei
@ 2021-05-13 10:06         ` Andrew Jones
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-05-13 10:06 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, nikos.nikoleris, andre.przywara, eric.auger

On Thu, May 13, 2021 at 10:08:09AM +0100, Alexandru Elisei wrote:
> Hi Drew,
> 
> On 5/13/21 8:08 AM, Andrew Jones wrote:
> > On Wed, May 12, 2021 at 05:14:24PM +0100, Alexandru Elisei wrote:
> >> Hi Drew,
> >>
> >> On 4/29/21 5:41 PM, Andrew Jones wrote:
> >>> The method can be smc in addition to hvc, and it will be when running
> >>> on bare metal. Additionally, we move the invocations to assembly so
> >>> we don't have to rely on compiler assumptions. We also fix the
> >>> prototype of psci_invoke. It should return long, not int, and
> >>> function_id should be an unsigned int, not an unsigned long.
> >> Sorry to harp on this again, but to be honest, it's still not clear to me why the
> >> psci_invoke_{hvc,smc} functions return a long int.
> >>
> >> If we only expect the PSCI functions to return error codes, then the PSCI spec
> >> says these are 32-bit signed integers. If we want to support PSCI functions
> >> returning other values, like PSCI_STAT_{RESIDENCY,COUNT}, then the invoke
> >> functions should return an unsigned value.
> >>
> >> The only case we're supporting is the error return for the SMC calling convention
> >> (which says that error codes are 32/64bit signed integers).
> > psci_invoke_{hvc,smc} should implement the SMC calling convention, since
> > they're just wrapping the smc/hvc call. PSCI calls that build on that,
> > e.g. psci_cpu_on, can define their own return type and then translate
> > the signed long returned by SMC into, e.g. 32-bit signed integers. Indeed
> > that's what psci_cpu_on does.
> >
> > I would write something like that in the commit message or rename
> > psci_invoke to smc_invoke.
> 
> I agree that psci_invoke_* use the SMC calling convention, but we're not
> implementing *all* the features of the SMC calling convention, because SMCCC can
> return more than one result in registers r0-r3. In my opinion, I think the easiest
> solution and the most consistent with both specifications would be to keep the
> current names and change the return value either to an int, and put a comment
> saying that we only support PSCI functions that return an error, either to a long
> unsigned int, meaning that we support *all* PSCI functions as defined in ARM DEN
> 0022D.
> 
> What do you think? Does that make sense?
>

OK, I'll change it back to an int return for psci_invoke and remove the
sentences in the commit message about it. I won't write any new comments
though, because I'd actually rather reduce the common PSCI code, than
to try and ensure we support it completely. We should do as little PSCI
in lib/arm* as possible. The psci unit test (arm/psci.c) is the one that
should worry the most about the specifications, and it should actually
define its own interfaces in order to inspect all bits, including the ones
that aren't supposed to be used, when it does its tests.

Thanks,
drew


^ permalink raw reply	[flat|nested] 22+ messages in thread

* [PATCH kvm-unit-tests v4] arm/arm64: psci: Don't assume method is hvc
  2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc Andrew Jones
  2021-05-12 16:14   ` Alexandru Elisei
@ 2021-05-13 10:18   ` Andrew Jones
  2021-05-13 15:53     ` Alexandru Elisei
  1 sibling, 1 reply; 22+ messages in thread
From: Andrew Jones @ 2021-05-13 10:18 UTC (permalink / raw)
  To: kvm; +Cc: alexandru.elisei, nikos.nikoleris, andre.przywara, eric.auger

The method can be smc in addition to hvc, and it will be when running
on bare metal. Additionally, we move the invocations to assembly so
we don't have to rely on compiler assumptions. We also fix the
prototype of psci_invoke. function_id should be an unsigned int, not
an unsigned long.

Signed-off-by: Andrew Jones <drjones@redhat.com>
---
 arm/cstart.S       | 22 ++++++++++++++++++++++
 arm/cstart64.S     | 22 ++++++++++++++++++++++
 arm/selftest.c     | 34 +++++++---------------------------
 lib/arm/asm/psci.h | 10 ++++++++--
 lib/arm/psci.c     | 35 +++++++++++++++++++++++++++--------
 lib/arm/setup.c    |  2 ++
 6 files changed, 88 insertions(+), 37 deletions(-)

diff --git a/arm/cstart.S b/arm/cstart.S
index 446966de350d..2401d92cdadc 100644
--- a/arm/cstart.S
+++ b/arm/cstart.S
@@ -95,6 +95,28 @@ start:
 
 .text
 
+/*
+ * psci_invoke_hvc / psci_invoke_smc
+ *
+ * Inputs:
+ *   r0 -- function_id
+ *   r1 -- arg0
+ *   r2 -- arg1
+ *   r3 -- arg2
+ *
+ * Outputs:
+ *   r0 -- return code
+ */
+.globl psci_invoke_hvc
+psci_invoke_hvc:
+	hvc	#0
+	mov	pc, lr
+
+.globl psci_invoke_smc
+psci_invoke_smc:
+	smc	#0
+	mov	pc, lr
+
 enable_vfp:
 	/* Enable full access to CP10 and CP11: */
 	mov	r0, #(3 << 22 | 3 << 20)
diff --git a/arm/cstart64.S b/arm/cstart64.S
index 42ba3a3ca249..e4ab7d06251e 100644
--- a/arm/cstart64.S
+++ b/arm/cstart64.S
@@ -109,6 +109,28 @@ start:
 
 .text
 
+/*
+ * psci_invoke_hvc / psci_invoke_smc
+ *
+ * Inputs:
+ *   w0 -- function_id
+ *   x1 -- arg0
+ *   x2 -- arg1
+ *   x3 -- arg2
+ *
+ * Outputs:
+ *   x0 -- return code
+ */
+.globl psci_invoke_hvc
+psci_invoke_hvc:
+	hvc	#0
+	ret
+
+.globl psci_invoke_smc
+psci_invoke_smc:
+	smc	#0
+	ret
+
 get_mmu_off:
 	adrp	x0, auxinfo
 	ldr	x0, [x0, :lo12:auxinfo + 8]
diff --git a/arm/selftest.c b/arm/selftest.c
index 4495b161cdd5..9f459ed3d571 100644
--- a/arm/selftest.c
+++ b/arm/selftest.c
@@ -400,33 +400,13 @@ static void check_vectors(void *arg __unused)
 	exit(report_summary());
 }
 
-static bool psci_check(void)
+static void psci_print(void)
 {
-	const struct fdt_property *method;
-	int node, len, ver;
-
-	node = fdt_node_offset_by_compatible(dt_fdt(), -1, "arm,psci-0.2");
-	if (node < 0) {
-		printf("PSCI v0.2 compatibility required\n");
-		return false;
-	}
-
-	method = fdt_get_property(dt_fdt(), node, "method", &len);
-	if (method == NULL) {
-		printf("bad psci device tree node\n");
-		return false;
-	}
-
-	if (len < 4 || strcmp(method->data, "hvc") != 0) {
-		printf("psci method must be hvc\n");
-		return false;
-	}
-
-	ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
-	printf("PSCI version %d.%d\n", PSCI_VERSION_MAJOR(ver),
-				       PSCI_VERSION_MINOR(ver));
-
-	return true;
+	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
+	report_info("PSCI version: %d.%d", PSCI_VERSION_MAJOR(ver),
+					  PSCI_VERSION_MINOR(ver));
+	report_info("PSCI method: %s", psci_invoke == psci_invoke_hvc ?
+				       "hvc" : "smc");
 }
 
 static void cpu_report(void *data __unused)
@@ -465,7 +445,7 @@ int main(int argc, char **argv)
 
 	} else if (strcmp(argv[1], "smp") == 0) {
 
-		report(psci_check(), "PSCI version");
+		psci_print();
 		on_cpus(cpu_report, NULL);
 		while (!cpumask_full(&ready))
 			cpu_relax();
diff --git a/lib/arm/asm/psci.h b/lib/arm/asm/psci.h
index 7b956bf5987d..cf03449ba665 100644
--- a/lib/arm/asm/psci.h
+++ b/lib/arm/asm/psci.h
@@ -3,8 +3,14 @@
 #include <libcflat.h>
 #include <linux/psci.h>
 
-extern int psci_invoke(unsigned long function_id, unsigned long arg0,
-		       unsigned long arg1, unsigned long arg2);
+typedef int (*psci_invoke_fn)(unsigned int function_id, unsigned long arg0,
+			      unsigned long arg1, unsigned long arg2);
+extern psci_invoke_fn psci_invoke;
+extern int psci_invoke_hvc(unsigned int function_id, unsigned long arg0,
+			   unsigned long arg1, unsigned long arg2);
+extern int psci_invoke_smc(unsigned int function_id, unsigned long arg0,
+			   unsigned long arg1, unsigned long arg2);
+extern void psci_set_conduit(void);
 extern int psci_cpu_on(unsigned long cpuid, unsigned long entry_point);
 extern void psci_system_reset(void);
 extern int cpu_psci_cpu_boot(unsigned int cpu);
diff --git a/lib/arm/psci.c b/lib/arm/psci.c
index 936c83948b6a..9c031a122e9b 100644
--- a/lib/arm/psci.c
+++ b/lib/arm/psci.c
@@ -6,22 +6,21 @@
  *
  * This work is licensed under the terms of the GNU LGPL, version 2.
  */
+#include <devicetree.h>
 #include <asm/psci.h>
 #include <asm/setup.h>
 #include <asm/page.h>
 #include <asm/smp.h>
 
-__attribute__((noinline))
-int psci_invoke(unsigned long function_id, unsigned long arg0,
-		unsigned long arg1, unsigned long arg2)
+static int psci_invoke_none(unsigned int function_id, unsigned long arg0,
+			    unsigned long arg1, unsigned long arg2)
 {
-	asm volatile(
-		"hvc #0"
-	: "+r" (function_id)
-	: "r" (arg0), "r" (arg1), "r" (arg2));
-	return function_id;
+	printf("No PSCI method configured! Can't invoke...\n");
+	return PSCI_RET_NOT_PRESENT;
 }
 
+psci_invoke_fn psci_invoke = psci_invoke_none;
+
 int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
 {
 #ifdef __arm__
@@ -56,3 +55,23 @@ void psci_system_off(void)
 	int err = psci_invoke(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
 	printf("CPU%d unable to do system off (error = %d)\n", smp_processor_id(), err);
 }
+
+void psci_set_conduit(void)
+{
+	const void *fdt = dt_fdt();
+	const struct fdt_property *method;
+	int node, len;
+
+	node = fdt_node_offset_by_compatible(fdt, -1, "arm,psci-0.2");
+	assert_msg(node >= 0, "PSCI v0.2 compatibility required");
+
+	method = fdt_get_property(fdt, node, "method", &len);
+	assert(method != NULL && len == 4);
+
+	if (strcmp(method->data, "hvc") == 0)
+		psci_invoke = psci_invoke_hvc;
+	else if (strcmp(method->data, "smc") == 0)
+		psci_invoke = psci_invoke_smc;
+	else
+		assert_msg(false, "Unknown PSCI conduit: %s", method->data);
+}
diff --git a/lib/arm/setup.c b/lib/arm/setup.c
index 86f054304baf..bcdf0d78c2e2 100644
--- a/lib/arm/setup.c
+++ b/lib/arm/setup.c
@@ -25,6 +25,7 @@
 #include <asm/processor.h>
 #include <asm/smp.h>
 #include <asm/timer.h>
+#include <asm/psci.h>
 
 #include "io.h"
 
@@ -266,6 +267,7 @@ void setup(const void *fdt, phys_addr_t freemem_start)
 	mem_regions_add_assumed();
 	mem_init(PAGE_ALIGN((unsigned long)freemem));
 
+	psci_set_conduit();
 	cpu_init();
 
 	/* cpu_init must be called before thread_info_init */
-- 
2.30.2


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region
  2021-05-10 15:45   ` Alexandru Elisei
@ 2021-05-13 15:48     ` Alexandru Elisei
  2021-05-13 17:18       ` Andrew Jones
  0 siblings, 1 reply; 22+ messages in thread
From: Alexandru Elisei @ 2021-05-13 15:48 UTC (permalink / raw)
  To: Andrew Jones, kvm; +Cc: nikos.nikoleris, andre.przywara, eric.auger

Hi Drew,

On 5/10/21 4:45 PM, Alexandru Elisei wrote:
> Hi Drew,
>
> On 4/29/21 5:41 PM, Andrew Jones wrote:
>> By providing a proper ioremap function, we can just rely on devices
>> calling it for each region they need (as they already do) instead of
>> mapping a big assumed I/O range. We don't require the MMU to be
>> enabled at the time of the ioremap. In that case, we add the mapping
>> to the identity map anyway. This allows us to call setup_vm after
>> io_init. Why don't we just call setup_vm before io_init, I hear you
>> ask? Well, that's because tests like sieve want to start with the MMU
>> off, later call setup_vm, and all the while have working I/O. Some
>> unit tests are just really demanding...
>>
>> While at it, ensure we map the I/O regions with XN (execute never),
>> as suggested by Alexandru Elisei.
> I got to thinking why this wasn't an issue before. Under KVM, device memory is not
> usually mapped at stage 2, so any speculated reads wouldn't have reached memory at
> all. The only way I imagine that happening if the user was running kvm-unit-tests
> with a passthrough PCI device, which I don't think happens too often.
>
> But we cannot rely on devices not being mapped at stage 2 when running under EFI
> (we're mapping them ourselves with ioremap), so I believe this is a good fix.
>
> Had another look at the patch, looks good to me:

While testing the series I discovered that this patch introduces a bug when
running under kvmtool.

Here's the splat:

$ ./configure --vmm=kvmtool --earlycon=uart,mmio,0x1000000 --page-size=4K && make
clean && make -j6 && ./vm run -c2 -m128 -f arm/micro-bench.flat
[..]
  # lkvm run --firmware arm/micro-bench.flat -m 128 -c 2 --name guest-6986
  Info: Placing fdt at 0x80200000 - 0x80210000
chr_testdev_init: chr-testdev: can't find a virtio-console
Timer Frequency 24000000 Hz (Output in microseconds)

name                                    total ns                         avg
ns            
--------------------------------------------------------------------------------------------
hvc                                 168674516.0                        
2573.0             
Load address: 80000000
PC: 80000128 PC offset: 128
Unhandled exception ec=0x25 (DABT_EL1)
Vector: 4 (el1h_sync)
ESR_EL1:         96000006, ec=0x25 (DABT_EL1)
FAR_EL1: 000000000a000008 (valid)
Exception frame registers:
pc : [<0000000080000128>] lr : [<000000008000cac8>] pstate: 800003c5
sp : 000000008003ff90
x29: 0000000000000000 x28: 0000000000000000
x27: 00000011ada4d0c2 x26: 0000000000000000
x25: 0000000080015978 x24: 0000000080015a90
x23: 0000048c27394fff x22: 20c49ba5e353f7cf
x21: 28f5c28f5c28f5c3 x20: 0000000080016af0
x19: 000000e8d4a51000 x18: 0000000080040000
x17: 0000000000000000 x16: 0000000080008210
x15: 000000008003fe5c x14: 0000000000000260
x13: 00000000000002a4 x12: 0000000080040000
x11: 0000000000000001 x10: 0000000000000060
x9 : 0000000000000000 x8 : 0000000000000039
x7 : 0000000000000040 x6 : 0000000080013983
x5 : 000000008003f74e x4 : 000000008003f69c
x3 : 0000000000000000 x2 : 0000000000000000
x1 : 0000000000000000 x0 : 000000000a000008

The issue is caused by the mmio_read_user_exec() function from arm/micro-bench.c.
kvmtool doesn't have a chr-testdev device, and because probing fails, the address
0x0a000008 isn't ioremap'ed. The 0-1G memory region is not automatically mapped
anymore, and the access triggers a data abort at stage 1.

I fixed the splat with this change:

diff --git a/arm/micro-bench.c b/arm/micro-bench.c
index 95c418c10eb4..ad9e44d71d8d 100644
--- a/arm/micro-bench.c
+++ b/arm/micro-bench.c
@@ -281,7 +281,7 @@ static void mmio_read_user_exec(void)
         * updated in the future if any relevant changes in QEMU
         * test-dev are made.
         */
-       void *userspace_emulated_addr = (void*)0x0a000008;
+       void *userspace_emulated_addr = (void*)ioremap(0x0a000008, 8);
 
        readl(userspace_emulated_addr);
 }

kvmtool ignores the MMIO exit reason if no device owns the IPA, that's why it also
works on kvmtool.

The micro-bench test with the diff passes under qemu and kvmtool, tested with 4K,
16K and 64K pages on an odroid-c4.

Thanks,

Alex

>
> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
>
> Thanks,
>
> Alex
>
>> Signed-off-by: Andrew Jones <drjones@redhat.com>
>> ---
>>  lib/arm/asm/io.h            |  6 ++++++
>>  lib/arm/asm/mmu.h           |  3 +++
>>  lib/arm/asm/page.h          |  2 ++
>>  lib/arm/asm/pgtable-hwdef.h |  1 +
>>  lib/arm/mmu.c               | 37 +++++++++++++++++++++++++++----------
>>  lib/arm64/asm/io.h          |  6 ++++++
>>  lib/arm64/asm/mmu.h         |  1 +
>>  lib/arm64/asm/page.h        |  2 ++
>>  8 files changed, 48 insertions(+), 10 deletions(-)
>>
>> diff --git a/lib/arm/asm/io.h b/lib/arm/asm/io.h
>> index ba3b0b2412ad..e4caa6ff5d1e 100644
>> --- a/lib/arm/asm/io.h
>> +++ b/lib/arm/asm/io.h
>> @@ -77,6 +77,12 @@ static inline void __raw_writel(u32 val, volatile void __iomem *addr)
>>  		     : "r" (val));
>>  }
>>  
>> +#define ioremap ioremap
>> +static inline void __iomem *ioremap(phys_addr_t phys_addr, size_t size)
>> +{
>> +	return __ioremap(phys_addr, size);
>> +}
>> +
>>  #define virt_to_phys virt_to_phys
>>  static inline phys_addr_t virt_to_phys(const volatile void *x)
>>  {
>> diff --git a/lib/arm/asm/mmu.h b/lib/arm/asm/mmu.h
>> index 122874b8aebe..94e70f0a84bf 100644
>> --- a/lib/arm/asm/mmu.h
>> +++ b/lib/arm/asm/mmu.h
>> @@ -8,10 +8,13 @@
>>  #include <asm/barrier.h>
>>  
>>  #define PTE_USER		L_PTE_USER
>> +#define PTE_UXN			L_PTE_XN
>> +#define PTE_PXN			L_PTE_PXN
>>  #define PTE_RDONLY		PTE_AP2
>>  #define PTE_SHARED		L_PTE_SHARED
>>  #define PTE_AF			PTE_EXT_AF
>>  #define PTE_WBWA		L_PTE_MT_WRITEALLOC
>> +#define PTE_UNCACHED		L_PTE_MT_UNCACHED
>>  
>>  /* See B3.18.7 TLB maintenance operations */
>>  
>> diff --git a/lib/arm/asm/page.h b/lib/arm/asm/page.h
>> index 1fb5cd26ac66..8eb4a883808e 100644
>> --- a/lib/arm/asm/page.h
>> +++ b/lib/arm/asm/page.h
>> @@ -47,5 +47,7 @@ typedef struct { pteval_t pgprot; } pgprot_t;
>>  extern phys_addr_t __virt_to_phys(unsigned long addr);
>>  extern unsigned long __phys_to_virt(phys_addr_t addr);
>>  
>> +extern void *__ioremap(phys_addr_t phys_addr, size_t size);
>> +
>>  #endif /* !__ASSEMBLY__ */
>>  #endif /* _ASMARM_PAGE_H_ */
>> diff --git a/lib/arm/asm/pgtable-hwdef.h b/lib/arm/asm/pgtable-hwdef.h
>> index fe1d8540ea3f..90fd306c7cc0 100644
>> --- a/lib/arm/asm/pgtable-hwdef.h
>> +++ b/lib/arm/asm/pgtable-hwdef.h
>> @@ -34,6 +34,7 @@
>>  #define L_PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
>>  #define L_PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
>>  #define L_PTE_YOUNG		(_AT(pteval_t, 1) << 10)	/* AF */
>> +#define L_PTE_PXN		(_AT(pteval_t, 1) << 53)	/* PXN */
>>  #define L_PTE_XN		(_AT(pteval_t, 1) << 54)	/* XN */
>>  
>>  /*
>> diff --git a/lib/arm/mmu.c b/lib/arm/mmu.c
>> index 15eef007f256..791b1f88f946 100644
>> --- a/lib/arm/mmu.c
>> +++ b/lib/arm/mmu.c
>> @@ -11,6 +11,7 @@
>>  #include <asm/mmu.h>
>>  #include <asm/setup.h>
>>  #include <asm/page.h>
>> +#include <asm/io.h>
>>  
>>  #include "alloc_page.h"
>>  #include "vmalloc.h"
>> @@ -157,9 +158,8 @@ void mmu_set_range_sect(pgd_t *pgtable, uintptr_t virt_offset,
>>  void *setup_mmu(phys_addr_t phys_end)
>>  {
>>  	uintptr_t code_end = (uintptr_t)&etext;
>> -	struct mem_region *r;
>>  
>> -	/* 0G-1G = I/O, 1G-3G = identity, 3G-4G = vmalloc */
>> +	/* 3G-4G region is reserved for vmalloc, cap phys_end at 3G */
>>  	if (phys_end > (3ul << 30))
>>  		phys_end = 3ul << 30;
>>  
>> @@ -170,14 +170,8 @@ void *setup_mmu(phys_addr_t phys_end)
>>  			"Unsupported translation granule %ld\n", PAGE_SIZE);
>>  #endif
>>  
>> -	mmu_idmap = alloc_page();
>> -
>> -	for (r = mem_regions; r->end; ++r) {
>> -		if (!(r->flags & MR_F_IO))
>> -			continue;
>> -		mmu_set_range_sect(mmu_idmap, r->start, r->start, r->end,
>> -				   __pgprot(PMD_SECT_UNCACHED | PMD_SECT_USER));
>> -	}
>> +	if (!mmu_idmap)
>> +		mmu_idmap = alloc_page();
>>  
>>  	/* armv8 requires code shared between EL1 and EL0 to be read-only */
>>  	mmu_set_range_ptes(mmu_idmap, PHYS_OFFSET,
>> @@ -192,6 +186,29 @@ void *setup_mmu(phys_addr_t phys_end)
>>  	return mmu_idmap;
>>  }
>>  
>> +void __iomem *__ioremap(phys_addr_t phys_addr, size_t size)
>> +{
>> +	phys_addr_t paddr_aligned = phys_addr & PAGE_MASK;
>> +	phys_addr_t paddr_end = PAGE_ALIGN(phys_addr + size);
>> +	pgprot_t prot = __pgprot(PTE_UNCACHED | PTE_USER | PTE_UXN | PTE_PXN);
>> +	pgd_t *pgtable;
>> +
>> +	assert(sizeof(long) == 8 || !(phys_addr >> 32));
>> +
>> +	if (mmu_enabled()) {
>> +		pgtable = current_thread_info()->pgtable;
>> +	} else {
>> +		if (!mmu_idmap)
>> +			mmu_idmap = alloc_page();
>> +		pgtable = mmu_idmap;
>> +	}
>> +
>> +	mmu_set_range_ptes(pgtable, paddr_aligned, paddr_aligned,
>> +			   paddr_end, prot);
>> +
>> +	return (void __iomem *)(unsigned long)phys_addr;
>> +}
>> +
>>  phys_addr_t __virt_to_phys(unsigned long addr)
>>  {
>>  	if (mmu_enabled()) {
>> diff --git a/lib/arm64/asm/io.h b/lib/arm64/asm/io.h
>> index e0a03b250d5b..be19f471c0fa 100644
>> --- a/lib/arm64/asm/io.h
>> +++ b/lib/arm64/asm/io.h
>> @@ -71,6 +71,12 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
>>  	return val;
>>  }
>>  
>> +#define ioremap ioremap
>> +static inline void __iomem *ioremap(phys_addr_t phys_addr, size_t size)
>> +{
>> +	return __ioremap(phys_addr, size);
>> +}
>> +
>>  #define virt_to_phys virt_to_phys
>>  static inline phys_addr_t virt_to_phys(const volatile void *x)
>>  {
>> diff --git a/lib/arm64/asm/mmu.h b/lib/arm64/asm/mmu.h
>> index 72d75eafc882..72371b2d9fe3 100644
>> --- a/lib/arm64/asm/mmu.h
>> +++ b/lib/arm64/asm/mmu.h
>> @@ -8,6 +8,7 @@
>>  #include <asm/barrier.h>
>>  
>>  #define PMD_SECT_UNCACHED	PMD_ATTRINDX(MT_DEVICE_nGnRE)
>> +#define PTE_UNCACHED		PTE_ATTRINDX(MT_DEVICE_nGnRE)
>>  #define PTE_WBWA		PTE_ATTRINDX(MT_NORMAL)
>>  
>>  static inline void flush_tlb_all(void)
>> diff --git a/lib/arm64/asm/page.h b/lib/arm64/asm/page.h
>> index ae4484b22114..d0fac6ea563d 100644
>> --- a/lib/arm64/asm/page.h
>> +++ b/lib/arm64/asm/page.h
>> @@ -72,5 +72,7 @@ typedef struct { pteval_t pgprot; } pgprot_t;
>>  extern phys_addr_t __virt_to_phys(unsigned long addr);
>>  extern unsigned long __phys_to_virt(phys_addr_t addr);
>>  
>> +extern void *__ioremap(phys_addr_t phys_addr, size_t size);
>> +
>>  #endif /* !__ASSEMBLY__ */
>>  #endif /* _ASMARM64_PAGE_H_ */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v4] arm/arm64: psci: Don't assume method is hvc
  2021-05-13 10:18   ` [PATCH kvm-unit-tests v4] " Andrew Jones
@ 2021-05-13 15:53     ` Alexandru Elisei
  0 siblings, 0 replies; 22+ messages in thread
From: Alexandru Elisei @ 2021-05-13 15:53 UTC (permalink / raw)
  To: Andrew Jones, kvm; +Cc: nikos.nikoleris, andre.przywara, eric.auger

Hi Drew,

On 5/13/21 11:18 AM, Andrew Jones wrote:
> The method can be smc in addition to hvc, and it will be when running
> on bare metal. Additionally, we move the invocations to assembly so
> we don't have to rely on compiler assumptions. We also fix the
> prototype of psci_invoke. function_id should be an unsigned int, not
> an unsigned long.

Looks good to me:

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>

Thanks,

Alex

>
> Signed-off-by: Andrew Jones <drjones@redhat.com>
> ---
>  arm/cstart.S       | 22 ++++++++++++++++++++++
>  arm/cstart64.S     | 22 ++++++++++++++++++++++
>  arm/selftest.c     | 34 +++++++---------------------------
>  lib/arm/asm/psci.h | 10 ++++++++--
>  lib/arm/psci.c     | 35 +++++++++++++++++++++++++++--------
>  lib/arm/setup.c    |  2 ++
>  6 files changed, 88 insertions(+), 37 deletions(-)
>
> diff --git a/arm/cstart.S b/arm/cstart.S
> index 446966de350d..2401d92cdadc 100644
> --- a/arm/cstart.S
> +++ b/arm/cstart.S
> @@ -95,6 +95,28 @@ start:
>  
>  .text
>  
> +/*
> + * psci_invoke_hvc / psci_invoke_smc
> + *
> + * Inputs:
> + *   r0 -- function_id
> + *   r1 -- arg0
> + *   r2 -- arg1
> + *   r3 -- arg2
> + *
> + * Outputs:
> + *   r0 -- return code
> + */
> +.globl psci_invoke_hvc
> +psci_invoke_hvc:
> +	hvc	#0
> +	mov	pc, lr
> +
> +.globl psci_invoke_smc
> +psci_invoke_smc:
> +	smc	#0
> +	mov	pc, lr
> +
>  enable_vfp:
>  	/* Enable full access to CP10 and CP11: */
>  	mov	r0, #(3 << 22 | 3 << 20)
> diff --git a/arm/cstart64.S b/arm/cstart64.S
> index 42ba3a3ca249..e4ab7d06251e 100644
> --- a/arm/cstart64.S
> +++ b/arm/cstart64.S
> @@ -109,6 +109,28 @@ start:
>  
>  .text
>  
> +/*
> + * psci_invoke_hvc / psci_invoke_smc
> + *
> + * Inputs:
> + *   w0 -- function_id
> + *   x1 -- arg0
> + *   x2 -- arg1
> + *   x3 -- arg2
> + *
> + * Outputs:
> + *   x0 -- return code
> + */
> +.globl psci_invoke_hvc
> +psci_invoke_hvc:
> +	hvc	#0
> +	ret
> +
> +.globl psci_invoke_smc
> +psci_invoke_smc:
> +	smc	#0
> +	ret
> +
>  get_mmu_off:
>  	adrp	x0, auxinfo
>  	ldr	x0, [x0, :lo12:auxinfo + 8]
> diff --git a/arm/selftest.c b/arm/selftest.c
> index 4495b161cdd5..9f459ed3d571 100644
> --- a/arm/selftest.c
> +++ b/arm/selftest.c
> @@ -400,33 +400,13 @@ static void check_vectors(void *arg __unused)
>  	exit(report_summary());
>  }
>  
> -static bool psci_check(void)
> +static void psci_print(void)
>  {
> -	const struct fdt_property *method;
> -	int node, len, ver;
> -
> -	node = fdt_node_offset_by_compatible(dt_fdt(), -1, "arm,psci-0.2");
> -	if (node < 0) {
> -		printf("PSCI v0.2 compatibility required\n");
> -		return false;
> -	}
> -
> -	method = fdt_get_property(dt_fdt(), node, "method", &len);
> -	if (method == NULL) {
> -		printf("bad psci device tree node\n");
> -		return false;
> -	}
> -
> -	if (len < 4 || strcmp(method->data, "hvc") != 0) {
> -		printf("psci method must be hvc\n");
> -		return false;
> -	}
> -
> -	ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> -	printf("PSCI version %d.%d\n", PSCI_VERSION_MAJOR(ver),
> -				       PSCI_VERSION_MINOR(ver));
> -
> -	return true;
> +	int ver = psci_invoke(PSCI_0_2_FN_PSCI_VERSION, 0, 0, 0);
> +	report_info("PSCI version: %d.%d", PSCI_VERSION_MAJOR(ver),
> +					  PSCI_VERSION_MINOR(ver));
> +	report_info("PSCI method: %s", psci_invoke == psci_invoke_hvc ?
> +				       "hvc" : "smc");
>  }
>  
>  static void cpu_report(void *data __unused)
> @@ -465,7 +445,7 @@ int main(int argc, char **argv)
>  
>  	} else if (strcmp(argv[1], "smp") == 0) {
>  
> -		report(psci_check(), "PSCI version");
> +		psci_print();
>  		on_cpus(cpu_report, NULL);
>  		while (!cpumask_full(&ready))
>  			cpu_relax();
> diff --git a/lib/arm/asm/psci.h b/lib/arm/asm/psci.h
> index 7b956bf5987d..cf03449ba665 100644
> --- a/lib/arm/asm/psci.h
> +++ b/lib/arm/asm/psci.h
> @@ -3,8 +3,14 @@
>  #include <libcflat.h>
>  #include <linux/psci.h>
>  
> -extern int psci_invoke(unsigned long function_id, unsigned long arg0,
> -		       unsigned long arg1, unsigned long arg2);
> +typedef int (*psci_invoke_fn)(unsigned int function_id, unsigned long arg0,
> +			      unsigned long arg1, unsigned long arg2);
> +extern psci_invoke_fn psci_invoke;
> +extern int psci_invoke_hvc(unsigned int function_id, unsigned long arg0,
> +			   unsigned long arg1, unsigned long arg2);
> +extern int psci_invoke_smc(unsigned int function_id, unsigned long arg0,
> +			   unsigned long arg1, unsigned long arg2);
> +extern void psci_set_conduit(void);
>  extern int psci_cpu_on(unsigned long cpuid, unsigned long entry_point);
>  extern void psci_system_reset(void);
>  extern int cpu_psci_cpu_boot(unsigned int cpu);
> diff --git a/lib/arm/psci.c b/lib/arm/psci.c
> index 936c83948b6a..9c031a122e9b 100644
> --- a/lib/arm/psci.c
> +++ b/lib/arm/psci.c
> @@ -6,22 +6,21 @@
>   *
>   * This work is licensed under the terms of the GNU LGPL, version 2.
>   */
> +#include <devicetree.h>
>  #include <asm/psci.h>
>  #include <asm/setup.h>
>  #include <asm/page.h>
>  #include <asm/smp.h>
>  
> -__attribute__((noinline))
> -int psci_invoke(unsigned long function_id, unsigned long arg0,
> -		unsigned long arg1, unsigned long arg2)
> +static int psci_invoke_none(unsigned int function_id, unsigned long arg0,
> +			    unsigned long arg1, unsigned long arg2)
>  {
> -	asm volatile(
> -		"hvc #0"
> -	: "+r" (function_id)
> -	: "r" (arg0), "r" (arg1), "r" (arg2));
> -	return function_id;
> +	printf("No PSCI method configured! Can't invoke...\n");
> +	return PSCI_RET_NOT_PRESENT;
>  }
>  
> +psci_invoke_fn psci_invoke = psci_invoke_none;
> +
>  int psci_cpu_on(unsigned long cpuid, unsigned long entry_point)
>  {
>  #ifdef __arm__
> @@ -56,3 +55,23 @@ void psci_system_off(void)
>  	int err = psci_invoke(PSCI_0_2_FN_SYSTEM_OFF, 0, 0, 0);
>  	printf("CPU%d unable to do system off (error = %d)\n", smp_processor_id(), err);
>  }
> +
> +void psci_set_conduit(void)
> +{
> +	const void *fdt = dt_fdt();
> +	const struct fdt_property *method;
> +	int node, len;
> +
> +	node = fdt_node_offset_by_compatible(fdt, -1, "arm,psci-0.2");
> +	assert_msg(node >= 0, "PSCI v0.2 compatibility required");
> +
> +	method = fdt_get_property(fdt, node, "method", &len);
> +	assert(method != NULL && len == 4);
> +
> +	if (strcmp(method->data, "hvc") == 0)
> +		psci_invoke = psci_invoke_hvc;
> +	else if (strcmp(method->data, "smc") == 0)
> +		psci_invoke = psci_invoke_smc;
> +	else
> +		assert_msg(false, "Unknown PSCI conduit: %s", method->data);
> +}
> diff --git a/lib/arm/setup.c b/lib/arm/setup.c
> index 86f054304baf..bcdf0d78c2e2 100644
> --- a/lib/arm/setup.c
> +++ b/lib/arm/setup.c
> @@ -25,6 +25,7 @@
>  #include <asm/processor.h>
>  #include <asm/smp.h>
>  #include <asm/timer.h>
> +#include <asm/psci.h>
>  
>  #include "io.h"
>  
> @@ -266,6 +267,7 @@ void setup(const void *fdt, phys_addr_t freemem_start)
>  	mem_regions_add_assumed();
>  	mem_init(PAGE_ALIGN((unsigned long)freemem));
>  
> +	psci_set_conduit();
>  	cpu_init();
>  
>  	/* cpu_init must be called before thread_info_init */

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region
  2021-05-13 15:48     ` Alexandru Elisei
@ 2021-05-13 17:18       ` Andrew Jones
  2021-05-13 17:43         ` Andrew Jones
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Jones @ 2021-05-13 17:18 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, nikos.nikoleris, andre.przywara, eric.auger

On Thu, May 13, 2021 at 04:48:16PM +0100, Alexandru Elisei wrote:
> Hi Drew,
> 
> On 5/10/21 4:45 PM, Alexandru Elisei wrote:
> > Hi Drew,
> >
> > On 4/29/21 5:41 PM, Andrew Jones wrote:
> >> By providing a proper ioremap function, we can just rely on devices
> >> calling it for each region they need (as they already do) instead of
> >> mapping a big assumed I/O range. We don't require the MMU to be
> >> enabled at the time of the ioremap. In that case, we add the mapping
> >> to the identity map anyway. This allows us to call setup_vm after
> >> io_init. Why don't we just call setup_vm before io_init, I hear you
> >> ask? Well, that's because tests like sieve want to start with the MMU
> >> off, later call setup_vm, and all the while have working I/O. Some
> >> unit tests are just really demanding...
> >>
> >> While at it, ensure we map the I/O regions with XN (execute never),
> >> as suggested by Alexandru Elisei.
> > I got to thinking why this wasn't an issue before. Under KVM, device memory is not
> > usually mapped at stage 2, so any speculated reads wouldn't have reached memory at
> > all. The only way I imagine that happening if the user was running kvm-unit-tests
> > with a passthrough PCI device, which I don't think happens too often.
> >
> > But we cannot rely on devices not being mapped at stage 2 when running under EFI
> > (we're mapping them ourselves with ioremap), so I believe this is a good fix.
> >
> > Had another look at the patch, looks good to me:
> 
> While testing the series I discovered that this patch introduces a bug when
> running under kvmtool.
> 
> Here's the splat:
> 
> $ ./configure --vmm=kvmtool --earlycon=uart,mmio,0x1000000 --page-size=4K && make
> clean && make -j6 && ./vm run -c2 -m128 -f arm/micro-bench.flat
> [..]
>   # lkvm run --firmware arm/micro-bench.flat -m 128 -c 2 --name guest-6986
>   Info: Placing fdt at 0x80200000 - 0x80210000
> chr_testdev_init: chr-testdev: can't find a virtio-console
> Timer Frequency 24000000 Hz (Output in microseconds)
> 
> name                                    total ns                         avg
> ns            
> --------------------------------------------------------------------------------------------
> hvc                                 168674516.0                        
> 2573.0             
> Load address: 80000000
> PC: 80000128 PC offset: 128
> Unhandled exception ec=0x25 (DABT_EL1)
> Vector: 4 (el1h_sync)
> ESR_EL1:         96000006, ec=0x25 (DABT_EL1)
> FAR_EL1: 000000000a000008 (valid)
> Exception frame registers:
> pc : [<0000000080000128>] lr : [<000000008000cac8>] pstate: 800003c5
> sp : 000000008003ff90
> x29: 0000000000000000 x28: 0000000000000000
> x27: 00000011ada4d0c2 x26: 0000000000000000
> x25: 0000000080015978 x24: 0000000080015a90
> x23: 0000048c27394fff x22: 20c49ba5e353f7cf
> x21: 28f5c28f5c28f5c3 x20: 0000000080016af0
> x19: 000000e8d4a51000 x18: 0000000080040000
> x17: 0000000000000000 x16: 0000000080008210
> x15: 000000008003fe5c x14: 0000000000000260
> x13: 00000000000002a4 x12: 0000000080040000
> x11: 0000000000000001 x10: 0000000000000060
> x9 : 0000000000000000 x8 : 0000000000000039
> x7 : 0000000000000040 x6 : 0000000080013983
> x5 : 000000008003f74e x4 : 000000008003f69c
> x3 : 0000000000000000 x2 : 0000000000000000
> x1 : 0000000000000000 x0 : 000000000a000008
> 
> The issue is caused by the mmio_read_user_exec() function from arm/micro-bench.c.
> kvmtool doesn't have a chr-testdev device, and because probing fails, the address
> 0x0a000008 isn't ioremap'ed. The 0-1G memory region is not automatically mapped
> anymore, and the access triggers a data abort at stage 1.
> 
> I fixed the splat with this change:
> 
> diff --git a/arm/micro-bench.c b/arm/micro-bench.c
> index 95c418c10eb4..ad9e44d71d8d 100644
> --- a/arm/micro-bench.c
> +++ b/arm/micro-bench.c
> @@ -281,7 +281,7 @@ static void mmio_read_user_exec(void)
>          * updated in the future if any relevant changes in QEMU
>          * test-dev are made.
>          */
> -       void *userspace_emulated_addr = (void*)0x0a000008;
> +       void *userspace_emulated_addr = (void*)ioremap(0x0a000008, 8);
>  
>         readl(userspace_emulated_addr);
>  }
> 
> kvmtool ignores the MMIO exit reason if no device owns the IPA, that's why it also
> works on kvmtool.
> 
> The micro-bench test with the diff passes under qemu and kvmtool, tested with 4K,
> 16K and 64K pages on an odroid-c4.
>

Thanks Alex,

I think a better fix is this untested one below, though. If you can test
it out and confirm it also resolves the issue, then I'll add this patch
to the series.

Thanks,
drew


diff --git a/arm/micro-bench.c b/arm/micro-bench.c
index 95c418c10eb4..deafd5695c33 100644
--- a/arm/micro-bench.c
+++ b/arm/micro-bench.c
@@ -273,16 +273,22 @@ static void hvc_exec(void)
        asm volatile("mov w0, #0x4b000000; hvc #0" ::: "w0");
 }
 
-static void mmio_read_user_exec(void)
+/*
+ * FIXME: Read device-id in virtio mmio here in order to
+ * force an exit to userspace. This address needs to be
+ * updated in the future if any relevant changes in QEMU
+ * test-dev are made.
+ */
+static void *userspace_emulated_addr;
+
+static bool mmio_read_user_prep(void)
 {
-       /*
-        * FIXME: Read device-id in virtio mmio here in order to
-        * force an exit to userspace. This address needs to be
-        * updated in the future if any relevant changes in QEMU
-        * test-dev are made.
-        */
-       void *userspace_emulated_addr = (void*)0x0a000008;
+       userspace_emulated_addr = (void*)ioremap(0x0a000008, 8);
+       return true;
+}
 
+static void mmio_read_user_exec(void)
+{
        readl(userspace_emulated_addr);
 }
 
@@ -309,14 +315,14 @@ struct exit_test {
 };
 
 static struct exit_test tests[] = {
-       {"hvc",                 NULL,           hvc_exec,               NULL,           65536,          true},
-       {"mmio_read_user",      NULL,           mmio_read_user_exec,    NULL,           65536,          true},
-       {"mmio_read_vgic",      NULL,           mmio_read_vgic_exec,    NULL,           65536,          true},
-       {"eoi",                 NULL,           eoi_exec,               NULL,           65536,          true},
-       {"ipi",                 ipi_prep,       ipi_exec,               NULL,           65536,          true},
-       {"ipi_hw",              ipi_hw_prep,    ipi_exec,               NULL,           65536,          true},
-       {"lpi",                 lpi_prep,       lpi_exec,               NULL,           65536,          true},
-       {"timer_10ms",          timer_prep,     timer_exec,             timer_post,     256,            true},
+       {"hvc",                 NULL,                   hvc_exec,               NULL,           65536,          true},
+       {"mmio_read_user",      mmio_read_user_prep,    mmio_read_user_exec,    NULL,           65536,          true},
+       {"mmio_read_vgic",      NULL,                   mmio_read_vgic_exec,    NULL,           65536,          true},
+       {"eoi",                 NULL,                   eoi_exec,               NULL,           65536,          true},
+       {"ipi",                 ipi_prep,               ipi_exec,               NULL,           65536,          true},
+       {"ipi_hw",              ipi_hw_prep,            ipi_exec,               NULL,           65536,          true},
+       {"lpi",                 lpi_prep,               lpi_exec,               NULL,           65536,          true},
+       {"timer_10ms",          timer_prep,             timer_exec,             timer_post,     256,            true},
 };
 
 struct ns_time {
 


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region
  2021-05-13 17:18       ` Andrew Jones
@ 2021-05-13 17:43         ` Andrew Jones
  2021-05-17 10:38           ` Alexandru Elisei
  0 siblings, 1 reply; 22+ messages in thread
From: Andrew Jones @ 2021-05-13 17:43 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, nikos.nikoleris, andre.przywara, eric.auger

On Thu, May 13, 2021 at 07:18:44PM +0200, Andrew Jones wrote:
> On Thu, May 13, 2021 at 04:48:16PM +0100, Alexandru Elisei wrote:
> > Hi Drew,
> > 
> > On 5/10/21 4:45 PM, Alexandru Elisei wrote:
> > > Hi Drew,
> > >
> > > On 4/29/21 5:41 PM, Andrew Jones wrote:
> > >> By providing a proper ioremap function, we can just rely on devices
> > >> calling it for each region they need (as they already do) instead of
> > >> mapping a big assumed I/O range. We don't require the MMU to be
> > >> enabled at the time of the ioremap. In that case, we add the mapping
> > >> to the identity map anyway. This allows us to call setup_vm after
> > >> io_init. Why don't we just call setup_vm before io_init, I hear you
> > >> ask? Well, that's because tests like sieve want to start with the MMU
> > >> off, later call setup_vm, and all the while have working I/O. Some
> > >> unit tests are just really demanding...
> > >>
> > >> While at it, ensure we map the I/O regions with XN (execute never),
> > >> as suggested by Alexandru Elisei.
> > > I got to thinking why this wasn't an issue before. Under KVM, device memory is not
> > > usually mapped at stage 2, so any speculated reads wouldn't have reached memory at
> > > all. The only way I imagine that happening if the user was running kvm-unit-tests
> > > with a passthrough PCI device, which I don't think happens too often.
> > >
> > > But we cannot rely on devices not being mapped at stage 2 when running under EFI
> > > (we're mapping them ourselves with ioremap), so I believe this is a good fix.
> > >
> > > Had another look at the patch, looks good to me:
> > 
> > While testing the series I discovered that this patch introduces a bug when
> > running under kvmtool.
> > 
> > Here's the splat:
> > 
> > $ ./configure --vmm=kvmtool --earlycon=uart,mmio,0x1000000 --page-size=4K && make
> > clean && make -j6 && ./vm run -c2 -m128 -f arm/micro-bench.flat
> > [..]
> >   # lkvm run --firmware arm/micro-bench.flat -m 128 -c 2 --name guest-6986
> >   Info: Placing fdt at 0x80200000 - 0x80210000
> > chr_testdev_init: chr-testdev: can't find a virtio-console
> > Timer Frequency 24000000 Hz (Output in microseconds)
> > 
> > name                                    total ns                         avg
> > ns            
> > --------------------------------------------------------------------------------------------
> > hvc                                 168674516.0                        
> > 2573.0             
> > Load address: 80000000
> > PC: 80000128 PC offset: 128
> > Unhandled exception ec=0x25 (DABT_EL1)
> > Vector: 4 (el1h_sync)
> > ESR_EL1:         96000006, ec=0x25 (DABT_EL1)
> > FAR_EL1: 000000000a000008 (valid)
> > Exception frame registers:
> > pc : [<0000000080000128>] lr : [<000000008000cac8>] pstate: 800003c5
> > sp : 000000008003ff90
> > x29: 0000000000000000 x28: 0000000000000000
> > x27: 00000011ada4d0c2 x26: 0000000000000000
> > x25: 0000000080015978 x24: 0000000080015a90
> > x23: 0000048c27394fff x22: 20c49ba5e353f7cf
> > x21: 28f5c28f5c28f5c3 x20: 0000000080016af0
> > x19: 000000e8d4a51000 x18: 0000000080040000
> > x17: 0000000000000000 x16: 0000000080008210
> > x15: 000000008003fe5c x14: 0000000000000260
> > x13: 00000000000002a4 x12: 0000000080040000
> > x11: 0000000000000001 x10: 0000000000000060
> > x9 : 0000000000000000 x8 : 0000000000000039
> > x7 : 0000000000000040 x6 : 0000000080013983
> > x5 : 000000008003f74e x4 : 000000008003f69c
> > x3 : 0000000000000000 x2 : 0000000000000000
> > x1 : 0000000000000000 x0 : 000000000a000008
> > 
> > The issue is caused by the mmio_read_user_exec() function from arm/micro-bench.c.
> > kvmtool doesn't have a chr-testdev device, and because probing fails, the address
> > 0x0a000008 isn't ioremap'ed. The 0-1G memory region is not automatically mapped
> > anymore, and the access triggers a data abort at stage 1.
> > 
> > I fixed the splat with this change:
> > 
> > diff --git a/arm/micro-bench.c b/arm/micro-bench.c
> > index 95c418c10eb4..ad9e44d71d8d 100644
> > --- a/arm/micro-bench.c
> > +++ b/arm/micro-bench.c
> > @@ -281,7 +281,7 @@ static void mmio_read_user_exec(void)
> >          * updated in the future if any relevant changes in QEMU
> >          * test-dev are made.
> >          */
> > -       void *userspace_emulated_addr = (void*)0x0a000008;
> > +       void *userspace_emulated_addr = (void*)ioremap(0x0a000008, 8);
> >  
> >         readl(userspace_emulated_addr);
> >  }
> > 
> > kvmtool ignores the MMIO exit reason if no device owns the IPA, that's why it also
> > works on kvmtool.
> > 
> > The micro-bench test with the diff passes under qemu and kvmtool, tested with 4K,
> > 16K and 64K pages on an odroid-c4.
> >
> 
> Thanks Alex,
> 
> I think a better fix is this untested one below, though. If you can test
> it out and confirm it also resolves the issue, then I'll add this patch
> to the series.
> 
> Thanks,
> drew
> 
> 
> diff --git a/arm/micro-bench.c b/arm/micro-bench.c
> index 95c418c10eb4..deafd5695c33 100644
> --- a/arm/micro-bench.c
> +++ b/arm/micro-bench.c
> @@ -273,16 +273,22 @@ static void hvc_exec(void)
>         asm volatile("mov w0, #0x4b000000; hvc #0" ::: "w0");
>  }
>  
> -static void mmio_read_user_exec(void)
> +/*
> + * FIXME: Read device-id in virtio mmio here in order to
> + * force an exit to userspace. This address needs to be
> + * updated in the future if any relevant changes in QEMU
> + * test-dev are made.
> + */
> +static void *userspace_emulated_addr;
> +
> +static bool mmio_read_user_prep(void)
>  {
> -       /*
> -        * FIXME: Read device-id in virtio mmio here in order to
> -        * force an exit to userspace. This address needs to be
> -        * updated in the future if any relevant changes in QEMU
> -        * test-dev are made.
> -        */
> -       void *userspace_emulated_addr = (void*)0x0a000008;
> +       userspace_emulated_addr = (void*)ioremap(0x0a000008, 8);
> +       return true;
> +}
>  
> +static void mmio_read_user_exec(void)
> +{
>         readl(userspace_emulated_addr);
>  }
>  
> @@ -309,14 +315,14 @@ struct exit_test {
>  };
>  
>  static struct exit_test tests[] = {
> -       {"hvc",                 NULL,           hvc_exec,               NULL,           65536,          true},
> -       {"mmio_read_user",      NULL,           mmio_read_user_exec,    NULL,           65536,          true},
> -       {"mmio_read_vgic",      NULL,           mmio_read_vgic_exec,    NULL,           65536,          true},
> -       {"eoi",                 NULL,           eoi_exec,               NULL,           65536,          true},
> -       {"ipi",                 ipi_prep,       ipi_exec,               NULL,           65536,          true},
> -       {"ipi_hw",              ipi_hw_prep,    ipi_exec,               NULL,           65536,          true},
> -       {"lpi",                 lpi_prep,       lpi_exec,               NULL,           65536,          true},
> -       {"timer_10ms",          timer_prep,     timer_exec,             timer_post,     256,            true},
> +       {"hvc",                 NULL,                   hvc_exec,               NULL,           65536,          true},
> +       {"mmio_read_user",      mmio_read_user_prep,    mmio_read_user_exec,    NULL,           65536,          true},
> +       {"mmio_read_vgic",      NULL,                   mmio_read_vgic_exec,    NULL,           65536,          true},
> +       {"eoi",                 NULL,                   eoi_exec,               NULL,           65536,          true},
> +       {"ipi",                 ipi_prep,               ipi_exec,               NULL,           65536,          true},
> +       {"ipi_hw",              ipi_hw_prep,            ipi_exec,               NULL,           65536,          true},
> +       {"lpi",                 lpi_prep,               lpi_exec,               NULL,           65536,          true},
> +       {"timer_10ms",          timer_prep,             timer_exec,             timer_post,     256,            true},
>  };
>  
>  struct ns_time {
>

I still haven't tested it (beyond compiling), but I've tweaked this a bit.
You can see it here

https://gitlab.com/rhdrjones/kvm-unit-tests/-/commit/71938030d160e021db3388037d0d407df17e8e5e

The whole v4 of this series is here

https://gitlab.com/rhdrjones/kvm-unit-tests/-/commits/efiprep

Thanks,
drew


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region
  2021-05-13 17:43         ` Andrew Jones
@ 2021-05-17 10:38           ` Alexandru Elisei
  2021-05-17 14:40             ` Andrew Jones
  0 siblings, 1 reply; 22+ messages in thread
From: Alexandru Elisei @ 2021-05-17 10:38 UTC (permalink / raw)
  To: Andrew Jones; +Cc: kvm, nikos.nikoleris, andre.przywara, eric.auger

Hi Drew,

On 5/13/21 6:43 PM, Andrew Jones wrote:
> On Thu, May 13, 2021 at 07:18:44PM +0200, Andrew Jones wrote:
>> [..]
>> Thanks Alex,
>>
>> I think a better fix is this untested one below, though. If you can test
>> it out and confirm it also resolves the issue, then I'll add this patch
>> to the series.
>>
>> Thanks,
>> drew
>>
>>
>> diff --git a/arm/micro-bench.c b/arm/micro-bench.c
>> index 95c418c10eb4..deafd5695c33 100644
>> --- a/arm/micro-bench.c
>> +++ b/arm/micro-bench.c
>> @@ -273,16 +273,22 @@ static void hvc_exec(void)
>>         asm volatile("mov w0, #0x4b000000; hvc #0" ::: "w0");
>>  }
>>  
>> -static void mmio_read_user_exec(void)
>> +/*
>> + * FIXME: Read device-id in virtio mmio here in order to
>> + * force an exit to userspace. This address needs to be
>> + * updated in the future if any relevant changes in QEMU
>> + * test-dev are made.
>> + */
>> +static void *userspace_emulated_addr;
>> +
>> +static bool mmio_read_user_prep(void)
>>  {
>> -       /*
>> -        * FIXME: Read device-id in virtio mmio here in order to
>> -        * force an exit to userspace. This address needs to be
>> -        * updated in the future if any relevant changes in QEMU
>> -        * test-dev are made.
>> -        */
>> -       void *userspace_emulated_addr = (void*)0x0a000008;
>> +       userspace_emulated_addr = (void*)ioremap(0x0a000008, 8);
>> +       return true;
>> +}
>>  
>> +static void mmio_read_user_exec(void)
>> +{
>>         readl(userspace_emulated_addr);
>>  }
>>  
>> @@ -309,14 +315,14 @@ struct exit_test {
>>  };
>>  
>>  static struct exit_test tests[] = {
>> -       {"hvc",                 NULL,           hvc_exec,               NULL,           65536,          true},
>> -       {"mmio_read_user",      NULL,           mmio_read_user_exec,    NULL,           65536,          true},
>> -       {"mmio_read_vgic",      NULL,           mmio_read_vgic_exec,    NULL,           65536,          true},
>> -       {"eoi",                 NULL,           eoi_exec,               NULL,           65536,          true},
>> -       {"ipi",                 ipi_prep,       ipi_exec,               NULL,           65536,          true},
>> -       {"ipi_hw",              ipi_hw_prep,    ipi_exec,               NULL,           65536,          true},
>> -       {"lpi",                 lpi_prep,       lpi_exec,               NULL,           65536,          true},
>> -       {"timer_10ms",          timer_prep,     timer_exec,             timer_post,     256,            true},
>> +       {"hvc",                 NULL,                   hvc_exec,               NULL,           65536,          true},
>> +       {"mmio_read_user",      mmio_read_user_prep,    mmio_read_user_exec,    NULL,           65536,          true},
>> +       {"mmio_read_vgic",      NULL,                   mmio_read_vgic_exec,    NULL,           65536,          true},
>> +       {"eoi",                 NULL,                   eoi_exec,               NULL,           65536,          true},
>> +       {"ipi",                 ipi_prep,               ipi_exec,               NULL,           65536,          true},
>> +       {"ipi_hw",              ipi_hw_prep,            ipi_exec,               NULL,           65536,          true},
>> +       {"lpi",                 lpi_prep,               lpi_exec,               NULL,           65536,          true},
>> +       {"timer_10ms",          timer_prep,             timer_exec,             timer_post,     256,            true},
>>  };
>>  
>>  struct ns_time {
>>
> I still haven't tested it (beyond compiling), but I've tweaked this a bit.
> You can see it here
>
> https://gitlab.com/rhdrjones/kvm-unit-tests/-/commit/71938030d160e021db3388037d0d407df17e8e5e
>
> The whole v4 of this series is here
>
> https://gitlab.com/rhdrjones/kvm-unit-tests/-/commits/efiprep

Had a look at the patch, looks good; in my suggestion I wrongly thought that readl
reads a long (64 bits), not an uint32_t value:

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>

I also ran some tests on the v4 series from your repo.

Qemu TCG on x86 machine:
    - arm compiled with arm-linux-gnu-gcc and arm-none-eabi-gcc
    - arm64, 4k and 64k pages.

Odroid-c4:
    - arm, both compilers, under kvmtool
    - arm64, 4k, 16k and 64k pages under qemu KVM and kvmtool

Rockpro64:
    - arm, both compilers, under kvmtool
    - arm64, 4k and 64k pages, under qemu KVM and kvmtool.

The ITS migration tests I had to run manually on the rockpro64 (Odroid has a
gicv2) because it looks like the run script wasn't detecting the prompt to start
migration. I'm guessing something on my side, because I had issues with the
migration tests before. Nonetheless, those tests ran just fine manually under qemu
and kvmtool, so everything looks correct to me:

Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>

Thanks,

Alex


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region
  2021-05-17 10:38           ` Alexandru Elisei
@ 2021-05-17 14:40             ` Andrew Jones
  0 siblings, 0 replies; 22+ messages in thread
From: Andrew Jones @ 2021-05-17 14:40 UTC (permalink / raw)
  To: Alexandru Elisei; +Cc: kvm, nikos.nikoleris, andre.przywara, eric.auger

On Mon, May 17, 2021 at 11:38:46AM +0100, Alexandru Elisei wrote:
> Hi Drew,
> 
> On 5/13/21 6:43 PM, Andrew Jones wrote:
> > On Thu, May 13, 2021 at 07:18:44PM +0200, Andrew Jones wrote:
> >> [..]
> >> Thanks Alex,
> >>
> >> I think a better fix is this untested one below, though. If you can test
> >> it out and confirm it also resolves the issue, then I'll add this patch
> >> to the series.
> >>
> >> Thanks,
> >> drew
> >>
> >>
> >> diff --git a/arm/micro-bench.c b/arm/micro-bench.c
> >> index 95c418c10eb4..deafd5695c33 100644
> >> --- a/arm/micro-bench.c
> >> +++ b/arm/micro-bench.c
> >> @@ -273,16 +273,22 @@ static void hvc_exec(void)
> >>         asm volatile("mov w0, #0x4b000000; hvc #0" ::: "w0");
> >>  }
> >>  
> >> -static void mmio_read_user_exec(void)
> >> +/*
> >> + * FIXME: Read device-id in virtio mmio here in order to
> >> + * force an exit to userspace. This address needs to be
> >> + * updated in the future if any relevant changes in QEMU
> >> + * test-dev are made.
> >> + */
> >> +static void *userspace_emulated_addr;
> >> +
> >> +static bool mmio_read_user_prep(void)
> >>  {
> >> -       /*
> >> -        * FIXME: Read device-id in virtio mmio here in order to
> >> -        * force an exit to userspace. This address needs to be
> >> -        * updated in the future if any relevant changes in QEMU
> >> -        * test-dev are made.
> >> -        */
> >> -       void *userspace_emulated_addr = (void*)0x0a000008;
> >> +       userspace_emulated_addr = (void*)ioremap(0x0a000008, 8);
> >> +       return true;
> >> +}
> >>  
> >> +static void mmio_read_user_exec(void)
> >> +{
> >>         readl(userspace_emulated_addr);
> >>  }
> >>  
> >> @@ -309,14 +315,14 @@ struct exit_test {
> >>  };
> >>  
> >>  static struct exit_test tests[] = {
> >> -       {"hvc",                 NULL,           hvc_exec,               NULL,           65536,          true},
> >> -       {"mmio_read_user",      NULL,           mmio_read_user_exec,    NULL,           65536,          true},
> >> -       {"mmio_read_vgic",      NULL,           mmio_read_vgic_exec,    NULL,           65536,          true},
> >> -       {"eoi",                 NULL,           eoi_exec,               NULL,           65536,          true},
> >> -       {"ipi",                 ipi_prep,       ipi_exec,               NULL,           65536,          true},
> >> -       {"ipi_hw",              ipi_hw_prep,    ipi_exec,               NULL,           65536,          true},
> >> -       {"lpi",                 lpi_prep,       lpi_exec,               NULL,           65536,          true},
> >> -       {"timer_10ms",          timer_prep,     timer_exec,             timer_post,     256,            true},
> >> +       {"hvc",                 NULL,                   hvc_exec,               NULL,           65536,          true},
> >> +       {"mmio_read_user",      mmio_read_user_prep,    mmio_read_user_exec,    NULL,           65536,          true},
> >> +       {"mmio_read_vgic",      NULL,                   mmio_read_vgic_exec,    NULL,           65536,          true},
> >> +       {"eoi",                 NULL,                   eoi_exec,               NULL,           65536,          true},
> >> +       {"ipi",                 ipi_prep,               ipi_exec,               NULL,           65536,          true},
> >> +       {"ipi_hw",              ipi_hw_prep,            ipi_exec,               NULL,           65536,          true},
> >> +       {"lpi",                 lpi_prep,               lpi_exec,               NULL,           65536,          true},
> >> +       {"timer_10ms",          timer_prep,             timer_exec,             timer_post,     256,            true},
> >>  };
> >>  
> >>  struct ns_time {
> >>
> > I still haven't tested it (beyond compiling), but I've tweaked this a bit.
> > You can see it here
> >
> > https://gitlab.com/rhdrjones/kvm-unit-tests/-/commit/71938030d160e021db3388037d0d407df17e8e5e
> >
> > The whole v4 of this series is here
> >
> > https://gitlab.com/rhdrjones/kvm-unit-tests/-/commits/efiprep
> 
> Had a look at the patch, looks good; in my suggestion I wrongly thought that readl
> reads a long (64 bits), not an uint32_t value:
> 
> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
> 
> I also ran some tests on the v4 series from your repo.
> 
> Qemu TCG on x86 machine:
>     - arm compiled with arm-linux-gnu-gcc and arm-none-eabi-gcc
>     - arm64, 4k and 64k pages.
> 
> Odroid-c4:
>     - arm, both compilers, under kvmtool
>     - arm64, 4k, 16k and 64k pages under qemu KVM and kvmtool
> 
> Rockpro64:
>     - arm, both compilers, under kvmtool
>     - arm64, 4k and 64k pages, under qemu KVM and kvmtool.
> 
> The ITS migration tests I had to run manually on the rockpro64 (Odroid has a
> gicv2) because it looks like the run script wasn't detecting the prompt to start
> migration. I'm guessing something on my side, because I had issues with the
> migration tests before. Nonetheless, those tests ran just fine manually under qemu
> and kvmtool, so everything looks correct to me:
> 
> Tested-by: Alexandru Elisei <alexandru.elisei@arm.com>
>

Thanks Alex! I've added your tags, applied to arm/queue and sent the pull
request.

Thanks,
drew


^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-05-17 14:42 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-29 16:41 [PATCH kvm-unit-tests v3 0/8] arm/arm64: Prepare for target-efi Andrew Jones
2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 1/8] arm/arm64: Reorganize cstart assembler Andrew Jones
2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 2/8] arm/arm64: Move setup_vm into setup Andrew Jones
2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 3/8] pci-testdev: ioremap regions Andrew Jones
2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 4/8] arm/arm64: mmu: Stop mapping an assumed IO region Andrew Jones
2021-05-10 15:45   ` Alexandru Elisei
2021-05-13 15:48     ` Alexandru Elisei
2021-05-13 17:18       ` Andrew Jones
2021-05-13 17:43         ` Andrew Jones
2021-05-17 10:38           ` Alexandru Elisei
2021-05-17 14:40             ` Andrew Jones
2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 5/8] arm/arm64: mmu: Remove memory layout assumptions Andrew Jones
2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 6/8] arm/arm64: setup: Consolidate " Andrew Jones
2021-05-11 15:11   ` Alexandru Elisei
2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 7/8] chr-testdev: Silently fail init Andrew Jones
2021-04-29 16:41 ` [PATCH kvm-unit-tests v3 8/8] arm/arm64: psci: Don't assume method is hvc Andrew Jones
2021-05-12 16:14   ` Alexandru Elisei
2021-05-13  7:08     ` Andrew Jones
2021-05-13  9:08       ` Alexandru Elisei
2021-05-13 10:06         ` Andrew Jones
2021-05-13 10:18   ` [PATCH kvm-unit-tests v4] " Andrew Jones
2021-05-13 15:53     ` Alexandru Elisei

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).