linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages)
@ 2021-09-30 10:35 Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 01/13] arm64/mm: Dynamically initialize protection_map[] Anshuman Khandual
                   ` (12 more replies)
  0 siblings, 13 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

This series enables 52 bits PA support for 4K and 16K page configs via
existing CONFIG_ARM64_PA_BITS_52, utilizing a new arch feature FEAT_LPA2
which is available from ARM v8.7. IDMAP needs changes to accommodate two
new level of page tables in certain scenarios like (4K|39VA|52PA) but the
same problem also exists for (16K|36VA|48PA) which needs fixing. I have
sent a fix for 16K case [1] and later will enable it for FEAT_LPA2 as well.

This series applies on v5.15-rc3.

Testing:

Build and boot tested (individual patches) on all existing and new
FEAT_LPA2 enabled config combinations.

Pending:

- Enable IDMAP for FEAT_LPA2
- Enable 52 bit VA range on 4K/16K page sizes
- Evaluate KVM and SMMU impacts from FEAT_LPA2

[1] https://lore.kernel.org/all/1632807225-20189-1-git-send-email-anshuman.khandual@arm.com/

Changes in RFC V3:

- protection_map[] gets reinitialized in platform to avoid build failure per Catalin
- Added __cpu_secondary_check52bitpa()
- Added FEAT_LPA2 support during KVM stage-2 translation
- Updated description for ARM64_PA_BITS_52 per Catalin
- Added tags from last version

Changes in RFC V2:

https://lore.kernel.org/all/1627281445-12445-1-git-send-email-anshuman.khandual@arm.com/

- Changed FEAT_LPA2 presence qualifying criteria wrt ID_AA64MMFR0_TGRAN_LPA2
- Changed FEAT_LPA2 specific encoding which drops the additional tmp variable
- Fixed [phys|pte]_to_[pte|phys]() helpers when FEAT_LPA2 is implemented

Changes in RFC V1:

https://lore.kernel.org/lkml/1626229291-6569-1-git-send-email-anshuman.khandual@arm.com/


Anshuman Khandual (13):
  arm64/mm: Dynamically initialize protection_map[]
  arm64/mm: Consolidate TCR_EL1 fields
  arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
  arm64/mm: Add FEAT_LPA2 specific VTCR_EL2.DS field
  arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
  arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]
  arm64/mm: Add FEAT_LPA2 specific encoding
  arm64/mm: Detect and enable FEAT_LPA2
  arm64/mm: Add __cpu_secondary_check52bitpa()
  arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S
  arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented
  arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES
  KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K

 arch/arm64/Kconfig                      | 17 +++++++----
 arch/arm64/include/asm/assembler.h      | 42 +++++++++++++++++++++------
 arch/arm64/include/asm/kernel-pgtable.h |  4 +--
 arch/arm64/include/asm/kvm_arm.h        |  1 +
 arch/arm64/include/asm/kvm_pgtable.h    | 10 ++++++-
 arch/arm64/include/asm/memory.h         |  1 +
 arch/arm64/include/asm/pgtable-hwdef.h  | 28 ++++++++++++++----
 arch/arm64/include/asm/pgtable-prot.h   | 34 +++++++++++-----------
 arch/arm64/include/asm/pgtable.h        | 18 ++++++++++--
 arch/arm64/include/asm/smp.h            |  1 +
 arch/arm64/include/asm/sysreg.h         |  9 +++---
 arch/arm64/kernel/head.S                | 51 +++++++++++++++++++++++++++++++++
 arch/arm64/kernel/smp.c                 |  2 ++
 arch/arm64/kvm/hyp/pgtable.c            | 25 ++++++++++++++--
 arch/arm64/kvm/reset.c                  | 14 ++++++---
 arch/arm64/mm/init.c                    | 22 ++++++++++++++
 arch/arm64/mm/mmu.c                     |  3 ++
 arch/arm64/mm/pgd.c                     |  2 +-
 arch/arm64/mm/proc.S                    | 11 ++++++-
 arch/arm64/mm/ptdump.c                  | 26 +++++++++++++++--
 20 files changed, 265 insertions(+), 56 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC V3 01/13] arm64/mm: Dynamically initialize protection_map[]
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 02/13] arm64/mm: Consolidate TCR_EL1 fields Anshuman Khandual
                   ` (11 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

Going forward some protection_map[] elements (i.e __PXXX and __SXXX) would
contain a runtime variable with FEAT_LPA2 enabled, preventing a successful
build because of their current static initialization. This change prevents
the problem via assigning a dummy protection value i.e __pgprot(0) to all
__PXXX and __SXXX elements which builds successfully but later updates the
protection_map[] array in platform mem_init(). __pgprot(0) does not cause
any problem because vm_get_page_prot() which uses protection_map[] should
never be called before platform mem_init().

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/pgtable-prot.h | 34 +++++++++++++++++-----------------
 arch/arm64/mm/init.c                  | 22 ++++++++++++++++++++++
 2 files changed, 39 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-prot.h b/arch/arm64/include/asm/pgtable-prot.h
index 7032f04..539503a 100644
--- a/arch/arm64/include/asm/pgtable-prot.h
+++ b/arch/arm64/include/asm/pgtable-prot.h
@@ -88,23 +88,23 @@ extern bool arm64_use_ng_mappings;
 #define PAGE_READONLY_EXEC	__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_RDONLY | PTE_NG | PTE_PXN)
 #define PAGE_EXECONLY		__pgprot(_PAGE_DEFAULT | PTE_RDONLY | PTE_NG | PTE_PXN)
 
-#define __P000  PAGE_NONE
-#define __P001  PAGE_READONLY
-#define __P010  PAGE_READONLY
-#define __P011  PAGE_READONLY
-#define __P100  PAGE_EXECONLY
-#define __P101  PAGE_READONLY_EXEC
-#define __P110  PAGE_READONLY_EXEC
-#define __P111  PAGE_READONLY_EXEC
-
-#define __S000  PAGE_NONE
-#define __S001  PAGE_READONLY
-#define __S010  PAGE_SHARED
-#define __S011  PAGE_SHARED
-#define __S100  PAGE_EXECONLY
-#define __S101  PAGE_READONLY_EXEC
-#define __S110  PAGE_SHARED_EXEC
-#define __S111  PAGE_SHARED_EXEC
+#define __P000  __pgprot(0)
+#define __P001  __pgprot(0)
+#define __P010  __pgprot(0)
+#define __P011  __pgprot(0)
+#define __P100  __pgprot(0)
+#define __P101  __pgprot(0)
+#define __P110  __pgprot(0)
+#define __P111  __pgprot(0)
+
+#define __S000  __pgprot(0)
+#define __S001  __pgprot(0)
+#define __S010  __pgprot(0)
+#define __S011  __pgprot(0)
+#define __S100  __pgprot(0)
+#define __S101  __pgprot(0)
+#define __S110  __pgprot(0)
+#define __S111  __pgprot(0)
 
 #endif /* __ASSEMBLY__ */
 
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 37a8175..27f7c6c 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -403,6 +403,27 @@ void __init bootmem_init(void)
 	memblock_dump_all();
 }
 
+static void init_protection_map(void)
+{
+	protection_map[0] = PAGE_NONE;
+	protection_map[1] = PAGE_READONLY;
+	protection_map[2] = PAGE_READONLY;
+	protection_map[3] = PAGE_READONLY;
+	protection_map[4] = PAGE_EXECONLY;
+	protection_map[5] = PAGE_READONLY_EXEC;
+	protection_map[6] = PAGE_READONLY_EXEC;
+	protection_map[7] = PAGE_READONLY_EXEC;
+
+	protection_map[8] = PAGE_NONE;
+	protection_map[9] = PAGE_READONLY;
+	protection_map[10] = PAGE_SHARED;
+	protection_map[11] = PAGE_SHARED;
+	protection_map[12] = PAGE_EXECONLY;
+	protection_map[13] = PAGE_READONLY_EXEC;
+	protection_map[14] = PAGE_SHARED_EXEC;
+	protection_map[15] = PAGE_SHARED_EXEC;
+}
+
 /*
  * mem_init() marks the free areas in the mem_map and tells us how much memory
  * is free.  This is done after various parts of the system have claimed their
@@ -444,6 +465,7 @@ void __init mem_init(void)
 		 */
 		sysctl_overcommit_memory = OVERCOMMIT_ALWAYS;
 	}
+	init_protection_map();
 }
 
 void free_initmem(void)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 02/13] arm64/mm: Consolidate TCR_EL1 fields
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 01/13] arm64/mm: Dynamically initialize protection_map[] Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 03/13] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field Anshuman Khandual
                   ` (10 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

This renames and moves SYS_TCR_EL1_TCMA1 and SYS_TCR_EL1_TCMA0 definitions
into pgtable-hwdef.h thus consolidating all TCR fields in a single header.
This does not cause any functional change.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
 arch/arm64/include/asm/sysreg.h        | 4 ----
 arch/arm64/mm/proc.S                   | 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 40085e5..66671ff 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -273,6 +273,8 @@
 #define TCR_NFD1		(UL(1) << 54)
 #define TCR_E0PD0		(UL(1) << 55)
 #define TCR_E0PD1		(UL(1) << 56)
+#define TCR_TCMA0		(UL(1) << 57)
+#define TCR_TCMA1		(UL(1) << 58)
 
 /*
  * TTBR.
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index b268082..4630eac 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1076,10 +1076,6 @@
 #define CPACR_EL1_ZEN_EL0EN	(BIT(17)) /* enable EL0 access, if EL1EN set */
 #define CPACR_EL1_ZEN		(CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)
 
-/* TCR EL1 Bit Definitions */
-#define SYS_TCR_EL1_TCMA1	(BIT(58))
-#define SYS_TCR_EL1_TCMA0	(BIT(57))
-
 /* GCR_EL1 Definitions */
 #define SYS_GCR_EL1_RRND	(BIT(16))
 #define SYS_GCR_EL1_EXCL_MASK	0xffffUL
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index d35c90d..50bbed9 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -46,7 +46,7 @@
 #endif
 
 #ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_MTE_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
+#define TCR_MTE_FLAGS TCR_TCMA1 | TCR_TBI1 | TCR_TBID1
 #else
 /*
  * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 03/13] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 01/13] arm64/mm: Dynamically initialize protection_map[] Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 02/13] arm64/mm: Consolidate TCR_EL1 fields Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 04/13] arm64/mm: Add FEAT_LPA2 specific VTCR_EL2.DS field Anshuman Khandual
                   ` (9 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

As per ARM ARM (0487G.A) TCR_EL1.DS fields controls whether 52 bit input
and output address get supported on 4K and 16K page size configuration,
when FEAT_LPA2 is known to have been implemented. This adds TCR_DS field
definition which would be used when FEAT_LPA2 gets enabled.

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/pgtable-hwdef.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 66671ff..1eb5574 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -275,6 +275,7 @@
 #define TCR_E0PD1		(UL(1) << 56)
 #define TCR_TCMA0		(UL(1) << 57)
 #define TCR_TCMA1		(UL(1) << 58)
+#define TCR_DS			(UL(1) << 59)
 
 /*
  * TTBR.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 04/13] arm64/mm: Add FEAT_LPA2 specific VTCR_EL2.DS field
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (2 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 03/13] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 05/13] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Anshuman Khandual
                   ` (8 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

As per ARM ARM (0487G.A) VTCR_EL2.DS fields controls whether 52 bit IPA and
output physical address get supported on 4K and 16K page size configuration
when FEAT_LPA2 is known to have been implemented. This adds VTCR_DS field
definition which would be used when FEAT_LPA2 gets enabled in Stage-2.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 327120c..e5c4236 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -105,6 +105,7 @@
 			 TCR_EL2_ORGN0_MASK | TCR_EL2_IRGN0_MASK | TCR_EL2_T0SZ_MASK)
 
 /* VTCR_EL2 Registers bits */
+#define VTCR_EL2_DS		(1UL << 32)
 #define VTCR_EL2_RES1		(1U << 31)
 #define VTCR_EL2_HD		(1 << 22)
 #define VTCR_EL2_HA		(1 << 21)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 05/13] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (3 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 04/13] arm64/mm: Add FEAT_LPA2 specific VTCR_EL2.DS field Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 06/13] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2] Anshuman Khandual
                   ` (7 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

PAGE_SIZE support is tested against possible minimum and maximum values for
its respective ID_AA64MMFR0.TGRAN field, depending on whether it is signed
or unsigned. But then FEAT_LPA2 implementation needs to be validated for 4K
and 16K page sizes via feature specific ID_AA64MMFR0.TGRAN values. Hence it
adds FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] values per ARM ARM (0487G.A).

Acked-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 4630eac..334d91f 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -850,12 +850,14 @@
 #define ID_AA64MMFR0_ASID_16		0x2
 
 #define ID_AA64MMFR0_TGRAN4_NI			0xf
+#define ID_AA64MMFR0_TGRAN4_LPA2		0x1
 #define ID_AA64MMFR0_TGRAN4_SUPPORTED_MIN	0x0
 #define ID_AA64MMFR0_TGRAN4_SUPPORTED_MAX	0x7
 #define ID_AA64MMFR0_TGRAN64_NI			0xf
 #define ID_AA64MMFR0_TGRAN64_SUPPORTED_MIN	0x0
 #define ID_AA64MMFR0_TGRAN64_SUPPORTED_MAX	0x7
 #define ID_AA64MMFR0_TGRAN16_NI			0x0
+#define ID_AA64MMFR0_TGRAN16_LPA2		0x2
 #define ID_AA64MMFR0_TGRAN16_SUPPORTED_MIN	0x1
 #define ID_AA64MMFR0_TGRAN16_SUPPORTED_MAX	0xf
 
@@ -872,6 +874,7 @@
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_DEFAULT	0x0
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_NONE	0x1
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MIN	0x2
+#define ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2	0x3
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MAX	0x7
 
 #ifdef CONFIG_ARM64_PA_BITS_52
@@ -1042,11 +1045,13 @@
 
 #if defined(CONFIG_ARM64_4K_PAGES)
 #define ID_AA64MMFR0_TGRAN_SHIFT		ID_AA64MMFR0_TGRAN4_SHIFT
+#define ID_AA64MMFR0_TGRAN_LPA2			ID_AA64MMFR0_TGRAN4_LPA2
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_TGRAN4_SUPPORTED_MIN
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MAX	ID_AA64MMFR0_TGRAN4_SUPPORTED_MAX
 #define ID_AA64MMFR0_TGRAN_2_SHIFT		ID_AA64MMFR0_TGRAN4_2_SHIFT
 #elif defined(CONFIG_ARM64_16K_PAGES)
 #define ID_AA64MMFR0_TGRAN_SHIFT		ID_AA64MMFR0_TGRAN16_SHIFT
+#define ID_AA64MMFR0_TGRAN_LPA2			ID_AA64MMFR0_TGRAN16_LPA2
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_TGRAN16_SUPPORTED_MIN
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MAX	ID_AA64MMFR0_TGRAN16_SUPPORTED_MAX
 #define ID_AA64MMFR0_TGRAN_2_SHIFT		ID_AA64MMFR0_TGRAN16_2_SHIFT
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 06/13] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (4 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 05/13] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 07/13] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
                   ` (6 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

Going forward, CONFIG_ARM64_PA_BITS_52 could be enabled on a system via two
different architecture features i.e FEAT_LPA for CONFIG_ARM64_64K_PAGES and
FEAT_LPA2 for CONFIG_ARM64_[4K|16K]_PAGES. But CONFIG_ARM64_PA_BITS_52 is
exclussively available on 64K page size config currently, which needs to be
freed up for other page size configs to use when FEAT_LPA2 gets enabled.

To achieve CONFIG_ARM64_PA_BITS_52 and CONFIG_ARM64_64K_PAGES decoupling,
and also to reduce #ifdefs while navigating various page size configs, this
adds two internal config options CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]. While
here it also converts existing 64K page size based FEAT_LPA implementations
to use CONFIG_ARM64_PA_BITS_52_LPA. TTBR representation remains same for
both FEAT_LPA and FEAT_LPA2. No functional change for 64K page size config.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig                     |  7 +++++++
 arch/arm64/include/asm/assembler.h     | 12 ++++++------
 arch/arm64/include/asm/pgtable-hwdef.h |  7 ++++---
 arch/arm64/include/asm/pgtable.h       |  6 +++---
 arch/arm64/mm/pgd.c                    |  2 +-
 5 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5c7ae4c..f58ef62 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -931,6 +931,12 @@ config ARM64_VA_BITS
 	default 48 if ARM64_VA_BITS_48
 	default 52 if ARM64_VA_BITS_52
 
+config ARM64_PA_BITS_52_LPA
+	bool
+
+config ARM64_PA_BITS_52_LPA2
+	bool
+
 choice
 	prompt "Physical address space size"
 	default ARM64_PA_BITS_48
@@ -945,6 +951,7 @@ config ARM64_PA_BITS_52
 	bool "52-bit (ARMv8.2)"
 	depends on ARM64_64K_PAGES
 	depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
+	select ARM64_PA_BITS_52_LPA if ARM64_64K_PAGES
 	help
 	  Enable support for a 52-bit physical address space, introduced as
 	  part of the ARMv8.2-LPA extension.
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index e5b5d3a..3fbe04a 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -621,26 +621,26 @@ alternative_endif
 	.endm
 
 	.macro	phys_to_pte, pte, phys
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	/*
 	 * We assume \phys is 64K aligned and this is guaranteed by only
 	 * supporting this configuration with 64K pages.
 	 */
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 	.endm
 
 	.macro	pte_to_phys, phys, pte
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
 	bfxil	\phys, \pte, #16, #32
 	lsl	\phys, \phys, #16
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 	.endm
 
 /*
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 1eb5574..f375bcf 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -155,13 +155,14 @@
 #define PTE_PXN			(_AT(pteval_t, 1) << 53)	/* Privileged XN */
 #define PTE_UXN			(_AT(pteval_t, 1) << 54)	/* User XN */
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
-#ifdef CONFIG_ARM64_PA_BITS_52
 #define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
 #define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
+#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
 /*
  * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index dfa76af..09b081e 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -66,14 +66,14 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
  * Macros to convert between a physical address and its placement in a
  * page table entry, taking care of 52-bit addresses.
  */
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 #define __pte_to_phys(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
 #define pte_pfn(pte)		(__pte_to_phys(pte) >> PAGE_SHIFT)
 #define pfn_pte(pfn,prot)	\
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index 4a64089..090dfbe 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -40,7 +40,7 @@ void __init pgtable_cache_init(void)
 	if (PGD_SIZE == PAGE_SIZE)
 		return;
 
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	/*
 	 * With 52-bit physical addresses, the architecture requires the
 	 * top-level table to be aligned to at least 64 bytes.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 07/13] arm64/mm: Add FEAT_LPA2 specific encoding
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (5 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 06/13] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2] Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-10-12 10:41   ` Suzuki K Poulose
  2021-09-30 10:35 ` [RFC V3 08/13] arm64/mm: Detect and enable FEAT_LPA2 Anshuman Khandual
                   ` (5 subsequent siblings)
  12 siblings, 1 reply; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
to accept a temporary variable and changes impacted call sites.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/assembler.h     | 14 +++++++++++---
 arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
 arch/arm64/include/asm/pgtable.h       |  4 ++++
 3 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 3fbe04a..c1543067 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -628,6 +628,10 @@ alternative_endif
 	 */
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	orr	\pte, \phys, \phys, lsr #42
+	and	\pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
+	and	\pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
@@ -635,9 +639,13 @@ alternative_endif
 
 	.macro	pte_to_phys, phys, pte
 #ifdef CONFIG_ARM64_PA_BITS_52_LPA
-	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
-	bfxil	\phys, \pte, #16, #32
-	lsl	\phys, \phys, #16
+	ubfiz	\phys, \pte, #(48 - PAGE_SHIFT - 12), #16
+	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
+	lsl	\phys, \phys, #PAGE_SHIFT
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ubfiz	\phys, \pte, #(52 - PAGE_SHIFT - 10), #10
+	bfxil	\phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
+	lsl	\phys, \phys, #PAGE_SHIFT
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index f375bcf..c815a85 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -159,6 +159,10 @@
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
 #define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (50 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+#define PTE_ADDR_HIGH		(_AT(pteval_t, 0x3) << 8)
+#define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 09b081e..9038d05 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -70,6 +70,10 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 #define __pte_to_phys(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+#define __pte_to_phys(pte)	\
+	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
+#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 08/13] arm64/mm: Detect and enable FEAT_LPA2
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (6 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 07/13] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 09/13] arm64/mm: Add __cpu_secondary_check52bitpa() Anshuman Khandual
                   ` (4 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

Detect FEAT_LPA2 implementation early enough during boot when requested via
CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
was requested but found not to be implemented.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/memory.h |  1 +
 arch/arm64/kernel/head.S        | 15 +++++++++++++++
 arch/arm64/mm/mmu.c             |  3 +++
 arch/arm64/mm/proc.S            |  9 +++++++++
 4 files changed, 28 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index f1745a8..41bf258 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -178,6 +178,7 @@
 #include <asm/bug.h>
 
 extern u64			vabits_actual;
+extern u64			arm64_lpa2_enabled;
 
 extern s64			memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index a8b6716..ab21aac 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
 	adrp	x23, __PHYS_OFFSET
 	and	x23, x23, MIN_KIMG_ALIGN - 1	// KASLR offset, defaults to 0
 	bl	set_cpu_boot_mode_flag
+
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	mrs     x10, ID_AA64MMFR0_EL1
+	ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
+	cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
+	b.lt	1f
+
+	mov	x10, #1
+	adr_l	x11, arm64_lpa2_enabled
+	str	x10, [x11]
+	dmb	sy
+	dc	ivac, x11
+1:
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 	bl	__create_page_tables
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index cfd9deb..b2a4d98 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -48,6 +48,9 @@ u64 idmap_ptrs_per_pgd = PTRS_PER_PGD;
 u64 __section(".mmuoff.data.write") vabits_actual;
 EXPORT_SYMBOL(vabits_actual);
 
+u64 __section(".mmuoff.data.write") arm64_lpa2_enabled;
+EXPORT_SYMBOL(arm64_lpa2_enabled);
+
 u64 kimage_voffset __ro_after_init;
 EXPORT_SYMBOL(kimage_voffset);
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 50bbed9..a1578e7 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -423,6 +423,15 @@ SYM_FUNC_START(__cpu_setup)
 			TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
 			TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	ldr_l   x10, arm64_lpa2_enabled
+	cmp	x10, #1
+	b.ne	1f
+	mov_q	x10, TCR_DS
+	orr	tcr, tcr, x10
+1:
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 #ifdef CONFIG_ARM64_MTE
 	/*
 	 * Update MAIR_EL1, GCR_EL1 and TFSR*_EL1 if MTE is supported
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 09/13] arm64/mm: Add __cpu_secondary_check52bitpa()
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (7 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 08/13] arm64/mm: Detect and enable FEAT_LPA2 Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 10/13] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S Anshuman Khandual
                   ` (3 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

FEAT_LPA2 enabled systems needs to test ID_AA64MMFR0.TGRAN value to ensure
that 52 bit PA range is supported across all secondary CPUs both for 4K and
16K configs. This adds a new __cpu_secondary_check52bitva check along with
a corresponding reason code CPU_STUCK_REASON_52_BIT_PA. This will identify
the exact reason in case the system gets stuck after detecting a secondary
CPU that does not support 52 bit PA.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/smp.h |  1 +
 arch/arm64/kernel/head.S     | 21 +++++++++++++++++++++
 arch/arm64/kernel/smp.c      |  2 ++
 3 files changed, 24 insertions(+)

diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
index fc55f5a..e5ff305 100644
--- a/arch/arm64/include/asm/smp.h
+++ b/arch/arm64/include/asm/smp.h
@@ -22,6 +22,7 @@
 
 #define CPU_STUCK_REASON_52_BIT_VA	(UL(1) << CPU_STUCK_REASON_SHIFT)
 #define CPU_STUCK_REASON_NO_GRAN	(UL(2) << CPU_STUCK_REASON_SHIFT)
+#define CPU_STUCK_REASON_52_BIT_PA	(UL(3) << CPU_STUCK_REASON_SHIFT)
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ab21aac..0b48e4c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -667,6 +667,7 @@ SYM_FUNC_START_LOCAL(secondary_startup)
 	 */
 	bl	switch_to_vhe
 	bl	__cpu_secondary_check52bitva
+	bl	__cpu_secondary_check52bitpa
 	bl	__cpu_setup			// initialise processor
 	adrp	x1, swapper_pg_dir
 	bl	__enable_mmu
@@ -770,6 +771,26 @@ SYM_FUNC_START(__cpu_secondary_check52bitva)
 2:	ret
 SYM_FUNC_END(__cpu_secondary_check52bitva)
 
+SYM_FUNC_START(__cpu_secondary_check52bitpa)
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	ldr_l	x0, arm64_lpa2_enabled
+	cmp     x2, #1
+	b.ne	2f
+
+	mrs     x0, ID_AA64MMFR0_EL1
+	ubfx    x0, x0, #ID_AA64MMFR0_TGRAN_SHIFT, 4
+	cmp     x0, #ID_AA64MMFR0_TGRAN_LPA2
+	b.ge	2f
+
+	update_early_cpu_boot_status \
+		CPU_STUCK_IN_KERNEL | CPU_STUCK_REASON_52_BIT_PA, x0, x1
+1:	wfe
+	wfi
+	b	1b
+#endif
+2:	ret
+SYM_FUNC_END(__cpu_secondary_check52bitpa)
+
 SYM_FUNC_START_LOCAL(__no_granule_support)
 	/* Indicate that this CPU can't boot and is stuck in the kernel */
 	update_early_cpu_boot_status \
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
index 6f6ff072..a8d08d1 100644
--- a/arch/arm64/kernel/smp.c
+++ b/arch/arm64/kernel/smp.c
@@ -164,6 +164,8 @@ int __cpu_up(unsigned int cpu, struct task_struct *idle)
 		if (status & CPU_STUCK_REASON_NO_GRAN) {
 			pr_crit("CPU%u: does not support %luK granule\n",
 				cpu, PAGE_SIZE / SZ_1K);
+		if (status & CPU_STUCK_REASON_52_BIT_PA)
+			pr_crit("CPU%u: does not support 52-bit PAs\n", cpu);
 		}
 		cpus_stuck_in_kernel++;
 		break;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 10/13] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (8 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 09/13] arm64/mm: Add __cpu_secondary_check52bitpa() Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 11/13] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented Anshuman Khandual
                   ` (2 subsequent siblings)
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

PTE[9:8] which holds the sharability attribute bits SH[1:0] could collide
with PA[51:50] when CONFIG_ARM64_PA_BITS_52 is enabled but then FEAT_LPA2
is not detected during boot. Dropping PTE_SHARED and PMD_SECT_S attributes
completely in this scenario will create non-shared page table entries which
would cause regression.

Instead just define PTE_SHARED and PMD_SECT_S after accounting for runtime
'arm64_lpa2_enable', thus maintaining required sharability attributes for
both kernel and user space page table entries. This updates ptdump handling
for page table entry shared attributes accommodating FEAT_LPA2 scenarios.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/kernel-pgtable.h |  4 ++--
 arch/arm64/include/asm/pgtable-hwdef.h  | 12 ++++++++++--
 arch/arm64/kernel/head.S                | 15 +++++++++++++++
 arch/arm64/mm/ptdump.c                  | 26 ++++++++++++++++++++++++--
 4 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 96dc0f7..bdd38046 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -103,8 +103,8 @@
 /*
  * Initial memory map attributes.
  */
-#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF)
+#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF)
 
 #if ARM64_KERNEL_USES_PMD_MAPS
 #define SWAPPER_MM_MMUFLAGS	(PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index c815a85..8a3b75e 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -116,13 +116,21 @@
 #define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
 #define PMD_TABLE_BIT		(_AT(pmdval_t, 1) << 1)
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+#define PTE_SHARED		(arm64_lpa2_enabled ? 0 : PTE_SHARED_STATIC)
+#define PMD_SECT_S		(arm64_lpa2_enabled ? 0 : PMD_SECT_S_STATIC)
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA2 */
+#define PTE_SHARED		PTE_SHARED_STATIC
+#define PMD_SECT_S		PMD_SECT_S_STATIC
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 /*
  * Section
  */
 #define PMD_SECT_VALID		(_AT(pmdval_t, 1) << 0)
 #define PMD_SECT_USER		(_AT(pmdval_t, 1) << 6)		/* AP[1] */
 #define PMD_SECT_RDONLY		(_AT(pmdval_t, 1) << 7)		/* AP[2] */
-#define PMD_SECT_S		(_AT(pmdval_t, 3) << 8)
+#define PMD_SECT_S_STATIC	(_AT(pmdval_t, 3) << 8)
 #define PMD_SECT_AF		(_AT(pmdval_t, 1) << 10)
 #define PMD_SECT_NG		(_AT(pmdval_t, 1) << 11)
 #define PMD_SECT_CONT		(_AT(pmdval_t, 1) << 52)
@@ -146,7 +154,7 @@
 #define PTE_TABLE_BIT		(_AT(pteval_t, 1) << 1)
 #define PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
 #define PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
-#define PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_SHARED_STATIC	(_AT(pteval_t, 3) << 8)         /* SH[1:0], inner shareable */
 #define PTE_AF			(_AT(pteval_t, 1) << 10)	/* Access Flag */
 #define PTE_NG			(_AT(pteval_t, 1) << 11)	/* nG */
 #define PTE_GP			(_AT(pteval_t, 1) << 50)	/* BTI guarded */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 0b48e4c..f62f360 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -302,6 +302,21 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 
 	mov	x7, SWAPPER_MM_MMUFLAGS
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	ldr_l   x2, arm64_lpa2_enabled
+	cmp     x2, #1
+	b.eq    1f
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
+	/*
+	 * FEAT_LPA2 has not been detected during boot.
+	 * Hence SWAPPER_MM_MMUFLAGS needs to have the
+	 * regular sharability attributes in PTE[9:8].
+	 * Same is also applicable when FEAT_LPA2 has
+	 * not been requested in the first place.
+	 */
+	orr     x7, x7, PTE_SHARED_STATIC
+1:
 	/*
 	 * Create the identity mapping.
 	 */
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 1c40353..be171cf 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -115,8 +115,8 @@ static const struct prot_bits pte_bits[] = {
 		.set	= "NX",
 		.clear	= "x ",
 	}, {
-		.mask	= PTE_SHARED,
-		.val	= PTE_SHARED,
+		.mask	= PTE_SHARED_STATIC,
+		.val	= PTE_SHARED_STATIC,
 		.set	= "SHD",
 		.clear	= "   ",
 	}, {
@@ -211,6 +211,28 @@ static void dump_prot(struct pg_state *st, const struct prot_bits *bits,
 	for (i = 0; i < num; i++, bits++) {
 		const char *s;
 
+		if (IS_ENABLED(CONFIG_ARM64_PA_BITS_52_LPA2) &&
+		   (bits->mask == PTE_SHARED_STATIC)) {
+			/*
+			 * If FEAT_LPA2 has been detected and enabled
+			 * sharing attributes for page table entries
+			 * are inherited from TCR_EL1.SH1 as init_mm
+			 * based mappings are enabled from TTBR1_EL1.
+			 */
+			if (arm64_lpa2_enabled) {
+				if ((read_sysreg(tcr_el1) & TCR_SH1_INNER) == TCR_SH1_INNER)
+					pt_dump_seq_printf(st->seq, " SHD ");
+				else
+					pt_dump_seq_printf(st->seq, " ");
+				continue;
+			}
+			/*
+			 * In case FEAT_LPA2 has not been detected and
+			 * enabled sharing attributes should be found
+			 * in the regular PTE positions. It just falls
+			 * through regular PTE attribute handling.
+			 */
+		}
 		if ((st->current_prot & bits->mask) == bits->val)
 			s = bits->set;
 		else
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 11/13] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (9 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 10/13] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 12/13] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K Anshuman Khandual
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

CONFIG_ARM64_PA_BITS_52 build kernels need to fallback for 48 bits PA range
encodings when FEAT_LPA2 is not implemented i.e TCR_EL1.DS could not be set
.  Hence modify applicable PTE and TTBR encoding helpers to accommodate the
scenario via 'arm64_lpa2_enabled'.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/assembler.h     | 16 ++++++++++++++++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 ++
 arch/arm64/include/asm/pgtable.h       | 12 ++++++++++--
 3 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index c1543067..e4f67ab 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -629,9 +629,17 @@ alternative_endif
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ldr_l   \pte, arm64_lpa2_enabled
+	cmp     \pte, #1
+	b.ne    .Lskip_lpa2\@
+
 	orr	\pte, \phys, \phys, lsr #42
 	and	\pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
 	and	\pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)
+	b	.Ldone_lpa2\@
+.Lskip_lpa2\@:
+	mov	\pte, \phys
+.Ldone_lpa2\@:
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
@@ -643,9 +651,17 @@ alternative_endif
 	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
 	lsl	\phys, \phys, #PAGE_SHIFT
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ldr_l   \phys, arm64_lpa2_enabled
+	cmp     \phys, #1
+	b.ne    .Lskip_lpa2\@
+
 	ubfiz	\phys, \pte, #(52 - PAGE_SHIFT - 10), #10
 	bfxil	\phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
 	lsl	\phys, \phys, #PAGE_SHIFT
+	b	.Ldone_lpa2\@
+.Lskip_lpa2\@:
+	and	\phys, \pte, #PTE_ADDR_MASK_48
+.Ldone_lpa2\@:
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 8a3b75e..b98b764 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -176,6 +176,8 @@
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
+#define PTE_ADDR_MASK_48	(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+
 /*
  * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
  */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 9038d05..5365661 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -71,9 +71,17 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
-#define __pte_to_phys(pte)	\
+#define __pte_to_phys_52(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
-#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
+#define __phys_to_pte_val_52(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
+
+#define __pte_to_phys_48(pte)		(pte_val(pte) & PTE_ADDR_MASK_48)
+#define __phys_to_pte_val_48(phys)	(phys)
+
+#define __pte_to_phys(pte)	\
+	(arm64_lpa2_enabled ? __pte_to_phys_52(pte) : __pte_to_phys_48(pte))
+#define __phys_to_pte_val(phys)	\
+	(arm64_lpa2_enabled ? __phys_to_pte_val_52(phys) : __phys_to_pte_val_48(phys))
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 12/13] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (10 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 11/13] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-09-30 10:35 ` [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K Anshuman Khandual
  12 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

All required FEAT_LPA2 components for 52 bit PA range are already in place.
Just enable CONFIG_ARM64_PA_BITS_52 on 4K and 16K pages which would select
CONFIG_ARM64_PA_BITS_52_LPA2 activating 52 bit PA range via FEAT_LPA2.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f58ef62..926a802 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -948,17 +948,17 @@ config ARM64_PA_BITS_48
 	bool "48-bit"
 
 config ARM64_PA_BITS_52
-	bool "52-bit (ARMv8.2)"
-	depends on ARM64_64K_PAGES
+	bool "52-bit"
 	depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
 	select ARM64_PA_BITS_52_LPA if ARM64_64K_PAGES
+	select ARM64_PA_BITS_52_LPA2 if (ARM64_4K_PAGES  || ARM64_16K_PAGES)
 	help
 	  Enable support for a 52-bit physical address space, introduced as
-	  part of the ARMv8.2-LPA extension.
+	  part of the ARMv8.2-LPA or ARMv8.7-LPA2 extension.
 
 	  With this enabled, the kernel will also continue to work on CPUs that
-	  do not support ARMv8.2-LPA, but with some added memory overhead (and
-	  minor performance overhead).
+	  do not support ARMv8.2-LPA or ARMv8.7-LPA2, but with some added memory
+	  overhead (and minor performance overhead).
 
 endchoice
 
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K
  2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
                   ` (11 preceding siblings ...)
  2021-09-30 10:35 ` [RFC V3 12/13] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES Anshuman Khandual
@ 2021-09-30 10:35 ` Anshuman Khandual
  2021-10-11 10:16   ` Marc Zyngier
  12 siblings, 1 reply; 20+ messages in thread
From: Anshuman Khandual @ 2021-09-30 10:35 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel
  Cc: suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

Stage-2 FEAT_LPA2 support is independent and also orthogonal to FEAT_LPA2
support either in Stage-1 or in the host kernel. Stage-2 IPA range support
is evaluated from the platform via ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2 and
gets enabled regardless of Stage-1 translation.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/kvm_pgtable.h | 10 +++++++++-
 arch/arm64/kvm/hyp/pgtable.c         | 25 +++++++++++++++++++++++--
 arch/arm64/kvm/reset.c               | 14 ++++++++++----
 3 files changed, 42 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
index 0277838..78a9d12 100644
--- a/arch/arm64/include/asm/kvm_pgtable.h
+++ b/arch/arm64/include/asm/kvm_pgtable.h
@@ -29,18 +29,26 @@ typedef u64 kvm_pte_t;
 
 #define KVM_PTE_ADDR_MASK		GENMASK(47, PAGE_SHIFT)
 #define KVM_PTE_ADDR_51_48		GENMASK(15, 12)
+#define KVM_PTE_ADDR_51_50		GENMASK(9, 8)
 
 static inline bool kvm_pte_valid(kvm_pte_t pte)
 {
 	return pte & KVM_PTE_VALID;
 }
 
+void set_kvm_lpa2_enabled(void);
+bool get_kvm_lpa2_enabled(void);
+
 static inline u64 kvm_pte_to_phys(kvm_pte_t pte)
 {
 	u64 pa = pte & KVM_PTE_ADDR_MASK;
 
-	if (PAGE_SHIFT == 16)
+	if (PAGE_SHIFT == 16) {
 		pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
+	} else {
+		if (get_kvm_lpa2_enabled())
+			pa |= FIELD_GET(KVM_PTE_ADDR_51_50, pte) << 50;
+	}
 
 	return pa;
 }
diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
index f8ceebe..58141bf 100644
--- a/arch/arm64/kvm/hyp/pgtable.c
+++ b/arch/arm64/kvm/hyp/pgtable.c
@@ -49,6 +49,18 @@
 #define KVM_INVALID_PTE_OWNER_MASK	GENMASK(9, 2)
 #define KVM_MAX_OWNER_ID		1
 
+static bool kvm_lpa2_enabled;
+
+bool get_kvm_lpa2_enabled(void)
+{
+	return kvm_lpa2_enabled;
+}
+
+void set_kvm_lpa2_enabled(void)
+{
+	kvm_lpa2_enabled = true;
+}
+
 struct kvm_pgtable_walk_data {
 	struct kvm_pgtable		*pgt;
 	struct kvm_pgtable_walker	*walker;
@@ -126,8 +138,12 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
 {
 	kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
 
-	if (PAGE_SHIFT == 16)
+	if (PAGE_SHIFT == 16) {
 		pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
+	} else {
+		if (get_kvm_lpa2_enabled())
+			pte |= FIELD_PREP(KVM_PTE_ADDR_51_50, pa >> 50);
+	}
 
 	return pte;
 }
@@ -540,6 +556,9 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
 	 */
 	vtcr |= VTCR_EL2_HA;
 
+	if (get_kvm_lpa2_enabled())
+		vtcr |= VTCR_EL2_DS;
+
 	/* Set the vmid bits */
 	vtcr |= (get_vmid_bits(mmfr1) == 16) ?
 		VTCR_EL2_VS_16BIT :
@@ -577,7 +596,9 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p
 	if (prot & KVM_PGTABLE_PROT_W)
 		attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
 
-	attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
+	if (!get_kvm_lpa2_enabled())
+		attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
+
 	attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
 	attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
 	*ptep = attr;
diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
index 5ce36b0..97ec387 100644
--- a/arch/arm64/kvm/reset.c
+++ b/arch/arm64/kvm/reset.c
@@ -315,26 +315,32 @@ u32 get_kvm_ipa_limit(void)
 
 int kvm_set_ipa_limit(void)
 {
-	unsigned int parange;
+	unsigned int parange, tgran;
 	u64 mmfr0;
 
 	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
 	parange = cpuid_feature_extract_unsigned_field(mmfr0,
 				ID_AA64MMFR0_PARANGE_SHIFT);
+	tgran = cpuid_feature_extract_unsigned_field(mmfr0,
+				ID_AA64MMFR0_TGRAN_2_SHIFT);
 	/*
 	 * IPA size beyond 48 bits could not be supported
 	 * on either 4K or 16K page size. Hence let's cap
 	 * it to 48 bits, in case it's reported as larger
 	 * on the system.
 	 */
-	if (PAGE_SIZE != SZ_64K)
-		parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
+	if (PAGE_SIZE != SZ_64K) {
+		if (tgran == ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2)
+			set_kvm_lpa2_enabled();
+		else
+			parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
+	}
 
 	/*
 	 * Check with ARMv8.5-GTG that our PAGE_SIZE is supported at
 	 * Stage-2. If not, things will stop very quickly.
 	 */
-	switch (cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_TGRAN_2_SHIFT)) {
+	switch (tgran) {
 	case ID_AA64MMFR0_TGRAN_2_SUPPORTED_NONE:
 		kvm_err("PAGE_SIZE not supported at Stage-2, giving up\n");
 		return -EINVAL;
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K
  2021-09-30 10:35 ` [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K Anshuman Khandual
@ 2021-10-11 10:16   ` Marc Zyngier
  2021-10-12  4:24     ` Anshuman Khandual
  0 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2021-10-11 10:16 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, suzuki.poulose, mark.rutland,
	will, catalin.marinas, james.morse, steven.price

On Thu, 30 Sep 2021 11:35:16 +0100,
Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> 
> Stage-2 FEAT_LPA2 support is independent and also orthogonal to FEAT_LPA2
> support either in Stage-1 or in the host kernel. Stage-2 IPA range support
> is evaluated from the platform via ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2 and
> gets enabled regardless of Stage-1 translation.
> 
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/include/asm/kvm_pgtable.h | 10 +++++++++-
>  arch/arm64/kvm/hyp/pgtable.c         | 25 +++++++++++++++++++++++--
>  arch/arm64/kvm/reset.c               | 14 ++++++++++----
>  3 files changed, 42 insertions(+), 7 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> index 0277838..78a9d12 100644
> --- a/arch/arm64/include/asm/kvm_pgtable.h
> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> @@ -29,18 +29,26 @@ typedef u64 kvm_pte_t;
>  
>  #define KVM_PTE_ADDR_MASK		GENMASK(47, PAGE_SHIFT)
>  #define KVM_PTE_ADDR_51_48		GENMASK(15, 12)
> +#define KVM_PTE_ADDR_51_50		GENMASK(9, 8)
>  
>  static inline bool kvm_pte_valid(kvm_pte_t pte)
>  {
>  	return pte & KVM_PTE_VALID;
>  }
>  
> +void set_kvm_lpa2_enabled(void);
> +bool get_kvm_lpa2_enabled(void);
> +
>  static inline u64 kvm_pte_to_phys(kvm_pte_t pte)
>  {
>  	u64 pa = pte & KVM_PTE_ADDR_MASK;
>  
> -	if (PAGE_SHIFT == 16)
> +	if (PAGE_SHIFT == 16) {
>  		pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
> +	} else {
> +		if (get_kvm_lpa2_enabled())

Having to do a function call just for this test seems bad, specially
for something that is used so often on the fault path.

Why can't this be made a normal capability that indicates LPA support
for the current page size?

> +			pa |= FIELD_GET(KVM_PTE_ADDR_51_50, pte) << 50;

Where are bits 48 and 49?

> +	}
>  
>  	return pa;
>  }
> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
> index f8ceebe..58141bf 100644
> --- a/arch/arm64/kvm/hyp/pgtable.c
> +++ b/arch/arm64/kvm/hyp/pgtable.c
> @@ -49,6 +49,18 @@
>  #define KVM_INVALID_PTE_OWNER_MASK	GENMASK(9, 2)
>  #define KVM_MAX_OWNER_ID		1
>  
> +static bool kvm_lpa2_enabled;
> +
> +bool get_kvm_lpa2_enabled(void)
> +{
> +	return kvm_lpa2_enabled;
> +}
> +
> +void set_kvm_lpa2_enabled(void)
> +{
> +	kvm_lpa2_enabled = true;
> +}
> +
>  struct kvm_pgtable_walk_data {
>  	struct kvm_pgtable		*pgt;
>  	struct kvm_pgtable_walker	*walker;
> @@ -126,8 +138,12 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
>  {
>  	kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
>  
> -	if (PAGE_SHIFT == 16)
> +	if (PAGE_SHIFT == 16) {
>  		pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
> +	} else {
> +		if (get_kvm_lpa2_enabled())
> +			pte |= FIELD_PREP(KVM_PTE_ADDR_51_50, pa >> 50);
> +	}
>  
>  	return pte;
>  }
> @@ -540,6 +556,9 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
>  	 */
>  	vtcr |= VTCR_EL2_HA;
>  
> +	if (get_kvm_lpa2_enabled())
> +		vtcr |= VTCR_EL2_DS;
> +
>  	/* Set the vmid bits */
>  	vtcr |= (get_vmid_bits(mmfr1) == 16) ?
>  		VTCR_EL2_VS_16BIT :
> @@ -577,7 +596,9 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p
>  	if (prot & KVM_PGTABLE_PROT_W)
>  		attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
>  
> -	attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
> +	if (!get_kvm_lpa2_enabled())
> +		attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
> +
>  	attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
>  	attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
>  	*ptep = attr;
> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
> index 5ce36b0..97ec387 100644
> --- a/arch/arm64/kvm/reset.c
> +++ b/arch/arm64/kvm/reset.c
> @@ -315,26 +315,32 @@ u32 get_kvm_ipa_limit(void)
>  
>  int kvm_set_ipa_limit(void)
>  {
> -	unsigned int parange;
> +	unsigned int parange, tgran;
>  	u64 mmfr0;
>  
>  	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>  	parange = cpuid_feature_extract_unsigned_field(mmfr0,
>  				ID_AA64MMFR0_PARANGE_SHIFT);
> +	tgran = cpuid_feature_extract_unsigned_field(mmfr0,
> +				ID_AA64MMFR0_TGRAN_2_SHIFT);
>  	/*
>  	 * IPA size beyond 48 bits could not be supported
>  	 * on either 4K or 16K page size. Hence let's cap
>  	 * it to 48 bits, in case it's reported as larger
>  	 * on the system.

Shouldn't you fix this comment?

>  	 */
> -	if (PAGE_SIZE != SZ_64K)
> -		parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
> +	if (PAGE_SIZE != SZ_64K) {
> +		if (tgran == ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2)
> +			set_kvm_lpa2_enabled();
> +		else
> +			parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
> +	}
>  
>  	/*
>  	 * Check with ARMv8.5-GTG that our PAGE_SIZE is supported at
>  	 * Stage-2. If not, things will stop very quickly.
>  	 */
> -	switch (cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_TGRAN_2_SHIFT)) {
> +	switch (tgran) {
>  	case ID_AA64MMFR0_TGRAN_2_SUPPORTED_NONE:
>  		kvm_err("PAGE_SIZE not supported at Stage-2, giving up\n");
>  		return -EINVAL;

Another thing I don't see is how you manage TLB invalidation by level
now that we gain a level 0 at 4kB, breaking the current assumptions
encoded in __tlbi_level().

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K
  2021-10-11 10:16   ` Marc Zyngier
@ 2021-10-12  4:24     ` Anshuman Khandual
  2021-10-12  8:30       ` Marc Zyngier
  0 siblings, 1 reply; 20+ messages in thread
From: Anshuman Khandual @ 2021-10-12  4:24 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, linux-kernel, suzuki.poulose, mark.rutland,
	will, catalin.marinas, james.morse, steven.price

Hello Marc,

On 10/11/21 3:46 PM, Marc Zyngier wrote:
> On Thu, 30 Sep 2021 11:35:16 +0100,
> Anshuman Khandual <anshuman.khandual@arm.com> wrote:
>>
>> Stage-2 FEAT_LPA2 support is independent and also orthogonal to FEAT_LPA2
>> support either in Stage-1 or in the host kernel. Stage-2 IPA range support
>> is evaluated from the platform via ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2 and
>> gets enabled regardless of Stage-1 translation.
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/include/asm/kvm_pgtable.h | 10 +++++++++-
>>  arch/arm64/kvm/hyp/pgtable.c         | 25 +++++++++++++++++++++++--
>>  arch/arm64/kvm/reset.c               | 14 ++++++++++----
>>  3 files changed, 42 insertions(+), 7 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
>> index 0277838..78a9d12 100644
>> --- a/arch/arm64/include/asm/kvm_pgtable.h
>> +++ b/arch/arm64/include/asm/kvm_pgtable.h
>> @@ -29,18 +29,26 @@ typedef u64 kvm_pte_t;
>>  
>>  #define KVM_PTE_ADDR_MASK		GENMASK(47, PAGE_SHIFT)
>>  #define KVM_PTE_ADDR_51_48		GENMASK(15, 12)
>> +#define KVM_PTE_ADDR_51_50		GENMASK(9, 8)
>>  
>>  static inline bool kvm_pte_valid(kvm_pte_t pte)
>>  {
>>  	return pte & KVM_PTE_VALID;
>>  }
>>  
>> +void set_kvm_lpa2_enabled(void);
>> +bool get_kvm_lpa2_enabled(void);
>> +
>>  static inline u64 kvm_pte_to_phys(kvm_pte_t pte)
>>  {
>>  	u64 pa = pte & KVM_PTE_ADDR_MASK;
>>  
>> -	if (PAGE_SHIFT == 16)
>> +	if (PAGE_SHIFT == 16) {
>>  		pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
>> +	} else {
>> +		if (get_kvm_lpa2_enabled())
> 
> Having to do a function call just for this test seems bad, specially
> for something that is used so often on the fault path.
> 
> Why can't this be made a normal capability that indicates LPA support
> for the current page size?

Although I could look into making this a normal capability check, would
not a static key based implementation be preferred if the function call
based construct here is too expensive ?

Originally, avoided capability method for stage-2 because it would have
been difficult in stage-1 where the FEAT_LPA2 detection is required way
earlier during boot before cpu capability comes up. Hence just followed
a simple variable method both for stage-1 and stage-2 keeping it same.

> 
>> +			pa |= FIELD_GET(KVM_PTE_ADDR_51_50, pte) << 50;
> 
> Where are bits 48 and 49?

Unlike the current FEAT_LPA feature, bits 48 and 49 are part of the PA
itself. Only the bits 50 and 51 move into bits 8 and 9, while creating
a PTE.

> 
>> +	}
>>  
>>  	return pa;
>>  }
>> diff --git a/arch/arm64/kvm/hyp/pgtable.c b/arch/arm64/kvm/hyp/pgtable.c
>> index f8ceebe..58141bf 100644
>> --- a/arch/arm64/kvm/hyp/pgtable.c
>> +++ b/arch/arm64/kvm/hyp/pgtable.c
>> @@ -49,6 +49,18 @@
>>  #define KVM_INVALID_PTE_OWNER_MASK	GENMASK(9, 2)
>>  #define KVM_MAX_OWNER_ID		1
>>  
>> +static bool kvm_lpa2_enabled;
>> +
>> +bool get_kvm_lpa2_enabled(void)
>> +{
>> +	return kvm_lpa2_enabled;
>> +}
>> +
>> +void set_kvm_lpa2_enabled(void)
>> +{
>> +	kvm_lpa2_enabled = true;
>> +}
>> +
>>  struct kvm_pgtable_walk_data {
>>  	struct kvm_pgtable		*pgt;
>>  	struct kvm_pgtable_walker	*walker;
>> @@ -126,8 +138,12 @@ static kvm_pte_t kvm_phys_to_pte(u64 pa)
>>  {
>>  	kvm_pte_t pte = pa & KVM_PTE_ADDR_MASK;
>>  
>> -	if (PAGE_SHIFT == 16)
>> +	if (PAGE_SHIFT == 16) {
>>  		pte |= FIELD_PREP(KVM_PTE_ADDR_51_48, pa >> 48);
>> +	} else {
>> +		if (get_kvm_lpa2_enabled())
>> +			pte |= FIELD_PREP(KVM_PTE_ADDR_51_50, pa >> 50);
>> +	}
>>  
>>  	return pte;
>>  }
>> @@ -540,6 +556,9 @@ u64 kvm_get_vtcr(u64 mmfr0, u64 mmfr1, u32 phys_shift)
>>  	 */
>>  	vtcr |= VTCR_EL2_HA;
>>  
>> +	if (get_kvm_lpa2_enabled())
>> +		vtcr |= VTCR_EL2_DS;
>> +
>>  	/* Set the vmid bits */
>>  	vtcr |= (get_vmid_bits(mmfr1) == 16) ?
>>  		VTCR_EL2_VS_16BIT :
>> @@ -577,7 +596,9 @@ static int stage2_set_prot_attr(struct kvm_pgtable *pgt, enum kvm_pgtable_prot p
>>  	if (prot & KVM_PGTABLE_PROT_W)
>>  		attr |= KVM_PTE_LEAF_ATTR_LO_S2_S2AP_W;
>>  
>> -	attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
>> +	if (!get_kvm_lpa2_enabled())
>> +		attr |= FIELD_PREP(KVM_PTE_LEAF_ATTR_LO_S2_SH, sh);
>> +
>>  	attr |= KVM_PTE_LEAF_ATTR_LO_S2_AF;
>>  	attr |= prot & KVM_PTE_LEAF_ATTR_HI_SW;
>>  	*ptep = attr;
>> diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c
>> index 5ce36b0..97ec387 100644
>> --- a/arch/arm64/kvm/reset.c
>> +++ b/arch/arm64/kvm/reset.c
>> @@ -315,26 +315,32 @@ u32 get_kvm_ipa_limit(void)
>>  
>>  int kvm_set_ipa_limit(void)
>>  {
>> -	unsigned int parange;
>> +	unsigned int parange, tgran;
>>  	u64 mmfr0;
>>  
>>  	mmfr0 = read_sanitised_ftr_reg(SYS_ID_AA64MMFR0_EL1);
>>  	parange = cpuid_feature_extract_unsigned_field(mmfr0,
>>  				ID_AA64MMFR0_PARANGE_SHIFT);
>> +	tgran = cpuid_feature_extract_unsigned_field(mmfr0,
>> +				ID_AA64MMFR0_TGRAN_2_SHIFT);
>>  	/*
>>  	 * IPA size beyond 48 bits could not be supported
>>  	 * on either 4K or 16K page size. Hence let's cap
>>  	 * it to 48 bits, in case it's reported as larger
>>  	 * on the system.
> 
> Shouldn't you fix this comment?

Ahh ! sure, will fix the comment.

> 
>>  	 */
>> -	if (PAGE_SIZE != SZ_64K)
>> -		parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
>> +	if (PAGE_SIZE != SZ_64K) {
>> +		if (tgran == ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2)
>> +			set_kvm_lpa2_enabled();
>> +		else
>> +			parange = min(parange, (unsigned int)ID_AA64MMFR0_PARANGE_48);
>> +	}
>>  
>>  	/*
>>  	 * Check with ARMv8.5-GTG that our PAGE_SIZE is supported at
>>  	 * Stage-2. If not, things will stop very quickly.
>>  	 */
>> -	switch (cpuid_feature_extract_unsigned_field(mmfr0, ID_AA64MMFR0_TGRAN_2_SHIFT)) {
>> +	switch (tgran) {
>>  	case ID_AA64MMFR0_TGRAN_2_SUPPORTED_NONE:
>>  		kvm_err("PAGE_SIZE not supported at Stage-2, giving up\n");
>>  		return -EINVAL;
> 
> Another thing I don't see is how you manage TLB invalidation by level
> now that we gain a level 0 at 4kB, breaking the current assumptions
> encoded in __tlbi_level().

Right, I guess something like this (not build tested) will be required as
level 0 for 4K and level 1 for 16K would only make sense when FEAT_LPA2 is
implemented, otherwise it will fallback to the default behaviour i.e table
level hint was not provided (TTL[3:2] is 0b00). Is there any other concern
which I might be missing here ?

--- a/arch/arm64/include/asm/tlbflush.h
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -104,8 +104,7 @@ static inline unsigned long get_trans_granule(void)
 #define __tlbi_level(op, addr, level) do {                             \
        u64 arg = addr;                                                 \
                                                                        \
-       if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) &&               \
-           level) {                                                    \
+       if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) {                \
                u64 ttl = level & 3;                                    \
                ttl |= get_trans_granule() << 2;                        \
                arg &= ~TLBI_TTL_MASK;                                  \

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K
  2021-10-12  4:24     ` Anshuman Khandual
@ 2021-10-12  8:30       ` Marc Zyngier
  2021-10-13  3:28         ` Anshuman Khandual
  0 siblings, 1 reply; 20+ messages in thread
From: Marc Zyngier @ 2021-10-12  8:30 UTC (permalink / raw)
  To: Anshuman Khandual
  Cc: linux-arm-kernel, linux-kernel, suzuki.poulose, mark.rutland,
	will, catalin.marinas, james.morse, steven.price

On Tue, 12 Oct 2021 05:24:15 +0100,
Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> 
> Hello Marc,
> 
> On 10/11/21 3:46 PM, Marc Zyngier wrote:
> > On Thu, 30 Sep 2021 11:35:16 +0100,
> > Anshuman Khandual <anshuman.khandual@arm.com> wrote:
> >>
> >> Stage-2 FEAT_LPA2 support is independent and also orthogonal to FEAT_LPA2
> >> support either in Stage-1 or in the host kernel. Stage-2 IPA range support
> >> is evaluated from the platform via ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2 and
> >> gets enabled regardless of Stage-1 translation.
> >>
> >> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> >> ---
> >>  arch/arm64/include/asm/kvm_pgtable.h | 10 +++++++++-
> >>  arch/arm64/kvm/hyp/pgtable.c         | 25 +++++++++++++++++++++++--
> >>  arch/arm64/kvm/reset.c               | 14 ++++++++++----
> >>  3 files changed, 42 insertions(+), 7 deletions(-)
> >>
> >> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
> >> index 0277838..78a9d12 100644
> >> --- a/arch/arm64/include/asm/kvm_pgtable.h
> >> +++ b/arch/arm64/include/asm/kvm_pgtable.h
> >> @@ -29,18 +29,26 @@ typedef u64 kvm_pte_t;
> >>  
> >>  #define KVM_PTE_ADDR_MASK		GENMASK(47, PAGE_SHIFT)
> >>  #define KVM_PTE_ADDR_51_48		GENMASK(15, 12)
> >> +#define KVM_PTE_ADDR_51_50		GENMASK(9, 8)
> >>  
> >>  static inline bool kvm_pte_valid(kvm_pte_t pte)
> >>  {
> >>  	return pte & KVM_PTE_VALID;
> >>  }
> >>  
> >> +void set_kvm_lpa2_enabled(void);
> >> +bool get_kvm_lpa2_enabled(void);
> >> +
> >>  static inline u64 kvm_pte_to_phys(kvm_pte_t pte)
> >>  {
> >>  	u64 pa = pte & KVM_PTE_ADDR_MASK;
> >>  
> >> -	if (PAGE_SHIFT == 16)
> >> +	if (PAGE_SHIFT == 16) {
> >>  		pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
> >> +	} else {
> >> +		if (get_kvm_lpa2_enabled())
> > 
> > Having to do a function call just for this test seems bad, specially
> > for something that is used so often on the fault path.
> > 
> > Why can't this be made a normal capability that indicates LPA support
> > for the current page size?
> 
> Although I could look into making this a normal capability check, would
> not a static key based implementation be preferred if the function call
> based construct here is too expensive ?

A capability *is* a static key. Specially if you make it final.

> Originally, avoided capability method for stage-2 because it would have
> been difficult in stage-1 where the FEAT_LPA2 detection is required way
> earlier during boot before cpu capability comes up. Hence just followed
> a simple variable method both for stage-1 and stage-2 keeping it same.

I think you'll have to find a way to make it work with a capability
for S1 too. Capabilities can be used even when not final, and you may
have to do something similar.

> > 
> >> +			pa |= FIELD_GET(KVM_PTE_ADDR_51_50, pte) << 50;
> > 
> > Where are bits 48 and 49?
> 
> Unlike the current FEAT_LPA feature, bits 48 and 49 are part of the PA
> itself. Only the bits 50 and 51 move into bits 8 and 9, while creating
> a PTE.

So why are you actively dropping these bits? Hint: look at
KVM_PTE_ADDR_MASK and the way it is used to extract the initial value
of 'pa'.

[...]

> > Another thing I don't see is how you manage TLB invalidation by level
> > now that we gain a level 0 at 4kB, breaking the current assumptions
> > encoded in __tlbi_level().
> 
> Right, I guess something like this (not build tested) will be required as
> level 0 for 4K and level 1 for 16K would only make sense when FEAT_LPA2 is
> implemented, otherwise it will fallback to the default behaviour i.e table
> level hint was not provided (TTL[3:2] is 0b00). Is there any other concern
> which I might be missing here ?
> 
> --- a/arch/arm64/include/asm/tlbflush.h
> +++ b/arch/arm64/include/asm/tlbflush.h
> @@ -104,8 +104,7 @@ static inline unsigned long get_trans_granule(void)
>  #define __tlbi_level(op, addr, level) do {                             \
>         u64 arg = addr;                                                 \
>                                                                         \
> -       if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) &&               \
> -           level) {                                                    \
> +       if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) {                \
>                 u64 ttl = level & 3;                                    \
>                 ttl |= get_trans_granule() << 2;                        \
>                 arg &= ~TLBI_TTL_MASK;                                  \
> 

That's a start, but 0 has always meant 'at any level' until now. You
will have to audit all the call sites and work out whether they can
pass 0 if they don't track the actual level.

	M.

-- 
Without deviation from the norm, progress is not possible.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC V3 07/13] arm64/mm: Add FEAT_LPA2 specific encoding
  2021-09-30 10:35 ` [RFC V3 07/13] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
@ 2021-10-12 10:41   ` Suzuki K Poulose
  2021-10-13  2:55     ` Anshuman Khandual
  0 siblings, 1 reply; 20+ messages in thread
From: Suzuki K Poulose @ 2021-10-12 10:41 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel
  Cc: mark.rutland, will, catalin.marinas, maz, james.morse, steven.price

On 30/09/2021 11:35, Anshuman Khandual wrote:
> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
> to accept a temporary variable and changes impacted call sites.
> 
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   arch/arm64/include/asm/assembler.h     | 14 +++++++++++---
>   arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>   arch/arm64/include/asm/pgtable.h       |  4 ++++
>   3 files changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index 3fbe04a..c1543067 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -628,6 +628,10 @@ alternative_endif
>   	 */
>   	orr	\pte, \phys, \phys, lsr #36
>   	and	\pte, \pte, #PTE_ADDR_MASK
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +	orr	\pte, \phys, \phys, lsr #42
> +	and	\pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
> +	and	\pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)
>   #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>   	mov	\pte, \phys
>   #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
> @@ -635,9 +639,13 @@ alternative_endif
>   
>   	.macro	pte_to_phys, phys, pte
>   #ifdef CONFIG_ARM64_PA_BITS_52_LPA
> -	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
> -	bfxil	\phys, \pte, #16, #32
> -	lsl	\phys, \phys, #16
> +	ubfiz	\phys, \pte, #(48 - PAGE_SHIFT - 12), #16
> +	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)

nit: This looks like an unrelated change and is better suited for the 
previous patch.


Suzuki

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC V3 07/13] arm64/mm: Add FEAT_LPA2 specific encoding
  2021-10-12 10:41   ` Suzuki K Poulose
@ 2021-10-13  2:55     ` Anshuman Khandual
  0 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-10-13  2:55 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel, linux-kernel
  Cc: mark.rutland, will, catalin.marinas, maz, james.morse, steven.price



On 10/12/21 4:11 PM, Suzuki K Poulose wrote:
> On 30/09/2021 11:35, Anshuman Khandual wrote:
>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>> to accept a temporary variable and changes impacted call sites.
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>   arch/arm64/include/asm/assembler.h     | 14 +++++++++++---
>>   arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>>   arch/arm64/include/asm/pgtable.h       |  4 ++++
>>   3 files changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>> index 3fbe04a..c1543067 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -628,6 +628,10 @@ alternative_endif
>>        */
>>       orr    \pte, \phys, \phys, lsr #36
>>       and    \pte, \pte, #PTE_ADDR_MASK
>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>> +    orr    \pte, \phys, \phys, lsr #42
>> +    and    \pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
>> +    and    \pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)
>>   #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>>       mov    \pte, \phys
>>   #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
>> @@ -635,9 +639,13 @@ alternative_endif
>>         .macro    pte_to_phys, phys, pte
>>   #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>> -    ubfiz    \phys, \pte, #(48 - 16 - 12), #16
>> -    bfxil    \phys, \pte, #16, #32
>> -    lsl    \phys, \phys, #16
>> +    ubfiz    \phys, \pte, #(48 - PAGE_SHIFT - 12), #16
>> +    bfxil    \phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
> 
> nit: This looks like an unrelated change and is better suited for the previous patch.

Changed the existing FEAT_LPA encodings here to use PAGE_SHIFT just
to match the new ones being added for FEAT_LPA2. But reasonable for
them to be folded into the previous patch instead.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K
  2021-10-12  8:30       ` Marc Zyngier
@ 2021-10-13  3:28         ` Anshuman Khandual
  0 siblings, 0 replies; 20+ messages in thread
From: Anshuman Khandual @ 2021-10-13  3:28 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: linux-arm-kernel, linux-kernel, suzuki.poulose, mark.rutland,
	will, catalin.marinas, james.morse, steven.price



On 10/12/21 2:00 PM, Marc Zyngier wrote:
> On Tue, 12 Oct 2021 05:24:15 +0100,
> Anshuman Khandual <anshuman.khandual@arm.com> wrote:
>>
>> Hello Marc,
>>
>> On 10/11/21 3:46 PM, Marc Zyngier wrote:
>>> On Thu, 30 Sep 2021 11:35:16 +0100,
>>> Anshuman Khandual <anshuman.khandual@arm.com> wrote:
>>>>
>>>> Stage-2 FEAT_LPA2 support is independent and also orthogonal to FEAT_LPA2
>>>> support either in Stage-1 or in the host kernel. Stage-2 IPA range support
>>>> is evaluated from the platform via ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2 and
>>>> gets enabled regardless of Stage-1 translation.
>>>>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>>  arch/arm64/include/asm/kvm_pgtable.h | 10 +++++++++-
>>>>  arch/arm64/kvm/hyp/pgtable.c         | 25 +++++++++++++++++++++++--
>>>>  arch/arm64/kvm/reset.c               | 14 ++++++++++----
>>>>  3 files changed, 42 insertions(+), 7 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/kvm_pgtable.h b/arch/arm64/include/asm/kvm_pgtable.h
>>>> index 0277838..78a9d12 100644
>>>> --- a/arch/arm64/include/asm/kvm_pgtable.h
>>>> +++ b/arch/arm64/include/asm/kvm_pgtable.h
>>>> @@ -29,18 +29,26 @@ typedef u64 kvm_pte_t;
>>>>  
>>>>  #define KVM_PTE_ADDR_MASK		GENMASK(47, PAGE_SHIFT)
>>>>  #define KVM_PTE_ADDR_51_48		GENMASK(15, 12)
>>>> +#define KVM_PTE_ADDR_51_50		GENMASK(9, 8)
>>>>  
>>>>  static inline bool kvm_pte_valid(kvm_pte_t pte)
>>>>  {
>>>>  	return pte & KVM_PTE_VALID;
>>>>  }
>>>>  
>>>> +void set_kvm_lpa2_enabled(void);
>>>> +bool get_kvm_lpa2_enabled(void);
>>>> +
>>>>  static inline u64 kvm_pte_to_phys(kvm_pte_t pte)
>>>>  {
>>>>  	u64 pa = pte & KVM_PTE_ADDR_MASK;
>>>>  
>>>> -	if (PAGE_SHIFT == 16)
>>>> +	if (PAGE_SHIFT == 16) {
>>>>  		pa |= FIELD_GET(KVM_PTE_ADDR_51_48, pte) << 48;
>>>> +	} else {
>>>> +		if (get_kvm_lpa2_enabled())
>>>
>>> Having to do a function call just for this test seems bad, specially
>>> for something that is used so often on the fault path.
>>>
>>> Why can't this be made a normal capability that indicates LPA support
>>> for the current page size?
>>
>> Although I could look into making this a normal capability check, would
>> not a static key based implementation be preferred if the function call
>> based construct here is too expensive ?
> 
> A capability *is* a static key. Specially if you make it final.

Sure.

> 
>> Originally, avoided capability method for stage-2 because it would have
>> been difficult in stage-1 where the FEAT_LPA2 detection is required way
>> earlier during boot before cpu capability comes up. Hence just followed
>> a simple variable method both for stage-1 and stage-2 keeping it same.
> 
> I think you'll have to find a way to make it work with a capability
> for S1 too. Capabilities can be used even when not final, and you may
> have to do something similar.

Sure, will explore that.

> 
>>>
>>>> +			pa |= FIELD_GET(KVM_PTE_ADDR_51_50, pte) << 50;
>>>
>>> Where are bits 48 and 49?
>>
>> Unlike the current FEAT_LPA feature, bits 48 and 49 are part of the PA
>> itself. Only the bits 50 and 51 move into bits 8 and 9, while creating
>> a PTE.
> 
> So why are you actively dropping these bits? Hint: look at
> KVM_PTE_ADDR_MASK and the way it is used to extract the initial value
> of 'pa'.

Right, will need another address mask i.e KVM_PTE_ADDR_MASK_50 which will
extract the PA field both in kvm_pte_to_phys() and kvm_phys_to_pte().

> 
> [...]
> 
>>> Another thing I don't see is how you manage TLB invalidation by level
>>> now that we gain a level 0 at 4kB, breaking the current assumptions
>>> encoded in __tlbi_level().
>>
>> Right, I guess something like this (not build tested) will be required as
>> level 0 for 4K and level 1 for 16K would only make sense when FEAT_LPA2 is
>> implemented, otherwise it will fallback to the default behaviour i.e table
>> level hint was not provided (TTL[3:2] is 0b00). Is there any other concern
>> which I might be missing here ?
>>
>> --- a/arch/arm64/include/asm/tlbflush.h
>> +++ b/arch/arm64/include/asm/tlbflush.h
>> @@ -104,8 +104,7 @@ static inline unsigned long get_trans_granule(void)
>>  #define __tlbi_level(op, addr, level) do {                             \
>>         u64 arg = addr;                                                 \
>>                                                                         \
>> -       if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) &&               \
>> -           level) {                                                    \
>> +       if (cpus_have_const_cap(ARM64_HAS_ARMv8_4_TTL) {                \
>>                 u64 ttl = level & 3;                                    \
>>                 ttl |= get_trans_granule() << 2;                        \
>>                 arg &= ~TLBI_TTL_MASK;                                  \
>>
> 
> That's a start, but 0 has always meant 'at any level' until now. You
> will have to audit all the call sites and work out whether they can
> pass 0 if they don't track the actual level.

Hmm, sure will audit the call sites. But wondering if the caller is not
sure about the level, should not it just use __tlbi(op, arg) instead ?
Seems like __tlbi_level(op, addr, level) should only accept level as 0
when the level is determined to be 0. The actual impact will depend on
if FEAT_LPA2 is implemented otherwise it falls back to __tlbi(op, arg).
Basically we should convert existing sites which call __tlbi_level()
with level as 0 (without determining), to use __tlbi() directly ?

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2021-10-13  3:28 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-30 10:35 [RFC V3 00/13] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 01/13] arm64/mm: Dynamically initialize protection_map[] Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 02/13] arm64/mm: Consolidate TCR_EL1 fields Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 03/13] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 04/13] arm64/mm: Add FEAT_LPA2 specific VTCR_EL2.DS field Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 05/13] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 06/13] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2] Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 07/13] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
2021-10-12 10:41   ` Suzuki K Poulose
2021-10-13  2:55     ` Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 08/13] arm64/mm: Detect and enable FEAT_LPA2 Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 09/13] arm64/mm: Add __cpu_secondary_check52bitpa() Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 10/13] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 11/13] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 12/13] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES Anshuman Khandual
2021-09-30 10:35 ` [RFC V3 13/13] KVM: arm64: Enable FEAT_LPA2 based 52 bits IPA size on 4K and 16K Anshuman Khandual
2021-10-11 10:16   ` Marc Zyngier
2021-10-12  4:24     ` Anshuman Khandual
2021-10-12  8:30       ` Marc Zyngier
2021-10-13  3:28         ` Anshuman Khandual

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).