All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages)
@ 2021-07-14  2:21 ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

This series enables 52 bits PA support for 4K and 16K page configs via
existing CONFIG_ARM64_PA_BITS_52, utilizing a new arch feature FEAT_LPA2
which is available from ARM v8.7. IDMAP needs changes to accommodate two
new level of page tables in certain scenarios like (4K|39VA|52PA) but the
same problem also exists for (16K|36VA|48PA) which needs fixing. I am
currently working on the IDMAP fix for 16K and later will enable it for
FEAT_LPA2 as well.

This series applies on v5.14-rc1.

Testing:

Build and boot tested (individual patches) on all existing and new
FEAT_LPA2 enabled config combinations.

Pending:

- Enable IDMAP for FEAT_LPA2
- Enable 52 bit VA range on 4K/16K page sizes
- Evaluate KVM and SMMU impacts from FEAT_LPA2

Anshuman Khandual (10):
  mm/mmap: Dynamically initialize protection_map[]
  arm64/mm: Consolidate TCR_EL1 fields
  arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
  arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
  arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]
  arm64/mm: Add FEAT_LPA2 specific encoding
  arm64/mm: Detect and enable FEAT_LPA2
  arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S
  arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented
  arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES

 arch/arm64/Kconfig                      |  9 +++++-
 arch/arm64/include/asm/assembler.h      | 48 ++++++++++++++++++++++------
 arch/arm64/include/asm/kernel-pgtable.h |  4 +--
 arch/arm64/include/asm/memory.h         |  1 +
 arch/arm64/include/asm/pgtable-hwdef.h  | 28 ++++++++++++++---
 arch/arm64/include/asm/pgtable.h        | 18 +++++++++--
 arch/arm64/include/asm/sysreg.h         |  9 +++---
 arch/arm64/kernel/head.S                | 55 ++++++++++++++++++++++++++-------
 arch/arm64/mm/mmu.c                     |  3 ++
 arch/arm64/mm/pgd.c                     |  2 +-
 arch/arm64/mm/proc.S                    | 11 ++++++-
 arch/arm64/mm/ptdump.c                  | 26 ++++++++++++++--
 mm/mmap.c                               | 26 +++++++++++++---
 13 files changed, 195 insertions(+), 45 deletions(-)

-- 
2.7.4


^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages)
@ 2021-07-14  2:21 ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

This series enables 52 bits PA support for 4K and 16K page configs via
existing CONFIG_ARM64_PA_BITS_52, utilizing a new arch feature FEAT_LPA2
which is available from ARM v8.7. IDMAP needs changes to accommodate two
new level of page tables in certain scenarios like (4K|39VA|52PA) but the
same problem also exists for (16K|36VA|48PA) which needs fixing. I am
currently working on the IDMAP fix for 16K and later will enable it for
FEAT_LPA2 as well.

This series applies on v5.14-rc1.

Testing:

Build and boot tested (individual patches) on all existing and new
FEAT_LPA2 enabled config combinations.

Pending:

- Enable IDMAP for FEAT_LPA2
- Enable 52 bit VA range on 4K/16K page sizes
- Evaluate KVM and SMMU impacts from FEAT_LPA2

Anshuman Khandual (10):
  mm/mmap: Dynamically initialize protection_map[]
  arm64/mm: Consolidate TCR_EL1 fields
  arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
  arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
  arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]
  arm64/mm: Add FEAT_LPA2 specific encoding
  arm64/mm: Detect and enable FEAT_LPA2
  arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S
  arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented
  arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES

 arch/arm64/Kconfig                      |  9 +++++-
 arch/arm64/include/asm/assembler.h      | 48 ++++++++++++++++++++++------
 arch/arm64/include/asm/kernel-pgtable.h |  4 +--
 arch/arm64/include/asm/memory.h         |  1 +
 arch/arm64/include/asm/pgtable-hwdef.h  | 28 ++++++++++++++---
 arch/arm64/include/asm/pgtable.h        | 18 +++++++++--
 arch/arm64/include/asm/sysreg.h         |  9 +++---
 arch/arm64/kernel/head.S                | 55 ++++++++++++++++++++++++++-------
 arch/arm64/mm/mmu.c                     |  3 ++
 arch/arm64/mm/pgd.c                     |  2 +-
 arch/arm64/mm/proc.S                    | 11 ++++++-
 arch/arm64/mm/ptdump.c                  | 26 ++++++++++++++--
 mm/mmap.c                               | 26 +++++++++++++---
 13 files changed, 195 insertions(+), 45 deletions(-)

-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* [RFC 01/10] mm/mmap: Dynamically initialize protection_map[]
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

The protection_map[] elements (__PXXX and __SXXX) might sometimes contain
runtime variables in certain platforms like arm64 preventing a successful
build because of the current static initialization. So it just defers the
initialization until mmmap_init() via a new helper init_protection_map().

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 mm/mmap.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index ca54d36..a95b078 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -100,10 +100,7 @@ static void unmap_region(struct mm_struct *mm,
  *								w: (no) no
  *								x: (yes) yes
  */
-pgprot_t protection_map[16] __ro_after_init = {
-	__P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
-	__S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
-};
+pgprot_t protection_map[16] __ro_after_init;
 
 #ifndef CONFIG_ARCH_HAS_FILTER_PGPROT
 static inline pgprot_t arch_filter_pgprot(pgprot_t prot)
@@ -3708,6 +3705,26 @@ void mm_drop_all_locks(struct mm_struct *mm)
 	mutex_unlock(&mm_all_locks_mutex);
 }
 
+static void init_protection_map(void)
+{
+	protection_map[0] = __P000;
+	protection_map[1] = __P001;
+	protection_map[2] = __P010;
+	protection_map[3] = __P011;
+	protection_map[4] = __P100;
+	protection_map[5] = __P101;
+	protection_map[6] = __P110;
+	protection_map[7] = __P111;
+	protection_map[8] = __S000;
+	protection_map[9] = __S001;
+	protection_map[10] = __S010;
+	protection_map[11] = __S011;
+	protection_map[12] = __S100;
+	protection_map[13] = __S101;
+	protection_map[14] = __S110;
+	protection_map[15] = __S111;
+}
+
 /*
  * initialise the percpu counter for VM
  */
@@ -3715,6 +3732,7 @@ void __init mmap_init(void)
 {
 	int ret;
 
+	init_protection_map();
 	ret = percpu_counter_init(&vm_committed_as, 0, GFP_KERNEL);
 	VM_BUG_ON(ret);
 }
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 01/10] mm/mmap: Dynamically initialize protection_map[]
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

The protection_map[] elements (__PXXX and __SXXX) might sometimes contain
runtime variables in certain platforms like arm64 preventing a successful
build because of the current static initialization. So it just defers the
initialization until mmmap_init() via a new helper init_protection_map().

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 mm/mmap.c | 26 ++++++++++++++++++++++----
 1 file changed, 22 insertions(+), 4 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index ca54d36..a95b078 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -100,10 +100,7 @@ static void unmap_region(struct mm_struct *mm,
  *								w: (no) no
  *								x: (yes) yes
  */
-pgprot_t protection_map[16] __ro_after_init = {
-	__P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
-	__S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
-};
+pgprot_t protection_map[16] __ro_after_init;
 
 #ifndef CONFIG_ARCH_HAS_FILTER_PGPROT
 static inline pgprot_t arch_filter_pgprot(pgprot_t prot)
@@ -3708,6 +3705,26 @@ void mm_drop_all_locks(struct mm_struct *mm)
 	mutex_unlock(&mm_all_locks_mutex);
 }
 
+static void init_protection_map(void)
+{
+	protection_map[0] = __P000;
+	protection_map[1] = __P001;
+	protection_map[2] = __P010;
+	protection_map[3] = __P011;
+	protection_map[4] = __P100;
+	protection_map[5] = __P101;
+	protection_map[6] = __P110;
+	protection_map[7] = __P111;
+	protection_map[8] = __S000;
+	protection_map[9] = __S001;
+	protection_map[10] = __S010;
+	protection_map[11] = __S011;
+	protection_map[12] = __S100;
+	protection_map[13] = __S101;
+	protection_map[14] = __S110;
+	protection_map[15] = __S111;
+}
+
 /*
  * initialise the percpu counter for VM
  */
@@ -3715,6 +3732,7 @@ void __init mmap_init(void)
 {
 	int ret;
 
+	init_protection_map();
 	ret = percpu_counter_init(&vm_committed_as, 0, GFP_KERNEL);
 	VM_BUG_ON(ret);
 }
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 02/10] arm64/mm: Consolidate TCR_EL1 fields
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

This renames and moves SYS_TCR_EL1_TCMA1 and SYS_TCR_EL1_TCMA0 definitions
into pgtable-hwdef.h thus consolidating all TCR fields in a single header.
This does not cause any functional change.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
 arch/arm64/include/asm/sysreg.h        | 4 ----
 arch/arm64/mm/proc.S                   | 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 40085e5..66671ff 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -273,6 +273,8 @@
 #define TCR_NFD1		(UL(1) << 54)
 #define TCR_E0PD0		(UL(1) << 55)
 #define TCR_E0PD1		(UL(1) << 56)
+#define TCR_TCMA0		(UL(1) << 57)
+#define TCR_TCMA1		(UL(1) << 58)
 
 /*
  * TTBR.
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 7b9c3ac..5cbfaf6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1059,10 +1059,6 @@
 #define CPACR_EL1_ZEN_EL0EN	(BIT(17)) /* enable EL0 access, if EL1EN set */
 #define CPACR_EL1_ZEN		(CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)
 
-/* TCR EL1 Bit Definitions */
-#define SYS_TCR_EL1_TCMA1	(BIT(58))
-#define SYS_TCR_EL1_TCMA0	(BIT(57))
-
 /* GCR_EL1 Definitions */
 #define SYS_GCR_EL1_RRND	(BIT(16))
 #define SYS_GCR_EL1_EXCL_MASK	0xffffUL
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 35936c5..1ae0c2b 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -46,7 +46,7 @@
 #endif
 
 #ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_MTE_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
+#define TCR_MTE_FLAGS TCR_TCMA1 | TCR_TBI1 | TCR_TBID1
 #else
 /*
  * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 02/10] arm64/mm: Consolidate TCR_EL1 fields
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

This renames and moves SYS_TCR_EL1_TCMA1 and SYS_TCR_EL1_TCMA0 definitions
into pgtable-hwdef.h thus consolidating all TCR fields in a single header.
This does not cause any functional change.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/pgtable-hwdef.h | 2 ++
 arch/arm64/include/asm/sysreg.h        | 4 ----
 arch/arm64/mm/proc.S                   | 2 +-
 3 files changed, 3 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 40085e5..66671ff 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -273,6 +273,8 @@
 #define TCR_NFD1		(UL(1) << 54)
 #define TCR_E0PD0		(UL(1) << 55)
 #define TCR_E0PD1		(UL(1) << 56)
+#define TCR_TCMA0		(UL(1) << 57)
+#define TCR_TCMA1		(UL(1) << 58)
 
 /*
  * TTBR.
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 7b9c3ac..5cbfaf6 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -1059,10 +1059,6 @@
 #define CPACR_EL1_ZEN_EL0EN	(BIT(17)) /* enable EL0 access, if EL1EN set */
 #define CPACR_EL1_ZEN		(CPACR_EL1_ZEN_EL1EN | CPACR_EL1_ZEN_EL0EN)
 
-/* TCR EL1 Bit Definitions */
-#define SYS_TCR_EL1_TCMA1	(BIT(58))
-#define SYS_TCR_EL1_TCMA0	(BIT(57))
-
 /* GCR_EL1 Definitions */
 #define SYS_GCR_EL1_RRND	(BIT(16))
 #define SYS_GCR_EL1_EXCL_MASK	0xffffUL
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 35936c5..1ae0c2b 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -46,7 +46,7 @@
 #endif
 
 #ifdef CONFIG_KASAN_HW_TAGS
-#define TCR_MTE_FLAGS SYS_TCR_EL1_TCMA1 | TCR_TBI1 | TCR_TBID1
+#define TCR_MTE_FLAGS TCR_TCMA1 | TCR_TBI1 | TCR_TBID1
 #else
 /*
  * The mte_zero_clear_page_tags() implementation uses DC GZVA, which relies on
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 03/10] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

As per ARM ARM (0487G.A) TCR_EL1.DS fields controls whether 52 bit input
and output address get supported on 4K and 16K page size configuration,
when FEAT_LPA2 is known to have been implemented. This adds TCR_DS field
definition which would be used when FEAT_LPA2 gets enabled.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/pgtable-hwdef.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 66671ff..1eb5574 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -275,6 +275,7 @@
 #define TCR_E0PD1		(UL(1) << 56)
 #define TCR_TCMA0		(UL(1) << 57)
 #define TCR_TCMA1		(UL(1) << 58)
+#define TCR_DS			(UL(1) << 59)
 
 /*
  * TTBR.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 03/10] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

As per ARM ARM (0487G.A) TCR_EL1.DS fields controls whether 52 bit input
and output address get supported on 4K and 16K page size configuration,
when FEAT_LPA2 is known to have been implemented. This adds TCR_DS field
definition which would be used when FEAT_LPA2 gets enabled.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/pgtable-hwdef.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 66671ff..1eb5574 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -275,6 +275,7 @@
 #define TCR_E0PD1		(UL(1) << 56)
 #define TCR_TCMA0		(UL(1) << 57)
 #define TCR_TCMA1		(UL(1) << 58)
+#define TCR_DS			(UL(1) << 59)
 
 /*
  * TTBR.
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 04/10] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

PAGE_SIZE support is tested against possible minimum and maximum values for
its respective ID_AA64MMFR0.TGRAN field, depending on whether it is signed
or unsigned. But then FEAT_LPA2 implementation needs to be validated for 4K
and 16K page sizes via feature specific ID_AA64MMFR0.TGRAN values. Hence it
adds FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] values per ARM ARM (0487G.A).

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 5cbfaf6..deecde0 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -849,16 +849,19 @@
 
 #define ID_AA64MMFR0_TGRAN4_NI		0xf
 #define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
+#define ID_AA64MMFR0_TGRAN4_LPA2	0x1
 #define ID_AA64MMFR0_TGRAN64_NI		0xf
 #define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
 #define ID_AA64MMFR0_TGRAN16_NI		0x0
 #define ID_AA64MMFR0_TGRAN16_SUPPORTED	0x1
+#define ID_AA64MMFR0_TGRAN16_LPA2	0x2
 #define ID_AA64MMFR0_PARANGE_48		0x5
 #define ID_AA64MMFR0_PARANGE_52		0x6
 
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_DEFAULT	0x0
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_NONE	0x1
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MIN	0x2
+#define ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2	0x3
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MAX	0x7
 
 #ifdef CONFIG_ARM64_PA_BITS_52
@@ -1030,10 +1033,12 @@
 #define ID_AA64MMFR0_TGRAN_SHIFT		ID_AA64MMFR0_TGRAN4_SHIFT
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_TGRAN4_SUPPORTED
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MAX	0x7
+#define ID_AA64MMFR0_TGRAN_LPA2			ID_AA64MMFR0_TGRAN4_LPA2
 #elif defined(CONFIG_ARM64_16K_PAGES)
 #define ID_AA64MMFR0_TGRAN_SHIFT		ID_AA64MMFR0_TGRAN16_SHIFT
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_TGRAN16_SUPPORTED
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MAX	0xF
+#define ID_AA64MMFR0_TGRAN_LPA2			ID_AA64MMFR0_TGRAN16_LPA2
 #elif defined(CONFIG_ARM64_64K_PAGES)
 #define ID_AA64MMFR0_TGRAN_SHIFT		ID_AA64MMFR0_TGRAN64_SHIFT
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_TGRAN64_SUPPORTED
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 04/10] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2]
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

PAGE_SIZE support is tested against possible minimum and maximum values for
its respective ID_AA64MMFR0.TGRAN field, depending on whether it is signed
or unsigned. But then FEAT_LPA2 implementation needs to be validated for 4K
and 16K page sizes via feature specific ID_AA64MMFR0.TGRAN values. Hence it
adds FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] values per ARM ARM (0487G.A).

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/sysreg.h | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 5cbfaf6..deecde0 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -849,16 +849,19 @@
 
 #define ID_AA64MMFR0_TGRAN4_NI		0xf
 #define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
+#define ID_AA64MMFR0_TGRAN4_LPA2	0x1
 #define ID_AA64MMFR0_TGRAN64_NI		0xf
 #define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
 #define ID_AA64MMFR0_TGRAN16_NI		0x0
 #define ID_AA64MMFR0_TGRAN16_SUPPORTED	0x1
+#define ID_AA64MMFR0_TGRAN16_LPA2	0x2
 #define ID_AA64MMFR0_PARANGE_48		0x5
 #define ID_AA64MMFR0_PARANGE_52		0x6
 
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_DEFAULT	0x0
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_NONE	0x1
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MIN	0x2
+#define ID_AA64MMFR0_TGRAN_2_SUPPORTED_LPA2	0x3
 #define ID_AA64MMFR0_TGRAN_2_SUPPORTED_MAX	0x7
 
 #ifdef CONFIG_ARM64_PA_BITS_52
@@ -1030,10 +1033,12 @@
 #define ID_AA64MMFR0_TGRAN_SHIFT		ID_AA64MMFR0_TGRAN4_SHIFT
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_TGRAN4_SUPPORTED
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MAX	0x7
+#define ID_AA64MMFR0_TGRAN_LPA2			ID_AA64MMFR0_TGRAN4_LPA2
 #elif defined(CONFIG_ARM64_16K_PAGES)
 #define ID_AA64MMFR0_TGRAN_SHIFT		ID_AA64MMFR0_TGRAN16_SHIFT
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_TGRAN16_SUPPORTED
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MAX	0xF
+#define ID_AA64MMFR0_TGRAN_LPA2			ID_AA64MMFR0_TGRAN16_LPA2
 #elif defined(CONFIG_ARM64_64K_PAGES)
 #define ID_AA64MMFR0_TGRAN_SHIFT		ID_AA64MMFR0_TGRAN64_SHIFT
 #define ID_AA64MMFR0_TGRAN_SUPPORTED_MIN	ID_AA64MMFR0_TGRAN64_SUPPORTED
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 05/10] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

Going forward, CONFIG_ARM64_PA_BITS_52 could be enabled on a system via two
different architecture features i.e FEAT_LPA for CONFIG_ARM64_64K_PAGES and
FEAT_LPA2 for CONFIG_ARM64_[4K|16K]_PAGES. But CONFIG_ARM64_PA_BITS_52 is
exclussively available on 64K page size config currently, which needs to be
freed up for other page size configs to use when FEAT_LPA2 gets enabled.

To achieve CONFIG_ARM64_PA_BITS_52 and CONFIG_ARM64_64K_PAGES decoupling,
and also to reduce #ifdefs while navigating various page size configs, this
adds two internal config options CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]. While
here it also converts existing 64K page size based FEAT_LPA implementations
to use CONFIG_ARM64_PA_BITS_52_LPA. TTBR representation remains same for
both FEAT_LPA and FEAT_LPA2. No functional change for 64K page size config.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig                     |  7 +++++++
 arch/arm64/include/asm/assembler.h     | 12 ++++++------
 arch/arm64/include/asm/pgtable-hwdef.h |  7 ++++---
 arch/arm64/include/asm/pgtable.h       |  6 +++---
 arch/arm64/mm/pgd.c                    |  2 +-
 5 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e07e7de..658a6fd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -934,6 +934,12 @@ config ARM64_VA_BITS
 	default 48 if ARM64_VA_BITS_48
 	default 52 if ARM64_VA_BITS_52
 
+config ARM64_PA_BITS_52_LPA
+	bool
+
+config ARM64_PA_BITS_52_LPA2
+	bool
+
 choice
 	prompt "Physical address space size"
 	default ARM64_PA_BITS_48
@@ -948,6 +954,7 @@ config ARM64_PA_BITS_52
 	bool "52-bit (ARMv8.2)"
 	depends on ARM64_64K_PAGES
 	depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
+	select ARM64_PA_BITS_52_LPA if ARM64_64K_PAGES
 	help
 	  Enable support for a 52-bit physical address space, introduced as
 	  part of the ARMv8.2-LPA extension.
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 89faca0..fedc202 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -607,26 +607,26 @@ alternative_endif
 	.endm
 
 	.macro	phys_to_pte, pte, phys
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	/*
 	 * We assume \phys is 64K aligned and this is guaranteed by only
 	 * supporting this configuration with 64K pages.
 	 */
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 	.endm
 
 	.macro	pte_to_phys, phys, pte
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
 	bfxil	\phys, \pte, #16, #32
 	lsl	\phys, \phys, #16
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 	.endm
 
 /*
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 1eb5574..f375bcf 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -155,13 +155,14 @@
 #define PTE_PXN			(_AT(pteval_t, 1) << 53)	/* Privileged XN */
 #define PTE_UXN			(_AT(pteval_t, 1) << 54)	/* User XN */
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
-#ifdef CONFIG_ARM64_PA_BITS_52
 #define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
 #define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
+#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
 /*
  * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f09bf5c..3c57fb2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -66,14 +66,14 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
  * Macros to convert between a physical address and its placement in a
  * page table entry, taking care of 52-bit addresses.
  */
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 #define __pte_to_phys(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
 #define pte_pfn(pte)		(__pte_to_phys(pte) >> PAGE_SHIFT)
 #define pfn_pte(pfn,prot)	\
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index 4a64089..090dfbe 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -40,7 +40,7 @@ void __init pgtable_cache_init(void)
 	if (PGD_SIZE == PAGE_SIZE)
 		return;
 
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	/*
 	 * With 52-bit physical addresses, the architecture requires the
 	 * top-level table to be aligned to at least 64 bytes.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 05/10] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

Going forward, CONFIG_ARM64_PA_BITS_52 could be enabled on a system via two
different architecture features i.e FEAT_LPA for CONFIG_ARM64_64K_PAGES and
FEAT_LPA2 for CONFIG_ARM64_[4K|16K]_PAGES. But CONFIG_ARM64_PA_BITS_52 is
exclussively available on 64K page size config currently, which needs to be
freed up for other page size configs to use when FEAT_LPA2 gets enabled.

To achieve CONFIG_ARM64_PA_BITS_52 and CONFIG_ARM64_64K_PAGES decoupling,
and also to reduce #ifdefs while navigating various page size configs, this
adds two internal config options CONFIG_ARM64_PA_BITS_52_[LPA|LPA2]. While
here it also converts existing 64K page size based FEAT_LPA implementations
to use CONFIG_ARM64_PA_BITS_52_LPA. TTBR representation remains same for
both FEAT_LPA and FEAT_LPA2. No functional change for 64K page size config.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig                     |  7 +++++++
 arch/arm64/include/asm/assembler.h     | 12 ++++++------
 arch/arm64/include/asm/pgtable-hwdef.h |  7 ++++---
 arch/arm64/include/asm/pgtable.h       |  6 +++---
 arch/arm64/mm/pgd.c                    |  2 +-
 5 files changed, 21 insertions(+), 13 deletions(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index e07e7de..658a6fd 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -934,6 +934,12 @@ config ARM64_VA_BITS
 	default 48 if ARM64_VA_BITS_48
 	default 52 if ARM64_VA_BITS_52
 
+config ARM64_PA_BITS_52_LPA
+	bool
+
+config ARM64_PA_BITS_52_LPA2
+	bool
+
 choice
 	prompt "Physical address space size"
 	default ARM64_PA_BITS_48
@@ -948,6 +954,7 @@ config ARM64_PA_BITS_52
 	bool "52-bit (ARMv8.2)"
 	depends on ARM64_64K_PAGES
 	depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
+	select ARM64_PA_BITS_52_LPA if ARM64_64K_PAGES
 	help
 	  Enable support for a 52-bit physical address space, introduced as
 	  part of the ARMv8.2-LPA extension.
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 89faca0..fedc202 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -607,26 +607,26 @@ alternative_endif
 	.endm
 
 	.macro	phys_to_pte, pte, phys
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	/*
 	 * We assume \phys is 64K aligned and this is guaranteed by only
 	 * supporting this configuration with 64K pages.
 	 */
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 	.endm
 
 	.macro	pte_to_phys, phys, pte
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
 	bfxil	\phys, \pte, #16, #32
 	lsl	\phys, \phys, #16
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 	.endm
 
 /*
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 1eb5574..f375bcf 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -155,13 +155,14 @@
 #define PTE_PXN			(_AT(pteval_t, 1) << 53)	/* Privileged XN */
 #define PTE_UXN			(_AT(pteval_t, 1) << 54)	/* User XN */
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
-#ifdef CONFIG_ARM64_PA_BITS_52
 #define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
 #define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
+#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
 /*
  * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index f09bf5c..3c57fb2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -66,14 +66,14 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
  * Macros to convert between a physical address and its placement in a
  * page table entry, taking care of 52-bit addresses.
  */
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 #define __pte_to_phys(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
-#else
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
-#endif
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
 #define pte_pfn(pte)		(__pte_to_phys(pte) >> PAGE_SHIFT)
 #define pfn_pte(pfn,prot)	\
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
index 4a64089..090dfbe 100644
--- a/arch/arm64/mm/pgd.c
+++ b/arch/arm64/mm/pgd.c
@@ -40,7 +40,7 @@ void __init pgtable_cache_init(void)
 	if (PGD_SIZE == PAGE_SIZE)
 		return;
 
-#ifdef CONFIG_ARM64_PA_BITS_52
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	/*
 	 * With 52-bit physical addresses, the architecture requires the
 	 * top-level table to be aligned to at least 64 bytes.
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
to accept a temporary variable and changes impacted call sites.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
 arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
 arch/arm64/include/asm/pgtable.h       |  4 ++++
 arch/arm64/kernel/head.S               | 25 +++++++++++++------------
 4 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index fedc202..0492543 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -606,7 +606,7 @@ alternative_endif
 #endif
 	.endm
 
-	.macro	phys_to_pte, pte, phys
+	.macro	phys_to_pte, pte, phys, tmp
 #ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	/*
 	 * We assume \phys is 64K aligned and this is guaranteed by only
@@ -614,6 +614,17 @@ alternative_endif
 	 */
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	orr	\pte, \phys, \phys, lsr #42
+
+	/*
+	 * The 'tmp' is being used here to just prepare
+	 * and hold PTE_ADDR_MASK which cannot be passed
+	 * to the subsequent 'and' instruction.
+	 */
+	mov	\tmp, #PTE_ADDR_LOW
+	orr	\tmp, \tmp, #PTE_ADDR_HIGH
+	and	\pte, \pte, \tmp
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
@@ -621,9 +632,13 @@ alternative_endif
 
 	.macro	pte_to_phys, phys, pte
 #ifdef CONFIG_ARM64_PA_BITS_52_LPA
-	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
-	bfxil	\phys, \pte, #16, #32
-	lsl	\phys, \phys, #16
+	ubfiz	\phys, \pte, #(48 - PAGE_SHIFT - 12), #16
+	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
+	lsl	\phys, \phys, #PAGE_SHIFT
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ubfiz	\phys, \pte, #(52 - PAGE_SHIFT - 10), #10
+	bfxil	\phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
+	lsl	\phys, \phys, #PAGE_SHIFT
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index f375bcf..c815a85 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -159,6 +159,10 @@
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
 #define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (50 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+#define PTE_ADDR_HIGH		(_AT(pteval_t, 0x3) << 8)
+#define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 3c57fb2..5e7e402 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -70,6 +70,10 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 #define __pte_to_phys(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+#define __pte_to_phys(pte)	\
+	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
+#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c5c994a..6444147 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -134,9 +134,9 @@ SYM_CODE_END(preserve_boot_args)
  * Corrupts:	ptrs, tmp1, tmp2
  * Returns:	tbl -> next level table page address
  */
-	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
+	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2, tmp3
 	add	\tmp1, \tbl, #PAGE_SIZE
-	phys_to_pte \tmp2, \tmp1
+	phys_to_pte \tmp2, \tmp1, \tmp3
 	orr	\tmp2, \tmp2, #PMD_TYPE_TABLE	// address of next table and entry type
 	lsr	\tmp1, \virt, #\shift
 	sub	\ptrs, \ptrs, #1
@@ -161,8 +161,8 @@ SYM_CODE_END(preserve_boot_args)
  * Corrupts:	index, tmp1
  * Returns:	rtbl
  */
-	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1
-.Lpe\@:	phys_to_pte \tmp1, \rtbl
+	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1, tmp2
+.Lpe\@:	phys_to_pte \tmp1, \rtbl, \tmp2
 	orr	\tmp1, \tmp1, \flags	// tmp1 = table entry
 	str	\tmp1, [\tbl, \index, lsl #3]
 	add	\rtbl, \rtbl, \inc	// rtbl = pa next level
@@ -224,31 +224,32 @@ SYM_CODE_END(preserve_boot_args)
  * Preserves:	vstart, vend, flags
  * Corrupts:	tbl, rtbl, istart, iend, tmp, count, sv
  */
-	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, tmp, count, sv
+	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, \
+								tmp, tmp1, count, sv
 	add \rtbl, \tbl, #PAGE_SIZE
 	mov \sv, \rtbl
 	mov \count, #0
 	compute_indices \vstart, \vend, #PGDIR_SHIFT, \pgds, \istart, \iend, \count
-	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
 	mov \tbl, \sv
 	mov \sv, \rtbl
 
 #if SWAPPER_PGTABLE_LEVELS > 3
 	compute_indices \vstart, \vend, #PUD_SHIFT, #PTRS_PER_PUD, \istart, \iend, \count
-	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
 	mov \tbl, \sv
 	mov \sv, \rtbl
 #endif
 
 #if SWAPPER_PGTABLE_LEVELS > 2
 	compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #PTRS_PER_PMD, \istart, \iend, \count
-	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
 	mov \tbl, \sv
 #endif
 
 	compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #PTRS_PER_PTE, \istart, \iend, \count
 	bic \count, \phys, #SWAPPER_BLOCK_SIZE - 1
-	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp
+	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp, \tmp1
 	.endm
 
 /*
@@ -343,7 +344,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 #endif
 
 	mov	x4, EXTRA_PTRS
-	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6
+	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6, x20
 #else
 	/*
 	 * If VA_BITS == 48, we don't have to configure an additional
@@ -356,7 +357,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 	ldr_l	x4, idmap_ptrs_per_pgd
 	adr_l	x6, __idmap_text_end		// __pa(__idmap_text_end)
 
-	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14
+	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
 
 	/*
 	 * Map the kernel image (starting with PHYS_OFFSET).
@@ -370,7 +371,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 	sub	x6, x6, x3			// _end - _text
 	add	x6, x6, x5			// runtime __va(_end)
 
-	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14
+	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
 
 	/*
 	 * Since the page tables have been populated with non-cacheable
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
to accept a temporary variable and changes impacted call sites.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
 arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
 arch/arm64/include/asm/pgtable.h       |  4 ++++
 arch/arm64/kernel/head.S               | 25 +++++++++++++------------
 4 files changed, 40 insertions(+), 16 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index fedc202..0492543 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -606,7 +606,7 @@ alternative_endif
 #endif
 	.endm
 
-	.macro	phys_to_pte, pte, phys
+	.macro	phys_to_pte, pte, phys, tmp
 #ifdef CONFIG_ARM64_PA_BITS_52_LPA
 	/*
 	 * We assume \phys is 64K aligned and this is guaranteed by only
@@ -614,6 +614,17 @@ alternative_endif
 	 */
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	orr	\pte, \phys, \phys, lsr #42
+
+	/*
+	 * The 'tmp' is being used here to just prepare
+	 * and hold PTE_ADDR_MASK which cannot be passed
+	 * to the subsequent 'and' instruction.
+	 */
+	mov	\tmp, #PTE_ADDR_LOW
+	orr	\tmp, \tmp, #PTE_ADDR_HIGH
+	and	\pte, \pte, \tmp
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
@@ -621,9 +632,13 @@ alternative_endif
 
 	.macro	pte_to_phys, phys, pte
 #ifdef CONFIG_ARM64_PA_BITS_52_LPA
-	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
-	bfxil	\phys, \pte, #16, #32
-	lsl	\phys, \phys, #16
+	ubfiz	\phys, \pte, #(48 - PAGE_SHIFT - 12), #16
+	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
+	lsl	\phys, \phys, #PAGE_SHIFT
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ubfiz	\phys, \pte, #(52 - PAGE_SHIFT - 10), #10
+	bfxil	\phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
+	lsl	\phys, \phys, #PAGE_SHIFT
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index f375bcf..c815a85 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -159,6 +159,10 @@
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
 #define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (50 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+#define PTE_ADDR_HIGH		(_AT(pteval_t, 0x3) << 8)
+#define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 3c57fb2..5e7e402 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -70,6 +70,10 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 #define __pte_to_phys(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
+#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+#define __pte_to_phys(pte)	\
+	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
+#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index c5c994a..6444147 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -134,9 +134,9 @@ SYM_CODE_END(preserve_boot_args)
  * Corrupts:	ptrs, tmp1, tmp2
  * Returns:	tbl -> next level table page address
  */
-	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
+	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2, tmp3
 	add	\tmp1, \tbl, #PAGE_SIZE
-	phys_to_pte \tmp2, \tmp1
+	phys_to_pte \tmp2, \tmp1, \tmp3
 	orr	\tmp2, \tmp2, #PMD_TYPE_TABLE	// address of next table and entry type
 	lsr	\tmp1, \virt, #\shift
 	sub	\ptrs, \ptrs, #1
@@ -161,8 +161,8 @@ SYM_CODE_END(preserve_boot_args)
  * Corrupts:	index, tmp1
  * Returns:	rtbl
  */
-	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1
-.Lpe\@:	phys_to_pte \tmp1, \rtbl
+	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1, tmp2
+.Lpe\@:	phys_to_pte \tmp1, \rtbl, \tmp2
 	orr	\tmp1, \tmp1, \flags	// tmp1 = table entry
 	str	\tmp1, [\tbl, \index, lsl #3]
 	add	\rtbl, \rtbl, \inc	// rtbl = pa next level
@@ -224,31 +224,32 @@ SYM_CODE_END(preserve_boot_args)
  * Preserves:	vstart, vend, flags
  * Corrupts:	tbl, rtbl, istart, iend, tmp, count, sv
  */
-	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, tmp, count, sv
+	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, \
+								tmp, tmp1, count, sv
 	add \rtbl, \tbl, #PAGE_SIZE
 	mov \sv, \rtbl
 	mov \count, #0
 	compute_indices \vstart, \vend, #PGDIR_SHIFT, \pgds, \istart, \iend, \count
-	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
 	mov \tbl, \sv
 	mov \sv, \rtbl
 
 #if SWAPPER_PGTABLE_LEVELS > 3
 	compute_indices \vstart, \vend, #PUD_SHIFT, #PTRS_PER_PUD, \istart, \iend, \count
-	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
 	mov \tbl, \sv
 	mov \sv, \rtbl
 #endif
 
 #if SWAPPER_PGTABLE_LEVELS > 2
 	compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #PTRS_PER_PMD, \istart, \iend, \count
-	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
+	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
 	mov \tbl, \sv
 #endif
 
 	compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #PTRS_PER_PTE, \istart, \iend, \count
 	bic \count, \phys, #SWAPPER_BLOCK_SIZE - 1
-	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp
+	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp, \tmp1
 	.endm
 
 /*
@@ -343,7 +344,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 #endif
 
 	mov	x4, EXTRA_PTRS
-	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6
+	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6, x20
 #else
 	/*
 	 * If VA_BITS == 48, we don't have to configure an additional
@@ -356,7 +357,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 	ldr_l	x4, idmap_ptrs_per_pgd
 	adr_l	x6, __idmap_text_end		// __pa(__idmap_text_end)
 
-	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14
+	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
 
 	/*
 	 * Map the kernel image (starting with PHYS_OFFSET).
@@ -370,7 +371,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 	sub	x6, x6, x3			// _end - _text
 	add	x6, x6, x5			// runtime __va(_end)
 
-	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14
+	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
 
 	/*
 	 * Since the page tables have been populated with non-cacheable
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

Detect FEAT_LPA2 implementation early enough during boot when requested via
CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
was requested but found not to be implemented.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/memory.h |  1 +
 arch/arm64/kernel/head.S        | 15 +++++++++++++++
 arch/arm64/mm/mmu.c             |  3 +++
 arch/arm64/mm/proc.S            |  9 +++++++++
 4 files changed, 28 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 824a365..d0ca002 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -178,6 +178,7 @@
 #include <asm/bug.h>
 
 extern u64			vabits_actual;
+extern u64			arm64_lpa2_enabled;
 
 extern s64			memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 6444147..9cf79ea 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
 	adrp	x23, __PHYS_OFFSET
 	and	x23, x23, MIN_KIMG_ALIGN - 1	// KASLR offset, defaults to 0
 	bl	set_cpu_boot_mode_flag
+
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	mrs     x10, ID_AA64MMFR0_EL1
+	ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
+	cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
+	b.ne	1f
+
+	mov	x10, #1
+	adr_l	x11, arm64_lpa2_enabled
+	str	x10, [x11]
+	dmb	sy
+	dc	ivac, x11
+1:
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 	bl	__create_page_tables
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d745865..00b7595 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -48,6 +48,9 @@ u64 idmap_ptrs_per_pgd = PTRS_PER_PGD;
 u64 __section(".mmuoff.data.write") vabits_actual;
 EXPORT_SYMBOL(vabits_actual);
 
+u64 __section(".mmuoff.data.write") arm64_lpa2_enabled;
+EXPORT_SYMBOL(arm64_lpa2_enabled);
+
 u64 kimage_voffset __ro_after_init;
 EXPORT_SYMBOL(kimage_voffset);
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 1ae0c2b..672880c 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -423,6 +423,15 @@ SYM_FUNC_START(__cpu_setup)
 			TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
 			TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	ldr_l   x10, arm64_lpa2_enabled
+	cmp	x10, #1
+	b.ne	1f
+	mov_q	x10, TCR_DS
+	orr	tcr, tcr, x10
+1:
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 #ifdef CONFIG_ARM64_MTE
 	/*
 	 * Update MAIR_EL1, GCR_EL1 and TFSR*_EL1 if MTE is supported
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

Detect FEAT_LPA2 implementation early enough during boot when requested via
CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
was requested but found not to be implemented.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/memory.h |  1 +
 arch/arm64/kernel/head.S        | 15 +++++++++++++++
 arch/arm64/mm/mmu.c             |  3 +++
 arch/arm64/mm/proc.S            |  9 +++++++++
 4 files changed, 28 insertions(+)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 824a365..d0ca002 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -178,6 +178,7 @@
 #include <asm/bug.h>
 
 extern u64			vabits_actual;
+extern u64			arm64_lpa2_enabled;
 
 extern s64			memstart_addr;
 /* PHYS_OFFSET - the physical address of the start of memory. */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 6444147..9cf79ea 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
 	adrp	x23, __PHYS_OFFSET
 	and	x23, x23, MIN_KIMG_ALIGN - 1	// KASLR offset, defaults to 0
 	bl	set_cpu_boot_mode_flag
+
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	mrs     x10, ID_AA64MMFR0_EL1
+	ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
+	cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
+	b.ne	1f
+
+	mov	x10, #1
+	adr_l	x11, arm64_lpa2_enabled
+	str	x10, [x11]
+	dmb	sy
+	dc	ivac, x11
+1:
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 	bl	__create_page_tables
 	/*
 	 * The following calls CPU setup code, see arch/arm64/mm/proc.S for
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index d745865..00b7595 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -48,6 +48,9 @@ u64 idmap_ptrs_per_pgd = PTRS_PER_PGD;
 u64 __section(".mmuoff.data.write") vabits_actual;
 EXPORT_SYMBOL(vabits_actual);
 
+u64 __section(".mmuoff.data.write") arm64_lpa2_enabled;
+EXPORT_SYMBOL(arm64_lpa2_enabled);
+
 u64 kimage_voffset __ro_after_init;
 EXPORT_SYMBOL(kimage_voffset);
 
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 1ae0c2b..672880c 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -423,6 +423,15 @@ SYM_FUNC_START(__cpu_setup)
 			TCR_TG_FLAGS | TCR_KASLR_FLAGS | TCR_ASID16 | \
 			TCR_TBI0 | TCR_A1 | TCR_KASAN_SW_FLAGS
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	ldr_l   x10, arm64_lpa2_enabled
+	cmp	x10, #1
+	b.ne	1f
+	mov_q	x10, TCR_DS
+	orr	tcr, tcr, x10
+1:
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 #ifdef CONFIG_ARM64_MTE
 	/*
 	 * Update MAIR_EL1, GCR_EL1 and TFSR*_EL1 if MTE is supported
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 08/10] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

PTE[9:8] which holds the sharability attribute bits SH[1:0] could collide
with PA[51:50] when CONFIG_ARM64_PA_BITS_52 is enabled but then FEAT_LPA2
is not detected during boot. Dropping PTE_SHARED and PMD_SECT_S attributes
completely in this scenario will create non-shared page table entries which
would cause regression.

Instead just define PTE_SHARED and PMD_SECT_S after accounting for runtime
'arm64_lpa2_enable', thus maintaining required sharability attributes for
both kernel and user space page table entries. This updates ptdump handling
for page table entry shared attributes accommodating FEAT_LPA2 scenarios.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/kernel-pgtable.h |  4 ++--
 arch/arm64/include/asm/pgtable-hwdef.h  | 12 ++++++++++--
 arch/arm64/kernel/head.S                | 15 +++++++++++++++
 arch/arm64/mm/ptdump.c                  | 26 ++++++++++++++++++++++++--
 4 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 3512184..db682b5 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -103,8 +103,8 @@
 /*
  * Initial memory map attributes.
  */
-#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF)
+#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF)
 
 #if ARM64_KERNEL_USES_PMD_MAPS
 #define SWAPPER_MM_MMUFLAGS	(PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index c815a85..8a3b75e 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -116,13 +116,21 @@
 #define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
 #define PMD_TABLE_BIT		(_AT(pmdval_t, 1) << 1)
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+#define PTE_SHARED		(arm64_lpa2_enabled ? 0 : PTE_SHARED_STATIC)
+#define PMD_SECT_S		(arm64_lpa2_enabled ? 0 : PMD_SECT_S_STATIC)
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA2 */
+#define PTE_SHARED		PTE_SHARED_STATIC
+#define PMD_SECT_S		PMD_SECT_S_STATIC
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 /*
  * Section
  */
 #define PMD_SECT_VALID		(_AT(pmdval_t, 1) << 0)
 #define PMD_SECT_USER		(_AT(pmdval_t, 1) << 6)		/* AP[1] */
 #define PMD_SECT_RDONLY		(_AT(pmdval_t, 1) << 7)		/* AP[2] */
-#define PMD_SECT_S		(_AT(pmdval_t, 3) << 8)
+#define PMD_SECT_S_STATIC	(_AT(pmdval_t, 3) << 8)
 #define PMD_SECT_AF		(_AT(pmdval_t, 1) << 10)
 #define PMD_SECT_NG		(_AT(pmdval_t, 1) << 11)
 #define PMD_SECT_CONT		(_AT(pmdval_t, 1) << 52)
@@ -146,7 +154,7 @@
 #define PTE_TABLE_BIT		(_AT(pteval_t, 1) << 1)
 #define PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
 #define PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
-#define PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_SHARED_STATIC	(_AT(pteval_t, 3) << 8)         /* SH[1:0], inner shareable */
 #define PTE_AF			(_AT(pteval_t, 1) << 10)	/* Access Flag */
 #define PTE_NG			(_AT(pteval_t, 1) << 11)	/* nG */
 #define PTE_GP			(_AT(pteval_t, 1) << 50)	/* BTI guarded */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 9cf79ea..5732da0 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -302,6 +302,21 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 
 	mov	x7, SWAPPER_MM_MMUFLAGS
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	ldr_l   x2, arm64_lpa2_enabled
+	cmp     x2, #1
+	b.eq    1f
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
+	/*
+	 * FEAT_LPA2 has not been detected during boot.
+	 * Hence SWAPPER_MM_MMUFLAGS needs to have the
+	 * regular sharability attributes in PTE[9:8].
+	 * Same is also applicable when FEAT_LPA2 has
+	 * not been requested in the first place.
+	 */
+	orr     x7, x7, PTE_SHARED_STATIC
+1:
 	/*
 	 * Create the identity mapping.
 	 */
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 1c40353..be171cf 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -115,8 +115,8 @@ static const struct prot_bits pte_bits[] = {
 		.set	= "NX",
 		.clear	= "x ",
 	}, {
-		.mask	= PTE_SHARED,
-		.val	= PTE_SHARED,
+		.mask	= PTE_SHARED_STATIC,
+		.val	= PTE_SHARED_STATIC,
 		.set	= "SHD",
 		.clear	= "   ",
 	}, {
@@ -211,6 +211,28 @@ static void dump_prot(struct pg_state *st, const struct prot_bits *bits,
 	for (i = 0; i < num; i++, bits++) {
 		const char *s;
 
+		if (IS_ENABLED(CONFIG_ARM64_PA_BITS_52_LPA2) &&
+		   (bits->mask == PTE_SHARED_STATIC)) {
+			/*
+			 * If FEAT_LPA2 has been detected and enabled
+			 * sharing attributes for page table entries
+			 * are inherited from TCR_EL1.SH1 as init_mm
+			 * based mappings are enabled from TTBR1_EL1.
+			 */
+			if (arm64_lpa2_enabled) {
+				if ((read_sysreg(tcr_el1) & TCR_SH1_INNER) == TCR_SH1_INNER)
+					pt_dump_seq_printf(st->seq, " SHD ");
+				else
+					pt_dump_seq_printf(st->seq, " ");
+				continue;
+			}
+			/*
+			 * In case FEAT_LPA2 has not been detected and
+			 * enabled sharing attributes should be found
+			 * in the regular PTE positions. It just falls
+			 * through regular PTE attribute handling.
+			 */
+		}
 		if ((st->current_prot & bits->mask) == bits->val)
 			s = bits->set;
 		else
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 08/10] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

PTE[9:8] which holds the sharability attribute bits SH[1:0] could collide
with PA[51:50] when CONFIG_ARM64_PA_BITS_52 is enabled but then FEAT_LPA2
is not detected during boot. Dropping PTE_SHARED and PMD_SECT_S attributes
completely in this scenario will create non-shared page table entries which
would cause regression.

Instead just define PTE_SHARED and PMD_SECT_S after accounting for runtime
'arm64_lpa2_enable', thus maintaining required sharability attributes for
both kernel and user space page table entries. This updates ptdump handling
for page table entry shared attributes accommodating FEAT_LPA2 scenarios.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/kernel-pgtable.h |  4 ++--
 arch/arm64/include/asm/pgtable-hwdef.h  | 12 ++++++++++--
 arch/arm64/kernel/head.S                | 15 +++++++++++++++
 arch/arm64/mm/ptdump.c                  | 26 ++++++++++++++++++++++++--
 4 files changed, 51 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/include/asm/kernel-pgtable.h b/arch/arm64/include/asm/kernel-pgtable.h
index 3512184..db682b5 100644
--- a/arch/arm64/include/asm/kernel-pgtable.h
+++ b/arch/arm64/include/asm/kernel-pgtable.h
@@ -103,8 +103,8 @@
 /*
  * Initial memory map attributes.
  */
-#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF | PTE_SHARED)
-#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_S)
+#define SWAPPER_PTE_FLAGS	(PTE_TYPE_PAGE | PTE_AF)
+#define SWAPPER_PMD_FLAGS	(PMD_TYPE_SECT | PMD_SECT_AF)
 
 #if ARM64_KERNEL_USES_PMD_MAPS
 #define SWAPPER_MM_MMUFLAGS	(PMD_ATTRINDX(MT_NORMAL) | SWAPPER_PMD_FLAGS)
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index c815a85..8a3b75e 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -116,13 +116,21 @@
 #define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
 #define PMD_TABLE_BIT		(_AT(pmdval_t, 1) << 1)
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+#define PTE_SHARED		(arm64_lpa2_enabled ? 0 : PTE_SHARED_STATIC)
+#define PMD_SECT_S		(arm64_lpa2_enabled ? 0 : PMD_SECT_S_STATIC)
+#else  /* !CONFIG_ARM64_PA_BITS_52_LPA2 */
+#define PTE_SHARED		PTE_SHARED_STATIC
+#define PMD_SECT_S		PMD_SECT_S_STATIC
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
 /*
  * Section
  */
 #define PMD_SECT_VALID		(_AT(pmdval_t, 1) << 0)
 #define PMD_SECT_USER		(_AT(pmdval_t, 1) << 6)		/* AP[1] */
 #define PMD_SECT_RDONLY		(_AT(pmdval_t, 1) << 7)		/* AP[2] */
-#define PMD_SECT_S		(_AT(pmdval_t, 3) << 8)
+#define PMD_SECT_S_STATIC	(_AT(pmdval_t, 3) << 8)
 #define PMD_SECT_AF		(_AT(pmdval_t, 1) << 10)
 #define PMD_SECT_NG		(_AT(pmdval_t, 1) << 11)
 #define PMD_SECT_CONT		(_AT(pmdval_t, 1) << 52)
@@ -146,7 +154,7 @@
 #define PTE_TABLE_BIT		(_AT(pteval_t, 1) << 1)
 #define PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
 #define PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
-#define PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_SHARED_STATIC	(_AT(pteval_t, 3) << 8)         /* SH[1:0], inner shareable */
 #define PTE_AF			(_AT(pteval_t, 1) << 10)	/* Access Flag */
 #define PTE_NG			(_AT(pteval_t, 1) << 11)	/* nG */
 #define PTE_GP			(_AT(pteval_t, 1) << 50)	/* BTI guarded */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 9cf79ea..5732da0 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -302,6 +302,21 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
 
 	mov	x7, SWAPPER_MM_MMUFLAGS
 
+#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
+	ldr_l   x2, arm64_lpa2_enabled
+	cmp     x2, #1
+	b.eq    1f
+#endif /* CONFIG_ARM64_PA_BITS_52_LPA2 */
+
+	/*
+	 * FEAT_LPA2 has not been detected during boot.
+	 * Hence SWAPPER_MM_MMUFLAGS needs to have the
+	 * regular sharability attributes in PTE[9:8].
+	 * Same is also applicable when FEAT_LPA2 has
+	 * not been requested in the first place.
+	 */
+	orr     x7, x7, PTE_SHARED_STATIC
+1:
 	/*
 	 * Create the identity mapping.
 	 */
diff --git a/arch/arm64/mm/ptdump.c b/arch/arm64/mm/ptdump.c
index 1c40353..be171cf 100644
--- a/arch/arm64/mm/ptdump.c
+++ b/arch/arm64/mm/ptdump.c
@@ -115,8 +115,8 @@ static const struct prot_bits pte_bits[] = {
 		.set	= "NX",
 		.clear	= "x ",
 	}, {
-		.mask	= PTE_SHARED,
-		.val	= PTE_SHARED,
+		.mask	= PTE_SHARED_STATIC,
+		.val	= PTE_SHARED_STATIC,
 		.set	= "SHD",
 		.clear	= "   ",
 	}, {
@@ -211,6 +211,28 @@ static void dump_prot(struct pg_state *st, const struct prot_bits *bits,
 	for (i = 0; i < num; i++, bits++) {
 		const char *s;
 
+		if (IS_ENABLED(CONFIG_ARM64_PA_BITS_52_LPA2) &&
+		   (bits->mask == PTE_SHARED_STATIC)) {
+			/*
+			 * If FEAT_LPA2 has been detected and enabled
+			 * sharing attributes for page table entries
+			 * are inherited from TCR_EL1.SH1 as init_mm
+			 * based mappings are enabled from TTBR1_EL1.
+			 */
+			if (arm64_lpa2_enabled) {
+				if ((read_sysreg(tcr_el1) & TCR_SH1_INNER) == TCR_SH1_INNER)
+					pt_dump_seq_printf(st->seq, " SHD ");
+				else
+					pt_dump_seq_printf(st->seq, " ");
+				continue;
+			}
+			/*
+			 * In case FEAT_LPA2 has not been detected and
+			 * enabled sharing attributes should be found
+			 * in the regular PTE positions. It just falls
+			 * through regular PTE attribute handling.
+			 */
+		}
 		if ((st->current_prot & bits->mask) == bits->val)
 			s = bits->set;
 		else
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 09/10] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

CONFIG_ARM64_PA_BITS_52 build kernels need to fallback for 48 bits PA range
encodings when FEAT_LPA2 is not implemented i.e TCR_EL1.DS could not be set
.  Hence modify applicable PTE and TTBR encoding helpers to accommodate the
scenario via 'arm64_lpa2_enabled'.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/assembler.h     | 13 +++++++++++++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 ++
 arch/arm64/include/asm/pgtable.h       | 12 ++++++++++--
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0492543..844e9a0 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -615,6 +615,10 @@ alternative_endif
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ldr_l   \tmp, arm64_lpa2_enabled
+	cmp     \tmp, #1
+	b.ne    .Lskip_lpa2\@
+
 	orr	\pte, \phys, \phys, lsr #42
 
 	/*
@@ -625,6 +629,9 @@ alternative_endif
 	mov	\tmp, #PTE_ADDR_LOW
 	orr	\tmp, \tmp, #PTE_ADDR_HIGH
 	and	\pte, \pte, \tmp
+
+.Lskip_lpa2\@:
+	mov	\pte, \phys
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
@@ -636,9 +643,15 @@ alternative_endif
 	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
 	lsl	\phys, \phys, #PAGE_SHIFT
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ldr_l   \phys, arm64_lpa2_enabled
+	cmp     \phys, #1
+	b.ne    .Lskip_lpa2\@
+
 	ubfiz	\phys, \pte, #(52 - PAGE_SHIFT - 10), #10
 	bfxil	\phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
 	lsl	\phys, \phys, #PAGE_SHIFT
+.Lskip_lpa2\@:
+	and	\phys, \pte, #PTE_ADDR_MASK_48
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 8a3b75e..b98b764 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -176,6 +176,8 @@
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
+#define PTE_ADDR_MASK_48	(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+
 /*
  * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
  */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 5e7e402..97b3cd2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -71,9 +71,17 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
-#define __pte_to_phys(pte)	\
+#define __pte_to_phys_52(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
-#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
+#define __phys_to_pte_val_52(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
+
+#define __pte_to_phys_48(pte)		(pte_val(pte) & PTE_ADDR_MASK_48)
+#define __phys_to_pte_val_48(phys)	(phys)
+
+#define __pte_to_phys(pte)	\
+	(arm64_lpa2_enabled ? __pte_to_phys_52(pte) : __pte_to_phys_48(pte))
+#define __phys_to_pte_val(phys)	\
+	(arm64_lpa2_enabled ? __phys_to_pte_val_52(phys) : __phys_to_pte_val_48(phys))
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 09/10] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

CONFIG_ARM64_PA_BITS_52 build kernels need to fallback for 48 bits PA range
encodings when FEAT_LPA2 is not implemented i.e TCR_EL1.DS could not be set
.  Hence modify applicable PTE and TTBR encoding helpers to accommodate the
scenario via 'arm64_lpa2_enabled'.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/include/asm/assembler.h     | 13 +++++++++++++
 arch/arm64/include/asm/pgtable-hwdef.h |  2 ++
 arch/arm64/include/asm/pgtable.h       | 12 ++++++++++--
 3 files changed, 25 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 0492543..844e9a0 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -615,6 +615,10 @@ alternative_endif
 	orr	\pte, \phys, \phys, lsr #36
 	and	\pte, \pte, #PTE_ADDR_MASK
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ldr_l   \tmp, arm64_lpa2_enabled
+	cmp     \tmp, #1
+	b.ne    .Lskip_lpa2\@
+
 	orr	\pte, \phys, \phys, lsr #42
 
 	/*
@@ -625,6 +629,9 @@ alternative_endif
 	mov	\tmp, #PTE_ADDR_LOW
 	orr	\tmp, \tmp, #PTE_ADDR_HIGH
 	and	\pte, \pte, \tmp
+
+.Lskip_lpa2\@:
+	mov	\pte, \phys
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	mov	\pte, \phys
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
@@ -636,9 +643,15 @@ alternative_endif
 	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
 	lsl	\phys, \phys, #PAGE_SHIFT
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
+	ldr_l   \phys, arm64_lpa2_enabled
+	cmp     \phys, #1
+	b.ne    .Lskip_lpa2\@
+
 	ubfiz	\phys, \pte, #(52 - PAGE_SHIFT - 10), #10
 	bfxil	\phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
 	lsl	\phys, \phys, #PAGE_SHIFT
+.Lskip_lpa2\@:
+	and	\phys, \pte, #PTE_ADDR_MASK_48
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 	and	\phys, \pte, #PTE_ADDR_MASK
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
index 8a3b75e..b98b764 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -176,6 +176,8 @@
 #define PTE_ADDR_MASK		PTE_ADDR_LOW
 #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
 
+#define PTE_ADDR_MASK_48	(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+
 /*
  * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
  */
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
index 5e7e402..97b3cd2 100644
--- a/arch/arm64/include/asm/pgtable.h
+++ b/arch/arm64/include/asm/pgtable.h
@@ -71,9 +71,17 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
 #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
 #elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
-#define __pte_to_phys(pte)	\
+#define __pte_to_phys_52(pte)	\
 	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
-#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
+#define __phys_to_pte_val_52(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
+
+#define __pte_to_phys_48(pte)		(pte_val(pte) & PTE_ADDR_MASK_48)
+#define __phys_to_pte_val_48(phys)	(phys)
+
+#define __pte_to_phys(pte)	\
+	(arm64_lpa2_enabled ? __pte_to_phys_52(pte) : __pte_to_phys_48(pte))
+#define __phys_to_pte_val(phys)	\
+	(arm64_lpa2_enabled ? __phys_to_pte_val_52(phys) : __phys_to_pte_val_48(phys))
 #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
 #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
 #define __phys_to_pte_val(phys)	(phys)
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 10/10] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES
  2021-07-14  2:21 ` Anshuman Khandual
@ 2021-07-14  2:21   ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

All required FEAT_LPA2 components for 52 bit PA range are already in place.
Just enable CONFIG_ARM64_PA_BITS_52 on 4K and 16K pages which would select
CONFIG_ARM64_PA_BITS_52_LPA2 activating 52 bit PA range via FEAT_LPA2.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 658a6fd..bc7e5c6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -952,9 +952,9 @@ config ARM64_PA_BITS_48
 
 config ARM64_PA_BITS_52
 	bool "52-bit (ARMv8.2)"
-	depends on ARM64_64K_PAGES
 	depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
 	select ARM64_PA_BITS_52_LPA if ARM64_64K_PAGES
+	select ARM64_PA_BITS_52_LPA2 if (ARM64_4K_PAGES  || ARM64_16K_PAGES)
 	help
 	  Enable support for a 52-bit physical address space, introduced as
 	  part of the ARMv8.2-LPA extension.
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 38+ messages in thread

* [RFC 10/10] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES
@ 2021-07-14  2:21   ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-14  2:21 UTC (permalink / raw)
  To: linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse, steven.price, Anshuman Khandual

All required FEAT_LPA2 components for 52 bit PA range are already in place.
Just enable CONFIG_ARM64_PA_BITS_52 on 4K and 16K pages which would select
CONFIG_ARM64_PA_BITS_52_LPA2 activating 52 bit PA range via FEAT_LPA2.

Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
---
 arch/arm64/Kconfig | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 658a6fd..bc7e5c6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -952,9 +952,9 @@ config ARM64_PA_BITS_48
 
 config ARM64_PA_BITS_52
 	bool "52-bit (ARMv8.2)"
-	depends on ARM64_64K_PAGES
 	depends on ARM64_PAN || !ARM64_SW_TTBR0_PAN
 	select ARM64_PA_BITS_52_LPA if ARM64_64K_PAGES
+	select ARM64_PA_BITS_52_LPA2 if (ARM64_4K_PAGES  || ARM64_16K_PAGES)
 	help
 	  Enable support for a 52-bit physical address space, introduced as
 	  part of the ARMv8.2-LPA extension.
-- 
2.7.4


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply related	[flat|nested] 38+ messages in thread

* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
  2021-07-14  2:21   ` Anshuman Khandual
@ 2021-07-14  8:21     ` Suzuki K Poulose
  -1 siblings, 0 replies; 38+ messages in thread
From: Suzuki K Poulose @ 2021-07-14  8:21 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
	steven.price

On 14/07/2021 03:21, Anshuman Khandual wrote:
> Detect FEAT_LPA2 implementation early enough during boot when requested via
> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
> was requested but found not to be implemented.
> 
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   arch/arm64/include/asm/memory.h |  1 +
>   arch/arm64/kernel/head.S        | 15 +++++++++++++++
>   arch/arm64/mm/mmu.c             |  3 +++
>   arch/arm64/mm/proc.S            |  9 +++++++++
>   4 files changed, 28 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 824a365..d0ca002 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -178,6 +178,7 @@
>   #include <asm/bug.h>
>   
>   extern u64			vabits_actual;
> +extern u64			arm64_lpa2_enabled;
>   
>   extern s64			memstart_addr;
>   /* PHYS_OFFSET - the physical address of the start of memory. */
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 6444147..9cf79ea 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>   	adrp	x23, __PHYS_OFFSET
>   	and	x23, x23, MIN_KIMG_ALIGN - 1	// KASLR offset, defaults to 0
>   	bl	set_cpu_boot_mode_flag
> +
> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
> +	mrs     x10, ID_AA64MMFR0_EL1
> +	ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
> +	cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
> +	b.ne	1f

For the sake of forward compatibility, this should be "b.lt"

Suzuki

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
@ 2021-07-14  8:21     ` Suzuki K Poulose
  0 siblings, 0 replies; 38+ messages in thread
From: Suzuki K Poulose @ 2021-07-14  8:21 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
	steven.price

On 14/07/2021 03:21, Anshuman Khandual wrote:
> Detect FEAT_LPA2 implementation early enough during boot when requested via
> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
> was requested but found not to be implemented.
> 
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>   arch/arm64/include/asm/memory.h |  1 +
>   arch/arm64/kernel/head.S        | 15 +++++++++++++++
>   arch/arm64/mm/mmu.c             |  3 +++
>   arch/arm64/mm/proc.S            |  9 +++++++++
>   4 files changed, 28 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 824a365..d0ca002 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -178,6 +178,7 @@
>   #include <asm/bug.h>
>   
>   extern u64			vabits_actual;
> +extern u64			arm64_lpa2_enabled;
>   
>   extern s64			memstart_addr;
>   /* PHYS_OFFSET - the physical address of the start of memory. */
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 6444147..9cf79ea 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>   	adrp	x23, __PHYS_OFFSET
>   	and	x23, x23, MIN_KIMG_ALIGN - 1	// KASLR offset, defaults to 0
>   	bl	set_cpu_boot_mode_flag
> +
> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
> +	mrs     x10, ID_AA64MMFR0_EL1
> +	ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
> +	cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
> +	b.ne	1f

For the sake of forward compatibility, this should be "b.lt"

Suzuki

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
  2021-07-14  2:21   ` Anshuman Khandual
@ 2021-07-14 15:38     ` Steven Price
  -1 siblings, 0 replies; 38+ messages in thread
From: Steven Price @ 2021-07-14 15:38 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse

On 14/07/2021 03:21, Anshuman Khandual wrote:
> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
> to accept a temporary variable and changes impacted call sites.
> 
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
>  arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>  arch/arm64/include/asm/pgtable.h       |  4 ++++
>  arch/arm64/kernel/head.S               | 25 +++++++++++++------------
>  4 files changed, 40 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index fedc202..0492543 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -606,7 +606,7 @@ alternative_endif
>  #endif
>  	.endm
>  
> -	.macro	phys_to_pte, pte, phys
> +	.macro	phys_to_pte, pte, phys, tmp
>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>  	/*
>  	 * We assume \phys is 64K aligned and this is guaranteed by only
> @@ -614,6 +614,17 @@ alternative_endif
>  	 */
>  	orr	\pte, \phys, \phys, lsr #36
>  	and	\pte, \pte, #PTE_ADDR_MASK
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +	orr	\pte, \phys, \phys, lsr #42
> +
> +	/*
> +	 * The 'tmp' is being used here to just prepare
> +	 * and hold PTE_ADDR_MASK which cannot be passed
> +	 * to the subsequent 'and' instruction.
> +	 */
> +	mov	\tmp, #PTE_ADDR_LOW
> +	orr	\tmp, \tmp, #PTE_ADDR_HIGH
> +	and	\pte, \pte, \tmp

Rather than adding an extra temporary register (and the fallout of
various other macros needing an extra register), this can be done with
two AND instructions:

	/* PTE_ADDR_MASK cannot be encoded as an immediate, so
         * mask off all but two bits, followed by masking the
         * extra two bits
         */
	and	\pte, \pte, #PTE_ADDR_MASK | (3 << 10)
	and	\pte, \pte, #~(3 << 10)

Steve

>  #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>  	mov	\pte, \phys
>  #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
> @@ -621,9 +632,13 @@ alternative_endif
>  
>  	.macro	pte_to_phys, phys, pte
>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
> -	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
> -	bfxil	\phys, \pte, #16, #32
> -	lsl	\phys, \phys, #16
> +	ubfiz	\phys, \pte, #(48 - PAGE_SHIFT - 12), #16
> +	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
> +	lsl	\phys, \phys, #PAGE_SHIFT
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +	ubfiz	\phys, \pte, #(52 - PAGE_SHIFT - 10), #10
> +	bfxil	\phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
> +	lsl	\phys, \phys, #PAGE_SHIFT
>  #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>  	and	\phys, \pte, #PTE_ADDR_MASK
>  #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index f375bcf..c815a85 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -159,6 +159,10 @@
>  #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
>  #define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
>  #define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (50 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
> +#define PTE_ADDR_HIGH		(_AT(pteval_t, 0x3) << 8)
> +#define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
>  #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>  #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
>  #define PTE_ADDR_MASK		PTE_ADDR_LOW
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 3c57fb2..5e7e402 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -70,6 +70,10 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
>  #define __pte_to_phys(pte)	\
>  	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
>  #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +#define __pte_to_phys(pte)	\
> +	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
> +#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
>  #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>  #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
>  #define __phys_to_pte_val(phys)	(phys)
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index c5c994a..6444147 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -134,9 +134,9 @@ SYM_CODE_END(preserve_boot_args)
>   * Corrupts:	ptrs, tmp1, tmp2
>   * Returns:	tbl -> next level table page address
>   */
> -	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
> +	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2, tmp3
>  	add	\tmp1, \tbl, #PAGE_SIZE
> -	phys_to_pte \tmp2, \tmp1
> +	phys_to_pte \tmp2, \tmp1, \tmp3
>  	orr	\tmp2, \tmp2, #PMD_TYPE_TABLE	// address of next table and entry type
>  	lsr	\tmp1, \virt, #\shift
>  	sub	\ptrs, \ptrs, #1
> @@ -161,8 +161,8 @@ SYM_CODE_END(preserve_boot_args)
>   * Corrupts:	index, tmp1
>   * Returns:	rtbl
>   */
> -	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1
> -.Lpe\@:	phys_to_pte \tmp1, \rtbl
> +	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1, tmp2
> +.Lpe\@:	phys_to_pte \tmp1, \rtbl, \tmp2
>  	orr	\tmp1, \tmp1, \flags	// tmp1 = table entry
>  	str	\tmp1, [\tbl, \index, lsl #3]
>  	add	\rtbl, \rtbl, \inc	// rtbl = pa next level
> @@ -224,31 +224,32 @@ SYM_CODE_END(preserve_boot_args)
>   * Preserves:	vstart, vend, flags
>   * Corrupts:	tbl, rtbl, istart, iend, tmp, count, sv
>   */
> -	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, tmp, count, sv
> +	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, \
> +								tmp, tmp1, count, sv
>  	add \rtbl, \tbl, #PAGE_SIZE
>  	mov \sv, \rtbl
>  	mov \count, #0
>  	compute_indices \vstart, \vend, #PGDIR_SHIFT, \pgds, \istart, \iend, \count
> -	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> +	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
>  	mov \tbl, \sv
>  	mov \sv, \rtbl
>  
>  #if SWAPPER_PGTABLE_LEVELS > 3
>  	compute_indices \vstart, \vend, #PUD_SHIFT, #PTRS_PER_PUD, \istart, \iend, \count
> -	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> +	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
>  	mov \tbl, \sv
>  	mov \sv, \rtbl
>  #endif
>  
>  #if SWAPPER_PGTABLE_LEVELS > 2
>  	compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #PTRS_PER_PMD, \istart, \iend, \count
> -	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> +	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
>  	mov \tbl, \sv
>  #endif
>  
>  	compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #PTRS_PER_PTE, \istart, \iend, \count
>  	bic \count, \phys, #SWAPPER_BLOCK_SIZE - 1
> -	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp
> +	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp, \tmp1
>  	.endm
>  
>  /*
> @@ -343,7 +344,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
>  #endif
>  
>  	mov	x4, EXTRA_PTRS
> -	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6
> +	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6, x20
>  #else
>  	/*
>  	 * If VA_BITS == 48, we don't have to configure an additional
> @@ -356,7 +357,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
>  	ldr_l	x4, idmap_ptrs_per_pgd
>  	adr_l	x6, __idmap_text_end		// __pa(__idmap_text_end)
>  
> -	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14
> +	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
>  
>  	/*
>  	 * Map the kernel image (starting with PHYS_OFFSET).
> @@ -370,7 +371,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
>  	sub	x6, x6, x3			// _end - _text
>  	add	x6, x6, x5			// runtime __va(_end)
>  
> -	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14
> +	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
>  
>  	/*
>  	 * Since the page tables have been populated with non-cacheable
> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
@ 2021-07-14 15:38     ` Steven Price
  0 siblings, 0 replies; 38+ messages in thread
From: Steven Price @ 2021-07-14 15:38 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse

On 14/07/2021 03:21, Anshuman Khandual wrote:
> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
> to accept a temporary variable and changes impacted call sites.
> 
> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
> ---
>  arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
>  arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>  arch/arm64/include/asm/pgtable.h       |  4 ++++
>  arch/arm64/kernel/head.S               | 25 +++++++++++++------------
>  4 files changed, 40 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index fedc202..0492543 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -606,7 +606,7 @@ alternative_endif
>  #endif
>  	.endm
>  
> -	.macro	phys_to_pte, pte, phys
> +	.macro	phys_to_pte, pte, phys, tmp
>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>  	/*
>  	 * We assume \phys is 64K aligned and this is guaranteed by only
> @@ -614,6 +614,17 @@ alternative_endif
>  	 */
>  	orr	\pte, \phys, \phys, lsr #36
>  	and	\pte, \pte, #PTE_ADDR_MASK
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +	orr	\pte, \phys, \phys, lsr #42
> +
> +	/*
> +	 * The 'tmp' is being used here to just prepare
> +	 * and hold PTE_ADDR_MASK which cannot be passed
> +	 * to the subsequent 'and' instruction.
> +	 */
> +	mov	\tmp, #PTE_ADDR_LOW
> +	orr	\tmp, \tmp, #PTE_ADDR_HIGH
> +	and	\pte, \pte, \tmp

Rather than adding an extra temporary register (and the fallout of
various other macros needing an extra register), this can be done with
two AND instructions:

	/* PTE_ADDR_MASK cannot be encoded as an immediate, so
         * mask off all but two bits, followed by masking the
         * extra two bits
         */
	and	\pte, \pte, #PTE_ADDR_MASK | (3 << 10)
	and	\pte, \pte, #~(3 << 10)

Steve

>  #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>  	mov	\pte, \phys
>  #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
> @@ -621,9 +632,13 @@ alternative_endif
>  
>  	.macro	pte_to_phys, phys, pte
>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
> -	ubfiz	\phys, \pte, #(48 - 16 - 12), #16
> -	bfxil	\phys, \pte, #16, #32
> -	lsl	\phys, \phys, #16
> +	ubfiz	\phys, \pte, #(48 - PAGE_SHIFT - 12), #16
> +	bfxil	\phys, \pte, #PAGE_SHIFT, #(48 - PAGE_SHIFT)
> +	lsl	\phys, \phys, #PAGE_SHIFT
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +	ubfiz	\phys, \pte, #(52 - PAGE_SHIFT - 10), #10
> +	bfxil	\phys, \pte, #PAGE_SHIFT, #(50 - PAGE_SHIFT)
> +	lsl	\phys, \phys, #PAGE_SHIFT
>  #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>  	and	\phys, \pte, #PTE_ADDR_MASK
>  #endif /* CONFIG_ARM64_PA_BITS_52_LPA */
> diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
> index f375bcf..c815a85 100644
> --- a/arch/arm64/include/asm/pgtable-hwdef.h
> +++ b/arch/arm64/include/asm/pgtable-hwdef.h
> @@ -159,6 +159,10 @@
>  #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
>  #define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
>  #define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (50 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
> +#define PTE_ADDR_HIGH		(_AT(pteval_t, 0x3) << 8)
> +#define PTE_ADDR_MASK		(PTE_ADDR_LOW | PTE_ADDR_HIGH)
>  #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>  #define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
>  #define PTE_ADDR_MASK		PTE_ADDR_LOW
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index 3c57fb2..5e7e402 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -70,6 +70,10 @@ extern unsigned long empty_zero_page[PAGE_SIZE / sizeof(unsigned long)];
>  #define __pte_to_phys(pte)	\
>  	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36))
>  #define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 36)) & PTE_ADDR_MASK)
> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
> +#define __pte_to_phys(pte)	\
> +	((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 42))
> +#define __phys_to_pte_val(phys)	(((phys) | ((phys) >> 42)) & PTE_ADDR_MASK)
>  #else  /* !CONFIG_ARM64_PA_BITS_52_LPA */
>  #define __pte_to_phys(pte)	(pte_val(pte) & PTE_ADDR_MASK)
>  #define __phys_to_pte_val(phys)	(phys)
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index c5c994a..6444147 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -134,9 +134,9 @@ SYM_CODE_END(preserve_boot_args)
>   * Corrupts:	ptrs, tmp1, tmp2
>   * Returns:	tbl -> next level table page address
>   */
> -	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2
> +	.macro	create_table_entry, tbl, virt, shift, ptrs, tmp1, tmp2, tmp3
>  	add	\tmp1, \tbl, #PAGE_SIZE
> -	phys_to_pte \tmp2, \tmp1
> +	phys_to_pte \tmp2, \tmp1, \tmp3
>  	orr	\tmp2, \tmp2, #PMD_TYPE_TABLE	// address of next table and entry type
>  	lsr	\tmp1, \virt, #\shift
>  	sub	\ptrs, \ptrs, #1
> @@ -161,8 +161,8 @@ SYM_CODE_END(preserve_boot_args)
>   * Corrupts:	index, tmp1
>   * Returns:	rtbl
>   */
> -	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1
> -.Lpe\@:	phys_to_pte \tmp1, \rtbl
> +	.macro populate_entries, tbl, rtbl, index, eindex, flags, inc, tmp1, tmp2
> +.Lpe\@:	phys_to_pte \tmp1, \rtbl, \tmp2
>  	orr	\tmp1, \tmp1, \flags	// tmp1 = table entry
>  	str	\tmp1, [\tbl, \index, lsl #3]
>  	add	\rtbl, \rtbl, \inc	// rtbl = pa next level
> @@ -224,31 +224,32 @@ SYM_CODE_END(preserve_boot_args)
>   * Preserves:	vstart, vend, flags
>   * Corrupts:	tbl, rtbl, istart, iend, tmp, count, sv
>   */
> -	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, tmp, count, sv
> +	.macro map_memory, tbl, rtbl, vstart, vend, flags, phys, pgds, istart, iend, \
> +								tmp, tmp1, count, sv
>  	add \rtbl, \tbl, #PAGE_SIZE
>  	mov \sv, \rtbl
>  	mov \count, #0
>  	compute_indices \vstart, \vend, #PGDIR_SHIFT, \pgds, \istart, \iend, \count
> -	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> +	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
>  	mov \tbl, \sv
>  	mov \sv, \rtbl
>  
>  #if SWAPPER_PGTABLE_LEVELS > 3
>  	compute_indices \vstart, \vend, #PUD_SHIFT, #PTRS_PER_PUD, \istart, \iend, \count
> -	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> +	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
>  	mov \tbl, \sv
>  	mov \sv, \rtbl
>  #endif
>  
>  #if SWAPPER_PGTABLE_LEVELS > 2
>  	compute_indices \vstart, \vend, #SWAPPER_TABLE_SHIFT, #PTRS_PER_PMD, \istart, \iend, \count
> -	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp
> +	populate_entries \tbl, \rtbl, \istart, \iend, #PMD_TYPE_TABLE, #PAGE_SIZE, \tmp, \tmp1
>  	mov \tbl, \sv
>  #endif
>  
>  	compute_indices \vstart, \vend, #SWAPPER_BLOCK_SHIFT, #PTRS_PER_PTE, \istart, \iend, \count
>  	bic \count, \phys, #SWAPPER_BLOCK_SIZE - 1
> -	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp
> +	populate_entries \tbl, \count, \istart, \iend, \flags, #SWAPPER_BLOCK_SIZE, \tmp, \tmp1
>  	.endm
>  
>  /*
> @@ -343,7 +344,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
>  #endif
>  
>  	mov	x4, EXTRA_PTRS
> -	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6
> +	create_table_entry x0, x3, EXTRA_SHIFT, x4, x5, x6, x20
>  #else
>  	/*
>  	 * If VA_BITS == 48, we don't have to configure an additional
> @@ -356,7 +357,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
>  	ldr_l	x4, idmap_ptrs_per_pgd
>  	adr_l	x6, __idmap_text_end		// __pa(__idmap_text_end)
>  
> -	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x13, x14
> +	map_memory x0, x1, x3, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
>  
>  	/*
>  	 * Map the kernel image (starting with PHYS_OFFSET).
> @@ -370,7 +371,7 @@ SYM_FUNC_START_LOCAL(__create_page_tables)
>  	sub	x6, x6, x3			// _end - _text
>  	add	x6, x6, x5			// runtime __va(_end)
>  
> -	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x13, x14
> +	map_memory x0, x1, x5, x6, x7, x3, x4, x10, x11, x12, x20, x13, x14
>  
>  	/*
>  	 * Since the page tables have been populated with non-cacheable
> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
  2021-07-14  8:21     ` Suzuki K Poulose
@ 2021-07-16  7:06       ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-16  7:06 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
	steven.price


On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
> On 14/07/2021 03:21, Anshuman Khandual wrote:
>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>> was requested but found not to be implemented.
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>   arch/arm64/include/asm/memory.h |  1 +
>>   arch/arm64/kernel/head.S        | 15 +++++++++++++++
>>   arch/arm64/mm/mmu.c             |  3 +++
>>   arch/arm64/mm/proc.S            |  9 +++++++++
>>   4 files changed, 28 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>> index 824a365..d0ca002 100644
>> --- a/arch/arm64/include/asm/memory.h
>> +++ b/arch/arm64/include/asm/memory.h
>> @@ -178,6 +178,7 @@
>>   #include <asm/bug.h>
>>     extern u64            vabits_actual;
>> +extern u64            arm64_lpa2_enabled;
>>     extern s64            memstart_addr;
>>   /* PHYS_OFFSET - the physical address of the start of memory. */
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 6444147..9cf79ea 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>>       adrp    x23, __PHYS_OFFSET
>>       and    x23, x23, MIN_KIMG_ALIGN - 1    // KASLR offset, defaults to 0
>>       bl    set_cpu_boot_mode_flag
>> +
>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>> +    mrs     x10, ID_AA64MMFR0_EL1
>> +    ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>> +    cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
>> +    b.ne    1f
> 
> For the sake of forward compatibility, this should be "b.lt"
Right, I guess we could assume that the feature will be present from the
current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
limit is different for 4K and 16K page sizes.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
@ 2021-07-16  7:06       ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-16  7:06 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
	steven.price


On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
> On 14/07/2021 03:21, Anshuman Khandual wrote:
>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>> was requested but found not to be implemented.
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>   arch/arm64/include/asm/memory.h |  1 +
>>   arch/arm64/kernel/head.S        | 15 +++++++++++++++
>>   arch/arm64/mm/mmu.c             |  3 +++
>>   arch/arm64/mm/proc.S            |  9 +++++++++
>>   4 files changed, 28 insertions(+)
>>
>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>> index 824a365..d0ca002 100644
>> --- a/arch/arm64/include/asm/memory.h
>> +++ b/arch/arm64/include/asm/memory.h
>> @@ -178,6 +178,7 @@
>>   #include <asm/bug.h>
>>     extern u64            vabits_actual;
>> +extern u64            arm64_lpa2_enabled;
>>     extern s64            memstart_addr;
>>   /* PHYS_OFFSET - the physical address of the start of memory. */
>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>> index 6444147..9cf79ea 100644
>> --- a/arch/arm64/kernel/head.S
>> +++ b/arch/arm64/kernel/head.S
>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>>       adrp    x23, __PHYS_OFFSET
>>       and    x23, x23, MIN_KIMG_ALIGN - 1    // KASLR offset, defaults to 0
>>       bl    set_cpu_boot_mode_flag
>> +
>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>> +    mrs     x10, ID_AA64MMFR0_EL1
>> +    ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>> +    cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
>> +    b.ne    1f
> 
> For the sake of forward compatibility, this should be "b.lt"
Right, I guess we could assume that the feature will be present from the
current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
limit is different for 4K and 16K page sizes.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
  2021-07-14 15:38     ` Steven Price
@ 2021-07-16  7:20       ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-16  7:20 UTC (permalink / raw)
  To: Steven Price, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse



On 7/14/21 9:08 PM, Steven Price wrote:
> On 14/07/2021 03:21, Anshuman Khandual wrote:
>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>> to accept a temporary variable and changes impacted call sites.
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
>>  arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>>  arch/arm64/include/asm/pgtable.h       |  4 ++++
>>  arch/arm64/kernel/head.S               | 25 +++++++++++++------------
>>  4 files changed, 40 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>> index fedc202..0492543 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -606,7 +606,7 @@ alternative_endif
>>  #endif
>>  	.endm
>>  
>> -	.macro	phys_to_pte, pte, phys
>> +	.macro	phys_to_pte, pte, phys, tmp
>>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>>  	/*
>>  	 * We assume \phys is 64K aligned and this is guaranteed by only
>> @@ -614,6 +614,17 @@ alternative_endif
>>  	 */
>>  	orr	\pte, \phys, \phys, lsr #36
>>  	and	\pte, \pte, #PTE_ADDR_MASK
>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>> +	orr	\pte, \phys, \phys, lsr #42
>> +
>> +	/*
>> +	 * The 'tmp' is being used here to just prepare
>> +	 * and hold PTE_ADDR_MASK which cannot be passed
>> +	 * to the subsequent 'and' instruction.
>> +	 */
>> +	mov	\tmp, #PTE_ADDR_LOW
>> +	orr	\tmp, \tmp, #PTE_ADDR_HIGH
>> +	and	\pte, \pte, \tmp
> Rather than adding an extra temporary register (and the fallout of
> various other macros needing an extra register), this can be done with
> two AND instructions:

I would really like to get rid of the 'tmp' variable here as
well but did not figure out any method of accomplishing it.

> 
> 	/* PTE_ADDR_MASK cannot be encoded as an immediate, so
>          * mask off all but two bits, followed by masking the
>          * extra two bits
>          */
> 	and	\pte, \pte, #PTE_ADDR_MASK | (3 << 10)
> 	and	\pte, \pte, #~(3 << 10)

Did this change as suggested

--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -626,9 +626,8 @@ alternative_endif
         * and hold PTE_ADDR_MASK which cannot be passed
         * to the subsequent 'and' instruction.
         */
-       mov     \tmp, #PTE_ADDR_LOW
-       orr     \tmp, \tmp, #PTE_ADDR_HIGH
-       and     \pte, \pte, \tmp
+       and     \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
+       and     \pte, \pte, #~(0x3 << 10)
 
 .Lskip_lpa2\@:
        mov     \pte, \phys


but still fails to build (tested on 16K)

arch/arm64/kernel/head.S: Assembler messages:
arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
@ 2021-07-16  7:20       ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-16  7:20 UTC (permalink / raw)
  To: Steven Price, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse



On 7/14/21 9:08 PM, Steven Price wrote:
> On 14/07/2021 03:21, Anshuman Khandual wrote:
>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>> to accept a temporary variable and changes impacted call sites.
>>
>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>> ---
>>  arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
>>  arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>>  arch/arm64/include/asm/pgtable.h       |  4 ++++
>>  arch/arm64/kernel/head.S               | 25 +++++++++++++------------
>>  4 files changed, 40 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>> index fedc202..0492543 100644
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -606,7 +606,7 @@ alternative_endif
>>  #endif
>>  	.endm
>>  
>> -	.macro	phys_to_pte, pte, phys
>> +	.macro	phys_to_pte, pte, phys, tmp
>>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>>  	/*
>>  	 * We assume \phys is 64K aligned and this is guaranteed by only
>> @@ -614,6 +614,17 @@ alternative_endif
>>  	 */
>>  	orr	\pte, \phys, \phys, lsr #36
>>  	and	\pte, \pte, #PTE_ADDR_MASK
>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>> +	orr	\pte, \phys, \phys, lsr #42
>> +
>> +	/*
>> +	 * The 'tmp' is being used here to just prepare
>> +	 * and hold PTE_ADDR_MASK which cannot be passed
>> +	 * to the subsequent 'and' instruction.
>> +	 */
>> +	mov	\tmp, #PTE_ADDR_LOW
>> +	orr	\tmp, \tmp, #PTE_ADDR_HIGH
>> +	and	\pte, \pte, \tmp
> Rather than adding an extra temporary register (and the fallout of
> various other macros needing an extra register), this can be done with
> two AND instructions:

I would really like to get rid of the 'tmp' variable here as
well but did not figure out any method of accomplishing it.

> 
> 	/* PTE_ADDR_MASK cannot be encoded as an immediate, so
>          * mask off all but two bits, followed by masking the
>          * extra two bits
>          */
> 	and	\pte, \pte, #PTE_ADDR_MASK | (3 << 10)
> 	and	\pte, \pte, #~(3 << 10)

Did this change as suggested

--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -626,9 +626,8 @@ alternative_endif
         * and hold PTE_ADDR_MASK which cannot be passed
         * to the subsequent 'and' instruction.
         */
-       mov     \tmp, #PTE_ADDR_LOW
-       orr     \tmp, \tmp, #PTE_ADDR_HIGH
-       and     \pte, \pte, \tmp
+       and     \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
+       and     \pte, \pte, #~(0x3 << 10)
 
 .Lskip_lpa2\@:
        mov     \pte, \phys


but still fails to build (tested on 16K)

arch/arm64/kernel/head.S: Assembler messages:
arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
  2021-07-16  7:06       ` Anshuman Khandual
@ 2021-07-16  8:08         ` Suzuki K Poulose
  -1 siblings, 0 replies; 38+ messages in thread
From: Suzuki K Poulose @ 2021-07-16  8:08 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
	steven.price

On 16/07/2021 08:06, Anshuman Khandual wrote:
> 
> On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>>> was requested but found not to be implemented.
>>>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>>    arch/arm64/include/asm/memory.h |  1 +
>>>    arch/arm64/kernel/head.S        | 15 +++++++++++++++
>>>    arch/arm64/mm/mmu.c             |  3 +++
>>>    arch/arm64/mm/proc.S            |  9 +++++++++
>>>    4 files changed, 28 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>> index 824a365..d0ca002 100644
>>> --- a/arch/arm64/include/asm/memory.h
>>> +++ b/arch/arm64/include/asm/memory.h
>>> @@ -178,6 +178,7 @@
>>>    #include <asm/bug.h>
>>>      extern u64            vabits_actual;
>>> +extern u64            arm64_lpa2_enabled;
>>>      extern s64            memstart_addr;
>>>    /* PHYS_OFFSET - the physical address of the start of memory. */
>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>>> index 6444147..9cf79ea 100644
>>> --- a/arch/arm64/kernel/head.S
>>> +++ b/arch/arm64/kernel/head.S
>>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>>>        adrp    x23, __PHYS_OFFSET
>>>        and    x23, x23, MIN_KIMG_ALIGN - 1    // KASLR offset, defaults to 0
>>>        bl    set_cpu_boot_mode_flag
>>> +
>>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>>> +    mrs     x10, ID_AA64MMFR0_EL1
>>> +    ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>>> +    cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
>>> +    b.ne    1f
>>
>> For the sake of forward compatibility, this should be "b.lt"
> Right, I guess we could assume that the feature will be present from the
> current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
> not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
> limit is different for 4K and 16K page sizes.

Absolutely.

Cheers
Suzuki

> 


^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
@ 2021-07-16  8:08         ` Suzuki K Poulose
  0 siblings, 0 replies; 38+ messages in thread
From: Suzuki K Poulose @ 2021-07-16  8:08 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
	steven.price

On 16/07/2021 08:06, Anshuman Khandual wrote:
> 
> On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>>> was requested but found not to be implemented.
>>>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>>    arch/arm64/include/asm/memory.h |  1 +
>>>    arch/arm64/kernel/head.S        | 15 +++++++++++++++
>>>    arch/arm64/mm/mmu.c             |  3 +++
>>>    arch/arm64/mm/proc.S            |  9 +++++++++
>>>    4 files changed, 28 insertions(+)
>>>
>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>> index 824a365..d0ca002 100644
>>> --- a/arch/arm64/include/asm/memory.h
>>> +++ b/arch/arm64/include/asm/memory.h
>>> @@ -178,6 +178,7 @@
>>>    #include <asm/bug.h>
>>>      extern u64            vabits_actual;
>>> +extern u64            arm64_lpa2_enabled;
>>>      extern s64            memstart_addr;
>>>    /* PHYS_OFFSET - the physical address of the start of memory. */
>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>>> index 6444147..9cf79ea 100644
>>> --- a/arch/arm64/kernel/head.S
>>> +++ b/arch/arm64/kernel/head.S
>>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>>>        adrp    x23, __PHYS_OFFSET
>>>        and    x23, x23, MIN_KIMG_ALIGN - 1    // KASLR offset, defaults to 0
>>>        bl    set_cpu_boot_mode_flag
>>> +
>>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>>> +    mrs     x10, ID_AA64MMFR0_EL1
>>> +    ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>>> +    cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
>>> +    b.ne    1f
>>
>> For the sake of forward compatibility, this should be "b.lt"
> Right, I guess we could assume that the feature will be present from the
> current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
> not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
> limit is different for 4K and 16K page sizes.

Absolutely.

Cheers
Suzuki

> 


_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
  2021-07-16  7:20       ` Anshuman Khandual
@ 2021-07-16 10:02         ` Steven Price
  -1 siblings, 0 replies; 38+ messages in thread
From: Steven Price @ 2021-07-16 10:02 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse

On 16/07/2021 08:20, Anshuman Khandual wrote:
> 
> 
> On 7/14/21 9:08 PM, Steven Price wrote:
>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>>> to accept a temporary variable and changes impacted call sites.
>>>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>>  arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
>>>  arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>>>  arch/arm64/include/asm/pgtable.h       |  4 ++++
>>>  arch/arm64/kernel/head.S               | 25 +++++++++++++------------
>>>  4 files changed, 40 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>>> index fedc202..0492543 100644
>>> --- a/arch/arm64/include/asm/assembler.h
>>> +++ b/arch/arm64/include/asm/assembler.h
>>> @@ -606,7 +606,7 @@ alternative_endif
>>>  #endif
>>>  	.endm
>>>  
>>> -	.macro	phys_to_pte, pte, phys
>>> +	.macro	phys_to_pte, pte, phys, tmp
>>>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>>>  	/*
>>>  	 * We assume \phys is 64K aligned and this is guaranteed by only
>>> @@ -614,6 +614,17 @@ alternative_endif
>>>  	 */
>>>  	orr	\pte, \phys, \phys, lsr #36
>>>  	and	\pte, \pte, #PTE_ADDR_MASK
>>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>>> +	orr	\pte, \phys, \phys, lsr #42
>>> +
>>> +	/*
>>> +	 * The 'tmp' is being used here to just prepare
>>> +	 * and hold PTE_ADDR_MASK which cannot be passed
>>> +	 * to the subsequent 'and' instruction.
>>> +	 */
>>> +	mov	\tmp, #PTE_ADDR_LOW
>>> +	orr	\tmp, \tmp, #PTE_ADDR_HIGH
>>> +	and	\pte, \pte, \tmp
>> Rather than adding an extra temporary register (and the fallout of
>> various other macros needing an extra register), this can be done with
>> two AND instructions:
> 
> I would really like to get rid of the 'tmp' variable here as
> well but did not figure out any method of accomplishing it.
> 
>>
>> 	/* PTE_ADDR_MASK cannot be encoded as an immediate, so
>>          * mask off all but two bits, followed by masking the
>>          * extra two bits
>>          */
>> 	and	\pte, \pte, #PTE_ADDR_MASK | (3 << 10)
>> 	and	\pte, \pte, #~(3 << 10)
> 
> Did this change as suggested
> 
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -626,9 +626,8 @@ alternative_endif
>          * and hold PTE_ADDR_MASK which cannot be passed
>          * to the subsequent 'and' instruction.
>          */
> -       mov     \tmp, #PTE_ADDR_LOW
> -       orr     \tmp, \tmp, #PTE_ADDR_HIGH
> -       and     \pte, \pte, \tmp
> +       and     \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
> +       and     \pte, \pte, #~(0x3 << 10)
>  
>  .Lskip_lpa2\@:
>         mov     \pte, \phys
> 
> 
> but still fails to build (tested on 16K)
> 
> arch/arm64/kernel/head.S: Assembler messages:
> arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> 

Ah, I'd only tested this for 4k. 16k would require a different set of masks.

So the bits we need to cover are those from just below PAGE_SHIFT to the
top of PTE_ADDR_HIGH (bit 10). So we can compute the mask for both 4k
and 16k with GENMASK(PAGE_SHIFT-1, 10):

	and	\pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
	and	\pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)

This compiles (for both 4k and 16k) and the assembly looks correct, but
I've not done any other testing.

Steve

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
@ 2021-07-16 10:02         ` Steven Price
  0 siblings, 0 replies; 38+ messages in thread
From: Steven Price @ 2021-07-16 10:02 UTC (permalink / raw)
  To: Anshuman Khandual, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse

On 16/07/2021 08:20, Anshuman Khandual wrote:
> 
> 
> On 7/14/21 9:08 PM, Steven Price wrote:
>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>>> to accept a temporary variable and changes impacted call sites.
>>>
>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>> ---
>>>  arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
>>>  arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>>>  arch/arm64/include/asm/pgtable.h       |  4 ++++
>>>  arch/arm64/kernel/head.S               | 25 +++++++++++++------------
>>>  4 files changed, 40 insertions(+), 16 deletions(-)
>>>
>>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>>> index fedc202..0492543 100644
>>> --- a/arch/arm64/include/asm/assembler.h
>>> +++ b/arch/arm64/include/asm/assembler.h
>>> @@ -606,7 +606,7 @@ alternative_endif
>>>  #endif
>>>  	.endm
>>>  
>>> -	.macro	phys_to_pte, pte, phys
>>> +	.macro	phys_to_pte, pte, phys, tmp
>>>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>>>  	/*
>>>  	 * We assume \phys is 64K aligned and this is guaranteed by only
>>> @@ -614,6 +614,17 @@ alternative_endif
>>>  	 */
>>>  	orr	\pte, \phys, \phys, lsr #36
>>>  	and	\pte, \pte, #PTE_ADDR_MASK
>>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>>> +	orr	\pte, \phys, \phys, lsr #42
>>> +
>>> +	/*
>>> +	 * The 'tmp' is being used here to just prepare
>>> +	 * and hold PTE_ADDR_MASK which cannot be passed
>>> +	 * to the subsequent 'and' instruction.
>>> +	 */
>>> +	mov	\tmp, #PTE_ADDR_LOW
>>> +	orr	\tmp, \tmp, #PTE_ADDR_HIGH
>>> +	and	\pte, \pte, \tmp
>> Rather than adding an extra temporary register (and the fallout of
>> various other macros needing an extra register), this can be done with
>> two AND instructions:
> 
> I would really like to get rid of the 'tmp' variable here as
> well but did not figure out any method of accomplishing it.
> 
>>
>> 	/* PTE_ADDR_MASK cannot be encoded as an immediate, so
>>          * mask off all but two bits, followed by masking the
>>          * extra two bits
>>          */
>> 	and	\pte, \pte, #PTE_ADDR_MASK | (3 << 10)
>> 	and	\pte, \pte, #~(3 << 10)
> 
> Did this change as suggested
> 
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -626,9 +626,8 @@ alternative_endif
>          * and hold PTE_ADDR_MASK which cannot be passed
>          * to the subsequent 'and' instruction.
>          */
> -       mov     \tmp, #PTE_ADDR_LOW
> -       orr     \tmp, \tmp, #PTE_ADDR_HIGH
> -       and     \pte, \pte, \tmp
> +       and     \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
> +       and     \pte, \pte, #~(0x3 << 10)
>  
>  .Lskip_lpa2\@:
>         mov     \pte, \phys
> 
> 
> but still fails to build (tested on 16K)
> 
> arch/arm64/kernel/head.S: Assembler messages:
> arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
> 

Ah, I'd only tested this for 4k. 16k would require a different set of masks.

So the bits we need to cover are those from just below PAGE_SHIFT to the
top of PTE_ADDR_HIGH (bit 10). So we can compute the mask for both 4k
and 16k with GENMASK(PAGE_SHIFT-1, 10):

	and	\pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
	and	\pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)

This compiles (for both 4k and 16k) and the assembly looks correct, but
I've not done any other testing.

Steve

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
  2021-07-16 10:02         ` Steven Price
@ 2021-07-16 14:37           ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-16 14:37 UTC (permalink / raw)
  To: Steven Price, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse

On 7/16/21 3:32 PM, Steven Price wrote:
> On 16/07/2021 08:20, Anshuman Khandual wrote:
>>
>>
>> On 7/14/21 9:08 PM, Steven Price wrote:
>>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>>>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>>>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>>>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>>>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>>>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>>>> to accept a temporary variable and changes impacted call sites.
>>>>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>>  arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
>>>>  arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>>>>  arch/arm64/include/asm/pgtable.h       |  4 ++++
>>>>  arch/arm64/kernel/head.S               | 25 +++++++++++++------------
>>>>  4 files changed, 40 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>>>> index fedc202..0492543 100644
>>>> --- a/arch/arm64/include/asm/assembler.h
>>>> +++ b/arch/arm64/include/asm/assembler.h
>>>> @@ -606,7 +606,7 @@ alternative_endif
>>>>  #endif
>>>>  	.endm
>>>>  
>>>> -	.macro	phys_to_pte, pte, phys
>>>> +	.macro	phys_to_pte, pte, phys, tmp
>>>>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>>>>  	/*
>>>>  	 * We assume \phys is 64K aligned and this is guaranteed by only
>>>> @@ -614,6 +614,17 @@ alternative_endif
>>>>  	 */
>>>>  	orr	\pte, \phys, \phys, lsr #36
>>>>  	and	\pte, \pte, #PTE_ADDR_MASK
>>>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>>>> +	orr	\pte, \phys, \phys, lsr #42
>>>> +
>>>> +	/*
>>>> +	 * The 'tmp' is being used here to just prepare
>>>> +	 * and hold PTE_ADDR_MASK which cannot be passed
>>>> +	 * to the subsequent 'and' instruction.
>>>> +	 */
>>>> +	mov	\tmp, #PTE_ADDR_LOW
>>>> +	orr	\tmp, \tmp, #PTE_ADDR_HIGH
>>>> +	and	\pte, \pte, \tmp
>>> Rather than adding an extra temporary register (and the fallout of
>>> various other macros needing an extra register), this can be done with
>>> two AND instructions:
>>
>> I would really like to get rid of the 'tmp' variable here as
>> well but did not figure out any method of accomplishing it.
>>
>>>
>>> 	/* PTE_ADDR_MASK cannot be encoded as an immediate, so
>>>          * mask off all but two bits, followed by masking the
>>>          * extra two bits
>>>          */
>>> 	and	\pte, \pte, #PTE_ADDR_MASK | (3 << 10)
>>> 	and	\pte, \pte, #~(3 << 10)
>>
>> Did this change as suggested
>>
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -626,9 +626,8 @@ alternative_endif
>>          * and hold PTE_ADDR_MASK which cannot be passed
>>          * to the subsequent 'and' instruction.
>>          */
>> -       mov     \tmp, #PTE_ADDR_LOW
>> -       orr     \tmp, \tmp, #PTE_ADDR_HIGH
>> -       and     \pte, \pte, \tmp
>> +       and     \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
>> +       and     \pte, \pte, #~(0x3 << 10)
>>  
>>  .Lskip_lpa2\@:
>>         mov     \pte, \phys
>>
>>
>> but still fails to build (tested on 16K)
>>
>> arch/arm64/kernel/head.S: Assembler messages:
>> arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>>
> 
> Ah, I'd only tested this for 4k. 16k would require a different set of masks.
> 
> So the bits we need to cover are those from just below PAGE_SHIFT to the
> top of PTE_ADDR_HIGH (bit 10). So we can compute the mask for both 4k

Okay.

> and 16k with GENMASK(PAGE_SHIFT-1, 10):
> 
> 	and	\pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
> 	and	\pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)
> 
> This compiles (for both 4k and 16k) and the assembly looks correct, but
> I've not done any other testing.

Yeah it works, will do the change.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding
@ 2021-07-16 14:37           ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-16 14:37 UTC (permalink / raw)
  To: Steven Price, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, suzuki.poulose, mark.rutland, will, catalin.marinas, maz,
	james.morse

On 7/16/21 3:32 PM, Steven Price wrote:
> On 16/07/2021 08:20, Anshuman Khandual wrote:
>>
>>
>> On 7/14/21 9:08 PM, Steven Price wrote:
>>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>>> FEAT_LPA2 requires different PTE representation formats for both 4K and 16K
>>>> page size config. This adds FEAT_LPA2 specific new PTE encodings as per ARM
>>>> ARM (0487G.A) which updates [pte|phys]_to_[phys|pte](). The updated helpers
>>>> would be used when FEAT_LPA2 gets enabled via CONFIG_ARM64_PA_BITS_52 on 4K
>>>> and 16K page size. Although TTBR encoding and phys_to_ttbr() helper remains
>>>> the same as FEAT_LPA for FEAT_LPA2 as well. It updates 'phys_to_pte' helper
>>>> to accept a temporary variable and changes impacted call sites.
>>>>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>>  arch/arm64/include/asm/assembler.h     | 23 +++++++++++++++++++----
>>>>  arch/arm64/include/asm/pgtable-hwdef.h |  4 ++++
>>>>  arch/arm64/include/asm/pgtable.h       |  4 ++++
>>>>  arch/arm64/kernel/head.S               | 25 +++++++++++++------------
>>>>  4 files changed, 40 insertions(+), 16 deletions(-)
>>>>
>>>> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
>>>> index fedc202..0492543 100644
>>>> --- a/arch/arm64/include/asm/assembler.h
>>>> +++ b/arch/arm64/include/asm/assembler.h
>>>> @@ -606,7 +606,7 @@ alternative_endif
>>>>  #endif
>>>>  	.endm
>>>>  
>>>> -	.macro	phys_to_pte, pte, phys
>>>> +	.macro	phys_to_pte, pte, phys, tmp
>>>>  #ifdef CONFIG_ARM64_PA_BITS_52_LPA
>>>>  	/*
>>>>  	 * We assume \phys is 64K aligned and this is guaranteed by only
>>>> @@ -614,6 +614,17 @@ alternative_endif
>>>>  	 */
>>>>  	orr	\pte, \phys, \phys, lsr #36
>>>>  	and	\pte, \pte, #PTE_ADDR_MASK
>>>> +#elif defined(CONFIG_ARM64_PA_BITS_52_LPA2)
>>>> +	orr	\pte, \phys, \phys, lsr #42
>>>> +
>>>> +	/*
>>>> +	 * The 'tmp' is being used here to just prepare
>>>> +	 * and hold PTE_ADDR_MASK which cannot be passed
>>>> +	 * to the subsequent 'and' instruction.
>>>> +	 */
>>>> +	mov	\tmp, #PTE_ADDR_LOW
>>>> +	orr	\tmp, \tmp, #PTE_ADDR_HIGH
>>>> +	and	\pte, \pte, \tmp
>>> Rather than adding an extra temporary register (and the fallout of
>>> various other macros needing an extra register), this can be done with
>>> two AND instructions:
>>
>> I would really like to get rid of the 'tmp' variable here as
>> well but did not figure out any method of accomplishing it.
>>
>>>
>>> 	/* PTE_ADDR_MASK cannot be encoded as an immediate, so
>>>          * mask off all but two bits, followed by masking the
>>>          * extra two bits
>>>          */
>>> 	and	\pte, \pte, #PTE_ADDR_MASK | (3 << 10)
>>> 	and	\pte, \pte, #~(3 << 10)
>>
>> Did this change as suggested
>>
>> --- a/arch/arm64/include/asm/assembler.h
>> +++ b/arch/arm64/include/asm/assembler.h
>> @@ -626,9 +626,8 @@ alternative_endif
>>          * and hold PTE_ADDR_MASK which cannot be passed
>>          * to the subsequent 'and' instruction.
>>          */
>> -       mov     \tmp, #PTE_ADDR_LOW
>> -       orr     \tmp, \tmp, #PTE_ADDR_HIGH
>> -       and     \pte, \pte, \tmp
>> +       and     \pte, \pte, #PTE_ADDR_MASK | (0x3 << 10)
>> +       and     \pte, \pte, #~(0x3 << 10)
>>  
>>  .Lskip_lpa2\@:
>>         mov     \pte, \phys
>>
>>
>> but still fails to build (tested on 16K)
>>
>> arch/arm64/kernel/head.S: Assembler messages:
>> arch/arm64/kernel/head.S:377: Error: immediate out of range at operand 3 -- `and x6,x6,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:390: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>> arch/arm64/kernel/head.S:404: Error: immediate out of range at operand 3 -- `and x12,x12,#((((1<<(50-14))-1)<<14)|(0x3<<8))|(0x3<<10)'
>>
> 
> Ah, I'd only tested this for 4k. 16k would require a different set of masks.
> 
> So the bits we need to cover are those from just below PAGE_SHIFT to the
> top of PTE_ADDR_HIGH (bit 10). So we can compute the mask for both 4k

Okay.

> and 16k with GENMASK(PAGE_SHIFT-1, 10):
> 
> 	and	\pte, \pte, #PTE_ADDR_MASK | GENMASK(PAGE_SHIFT - 1, 10)
> 	and	\pte, \pte, #~GENMASK(PAGE_SHIFT - 1, 10)
> 
> This compiles (for both 4k and 16k) and the assembly looks correct, but
> I've not done any other testing.

Yeah it works, will do the change.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
  2021-07-16  8:08         ` Suzuki K Poulose
@ 2021-07-19  4:47           ` Anshuman Khandual
  -1 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-19  4:47 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
	steven.price



On 7/16/21 1:38 PM, Suzuki K Poulose wrote:
> On 16/07/2021 08:06, Anshuman Khandual wrote:
>>
>> On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
>>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>>>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>>>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>>>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>>>> was requested but found not to be implemented.
>>>>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>>    arch/arm64/include/asm/memory.h |  1 +
>>>>    arch/arm64/kernel/head.S        | 15 +++++++++++++++
>>>>    arch/arm64/mm/mmu.c             |  3 +++
>>>>    arch/arm64/mm/proc.S            |  9 +++++++++
>>>>    4 files changed, 28 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>>> index 824a365..d0ca002 100644
>>>> --- a/arch/arm64/include/asm/memory.h
>>>> +++ b/arch/arm64/include/asm/memory.h
>>>> @@ -178,6 +178,7 @@
>>>>    #include <asm/bug.h>
>>>>      extern u64            vabits_actual;
>>>> +extern u64            arm64_lpa2_enabled;
>>>>      extern s64            memstart_addr;
>>>>    /* PHYS_OFFSET - the physical address of the start of memory. */
>>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>>>> index 6444147..9cf79ea 100644
>>>> --- a/arch/arm64/kernel/head.S
>>>> +++ b/arch/arm64/kernel/head.S
>>>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>>>>        adrp    x23, __PHYS_OFFSET
>>>>        and    x23, x23, MIN_KIMG_ALIGN - 1    // KASLR offset, defaults to 0
>>>>        bl    set_cpu_boot_mode_flag
>>>> +
>>>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>>>> +    mrs     x10, ID_AA64MMFR0_EL1
>>>> +    ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>>>> +    cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
>>>> +    b.ne    1f
>>>
>>> For the sake of forward compatibility, this should be "b.lt"
>> Right, I guess we could assume that the feature will be present from the
>> current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
>> not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
>> limit is different for 4K and 16K page sizes.
> 
> Absolutely.

ID_AA64MMFR0_TGRAN_SUPPORTED_MAX check there is not required as __enable_mmu()
already performs the required boundary check for a given page size support.

^ permalink raw reply	[flat|nested] 38+ messages in thread

* Re: [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2
@ 2021-07-19  4:47           ` Anshuman Khandual
  0 siblings, 0 replies; 38+ messages in thread
From: Anshuman Khandual @ 2021-07-19  4:47 UTC (permalink / raw)
  To: Suzuki K Poulose, linux-arm-kernel, linux-kernel, linux-mm
  Cc: akpm, mark.rutland, will, catalin.marinas, maz, james.morse,
	steven.price



On 7/16/21 1:38 PM, Suzuki K Poulose wrote:
> On 16/07/2021 08:06, Anshuman Khandual wrote:
>>
>> On 7/14/21 1:51 PM, Suzuki K Poulose wrote:
>>> On 14/07/2021 03:21, Anshuman Khandual wrote:
>>>> Detect FEAT_LPA2 implementation early enough during boot when requested via
>>>> CONFIG_ARM64_PA_BITS_52_LPA2 and remember in a variable arm64_lpa2_enabled.
>>>> This variable could then be used to turn on TCR_EL1.TCR_DS effecting the 52
>>>> bits PA range or fall back to default 48 bits PA range if FEAT_LPA2 feature
>>>> was requested but found not to be implemented.
>>>>
>>>> Signed-off-by: Anshuman Khandual <anshuman.khandual@arm.com>
>>>> ---
>>>>    arch/arm64/include/asm/memory.h |  1 +
>>>>    arch/arm64/kernel/head.S        | 15 +++++++++++++++
>>>>    arch/arm64/mm/mmu.c             |  3 +++
>>>>    arch/arm64/mm/proc.S            |  9 +++++++++
>>>>    4 files changed, 28 insertions(+)
>>>>
>>>> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
>>>> index 824a365..d0ca002 100644
>>>> --- a/arch/arm64/include/asm/memory.h
>>>> +++ b/arch/arm64/include/asm/memory.h
>>>> @@ -178,6 +178,7 @@
>>>>    #include <asm/bug.h>
>>>>      extern u64            vabits_actual;
>>>> +extern u64            arm64_lpa2_enabled;
>>>>      extern s64            memstart_addr;
>>>>    /* PHYS_OFFSET - the physical address of the start of memory. */
>>>> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
>>>> index 6444147..9cf79ea 100644
>>>> --- a/arch/arm64/kernel/head.S
>>>> +++ b/arch/arm64/kernel/head.S
>>>> @@ -94,6 +94,21 @@ SYM_CODE_START(primary_entry)
>>>>        adrp    x23, __PHYS_OFFSET
>>>>        and    x23, x23, MIN_KIMG_ALIGN - 1    // KASLR offset, defaults to 0
>>>>        bl    set_cpu_boot_mode_flag
>>>> +
>>>> +#ifdef CONFIG_ARM64_PA_BITS_52_LPA2
>>>> +    mrs     x10, ID_AA64MMFR0_EL1
>>>> +    ubfx    x10, x10, #ID_AA64MMFR0_TGRAN_SHIFT, 4
>>>> +    cmp     x10, #ID_AA64MMFR0_TGRAN_LPA2
>>>> +    b.ne    1f
>>>
>>> For the sake of forward compatibility, this should be "b.lt"
>> Right, I guess we could assume that the feature will be present from the
>> current ID_AA64MMFR0_TGRAN_LPA2 values onward in the future. But should
>> not this also be capped at ID_AA64MMFR0_TGRAN_SUPPORTED_MAX as the upper
>> limit is different for 4K and 16K page sizes.
> 
> Absolutely.

ID_AA64MMFR0_TGRAN_SUPPORTED_MAX check there is not required as __enable_mmu()
already performs the required boundary check for a given page size support.

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 38+ messages in thread

end of thread, other threads:[~2021-07-19  4:49 UTC | newest]

Thread overview: 38+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-07-14  2:21 [RFC 00/10] arm64/mm: Enable FEAT_LPA2 (52 bits PA support on 4K|16K pages) Anshuman Khandual
2021-07-14  2:21 ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 01/10] mm/mmap: Dynamically initialize protection_map[] Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 02/10] arm64/mm: Consolidate TCR_EL1 fields Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 03/10] arm64/mm: Add FEAT_LPA2 specific TCR_EL1.DS field Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 04/10] arm64/mm: Add FEAT_LPA2 specific ID_AA64MMFR0.TGRAN[2] Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 05/10] arm64/mm: Add CONFIG_ARM64_PA_BITS_52_[LPA|LPA2] Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 06/10] arm64/mm: Add FEAT_LPA2 specific encoding Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14 15:38   ` Steven Price
2021-07-14 15:38     ` Steven Price
2021-07-16  7:20     ` Anshuman Khandual
2021-07-16  7:20       ` Anshuman Khandual
2021-07-16 10:02       ` Steven Price
2021-07-16 10:02         ` Steven Price
2021-07-16 14:37         ` Anshuman Khandual
2021-07-16 14:37           ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 07/10] arm64/mm: Detect and enable FEAT_LPA2 Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14  8:21   ` Suzuki K Poulose
2021-07-14  8:21     ` Suzuki K Poulose
2021-07-16  7:06     ` Anshuman Khandual
2021-07-16  7:06       ` Anshuman Khandual
2021-07-16  8:08       ` Suzuki K Poulose
2021-07-16  8:08         ` Suzuki K Poulose
2021-07-19  4:47         ` Anshuman Khandual
2021-07-19  4:47           ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 08/10] arm64/mm: Add FEAT_LPA2 specific PTE_SHARED and PMD_SECT_S Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 09/10] arm64/mm: Add FEAT_LPA2 specific fallback (48 bits PA) when not implemented Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual
2021-07-14  2:21 ` [RFC 10/10] arm64/mm: Enable CONFIG_ARM64_PA_BITS_52 on CONFIG_ARM64_[4K|16K]_PAGES Anshuman Khandual
2021-07-14  2:21   ` Anshuman Khandual

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.