All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions
@ 2020-09-10  5:33 Bhupesh Sharma
  2020-09-10  5:33 ` [PATCH v5 1/3] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available) Bhupesh Sharma
                   ` (3 more replies)
  0 siblings, 4 replies; 13+ messages in thread
From: Bhupesh Sharma @ 2020-09-10  5:33 UTC (permalink / raw)
  To: kexec; +Cc: John Donnelly, bhsharma, bhupesh.linux, Kazuhito Hagio

Changes since v4:
----------------
- v4 can be seen here:
  https://www.spinics.net/lists/kexec/msg23850.html
- Removed the patch (via [PATCH 4/4] in v3) which marked '--mem-usage'
  option as unsupported for arm64 architecture, as we now have a mechanism
  to read the 'vabits_actual' value from 'id_aa64mmfr2_el1' arm64 system
  architecture register. As per discussions with arm64 and gcc/binutils
  maintainers it turns out there is no standard ABI available between
  the kernel and user-space to export this value early enough to be used
  for page_offset calculation in the --mem-usage case. So, the next best
  option is to have the user-space read the system register to determine
  underlying hardware support for larger (52-bit) addressing support.

  This allows us to keep supporting '--mem-usage' option on arm64 even
  on newer kernels (with flipped VA space).

Changes since v3:
----------------
- v3 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022534.html
- Added a new patch (via [PATCH 4/4]) which marks '--mem-usage' option as
  unsupported for arm64 architecture. With the newer arm64 kernels
  supporting 48-bit/52-bit VA address spaces and keeping a single
  binary for supporting the same, the address of
  kernel symbols like _stext, which could be earlier used to determine
  VA_BITS value, can no longer to determine whether VA_BITS is set to 48
  or 52 in the kernel space. Hence for now, it makes sense to mark
  '--mem-usage' option as unsupported for arm64 architecture until
  we have more clarity from arm64 kernel maintainers on how to manage
  the same in future kernel/makedumpfile versions.

Changes since v2:
----------------
- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022456.html
- I missed some comments from Kazu sent on the LVA v1 patch when I sent
  out the v2. So, addressing them now in v3.
- Also added a patch that adds a tree-wide feature to read
  'MAX_PHYSMEM_BITS' from vmcoreinfo (if available).

Changes since v1:
----------------
- v1 was sent as two separate patches:
  http://lists.infradead.org/pipermail/kexec/2019-February/022424.html
  (ARMv8.2-LPA)
  http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
  (ARMv8.2-LVA)
- v2 combined the two in a single patchset and also addresses Kazu's
  review comments.

This patchset adds support for ARMv8.2 extensions in makedumpfile code.
I cover the following cases with this patchset:
- Both old (<5.4) and new kernels (>= 5.4) work well.
- All VA and PA bit combinations currently supported via the kernel
  CONFIG options work well, including:
 - 48-bit kernel VA + 52-bit PA (LPA)
 - 52-bit kernel VA (LVA) + 52-bit PA (LPA)

This has been tested for the following user-cases:
1. Analysing page information via '--mem-usage' option.
2. Creating a dumpfile using /proc/vmcore,
3. Creating a dumpfile using /proc/kcore, and
4. Post-processing a vmcore.

I have tested this patchset on the following platforms, with kernels
which support/do-not-support ARMv8.2 features:
1. CPUs which don't support ARMv8.2 features, e.g. qualcomm-amberwing,
   ampere-osprey.
2. Prototype models which support ARMv8.2 extensions (e.g. ARMv8 FVP
   simulation model).

Also a preparation patch has been added in this patchset which adds a
common feature for archs (except arm64, for which similar support is
added via subsequent patch) to retrieve 'MAX_PHYSMEM_BITS' from
vmcoreinfo (if available).

This patchset ensures backward compatibility for kernel versions in
which 'TCR_EL1.T1SZ' and 'MAX_PHYSMEM_BITS' are not available in
vmcoreinfo.

In the newer kernels (>= 5.4.0) the following patches export these
variables in the vmcoreinfo:
 - 1d50e5d0c505 ("crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo")
 - bbdbc11804ff ("arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo")

Cc: John Donnelly <john.p.donnelly@oracle.com>
Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: kexec@lists.infradead.org

Bhupesh Sharma (3):
  tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available)
  makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support)
  makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA
    support)

 arch/arm.c     |   8 +-
 arch/arm64.c   | 520 ++++++++++++++++++++++++++++++++++++++-----------
 arch/ia64.c    |   7 +-
 arch/ppc.c     |   8 +-
 arch/ppc64.c   |  49 +++--
 arch/s390x.c   |  29 +--
 arch/sparc64.c |   9 +-
 arch/x86.c     |  34 ++--
 arch/x86_64.c  |  27 +--
 common.h       |  10 +
 makedumpfile.c |   4 +-
 makedumpfile.h |   6 +-
 12 files changed, 529 insertions(+), 182 deletions(-)

-- 
2.26.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v5 1/3] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available)
  2020-09-10  5:33 [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions Bhupesh Sharma
@ 2020-09-10  5:33 ` Bhupesh Sharma
  2020-09-18  5:43   ` HAGIO KAZUHITO(萩尾 一仁)
  2020-09-10  5:33 ` [PATCH v5 2/3] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support) Bhupesh Sharma
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 13+ messages in thread
From: Bhupesh Sharma @ 2020-09-10  5:33 UTC (permalink / raw)
  To: kexec; +Cc: John Donnelly, bhsharma, bhupesh.linux, Kazuhito Hagio

This patch adds a common feature for archs (except arm64, for which
similar support is added via subsequent patch) to retrieve
'MAX_PHYSMEM_BITS' from vmcoreinfo (if available).

I recently posted a kernel patch (see [0]) which appends
'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than
in arch-specific code, so that user-space code can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in 'makedumpfile' utility.

This patch ensures backward compatibility for kernel versions in which
'MAX_PHYSMEM_BITS' is not available in vmcoreinfo.

[0]. http://lists.infradead.org/pipermail/kexec/2019-November/023960.html

Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: John Donnelly <john.p.donnelly@oracle.com>
Cc: kexec@lists.infradead.org
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
---
 arch/arm.c     |  8 +++++++-
 arch/ia64.c    |  7 ++++++-
 arch/ppc.c     |  8 +++++++-
 arch/ppc64.c   | 49 ++++++++++++++++++++++++++++---------------------
 arch/s390x.c   | 29 ++++++++++++++++++-----------
 arch/sparc64.c |  9 +++++++--
 arch/x86.c     | 34 ++++++++++++++++++++--------------
 arch/x86_64.c  | 27 ++++++++++++++++-----------
 8 files changed, 109 insertions(+), 62 deletions(-)

diff --git a/arch/arm.c b/arch/arm.c
index af7442ac70bf..33536fc4dfc9 100644
--- a/arch/arm.c
+++ b/arch/arm.c
@@ -81,7 +81,13 @@ int
 get_machdep_info_arm(void)
 {
 	info->page_offset = SYMBOL(_stext) & 0xffff0000UL;
-	info->max_physmem_bits = _MAX_PHYSMEM_BITS;
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else
+		info->max_physmem_bits = _MAX_PHYSMEM_BITS;
+
 	info->kernel_start = SYMBOL(_stext);
 	info->section_size_bits = _SECTION_SIZE_BITS;
 
diff --git a/arch/ia64.c b/arch/ia64.c
index 6c33cc7c8288..fb44dda47172 100644
--- a/arch/ia64.c
+++ b/arch/ia64.c
@@ -85,7 +85,12 @@ get_machdep_info_ia64(void)
 	}
 
 	info->section_size_bits = _SECTION_SIZE_BITS;
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
 
 	return TRUE;
 }
diff --git a/arch/ppc.c b/arch/ppc.c
index 37c6a3b60cd3..ed9447427a30 100644
--- a/arch/ppc.c
+++ b/arch/ppc.c
@@ -31,7 +31,13 @@ get_machdep_info_ppc(void)
 	unsigned long vmlist, vmap_area_list, vmalloc_start;
 
 	info->section_size_bits = _SECTION_SIZE_BITS;
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
+
 	info->page_offset = __PAGE_OFFSET;
 
 	if (SYMBOL(_stext) != NOT_FOUND_SYMBOL)
diff --git a/arch/ppc64.c b/arch/ppc64.c
index 9d8f2525f608..a3984eebdced 100644
--- a/arch/ppc64.c
+++ b/arch/ppc64.c
@@ -466,30 +466,37 @@ int
 set_ppc64_max_physmem_bits(void)
 {
 	long array_len = ARRAY_LENGTH(mem_section);
-	/*
-	 * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
-	 * newer kernels 3.7 onwards uses 46 bits.
-	 */
-
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
-	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
-		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
-		return TRUE;
-
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_7;
-	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
-		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
-		return TRUE;
 
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_19;
-	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
-		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
 		return TRUE;
+	} else {
+		/*
+		 * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
+		 * newer kernels 3.7 onwards uses 46 bits.
+		 */
 
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_20;
-	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
-		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
-		return TRUE;
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
+		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
+				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
+			return TRUE;
+
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_7;
+		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
+				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
+			return TRUE;
+
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_19;
+		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
+				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
+			return TRUE;
+
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_20;
+		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
+				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
+			return TRUE;
+	}
 
 	return FALSE;
 }
diff --git a/arch/s390x.c b/arch/s390x.c
index bf9d58e54fb7..4d17a783e5bd 100644
--- a/arch/s390x.c
+++ b/arch/s390x.c
@@ -63,20 +63,27 @@ int
 set_s390x_max_physmem_bits(void)
 {
 	long array_len = ARRAY_LENGTH(mem_section);
-	/*
-	 * The older s390x kernels uses _MAX_PHYSMEM_BITS as 42 and the
-	 * newer kernels uses 46 bits.
-	 */
 
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
-	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
-		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
 		return TRUE;
+	} else {
+		/*
+		 * The older s390x kernels uses _MAX_PHYSMEM_BITS as 42 and the
+		 * newer kernels uses 46 bits.
+		 */
 
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_3;
-	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
-		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
-		return TRUE;
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
+		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
+				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
+			return TRUE;
+
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_3;
+		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
+				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
+			return TRUE;
+	}
 
 	return FALSE;
 }
diff --git a/arch/sparc64.c b/arch/sparc64.c
index 1cfaa854ce6d..b93a05bdfe59 100644
--- a/arch/sparc64.c
+++ b/arch/sparc64.c
@@ -25,10 +25,15 @@ int get_versiondep_info_sparc64(void)
 {
 	info->section_size_bits = _SECTION_SIZE_BITS;
 
-	if (info->kernel_version >= KERNEL_VERSION(3, 8, 13))
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else if (info->kernel_version >= KERNEL_VERSION(3, 8, 13))
 		info->max_physmem_bits = _MAX_PHYSMEM_BITS_L4;
-	else {
+	else
 		info->max_physmem_bits = _MAX_PHYSMEM_BITS_L3;
+
+	if (info->kernel_version < KERNEL_VERSION(3, 8, 13)) {
 		info->flag_vmemmap = TRUE;
 		info->vmemmap_start = VMEMMAP_BASE_SPARC64;
 		info->vmemmap_end = VMEMMAP_BASE_SPARC64 +
diff --git a/arch/x86.c b/arch/x86.c
index 3fdae93084b8..f1b43d4c8179 100644
--- a/arch/x86.c
+++ b/arch/x86.c
@@ -72,21 +72,27 @@ get_machdep_info_x86(void)
 {
 	unsigned long vmlist, vmap_area_list, vmalloc_start;
 
-	/* PAE */
-	if ((vt.mem_flags & MEMORY_X86_PAE)
-	    || ((SYMBOL(pkmap_count) != NOT_FOUND_SYMBOL)
-	      && (SYMBOL(pkmap_count_next) != NOT_FOUND_SYMBOL)
-	      && ((SYMBOL(pkmap_count_next)-SYMBOL(pkmap_count))/sizeof(int))
-	      == 512)) {
-		DEBUG_MSG("\n");
-		DEBUG_MSG("PAE          : ON\n");
-		vt.mem_flags |= MEMORY_X86_PAE;
-		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_PAE;
-	} else {
-		DEBUG_MSG("\n");
-		DEBUG_MSG("PAE          : OFF\n");
-		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else {
+		/* PAE */
+		if ((vt.mem_flags & MEMORY_X86_PAE)
+				|| ((SYMBOL(pkmap_count) != NOT_FOUND_SYMBOL)
+					&& (SYMBOL(pkmap_count_next) != NOT_FOUND_SYMBOL)
+					&& ((SYMBOL(pkmap_count_next)-SYMBOL(pkmap_count))/sizeof(int))
+					== 512)) {
+			DEBUG_MSG("\n");
+			DEBUG_MSG("PAE          : ON\n");
+			vt.mem_flags |= MEMORY_X86_PAE;
+			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_PAE;
+		} else {
+			DEBUG_MSG("\n");
+			DEBUG_MSG("PAE          : OFF\n");
+			info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
+		}
 	}
+
 	info->page_offset = __PAGE_OFFSET;
 
 	if (SYMBOL(_stext) == NOT_FOUND_SYMBOL) {
diff --git a/arch/x86_64.c b/arch/x86_64.c
index b5e295452964..10eed83df655 100644
--- a/arch/x86_64.c
+++ b/arch/x86_64.c
@@ -268,17 +268,22 @@ get_machdep_info_x86_64(void)
 int
 get_versiondep_info_x86_64(void)
 {
-	/*
-	 * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
-	 */
-	if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
-		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG;
-	else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
-		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_26;
-	else if(check_5level_paging())
-		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_5LEVEL;
-	else
-		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_31;
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	} else {
+		/*
+		 * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
+		 */
+		if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
+			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG;
+		else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
+			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_26;
+		else if(check_5level_paging())
+			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_5LEVEL;
+		else
+			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_31;
+	}
 
 	if (!get_page_offset_x86_64())
 		return FALSE;
-- 
2.26.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 2/3] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support)
  2020-09-10  5:33 [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions Bhupesh Sharma
  2020-09-10  5:33 ` [PATCH v5 1/3] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available) Bhupesh Sharma
@ 2020-09-10  5:33 ` Bhupesh Sharma
  2020-09-23  5:28   ` HAGIO KAZUHITO(萩尾 一仁)
  2020-09-10  5:33 ` [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support) Bhupesh Sharma
  2020-09-24  5:23 ` [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions HAGIO KAZUHITO(萩尾 一仁)
  3 siblings, 1 reply; 13+ messages in thread
From: Bhupesh Sharma @ 2020-09-10  5:33 UTC (permalink / raw)
  To: kexec; +Cc: John Donnelly, bhsharma, bhupesh.linux, Kazuhito Hagio

ARMv8.2-LPA architecture extension (if available on underlying hardware)
can support 52-bit physical addresses, while the kernel virtual
addresses remain 48-bit.

Make sure that we read the 52-bit PA address capability from
'MAX_PHYSMEM_BITS' variable (if available in vmcoreinfo) and
accordingly change the pte_to_phy() mask values and also traverse
the page-table walk accordingly.

Also make sure that it works well for the existing 48-bit PA address
platforms and also on environments which use newer kernels with 52-bit
PA support but hardware which is not ARM8.2-LPA compliant.

Kernel commit 1d50e5d0c505 ("crash_core, vmcoreinfo: Append
'MAX_PHYSMEM_BITS' to vmcoreinfo") already supports adding
'MAX_PHYSMEM_BITS' variable to vmcoreinfo.

This patch is in accordance with ARMv8 Architecture Reference Manual

Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: John Donnelly <john.p.donnelly@oracle.com>
Cc: kexec@lists.infradead.org
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
---
 arch/arm64.c | 291 ++++++++++++++++++++++++++++++++++++---------------
 1 file changed, 204 insertions(+), 87 deletions(-)

diff --git a/arch/arm64.c b/arch/arm64.c
index 54d60b440850..709e0a506916 100644
--- a/arch/arm64.c
+++ b/arch/arm64.c
@@ -39,72 +39,185 @@ typedef struct {
 	unsigned long pte;
 } pte_t;
 
+#define __pte(x)	((pte_t) { (x) } )
+#define __pmd(x)	((pmd_t) { (x) } )
+#define __pud(x)	((pud_t) { (x) } )
+#define __pgd(x)	((pgd_t) { (x) } )
+
+static int lpa_52_bit_support_available;
 static int pgtable_level;
 static int va_bits;
 static unsigned long kimage_voffset;
 
-#define SZ_4K			(4 * 1024)
-#define SZ_16K			(16 * 1024)
-#define SZ_64K			(64 * 1024)
-#define SZ_128M			(128 * 1024 * 1024)
+#define SZ_4K			4096
+#define SZ_16K			16384
+#define SZ_64K			65536
 
-#define PAGE_OFFSET_36 ((0xffffffffffffffffUL) << 36)
-#define PAGE_OFFSET_39 ((0xffffffffffffffffUL) << 39)
-#define PAGE_OFFSET_42 ((0xffffffffffffffffUL) << 42)
-#define PAGE_OFFSET_47 ((0xffffffffffffffffUL) << 47)
-#define PAGE_OFFSET_48 ((0xffffffffffffffffUL) << 48)
+#define PAGE_OFFSET_36		((0xffffffffffffffffUL) << 36)
+#define PAGE_OFFSET_39		((0xffffffffffffffffUL) << 39)
+#define PAGE_OFFSET_42		((0xffffffffffffffffUL) << 42)
+#define PAGE_OFFSET_47		((0xffffffffffffffffUL) << 47)
+#define PAGE_OFFSET_48		((0xffffffffffffffffUL) << 48)
+#define PAGE_OFFSET_52		((0xffffffffffffffffUL) << 52)
 
 #define pgd_val(x)		((x).pgd)
 #define pud_val(x)		(pgd_val((x).pgd))
 #define pmd_val(x)		(pud_val((x).pud))
 #define pte_val(x)		((x).pte)
 
-#define PAGE_MASK		(~(PAGESIZE() - 1))
-#define PGDIR_SHIFT		((PAGESHIFT() - 3) * pgtable_level + 3)
-#define PTRS_PER_PGD		(1 << (va_bits - PGDIR_SHIFT))
-#define PUD_SHIFT		get_pud_shift_arm64()
-#define PUD_SIZE		(1UL << PUD_SHIFT)
-#define PUD_MASK		(~(PUD_SIZE - 1))
-#define PTRS_PER_PTE		(1 << (PAGESHIFT() - 3))
-#define PTRS_PER_PUD		PTRS_PER_PTE
-#define PMD_SHIFT		((PAGESHIFT() - 3) * 2 + 3)
-#define PMD_SIZE		(1UL << PMD_SHIFT)
-#define PMD_MASK		(~(PMD_SIZE - 1))
+/* See 'include/uapi/linux/const.h' for definitions below */
+#define __AC(X,Y)	(X##Y)
+#define _AC(X,Y)	__AC(X,Y)
+#define _AT(T,X)	((T)(X))
+
+/* See 'include/asm/pgtable-types.h' for definitions below */
+typedef unsigned long pteval_t;
+typedef unsigned long pmdval_t;
+typedef unsigned long pudval_t;
+typedef unsigned long pgdval_t;
+
+#define PAGE_SHIFT	PAGESHIFT()
+
+/* See 'arch/arm64/include/asm/pgtable-hwdef.h' for definitions below */
+
+#define ARM64_HW_PGTABLE_LEVEL_SHIFT(n)	((PAGE_SHIFT - 3) * (4 - (n)) + 3)
+
+#define PTRS_PER_PTE		(1 << (PAGE_SHIFT - 3))
+
+/*
+ * PMD_SHIFT determines the size a level 2 page table entry can map.
+ */
+#define PMD_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(2)
+#define PMD_SIZE		(_AC(1, UL) << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
 #define PTRS_PER_PMD		PTRS_PER_PTE
 
-#define PAGE_PRESENT		(1 << 0)
+/*
+ * PUD_SHIFT determines the size a level 1 page table entry can map.
+ */
+#define PUD_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(1)
+#define PUD_SIZE		(_AC(1, UL) << PUD_SHIFT)
+#define PUD_MASK		(~(PUD_SIZE-1))
+#define PTRS_PER_PUD		PTRS_PER_PTE
+
+/*
+ * PGDIR_SHIFT determines the size a top-level page table entry can map
+ * (depending on the configuration, this level can be 0, 1 or 2).
+ */
+#define PGDIR_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - (pgtable_level))
+#define PGDIR_SIZE		(_AC(1, UL) << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+#define PTRS_PER_PGD		(1 << ((va_bits) - PGDIR_SHIFT))
+
+/*
+ * Section address mask and size definitions.
+ */
 #define SECTIONS_SIZE_BITS	30
-/* Highest possible physical address supported */
-#define PHYS_MASK_SHIFT		48
-#define PHYS_MASK		((1UL << PHYS_MASK_SHIFT) - 1)
+
 /*
- * Remove the highest order bits that are not a part of the
- * physical address in a section
+ * Hardware page table definitions.
+ *
+ * Level 1 descriptor (PUD).
  */
 #define PMD_SECTION_MASK	((1UL << PHYS_MASK_SHIFT) - 1)
+#define PUD_TYPE_TABLE		(_AT(pudval_t, 3) << 0)
+#define PUD_TABLE_BIT		(_AT(pudval_t, 1) << 1)
+#define PUD_TYPE_MASK		(_AT(pudval_t, 3) << 0)
+#define PUD_TYPE_SECT		(_AT(pudval_t, 1) << 0)
 
-#define PMD_TYPE_MASK		3
-#define PMD_TYPE_SECT		1
-#define PMD_TYPE_TABLE		3
+/*
+ * Level 2 descriptor (PMD).
+ */
+#define PMD_TYPE_MASK		(_AT(pmdval_t, 3) << 0)
+#define PMD_TYPE_FAULT		(_AT(pmdval_t, 0) << 0)
+#define PMD_TYPE_TABLE		(_AT(pmdval_t, 3) << 0)
+#define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
+#define PMD_TABLE_BIT		(_AT(pmdval_t, 1) << 1)
+
+/*
+ * Level 3 descriptor (PTE).
+ */
+#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
+#define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
+
+static inline unsigned long
+get_pte_addr_mask_arm64(void)
+{
+	if (lpa_52_bit_support_available)
+		return (PTE_ADDR_LOW | PTE_ADDR_HIGH);
+	else
+		return PTE_ADDR_LOW;
+}
+
+#define PTE_ADDR_MASK		get_pte_addr_mask_arm64()
 
-#define PUD_TYPE_MASK		3
-#define PUD_TYPE_SECT		1
-#define PUD_TYPE_TABLE		3
+#define PAGE_MASK		(~(PAGESIZE() - 1))
+#define PAGE_PRESENT		(1 << 0)
 
+/* Helper API to convert between a physical address and its placement
+ * in a page table entry, taking care of 52-bit addresses.
+ */
+static inline unsigned long
+__pte_to_phys(pte_t pte)
+{
+	if (lpa_52_bit_support_available)
+		return ((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36));
+	else
+		return (pte_val(pte) & PTE_ADDR_MASK);
+}
+
+/* Find an entry in a page-table-directory */
 #define pgd_index(vaddr) 		(((vaddr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
-#define pgd_offset(pgdir, vaddr)	((pgd_t *)(pgdir) + pgd_index(vaddr))
 
-#define pte_index(vaddr) 		(((vaddr) >> PAGESHIFT()) & (PTRS_PER_PTE - 1))
-#define pmd_page_paddr(pmd)		(pmd_val(pmd) & PHYS_MASK & (int32_t)PAGE_MASK)
-#define pte_offset(dir, vaddr) 		((pte_t*)pmd_page_paddr((*dir)) + pte_index(vaddr))
+static inline pte_t
+pgd_pte(pgd_t pgd)
+{
+	return __pte(pgd_val(pgd));
+}
 
-#define pmd_index(vaddr)		(((vaddr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
-#define pud_page_paddr(pud)		(pud_val(pud) & PHYS_MASK & (int32_t)PAGE_MASK)
-#define pmd_offset_pgtbl_lvl_2(pud, vaddr) ((pmd_t *)pud)
-#define pmd_offset_pgtbl_lvl_3(pud, vaddr) ((pmd_t *)pud_page_paddr((*pud)) + pmd_index(vaddr))
+#define __pgd_to_phys(pgd)		__pte_to_phys(pgd_pte(pgd))
+#define pgd_offset(pgd, vaddr)		((pgd_t *)(pgd) + pgd_index(vaddr))
+
+static inline pte_t pud_pte(pud_t pud)
+{
+	return __pte(pud_val(pud));
+}
 
+static inline unsigned long
+pgd_page_paddr(pgd_t pgd)
+{
+	return __pgd_to_phys(pgd);
+}
+
+/* Find an entry in the first-level page table. */
 #define pud_index(vaddr)		(((vaddr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
-#define pgd_page_paddr(pgd)		(pgd_val(pgd) & PHYS_MASK & (int32_t)PAGE_MASK)
+#define __pud_to_phys(pud)		__pte_to_phys(pud_pte(pud))
+
+static inline unsigned long
+pud_page_paddr(pud_t pud)
+{
+	return __pud_to_phys(pud);
+}
+
+/* Find an entry in the second-level page table. */
+#define pmd_index(vaddr)		(((vaddr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
+
+static inline pte_t pmd_pte(pmd_t pmd)
+{
+	return __pte(pmd_val(pmd));
+}
+
+#define __pmd_to_phys(pmd)		__pte_to_phys(pmd_pte(pmd))
+
+static inline unsigned long
+pmd_page_paddr(pmd_t pmd)
+{
+	return __pmd_to_phys(pmd);
+}
+
+/* Find an entry in the third-level page table. */
+#define pte_index(vaddr) 		(((vaddr) >> PAGESHIFT()) & (PTRS_PER_PTE - 1))
+#define pte_offset(dir, vaddr) 		(pmd_page_paddr((*dir)) + pte_index(vaddr) * sizeof(pte_t))
 
 static unsigned long long
 __pa(unsigned long vaddr)
@@ -116,32 +229,22 @@ __pa(unsigned long vaddr)
 		return (vaddr - kimage_voffset);
 }
 
-static int
-get_pud_shift_arm64(void)
+static pud_t *
+pud_offset(pgd_t *pgda, pgd_t *pgdv, unsigned long vaddr)
 {
-	if (pgtable_level == 4)
-		return ((PAGESHIFT() - 3) * 3 + 3);
+	if (pgtable_level > 3)
+		return (pud_t *)(pgd_page_paddr(*pgdv) + pud_index(vaddr) * sizeof(pud_t));
 	else
-		return PGDIR_SHIFT;
+		return (pud_t *)(pgda);
 }
 
 static pmd_t *
 pmd_offset(pud_t *puda, pud_t *pudv, unsigned long vaddr)
 {
-	if (pgtable_level == 2) {
-		return pmd_offset_pgtbl_lvl_2(puda, vaddr);
-	} else {
-		return pmd_offset_pgtbl_lvl_3(pudv, vaddr);
-	}
-}
-
-static pud_t *
-pud_offset(pgd_t *pgda, pgd_t *pgdv, unsigned long vaddr)
-{
-	if (pgtable_level == 4)
-		return ((pud_t *)pgd_page_paddr((*pgdv)) + pud_index(vaddr));
+	if (pgtable_level > 2)
+		return (pmd_t *)(pud_page_paddr(*pudv) + pmd_index(vaddr) * sizeof(pmd_t));
 	else
-		return (pud_t *)(pgda);
+		return (pmd_t*)(puda);
 }
 
 static int calculate_plat_config(void)
@@ -246,6 +349,14 @@ get_stext_symbol(void)
 int
 get_machdep_info_arm64(void)
 {
+	/* Determine if the PA address range is 52-bits: ARMv8.2-LPA */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+		if (info->max_physmem_bits == 52)
+			lpa_52_bit_support_available = 1;
+	} else
+		info->max_physmem_bits = 48;
+
 	/* Check if va_bits is still not initialized. If still 0, call
 	 * get_versiondep_info() to initialize the same.
 	 */
@@ -258,12 +369,11 @@ get_machdep_info_arm64(void)
 	}
 
 	kimage_voffset = NUMBER(kimage_voffset);
-	info->max_physmem_bits = PHYS_MASK_SHIFT;
 	info->section_size_bits = SECTIONS_SIZE_BITS;
 
 	DEBUG_MSG("kimage_voffset   : %lx\n", kimage_voffset);
-	DEBUG_MSG("max_physmem_bits : %lx\n", info->max_physmem_bits);
-	DEBUG_MSG("section_size_bits: %lx\n", info->section_size_bits);
+	DEBUG_MSG("max_physmem_bits : %ld\n", info->max_physmem_bits);
+	DEBUG_MSG("section_size_bits: %ld\n", info->section_size_bits);
 
 	return TRUE;
 }
@@ -321,6 +431,19 @@ get_versiondep_info_arm64(void)
 	return TRUE;
 }
 
+/* 1GB section for Page Table level = 4 and Page Size = 4KB */
+static int
+is_pud_sect(pud_t pud)
+{
+	return ((pud_val(pud) & PUD_TYPE_MASK) == PUD_TYPE_SECT);
+}
+
+static int
+is_pmd_sect(pmd_t pmd)
+{
+	return ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_SECT);
+}
+
 /*
  * vaddr_to_paddr_arm64() - translate arbitrary virtual address to physical
  * @vaddr: virtual address to translate
@@ -358,10 +481,9 @@ vaddr_to_paddr_arm64(unsigned long vaddr)
 		return NOT_PADDR;
 	}
 
-	if ((pud_val(pudv) & PUD_TYPE_MASK) == PUD_TYPE_SECT) {
-		/* 1GB section for Page Table level = 4 and Page Size = 4KB */
-		paddr = (pud_val(pudv) & (PUD_MASK & PMD_SECTION_MASK))
-					+ (vaddr & (PUD_SIZE - 1));
+	if (is_pud_sect(pudv)) {
+		paddr = (pud_page_paddr(pudv) & PUD_MASK) +
+				(vaddr & (PUD_SIZE - 1));
 		return paddr;
 	}
 
@@ -371,29 +493,24 @@ vaddr_to_paddr_arm64(unsigned long vaddr)
 		return NOT_PADDR;
 	}
 
-	switch (pmd_val(pmdv) & PMD_TYPE_MASK) {
-	case PMD_TYPE_TABLE:
-		ptea = pte_offset(&pmdv, vaddr);
-		/* 64k page */
-		if (!readmem(PADDR, (unsigned long long)ptea, &ptev, sizeof(ptev))) {
-			ERRMSG("Can't read pte\n");
-			return NOT_PADDR;
-		}
+	if (is_pmd_sect(pmdv)) {
+		paddr = (pmd_page_paddr(pmdv) & PMD_MASK) +
+				(vaddr & (PMD_SIZE - 1));
+		return paddr;
+	}
 
-		if (!(pte_val(ptev) & PAGE_PRESENT)) {
-			ERRMSG("Can't get a valid pte.\n");
-			return NOT_PADDR;
-		} else {
+	ptea = (pte_t *)pte_offset(&pmdv, vaddr);
+	if (!readmem(PADDR, (unsigned long long)ptea, &ptev, sizeof(ptev))) {
+		ERRMSG("Can't read pte\n");
+		return NOT_PADDR;
+	}
 
-			paddr = (PAGEBASE(pte_val(ptev)) & PHYS_MASK)
-					+ (vaddr & (PAGESIZE() - 1));
-		}
-		break;
-	case PMD_TYPE_SECT:
-		/* 512MB section for Page Table level = 3 and Page Size = 64KB*/
-		paddr = (pmd_val(pmdv) & (PMD_MASK & PMD_SECTION_MASK))
-					+ (vaddr & (PMD_SIZE - 1));
-		break;
+	if (!(pte_val(ptev) & PAGE_PRESENT)) {
+		ERRMSG("Can't get a valid pte.\n");
+		return NOT_PADDR;
+	} else {
+		paddr = __pte_to_phys(ptev) +
+				(vaddr & (PAGESIZE() - 1));
 	}
 
 	return paddr;
-- 
2.26.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support)
  2020-09-10  5:33 [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions Bhupesh Sharma
  2020-09-10  5:33 ` [PATCH v5 1/3] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available) Bhupesh Sharma
  2020-09-10  5:33 ` [PATCH v5 2/3] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support) Bhupesh Sharma
@ 2020-09-10  5:33 ` Bhupesh Sharma
  2020-09-24  5:05   ` HAGIO KAZUHITO(萩尾 一仁)
  2020-09-24  5:23 ` [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions HAGIO KAZUHITO(萩尾 一仁)
  3 siblings, 1 reply; 13+ messages in thread
From: Bhupesh Sharma @ 2020-09-10  5:33 UTC (permalink / raw)
  To: kexec; +Cc: John Donnelly, bhsharma, bhupesh.linux, Kazuhito Hagio

With ARMv8.2-LVA architecture extension availability, arm64 hardware
which supports this extension can support upto 52-bit virtual
addresses. It is specially useful for having a 52-bit user-space virtual
address space while the kernel can still retain 48-bit/52-bit virtual
addressing.

Since at the moment we enable the support of this extension in the
kernel via a CONFIG flag (CONFIG_ARM64_VA_BITS_52), so there are
no clear mechanisms in user-space to determine this CONFIG
flag value and use it to determine the kernel-space VA address range
values.

'makedumpfile' can instead use 'TCR_EL1.T1SZ' value from vmcoreinfo
which indicates the size offset of the memory region addressed by
TTBR1_EL1 (and hence can be used for determining the
vabits_actual value).

Using the vmcoreinfo variable exported by kernel commit
 bbdbc11804ff ("arm64/crash_core: Export  TCR_EL1.T1SZ in vmcoreinfo"),
the user-space can use the following computation for determining whether
 an address lies in the linear map range (for newer kernels >= 5.4):

  #define __is_lm_address(addr)	(!(((u64)addr) & BIT(vabits_actual - 1)))

Note that for the --mem-usage case though we need to calculate
vabits_actual value before the vmcoreinfo read functionality is ready,
so we can instead read the architecture register ID_AA64MMFR2_EL1
directly to see if the underlying hardware supports 52-bit addressing
and accordingly set vabits_actual as:

   read_id_aa64mmfr2_el1();
   if (hardware supports 52-bit addressing)
	vabits_actual = 52;
   else
	vabits_actual = va_bits value calculated via _stext symbol;

Also make sure that the page_offset, is_linear_addr(addr) and __pa()
calculations work both for older (< 5.4) and newer kernels (>= 5.4).

I have tested several combinations with both kernel categories
[for e.g. with different VA (39, 42, 48 and 52-bit) and PA combinations
(48 and 52-bit)] on at-least 3 different boards.

Unfortunately, this means that we need to call 'populate_kernel_version()'
earlier 'get_page_offset_arm64()' as 'info->kernel_version' remains
uninitialized before its first use otherwise.

This patch is in accordance with ARMv8 Architecture Reference Manual

Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
Cc: John Donnelly <john.p.donnelly@oracle.com>
Cc: kexec@lists.infradead.org
Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
---
 arch/arm64.c   | 233 ++++++++++++++++++++++++++++++++++++++++++-------
 common.h       |  10 +++
 makedumpfile.c |   4 +-
 makedumpfile.h |   6 +-
 4 files changed, 218 insertions(+), 35 deletions(-)

diff --git a/arch/arm64.c b/arch/arm64.c
index 709e0a506916..ccaa8641ca66 100644
--- a/arch/arm64.c
+++ b/arch/arm64.c
@@ -19,10 +19,23 @@
 
 #ifdef __aarch64__
 
+#include <asm/hwcap.h>
+#include <sys/auxv.h>
 #include "../elf_info.h"
 #include "../makedumpfile.h"
 #include "../print_info.h"
 
+/* ID_AA64MMFR2_EL1 related helpers: */
+#define ID_AA64MMFR2_LVA_SHIFT	16
+#define ID_AA64MMFR2_LVA_MASK	(0xf << ID_AA64MMFR2_LVA_SHIFT)
+
+/* CPU feature ID registers */
+#define get_cpu_ftr(id) ({							\
+		unsigned long __val;						\
+		asm volatile("mrs %0, " __stringify(id) : "=r" (__val));	\
+		__val;								\
+})
+
 typedef struct {
 	unsigned long pgd;
 } pgd_t;
@@ -47,6 +60,7 @@ typedef struct {
 static int lpa_52_bit_support_available;
 static int pgtable_level;
 static int va_bits;
+static int vabits_actual;
 static unsigned long kimage_voffset;
 
 #define SZ_4K			4096
@@ -58,7 +72,6 @@ static unsigned long kimage_voffset;
 #define PAGE_OFFSET_42		((0xffffffffffffffffUL) << 42)
 #define PAGE_OFFSET_47		((0xffffffffffffffffUL) << 47)
 #define PAGE_OFFSET_48		((0xffffffffffffffffUL) << 48)
-#define PAGE_OFFSET_52		((0xffffffffffffffffUL) << 52)
 
 #define pgd_val(x)		((x).pgd)
 #define pud_val(x)		(pgd_val((x).pgd))
@@ -219,13 +232,25 @@ pmd_page_paddr(pmd_t pmd)
 #define pte_index(vaddr) 		(((vaddr) >> PAGESHIFT()) & (PTRS_PER_PTE - 1))
 #define pte_offset(dir, vaddr) 		(pmd_page_paddr((*dir)) + pte_index(vaddr) * sizeof(pte_t))
 
+/*
+ * The linear kernel range starts at the bottom of the virtual address
+ * space. Testing the top bit for the start of the region is a
+ * sufficient check and avoids having to worry about the tag.
+ */
+#define is_linear_addr(addr)	((info->kernel_version < KERNEL_VERSION(5, 4, 0)) ?	\
+	(!!((unsigned long)(addr) & (1UL << (vabits_actual - 1)))) : \
+	(!((unsigned long)(addr) & (1UL << (vabits_actual - 1)))))
+
 static unsigned long long
 __pa(unsigned long vaddr)
 {
 	if (kimage_voffset == NOT_FOUND_NUMBER ||
-			(vaddr >= PAGE_OFFSET))
-		return (vaddr - PAGE_OFFSET + info->phys_base);
-	else
+			is_linear_addr(vaddr)) {
+		if (info->kernel_version < KERNEL_VERSION(5, 4, 0))
+			return ((vaddr & ~PAGE_OFFSET) + info->phys_base);
+		else
+			return (vaddr + info->phys_base - PAGE_OFFSET);
+	} else
 		return (vaddr - kimage_voffset);
 }
 
@@ -254,6 +279,7 @@ static int calculate_plat_config(void)
 			(PAGESIZE() == SZ_64K && va_bits == 42)) {
 		pgtable_level = 2;
 	} else if ((PAGESIZE() == SZ_64K && va_bits == 48) ||
+			(PAGESIZE() == SZ_64K && va_bits == 52) ||
 			(PAGESIZE() == SZ_4K && va_bits == 39) ||
 			(PAGESIZE() == SZ_16K && va_bits == 47)) {
 		pgtable_level = 3;
@@ -288,8 +314,14 @@ get_phys_base_arm64(void)
 		return TRUE;
 	}
 
+	/* Ignore the 1st PT_LOAD */
 	if (get_num_pt_loads() && PAGE_OFFSET) {
-		for (i = 0;
+		/* Note that the following loop starts with i = 1.
+		 * This is required to make sure that the following logic
+		 * works both for old and newer kernels (with flipped
+		 * VA space, i.e. >= 5.4.0)
+		 */
+		for (i = 1;
 		    get_pt_load(i, &phys_start, NULL, &virt_start, NULL);
 		    i++) {
 			if (virt_start != NOT_KV_ADDR
@@ -346,6 +378,139 @@ get_stext_symbol(void)
 	return(found ? kallsym : FALSE);
 }
 
+static int
+get_va_bits_from_stext_arm64(void)
+{
+	ulong _stext;
+
+	_stext = get_stext_symbol();
+	if (!_stext) {
+		ERRMSG("Can't get the symbol of _stext.\n");
+		return FALSE;
+	}
+
+	/* Derive va_bits as per arch/arm64/Kconfig. Note that this is a
+	 * best case approximation at the moment, as there can be
+	 * inconsistencies in this calculation (for e.g., for
+	 * 52-bit kernel VA case, the 48th bit is set in
+	 * the _stext symbol).
+	 *
+	 * So, we need to rely on the vabits_actual symbol in the
+	 * vmcoreinfo or read via system register for a accurate value
+	 * of the virtual addressing supported by the underlying kernel.
+	 */
+	if ((_stext & PAGE_OFFSET_48) == PAGE_OFFSET_48) {
+		va_bits = 48;
+	} else if ((_stext & PAGE_OFFSET_47) == PAGE_OFFSET_47) {
+		va_bits = 47;
+	} else if ((_stext & PAGE_OFFSET_42) == PAGE_OFFSET_42) {
+		va_bits = 42;
+	} else if ((_stext & PAGE_OFFSET_39) == PAGE_OFFSET_39) {
+		va_bits = 39;
+	} else if ((_stext & PAGE_OFFSET_36) == PAGE_OFFSET_36) {
+		va_bits = 36;
+	} else {
+		ERRMSG("Cannot find a proper _stext for calculating VA_BITS\n");
+		return FALSE;
+	}
+
+	DEBUG_MSG("va_bits       : %d (approximation via _stext)\n", va_bits);
+
+	return TRUE;
+}
+
+/* Note that its important to note that the
+ * ID_AA64MMFR2_EL1 architecture register can be read
+ * only when we give an .arch hint to the gcc/binutils,
+ * so we use the gcc construct '__attribute__ ((target ("arch=armv8.2-a")))'
+ * here which is an .arch directive (see AArch64-Target-selection-directives
+ * documentation from ARM for details). This is required only for
+ * this function to make sure it compiles well with gcc/binutils.
+ */
+__attribute__ ((target ("arch=armv8.2-a")))
+static unsigned long
+read_id_aa64mmfr2_el1(void)
+{
+	return get_cpu_ftr(ID_AA64MMFR2_EL1);
+}
+
+static int
+get_vabits_actual_from_id_aa64mmfr2_el1(void)
+{
+	int l_vabits_actual;
+	unsigned long val;
+
+	/* Check if ID_AA64MMFR2_EL1 CPU-ID register indicates
+	 * ARMv8.2/LVA support:
+	 * VARange, bits [19:16]
+	 *   From ARMv8.2:
+	 *   Indicates support for a larger virtual address.
+	 *   Defined values are:
+	 *     0b0000 VMSAv8-64 supports 48-bit VAs.
+	 *     0b0001 VMSAv8-64 supports 52-bit VAs when using the 64KB
+	 *            page size. The other translation granules support
+	 *            48-bit VAs.
+	 *
+	 * See ARMv8 ARM for more details.
+	 */
+	if (!(getauxval(AT_HWCAP) & HWCAP_CPUID)) {
+		ERRMSG("arm64 CPUID registers unavailable.\n");
+		return ERROR;
+	}
+
+	val = read_id_aa64mmfr2_el1();
+	val = (val & ID_AA64MMFR2_LVA_MASK) > ID_AA64MMFR2_LVA_SHIFT;
+
+	if ((val == 0x1) && (PAGESIZE() == SZ_64K))
+		l_vabits_actual = 52;
+	else
+		l_vabits_actual = 48;
+
+	return l_vabits_actual;
+}
+
+static void
+get_page_offset_arm64(void)
+{
+	/* Check if 'vabits_actual' is initialized yet.
+	 * If not, our best bet is to read ID_AA64MMFR2_EL1 CPU-ID
+	 * register.
+	 */
+	if (!vabits_actual) {
+		vabits_actual = get_vabits_actual_from_id_aa64mmfr2_el1();
+		if ((vabits_actual == ERROR) || (vabits_actual != 52)) {
+			/* If we cannot read ID_AA64MMFR2_EL1 arch
+			 * register or if this register does not indicate
+			 * support for a larger virtual address, our last
+			 * option is to use the VA_BITS to calculate the
+			 * PAGE_OFFSET value, i.e. vabits_actual = VA_BITS.
+			 */
+			vabits_actual = va_bits;
+			DEBUG_MSG("vabits_actual : %d (approximation via va_bits)\n",
+					vabits_actual);
+		} else
+			DEBUG_MSG("vabits_actual : %d (via id_aa64mmfr2_el1)\n",
+					vabits_actual);
+	}
+
+	if (!populate_kernel_version()) {
+		ERRMSG("Cannot get information about current kernel\n");
+		return;
+	}
+
+	/* See arch/arm64/include/asm/memory.h for more details of
+	 * the PAGE_OFFSET calculation.
+	 */
+	if (info->kernel_version < KERNEL_VERSION(5, 4, 0))
+		info->page_offset = ((0xffffffffffffffffUL) -
+				((1UL) << (vabits_actual - 1)) + 1);
+	else
+		info->page_offset = (-(1UL << vabits_actual));
+
+	DEBUG_MSG("page_offset   : %lx (via vabits_actual)\n",
+			info->page_offset);
+}
+
 int
 get_machdep_info_arm64(void)
 {
@@ -360,8 +525,33 @@ get_machdep_info_arm64(void)
 	/* Check if va_bits is still not initialized. If still 0, call
 	 * get_versiondep_info() to initialize the same.
 	 */
+	if (NUMBER(VA_BITS) != NOT_FOUND_NUMBER) {
+		va_bits = NUMBER(VA_BITS);
+		DEBUG_MSG("va_bits       : %d (vmcoreinfo)\n",
+				va_bits);
+	}
+
+	/* Check if va_bits is still not initialized. If still 0, call
+	 * get_versiondep_info() to initialize the same from _stext
+	 * symbol.
+	 */
 	if (!va_bits)
-		get_versiondep_info_arm64();
+		if (get_va_bits_from_stext_arm64() == FALSE)
+			return FALSE;
+
+	/* See TCR_EL1, Translation Control Register (EL1) register
+	 * description in the ARMv8 Architecture Reference Manual.
+	 * Basically, we can use the TCR_EL1.T1SZ
+	 * value to determine the virtual addressing range supported
+	 * in the kernel-space (i.e. vabits_actual).
+	 */
+	if (NUMBER(TCR_EL1_T1SZ) != NOT_FOUND_NUMBER) {
+		vabits_actual = 64 - NUMBER(TCR_EL1_T1SZ);
+		DEBUG_MSG("vabits_actual : %d (vmcoreinfo)\n",
+				vabits_actual);
+	}
+
+	get_page_offset_arm64();
 
 	if (!calculate_plat_config()) {
 		ERRMSG("Can't determine platform config values\n");
@@ -399,34 +589,11 @@ get_xen_info_arm64(void)
 int
 get_versiondep_info_arm64(void)
 {
-	ulong _stext;
-
-	_stext = get_stext_symbol();
-	if (!_stext) {
-		ERRMSG("Can't get the symbol of _stext.\n");
-		return FALSE;
-	}
-
-	/* Derive va_bits as per arch/arm64/Kconfig */
-	if ((_stext & PAGE_OFFSET_36) == PAGE_OFFSET_36) {
-		va_bits = 36;
-	} else if ((_stext & PAGE_OFFSET_39) == PAGE_OFFSET_39) {
-		va_bits = 39;
-	} else if ((_stext & PAGE_OFFSET_42) == PAGE_OFFSET_42) {
-		va_bits = 42;
-	} else if ((_stext & PAGE_OFFSET_47) == PAGE_OFFSET_47) {
-		va_bits = 47;
-	} else if ((_stext & PAGE_OFFSET_48) == PAGE_OFFSET_48) {
-		va_bits = 48;
-	} else {
-		ERRMSG("Cannot find a proper _stext for calculating VA_BITS\n");
-		return FALSE;
-	}
-
-	info->page_offset = (0xffffffffffffffffUL) << (va_bits - 1);
+	if (!va_bits)
+		if (get_va_bits_from_stext_arm64() == FALSE)
+			return FALSE;
 
-	DEBUG_MSG("va_bits      : %d\n", va_bits);
-	DEBUG_MSG("page_offset  : %lx\n", info->page_offset);
+	get_page_offset_arm64();
 
 	return TRUE;
 }
diff --git a/common.h b/common.h
index 6e2f657a79c7..1901df195e9d 100644
--- a/common.h
+++ b/common.h
@@ -50,5 +50,15 @@
 #define NOT_PADDR	(ULONGLONG_MAX)
 #define BADADDR  	((ulong)(-1))
 
+/* Indirect stringification.  Doing two levels allows the parameter to be a
+ * macro itself.  For example, compile with -DFOO=bar, __stringify(FOO)
+ * converts to "bar".
+ *
+ * Copied from linux source: 'include/linux/stringify.h'
+ */
+
+#define __stringify_1(x...)	#x
+#define __stringify(x...)	__stringify_1(x)
+
 #endif  /* COMMON_H */
 
diff --git a/makedumpfile.c b/makedumpfile.c
index 4c4251ea8719..5ab82fd3cf14 100644
--- a/makedumpfile.c
+++ b/makedumpfile.c
@@ -1133,7 +1133,7 @@ fallback_to_current_page_size(void)
 	return TRUE;
 }
 
-static int populate_kernel_version(void)
+int populate_kernel_version(void)
 {
 	struct utsname utsname;
 
@@ -2323,6 +2323,7 @@ write_vmcoreinfo_data(void)
 	WRITE_NUMBER("HUGETLB_PAGE_DTOR", HUGETLB_PAGE_DTOR);
 #ifdef __aarch64__
 	WRITE_NUMBER("VA_BITS", VA_BITS);
+	WRITE_NUMBER_UNSIGNED("TCR_EL1_T1SZ", TCR_EL1_T1SZ);
 	WRITE_NUMBER_UNSIGNED("PHYS_OFFSET", PHYS_OFFSET);
 	WRITE_NUMBER_UNSIGNED("kimage_voffset", kimage_voffset);
 #endif
@@ -2729,6 +2730,7 @@ read_vmcoreinfo(void)
 	READ_NUMBER("KERNEL_IMAGE_SIZE", KERNEL_IMAGE_SIZE);
 #ifdef __aarch64__
 	READ_NUMBER("VA_BITS", VA_BITS);
+	READ_NUMBER_UNSIGNED("TCR_EL1_T1SZ", TCR_EL1_T1SZ);
 	READ_NUMBER_UNSIGNED("PHYS_OFFSET", PHYS_OFFSET);
 	READ_NUMBER_UNSIGNED("kimage_voffset", kimage_voffset);
 #endif
diff --git a/makedumpfile.h b/makedumpfile.h
index 03fb4ce06872..dc65f002bad6 100644
--- a/makedumpfile.h
+++ b/makedumpfile.h
@@ -974,7 +974,9 @@ unsigned long long vaddr_to_paddr_arm64(unsigned long vaddr);
 int get_versiondep_info_arm64(void);
 int get_xen_basic_info_arm64(void);
 int get_xen_info_arm64(void);
-#define paddr_to_vaddr_arm64(X) (((X) - info->phys_base) | PAGE_OFFSET)
+#define paddr_to_vaddr_arm64(X) ((info->kernel_version < KERNEL_VERSION(5, 4, 0)) ?	\
+				 ((X) - (info->phys_base - PAGE_OFFSET)) :		\
+				 (((X) - info->phys_base) | PAGE_OFFSET))
 
 #define find_vmemmap()		stub_false()
 #define vaddr_to_paddr(X)	vaddr_to_paddr_arm64(X)
@@ -1938,6 +1940,7 @@ struct number_table {
 	long	KERNEL_IMAGE_SIZE;
 #ifdef __aarch64__
 	long 	VA_BITS;
+	unsigned long	TCR_EL1_T1SZ;
 	unsigned long	PHYS_OFFSET;
 	unsigned long	kimage_voffset;
 #endif
@@ -2389,5 +2392,6 @@ ulong htol(char *s, int flags);
 int hexadecimal(char *s, int count);
 int decimal(char *s, int count);
 int file_exists(char *file);
+int populate_kernel_version(void);
 
 #endif /* MAKEDUMPFILE_H */
-- 
2.26.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [PATCH v5 1/3] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available)
  2020-09-10  5:33 ` [PATCH v5 1/3] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available) Bhupesh Sharma
@ 2020-09-18  5:43   ` HAGIO KAZUHITO(萩尾 一仁)
  0 siblings, 0 replies; 13+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2020-09-18  5:43 UTC (permalink / raw)
  To: Bhupesh Sharma, kexec; +Cc: John Donnelly, bhupesh.linux

Hi Bhupesh,

-----Original Message-----
> This patch adds a common feature for archs (except arm64, for which
> similar support is added via subsequent patch) to retrieve
> 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available).
> 
> I recently posted a kernel patch (see [0]) which appends
> 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than
> in arch-specific code, so that user-space code can also benefit from
> this addition to the vmcoreinfo and use it as a standard way of
> determining 'SECTIONS_SHIFT' value in 'makedumpfile' utility.
> 
> This patch ensures backward compatibility for kernel versions in which
> 'MAX_PHYSMEM_BITS' is not available in vmcoreinfo.
> 
> [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023960.html

This archive page appears gone..  I modified the commit message.
Please see comments below.

> 
> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
> Cc: John Donnelly <john.p.donnelly@oracle.com>
> Cc: kexec@lists.infradead.org
> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
> ---
>  arch/arm.c     |  8 +++++++-
>  arch/ia64.c    |  7 ++++++-
>  arch/ppc.c     |  8 +++++++-
>  arch/ppc64.c   | 49 ++++++++++++++++++++++++++++---------------------
>  arch/s390x.c   | 29 ++++++++++++++++++-----------
>  arch/sparc64.c |  9 +++++++--
>  arch/x86.c     | 34 ++++++++++++++++++++--------------
>  arch/x86_64.c  | 27 ++++++++++++++++-----------
>  8 files changed, 109 insertions(+), 62 deletions(-)
> 
> diff --git a/arch/arm.c b/arch/arm.c
> index af7442ac70bf..33536fc4dfc9 100644
> --- a/arch/arm.c
> +++ b/arch/arm.c
> @@ -81,7 +81,13 @@ int
>  get_machdep_info_arm(void)
>  {
>  	info->page_offset = SYMBOL(_stext) & 0xffff0000UL;
> -	info->max_physmem_bits = _MAX_PHYSMEM_BITS;
> +
> +	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> +	else
> +		info->max_physmem_bits = _MAX_PHYSMEM_BITS;
> +
>  	info->kernel_start = SYMBOL(_stext);
>  	info->section_size_bits = _SECTION_SIZE_BITS;
> 
> diff --git a/arch/ia64.c b/arch/ia64.c
> index 6c33cc7c8288..fb44dda47172 100644
> --- a/arch/ia64.c
> +++ b/arch/ia64.c
> @@ -85,7 +85,12 @@ get_machdep_info_ia64(void)
>  	}
> 
>  	info->section_size_bits = _SECTION_SIZE_BITS;
> -	info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
> +
> +	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> +	else
> +		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
> 
>  	return TRUE;
>  }
> diff --git a/arch/ppc.c b/arch/ppc.c
> index 37c6a3b60cd3..ed9447427a30 100644
> --- a/arch/ppc.c
> +++ b/arch/ppc.c
> @@ -31,7 +31,13 @@ get_machdep_info_ppc(void)
>  	unsigned long vmlist, vmap_area_list, vmalloc_start;
> 
>  	info->section_size_bits = _SECTION_SIZE_BITS;
> -	info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
> +
> +	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> +	else
> +		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
> +
>  	info->page_offset = __PAGE_OFFSET;
> 
>  	if (SYMBOL(_stext) != NOT_FOUND_SYMBOL)
> diff --git a/arch/ppc64.c b/arch/ppc64.c
> index 9d8f2525f608..a3984eebdced 100644
> --- a/arch/ppc64.c
> +++ b/arch/ppc64.c
> @@ -466,30 +466,37 @@ int
>  set_ppc64_max_physmem_bits(void)
>  {
>  	long array_len = ARRAY_LENGTH(mem_section);
> -	/*
> -	 * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
> -	 * newer kernels 3.7 onwards uses 46 bits.
> -	 */
> -
> -	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
> -	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> -		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> -		return TRUE;
> -
> -	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_7;
> -	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> -		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> -		return TRUE;
> 
> -	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_19;
> -	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> -		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> +	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
>  		return TRUE;
> +	} else {
> +		/*
> +		 * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
> +		 * newer kernels 3.7 onwards uses 46 bits.
> +		 */
> 
> -	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_20;
> -	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> -		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> -		return TRUE;
> +		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
> +		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> +				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> +			return TRUE;
> +
> +		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_7;
> +		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> +				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> +			return TRUE;
> +
> +		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_19;
> +		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> +				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> +			return TRUE;
> +
> +		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_4_20;
> +		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> +				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> +			return TRUE;
> +	}
> 
>  	return FALSE;
>  }

That else block is not needed?  I will replace this hunk with:

--- a/arch/ppc64.c
+++ b/arch/ppc64.c
@@ -466,6 +466,13 @@ int
 set_ppc64_max_physmem_bits(void)
 {
        long array_len = ARRAY_LENGTH(mem_section);
+
+       /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+       if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
+               info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+               return TRUE;
+       }
+
        /*
         * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
         * newer kernels 3.7 onwards uses 46 bits.


> diff --git a/arch/s390x.c b/arch/s390x.c
> index bf9d58e54fb7..4d17a783e5bd 100644
> --- a/arch/s390x.c
> +++ b/arch/s390x.c
> @@ -63,20 +63,27 @@ int
>  set_s390x_max_physmem_bits(void)
>  {
>  	long array_len = ARRAY_LENGTH(mem_section);
> -	/*
> -	 * The older s390x kernels uses _MAX_PHYSMEM_BITS as 42 and the
> -	 * newer kernels uses 46 bits.
> -	 */
> 
> -	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
> -	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> -		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> +	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
>  		return TRUE;
> +	} else {
> +		/*
> +		 * The older s390x kernels uses _MAX_PHYSMEM_BITS as 42 and the
> +		 * newer kernels uses 46 bits.
> +		 */
> 
> -	info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_3;
> -	if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> -		|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> -		return TRUE;
> +		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG ;
> +		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> +				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> +			return TRUE;
> +
> +		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_3_3;
> +		if ((array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT_EXTREME()))
> +				|| (array_len == (NR_MEM_SECTIONS() / _SECTIONS_PER_ROOT())))
> +			return TRUE;
> +	}
> 
>  	return FALSE;
>  }

Ditto.

--- a/arch/s390x.c
+++ b/arch/s390x.c
@@ -63,6 +63,13 @@ int
 set_s390x_max_physmem_bits(void)
 {
        long array_len = ARRAY_LENGTH(mem_section);
+
+       /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+       if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
+               info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+               return TRUE;
+       }
+
        /*
         * The older s390x kernels uses _MAX_PHYSMEM_BITS as 42 and the
         * newer kernels uses 46 bits.


> diff --git a/arch/sparc64.c b/arch/sparc64.c
> index 1cfaa854ce6d..b93a05bdfe59 100644
> --- a/arch/sparc64.c
> +++ b/arch/sparc64.c
> @@ -25,10 +25,15 @@ int get_versiondep_info_sparc64(void)
>  {
>  	info->section_size_bits = _SECTION_SIZE_BITS;
> 
> -	if (info->kernel_version >= KERNEL_VERSION(3, 8, 13))
> +	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> +	else if (info->kernel_version >= KERNEL_VERSION(3, 8, 13))
>  		info->max_physmem_bits = _MAX_PHYSMEM_BITS_L4;
> -	else {
> +	else
>  		info->max_physmem_bits = _MAX_PHYSMEM_BITS_L3;
> +
> +	if (info->kernel_version < KERNEL_VERSION(3, 8, 13)) {
>  		info->flag_vmemmap = TRUE;
>  		info->vmemmap_start = VMEMMAP_BASE_SPARC64;
>  		info->vmemmap_end = VMEMMAP_BASE_SPARC64 +
> diff --git a/arch/x86.c b/arch/x86.c
> index 3fdae93084b8..f1b43d4c8179 100644
> --- a/arch/x86.c
> +++ b/arch/x86.c
> @@ -72,21 +72,27 @@ get_machdep_info_x86(void)
>  {
>  	unsigned long vmlist, vmap_area_list, vmalloc_start;
> 
> -	/* PAE */
> -	if ((vt.mem_flags & MEMORY_X86_PAE)
> -	    || ((SYMBOL(pkmap_count) != NOT_FOUND_SYMBOL)
> -	      && (SYMBOL(pkmap_count_next) != NOT_FOUND_SYMBOL)
> -	      && ((SYMBOL(pkmap_count_next)-SYMBOL(pkmap_count))/sizeof(int))
> -	      == 512)) {
> -		DEBUG_MSG("\n");
> -		DEBUG_MSG("PAE          : ON\n");
> -		vt.mem_flags |= MEMORY_X86_PAE;
> -		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_PAE;
> -	} else {
> -		DEBUG_MSG("\n");
> -		DEBUG_MSG("PAE          : OFF\n");
> -		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
> +	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> +	else {
> +		/* PAE */
> +		if ((vt.mem_flags & MEMORY_X86_PAE)
> +				|| ((SYMBOL(pkmap_count) != NOT_FOUND_SYMBOL)
> +					&& (SYMBOL(pkmap_count_next) != NOT_FOUND_SYMBOL)
> +					&& ((SYMBOL(pkmap_count_next)-SYMBOL(pkmap_count))/sizeof(int))
> +					== 512)) {
> +			DEBUG_MSG("\n");
> +			DEBUG_MSG("PAE          : ON\n");
> +			vt.mem_flags |= MEMORY_X86_PAE;
> +			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_PAE;
> +		} else {
> +			DEBUG_MSG("\n");
> +			DEBUG_MSG("PAE          : OFF\n");
> +			info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
> +		}

If we can get MAX_PHYSMEM_BITS from vmcoreinfo,
1. No "PAE : ON " or "PAE : OFF" debug message
2. No vt.mem_flags |= MEMORY_X86_PAE if pkmap check is true.

The latter will not occur with MAX_PHYSMEM_BITS in vmcoreinfo,
but for simplicity, I'd like to replace the hunk with the following.

--- a/arch/x86.c
+++ b/arch/x86.c
@@ -87,6 +87,11 @@ get_machdep_info_x86(void)
                DEBUG_MSG("PAE          : OFF\n");
                info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
        }
+
+       /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+       if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+               info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+
        info->page_offset = __PAGE_OFFSET;
 
        if (SYMBOL(_stext) == NOT_FOUND_SYMBOL) {



>  	}
> +
>  	info->page_offset = __PAGE_OFFSET;
> 
>  	if (SYMBOL(_stext) == NOT_FOUND_SYMBOL) {
> diff --git a/arch/x86_64.c b/arch/x86_64.c
> index b5e295452964..10eed83df655 100644
> --- a/arch/x86_64.c
> +++ b/arch/x86_64.c
> @@ -268,17 +268,22 @@ get_machdep_info_x86_64(void)
>  int
>  get_versiondep_info_x86_64(void)
>  {
> -	/*
> -	 * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
> -	 */
> -	if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
> -		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG;
> -	else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
> -		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_26;
> -	else if(check_5level_paging())
> -		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_5LEVEL;
> -	else
> -		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_31;
> +	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> +	} else {
> +		/*
> +		 * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
> +		 */
> +		if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
> +			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG;
> +		else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
> +			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_26;
> +		else if(check_5level_paging())
> +			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_5LEVEL;
> +		else
> +			info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_31;
> +	}

Ditto.

--- a/arch/x86_64.c
+++ b/arch/x86_64.c
@@ -268,10 +268,10 @@ get_machdep_info_x86_64(void)
 int
 get_versiondep_info_x86_64(void)
 {
-       /*
-        * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
-        */
-       if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
+       /* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+       if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+               info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+       else if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
                info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG;
        else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
                info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_26;

> 
>  	if (!get_page_offset_x86_64())
>  		return FALSE;
> --
> 2.26.2



So I modified the above hunks, could you check the following?
And I'm going to merge this patch separately from the 2/3 and 3/3 patch,
because those patches don't depend on this patch.


From 93d32474eb18f955bb57cdc2b63dd2cb447a33fc Mon Sep 17 00:00:00 2001
From: Bhupesh Sharma <bhsharma@redhat.com>
Date: Thu, 10 Sep 2020 11:03:03 +0530
Subject: [PATCH] tree-wide: Retrieve MAX_PHYSMEM_BITS from vmcoreinfo

Add a common feature for architectures (except arm64, for which
similar support is added via a subsequent patch) to retrieve
MAX_PHYSMEM_BITS from vmcoreinfo, which was added by kernel commit
1d50e5d0c505 ("crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS'
to vmcoreinfo").  This makes makedumpfile adaptable for future
MAX_PHYSMEM_BITS changes.

Also ensure backward compatibility for kernel versions in which
MAX_PHYSMEM_BITS is not available in vmcoreinfo.

Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
Signed-off-by: Kazuhito Hagio <k-hagio-ab@nec.com>
---
 arch/arm.c     | 8 +++++++-
 arch/ia64.c    | 7 ++++++-
 arch/ppc.c     | 8 +++++++-
 arch/ppc64.c   | 7 +++++++
 arch/s390x.c   | 7 +++++++
 arch/sparc64.c | 9 +++++++--
 arch/x86.c     | 5 +++++
 arch/x86_64.c  | 8 ++++----
 8 files changed, 50 insertions(+), 9 deletions(-)

diff --git a/arch/arm.c b/arch/arm.c
index af7442ac70bf..33536fc4dfc9 100644
--- a/arch/arm.c
+++ b/arch/arm.c
@@ -81,7 +81,13 @@ int
 get_machdep_info_arm(void)
 {
 	info->page_offset = SYMBOL(_stext) & 0xffff0000UL;
-	info->max_physmem_bits = _MAX_PHYSMEM_BITS;
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else
+		info->max_physmem_bits = _MAX_PHYSMEM_BITS;
+
 	info->kernel_start = SYMBOL(_stext);
 	info->section_size_bits = _SECTION_SIZE_BITS;
 
diff --git a/arch/ia64.c b/arch/ia64.c
index 6c33cc7c8288..fb44dda47172 100644
--- a/arch/ia64.c
+++ b/arch/ia64.c
@@ -85,7 +85,12 @@ get_machdep_info_ia64(void)
 	}
 
 	info->section_size_bits = _SECTION_SIZE_BITS;
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
 
 	return TRUE;
 }
diff --git a/arch/ppc.c b/arch/ppc.c
index 37c6a3b60cd3..ed9447427a30 100644
--- a/arch/ppc.c
+++ b/arch/ppc.c
@@ -31,7 +31,13 @@ get_machdep_info_ppc(void)
 	unsigned long vmlist, vmap_area_list, vmalloc_start;
 
 	info->section_size_bits = _SECTION_SIZE_BITS;
-	info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else
+		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
+
 	info->page_offset = __PAGE_OFFSET;
 
 	if (SYMBOL(_stext) != NOT_FOUND_SYMBOL)
diff --git a/arch/ppc64.c b/arch/ppc64.c
index 9d8f2525f608..5e70acb51aba 100644
--- a/arch/ppc64.c
+++ b/arch/ppc64.c
@@ -466,6 +466,13 @@ int
 set_ppc64_max_physmem_bits(void)
 {
 	long array_len = ARRAY_LENGTH(mem_section);
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+		return TRUE;
+	}
+
 	/*
 	 * The older ppc64 kernels uses _MAX_PHYSMEM_BITS as 42 and the
 	 * newer kernels 3.7 onwards uses 46 bits.
diff --git a/arch/s390x.c b/arch/s390x.c
index bf9d58e54fb7..c4fed6f3bbd0 100644
--- a/arch/s390x.c
+++ b/arch/s390x.c
@@ -63,6 +63,13 @@ int
 set_s390x_max_physmem_bits(void)
 {
 	long array_len = ARRAY_LENGTH(mem_section);
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+		return TRUE;
+	}
+
 	/*
 	 * The older s390x kernels uses _MAX_PHYSMEM_BITS as 42 and the
 	 * newer kernels uses 46 bits.
diff --git a/arch/sparc64.c b/arch/sparc64.c
index 1cfaa854ce6d..b93a05bdfe59 100644
--- a/arch/sparc64.c
+++ b/arch/sparc64.c
@@ -25,10 +25,15 @@ int get_versiondep_info_sparc64(void)
 {
 	info->section_size_bits = _SECTION_SIZE_BITS;
 
-	if (info->kernel_version >= KERNEL_VERSION(3, 8, 13))
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else if (info->kernel_version >= KERNEL_VERSION(3, 8, 13))
 		info->max_physmem_bits = _MAX_PHYSMEM_BITS_L4;
-	else {
+	else
 		info->max_physmem_bits = _MAX_PHYSMEM_BITS_L3;
+
+	if (info->kernel_version < KERNEL_VERSION(3, 8, 13)) {
 		info->flag_vmemmap = TRUE;
 		info->vmemmap_start = VMEMMAP_BASE_SPARC64;
 		info->vmemmap_end = VMEMMAP_BASE_SPARC64 +
diff --git a/arch/x86.c b/arch/x86.c
index 3fdae93084b8..7899329c821f 100644
--- a/arch/x86.c
+++ b/arch/x86.c
@@ -87,6 +87,11 @@ get_machdep_info_x86(void)
 		DEBUG_MSG("PAE          : OFF\n");
 		info->max_physmem_bits  = _MAX_PHYSMEM_BITS;
 	}
+
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+
 	info->page_offset = __PAGE_OFFSET;
 
 	if (SYMBOL(_stext) == NOT_FOUND_SYMBOL) {
diff --git a/arch/x86_64.c b/arch/x86_64.c
index b5e295452964..58b5c0b7af40 100644
--- a/arch/x86_64.c
+++ b/arch/x86_64.c
@@ -268,10 +268,10 @@ get_machdep_info_x86_64(void)
 int
 get_versiondep_info_x86_64(void)
 {
-	/*
-	 * On linux-2.6.26, MAX_PHYSMEM_BITS is changed to 44 from 40.
-	 */
-	if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
+	/* Check if we can get MAX_PHYSMEM_BITS from vmcoreinfo */
+	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER)
+		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
+	else if (info->kernel_version < KERNEL_VERSION(2, 6, 26))
 		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_ORIG;
 	else if (info->kernel_version < KERNEL_VERSION(2, 6, 31))
 		info->max_physmem_bits  = _MAX_PHYSMEM_BITS_2_6_26;
-- 
1.8.3.1


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* RE: [PATCH v5 2/3] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support)
  2020-09-10  5:33 ` [PATCH v5 2/3] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support) Bhupesh Sharma
@ 2020-09-23  5:28   ` HAGIO KAZUHITO(萩尾 一仁)
  0 siblings, 0 replies; 13+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2020-09-23  5:28 UTC (permalink / raw)
  To: Bhupesh Sharma, kexec; +Cc: John Donnelly, bhupesh.linux, Kazuhito Hagio

-----Original Message-----
> ARMv8.2-LPA architecture extension (if available on underlying hardware)
> can support 52-bit physical addresses, while the kernel virtual
> addresses remain 48-bit.
> 
> Make sure that we read the 52-bit PA address capability from
> 'MAX_PHYSMEM_BITS' variable (if available in vmcoreinfo) and
> accordingly change the pte_to_phy() mask values and also traverse
> the page-table walk accordingly.
> 
> Also make sure that it works well for the existing 48-bit PA address
> platforms and also on environments which use newer kernels with 52-bit
> PA support but hardware which is not ARM8.2-LPA compliant.
> 
> Kernel commit 1d50e5d0c505 ("crash_core, vmcoreinfo: Append
> 'MAX_PHYSMEM_BITS' to vmcoreinfo") already supports adding
> 'MAX_PHYSMEM_BITS' variable to vmcoreinfo.
> 
> This patch is in accordance with ARMv8 Architecture Reference Manual
> 
> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
> Cc: John Donnelly <john.p.donnelly@oracle.com>
> Cc: kexec@lists.infradead.org
> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>

PMD_SECTION_MASK is not needed any more, but I can fix it when applying.
Otherwise, the patch looks good to me.

Thanks,
Kazu


> ---
>  arch/arm64.c | 291 ++++++++++++++++++++++++++++++++++++---------------
>  1 file changed, 204 insertions(+), 87 deletions(-)
> 
> diff --git a/arch/arm64.c b/arch/arm64.c
> index 54d60b440850..709e0a506916 100644
> --- a/arch/arm64.c
> +++ b/arch/arm64.c
> @@ -39,72 +39,185 @@ typedef struct {
>  	unsigned long pte;
>  } pte_t;
> 
> +#define __pte(x)	((pte_t) { (x) } )
> +#define __pmd(x)	((pmd_t) { (x) } )
> +#define __pud(x)	((pud_t) { (x) } )
> +#define __pgd(x)	((pgd_t) { (x) } )
> +
> +static int lpa_52_bit_support_available;
>  static int pgtable_level;
>  static int va_bits;
>  static unsigned long kimage_voffset;
> 
> -#define SZ_4K			(4 * 1024)
> -#define SZ_16K			(16 * 1024)
> -#define SZ_64K			(64 * 1024)
> -#define SZ_128M			(128 * 1024 * 1024)
> +#define SZ_4K			4096
> +#define SZ_16K			16384
> +#define SZ_64K			65536
> 
> -#define PAGE_OFFSET_36 ((0xffffffffffffffffUL) << 36)
> -#define PAGE_OFFSET_39 ((0xffffffffffffffffUL) << 39)
> -#define PAGE_OFFSET_42 ((0xffffffffffffffffUL) << 42)
> -#define PAGE_OFFSET_47 ((0xffffffffffffffffUL) << 47)
> -#define PAGE_OFFSET_48 ((0xffffffffffffffffUL) << 48)
> +#define PAGE_OFFSET_36		((0xffffffffffffffffUL) << 36)
> +#define PAGE_OFFSET_39		((0xffffffffffffffffUL) << 39)
> +#define PAGE_OFFSET_42		((0xffffffffffffffffUL) << 42)
> +#define PAGE_OFFSET_47		((0xffffffffffffffffUL) << 47)
> +#define PAGE_OFFSET_48		((0xffffffffffffffffUL) << 48)
> +#define PAGE_OFFSET_52		((0xffffffffffffffffUL) << 52)
> 
>  #define pgd_val(x)		((x).pgd)
>  #define pud_val(x)		(pgd_val((x).pgd))
>  #define pmd_val(x)		(pud_val((x).pud))
>  #define pte_val(x)		((x).pte)
> 
> -#define PAGE_MASK		(~(PAGESIZE() - 1))
> -#define PGDIR_SHIFT		((PAGESHIFT() - 3) * pgtable_level + 3)
> -#define PTRS_PER_PGD		(1 << (va_bits - PGDIR_SHIFT))
> -#define PUD_SHIFT		get_pud_shift_arm64()
> -#define PUD_SIZE		(1UL << PUD_SHIFT)
> -#define PUD_MASK		(~(PUD_SIZE - 1))
> -#define PTRS_PER_PTE		(1 << (PAGESHIFT() - 3))
> -#define PTRS_PER_PUD		PTRS_PER_PTE
> -#define PMD_SHIFT		((PAGESHIFT() - 3) * 2 + 3)
> -#define PMD_SIZE		(1UL << PMD_SHIFT)
> -#define PMD_MASK		(~(PMD_SIZE - 1))
> +/* See 'include/uapi/linux/const.h' for definitions below */
> +#define __AC(X,Y)	(X##Y)
> +#define _AC(X,Y)	__AC(X,Y)
> +#define _AT(T,X)	((T)(X))
> +
> +/* See 'include/asm/pgtable-types.h' for definitions below */
> +typedef unsigned long pteval_t;
> +typedef unsigned long pmdval_t;
> +typedef unsigned long pudval_t;
> +typedef unsigned long pgdval_t;
> +
> +#define PAGE_SHIFT	PAGESHIFT()
> +
> +/* See 'arch/arm64/include/asm/pgtable-hwdef.h' for definitions below */
> +
> +#define ARM64_HW_PGTABLE_LEVEL_SHIFT(n)	((PAGE_SHIFT - 3) * (4 - (n)) + 3)
> +
> +#define PTRS_PER_PTE		(1 << (PAGE_SHIFT - 3))
> +
> +/*
> + * PMD_SHIFT determines the size a level 2 page table entry can map.
> + */
> +#define PMD_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(2)
> +#define PMD_SIZE		(_AC(1, UL) << PMD_SHIFT)
> +#define PMD_MASK		(~(PMD_SIZE-1))
>  #define PTRS_PER_PMD		PTRS_PER_PTE
> 
> -#define PAGE_PRESENT		(1 << 0)
> +/*
> + * PUD_SHIFT determines the size a level 1 page table entry can map.
> + */
> +#define PUD_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(1)
> +#define PUD_SIZE		(_AC(1, UL) << PUD_SHIFT)
> +#define PUD_MASK		(~(PUD_SIZE-1))
> +#define PTRS_PER_PUD		PTRS_PER_PTE
> +
> +/*
> + * PGDIR_SHIFT determines the size a top-level page table entry can map
> + * (depending on the configuration, this level can be 0, 1 or 2).
> + */
> +#define PGDIR_SHIFT		ARM64_HW_PGTABLE_LEVEL_SHIFT(4 - (pgtable_level))
> +#define PGDIR_SIZE		(_AC(1, UL) << PGDIR_SHIFT)
> +#define PGDIR_MASK		(~(PGDIR_SIZE-1))
> +#define PTRS_PER_PGD		(1 << ((va_bits) - PGDIR_SHIFT))
> +
> +/*
> + * Section address mask and size definitions.
> + */
>  #define SECTIONS_SIZE_BITS	30
> -/* Highest possible physical address supported */
> -#define PHYS_MASK_SHIFT		48
> -#define PHYS_MASK		((1UL << PHYS_MASK_SHIFT) - 1)
> +
>  /*
> - * Remove the highest order bits that are not a part of the
> - * physical address in a section
> + * Hardware page table definitions.
> + *
> + * Level 1 descriptor (PUD).
>   */
>  #define PMD_SECTION_MASK	((1UL << PHYS_MASK_SHIFT) - 1)
> +#define PUD_TYPE_TABLE		(_AT(pudval_t, 3) << 0)
> +#define PUD_TABLE_BIT		(_AT(pudval_t, 1) << 1)
> +#define PUD_TYPE_MASK		(_AT(pudval_t, 3) << 0)
> +#define PUD_TYPE_SECT		(_AT(pudval_t, 1) << 0)
> 
> -#define PMD_TYPE_MASK		3
> -#define PMD_TYPE_SECT		1
> -#define PMD_TYPE_TABLE		3
> +/*
> + * Level 2 descriptor (PMD).
> + */
> +#define PMD_TYPE_MASK		(_AT(pmdval_t, 3) << 0)
> +#define PMD_TYPE_FAULT		(_AT(pmdval_t, 0) << 0)
> +#define PMD_TYPE_TABLE		(_AT(pmdval_t, 3) << 0)
> +#define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
> +#define PMD_TABLE_BIT		(_AT(pmdval_t, 1) << 1)
> +
> +/*
> + * Level 3 descriptor (PTE).
> + */
> +#define PTE_ADDR_LOW		(((_AT(pteval_t, 1) << (48 - PAGE_SHIFT)) - 1) << PAGE_SHIFT)
> +#define PTE_ADDR_HIGH		(_AT(pteval_t, 0xf) << 12)
> +
> +static inline unsigned long
> +get_pte_addr_mask_arm64(void)
> +{
> +	if (lpa_52_bit_support_available)
> +		return (PTE_ADDR_LOW | PTE_ADDR_HIGH);
> +	else
> +		return PTE_ADDR_LOW;
> +}
> +
> +#define PTE_ADDR_MASK		get_pte_addr_mask_arm64()
> 
> -#define PUD_TYPE_MASK		3
> -#define PUD_TYPE_SECT		1
> -#define PUD_TYPE_TABLE		3
> +#define PAGE_MASK		(~(PAGESIZE() - 1))
> +#define PAGE_PRESENT		(1 << 0)
> 
> +/* Helper API to convert between a physical address and its placement
> + * in a page table entry, taking care of 52-bit addresses.
> + */
> +static inline unsigned long
> +__pte_to_phys(pte_t pte)
> +{
> +	if (lpa_52_bit_support_available)
> +		return ((pte_val(pte) & PTE_ADDR_LOW) | ((pte_val(pte) & PTE_ADDR_HIGH) << 36));
> +	else
> +		return (pte_val(pte) & PTE_ADDR_MASK);
> +}
> +
> +/* Find an entry in a page-table-directory */
>  #define pgd_index(vaddr) 		(((vaddr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
> -#define pgd_offset(pgdir, vaddr)	((pgd_t *)(pgdir) + pgd_index(vaddr))
> 
> -#define pte_index(vaddr) 		(((vaddr) >> PAGESHIFT()) & (PTRS_PER_PTE - 1))
> -#define pmd_page_paddr(pmd)		(pmd_val(pmd) & PHYS_MASK & (int32_t)PAGE_MASK)
> -#define pte_offset(dir, vaddr) 		((pte_t*)pmd_page_paddr((*dir)) + pte_index(vaddr))
> +static inline pte_t
> +pgd_pte(pgd_t pgd)
> +{
> +	return __pte(pgd_val(pgd));
> +}
> 
> -#define pmd_index(vaddr)		(((vaddr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
> -#define pud_page_paddr(pud)		(pud_val(pud) & PHYS_MASK & (int32_t)PAGE_MASK)
> -#define pmd_offset_pgtbl_lvl_2(pud, vaddr) ((pmd_t *)pud)
> -#define pmd_offset_pgtbl_lvl_3(pud, vaddr) ((pmd_t *)pud_page_paddr((*pud)) + pmd_index(vaddr))
> +#define __pgd_to_phys(pgd)		__pte_to_phys(pgd_pte(pgd))
> +#define pgd_offset(pgd, vaddr)		((pgd_t *)(pgd) + pgd_index(vaddr))
> +
> +static inline pte_t pud_pte(pud_t pud)
> +{
> +	return __pte(pud_val(pud));
> +}
> 
> +static inline unsigned long
> +pgd_page_paddr(pgd_t pgd)
> +{
> +	return __pgd_to_phys(pgd);
> +}
> +
> +/* Find an entry in the first-level page table. */
>  #define pud_index(vaddr)		(((vaddr) >> PUD_SHIFT) & (PTRS_PER_PUD - 1))
> -#define pgd_page_paddr(pgd)		(pgd_val(pgd) & PHYS_MASK & (int32_t)PAGE_MASK)
> +#define __pud_to_phys(pud)		__pte_to_phys(pud_pte(pud))
> +
> +static inline unsigned long
> +pud_page_paddr(pud_t pud)
> +{
> +	return __pud_to_phys(pud);
> +}
> +
> +/* Find an entry in the second-level page table. */
> +#define pmd_index(vaddr)		(((vaddr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
> +
> +static inline pte_t pmd_pte(pmd_t pmd)
> +{
> +	return __pte(pmd_val(pmd));
> +}
> +
> +#define __pmd_to_phys(pmd)		__pte_to_phys(pmd_pte(pmd))
> +
> +static inline unsigned long
> +pmd_page_paddr(pmd_t pmd)
> +{
> +	return __pmd_to_phys(pmd);
> +}
> +
> +/* Find an entry in the third-level page table. */
> +#define pte_index(vaddr) 		(((vaddr) >> PAGESHIFT()) & (PTRS_PER_PTE - 1))
> +#define pte_offset(dir, vaddr) 		(pmd_page_paddr((*dir)) + pte_index(vaddr) * sizeof(pte_t))
> 
>  static unsigned long long
>  __pa(unsigned long vaddr)
> @@ -116,32 +229,22 @@ __pa(unsigned long vaddr)
>  		return (vaddr - kimage_voffset);
>  }
> 
> -static int
> -get_pud_shift_arm64(void)
> +static pud_t *
> +pud_offset(pgd_t *pgda, pgd_t *pgdv, unsigned long vaddr)
>  {
> -	if (pgtable_level == 4)
> -		return ((PAGESHIFT() - 3) * 3 + 3);
> +	if (pgtable_level > 3)
> +		return (pud_t *)(pgd_page_paddr(*pgdv) + pud_index(vaddr) * sizeof(pud_t));
>  	else
> -		return PGDIR_SHIFT;
> +		return (pud_t *)(pgda);
>  }
> 
>  static pmd_t *
>  pmd_offset(pud_t *puda, pud_t *pudv, unsigned long vaddr)
>  {
> -	if (pgtable_level == 2) {
> -		return pmd_offset_pgtbl_lvl_2(puda, vaddr);
> -	} else {
> -		return pmd_offset_pgtbl_lvl_3(pudv, vaddr);
> -	}
> -}
> -
> -static pud_t *
> -pud_offset(pgd_t *pgda, pgd_t *pgdv, unsigned long vaddr)
> -{
> -	if (pgtable_level == 4)
> -		return ((pud_t *)pgd_page_paddr((*pgdv)) + pud_index(vaddr));
> +	if (pgtable_level > 2)
> +		return (pmd_t *)(pud_page_paddr(*pudv) + pmd_index(vaddr) * sizeof(pmd_t));
>  	else
> -		return (pud_t *)(pgda);
> +		return (pmd_t*)(puda);
>  }
> 
>  static int calculate_plat_config(void)
> @@ -246,6 +349,14 @@ get_stext_symbol(void)
>  int
>  get_machdep_info_arm64(void)
>  {
> +	/* Determine if the PA address range is 52-bits: ARMv8.2-LPA */
> +	if (NUMBER(MAX_PHYSMEM_BITS) != NOT_FOUND_NUMBER) {
> +		info->max_physmem_bits = NUMBER(MAX_PHYSMEM_BITS);
> +		if (info->max_physmem_bits == 52)
> +			lpa_52_bit_support_available = 1;
> +	} else
> +		info->max_physmem_bits = 48;
> +
>  	/* Check if va_bits is still not initialized. If still 0, call
>  	 * get_versiondep_info() to initialize the same.
>  	 */
> @@ -258,12 +369,11 @@ get_machdep_info_arm64(void)
>  	}
> 
>  	kimage_voffset = NUMBER(kimage_voffset);
> -	info->max_physmem_bits = PHYS_MASK_SHIFT;
>  	info->section_size_bits = SECTIONS_SIZE_BITS;
> 
>  	DEBUG_MSG("kimage_voffset   : %lx\n", kimage_voffset);
> -	DEBUG_MSG("max_physmem_bits : %lx\n", info->max_physmem_bits);
> -	DEBUG_MSG("section_size_bits: %lx\n", info->section_size_bits);
> +	DEBUG_MSG("max_physmem_bits : %ld\n", info->max_physmem_bits);
> +	DEBUG_MSG("section_size_bits: %ld\n", info->section_size_bits);
> 
>  	return TRUE;
>  }
> @@ -321,6 +431,19 @@ get_versiondep_info_arm64(void)
>  	return TRUE;
>  }
> 
> +/* 1GB section for Page Table level = 4 and Page Size = 4KB */
> +static int
> +is_pud_sect(pud_t pud)
> +{
> +	return ((pud_val(pud) & PUD_TYPE_MASK) == PUD_TYPE_SECT);
> +}
> +
> +static int
> +is_pmd_sect(pmd_t pmd)
> +{
> +	return ((pmd_val(pmd) & PMD_TYPE_MASK) == PMD_TYPE_SECT);
> +}
> +
>  /*
>   * vaddr_to_paddr_arm64() - translate arbitrary virtual address to physical
>   * @vaddr: virtual address to translate
> @@ -358,10 +481,9 @@ vaddr_to_paddr_arm64(unsigned long vaddr)
>  		return NOT_PADDR;
>  	}
> 
> -	if ((pud_val(pudv) & PUD_TYPE_MASK) == PUD_TYPE_SECT) {
> -		/* 1GB section for Page Table level = 4 and Page Size = 4KB */
> -		paddr = (pud_val(pudv) & (PUD_MASK & PMD_SECTION_MASK))
> -					+ (vaddr & (PUD_SIZE - 1));
> +	if (is_pud_sect(pudv)) {
> +		paddr = (pud_page_paddr(pudv) & PUD_MASK) +
> +				(vaddr & (PUD_SIZE - 1));
>  		return paddr;
>  	}
> 
> @@ -371,29 +493,24 @@ vaddr_to_paddr_arm64(unsigned long vaddr)
>  		return NOT_PADDR;
>  	}
> 
> -	switch (pmd_val(pmdv) & PMD_TYPE_MASK) {
> -	case PMD_TYPE_TABLE:
> -		ptea = pte_offset(&pmdv, vaddr);
> -		/* 64k page */
> -		if (!readmem(PADDR, (unsigned long long)ptea, &ptev, sizeof(ptev))) {
> -			ERRMSG("Can't read pte\n");
> -			return NOT_PADDR;
> -		}
> +	if (is_pmd_sect(pmdv)) {
> +		paddr = (pmd_page_paddr(pmdv) & PMD_MASK) +
> +				(vaddr & (PMD_SIZE - 1));
> +		return paddr;
> +	}
> 
> -		if (!(pte_val(ptev) & PAGE_PRESENT)) {
> -			ERRMSG("Can't get a valid pte.\n");
> -			return NOT_PADDR;
> -		} else {
> +	ptea = (pte_t *)pte_offset(&pmdv, vaddr);
> +	if (!readmem(PADDR, (unsigned long long)ptea, &ptev, sizeof(ptev))) {
> +		ERRMSG("Can't read pte\n");
> +		return NOT_PADDR;
> +	}
> 
> -			paddr = (PAGEBASE(pte_val(ptev)) & PHYS_MASK)
> -					+ (vaddr & (PAGESIZE() - 1));
> -		}
> -		break;
> -	case PMD_TYPE_SECT:
> -		/* 512MB section for Page Table level = 3 and Page Size = 64KB*/
> -		paddr = (pmd_val(pmdv) & (PMD_MASK & PMD_SECTION_MASK))
> -					+ (vaddr & (PMD_SIZE - 1));
> -		break;
> +	if (!(pte_val(ptev) & PAGE_PRESENT)) {
> +		ERRMSG("Can't get a valid pte.\n");
> +		return NOT_PADDR;
> +	} else {
> +		paddr = __pte_to_phys(ptev) +
> +				(vaddr & (PAGESIZE() - 1));
>  	}
> 
>  	return paddr;
> --
> 2.26.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support)
  2020-09-10  5:33 ` [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support) Bhupesh Sharma
@ 2020-09-24  5:05   ` HAGIO KAZUHITO(萩尾 一仁)
  2021-01-13  9:17     ` Pingfan Liu
  0 siblings, 1 reply; 13+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2020-09-24  5:05 UTC (permalink / raw)
  To: Bhupesh Sharma, kexec; +Cc: John Donnelly, bhupesh.linux

Hi Bhupesh,

Thank you for the updated patch.

-----Original Message-----
> With ARMv8.2-LVA architecture extension availability, arm64 hardware
> which supports this extension can support upto 52-bit virtual
> addresses. It is specially useful for having a 52-bit user-space virtual
> address space while the kernel can still retain 48-bit/52-bit virtual
> addressing.
> 
> Since at the moment we enable the support of this extension in the
> kernel via a CONFIG flag (CONFIG_ARM64_VA_BITS_52), so there are
> no clear mechanisms in user-space to determine this CONFIG
> flag value and use it to determine the kernel-space VA address range
> values.
> 
> 'makedumpfile' can instead use 'TCR_EL1.T1SZ' value from vmcoreinfo
> which indicates the size offset of the memory region addressed by
> TTBR1_EL1 (and hence can be used for determining the
> vabits_actual value).
> 
> Using the vmcoreinfo variable exported by kernel commit
>  bbdbc11804ff ("arm64/crash_core: Export  TCR_EL1.T1SZ in vmcoreinfo"),
> the user-space can use the following computation for determining whether
>  an address lies in the linear map range (for newer kernels >= 5.4):
> 
>   #define __is_lm_address(addr)	(!(((u64)addr) & BIT(vabits_actual - 1)))
> 
> Note that for the --mem-usage case though we need to calculate
> vabits_actual value before the vmcoreinfo read functionality is ready,

For this, can't we read the TCR_EL1.T1SZ from vmcoreinfo in /proc/kcore's
ELF note?  I think we can use the common functions used to vmcore with it.

I'll write a patch to do so if it sounds good.

> so we can instead read the architecture register ID_AA64MMFR2_EL1
> directly to see if the underlying hardware supports 52-bit addressing
> and accordingly set vabits_actual as:
> 
>    read_id_aa64mmfr2_el1();
>    if (hardware supports 52-bit addressing)
> 	vabits_actual = 52;
>    else
> 	vabits_actual = va_bits value calculated via _stext symbol;
> 
> Also make sure that the page_offset, is_linear_addr(addr) and __pa()
> calculations work both for older (< 5.4) and newer kernels (>= 5.4).
> 
> I have tested several combinations with both kernel categories
> [for e.g. with different VA (39, 42, 48 and 52-bit) and PA combinations
> (48 and 52-bit)] on at-least 3 different boards.
> 
> Unfortunately, this means that we need to call 'populate_kernel_version()'
> earlier 'get_page_offset_arm64()' as 'info->kernel_version' remains
> uninitialized before its first use otherwise.

The populate_kernel_version() uses uname(), so this means that there will
be some cases that makedumpfile doesn't work with vmcores which were
captured on other kernels than running one.  This is a rather big limitation
especially to backward-compatibility test, and it would be better to
avoid changing behavior depending on environment, not on data.

Is there no room to avoid it?

Just an idea, but can we use the OSRELEASE vmcoreinfo in ELF note first
to determine the kernel version?  It's from init_uts_ns.name.release,
why can't we use it?

Thanks,
Kazu

> 
> This patch is in accordance with ARMv8 Architecture Reference Manual
> 
> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
> Cc: John Donnelly <john.p.donnelly@oracle.com>
> Cc: kexec@lists.infradead.org
> Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com>
> ---
>  arch/arm64.c   | 233 ++++++++++++++++++++++++++++++++++++++++++-------
>  common.h       |  10 +++
>  makedumpfile.c |   4 +-
>  makedumpfile.h |   6 +-
>  4 files changed, 218 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/arm64.c b/arch/arm64.c
> index 709e0a506916..ccaa8641ca66 100644
> --- a/arch/arm64.c
> +++ b/arch/arm64.c
> @@ -19,10 +19,23 @@
> 
>  #ifdef __aarch64__
> 
> +#include <asm/hwcap.h>
> +#include <sys/auxv.h>
>  #include "../elf_info.h"
>  #include "../makedumpfile.h"
>  #include "../print_info.h"
> 
> +/* ID_AA64MMFR2_EL1 related helpers: */
> +#define ID_AA64MMFR2_LVA_SHIFT	16
> +#define ID_AA64MMFR2_LVA_MASK	(0xf << ID_AA64MMFR2_LVA_SHIFT)
> +
> +/* CPU feature ID registers */
> +#define get_cpu_ftr(id) ({							\
> +		unsigned long __val;						\
> +		asm volatile("mrs %0, " __stringify(id) : "=r" (__val));	\
> +		__val;								\
> +})
> +
>  typedef struct {
>  	unsigned long pgd;
>  } pgd_t;
> @@ -47,6 +60,7 @@ typedef struct {
>  static int lpa_52_bit_support_available;
>  static int pgtable_level;
>  static int va_bits;
> +static int vabits_actual;
>  static unsigned long kimage_voffset;
> 
>  #define SZ_4K			4096
> @@ -58,7 +72,6 @@ static unsigned long kimage_voffset;
>  #define PAGE_OFFSET_42		((0xffffffffffffffffUL) << 42)
>  #define PAGE_OFFSET_47		((0xffffffffffffffffUL) << 47)
>  #define PAGE_OFFSET_48		((0xffffffffffffffffUL) << 48)
> -#define PAGE_OFFSET_52		((0xffffffffffffffffUL) << 52)
> 
>  #define pgd_val(x)		((x).pgd)
>  #define pud_val(x)		(pgd_val((x).pgd))
> @@ -219,13 +232,25 @@ pmd_page_paddr(pmd_t pmd)
>  #define pte_index(vaddr) 		(((vaddr) >> PAGESHIFT()) & (PTRS_PER_PTE - 1))
>  #define pte_offset(dir, vaddr) 		(pmd_page_paddr((*dir)) + pte_index(vaddr) * sizeof(pte_t))
> 
> +/*
> + * The linear kernel range starts at the bottom of the virtual address
> + * space. Testing the top bit for the start of the region is a
> + * sufficient check and avoids having to worry about the tag.
> + */
> +#define is_linear_addr(addr)	((info->kernel_version < KERNEL_VERSION(5, 4, 0)) ?	\
> +	(!!((unsigned long)(addr) & (1UL << (vabits_actual - 1)))) : \
> +	(!((unsigned long)(addr) & (1UL << (vabits_actual - 1)))))
> +
>  static unsigned long long
>  __pa(unsigned long vaddr)
>  {
>  	if (kimage_voffset == NOT_FOUND_NUMBER ||
> -			(vaddr >= PAGE_OFFSET))
> -		return (vaddr - PAGE_OFFSET + info->phys_base);
> -	else
> +			is_linear_addr(vaddr)) {
> +		if (info->kernel_version < KERNEL_VERSION(5, 4, 0))
> +			return ((vaddr & ~PAGE_OFFSET) + info->phys_base);
> +		else
> +			return (vaddr + info->phys_base - PAGE_OFFSET);
> +	} else
>  		return (vaddr - kimage_voffset);
>  }
> 
> @@ -254,6 +279,7 @@ static int calculate_plat_config(void)
>  			(PAGESIZE() == SZ_64K && va_bits == 42)) {
>  		pgtable_level = 2;
>  	} else if ((PAGESIZE() == SZ_64K && va_bits == 48) ||
> +			(PAGESIZE() == SZ_64K && va_bits == 52) ||
>  			(PAGESIZE() == SZ_4K && va_bits == 39) ||
>  			(PAGESIZE() == SZ_16K && va_bits == 47)) {
>  		pgtable_level = 3;
> @@ -288,8 +314,14 @@ get_phys_base_arm64(void)
>  		return TRUE;
>  	}
> 
> +	/* Ignore the 1st PT_LOAD */
>  	if (get_num_pt_loads() && PAGE_OFFSET) {
> -		for (i = 0;
> +		/* Note that the following loop starts with i = 1.
> +		 * This is required to make sure that the following logic
> +		 * works both for old and newer kernels (with flipped
> +		 * VA space, i.e. >= 5.4.0)
> +		 */
> +		for (i = 1;
>  		    get_pt_load(i, &phys_start, NULL, &virt_start, NULL);
>  		    i++) {
>  			if (virt_start != NOT_KV_ADDR
> @@ -346,6 +378,139 @@ get_stext_symbol(void)
>  	return(found ? kallsym : FALSE);
>  }
> 
> +static int
> +get_va_bits_from_stext_arm64(void)
> +{
> +	ulong _stext;
> +
> +	_stext = get_stext_symbol();
> +	if (!_stext) {
> +		ERRMSG("Can't get the symbol of _stext.\n");
> +		return FALSE;
> +	}
> +
> +	/* Derive va_bits as per arch/arm64/Kconfig. Note that this is a
> +	 * best case approximation at the moment, as there can be
> +	 * inconsistencies in this calculation (for e.g., for
> +	 * 52-bit kernel VA case, the 48th bit is set in
> +	 * the _stext symbol).
> +	 *
> +	 * So, we need to rely on the vabits_actual symbol in the
> +	 * vmcoreinfo or read via system register for a accurate value
> +	 * of the virtual addressing supported by the underlying kernel.
> +	 */
> +	if ((_stext & PAGE_OFFSET_48) == PAGE_OFFSET_48) {
> +		va_bits = 48;
> +	} else if ((_stext & PAGE_OFFSET_47) == PAGE_OFFSET_47) {
> +		va_bits = 47;
> +	} else if ((_stext & PAGE_OFFSET_42) == PAGE_OFFSET_42) {
> +		va_bits = 42;
> +	} else if ((_stext & PAGE_OFFSET_39) == PAGE_OFFSET_39) {
> +		va_bits = 39;
> +	} else if ((_stext & PAGE_OFFSET_36) == PAGE_OFFSET_36) {
> +		va_bits = 36;
> +	} else {
> +		ERRMSG("Cannot find a proper _stext for calculating VA_BITS\n");
> +		return FALSE;
> +	}
> +
> +	DEBUG_MSG("va_bits       : %d (approximation via _stext)\n", va_bits);
> +
> +	return TRUE;
> +}
> +
> +/* Note that its important to note that the
> + * ID_AA64MMFR2_EL1 architecture register can be read
> + * only when we give an .arch hint to the gcc/binutils,
> + * so we use the gcc construct '__attribute__ ((target ("arch=armv8.2-a")))'
> + * here which is an .arch directive (see AArch64-Target-selection-directives
> + * documentation from ARM for details). This is required only for
> + * this function to make sure it compiles well with gcc/binutils.
> + */
> +__attribute__ ((target ("arch=armv8.2-a")))
> +static unsigned long
> +read_id_aa64mmfr2_el1(void)
> +{
> +	return get_cpu_ftr(ID_AA64MMFR2_EL1);
> +}
> +
> +static int
> +get_vabits_actual_from_id_aa64mmfr2_el1(void)
> +{
> +	int l_vabits_actual;
> +	unsigned long val;
> +
> +	/* Check if ID_AA64MMFR2_EL1 CPU-ID register indicates
> +	 * ARMv8.2/LVA support:
> +	 * VARange, bits [19:16]
> +	 *   From ARMv8.2:
> +	 *   Indicates support for a larger virtual address.
> +	 *   Defined values are:
> +	 *     0b0000 VMSAv8-64 supports 48-bit VAs.
> +	 *     0b0001 VMSAv8-64 supports 52-bit VAs when using the 64KB
> +	 *            page size. The other translation granules support
> +	 *            48-bit VAs.
> +	 *
> +	 * See ARMv8 ARM for more details.
> +	 */
> +	if (!(getauxval(AT_HWCAP) & HWCAP_CPUID)) {
> +		ERRMSG("arm64 CPUID registers unavailable.\n");
> +		return ERROR;
> +	}
> +
> +	val = read_id_aa64mmfr2_el1();
> +	val = (val & ID_AA64MMFR2_LVA_MASK) > ID_AA64MMFR2_LVA_SHIFT;
> +
> +	if ((val == 0x1) && (PAGESIZE() == SZ_64K))
> +		l_vabits_actual = 52;
> +	else
> +		l_vabits_actual = 48;
> +
> +	return l_vabits_actual;
> +}
> +
> +static void
> +get_page_offset_arm64(void)
> +{
> +	/* Check if 'vabits_actual' is initialized yet.
> +	 * If not, our best bet is to read ID_AA64MMFR2_EL1 CPU-ID
> +	 * register.
> +	 */
> +	if (!vabits_actual) {
> +		vabits_actual = get_vabits_actual_from_id_aa64mmfr2_el1();
> +		if ((vabits_actual == ERROR) || (vabits_actual != 52)) {
> +			/* If we cannot read ID_AA64MMFR2_EL1 arch
> +			 * register or if this register does not indicate
> +			 * support for a larger virtual address, our last
> +			 * option is to use the VA_BITS to calculate the
> +			 * PAGE_OFFSET value, i.e. vabits_actual = VA_BITS.
> +			 */
> +			vabits_actual = va_bits;
> +			DEBUG_MSG("vabits_actual : %d (approximation via va_bits)\n",
> +					vabits_actual);
> +		} else
> +			DEBUG_MSG("vabits_actual : %d (via id_aa64mmfr2_el1)\n",
> +					vabits_actual);
> +	}
> +
> +	if (!populate_kernel_version()) {
> +		ERRMSG("Cannot get information about current kernel\n");
> +		return;
> +	}
> +
> +	/* See arch/arm64/include/asm/memory.h for more details of
> +	 * the PAGE_OFFSET calculation.
> +	 */
> +	if (info->kernel_version < KERNEL_VERSION(5, 4, 0))
> +		info->page_offset = ((0xffffffffffffffffUL) -
> +				((1UL) << (vabits_actual - 1)) + 1);
> +	else
> +		info->page_offset = (-(1UL << vabits_actual));
> +
> +	DEBUG_MSG("page_offset   : %lx (via vabits_actual)\n",
> +			info->page_offset);
> +}
> +
>  int
>  get_machdep_info_arm64(void)
>  {
> @@ -360,8 +525,33 @@ get_machdep_info_arm64(void)
>  	/* Check if va_bits is still not initialized. If still 0, call
>  	 * get_versiondep_info() to initialize the same.
>  	 */
> +	if (NUMBER(VA_BITS) != NOT_FOUND_NUMBER) {
> +		va_bits = NUMBER(VA_BITS);
> +		DEBUG_MSG("va_bits       : %d (vmcoreinfo)\n",
> +				va_bits);
> +	}
> +
> +	/* Check if va_bits is still not initialized. If still 0, call
> +	 * get_versiondep_info() to initialize the same from _stext
> +	 * symbol.
> +	 */
>  	if (!va_bits)
> -		get_versiondep_info_arm64();
> +		if (get_va_bits_from_stext_arm64() == FALSE)
> +			return FALSE;
> +
> +	/* See TCR_EL1, Translation Control Register (EL1) register
> +	 * description in the ARMv8 Architecture Reference Manual.
> +	 * Basically, we can use the TCR_EL1.T1SZ
> +	 * value to determine the virtual addressing range supported
> +	 * in the kernel-space (i.e. vabits_actual).
> +	 */
> +	if (NUMBER(TCR_EL1_T1SZ) != NOT_FOUND_NUMBER) {
> +		vabits_actual = 64 - NUMBER(TCR_EL1_T1SZ);
> +		DEBUG_MSG("vabits_actual : %d (vmcoreinfo)\n",
> +				vabits_actual);
> +	}
> +
> +	get_page_offset_arm64();
> 
>  	if (!calculate_plat_config()) {
>  		ERRMSG("Can't determine platform config values\n");
> @@ -399,34 +589,11 @@ get_xen_info_arm64(void)
>  int
>  get_versiondep_info_arm64(void)
>  {
> -	ulong _stext;
> -
> -	_stext = get_stext_symbol();
> -	if (!_stext) {
> -		ERRMSG("Can't get the symbol of _stext.\n");
> -		return FALSE;
> -	}
> -
> -	/* Derive va_bits as per arch/arm64/Kconfig */
> -	if ((_stext & PAGE_OFFSET_36) == PAGE_OFFSET_36) {
> -		va_bits = 36;
> -	} else if ((_stext & PAGE_OFFSET_39) == PAGE_OFFSET_39) {
> -		va_bits = 39;
> -	} else if ((_stext & PAGE_OFFSET_42) == PAGE_OFFSET_42) {
> -		va_bits = 42;
> -	} else if ((_stext & PAGE_OFFSET_47) == PAGE_OFFSET_47) {
> -		va_bits = 47;
> -	} else if ((_stext & PAGE_OFFSET_48) == PAGE_OFFSET_48) {
> -		va_bits = 48;
> -	} else {
> -		ERRMSG("Cannot find a proper _stext for calculating VA_BITS\n");
> -		return FALSE;
> -	}
> -
> -	info->page_offset = (0xffffffffffffffffUL) << (va_bits - 1);
> +	if (!va_bits)
> +		if (get_va_bits_from_stext_arm64() == FALSE)
> +			return FALSE;
> 
> -	DEBUG_MSG("va_bits      : %d\n", va_bits);
> -	DEBUG_MSG("page_offset  : %lx\n", info->page_offset);
> +	get_page_offset_arm64();
> 
>  	return TRUE;
>  }
> diff --git a/common.h b/common.h
> index 6e2f657a79c7..1901df195e9d 100644
> --- a/common.h
> +++ b/common.h
> @@ -50,5 +50,15 @@
>  #define NOT_PADDR	(ULONGLONG_MAX)
>  #define BADADDR  	((ulong)(-1))
> 
> +/* Indirect stringification.  Doing two levels allows the parameter to be a
> + * macro itself.  For example, compile with -DFOO=bar, __stringify(FOO)
> + * converts to "bar".
> + *
> + * Copied from linux source: 'include/linux/stringify.h'
> + */
> +
> +#define __stringify_1(x...)	#x
> +#define __stringify(x...)	__stringify_1(x)
> +
>  #endif  /* COMMON_H */
> 
> diff --git a/makedumpfile.c b/makedumpfile.c
> index 4c4251ea8719..5ab82fd3cf14 100644
> --- a/makedumpfile.c
> +++ b/makedumpfile.c
> @@ -1133,7 +1133,7 @@ fallback_to_current_page_size(void)
>  	return TRUE;
>  }
> 
> -static int populate_kernel_version(void)
> +int populate_kernel_version(void)
>  {
>  	struct utsname utsname;
> 
> @@ -2323,6 +2323,7 @@ write_vmcoreinfo_data(void)
>  	WRITE_NUMBER("HUGETLB_PAGE_DTOR", HUGETLB_PAGE_DTOR);
>  #ifdef __aarch64__
>  	WRITE_NUMBER("VA_BITS", VA_BITS);
> +	WRITE_NUMBER_UNSIGNED("TCR_EL1_T1SZ", TCR_EL1_T1SZ);
>  	WRITE_NUMBER_UNSIGNED("PHYS_OFFSET", PHYS_OFFSET);
>  	WRITE_NUMBER_UNSIGNED("kimage_voffset", kimage_voffset);
>  #endif
> @@ -2729,6 +2730,7 @@ read_vmcoreinfo(void)
>  	READ_NUMBER("KERNEL_IMAGE_SIZE", KERNEL_IMAGE_SIZE);
>  #ifdef __aarch64__
>  	READ_NUMBER("VA_BITS", VA_BITS);
> +	READ_NUMBER_UNSIGNED("TCR_EL1_T1SZ", TCR_EL1_T1SZ);
>  	READ_NUMBER_UNSIGNED("PHYS_OFFSET", PHYS_OFFSET);
>  	READ_NUMBER_UNSIGNED("kimage_voffset", kimage_voffset);
>  #endif
> diff --git a/makedumpfile.h b/makedumpfile.h
> index 03fb4ce06872..dc65f002bad6 100644
> --- a/makedumpfile.h
> +++ b/makedumpfile.h
> @@ -974,7 +974,9 @@ unsigned long long vaddr_to_paddr_arm64(unsigned long vaddr);
>  int get_versiondep_info_arm64(void);
>  int get_xen_basic_info_arm64(void);
>  int get_xen_info_arm64(void);
> -#define paddr_to_vaddr_arm64(X) (((X) - info->phys_base) | PAGE_OFFSET)
> +#define paddr_to_vaddr_arm64(X) ((info->kernel_version < KERNEL_VERSION(5, 4, 0)) ?	\
> +				 ((X) - (info->phys_base - PAGE_OFFSET)) :		\
> +				 (((X) - info->phys_base) | PAGE_OFFSET))
> 
>  #define find_vmemmap()		stub_false()
>  #define vaddr_to_paddr(X)	vaddr_to_paddr_arm64(X)
> @@ -1938,6 +1940,7 @@ struct number_table {
>  	long	KERNEL_IMAGE_SIZE;
>  #ifdef __aarch64__
>  	long 	VA_BITS;
> +	unsigned long	TCR_EL1_T1SZ;
>  	unsigned long	PHYS_OFFSET;
>  	unsigned long	kimage_voffset;
>  #endif
> @@ -2389,5 +2392,6 @@ ulong htol(char *s, int flags);
>  int hexadecimal(char *s, int count);
>  int decimal(char *s, int count);
>  int file_exists(char *file);
> +int populate_kernel_version(void);
> 
>  #endif /* MAKEDUMPFILE_H */
> --
> 2.26.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions
  2020-09-10  5:33 [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions Bhupesh Sharma
                   ` (2 preceding siblings ...)
  2020-09-10  5:33 ` [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support) Bhupesh Sharma
@ 2020-09-24  5:23 ` HAGIO KAZUHITO(萩尾 一仁)
  2020-11-11  0:34   ` HAGIO KAZUHITO(萩尾 一仁)
  3 siblings, 1 reply; 13+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2020-09-24  5:23 UTC (permalink / raw)
  To: Bhupesh Sharma, kexec; +Cc: John Donnelly, bhupesh.linux

Hi Bhupesh,

sorry for my scattered comments.

As for 1/3 and 2/3, I think we can merge them separately with 3/3, right?
so if you can ack the updated 1/3 I sent, I will merge them first.
Could you check it?

As for 3/3, you introduced a couple of new ways, so I'd like to discuss
whether there is no better way a little more.

Thanks,
Kazu

-----Original Message-----
> Changes since v4:
> ----------------
> - v4 can be seen here:
>   https://www.spinics.net/lists/kexec/msg23850.html
> - Removed the patch (via [PATCH 4/4] in v3) which marked '--mem-usage'
>   option as unsupported for arm64 architecture, as we now have a mechanism
>   to read the 'vabits_actual' value from 'id_aa64mmfr2_el1' arm64 system
>   architecture register. As per discussions with arm64 and gcc/binutils
>   maintainers it turns out there is no standard ABI available between
>   the kernel and user-space to export this value early enough to be used
>   for page_offset calculation in the --mem-usage case. So, the next best
>   option is to have the user-space read the system register to determine
>   underlying hardware support for larger (52-bit) addressing support.
> 
>   This allows us to keep supporting '--mem-usage' option on arm64 even
>   on newer kernels (with flipped VA space).
> 
> Changes since v3:
> ----------------
> - v3 can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-March/022534.html
> - Added a new patch (via [PATCH 4/4]) which marks '--mem-usage' option as
>   unsupported for arm64 architecture. With the newer arm64 kernels
>   supporting 48-bit/52-bit VA address spaces and keeping a single
>   binary for supporting the same, the address of
>   kernel symbols like _stext, which could be earlier used to determine
>   VA_BITS value, can no longer to determine whether VA_BITS is set to 48
>   or 52 in the kernel space. Hence for now, it makes sense to mark
>   '--mem-usage' option as unsupported for arm64 architecture until
>   we have more clarity from arm64 kernel maintainers on how to manage
>   the same in future kernel/makedumpfile versions.
> 
> Changes since v2:
> ----------------
> - v2 can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-February/022456.html
> - I missed some comments from Kazu sent on the LVA v1 patch when I sent
>   out the v2. So, addressing them now in v3.
> - Also added a patch that adds a tree-wide feature to read
>   'MAX_PHYSMEM_BITS' from vmcoreinfo (if available).
> 
> Changes since v1:
> ----------------
> - v1 was sent as two separate patches:
>   http://lists.infradead.org/pipermail/kexec/2019-February/022424.html
>   (ARMv8.2-LPA)
>   http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
>   (ARMv8.2-LVA)
> - v2 combined the two in a single patchset and also addresses Kazu's
>   review comments.
> 
> This patchset adds support for ARMv8.2 extensions in makedumpfile code.
> I cover the following cases with this patchset:
> - Both old (<5.4) and new kernels (>= 5.4) work well.
> - All VA and PA bit combinations currently supported via the kernel
>   CONFIG options work well, including:
>  - 48-bit kernel VA + 52-bit PA (LPA)
>  - 52-bit kernel VA (LVA) + 52-bit PA (LPA)
> 
> This has been tested for the following user-cases:
> 1. Analysing page information via '--mem-usage' option.
> 2. Creating a dumpfile using /proc/vmcore,
> 3. Creating a dumpfile using /proc/kcore, and
> 4. Post-processing a vmcore.
> 
> I have tested this patchset on the following platforms, with kernels
> which support/do-not-support ARMv8.2 features:
> 1. CPUs which don't support ARMv8.2 features, e.g. qualcomm-amberwing,
>    ampere-osprey.
> 2. Prototype models which support ARMv8.2 extensions (e.g. ARMv8 FVP
>    simulation model).
> 
> Also a preparation patch has been added in this patchset which adds a
> common feature for archs (except arm64, for which similar support is
> added via subsequent patch) to retrieve 'MAX_PHYSMEM_BITS' from
> vmcoreinfo (if available).
> 
> This patchset ensures backward compatibility for kernel versions in
> which 'TCR_EL1.T1SZ' and 'MAX_PHYSMEM_BITS' are not available in
> vmcoreinfo.
> 
> In the newer kernels (>= 5.4.0) the following patches export these
> variables in the vmcoreinfo:
>  - 1d50e5d0c505 ("crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo")
>  - bbdbc11804ff ("arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo")
> 
> Cc: John Donnelly <john.p.donnelly@oracle.com>
> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
> Cc: kexec@lists.infradead.org
> 
> Bhupesh Sharma (3):
>   tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available)
>   makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support)
>   makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA
>     support)
> 
>  arch/arm.c     |   8 +-
>  arch/arm64.c   | 520 ++++++++++++++++++++++++++++++++++++++-----------
>  arch/ia64.c    |   7 +-
>  arch/ppc.c     |   8 +-
>  arch/ppc64.c   |  49 +++--
>  arch/s390x.c   |  29 +--
>  arch/sparc64.c |   9 +-
>  arch/x86.c     |  34 ++--
>  arch/x86_64.c  |  27 +--
>  common.h       |  10 +
>  makedumpfile.c |   4 +-
>  makedumpfile.h |   6 +-
>  12 files changed, 529 insertions(+), 182 deletions(-)
> 
> --
> 2.26.2


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions
  2020-09-24  5:23 ` [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions HAGIO KAZUHITO(萩尾 一仁)
@ 2020-11-11  0:34   ` HAGIO KAZUHITO(萩尾 一仁)
  2020-11-12  6:46     ` Bhupesh Sharma
  0 siblings, 1 reply; 13+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2020-11-11  0:34 UTC (permalink / raw)
  To: Bhupesh Sharma, kexec; +Cc: John Donnelly, bhupesh.linux

Hi Bhupesh,

-----Original Message-----
> Hi Bhupesh,
> 
> sorry for my scattered comments.
> 
> As for 1/3 and 2/3, I think we can merge them separately with 3/3, right?
> so if you can ack the updated 1/3 I sent, I will merge them first.
> Could you check it?

Hopefully I'd like to release the next version of makedumpfile next week.

So I'm going to merge the modified 1/3 patch and your 2/3 patch separately
from 3/3, I think it can use "MAX_PHYSMEM_BITS" and reduce distribution-specific
patches.  If you have any concerns about it, please let me know.

Thanks,
Kazu

> 
> As for 3/3, you introduced a couple of new ways, so I'd like to discuss
> whether there is no better way a little more.
> 
> Thanks,
> Kazu
> 
> -----Original Message-----
> > Changes since v4:
> > ----------------
> > - v4 can be seen here:
> >   https://www.spinics.net/lists/kexec/msg23850.html
> > - Removed the patch (via [PATCH 4/4] in v3) which marked '--mem-usage'
> >   option as unsupported for arm64 architecture, as we now have a mechanism
> >   to read the 'vabits_actual' value from 'id_aa64mmfr2_el1' arm64 system
> >   architecture register. As per discussions with arm64 and gcc/binutils
> >   maintainers it turns out there is no standard ABI available between
> >   the kernel and user-space to export this value early enough to be used
> >   for page_offset calculation in the --mem-usage case. So, the next best
> >   option is to have the user-space read the system register to determine
> >   underlying hardware support for larger (52-bit) addressing support.
> >
> >   This allows us to keep supporting '--mem-usage' option on arm64 even
> >   on newer kernels (with flipped VA space).
> >
> > Changes since v3:
> > ----------------
> > - v3 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022534.html
> > - Added a new patch (via [PATCH 4/4]) which marks '--mem-usage' option as
> >   unsupported for arm64 architecture. With the newer arm64 kernels
> >   supporting 48-bit/52-bit VA address spaces and keeping a single
> >   binary for supporting the same, the address of
> >   kernel symbols like _stext, which could be earlier used to determine
> >   VA_BITS value, can no longer to determine whether VA_BITS is set to 48
> >   or 52 in the kernel space. Hence for now, it makes sense to mark
> >   '--mem-usage' option as unsupported for arm64 architecture until
> >   we have more clarity from arm64 kernel maintainers on how to manage
> >   the same in future kernel/makedumpfile versions.
> >
> > Changes since v2:
> > ----------------
> > - v2 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-February/022456.html
> > - I missed some comments from Kazu sent on the LVA v1 patch when I sent
> >   out the v2. So, addressing them now in v3.
> > - Also added a patch that adds a tree-wide feature to read
> >   'MAX_PHYSMEM_BITS' from vmcoreinfo (if available).
> >
> > Changes since v1:
> > ----------------
> > - v1 was sent as two separate patches:
> >   http://lists.infradead.org/pipermail/kexec/2019-February/022424.html
> >   (ARMv8.2-LPA)
> >   http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
> >   (ARMv8.2-LVA)
> > - v2 combined the two in a single patchset and also addresses Kazu's
> >   review comments.
> >
> > This patchset adds support for ARMv8.2 extensions in makedumpfile code.
> > I cover the following cases with this patchset:
> > - Both old (<5.4) and new kernels (>= 5.4) work well.
> > - All VA and PA bit combinations currently supported via the kernel
> >   CONFIG options work well, including:
> >  - 48-bit kernel VA + 52-bit PA (LPA)
> >  - 52-bit kernel VA (LVA) + 52-bit PA (LPA)
> >
> > This has been tested for the following user-cases:
> > 1. Analysing page information via '--mem-usage' option.
> > 2. Creating a dumpfile using /proc/vmcore,
> > 3. Creating a dumpfile using /proc/kcore, and
> > 4. Post-processing a vmcore.
> >
> > I have tested this patchset on the following platforms, with kernels
> > which support/do-not-support ARMv8.2 features:
> > 1. CPUs which don't support ARMv8.2 features, e.g. qualcomm-amberwing,
> >    ampere-osprey.
> > 2. Prototype models which support ARMv8.2 extensions (e.g. ARMv8 FVP
> >    simulation model).
> >
> > Also a preparation patch has been added in this patchset which adds a
> > common feature for archs (except arm64, for which similar support is
> > added via subsequent patch) to retrieve 'MAX_PHYSMEM_BITS' from
> > vmcoreinfo (if available).
> >
> > This patchset ensures backward compatibility for kernel versions in
> > which 'TCR_EL1.T1SZ' and 'MAX_PHYSMEM_BITS' are not available in
> > vmcoreinfo.
> >
> > In the newer kernels (>= 5.4.0) the following patches export these
> > variables in the vmcoreinfo:
> >  - 1d50e5d0c505 ("crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo")
> >  - bbdbc11804ff ("arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo")
> >
> > Cc: John Donnelly <john.p.donnelly@oracle.com>
> > Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
> > Cc: kexec@lists.infradead.org
> >
> > Bhupesh Sharma (3):
> >   tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available)
> >   makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support)
> >   makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA
> >     support)
> >
> >  arch/arm.c     |   8 +-
> >  arch/arm64.c   | 520 ++++++++++++++++++++++++++++++++++++++-----------
> >  arch/ia64.c    |   7 +-
> >  arch/ppc.c     |   8 +-
> >  arch/ppc64.c   |  49 +++--
> >  arch/s390x.c   |  29 +--
> >  arch/sparc64.c |   9 +-
> >  arch/x86.c     |  34 ++--
> >  arch/x86_64.c  |  27 +--
> >  common.h       |  10 +
> >  makedumpfile.c |   4 +-
> >  makedumpfile.h |   6 +-
> >  12 files changed, 529 insertions(+), 182 deletions(-)
> >
> > --
> > 2.26.2
> 
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions
  2020-11-11  0:34   ` HAGIO KAZUHITO(萩尾 一仁)
@ 2020-11-12  6:46     ` Bhupesh Sharma
  2020-11-13  6:03       ` HAGIO KAZUHITO(萩尾 一仁)
  0 siblings, 1 reply; 13+ messages in thread
From: Bhupesh Sharma @ 2020-11-12  6:46 UTC (permalink / raw)
  To: HAGIO KAZUHITO(萩尾 一仁)
  Cc: John Donnelly, bhupesh.linux, kexec

Hi Kazu,

On Wed, Nov 11, 2020 at 6:05 AM HAGIO KAZUHITO(萩尾 一仁)
<k-hagio-ab@nec.com> wrote:
>
> Hi Bhupesh,
>
> -----Original Message-----
> > Hi Bhupesh,
> >
> > sorry for my scattered comments.
> >
> > As for 1/3 and 2/3, I think we can merge them separately with 3/3, right?
> > so if you can ack the updated 1/3 I sent, I will merge them first.
> > Could you check it?
>
> Hopefully I'd like to release the next version of makedumpfile next week.
>
> So I'm going to merge the modified 1/3 patch and your 2/3 patch separately
> from 3/3, I think it can use "MAX_PHYSMEM_BITS" and reduce distribution-specific
> patches.  If you have any concerns about it, please let me know.

This looks fine to me. Please feel free to merge the patchset in this way.

Regards,
Bhupesh

> > Thanks,
> > Kazu
> >
> > -----Original Message-----
> > > Changes since v4:
> > > ----------------
> > > - v4 can be seen here:
> > >   https://www.spinics.net/lists/kexec/msg23850.html
> > > - Removed the patch (via [PATCH 4/4] in v3) which marked '--mem-usage'
> > >   option as unsupported for arm64 architecture, as we now have a mechanism
> > >   to read the 'vabits_actual' value from 'id_aa64mmfr2_el1' arm64 system
> > >   architecture register. As per discussions with arm64 and gcc/binutils
> > >   maintainers it turns out there is no standard ABI available between
> > >   the kernel and user-space to export this value early enough to be used
> > >   for page_offset calculation in the --mem-usage case. So, the next best
> > >   option is to have the user-space read the system register to determine
> > >   underlying hardware support for larger (52-bit) addressing support.
> > >
> > >   This allows us to keep supporting '--mem-usage' option on arm64 even
> > >   on newer kernels (with flipped VA space).
> > >
> > > Changes since v3:
> > > ----------------
> > > - v3 can be seen here:
> > >   http://lists.infradead.org/pipermail/kexec/2019-March/022534.html
> > > - Added a new patch (via [PATCH 4/4]) which marks '--mem-usage' option as
> > >   unsupported for arm64 architecture. With the newer arm64 kernels
> > >   supporting 48-bit/52-bit VA address spaces and keeping a single
> > >   binary for supporting the same, the address of
> > >   kernel symbols like _stext, which could be earlier used to determine
> > >   VA_BITS value, can no longer to determine whether VA_BITS is set to 48
> > >   or 52 in the kernel space. Hence for now, it makes sense to mark
> > >   '--mem-usage' option as unsupported for arm64 architecture until
> > >   we have more clarity from arm64 kernel maintainers on how to manage
> > >   the same in future kernel/makedumpfile versions.
> > >
> > > Changes since v2:
> > > ----------------
> > > - v2 can be seen here:
> > >   http://lists.infradead.org/pipermail/kexec/2019-February/022456.html
> > > - I missed some comments from Kazu sent on the LVA v1 patch when I sent
> > >   out the v2. So, addressing them now in v3.
> > > - Also added a patch that adds a tree-wide feature to read
> > >   'MAX_PHYSMEM_BITS' from vmcoreinfo (if available).
> > >
> > > Changes since v1:
> > > ----------------
> > > - v1 was sent as two separate patches:
> > >   http://lists.infradead.org/pipermail/kexec/2019-February/022424.html
> > >   (ARMv8.2-LPA)
> > >   http://lists.infradead.org/pipermail/kexec/2019-February/022425.html
> > >   (ARMv8.2-LVA)
> > > - v2 combined the two in a single patchset and also addresses Kazu's
> > >   review comments.
> > >
> > > This patchset adds support for ARMv8.2 extensions in makedumpfile code.
> > > I cover the following cases with this patchset:
> > > - Both old (<5.4) and new kernels (>= 5.4) work well.
> > > - All VA and PA bit combinations currently supported via the kernel
> > >   CONFIG options work well, including:
> > >  - 48-bit kernel VA + 52-bit PA (LPA)
> > >  - 52-bit kernel VA (LVA) + 52-bit PA (LPA)
> > >
> > > This has been tested for the following user-cases:
> > > 1. Analysing page information via '--mem-usage' option.
> > > 2. Creating a dumpfile using /proc/vmcore,
> > > 3. Creating a dumpfile using /proc/kcore, and
> > > 4. Post-processing a vmcore.
> > >
> > > I have tested this patchset on the following platforms, with kernels
> > > which support/do-not-support ARMv8.2 features:
> > > 1. CPUs which don't support ARMv8.2 features, e.g. qualcomm-amberwing,
> > >    ampere-osprey.
> > > 2. Prototype models which support ARMv8.2 extensions (e.g. ARMv8 FVP
> > >    simulation model).
> > >
> > > Also a preparation patch has been added in this patchset which adds a
> > > common feature for archs (except arm64, for which similar support is
> > > added via subsequent patch) to retrieve 'MAX_PHYSMEM_BITS' from
> > > vmcoreinfo (if available).
> > >
> > > This patchset ensures backward compatibility for kernel versions in
> > > which 'TCR_EL1.T1SZ' and 'MAX_PHYSMEM_BITS' are not available in
> > > vmcoreinfo.
> > >
> > > In the newer kernels (>= 5.4.0) the following patches export these
> > > variables in the vmcoreinfo:
> > >  - 1d50e5d0c505 ("crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo")
> > >  - bbdbc11804ff ("arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo")
> > >
> > > Cc: John Donnelly <john.p.donnelly@oracle.com>
> > > Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com>
> > > Cc: kexec@lists.infradead.org
> > >
> > > Bhupesh Sharma (3):
> > >   tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available)
> > >   makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support)
> > >   makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA
> > >     support)
> > >
> > >  arch/arm.c     |   8 +-
> > >  arch/arm64.c   | 520 ++++++++++++++++++++++++++++++++++++++-----------
> > >  arch/ia64.c    |   7 +-
> > >  arch/ppc.c     |   8 +-
> > >  arch/ppc64.c   |  49 +++--
> > >  arch/s390x.c   |  29 +--
> > >  arch/sparc64.c |   9 +-
> > >  arch/x86.c     |  34 ++--
> > >  arch/x86_64.c  |  27 +--
> > >  common.h       |  10 +
> > >  makedumpfile.c |   4 +-
> > >  makedumpfile.h |   6 +-
> > >  12 files changed, 529 insertions(+), 182 deletions(-)
> > >
> > > --
> > > 2.26.2
> >
> >
> > _______________________________________________
> > kexec mailing list
> > kexec@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/kexec
>


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions
  2020-11-12  6:46     ` Bhupesh Sharma
@ 2020-11-13  6:03       ` HAGIO KAZUHITO(萩尾 一仁)
  0 siblings, 0 replies; 13+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2020-11-13  6:03 UTC (permalink / raw)
  To: Bhupesh Sharma; +Cc: John Donnelly, bhupesh.linux, kexec

Hi Bhupesh,

-----Original Message-----
> > Hopefully I'd like to release the next version of makedumpfile next week.
> >
> > So I'm going to merge the modified 1/3 patch and your 2/3 patch separately
> > from 3/3, I think it can use "MAX_PHYSMEM_BITS" and reduce distribution-specific
> > patches.  If you have any concerns about it, please let me know.
> 
> This looks fine to me. Please feel free to merge the patchset in this way.

Thanks for the reply, merged the two patches.

Kazu

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support)
  2020-09-24  5:05   ` HAGIO KAZUHITO(萩尾 一仁)
@ 2021-01-13  9:17     ` Pingfan Liu
  2021-01-14  0:43       ` HAGIO KAZUHITO(萩尾 一仁)
  0 siblings, 1 reply; 13+ messages in thread
From: Pingfan Liu @ 2021-01-13  9:17 UTC (permalink / raw)
  To: HAGIO KAZUHITO(萩尾 一仁),
	kexec, Bhupesh Sharma

On Thu, Sep 24, 2020 at 05:05:45AM +0000, HAGIO KAZUHITO(萩尾 一仁) wrote:
> Hi Bhupesh,
> 
> Thank you for the updated patch.
> 
> -----Original Message-----
> > With ARMv8.2-LVA architecture extension availability, arm64 hardware
> > which supports this extension can support upto 52-bit virtual
> > addresses. It is specially useful for having a 52-bit user-space virtual
> > address space while the kernel can still retain 48-bit/52-bit virtual
> > addressing.
> > 
> > Since at the moment we enable the support of this extension in the
> > kernel via a CONFIG flag (CONFIG_ARM64_VA_BITS_52), so there are
> > no clear mechanisms in user-space to determine this CONFIG
> > flag value and use it to determine the kernel-space VA address range
> > values.
> > 
> > 'makedumpfile' can instead use 'TCR_EL1.T1SZ' value from vmcoreinfo
> > which indicates the size offset of the memory region addressed by
> > TTBR1_EL1 (and hence can be used for determining the
> > vabits_actual value).
> > 
> > Using the vmcoreinfo variable exported by kernel commit
> >  bbdbc11804ff ("arm64/crash_core: Export  TCR_EL1.T1SZ in vmcoreinfo"),
> > the user-space can use the following computation for determining whether
> >  an address lies in the linear map range (for newer kernels >= 5.4):
> > 
> >   #define __is_lm_address(addr)	(!(((u64)addr) & BIT(vabits_actual - 1)))
> > 
> > Note that for the --mem-usage case though we need to calculate
> > vabits_actual value before the vmcoreinfo read functionality is ready,
> 
> For this, can't we read the TCR_EL1.T1SZ from vmcoreinfo in /proc/kcore's
> ELF note?  I think we can use the common functions used to vmcore with it.
> 
> I'll write a patch to do so if it sounds good.
> 
> > so we can instead read the architecture register ID_AA64MMFR2_EL1
> > directly to see if the underlying hardware supports 52-bit addressing
> > and accordingly set vabits_actual as:
> > 
> >    read_id_aa64mmfr2_el1();
> >    if (hardware supports 52-bit addressing)
> > 	vabits_actual = 52;
> >    else
> > 	vabits_actual = va_bits value calculated via _stext symbol;
> > 
> > Also make sure that the page_offset, is_linear_addr(addr) and __pa()
> > calculations work both for older (< 5.4) and newer kernels (>= 5.4).
> > 
> > I have tested several combinations with both kernel categories
> > [for e.g. with different VA (39, 42, 48 and 52-bit) and PA combinations
> > (48 and 52-bit)] on at-least 3 different boards.
> > 
> > Unfortunately, this means that we need to call 'populate_kernel_version()'
> > earlier 'get_page_offset_arm64()' as 'info->kernel_version' remains
> > uninitialized before its first use otherwise.
> 
> The populate_kernel_version() uses uname(), so this means that there will
> be some cases that makedumpfile doesn't work with vmcores which were
> captured on other kernels than running one.  This is a rather big limitation
> especially to backward-compatibility test, and it would be better to
> avoid changing behavior depending on environment, not on data.
> 
> Is there no room to avoid it?

I have a new idea about it, which avoid any version judgement. Please see the comment inline
> 
> Just an idea, but can we use the OSRELEASE vmcoreinfo in ELF note first
> to determine the kernel version?  It's from init_uts_ns.name.release,
> why can't we use it?
> 
> Thanks,
> Kazu
> 
> > 
> > This patch is in accordance with ARMv8 Architecture Reference Manual
> > 
> > Cc: Kazuhito Hagio <k-hagio at ab.jp.nec.com>
> > Cc: John Donnelly <john.p.donnelly at oracle.com>
> > Cc: kexec at lists.infradead.org
> > Signed-off-by: Bhupesh Sharma <bhsharma at redhat.com>
> > ---
> >  arch/arm64.c   | 233 ++++++++++++++++++++++++++++++++++++++++++-------
> >  common.h       |  10 +++
> >  makedumpfile.c |   4 +-
> >  makedumpfile.h |   6 +-
> >  4 files changed, 218 insertions(+), 35 deletions(-)
> > 
> > diff --git a/arch/arm64.c b/arch/arm64.c
> > index 709e0a506916..ccaa8641ca66 100644
> > --- a/arch/arm64.c
> > +++ b/arch/arm64.c
> > @@ -19,10 +19,23 @@
> > 
> >  #ifdef __aarch64__
> > 
> > +#include <asm/hwcap.h>
> > +#include <sys/auxv.h>
> >  #include "../elf_info.h"
> >  #include "../makedumpfile.h"
> >  #include "../print_info.h"
> > 
> > +/* ID_AA64MMFR2_EL1 related helpers: */
> > +#define ID_AA64MMFR2_LVA_SHIFT	16
> > +#define ID_AA64MMFR2_LVA_MASK	(0xf << ID_AA64MMFR2_LVA_SHIFT)
> > +
> > +/* CPU feature ID registers */
> > +#define get_cpu_ftr(id) ({							\
> > +		unsigned long __val;						\
> > +		asm volatile("mrs %0, " __stringify(id) : "=r" (__val));	\
> > +		__val;								\
> > +})
> > +
> >  typedef struct {
> >  	unsigned long pgd;
> >  } pgd_t;
> > @@ -47,6 +60,7 @@ typedef struct {
> >  static int lpa_52_bit_support_available;
> >  static int pgtable_level;
> >  static int va_bits;
> > +static int vabits_actual;
> >  static unsigned long kimage_voffset;
> > 
> >  #define SZ_4K			4096
> > @@ -58,7 +72,6 @@ static unsigned long kimage_voffset;
> >  #define PAGE_OFFSET_42		((0xffffffffffffffffUL) << 42)
> >  #define PAGE_OFFSET_47		((0xffffffffffffffffUL) << 47)
> >  #define PAGE_OFFSET_48		((0xffffffffffffffffUL) << 48)
> > -#define PAGE_OFFSET_52		((0xffffffffffffffffUL) << 52)
> > 
> >  #define pgd_val(x)		((x).pgd)
> >  #define pud_val(x)		(pgd_val((x).pgd))
> > @@ -219,13 +232,25 @@ pmd_page_paddr(pmd_t pmd)
> >  #define pte_index(vaddr) 		(((vaddr) >> PAGESHIFT()) & (PTRS_PER_PTE - 1))
> >  #define pte_offset(dir, vaddr) 		(pmd_page_paddr((*dir)) + pte_index(vaddr) * sizeof(pte_t))
> > 
> > +/*
> > + * The linear kernel range starts at the bottom of the virtual address
> > + * space. Testing the top bit for the start of the region is a
> > + * sufficient check and avoids having to worry about the tag.
> > + */
> > +#define is_linear_addr(addr)	((info->kernel_version < KERNEL_VERSION(5, 4, 0)) ?	\
> > +	(!!((unsigned long)(addr) & (1UL << (vabits_actual - 1)))) : \
> > +	(!((unsigned long)(addr) & (1UL << (vabits_actual - 1)))))
> > +
> >  static unsigned long long
> >  __pa(unsigned long vaddr)
> >  {
> >  	if (kimage_voffset == NOT_FOUND_NUMBER ||
> > -			(vaddr >= PAGE_OFFSET))
> > -		return (vaddr - PAGE_OFFSET + info->phys_base);
> > -	else
> > +			is_linear_addr(vaddr)) {
> > +		if (info->kernel_version < KERNEL_VERSION(5, 4, 0))
> > +			return ((vaddr & ~PAGE_OFFSET) + info->phys_base);
> > +		else
> > +			return (vaddr + info->phys_base - PAGE_OFFSET);
> > +	} else
> >  		return (vaddr - kimage_voffset);
> >  }
> > 
> > @@ -254,6 +279,7 @@ static int calculate_plat_config(void)
> >  			(PAGESIZE() == SZ_64K && va_bits == 42)) {
> >  		pgtable_level = 2;
> >  	} else if ((PAGESIZE() == SZ_64K && va_bits == 48) ||
> > +			(PAGESIZE() == SZ_64K && va_bits == 52) ||
> >  			(PAGESIZE() == SZ_4K && va_bits == 39) ||
> >  			(PAGESIZE() == SZ_16K && va_bits == 47)) {
> >  		pgtable_level = 3;
> > @@ -288,8 +314,14 @@ get_phys_base_arm64(void)
> >  		return TRUE;
> >  	}
> > 
> > +	/* Ignore the 1st PT_LOAD */
> >  	if (get_num_pt_loads() && PAGE_OFFSET) {
> > -		for (i = 0;
> > +		/* Note that the following loop starts with i = 1.
> > +		 * This is required to make sure that the following logic
> > +		 * works both for old and newer kernels (with flipped
> > +		 * VA space, i.e. >= 5.4.0)
> > +		 */
> > +		for (i = 1;
> >  		    get_pt_load(i, &phys_start, NULL, &virt_start, NULL);
> >  		    i++) {
> >  			if (virt_start != NOT_KV_ADDR
> > @@ -346,6 +378,139 @@ get_stext_symbol(void)
> >  	return(found ? kallsym : FALSE);
> >  }
> > 
> > +static int
> > +get_va_bits_from_stext_arm64(void)
> > +{
> > +	ulong _stext;
> > +
> > +	_stext = get_stext_symbol();
> > +	if (!_stext) {
> > +		ERRMSG("Can't get the symbol of _stext.\n");
> > +		return FALSE;
> > +	}
> > +
> > +	/* Derive va_bits as per arch/arm64/Kconfig. Note that this is a
> > +	 * best case approximation at the moment, as there can be
> > +	 * inconsistencies in this calculation (for e.g., for
> > +	 * 52-bit kernel VA case, the 48th bit is set in
> > +	 * the _stext symbol).
> > +	 *
> > +	 * So, we need to rely on the vabits_actual symbol in the
> > +	 * vmcoreinfo or read via system register for a accurate value
> > +	 * of the virtual addressing supported by the underlying kernel.
> > +	 */
> > +	if ((_stext & PAGE_OFFSET_48) == PAGE_OFFSET_48) {
> > +		va_bits = 48;
> > +	} else if ((_stext & PAGE_OFFSET_47) == PAGE_OFFSET_47) {
> > +		va_bits = 47;
> > +	} else if ((_stext & PAGE_OFFSET_42) == PAGE_OFFSET_42) {
> > +		va_bits = 42;
> > +	} else if ((_stext & PAGE_OFFSET_39) == PAGE_OFFSET_39) {
> > +		va_bits = 39;
> > +	} else if ((_stext & PAGE_OFFSET_36) == PAGE_OFFSET_36) {
> > +		va_bits = 36;
> > +	} else {
> > +		ERRMSG("Cannot find a proper _stext for calculating VA_BITS\n");
> > +		return FALSE;
> > +	}
> > +
> > +	DEBUG_MSG("va_bits       : %d (approximation via _stext)\n", va_bits);
> > +
> > +	return TRUE;
> > +}
> > +
> > +/* Note that its important to note that the
> > + * ID_AA64MMFR2_EL1 architecture register can be read
> > + * only when we give an .arch hint to the gcc/binutils,
> > + * so we use the gcc construct '__attribute__ ((target ("arch=armv8.2-a")))'
> > + * here which is an .arch directive (see AArch64-Target-selection-directives
> > + * documentation from ARM for details). This is required only for
> > + * this function to make sure it compiles well with gcc/binutils.
> > + */
> > +__attribute__ ((target ("arch=armv8.2-a")))
> > +static unsigned long
> > +read_id_aa64mmfr2_el1(void)
> > +{
> > +	return get_cpu_ftr(ID_AA64MMFR2_EL1);
> > +}
> > +
> > +static int
> > +get_vabits_actual_from_id_aa64mmfr2_el1(void)
> > +{
> > +	int l_vabits_actual;
> > +	unsigned long val;
> > +
> > +	/* Check if ID_AA64MMFR2_EL1 CPU-ID register indicates
> > +	 * ARMv8.2/LVA support:
> > +	 * VARange, bits [19:16]
> > +	 *   From ARMv8.2:
> > +	 *   Indicates support for a larger virtual address.
> > +	 *   Defined values are:
> > +	 *     0b0000 VMSAv8-64 supports 48-bit VAs.
> > +	 *     0b0001 VMSAv8-64 supports 52-bit VAs when using the 64KB
> > +	 *            page size. The other translation granules support
> > +	 *            48-bit VAs.
> > +	 *
> > +	 * See ARMv8 ARM for more details.
> > +	 */
> > +	if (!(getauxval(AT_HWCAP) & HWCAP_CPUID)) {
> > +		ERRMSG("arm64 CPUID registers unavailable.\n");
> > +		return ERROR;
> > +	}
> > +
> > +	val = read_id_aa64mmfr2_el1();
> > +	val = (val & ID_AA64MMFR2_LVA_MASK) > ID_AA64MMFR2_LVA_SHIFT;
> > +
> > +	if ((val == 0x1) && (PAGESIZE() == SZ_64K))
> > +		l_vabits_actual = 52;
> > +	else
> > +		l_vabits_actual = 48;
> > +
> > +	return l_vabits_actual;
> > +}
> > +
> > +static void
> > +get_page_offset_arm64(void)
> > +{
> > +	/* Check if 'vabits_actual' is initialized yet.
> > +	 * If not, our best bet is to read ID_AA64MMFR2_EL1 CPU-ID
> > +	 * register.
> > +	 */
> > +	if (!vabits_actual) {
> > +		vabits_actual = get_vabits_actual_from_id_aa64mmfr2_el1();
> > +		if ((vabits_actual == ERROR) || (vabits_actual != 52)) {
> > +			/* If we cannot read ID_AA64MMFR2_EL1 arch
> > +			 * register or if this register does not indicate
> > +			 * support for a larger virtual address, our last
> > +			 * option is to use the VA_BITS to calculate the
> > +			 * PAGE_OFFSET value, i.e. vabits_actual = VA_BITS.
> > +			 */
> > +			vabits_actual = va_bits;
> > +			DEBUG_MSG("vabits_actual : %d (approximation via va_bits)\n",
> > +					vabits_actual);
> > +		} else
> > +			DEBUG_MSG("vabits_actual : %d (via id_aa64mmfr2_el1)\n",
> > +					vabits_actual);
> > +	}
> > +
> > +	if (!populate_kernel_version()) {
> > +		ERRMSG("Cannot get information about current kernel\n");
> > +		return;
> > +	}
> > +
> > +	/* See arch/arm64/include/asm/memory.h for more details of
> > +	 * the PAGE_OFFSET calculation.
> > +	 */
> > +	if (info->kernel_version < KERNEL_VERSION(5, 4, 0))
> > +		info->page_offset = ((0xffffffffffffffffUL) -
> > +				((1UL) << (vabits_actual - 1)) + 1);
> > +	else
> > +		info->page_offset = (-(1UL << vabits_actual));
> > +

Considering the following related commit order

    b6d00d47e81a arm64: mm: Introduce 52-bit Kernel VAs                        (2)
    ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE
    c8b6d2ccf9b1 arm64: mm: Separate out vmemmap
    c812026c54cf arm64: mm: Logic to make offset_ttbr1 conditional
    5383cc6efed1 arm64: mm: Introduce vabits_actual
    90ec95cda91a arm64: mm: Introduce VA_BITS_MIN
    99426e5e8c9f arm64: dump: De-constify VA_START and KASAN_SHADOW_START
    6bd1d0be0e97 arm64: kasan: Switch to using KASAN_SHADOW_OFFSET
    14c127c957c1 arm64: mm: Flip kernel VA space                               (1)

And
    #define _PAGE_END(va)		(-(UL(1) << ((va) - 1)))
    #define PAGE_OFFSET (((0xffffffffffffffffUL) - ((1UL) << (vabits_actual - 1)) + 1))  //old
    #define PAGE_OFFSET (-(1UL << vabits_actual))                                        //new

before (1), SYMBOL(_text) < PAGE_OFFSET, afterward, SYMBOL(_text) > PAGE_END == "old PAGE_OFFSET"

So the comparasion of kernel version can be replaced by
    if SYMBOL(_text) > PAGE_END
	info->page_offset = new PAGE_OFFSET
    else
	info->page_offset = old PAGE_OFFSET


Any comment?

Thanks,
	Pingfan


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support)
  2021-01-13  9:17     ` Pingfan Liu
@ 2021-01-14  0:43       ` HAGIO KAZUHITO(萩尾 一仁)
  0 siblings, 0 replies; 13+ messages in thread
From: HAGIO KAZUHITO(萩尾 一仁) @ 2021-01-14  0:43 UTC (permalink / raw)
  To: Pingfan Liu, kexec, Bhupesh Sharma

-----Original Message-----
> Considering the following related commit order
> 
>     b6d00d47e81a arm64: mm: Introduce 52-bit Kernel VAs                        (2)
>     ce3aaed87344 arm64: mm: Modify calculation of VMEMMAP_SIZE
>     c8b6d2ccf9b1 arm64: mm: Separate out vmemmap
>     c812026c54cf arm64: mm: Logic to make offset_ttbr1 conditional
>     5383cc6efed1 arm64: mm: Introduce vabits_actual
>     90ec95cda91a arm64: mm: Introduce VA_BITS_MIN
>     99426e5e8c9f arm64: dump: De-constify VA_START and KASAN_SHADOW_START
>     6bd1d0be0e97 arm64: kasan: Switch to using KASAN_SHADOW_OFFSET
>     14c127c957c1 arm64: mm: Flip kernel VA space                               (1)
> 
> And
>     #define _PAGE_END(va)		(-(UL(1) << ((va) - 1)))
>     #define PAGE_OFFSET (((0xffffffffffffffffUL) - ((1UL) << (vabits_actual - 1)) + 1))  //old
>     #define PAGE_OFFSET (-(1UL << vabits_actual))                                        //new
> 
> before (1), SYMBOL(_text) < PAGE_OFFSET, afterward, SYMBOL(_text) > PAGE_END == "old PAGE_OFFSET"
> 
> So the comparasion of kernel version can be replaced by
>     if SYMBOL(_text) > PAGE_END
> 	info->page_offset = new PAGE_OFFSET
>     else
> 	info->page_offset = old PAGE_OFFSET

Oh, if we use PAGE_END(VA_BITS_MIN) here, which was actually changed in 5.11
from PAGE_END(vabits_actual), that sounds good to me.  Excellent!

I've been splitting and rewriting this patch of Bhupesh to remove some parts
I mentioned and for easier review and adding my ideas [1], though still halfway.
I'd like to try taking Pingfan's idea in.

[1] https://github.com/k-hagio/makedumpfile/commits/arm64.kh.test2


BTW, we have one more challenge, for 5.4+ kernels without NUMBER(TCR_EL1_T1SZ),
I'm thinking about using SYMBOL(mem_section) to get vabits_actual, because it
should be an address in kernel linear space.

+       if (NUMBER(TCR_EL1_T1SZ) != NOT_FOUND_NUMBER) {
+               vabits_actual = 64 - NUMBER(TCR_EL1_T1SZ);
+               DEBUG_MSG("vabits_actual : %d (vmcoreinfo)\n", vabits_actual);
+       } else if ((info->kernel_version >= KERNEL_VERSION(5, 4, 0)) &&
+                   (va_bits == 52) && (SYMBOL(mem_section) != NOT_FOUND_SYMBOL)) {
+               /*
+                * Linux 5.4 through 5.10 have the following linear space:
+                *  48-bit: 0xffff000000000000 - 0xffff7fffffffffff
+                *  58-bit: 0xfff0000000000000 - 0xfff7ffffffffffff
+                */
+               if (SYMBOL(mem_section) & (1UL << (52 - 1)))
+                       vabits_actual = 48;
+               else
+                       vabits_actual = 52;
+       } else {
+               vabits_actual = va_bits;
+               DEBUG_MSG("vabits_actual : %d (same as va_bits)\n", vabits_actual);
+       }

This might not work with 5.11, but work through 5.10.

Any comments?

Thanks,
Kazu
_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2021-01-14  0:43 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-09-10  5:33 [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions Bhupesh Sharma
2020-09-10  5:33 ` [PATCH v5 1/3] tree-wide: Retrieve 'MAX_PHYSMEM_BITS' from vmcoreinfo (if available) Bhupesh Sharma
2020-09-18  5:43   ` HAGIO KAZUHITO(萩尾 一仁)
2020-09-10  5:33 ` [PATCH v5 2/3] makedumpfile/arm64: Add support for ARMv8.2-LPA (52-bit PA support) Bhupesh Sharma
2020-09-23  5:28   ` HAGIO KAZUHITO(萩尾 一仁)
2020-09-10  5:33 ` [PATCH v5 3/3] makedumpfile/arm64: Add support for ARMv8.2-LVA (52-bit kernel VA support) Bhupesh Sharma
2020-09-24  5:05   ` HAGIO KAZUHITO(萩尾 一仁)
2021-01-13  9:17     ` Pingfan Liu
2021-01-14  0:43       ` HAGIO KAZUHITO(萩尾 一仁)
2020-09-24  5:23 ` [PATCH v5 0/3] makedumpfile/arm64: Add support for ARMv8.2 extensions HAGIO KAZUHITO(萩尾 一仁)
2020-11-11  0:34   ` HAGIO KAZUHITO(萩尾 一仁)
2020-11-12  6:46     ` Bhupesh Sharma
2020-11-13  6:03       ` HAGIO KAZUHITO(萩尾 一仁)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.