[PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility
@ 2016-11-22  4:32 ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-22  4:32 UTC (permalink / raw)
  To: linux-arm-kernel

It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
is around 13MB and initramfs is around 30MB. It takes more than 20 second
even when we have -O2 optimization enabled. However, if dcache is enabled
during purgatory execution then, it takes just a second in SHA verification.

Therefore, these patches adds support for dcache enabling facility during
purgatory execution. There is no change in kexec behaviour by default.
Dcache will be enabled only when --enable-dcache is passed to kexec.

Pratyush Anand (2):
  arm64: Add enable/disable d-cache support for purgatory
  arm64: Pass RAM boundary and enable-dcache flag to purgatory

 kexec/arch/arm64/include/arch/options.h |   6 +-
 kexec/arch/arm64/include/types.h        |  16 ++
 kexec/arch/arm64/kexec-arm64.c          |  25 ++-
 purgatory/arch/arm64/Makefile           |   2 +
 purgatory/arch/arm64/cache-asm.S        | 186 ++++++++++++++++++
 purgatory/arch/arm64/cache.c            | 330 ++++++++++++++++++++++++++++++++
 purgatory/arch/arm64/cache.h            |  79 ++++++++
 purgatory/arch/arm64/purgatory-arm64.c  |  11 ++
 8 files changed, 653 insertions(+), 2 deletions(-)
 create mode 100644 kexec/arch/arm64/include/types.h
 create mode 100644 purgatory/arch/arm64/cache-asm.S
 create mode 100644 purgatory/arch/arm64/cache.c
 create mode 100644 purgatory/arch/arm64/cache.h

-- 
2.7.4

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility
@ 2016-11-22  4:32 ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-22  4:32 UTC (permalink / raw)
  To: kexec, geoff; +Cc: Pratyush Anand, james.morse, linux-arm-kernel

It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
is around 13MB and initramfs is around 30MB. It takes more than 20 second
even when we have -O2 optimization enabled. However, if dcache is enabled
during purgatory execution then, it takes just a second in SHA verification.

Therefore, these patches adds support for dcache enabling facility during
purgatory execution. There is no change in kexec behaviour by default.
Dcache will be enabled only when --enable-dcache is passed to kexec.

Pratyush Anand (2):
  arm64: Add enable/disable d-cache support for purgatory
  arm64: Pass RAM boundary and enable-dcache flag to purgatory

 kexec/arch/arm64/include/arch/options.h |   6 +-
 kexec/arch/arm64/include/types.h        |  16 ++
 kexec/arch/arm64/kexec-arm64.c          |  25 ++-
 purgatory/arch/arm64/Makefile           |   2 +
 purgatory/arch/arm64/cache-asm.S        | 186 ++++++++++++++++++
 purgatory/arch/arm64/cache.c            | 330 ++++++++++++++++++++++++++++++++
 purgatory/arch/arm64/cache.h            |  79 ++++++++
 purgatory/arch/arm64/purgatory-arm64.c  |  11 ++
 8 files changed, 653 insertions(+), 2 deletions(-)
 create mode 100644 kexec/arch/arm64/include/types.h
 create mode 100644 purgatory/arch/arm64/cache-asm.S
 create mode 100644 purgatory/arch/arm64/cache.c
 create mode 100644 purgatory/arch/arm64/cache.h

-- 
2.7.4


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-11-22  4:32 ` Pratyush Anand
@ 2016-11-22  4:32   ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-22  4:32 UTC (permalink / raw)
  To: linux-arm-kernel

This patch adds support to enable/disable d-cache, which can be used for
faster purgatory sha256 verification.

We are supporting only 4K and 64K page sizes. This code will not work if a
hardware is not supporting at least one of these page sizes.  Therefore,
D-cache is disabled by default and enabled only when "enable-dcache" is
passed to the kexec().
Since this is an identity mapped system, so VA_BITS will be same as max PA
bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
level of page table will be there with block descriptor entries.
Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
only table entries pointing to a level 1 lookup. Level 1 will have only
block entries which will map 1GB block. For 64K mapping, TTBR points to
level 1 lookups, which will have only table entries pointing to a level 2
lookup. Level 2 will have only block entries which will map 512MB block. If
UART base address and RAM addresses are not at least 1GB and 512MB apart
for 4K and 64K respectively, then mapping result could be unpredictable. In
that case we need to support one more level of granularity, but until
someone needs that keep it like this only.
We can not allocate dynamic memory in purgatory. Therefore we keep page
table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
points to table at next level (having block entries).  If index for RAM
area and UART area in first table is not same, then we will need another
next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).

Signed-off-by: Pratyush Anand <panand@redhat.com>
---
 purgatory/arch/arm64/Makefile    |   2 +
 purgatory/arch/arm64/cache-asm.S | 186 ++++++++++++++++++++++
 purgatory/arch/arm64/cache.c     | 330 +++++++++++++++++++++++++++++++++++++++
 purgatory/arch/arm64/cache.h     |  79 ++++++++++
 4 files changed, 597 insertions(+)
 create mode 100644 purgatory/arch/arm64/cache-asm.S
 create mode 100644 purgatory/arch/arm64/cache.c
 create mode 100644 purgatory/arch/arm64/cache.h

diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
index 636abeab17b2..0f80f8165d90 100644
--- a/purgatory/arch/arm64/Makefile
+++ b/purgatory/arch/arm64/Makefile
@@ -11,6 +11,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
 
 arm64_PURGATORY_SRCS += \
 	purgatory/arch/arm64/entry.S \
+	purgatory/arch/arm64/cache-asm.S \
+	purgatory/arch/arm64/cache.c \
 	purgatory/arch/arm64/purgatory-arm64.c
 
 dist += \
diff --git a/purgatory/arch/arm64/cache-asm.S b/purgatory/arch/arm64/cache-asm.S
new file mode 100644
index 000000000000..bef97ef48888
--- /dev/null
+++ b/purgatory/arch/arm64/cache-asm.S
@@ -0,0 +1,186 @@
+/*
+ * Some of the routines have been copied from Linux Kernel, therefore
+ * copying the license as well.
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "cache.h"
+
+/*
+ * 	dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ *	inval_cache_range(start, end)
+ *	- x0 - start	- start address of region
+ *	- x1 - end	- end address of region
+ */
+.globl inval_cache_range
+inval_cache_range:
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	tst	x1, x3				// end cache line aligned?
+	bic	x1, x1, x3
+	b.eq	1f
+	dc	civac, x1			// clean & invalidate D / U line
+1:	tst	x0, x3				// start cache line aligned?
+	bic	x0, x0, x3
+	b.eq	2f
+	dc	civac, x0			// clean & invalidate D / U line
+	b	3f
+2:	dc	ivac, x0			// invalidate D / U line
+3:	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	2b
+	dsb	sy
+	ret
+/*
+ *	flush_dcache_range(start, end)
+ *	- x0 - start	- start address of region
+ *	- x1 - end	- end address of region
+ *
+ */
+.globl flush_dcache_range
+flush_dcache_range:
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x0, x0, x3
+1:	dc	civac, x0			// clean & invalidate D line / unified line
+	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	ret
+
+/*
+ *	invalidate_tlbs_el1()
+ */
+.globl invalidate_tlbs_el1
+invalidate_tlbs_el1:
+	dsb	nshst
+	tlbi	vmalle1
+	dsb	nsh
+	isb
+	ret
+
+/*
+ *	invalidate_tlbs_el2()
+ */
+.globl invalidate_tlbs_el2
+invalidate_tlbs_el2:
+	dsb	nshst
+	tlbi	alle2
+	dsb	nsh
+	isb
+	ret
+
+/*
+ * 	get_mm_feature_reg0_val - Get information about supported MM
+ * 	features
+ */
+.globl get_mm_feature_reg0_val
+get_mm_feature_reg0_val:
+	mrs	x0, ID_AA64MMFR0_EL1
+	ret
+
+/*
+ * 	get_current_el - Get information about current exception level
+ */
+.globl get_current_el
+get_current_el:
+	mrs 	x0, CurrentEL
+	lsr	x0, x0, #2
+	ret
+
+/*
+ * 	invalidate_icache - Invalidate I-cache
+ */
+.globl invalidate_icache
+invalidate_icache:
+	ic	iallu
+	dsb	nsh
+	isb
+	ret
+
+/*
+ * 	set_mair_tcr_ttbr_sctlr_el1(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
+ * 	x0 - page_table - Page Table Base
+ * 	x1 - tcr_flags - TCR Flags to be set
+ */
+.globl set_mair_tcr_ttbr_sctlr_el1
+set_mair_tcr_ttbr_sctlr_el1:
+	ldr	x2, =MEMORY_ATTRIBUTES
+	msr	mair_el1, x2
+	msr	tcr_el1, x1
+	msr	ttbr0_el1, x0
+	isb
+	mrs	x0, sctlr_el1
+	ldr	x3, =SCTLR_ELx_FLAGS
+	orr	x0, x0, x3
+	msr	sctlr_el1, x0
+	isb
+	ret
+
+/*
+ * 	set_mair_tcr_ttbr_sctlr_el2(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
+ * 	x0 - page_table - Page Table Base
+ * 	x1 - tcr_flags - TCR Flags to be set
+ */
+.globl set_mair_tcr_ttbr_sctlr_el2
+set_mair_tcr_ttbr_sctlr_el2:
+	ldr	x2, =MEMORY_ATTRIBUTES
+	msr	mair_el2, x2
+	msr	tcr_el2, x1
+	msr	ttbr0_el2, x0
+	isb
+	mrs	x0, sctlr_el2
+	ldr	x3, =SCTLR_ELx_FLAGS
+	orr	x0, x0, x3
+	msr	sctlr_el2, x0
+	isb
+	ret
+
+/*
+ * reset_sctlr_el1 - disables cache and mmu
+ */
+.globl reset_sctlr_el1
+reset_sctlr_el1:
+	mrs	x0, sctlr_el1
+	bic	x0, x0, #SCTLR_ELx_C
+	bic	x0, x0, #SCTLR_ELx_M
+	msr	sctlr_el1, x0
+	isb
+	ret
+
+/*
+ * reset_sctlr_el2 - disables cache and mmu
+ */
+.globl reset_sctlr_el2
+reset_sctlr_el2:
+	mrs	x0, sctlr_el2
+	bic	x0, x0, #SCTLR_ELx_C
+	bic	x0, x0, #SCTLR_ELx_M
+	msr	sctlr_el2, x0
+	isb
+	ret
diff --git a/purgatory/arch/arm64/cache.c b/purgatory/arch/arm64/cache.c
new file mode 100644
index 000000000000..3c7e058ccf11
--- /dev/null
+++ b/purgatory/arch/arm64/cache.c
@@ -0,0 +1,330 @@
+/*
+ * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* We are supporting only 4K and 64K page sizes. This code will not work if
+ * a hardware is not supporting at least one of these page sizes.
+ * Therefore, D-cache is disabled by default and enabled only when
+ * "enable-dcache" is passed to the kexec().
+ * Since this is an identity mapped system, so VA_BITS will be same as max
+ * PA bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only
+ * one level of page table will be there with block descriptor entries.
+ * Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will
+ * have only table entries pointing to a level 1 lookup. Level 1 will have
+ * only block entries which will map 1GB block.For 64K mapping, TTBR points
+ * to level 1 lookups, which will have only table entries pointing to a
+ * level 2 lookup. Level 2 will have only block entries which will map
+ * 512MB block. If UART base address and RAM addresses are not at least 1GB
+ * and 512MB apart for 4K and 64K respectively, then mapping result could
+ * be unpredictable. In that case we need to support one more level of
+ * granularity, but until someone needs that keep it like this only.
+ * We can not allocate dynamic memory in purgatory. Therefore we keep page
+ * table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points
+ * to first level (having only table entries) and (page_table +
+ * MAX_PAGE_SIZE) points to table at next level (having block entries). If
+ * index for RAM area and UART area in first table is not same, then we
+ * will need another next level table which will be located@(page_table
+ * + 2 * MAX_PAGE_SIZE).
+ */
+
+#include <stdint.h>
+#include <string.h>
+#include <purgatory.h>
+#include "cache.h"
+
+static uint64_t page_shift;
+static uint64_t pgtable_level;
+static uint64_t va_bits;
+
+static uint64_t page_table[PAGE_TABLE_SIZE / sizeof(uint64_t)] __attribute__ ((aligned (MAX_PAGE_SIZE))) = { };
+static uint64_t page_table_used;
+
+#define PAGE_SIZE	(1 << page_shift)
+/*
+ *	is_4k_page_supported - return true if 4k page is supported else
+ *	false
+ */
+static int is_4k_page_supported(void)
+{
+	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN4_MASK) ==
+			ID_AA64MMFR0_TGRAN4_SUPPORTED);
+}
+
+/*
+ *	is_64k_page_supported - return true if 64k page is supported else
+ *	false
+ */
+static int is_64k_page_supported(void)
+{
+	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN64_MASK) ==
+			ID_AA64MMFR0_TGRAN64_SUPPORTED);
+}
+
+/*
+ *	get_ips_bits - return supported IPS bits
+ */
+static uint64_t get_ips_bits(void)
+{
+	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_PARANGE_MASK) >>
+			ID_AA64MMFR0_PARANGE_SHIFT);
+}
+
+/*
+ *	get_va_bits - return supported VA bits (For identity mapping VA = PA)
+ */
+static uint64_t get_va_bits(void)
+{
+	uint64_t ips = get_ips_bits();
+
+	switch(ips) {
+	case ID_AA64MMFR0_PARANGE_48:
+		return 48;
+	case ID_AA64MMFR0_PARANGE_44:
+		return 44;
+	case ID_AA64MMFR0_PARANGE_42:
+		return 42;
+	case ID_AA64MMFR0_PARANGE_40:
+		return 40;
+	case ID_AA64MMFR0_PARANGE_36:
+		return 36;
+	default:
+		return 32;
+	}
+}
+
+/*
+ *	get_section_shift - get block shift for supported page size
+ */
+static uint64_t get_section_shift(void)
+{
+	if (page_shift == 16)
+		return 29;
+	else if(page_shift == 12)
+		return 30;
+	else
+		return 0;
+}
+
+/*
+ *	get_section_mask - get section mask for supported page size
+ */
+static uint64_t get_section_mask(void)
+{
+	if (page_shift == 16)
+		return 0x1FFF;
+	else if(page_shift == 12)
+		return 0x1FF;
+	else
+		return 0;
+}
+
+/*
+ *	get_pgdir_shift - get pgdir shift for supported page size
+ */
+static uint64_t get_pgdir_shift(void)
+{
+	if (page_shift == 16)
+		return 42;
+	else if(page_shift == 12)
+		return 39;
+	else
+		return 0;
+}
+
+/*
+ *	init_page_table - Initializes page table locations
+ */
+
+static void init_page_table(void)
+{
+	/*
+	 * Invalidate the page tables to avoid potential dirty cache lines
+	 * being evicted.
+	 */
+
+	inval_cache_range((uint64_t)page_table,
+			(uint64_t)page_table + PAGE_TABLE_SIZE);
+	memset(page_table, 0, PAGE_TABLE_SIZE);
+}
+/*
+ *	create_identity_mapping(start, end, flags)
+ *	start		- start address
+ *	end		- end address
+ *	flags 		- MMU Flags for Normal or Device type memory
+ */
+static void create_identity_mapping(uint64_t start, uint64_t end,
+					uint64_t flags)
+{
+	uint32_t sec_shift, pgdir_shift, sec_mask;
+	uint64_t desc, s1, e1, s2, e2;
+	uint64_t *table2;
+
+	s1 = start;
+	e1 = end - 1;
+
+	sec_shift = get_section_shift();
+	if (pgtable_level == 1) {
+		s1 >>= sec_shift;
+		e1 >>= sec_shift;
+		do {
+			desc = s1 << sec_shift;
+			desc |= flags;
+			page_table[s1] = desc;
+			s1++;
+		} while (s1 <= e1);
+	} else {
+		pgdir_shift = get_pgdir_shift();
+		sec_mask = get_section_mask();
+		s1 >>= pgdir_shift;
+		e1 >>= pgdir_shift;
+		do {
+			/*
+			 * If there is no table entry then write a new
+			 * entry else, use old entry
+			 */
+			if (!page_table[s1]) {
+				table2 = &page_table[(++page_table_used *
+						MAX_PAGE_SIZE) /
+						sizeof(uint64_t)];
+				desc = (uint64_t)table2 | PMD_TYPE_TABLE;
+				page_table[s1] = desc;
+			} else {
+				table2 = (uint64_t *)(page_table[s1] &
+						~PMD_TYPE_MASK);
+			}
+			s1++;
+			s2 = start >> sec_shift;
+			s2 &= sec_mask;
+			e2 = (end - 1) >> sec_shift;
+			e2 &= sec_mask;
+			do {
+				desc = s2 << sec_shift;
+				desc |= flags;
+				table2[s2] = desc;
+				s2++;
+			} while (s2 <= e2);
+		} while (s1 <= e1);
+	}
+}
+
+/*
+ *	enable_mmu_dcache: Enable mmu and D-cache in sctlr_el1
+ */
+static void enable_mmu_dcache(void)
+{
+	uint64_t tcr_flags = TCR_FLAGS | TCR_T0SZ(va_bits);
+
+	switch(page_shift) {
+	case 16:
+		tcr_flags |= TCR_TG0_64K;
+		break;
+	case 12:
+		tcr_flags |= TCR_TG0_4K;
+		break;
+	default:
+		printf("page shift not supported\n");
+		return;
+	}
+	/*
+	 * Since the page tables have been populated with non-cacheable
+	 * accesses (MMU disabled), invalidate the page tables to remove
+	 * any speculatively loaded cache lines.
+	 */
+	inval_cache_range((uint64_t)page_table,
+				(uint64_t)page_table + PAGE_TABLE_SIZE);
+
+	switch(get_current_el()) {
+	case 2:
+		invalidate_tlbs_el2();
+		tcr_flags |= (get_ips_bits() << TCR_PS_EL2_SHIFT);
+		set_mair_tcr_ttbr_sctlr_el2((uint64_t)page_table, tcr_flags);
+		break;
+	case 1:
+		invalidate_tlbs_el1();
+		tcr_flags |= (get_ips_bits() << TCR_IPS_EL1_SHIFT);
+		set_mair_tcr_ttbr_sctlr_el1((uint64_t)page_table, tcr_flags);
+		break;
+	default:
+		return;
+	}
+	invalidate_icache();
+}
+
+/*
+ *	enable_dcache: Enable D-cache and set appropriate attributes
+ *	ram_start - Start address of RAM
+ *	ram_end - End address of RAM
+ *	uart_base - Base address of uart
+ */
+int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base)
+{
+	va_bits = get_va_bits();
+
+	page_table_used = 0;
+	if (is_64k_page_supported()) {
+		page_shift = 16;
+		if (va_bits <= 42)
+			pgtable_level = 1;
+		else
+			pgtable_level = 2;
+	} else if (is_4k_page_supported()) {
+		page_shift = 12;
+		if (va_bits <= 39)
+			pgtable_level = 1;
+		else
+			pgtable_level = 2;
+	} else {
+		printf("Valid Page Granule not supported by hardware\n");
+		return -1;
+	}
+	init_page_table();
+	create_identity_mapping(ram_start, ram_end, MM_MMUFLAGS_NORMAL);
+	printf("Normal identity mapping created from %lx to %lx\n",
+			ram_start, ram_end);
+	if (uart_base) {
+		create_identity_mapping((uint64_t)uart_base,
+					(uint64_t)uart_base + PAGE_SIZE,
+					MM_MMUFLAGS_DEVICE);
+		printf("Device identity mapping created from %lx to %lx\n",
+				(uint64_t)uart_base,
+				(uint64_t)uart_base + PAGE_SIZE);
+	}
+	enable_mmu_dcache();
+	printf("Cache Enabled\n");
+
+	return 0;
+}
+
+/*
+ *	disable_dcache: Disable D-cache and flush RAM locations
+ *	ram_start - Start address of RAM
+ *	ram_end - End address of RAM
+ */
+void disable_dcache(uint64_t ram_start, uint64_t ram_end)
+{
+	switch(get_current_el()) {
+	case 2:
+		reset_sctlr_el2();
+		break;
+	case 1:
+		reset_sctlr_el1();
+		break;
+	default:
+		return;
+	}
+	invalidate_icache();
+	flush_dcache_range(ram_start, ram_end);
+	printf("Cache Disabled\n");
+}
diff --git a/purgatory/arch/arm64/cache.h b/purgatory/arch/arm64/cache.h
new file mode 100644
index 000000000000..c988020566e3
--- /dev/null
+++ b/purgatory/arch/arm64/cache.h
@@ -0,0 +1,79 @@
+#ifndef	__CACHE_H__
+#define __CACHE_H__
+
+#define MT_DEVICE_NGNRNE	0
+#define MT_DEVICE_NGNRE		1
+#define MT_DEVICE_GRE		2
+#define MT_NORMAL_NC		3
+#define MT_NORMAL		4
+
+#ifndef __ASSEMBLER__
+
+#define MAX_PAGE_SIZE		0x10000
+#define PAGE_TABLE_SIZE		(3 * MAX_PAGE_SIZE)
+#define ID_AA64MMFR0_TGRAN64_SHIFT	24
+#define ID_AA64MMFR0_TGRAN4_SHIFT	28
+#define ID_AA64MMFR0_TGRAN64_MASK	(0xFUL << ID_AA64MMFR0_TGRAN64_SHIFT)
+#define ID_AA64MMFR0_TGRAN4_MASK	(0xFUL << ID_AA64MMFR0_TGRAN4_SHIFT)
+#define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
+#define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
+#define ID_AA64MMFR0_PARANGE_SHIFT	0
+#define ID_AA64MMFR0_PARANGE_MASK	(0xFUL << ID_AA64MMFR0_PARANGE_SHIFT)
+#define ID_AA64MMFR0_PARANGE_48		0x5
+#define ID_AA64MMFR0_PARANGE_44		0x4
+#define ID_AA64MMFR0_PARANGE_42		0x3
+#define ID_AA64MMFR0_PARANGE_40		0x2
+#define ID_AA64MMFR0_PARANGE_36		0x1
+#define ID_AA64MMFR0_PARANGE_32		0x0
+
+#define TCR_TG0_64K 		(1UL << 14)
+#define TCR_TG0_4K 		(0UL << 14)
+#define TCR_SHARED_NONE		(0UL << 12)
+#define TCR_ORGN_WBWA		(1UL << 10)
+#define TCR_IRGN_WBWA		(1UL << 8)
+#define TCR_IPS_EL1_SHIFT	32
+#define TCR_PS_EL2_SHIFT	16
+#define TCR_T0SZ(x)		((unsigned long)(64 - (x)) << 0)
+#define TCR_FLAGS (TCR_SHARED_NONE | TCR_ORGN_WBWA | TCR_IRGN_WBWA)
+
+#define PMD_TYPE_SECT		(1UL << 0)
+#define PMD_TYPE_TABLE		(3UL << 0)
+#define PMD_TYPE_MASK		0x3
+#define PMD_SECT_AF		(1UL << 10)
+#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
+#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
+#define PMD_SECT_PXN		(1UL << 53)
+#define PMD_SECT_UXN		(1UL << 54)
+#define PMD_FLAGS_DEVICE	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_PXN | PMD_SECT_UXN)
+#define MM_MMUFLAGS_NORMAL	PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL
+#define MM_MMUFLAGS_DEVICE	PMD_ATTRINDX(MT_DEVICE_NGNRE) | PMD_FLAGS_DEVICE
+
+void disable_dcache(uint64_t ram_start, uint64_t ram_end);
+int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base);
+uint64_t get_mm_feature_reg0_val(void);
+void inval_cache_range(uint64_t start, uint64_t end);
+void flush_dcache_range(uint64_t start, uint64_t end);
+uint64_t get_current_el(void);
+void set_mair_tcr_ttbr_sctlr_el1(uint64_t page_table, uint64_t tcr_flags);
+void set_mair_tcr_ttbr_sctlr_el2(uint64_t page_table, uint64_t tcr_flags);
+void invalidate_tlbs_el1(void);
+void invalidate_tlbs_el2(void);
+void invalidate_icache(void);
+void reset_sctlr_el1(void);
+void reset_sctlr_el2(void);
+#else
+#define MEMORY_ATTRIBUTES	((0x00 << (MT_DEVICE_NGNRNE*8)) | \
+				(0x04 << (MT_DEVICE_NGNRE*8)) | \
+				(0x0C << (MT_DEVICE_GRE*8)) | \
+				(0x44 << (MT_NORMAL_NC*8)) | \
+				(0xFF << (MT_NORMAL*8)))
+
+/* Common SCTLR_ELx flags. */
+#define SCTLR_ELx_I		(1 << 12)
+#define SCTLR_ELx_C		(1 << 2)
+#define SCTLR_ELx_M		(1 << 0)
+
+#define SCTLR_ELx_FLAGS (SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_I)
+
+#endif
+#endif
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-11-22  4:32   ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-22  4:32 UTC (permalink / raw)
  To: kexec, geoff; +Cc: Pratyush Anand, james.morse, linux-arm-kernel

This patch adds support to enable/disable d-cache, which can be used for
faster purgatory sha256 verification.

We are supporting only 4K and 64K page sizes. This code will not work if a
hardware is not supporting at least one of these page sizes.  Therefore,
D-cache is disabled by default and enabled only when "enable-dcache" is
passed to the kexec().
Since this is an identity mapped system, so VA_BITS will be same as max PA
bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
level of page table will be there with block descriptor entries.
Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
only table entries pointing to a level 1 lookup. Level 1 will have only
block entries which will map 1GB block. For 64K mapping, TTBR points to
level 1 lookups, which will have only table entries pointing to a level 2
lookup. Level 2 will have only block entries which will map 512MB block. If
UART base address and RAM addresses are not at least 1GB and 512MB apart
for 4K and 64K respectively, then mapping result could be unpredictable. In
that case we need to support one more level of granularity, but until
someone needs that keep it like this only.
We can not allocate dynamic memory in purgatory. Therefore we keep page
table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
points to table at next level (having block entries).  If index for RAM
area and UART area in first table is not same, then we will need another
next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).

Signed-off-by: Pratyush Anand <panand@redhat.com>
---
 purgatory/arch/arm64/Makefile    |   2 +
 purgatory/arch/arm64/cache-asm.S | 186 ++++++++++++++++++++++
 purgatory/arch/arm64/cache.c     | 330 +++++++++++++++++++++++++++++++++++++++
 purgatory/arch/arm64/cache.h     |  79 ++++++++++
 4 files changed, 597 insertions(+)
 create mode 100644 purgatory/arch/arm64/cache-asm.S
 create mode 100644 purgatory/arch/arm64/cache.c
 create mode 100644 purgatory/arch/arm64/cache.h

diff --git a/purgatory/arch/arm64/Makefile b/purgatory/arch/arm64/Makefile
index 636abeab17b2..0f80f8165d90 100644
--- a/purgatory/arch/arm64/Makefile
+++ b/purgatory/arch/arm64/Makefile
@@ -11,6 +11,8 @@ arm64_PURGATORY_EXTRA_CFLAGS = \
 
 arm64_PURGATORY_SRCS += \
 	purgatory/arch/arm64/entry.S \
+	purgatory/arch/arm64/cache-asm.S \
+	purgatory/arch/arm64/cache.c \
 	purgatory/arch/arm64/purgatory-arm64.c
 
 dist += \
diff --git a/purgatory/arch/arm64/cache-asm.S b/purgatory/arch/arm64/cache-asm.S
new file mode 100644
index 000000000000..bef97ef48888
--- /dev/null
+++ b/purgatory/arch/arm64/cache-asm.S
@@ -0,0 +1,186 @@
+/*
+ * Some of the routines have been copied from Linux Kernel, therefore
+ * copying the license as well.
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include "cache.h"
+
+/*
+ * 	dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ *	inval_cache_range(start, end)
+ *	- x0 - start	- start address of region
+ *	- x1 - end	- end address of region
+ */
+.globl inval_cache_range
+inval_cache_range:
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	tst	x1, x3				// end cache line aligned?
+	bic	x1, x1, x3
+	b.eq	1f
+	dc	civac, x1			// clean & invalidate D / U line
+1:	tst	x0, x3				// start cache line aligned?
+	bic	x0, x0, x3
+	b.eq	2f
+	dc	civac, x0			// clean & invalidate D / U line
+	b	3f
+2:	dc	ivac, x0			// invalidate D / U line
+3:	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	2b
+	dsb	sy
+	ret
+/*
+ *	flush_dcache_range(start, end)
+ *	- x0 - start	- start address of region
+ *	- x1 - end	- end address of region
+ *
+ */
+.globl flush_dcache_range
+flush_dcache_range:
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x0, x0, x3
+1:	dc	civac, x0			// clean & invalidate D line / unified line
+	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	ret
+
+/*
+ *	invalidate_tlbs_el1()
+ */
+.globl invalidate_tlbs_el1
+invalidate_tlbs_el1:
+	dsb	nshst
+	tlbi	vmalle1
+	dsb	nsh
+	isb
+	ret
+
+/*
+ *	invalidate_tlbs_el2()
+ */
+.globl invalidate_tlbs_el2
+invalidate_tlbs_el2:
+	dsb	nshst
+	tlbi	alle2
+	dsb	nsh
+	isb
+	ret
+
+/*
+ * 	get_mm_feature_reg0_val - Get information about supported MM
+ * 	features
+ */
+.globl get_mm_feature_reg0_val
+get_mm_feature_reg0_val:
+	mrs	x0, ID_AA64MMFR0_EL1
+	ret
+
+/*
+ * 	get_current_el - Get information about current exception level
+ */
+.globl get_current_el
+get_current_el:
+	mrs 	x0, CurrentEL
+	lsr	x0, x0, #2
+	ret
+
+/*
+ * 	invalidate_icache - Invalidate I-cache
+ */
+.globl invalidate_icache
+invalidate_icache:
+	ic	iallu
+	dsb	nsh
+	isb
+	ret
+
+/*
+ * 	set_mair_tcr_ttbr_sctlr_el1(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
+ * 	x0 - page_table - Page Table Base
+ * 	x1 - tcr_flags - TCR Flags to be set
+ */
+.globl set_mair_tcr_ttbr_sctlr_el1
+set_mair_tcr_ttbr_sctlr_el1:
+	ldr	x2, =MEMORY_ATTRIBUTES
+	msr	mair_el1, x2
+	msr	tcr_el1, x1
+	msr	ttbr0_el1, x0
+	isb
+	mrs	x0, sctlr_el1
+	ldr	x3, =SCTLR_ELx_FLAGS
+	orr	x0, x0, x3
+	msr	sctlr_el1, x0
+	isb
+	ret
+
+/*
+ * 	set_mair_tcr_ttbr_sctlr_el2(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
+ * 	x0 - page_table - Page Table Base
+ * 	x1 - tcr_flags - TCR Flags to be set
+ */
+.globl set_mair_tcr_ttbr_sctlr_el2
+set_mair_tcr_ttbr_sctlr_el2:
+	ldr	x2, =MEMORY_ATTRIBUTES
+	msr	mair_el2, x2
+	msr	tcr_el2, x1
+	msr	ttbr0_el2, x0
+	isb
+	mrs	x0, sctlr_el2
+	ldr	x3, =SCTLR_ELx_FLAGS
+	orr	x0, x0, x3
+	msr	sctlr_el2, x0
+	isb
+	ret
+
+/*
+ * reset_sctlr_el1 - disables cache and mmu
+ */
+.globl reset_sctlr_el1
+reset_sctlr_el1:
+	mrs	x0, sctlr_el1
+	bic	x0, x0, #SCTLR_ELx_C
+	bic	x0, x0, #SCTLR_ELx_M
+	msr	sctlr_el1, x0
+	isb
+	ret
+
+/*
+ * reset_sctlr_el2 - disables cache and mmu
+ */
+.globl reset_sctlr_el2
+reset_sctlr_el2:
+	mrs	x0, sctlr_el2
+	bic	x0, x0, #SCTLR_ELx_C
+	bic	x0, x0, #SCTLR_ELx_M
+	msr	sctlr_el2, x0
+	isb
+	ret
diff --git a/purgatory/arch/arm64/cache.c b/purgatory/arch/arm64/cache.c
new file mode 100644
index 000000000000..3c7e058ccf11
--- /dev/null
+++ b/purgatory/arch/arm64/cache.c
@@ -0,0 +1,330 @@
+/*
+ * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program. If not, see <http://www.gnu.org/licenses/>.
+ */
+
+/* We are supporting only 4K and 64K page sizes. This code will not work if
+ * a hardware is not supporting at least one of these page sizes.
+ * Therefore, D-cache is disabled by default and enabled only when
+ * "enable-dcache" is passed to the kexec().
+ * Since this is an identity mapped system, so VA_BITS will be same as max
+ * PA bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only
+ * one level of page table will be there with block descriptor entries.
+ * Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will
+ * have only table entries pointing to a level 1 lookup. Level 1 will have
+ * only block entries which will map 1GB block.For 64K mapping, TTBR points
+ * to level 1 lookups, which will have only table entries pointing to a
+ * level 2 lookup. Level 2 will have only block entries which will map
+ * 512MB block. If UART base address and RAM addresses are not at least 1GB
+ * and 512MB apart for 4K and 64K respectively, then mapping result could
+ * be unpredictable. In that case we need to support one more level of
+ * granularity, but until someone needs that keep it like this only.
+ * We can not allocate dynamic memory in purgatory. Therefore we keep page
+ * table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points
+ * to first level (having only table entries) and (page_table +
+ * MAX_PAGE_SIZE) points to table at next level (having block entries). If
+ * index for RAM area and UART area in first table is not same, then we
+ * will need another next level table which will be located at (page_table
+ * + 2 * MAX_PAGE_SIZE).
+ */
+
+#include <stdint.h>
+#include <string.h>
+#include <purgatory.h>
+#include "cache.h"
+
+static uint64_t page_shift;
+static uint64_t pgtable_level;
+static uint64_t va_bits;
+
+static uint64_t page_table[PAGE_TABLE_SIZE / sizeof(uint64_t)] __attribute__ ((aligned (MAX_PAGE_SIZE))) = { };
+static uint64_t page_table_used;
+
+#define PAGE_SIZE	(1 << page_shift)
+/*
+ *	is_4k_page_supported - return true if 4k page is supported else
+ *	false
+ */
+static int is_4k_page_supported(void)
+{
+	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN4_MASK) ==
+			ID_AA64MMFR0_TGRAN4_SUPPORTED);
+}
+
+/*
+ *	is_64k_page_supported - return true if 64k page is supported else
+ *	false
+ */
+static int is_64k_page_supported(void)
+{
+	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN64_MASK) ==
+			ID_AA64MMFR0_TGRAN64_SUPPORTED);
+}
+
+/*
+ *	get_ips_bits - return supported IPS bits
+ */
+static uint64_t get_ips_bits(void)
+{
+	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_PARANGE_MASK) >>
+			ID_AA64MMFR0_PARANGE_SHIFT);
+}
+
+/*
+ *	get_va_bits - return supported VA bits (For identity mapping VA = PA)
+ */
+static uint64_t get_va_bits(void)
+{
+	uint64_t ips = get_ips_bits();
+
+	switch(ips) {
+	case ID_AA64MMFR0_PARANGE_48:
+		return 48;
+	case ID_AA64MMFR0_PARANGE_44:
+		return 44;
+	case ID_AA64MMFR0_PARANGE_42:
+		return 42;
+	case ID_AA64MMFR0_PARANGE_40:
+		return 40;
+	case ID_AA64MMFR0_PARANGE_36:
+		return 36;
+	default:
+		return 32;
+	}
+}
+
+/*
+ *	get_section_shift - get block shift for supported page size
+ */
+static uint64_t get_section_shift(void)
+{
+	if (page_shift == 16)
+		return 29;
+	else if(page_shift == 12)
+		return 30;
+	else
+		return 0;
+}
+
+/*
+ *	get_section_mask - get section mask for supported page size
+ */
+static uint64_t get_section_mask(void)
+{
+	if (page_shift == 16)
+		return 0x1FFF;
+	else if(page_shift == 12)
+		return 0x1FF;
+	else
+		return 0;
+}
+
+/*
+ *	get_pgdir_shift - get pgdir shift for supported page size
+ */
+static uint64_t get_pgdir_shift(void)
+{
+	if (page_shift == 16)
+		return 42;
+	else if(page_shift == 12)
+		return 39;
+	else
+		return 0;
+}
+
+/*
+ *	init_page_table - Initializes page table locations
+ */
+
+static void init_page_table(void)
+{
+	/*
+	 * Invalidate the page tables to avoid potential dirty cache lines
+	 * being evicted.
+	 */
+
+	inval_cache_range((uint64_t)page_table,
+			(uint64_t)page_table + PAGE_TABLE_SIZE);
+	memset(page_table, 0, PAGE_TABLE_SIZE);
+}
+/*
+ *	create_identity_mapping(start, end, flags)
+ *	start		- start address
+ *	end		- end address
+ *	flags 		- MMU Flags for Normal or Device type memory
+ */
+static void create_identity_mapping(uint64_t start, uint64_t end,
+					uint64_t flags)
+{
+	uint32_t sec_shift, pgdir_shift, sec_mask;
+	uint64_t desc, s1, e1, s2, e2;
+	uint64_t *table2;
+
+	s1 = start;
+	e1 = end - 1;
+
+	sec_shift = get_section_shift();
+	if (pgtable_level == 1) {
+		s1 >>= sec_shift;
+		e1 >>= sec_shift;
+		do {
+			desc = s1 << sec_shift;
+			desc |= flags;
+			page_table[s1] = desc;
+			s1++;
+		} while (s1 <= e1);
+	} else {
+		pgdir_shift = get_pgdir_shift();
+		sec_mask = get_section_mask();
+		s1 >>= pgdir_shift;
+		e1 >>= pgdir_shift;
+		do {
+			/*
+			 * If there is no table entry then write a new
+			 * entry else, use old entry
+			 */
+			if (!page_table[s1]) {
+				table2 = &page_table[(++page_table_used *
+						MAX_PAGE_SIZE) /
+						sizeof(uint64_t)];
+				desc = (uint64_t)table2 | PMD_TYPE_TABLE;
+				page_table[s1] = desc;
+			} else {
+				table2 = (uint64_t *)(page_table[s1] &
+						~PMD_TYPE_MASK);
+			}
+			s1++;
+			s2 = start >> sec_shift;
+			s2 &= sec_mask;
+			e2 = (end - 1) >> sec_shift;
+			e2 &= sec_mask;
+			do {
+				desc = s2 << sec_shift;
+				desc |= flags;
+				table2[s2] = desc;
+				s2++;
+			} while (s2 <= e2);
+		} while (s1 <= e1);
+	}
+}
+
+/*
+ *	enable_mmu_dcache: Enable mmu and D-cache in sctlr_el1
+ */
+static void enable_mmu_dcache(void)
+{
+	uint64_t tcr_flags = TCR_FLAGS | TCR_T0SZ(va_bits);
+
+	switch(page_shift) {
+	case 16:
+		tcr_flags |= TCR_TG0_64K;
+		break;
+	case 12:
+		tcr_flags |= TCR_TG0_4K;
+		break;
+	default:
+		printf("page shift not supported\n");
+		return;
+	}
+	/*
+	 * Since the page tables have been populated with non-cacheable
+	 * accesses (MMU disabled), invalidate the page tables to remove
+	 * any speculatively loaded cache lines.
+	 */
+	inval_cache_range((uint64_t)page_table,
+				(uint64_t)page_table + PAGE_TABLE_SIZE);
+
+	switch(get_current_el()) {
+	case 2:
+		invalidate_tlbs_el2();
+		tcr_flags |= (get_ips_bits() << TCR_PS_EL2_SHIFT);
+		set_mair_tcr_ttbr_sctlr_el2((uint64_t)page_table, tcr_flags);
+		break;
+	case 1:
+		invalidate_tlbs_el1();
+		tcr_flags |= (get_ips_bits() << TCR_IPS_EL1_SHIFT);
+		set_mair_tcr_ttbr_sctlr_el1((uint64_t)page_table, tcr_flags);
+		break;
+	default:
+		return;
+	}
+	invalidate_icache();
+}
+
+/*
+ *	enable_dcache: Enable D-cache and set appropriate attributes
+ *	ram_start - Start address of RAM
+ *	ram_end - End address of RAM
+ *	uart_base - Base address of uart
+ */
+int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base)
+{
+	va_bits = get_va_bits();
+
+	page_table_used = 0;
+	if (is_64k_page_supported()) {
+		page_shift = 16;
+		if (va_bits <= 42)
+			pgtable_level = 1;
+		else
+			pgtable_level = 2;
+	} else if (is_4k_page_supported()) {
+		page_shift = 12;
+		if (va_bits <= 39)
+			pgtable_level = 1;
+		else
+			pgtable_level = 2;
+	} else {
+		printf("Valid Page Granule not supported by hardware\n");
+		return -1;
+	}
+	init_page_table();
+	create_identity_mapping(ram_start, ram_end, MM_MMUFLAGS_NORMAL);
+	printf("Normal identity mapping created from %lx to %lx\n",
+			ram_start, ram_end);
+	if (uart_base) {
+		create_identity_mapping((uint64_t)uart_base,
+					(uint64_t)uart_base + PAGE_SIZE,
+					MM_MMUFLAGS_DEVICE);
+		printf("Device identity mapping created from %lx to %lx\n",
+				(uint64_t)uart_base,
+				(uint64_t)uart_base + PAGE_SIZE);
+	}
+	enable_mmu_dcache();
+	printf("Cache Enabled\n");
+
+	return 0;
+}
+
+/*
+ *	disable_dcache: Disable D-cache and flush RAM locations
+ *	ram_start - Start address of RAM
+ *	ram_end - End address of RAM
+ */
+void disable_dcache(uint64_t ram_start, uint64_t ram_end)
+{
+	switch(get_current_el()) {
+	case 2:
+		reset_sctlr_el2();
+		break;
+	case 1:
+		reset_sctlr_el1();
+		break;
+	default:
+		return;
+	}
+	invalidate_icache();
+	flush_dcache_range(ram_start, ram_end);
+	printf("Cache Disabled\n");
+}
diff --git a/purgatory/arch/arm64/cache.h b/purgatory/arch/arm64/cache.h
new file mode 100644
index 000000000000..c988020566e3
--- /dev/null
+++ b/purgatory/arch/arm64/cache.h
@@ -0,0 +1,79 @@
+#ifndef	__CACHE_H__
+#define __CACHE_H__
+
+#define MT_DEVICE_NGNRNE	0
+#define MT_DEVICE_NGNRE		1
+#define MT_DEVICE_GRE		2
+#define MT_NORMAL_NC		3
+#define MT_NORMAL		4
+
+#ifndef __ASSEMBLER__
+
+#define MAX_PAGE_SIZE		0x10000
+#define PAGE_TABLE_SIZE		(3 * MAX_PAGE_SIZE)
+#define ID_AA64MMFR0_TGRAN64_SHIFT	24
+#define ID_AA64MMFR0_TGRAN4_SHIFT	28
+#define ID_AA64MMFR0_TGRAN64_MASK	(0xFUL << ID_AA64MMFR0_TGRAN64_SHIFT)
+#define ID_AA64MMFR0_TGRAN4_MASK	(0xFUL << ID_AA64MMFR0_TGRAN4_SHIFT)
+#define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
+#define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
+#define ID_AA64MMFR0_PARANGE_SHIFT	0
+#define ID_AA64MMFR0_PARANGE_MASK	(0xFUL << ID_AA64MMFR0_PARANGE_SHIFT)
+#define ID_AA64MMFR0_PARANGE_48		0x5
+#define ID_AA64MMFR0_PARANGE_44		0x4
+#define ID_AA64MMFR0_PARANGE_42		0x3
+#define ID_AA64MMFR0_PARANGE_40		0x2
+#define ID_AA64MMFR0_PARANGE_36		0x1
+#define ID_AA64MMFR0_PARANGE_32		0x0
+
+#define TCR_TG0_64K 		(1UL << 14)
+#define TCR_TG0_4K 		(0UL << 14)
+#define TCR_SHARED_NONE		(0UL << 12)
+#define TCR_ORGN_WBWA		(1UL << 10)
+#define TCR_IRGN_WBWA		(1UL << 8)
+#define TCR_IPS_EL1_SHIFT	32
+#define TCR_PS_EL2_SHIFT	16
+#define TCR_T0SZ(x)		((unsigned long)(64 - (x)) << 0)
+#define TCR_FLAGS (TCR_SHARED_NONE | TCR_ORGN_WBWA | TCR_IRGN_WBWA)
+
+#define PMD_TYPE_SECT		(1UL << 0)
+#define PMD_TYPE_TABLE		(3UL << 0)
+#define PMD_TYPE_MASK		0x3
+#define PMD_SECT_AF		(1UL << 10)
+#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
+#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
+#define PMD_SECT_PXN		(1UL << 53)
+#define PMD_SECT_UXN		(1UL << 54)
+#define PMD_FLAGS_DEVICE	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_PXN | PMD_SECT_UXN)
+#define MM_MMUFLAGS_NORMAL	PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL
+#define MM_MMUFLAGS_DEVICE	PMD_ATTRINDX(MT_DEVICE_NGNRE) | PMD_FLAGS_DEVICE
+
+void disable_dcache(uint64_t ram_start, uint64_t ram_end);
+int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base);
+uint64_t get_mm_feature_reg0_val(void);
+void inval_cache_range(uint64_t start, uint64_t end);
+void flush_dcache_range(uint64_t start, uint64_t end);
+uint64_t get_current_el(void);
+void set_mair_tcr_ttbr_sctlr_el1(uint64_t page_table, uint64_t tcr_flags);
+void set_mair_tcr_ttbr_sctlr_el2(uint64_t page_table, uint64_t tcr_flags);
+void invalidate_tlbs_el1(void);
+void invalidate_tlbs_el2(void);
+void invalidate_icache(void);
+void reset_sctlr_el1(void);
+void reset_sctlr_el2(void);
+#else
+#define MEMORY_ATTRIBUTES	((0x00 << (MT_DEVICE_NGNRNE*8)) | \
+				(0x04 << (MT_DEVICE_NGNRE*8)) | \
+				(0x0C << (MT_DEVICE_GRE*8)) | \
+				(0x44 << (MT_NORMAL_NC*8)) | \
+				(0xFF << (MT_NORMAL*8)))
+
+/* Common SCTLR_ELx flags. */
+#define SCTLR_ELx_I		(1 << 12)
+#define SCTLR_ELx_C		(1 << 2)
+#define SCTLR_ELx_M		(1 << 0)
+
+#define SCTLR_ELx_FLAGS (SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_I)
+
+#endif
+#endif
-- 
2.7.4


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
  2016-11-22  4:32 ` Pratyush Anand
@ 2016-11-22  4:32   ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-22  4:32 UTC (permalink / raw)
  To: linux-arm-kernel

When "enable-dcache" is passed to the kexec() command line, kexec-tools
passes this information to purgatory, which in turn enables cache during
sha-256 verification.

RAM boundary which includes all the sections is needed for creating
identity page mapping and to enable d-cache for those areas. Therefore
these informations are passed to purgatory as well.

Signed-off-by: Pratyush Anand <panand@redhat.com>
---
 kexec/arch/arm64/include/arch/options.h |  6 +++++-
 kexec/arch/arm64/include/types.h        | 16 ++++++++++++++++
 kexec/arch/arm64/kexec-arm64.c          | 25 ++++++++++++++++++++++++-
 purgatory/arch/arm64/purgatory-arm64.c  | 11 +++++++++++
 4 files changed, 56 insertions(+), 2 deletions(-)
 create mode 100644 kexec/arch/arm64/include/types.h

diff --git a/kexec/arch/arm64/include/arch/options.h b/kexec/arch/arm64/include/arch/options.h
index a17d933e396b..3e76ff04d6c1 100644
--- a/kexec/arch/arm64/include/arch/options.h
+++ b/kexec/arch/arm64/include/arch/options.h
@@ -5,13 +5,15 @@
 #define OPT_DTB			((OPT_MAX)+1)
 #define OPT_INITRD		((OPT_MAX)+2)
 #define OPT_REUSE_CMDLINE	((OPT_MAX)+3)
-#define OPT_ARCH_MAX		((OPT_MAX)+4)
+#define OPT_ENABLE_DCACHE	((OPT_MAX)+4)
+#define OPT_ARCH_MAX		((OPT_MAX)+5)
 
 #define KEXEC_ARCH_OPTIONS \
 	KEXEC_OPTIONS \
 	{ "append",        1, NULL, OPT_APPEND }, \
 	{ "command-line",  1, NULL, OPT_APPEND }, \
 	{ "dtb",           1, NULL, OPT_DTB }, \
+	{ "enable-dcache", 0, NULL, OPT_ENABLE_DCACHE }, \
 	{ "initrd",        1, NULL, OPT_INITRD }, \
 	{ "ramdisk",       1, NULL, OPT_INITRD }, \
 	{ "reuse-cmdline", 0, NULL, OPT_REUSE_CMDLINE }, \
@@ -24,6 +26,7 @@ static const char arm64_opts_usage[] __attribute__ ((unused)) =
 "     --append=STRING       Set the kernel command line to STRING.\n"
 "     --command-line=STRING Set the kernel command line to STRING.\n"
 "     --dtb=FILE            Use FILE as the device tree blob.\n"
+"     --enable-dcache       Enable D-Cache in Purgatory for faster SHA verification.\n"
 "     --initrd=FILE         Use FILE as the kernel initial ramdisk.\n"
 "     --ramdisk=FILE        Use FILE as the kernel initial ramdisk.\n"
 "     --reuse-cmdline       Use kernel command line from running system.\n";
@@ -32,6 +35,7 @@ struct arm64_opts {
 	const char *command_line;
 	const char *dtb;
 	const char *initrd;
+	uint8_t enable_dcache;
 };
 
 extern struct arm64_opts arm64_opts;
diff --git a/kexec/arch/arm64/include/types.h b/kexec/arch/arm64/include/types.h
new file mode 100644
index 000000000000..08f833a6d585
--- /dev/null
+++ b/kexec/arch/arm64/include/types.h
@@ -0,0 +1,16 @@
+#ifndef _TYPES_H_
+#define _TYPES_H_
+
+#define min(x,y) ({ \
+	typeof(x) _x = (x);	\
+	typeof(y) _y = (y);	\
+	(void) (&_x == &_y);	\
+	_x < _y ? _x : _y; })
+
+#define max(x,y) ({ \
+	typeof(x) _x = (x);	\
+	typeof(y) _y = (y);	\
+	(void) (&_x == &_y);	\
+	_x > _y ? _x : _y; })
+
+#endif /* _TYPES_H_ */
diff --git a/kexec/arch/arm64/kexec-arm64.c b/kexec/arch/arm64/kexec-arm64.c
index 288548f49304..b54d1b5304f6 100644
--- a/kexec/arch/arm64/kexec-arm64.c
+++ b/kexec/arch/arm64/kexec-arm64.c
@@ -23,6 +23,7 @@
 #include "fs2dt.h"
 #include "kexec-syscall.h"
 #include "arch/options.h"
+#include "types.h"
 
 /* Global varables the core kexec routines expect. */
 
@@ -130,6 +131,9 @@ int arch_process_options(int argc, char **argv)
 		case OPT_PANIC:
 			die("load-panic (-p) not supported");
 			break;
+		case OPT_ENABLE_DCACHE:
+			arm64_opts.enable_dcache = 1;
+			break;
 		default:
 			break; /* Ignore core and unknown options. */
 		}
@@ -323,10 +327,13 @@ unsigned long arm64_locate_kernel_segment(struct kexec_info *info)
 int arm64_load_other_segments(struct kexec_info *info,
 	unsigned long image_base)
 {
-	int result;
+	int result, i;
 	unsigned long dtb_base;
 	unsigned long hole_min;
 	unsigned long hole_max;
+	unsigned long arm64_ram_start = -1;
+	unsigned long arm64_ram_end = 0;
+	uint8_t purgatory_enable_dcache;
 	char *initrd_buf = NULL;
 	struct dtb dtb;
 	char command_line[COMMAND_LINE_SIZE] = "";
@@ -337,6 +344,8 @@ int arm64_load_other_segments(struct kexec_info *info,
 		command_line[sizeof(command_line) - 1] = 0;
 	}
 
+	purgatory_enable_dcache = arm64_opts.enable_dcache;
+
 	if (arm64_opts.dtb) {
 		dtb.name = "dtb_user";
 		dtb.buf = slurp_file(arm64_opts.dtb, &dtb.size);
@@ -419,8 +428,22 @@ int arm64_load_other_segments(struct kexec_info *info,
 	elf_rel_set_symbol(&info->rhdr, "arm64_kernel_entry", &image_base,
 		sizeof(image_base));
 
+	elf_rel_set_symbol(&info->rhdr, "arm64_enable_dcache",
+		&purgatory_enable_dcache, sizeof(purgatory_enable_dcache));
+
 	elf_rel_set_symbol(&info->rhdr, "arm64_dtb_addr", &dtb_base,
 		sizeof(dtb_base));
+	for (i = 0; i < info->nr_segments; i++) {
+		arm64_ram_start = min(arm64_ram_start,
+				(unsigned long)info->segment[i].mem);
+		arm64_ram_end = max(arm64_ram_end,
+				((unsigned long)info->segment[i].mem +
+				 info->segment[i].memsz));
+	}
+	elf_rel_set_symbol(&info->rhdr, "arm64_ram_start",
+			&arm64_ram_start, sizeof(arm64_ram_start));
+	elf_rel_set_symbol(&info->rhdr, "arm64_ram_end",
+			&arm64_ram_end, sizeof(arm64_ram_end));
 
 	return 0;
 }
diff --git a/purgatory/arch/arm64/purgatory-arm64.c b/purgatory/arch/arm64/purgatory-arm64.c
index fe50fcf8ebc3..6d61dcbce9ac 100644
--- a/purgatory/arch/arm64/purgatory-arm64.c
+++ b/purgatory/arch/arm64/purgatory-arm64.c
@@ -4,6 +4,13 @@
 
 #include <stdint.h>
 #include <purgatory.h>
+#include "cache.h"
+
+/* Symbols set by kexec. */
+
+uint8_t arm64_enable_dcache __attribute__ ((section ("data")));
+uint64_t arm64_ram_start __attribute__ ((section ("data")));
+uint64_t arm64_ram_end __attribute__ ((section ("data")));
 
 void putchar(int ch)
 {
@@ -12,8 +19,12 @@ void putchar(int ch)
 
 void post_verification_setup_arch(void)
 {
+	if (arm64_enable_dcache)
+		disable_dcache(arm64_ram_start, arm64_ram_end);
 }
 
 void setup_arch(void)
 {
+	if (arm64_enable_dcache)
+		enable_dcache(arm64_ram_start, arm64_ram_end, 0);
 }
-- 
2.7.4

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
@ 2016-11-22  4:32   ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-22  4:32 UTC (permalink / raw)
  To: kexec, geoff; +Cc: Pratyush Anand, james.morse, linux-arm-kernel

When "enable-dcache" is passed to the kexec() command line, kexec-tools
passes this information to purgatory, which in turn enables cache during
sha-256 verification.

RAM boundary which includes all the sections is needed for creating
identity page mapping and to enable d-cache for those areas. Therefore
these informations are passed to purgatory as well.

Signed-off-by: Pratyush Anand <panand@redhat.com>
---
 kexec/arch/arm64/include/arch/options.h |  6 +++++-
 kexec/arch/arm64/include/types.h        | 16 ++++++++++++++++
 kexec/arch/arm64/kexec-arm64.c          | 25 ++++++++++++++++++++++++-
 purgatory/arch/arm64/purgatory-arm64.c  | 11 +++++++++++
 4 files changed, 56 insertions(+), 2 deletions(-)
 create mode 100644 kexec/arch/arm64/include/types.h

diff --git a/kexec/arch/arm64/include/arch/options.h b/kexec/arch/arm64/include/arch/options.h
index a17d933e396b..3e76ff04d6c1 100644
--- a/kexec/arch/arm64/include/arch/options.h
+++ b/kexec/arch/arm64/include/arch/options.h
@@ -5,13 +5,15 @@
 #define OPT_DTB			((OPT_MAX)+1)
 #define OPT_INITRD		((OPT_MAX)+2)
 #define OPT_REUSE_CMDLINE	((OPT_MAX)+3)
-#define OPT_ARCH_MAX		((OPT_MAX)+4)
+#define OPT_ENABLE_DCACHE	((OPT_MAX)+4)
+#define OPT_ARCH_MAX		((OPT_MAX)+5)
 
 #define KEXEC_ARCH_OPTIONS \
 	KEXEC_OPTIONS \
 	{ "append",        1, NULL, OPT_APPEND }, \
 	{ "command-line",  1, NULL, OPT_APPEND }, \
 	{ "dtb",           1, NULL, OPT_DTB }, \
+	{ "enable-dcache", 0, NULL, OPT_ENABLE_DCACHE }, \
 	{ "initrd",        1, NULL, OPT_INITRD }, \
 	{ "ramdisk",       1, NULL, OPT_INITRD }, \
 	{ "reuse-cmdline", 0, NULL, OPT_REUSE_CMDLINE }, \
@@ -24,6 +26,7 @@ static const char arm64_opts_usage[] __attribute__ ((unused)) =
 "     --append=STRING       Set the kernel command line to STRING.\n"
 "     --command-line=STRING Set the kernel command line to STRING.\n"
 "     --dtb=FILE            Use FILE as the device tree blob.\n"
+"     --enable-dcache       Enable D-Cache in Purgatory for faster SHA verification.\n"
 "     --initrd=FILE         Use FILE as the kernel initial ramdisk.\n"
 "     --ramdisk=FILE        Use FILE as the kernel initial ramdisk.\n"
 "     --reuse-cmdline       Use kernel command line from running system.\n";
@@ -32,6 +35,7 @@ struct arm64_opts {
 	const char *command_line;
 	const char *dtb;
 	const char *initrd;
+	uint8_t enable_dcache;
 };
 
 extern struct arm64_opts arm64_opts;
diff --git a/kexec/arch/arm64/include/types.h b/kexec/arch/arm64/include/types.h
new file mode 100644
index 000000000000..08f833a6d585
--- /dev/null
+++ b/kexec/arch/arm64/include/types.h
@@ -0,0 +1,16 @@
+#ifndef _TYPES_H_
+#define _TYPES_H_
+
+#define min(x,y) ({ \
+	typeof(x) _x = (x);	\
+	typeof(y) _y = (y);	\
+	(void) (&_x == &_y);	\
+	_x < _y ? _x : _y; })
+
+#define max(x,y) ({ \
+	typeof(x) _x = (x);	\
+	typeof(y) _y = (y);	\
+	(void) (&_x == &_y);	\
+	_x > _y ? _x : _y; })
+
+#endif /* _TYPES_H_ */
diff --git a/kexec/arch/arm64/kexec-arm64.c b/kexec/arch/arm64/kexec-arm64.c
index 288548f49304..b54d1b5304f6 100644
--- a/kexec/arch/arm64/kexec-arm64.c
+++ b/kexec/arch/arm64/kexec-arm64.c
@@ -23,6 +23,7 @@
 #include "fs2dt.h"
 #include "kexec-syscall.h"
 #include "arch/options.h"
+#include "types.h"
 
 /* Global varables the core kexec routines expect. */
 
@@ -130,6 +131,9 @@ int arch_process_options(int argc, char **argv)
 		case OPT_PANIC:
 			die("load-panic (-p) not supported");
 			break;
+		case OPT_ENABLE_DCACHE:
+			arm64_opts.enable_dcache = 1;
+			break;
 		default:
 			break; /* Ignore core and unknown options. */
 		}
@@ -323,10 +327,13 @@ unsigned long arm64_locate_kernel_segment(struct kexec_info *info)
 int arm64_load_other_segments(struct kexec_info *info,
 	unsigned long image_base)
 {
-	int result;
+	int result, i;
 	unsigned long dtb_base;
 	unsigned long hole_min;
 	unsigned long hole_max;
+	unsigned long arm64_ram_start = -1;
+	unsigned long arm64_ram_end = 0;
+	uint8_t purgatory_enable_dcache;
 	char *initrd_buf = NULL;
 	struct dtb dtb;
 	char command_line[COMMAND_LINE_SIZE] = "";
@@ -337,6 +344,8 @@ int arm64_load_other_segments(struct kexec_info *info,
 		command_line[sizeof(command_line) - 1] = 0;
 	}
 
+	purgatory_enable_dcache = arm64_opts.enable_dcache;
+
 	if (arm64_opts.dtb) {
 		dtb.name = "dtb_user";
 		dtb.buf = slurp_file(arm64_opts.dtb, &dtb.size);
@@ -419,8 +428,22 @@ int arm64_load_other_segments(struct kexec_info *info,
 	elf_rel_set_symbol(&info->rhdr, "arm64_kernel_entry", &image_base,
 		sizeof(image_base));
 
+	elf_rel_set_symbol(&info->rhdr, "arm64_enable_dcache",
+		&purgatory_enable_dcache, sizeof(purgatory_enable_dcache));
+
 	elf_rel_set_symbol(&info->rhdr, "arm64_dtb_addr", &dtb_base,
 		sizeof(dtb_base));
+	for (i = 0; i < info->nr_segments; i++) {
+		arm64_ram_start = min(arm64_ram_start,
+				(unsigned long)info->segment[i].mem);
+		arm64_ram_end = max(arm64_ram_end,
+				((unsigned long)info->segment[i].mem +
+				 info->segment[i].memsz));
+	}
+	elf_rel_set_symbol(&info->rhdr, "arm64_ram_start",
+			&arm64_ram_start, sizeof(arm64_ram_start));
+	elf_rel_set_symbol(&info->rhdr, "arm64_ram_end",
+			&arm64_ram_end, sizeof(arm64_ram_end));
 
 	return 0;
 }
diff --git a/purgatory/arch/arm64/purgatory-arm64.c b/purgatory/arch/arm64/purgatory-arm64.c
index fe50fcf8ebc3..6d61dcbce9ac 100644
--- a/purgatory/arch/arm64/purgatory-arm64.c
+++ b/purgatory/arch/arm64/purgatory-arm64.c
@@ -4,6 +4,13 @@
 
 #include <stdint.h>
 #include <purgatory.h>
+#include "cache.h"
+
+/* Symbols set by kexec. */
+
+uint8_t arm64_enable_dcache __attribute__ ((section ("data")));
+uint64_t arm64_ram_start __attribute__ ((section ("data")));
+uint64_t arm64_ram_end __attribute__ ((section ("data")));
 
 void putchar(int ch)
 {
@@ -12,8 +19,12 @@ void putchar(int ch)
 
 void post_verification_setup_arch(void)
 {
+	if (arm64_enable_dcache)
+		disable_dcache(arm64_ram_start, arm64_ram_end);
 }
 
 void setup_arch(void)
 {
+	if (arm64_enable_dcache)
+		enable_dcache(arm64_ram_start, arm64_ram_end, 0);
 }
-- 
2.7.4


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply related	[flat|nested] 48+ messages in thread

* [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility
  2016-11-22  4:32 ` Pratyush Anand
@ 2016-11-22 18:56   ` Geoff Levand
  -1 siblings, 0 replies; 48+ messages in thread
From: Geoff Levand @ 2016-11-22 18:56 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pratyush,

On 11/21/2016 08:32 PM, Pratyush Anand wrote:
> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
> is around 13MB and initramfs is around 30MB. It takes more than 20 second
> even when we have -O2 optimization enabled. However, if dcache is enabled
> during purgatory execution then, it takes just a second in SHA verification.

As I had mentioned in another thread, I think -O2 optimization is
sufficient considering the complexity of the code needed to enable
the dcache.  Integrity checking is only needed for crash dump
support.  If the crash reboot takes an extra 20 seconds does it
matter?

For the re-boot of a stable system where the new kernel is loaded
then immediately kexec'ed into integrity checking is not needed.
For that arm64 support needs to be added to kexec-lite.

-Geoff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility
@ 2016-11-22 18:56   ` Geoff Levand
  0 siblings, 0 replies; 48+ messages in thread
From: Geoff Levand @ 2016-11-22 18:56 UTC (permalink / raw)
  To: Pratyush Anand, kexec; +Cc: james.morse, linux-arm-kernel

Hi Pratyush,

On 11/21/2016 08:32 PM, Pratyush Anand wrote:
> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
> is around 13MB and initramfs is around 30MB. It takes more than 20 second
> even when we have -O2 optimization enabled. However, if dcache is enabled
> during purgatory execution then, it takes just a second in SHA verification.

As I had mentioned in another thread, I think -O2 optimization is
sufficient considering the complexity of the code needed to enable
the dcache.  Integrity checking is only needed for crash dump
support.  If the crash reboot takes an extra 20 seconds does it
matter?

For the re-boot of a stable system where the new kernel is loaded
then immediately kexec'ed into integrity checking is not needed.
For that arm64 support needs to be added to kexec-lite.

-Geoff

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
  2016-11-22  4:32   ` Pratyush Anand
@ 2016-11-22 18:57     ` Geoff Levand
  -1 siblings, 0 replies; 48+ messages in thread
From: Geoff Levand @ 2016-11-22 18:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pratyush,

On 11/21/2016 08:32 PM, Pratyush Anand wrote:
> When "enable-dcache" is passed to the kexec() command line, kexec-tools
> passes this information to purgatory, which in turn enables cache during
> sha-256 verification.

What's the point of this enable-dcache option?  Why not just
always enable the cache if we can?

-Geoff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
@ 2016-11-22 18:57     ` Geoff Levand
  0 siblings, 0 replies; 48+ messages in thread
From: Geoff Levand @ 2016-11-22 18:57 UTC (permalink / raw)
  To: Pratyush Anand, kexec; +Cc: james.morse, linux-arm-kernel

Hi Pratyush,

On 11/21/2016 08:32 PM, Pratyush Anand wrote:
> When "enable-dcache" is passed to the kexec() command line, kexec-tools
> passes this information to purgatory, which in turn enables cache during
> sha-256 verification.

What's the point of this enable-dcache option?  Why not just
always enable the cache if we can?

-Geoff


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility
  2016-11-22 18:56   ` Geoff Levand
@ 2016-11-23  1:39     ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-23  1:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geoff,

On Wednesday 23 November 2016 12:26 AM, Geoff Levand wrote:
> Hi Pratyush,
>
> On 11/21/2016 08:32 PM, Pratyush Anand wrote:
>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz
>> image
>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>> even when we have -O2 optimization enabled. However, if dcache is enabled
>> during purgatory execution then, it takes just a second in SHA
>> verification.
>
> As I had mentioned in another thread, I think -O2 optimization is
> sufficient considering the complexity of the code needed to enable
> the dcache.  Integrity checking is only needed for crash dump
> support.  If the crash reboot takes an extra 20 seconds does it
> matter?
>

Even this 20 second for kdump case is also annoying and this could 
increase if the size of initramfs increases.

Moreover, by default d-cache is still disabled, so even if it is complex 
and you might see some instability, it will not affect the existing 
systems. So, IMHO we should have this in the upstream.


> For the re-boot of a stable system where the new kernel is loaded
> then immediately kexec'ed into integrity checking is not needed.
> For that arm64 support needs to be added to kexec-lite.

I think kexec-lite is a separate project and --lite in kexec-tools was 
never acked. So, even for `kexec -l` it will be helpful.

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility
@ 2016-11-23  1:39     ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-23  1:39 UTC (permalink / raw)
  To: Geoff Levand, kexec; +Cc: james.morse, linux-arm-kernel

Hi Geoff,

On Wednesday 23 November 2016 12:26 AM, Geoff Levand wrote:
> Hi Pratyush,
>
> On 11/21/2016 08:32 PM, Pratyush Anand wrote:
>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz
>> image
>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>> even when we have -O2 optimization enabled. However, if dcache is enabled
>> during purgatory execution then, it takes just a second in SHA
>> verification.
>
> As I had mentioned in another thread, I think -O2 optimization is
> sufficient considering the complexity of the code needed to enable
> the dcache.  Integrity checking is only needed for crash dump
> support.  If the crash reboot takes an extra 20 seconds does it
> matter?
>

Even this 20 second for kdump case is also annoying and this could 
increase if the size of initramfs increases.

Moreover, by default d-cache is still disabled, so even if it is complex 
and you might see some instability, it will not affect the existing 
systems. So, IMHO we should have this in the upstream.


> For the re-boot of a stable system where the new kernel is loaded
> then immediately kexec'ed into integrity checking is not needed.
> For that arm64 support needs to be added to kexec-lite.

I think kexec-lite is a separate project and --lite in kexec-tools was 
never acked. So, even for `kexec -l` it will be helpful.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
  2016-11-22 18:57     ` Geoff Levand
@ 2016-11-23  1:46       ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-23  1:46 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Geoff,

On Wednesday 23 November 2016 12:27 AM, Geoff Levand wrote:
> Hi Pratyush,
>
> On 11/21/2016 08:32 PM, Pratyush Anand wrote:
>> When "enable-dcache" is passed to the kexec() command line, kexec-tools
>> passes this information to purgatory, which in turn enables cache during
>> sha-256 verification.
>
> What's the point of this enable-dcache option?  Why not just
> always enable the cache if we can?

As I have written in changelog of patch 1/2

"We are supporting only 4K and 64K page sizes. This code will not work if a
hardware is not supporting at least one of these page sizes.  Therefore,
D-cache is disabled by default and enabled only when "enable-dcache" is
passed to the kexec()."

Although this is very unlikely that a hardware will support only 16K 
page sizes, however it is possible. Therefore, its better to keep it 
disabled by default.

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
@ 2016-11-23  1:46       ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-23  1:46 UTC (permalink / raw)
  To: Geoff Levand, kexec; +Cc: james.morse, linux-arm-kernel

Hi Geoff,

On Wednesday 23 November 2016 12:27 AM, Geoff Levand wrote:
> Hi Pratyush,
>
> On 11/21/2016 08:32 PM, Pratyush Anand wrote:
>> When "enable-dcache" is passed to the kexec() command line, kexec-tools
>> passes this information to purgatory, which in turn enables cache during
>> sha-256 verification.
>
> What's the point of this enable-dcache option?  Why not just
> always enable the cache if we can?

As I have written in changelog of patch 1/2

"We are supporting only 4K and 64K page sizes. This code will not work if a
hardware is not supporting at least one of these page sizes.  Therefore,
D-cache is disabled by default and enabled only when "enable-dcache" is
passed to the kexec()."

Although this is very unlikely that a hardware will support only 16K 
page sizes, however it is possible. Therefore, its better to keep it 
disabled by default.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
  2016-11-23  1:46       ` Pratyush Anand
@ 2016-11-23  2:03         ` Dave Young
  -1 siblings, 0 replies; 48+ messages in thread
From: Dave Young @ 2016-11-23  2:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 11/23/16 at 07:16am, Pratyush Anand wrote:
> Hi Geoff,
> 
> On Wednesday 23 November 2016 12:27 AM, Geoff Levand wrote:
> > Hi Pratyush,
> > 
> > On 11/21/2016 08:32 PM, Pratyush Anand wrote:
> > > When "enable-dcache" is passed to the kexec() command line, kexec-tools
> > > passes this information to purgatory, which in turn enables cache during
> > > sha-256 verification.
> > 
> > What's the point of this enable-dcache option?  Why not just
> > always enable the cache if we can?
> 
> As I have written in changelog of patch 1/2
> 
> "We are supporting only 4K and 64K page sizes. This code will not work if a
> hardware is not supporting at least one of these page sizes.  Therefore,
> D-cache is disabled by default and enabled only when "enable-dcache" is
> passed to the kexec()."
> 
> 
> Although this is very unlikely that a hardware will support only 16K page
> sizes, however it is possible. Therefore, its better to keep it disabled by
> default.

If it is *unlikely* it could be better to make it as default and add a
--disable-dcache instead.

> 
> ~Pratyush
> 
> _______________________________________________
> kexec mailing list
> kexec at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Thanks
Dave

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
@ 2016-11-23  2:03         ` Dave Young
  0 siblings, 0 replies; 48+ messages in thread
From: Dave Young @ 2016-11-23  2:03 UTC (permalink / raw)
  To: Pratyush Anand; +Cc: Geoff Levand, james.morse, kexec, linux-arm-kernel

On 11/23/16 at 07:16am, Pratyush Anand wrote:
> Hi Geoff,
> 
> On Wednesday 23 November 2016 12:27 AM, Geoff Levand wrote:
> > Hi Pratyush,
> > 
> > On 11/21/2016 08:32 PM, Pratyush Anand wrote:
> > > When "enable-dcache" is passed to the kexec() command line, kexec-tools
> > > passes this information to purgatory, which in turn enables cache during
> > > sha-256 verification.
> > 
> > What's the point of this enable-dcache option?  Why not just
> > always enable the cache if we can?
> 
> As I have written in changelog of patch 1/2
> 
> "We are supporting only 4K and 64K page sizes. This code will not work if a
> hardware is not supporting at least one of these page sizes.  Therefore,
> D-cache is disabled by default and enabled only when "enable-dcache" is
> passed to the kexec()."
> 
> 
> Although this is very unlikely that a hardware will support only 16K page
> sizes, however it is possible. Therefore, its better to keep it disabled by
> default.

If it is *unlikely* it could be better to make it as default and add a
--disable-dcache instead.

> 
> ~Pratyush
> 
> _______________________________________________
> kexec mailing list
> kexec@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/kexec

Thanks
Dave

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
  2016-11-23  2:03         ` Dave Young
@ 2016-11-23  2:11           ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-23  2:11 UTC (permalink / raw)
  To: linux-arm-kernel



On Wednesday 23 November 2016 07:33 AM, Dave Young wrote:
>> Although this is very unlikely that a hardware will support only 16K page
>> > sizes, however it is possible. Therefore, its better to keep it disabled by
>> > default.
> If it is *unlikely* it could be better to make it as default and add a
> --disable-dcache instead.
>

I think, I can do that.

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
@ 2016-11-23  2:11           ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-23  2:11 UTC (permalink / raw)
  To: Dave Young; +Cc: Geoff Levand, james.morse, kexec, linux-arm-kernel



On Wednesday 23 November 2016 07:33 AM, Dave Young wrote:
>> Although this is very unlikely that a hardware will support only 16K page
>> > sizes, however it is possible. Therefore, its better to keep it disabled by
>> > default.
> If it is *unlikely* it could be better to make it as default and add a
> --disable-dcache instead.
>

I think, I can do that.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
  2016-11-23  2:11           ` Pratyush Anand
@ 2016-11-23  8:08             ` Simon Horman
  -1 siblings, 0 replies; 48+ messages in thread
From: Simon Horman @ 2016-11-23  8:08 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Nov 23, 2016 at 07:41:52AM +0530, Pratyush Anand wrote:
> 
> 
> On Wednesday 23 November 2016 07:33 AM, Dave Young wrote:
> >>Although this is very unlikely that a hardware will support only 16K page
> >>> sizes, however it is possible. Therefore, its better to keep it disabled by
> >>> default.
> >If it is *unlikely* it could be better to make it as default and add a
> >--disable-dcache instead.
> >
> 
> I think, I can do that.

Can this be detected at run-time?

It sounds like it will be painful if on some setups the default
doesn't work.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
@ 2016-11-23  8:08             ` Simon Horman
  0 siblings, 0 replies; 48+ messages in thread
From: Simon Horman @ 2016-11-23  8:08 UTC (permalink / raw)
  To: Pratyush Anand
  Cc: Geoff Levand, james.morse, Dave Young, kexec, linux-arm-kernel

On Wed, Nov 23, 2016 at 07:41:52AM +0530, Pratyush Anand wrote:
> 
> 
> On Wednesday 23 November 2016 07:33 AM, Dave Young wrote:
> >>Although this is very unlikely that a hardware will support only 16K page
> >>> sizes, however it is possible. Therefore, its better to keep it disabled by
> >>> default.
> >If it is *unlikely* it could be better to make it as default and add a
> >--disable-dcache instead.
> >
> 
> I think, I can do that.

Can this be detected at run-time?

It sounds like it will be painful if on some setups the default
doesn't work.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
  2016-11-23  8:08             ` Simon Horman
@ 2016-11-23  8:17               ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-23  8:17 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Simon,

Thanks for your review comment.

On Wed, Nov 23, 2016 at 1:38 PM, Simon Horman <horms@verge.net.au> wrote:
> On Wed, Nov 23, 2016 at 07:41:52AM +0530, Pratyush Anand wrote:
>>
>>
>> On Wednesday 23 November 2016 07:33 AM, Dave Young wrote:
>> >>Although this is very unlikely that a hardware will support only 16K page
>> >>> sizes, however it is possible. Therefore, its better to keep it disabled by
>> >>> default.
>> >If it is *unlikely* it could be better to make it as default and add a
>> >--disable-dcache instead.
>> >
>>
>> I think, I can do that.
>
> Can this be detected at run-time?



Thats doable. OK, so if everyone agrees then I can send a V2 where
neither --enable-dcache nor --disable-dcache will be used. It will
enable dcache if 4K/64K page is supported and will do nothing
otherwise.

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory
@ 2016-11-23  8:17               ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-11-23  8:17 UTC (permalink / raw)
  To: Simon Horman
  Cc: Geoff Levand, James Morse, Dave Young, Kexec Mailing List,
	linux-arm-kernel

Hi Simon,

Thanks for your review comment.

On Wed, Nov 23, 2016 at 1:38 PM, Simon Horman <horms@verge.net.au> wrote:
> On Wed, Nov 23, 2016 at 07:41:52AM +0530, Pratyush Anand wrote:
>>
>>
>> On Wednesday 23 November 2016 07:33 AM, Dave Young wrote:
>> >>Although this is very unlikely that a hardware will support only 16K page
>> >>> sizes, however it is possible. Therefore, its better to keep it disabled by
>> >>> default.
>> >If it is *unlikely* it could be better to make it as default and add a
>> >--disable-dcache instead.
>> >
>>
>> I think, I can do that.
>
> Can this be detected at run-time?



Thats doable. OK, so if everyone agrees then I can send a V2 where
neither --enable-dcache nor --disable-dcache will be used. It will
enable dcache if 4K/64K page is supported and will do nothing
otherwise.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-11-22  4:32   ` Pratyush Anand
@ 2016-11-25 18:30     ` James Morse
  -1 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-11-25 18:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pratyush,

(CC: Mark, mismatched memory attributes in paragraph 3?)

On 22/11/16 04:32, Pratyush Anand wrote:
> This patch adds support to enable/disable d-cache, which can be used for
> faster purgatory sha256 verification.

(I'm not clear why we want the sha256, but that is being discussed elsewhere on
 the thread)


> We are supporting only 4K and 64K page sizes. This code will not work if a
> hardware is not supporting at least one of these page sizes.  Therefore,
> D-cache is disabled by default and enabled only when "enable-dcache" is
> passed to the kexec().

I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would be
a lot simpler to only support one page size, which should be 4K as that is what
UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)

I would go as far as to generate the page tables at 'kexec -l' time, and only if
'/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore must
support 4K pages). This would keep the purgatory code as simple as possible.

I don't think the performance difference between 4K and 64K page sizes will be
measurable, is purgatory really performance sensitive code?


> Since this is an identity mapped system, so VA_BITS will be same as max PA
> bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
> level of page table will be there with block descriptor entries.
> Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
> only table entries pointing to a level 1 lookup. Level 1 will have only
> block entries which will map 1GB block. For 64K mapping, TTBR points to
> level 1 lookups, which will have only table entries pointing to a level 2
> lookup. Level 2 will have only block entries which will map 512MB block. If

This is more complexity to pick a VA size. Why not always use the maximum 48bit
VA? The cost is negligible compared to having simpler (easier to review!)
purgatory code.

By always using 1GB blocks you may be creating aliases with mismatched attributes:
* If kdump only reserves 128MB, your 1GB mapping will alias whatever else was
  in the same 1GB of address space. This could be a reserved region with some
  other memory attributes.
* With kdump, we may have failed to park the other CPUs if they are executing
  with interrupts masked and haven't yet handled the smp_send_stop() IPI.
* One of these other CPUs could be reading/writing in this area as it doesn't
  belong to the kdump reserved area, just happens to be in the same 1GB.

I need to dig through the ARM-ARM to find out what happens next, but I'm pretty
sure this is well into the "don't do that" territory.


It would be much better to force the memory areas to be a multiple of 2MB and
2MB aligned, which will allow you to use 2M section mappings for memory, (but
not the uart). This way we only map regions we had reserved and know are memory.


> UART base address and RAM addresses are not at least 1GB and 512MB apart
> for 4K and 64K respectively, then mapping result could be unpredictable. In
> that case we need to support one more level of granularity, but until
> someone needs that keep it like this only.
>
> We can not allocate dynamic memory in purgatory. Therefore we keep page
> table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
> first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
> points to table at next level (having block entries).  If index for RAM
> area and UART area in first table is not same, then we will need another
> next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).


> diff --git a/purgatory/arch/arm64/cache-asm.S b/purgatory/arch/arm64/cache-asm.S
> new file mode 100644
> index 000000000000..bef97ef48888
> --- /dev/null
> +++ b/purgatory/arch/arm64/cache-asm.S
> @@ -0,0 +1,186 @@
> +/*
> + * Some of the routines have been copied from Linux Kernel, therefore
> + * copying the license as well.
> + *
> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "cache.h"
> +
> +/*
> + * 	dcache_line_size - get the minimum D-cache line size from the CTR register.
> + */
> +	.macro	dcache_line_size, reg, tmp
> +	mrs	\tmp, ctr_el0			// read CTR
> +	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
> +	mov	\reg, #4			// bytes per word
> +	lsl	\reg, \reg, \tmp		// actual cache line size
> +	.endm
> +
> +/*
> + *	inval_cache_range(start, end)
> + *	- x0 - start	- start address of region
> + *	- x1 - end	- end address of region
> + */
> +.globl inval_cache_range
> +inval_cache_range:
> +	dcache_line_size x2, x3
> +	sub	x3, x2, #1
> +	tst	x1, x3				// end cache line aligned?
> +	bic	x1, x1, x3
> +	b.eq	1f
> +	dc	civac, x1			// clean & invalidate D / U line
> +1:	tst	x0, x3				// start cache line aligned?
> +	bic	x0, x0, x3
> +	b.eq	2f
> +	dc	civac, x0			// clean & invalidate D / U line
> +	b	3f
> +2:	dc	ivac, x0			// invalidate D / U line
> +3:	add	x0, x0, x2
> +	cmp	x0, x1
> +	b.lo	2b
> +	dsb	sy
> +	ret
> +/*
> + *	flush_dcache_range(start, end)
> + *	- x0 - start	- start address of region
> + *	- x1 - end	- end address of region
> + *
> + */
> +.globl flush_dcache_range
> +flush_dcache_range:
> +	dcache_line_size x2, x3
> +	sub	x3, x2, #1
> +	bic	x0, x0, x3
> +1:	dc	civac, x0			// clean & invalidate D line / unified line
> +	add	x0, x0, x2
> +	cmp	x0, x1
> +	b.lo	1b
> +	dsb	sy
> +	ret
> +
> +/*
> + *	invalidate_tlbs_el1()
> + */
> +.globl invalidate_tlbs_el1
> +invalidate_tlbs_el1:
> +	dsb	nshst
> +	tlbi	vmalle1
> +	dsb	nsh
> +	isb
> +	ret
> +
> +/*
> + *	invalidate_tlbs_el2()
> + */
> +.globl invalidate_tlbs_el2
> +invalidate_tlbs_el2:
> +	dsb	nshst
> +	tlbi	alle2
> +	dsb	nsh
> +	isb
> +	ret
> +
> +/*
> + * 	get_mm_feature_reg0_val - Get information about supported MM
> + * 	features
> + */
> +.globl get_mm_feature_reg0_val
> +get_mm_feature_reg0_val:
> +	mrs	x0, ID_AA64MMFR0_EL1
> +	ret
> +
> +/*
> + * 	get_current_el - Get information about current exception level
> + */
> +.globl get_current_el
> +get_current_el:
> +	mrs 	x0, CurrentEL
> +	lsr	x0, x0, #2
> +	ret
> +
> +/*
> + * 	invalidate_icache - Invalidate I-cache
> + */
> +.globl invalidate_icache
> +invalidate_icache:
> +	ic	iallu
> +	dsb	nsh
> +	isb
> +	ret
> +
> +/*
> + * 	set_mair_tcr_ttbr_sctlr_el1(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
> + * 	x0 - page_table - Page Table Base
> + * 	x1 - tcr_flags - TCR Flags to be set
> + */
> +.globl set_mair_tcr_ttbr_sctlr_el1
> +set_mair_tcr_ttbr_sctlr_el1:
> +	ldr	x2, =MEMORY_ATTRIBUTES
> +	msr	mair_el1, x2
> +	msr	tcr_el1, x1
> +	msr	ttbr0_el1, x0
> +	isb
> +	mrs	x0, sctlr_el1
> +	ldr	x3, =SCTLR_ELx_FLAGS
> +	orr	x0, x0, x3
> +	msr	sctlr_el1, x0
> +	isb
> +	ret
> +
> +/*
> + * 	set_mair_tcr_ttbr_sctlr_el2(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
> + * 	x0 - page_table - Page Table Base
> + * 	x1 - tcr_flags - TCR Flags to be set
> + */
> +.globl set_mair_tcr_ttbr_sctlr_el2
> +set_mair_tcr_ttbr_sctlr_el2:
> +	ldr	x2, =MEMORY_ATTRIBUTES
> +	msr	mair_el2, x2
> +	msr	tcr_el2, x1
> +	msr	ttbr0_el2, x0
> +	isb
> +	mrs	x0, sctlr_el2
> +	ldr	x3, =SCTLR_ELx_FLAGS
> +	orr	x0, x0, x3
> +	msr	sctlr_el2, x0
> +	isb
> +	ret
> +
> +/*
> + * reset_sctlr_el1 - disables cache and mmu
> + */
> +.globl reset_sctlr_el1
> +reset_sctlr_el1:
> +	mrs	x0, sctlr_el1
> +	bic	x0, x0, #SCTLR_ELx_C
> +	bic	x0, x0, #SCTLR_ELx_M
> +	msr	sctlr_el1, x0
> +	isb
> +	ret
> +
> +/*
> + * reset_sctlr_el2 - disables cache and mmu
> + */
> +.globl reset_sctlr_el2
> +reset_sctlr_el2:
> +	mrs	x0, sctlr_el2
> +	bic	x0, x0, #SCTLR_ELx_C
> +	bic	x0, x0, #SCTLR_ELx_M
> +	msr	sctlr_el2, x0
> +	isb
> +	ret
> diff --git a/purgatory/arch/arm64/cache.c b/purgatory/arch/arm64/cache.c
> new file mode 100644
> index 000000000000..3c7e058ccf11
> --- /dev/null
> +++ b/purgatory/arch/arm64/cache.c
> @@ -0,0 +1,330 @@
> +/*
> + * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +/* We are supporting only 4K and 64K page sizes. This code will not work if
> + * a hardware is not supporting at least one of these page sizes.
> + * Therefore, D-cache is disabled by default and enabled only when
> + * "enable-dcache" is passed to the kexec().
> + * Since this is an identity mapped system, so VA_BITS will be same as max
> + * PA bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only
> + * one level of page table will be there with block descriptor entries.
> + * Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will
> + * have only table entries pointing to a level 1 lookup. Level 1 will have
> + * only block entries which will map 1GB block.For 64K mapping, TTBR points
> + * to level 1 lookups, which will have only table entries pointing to a
> + * level 2 lookup. Level 2 will have only block entries which will map
> + * 512MB block. If UART base address and RAM addresses are not at least 1GB
> + * and 512MB apart for 4K and 64K respectively, then mapping result could
> + * be unpredictable. In that case we need to support one more level of
> + * granularity, but until someone needs that keep it like this only.
> + * We can not allocate dynamic memory in purgatory. Therefore we keep page
> + * table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points
> + * to first level (having only table entries) and (page_table +
> + * MAX_PAGE_SIZE) points to table at next level (having block entries). If
> + * index for RAM area and UART area in first table is not same, then we
> + * will need another next level table which will be located at (page_table
> + * + 2 * MAX_PAGE_SIZE).
> + */
> +
> +#include <stdint.h>
> +#include <string.h>
> +#include <purgatory.h>
> +#include "cache.h"
> +
> +static uint64_t page_shift;
> +static uint64_t pgtable_level;
> +static uint64_t va_bits;
> +
> +static uint64_t page_table[PAGE_TABLE_SIZE / sizeof(uint64_t)] __attribute__ ((aligned (MAX_PAGE_SIZE))) = { };
> +static uint64_t page_table_used;
> +
> +#define PAGE_SIZE	(1 << page_shift)
> +/*
> + *	is_4k_page_supported - return true if 4k page is supported else
> + *	false
> + */
> +static int is_4k_page_supported(void)
> +{
> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN4_MASK) ==
> +			ID_AA64MMFR0_TGRAN4_SUPPORTED);
> +}
> +
> +/*
> + *	is_64k_page_supported - return true if 64k page is supported else
> + *	false
> + */
> +static int is_64k_page_supported(void)
> +{
> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN64_MASK) ==
> +			ID_AA64MMFR0_TGRAN64_SUPPORTED);
> +}
> +
> +/*
> + *	get_ips_bits - return supported IPS bits
> + */
> +static uint64_t get_ips_bits(void)
> +{
> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_PARANGE_MASK) >>
> +			ID_AA64MMFR0_PARANGE_SHIFT);
> +}
> +
> +/*
> + *	get_va_bits - return supported VA bits (For identity mapping VA = PA)
> + */
> +static uint64_t get_va_bits(void)
> +{
> +	uint64_t ips = get_ips_bits();
> +
> +	switch(ips) {
> +	case ID_AA64MMFR0_PARANGE_48:
> +		return 48;
> +	case ID_AA64MMFR0_PARANGE_44:
> +		return 44;
> +	case ID_AA64MMFR0_PARANGE_42:
> +		return 42;
> +	case ID_AA64MMFR0_PARANGE_40:
> +		return 40;
> +	case ID_AA64MMFR0_PARANGE_36:
> +		return 36;
> +	default:
> +		return 32;
> +	}
> +}
> +
> +/*
> + *	get_section_shift - get block shift for supported page size
> + */
> +static uint64_t get_section_shift(void)
> +{
> +	if (page_shift == 16)
> +		return 29;
> +	else if(page_shift == 12)
> +		return 30;
> +	else
> +		return 0;
> +}
> +
> +/*
> + *	get_section_mask - get section mask for supported page size
> + */
> +static uint64_t get_section_mask(void)
> +{
> +	if (page_shift == 16)
> +		return 0x1FFF;
> +	else if(page_shift == 12)
> +		return 0x1FF;
> +	else
> +		return 0;
> +}
> +
> +/*
> + *	get_pgdir_shift - get pgdir shift for supported page size
> + */
> +static uint64_t get_pgdir_shift(void)
> +{
> +	if (page_shift == 16)
> +		return 42;
> +	else if(page_shift == 12)
> +		return 39;
> +	else
> +		return 0;
> +}
> +
> +/*
> + *	init_page_table - Initializes page table locations
> + */
> +
> +static void init_page_table(void)
> +{
> +	/*
> +	 * Invalidate the page tables to avoid potential dirty cache lines
> +	 * being evicted.
> +	 */

How do these lines get dirty? arm64_relocate_new_kernel() invalidated these
pages to PoC before it copied the data. If they were speculatively fetched (I
don't know the rules of when/how that happens) they may be wrong, but will be
clean and not written back. If we change them in purgatory, you invalidate again
from enable_mmu_dcache(). I don't think this is needed.


> +	inval_cache_range((uint64_t)page_table,
> +			(uint64_t)page_table + PAGE_TABLE_SIZE);
> +	memset(page_table, 0, PAGE_TABLE_SIZE);
> +}
> +/*
> + *	create_identity_mapping(start, end, flags)
> + *	start		- start address
> + *	end		- end address
> + *	flags 		- MMU Flags for Normal or Device type memory
> + */
> +static void create_identity_mapping(uint64_t start, uint64_t end,
> +					uint64_t flags)
> +{
> +	uint32_t sec_shift, pgdir_shift, sec_mask;
> +	uint64_t desc, s1, e1, s2, e2;
> +	uint64_t *table2;
> +
> +	s1 = start;
> +	e1 = end - 1;
> +
> +	sec_shift = get_section_shift();
> +	if (pgtable_level == 1) {
> +		s1 >>= sec_shift;
> +		e1 >>= sec_shift;
> +		do {
> +			desc = s1 << sec_shift;
> +			desc |= flags;
> +			page_table[s1] = desc;
> +			s1++;
> +		} while (s1 <= e1);
> +	} else {
> +		pgdir_shift = get_pgdir_shift();
> +		sec_mask = get_section_mask();
> +		s1 >>= pgdir_shift;
> +		e1 >>= pgdir_shift;
> +		do {
> +			/*
> +			 * If there is no table entry then write a new
> +			 * entry else, use old entry
> +			 */
> +			if (!page_table[s1]) {
> +				table2 = &page_table[(++page_table_used *
> +						MAX_PAGE_SIZE) /
> +						sizeof(uint64_t)];
> +				desc = (uint64_t)table2 | PMD_TYPE_TABLE;
> +				page_table[s1] = desc;
> +			} else {
> +				table2 = (uint64_t *)(page_table[s1] &
> +						~PMD_TYPE_MASK);
> +			}
> +			s1++;
> +			s2 = start >> sec_shift;
> +			s2 &= sec_mask;
> +			e2 = (end - 1) >> sec_shift;
> +			e2 &= sec_mask;
> +			do {
> +				desc = s2 << sec_shift;
> +				desc |= flags;
> +				table2[s2] = desc;
> +				s2++;
> +			} while (s2 <= e2);
> +		} while (s1 <= e1);
> +	}
> +}

(I will need to come back to this ... it looks pretty complicated. If you mimic
Linux's p?d/pte macros it will be more familiar and easier to read.)


> +
> +/*
> + *	enable_mmu_dcache: Enable mmu and D-cache in sctlr_el1
> + */
> +static void enable_mmu_dcache(void)
> +{
> +	uint64_t tcr_flags = TCR_FLAGS | TCR_T0SZ(va_bits);
> +
> +	switch(page_shift) {
> +	case 16:
> +		tcr_flags |= TCR_TG0_64K;
> +		break;
> +	case 12:
> +		tcr_flags |= TCR_TG0_4K;
> +		break;
> +	default:
> +		printf("page shift not supported\n");
> +		return;
> +	}
> +	/*
> +	 * Since the page tables have been populated with non-cacheable
> +	 * accesses (MMU disabled), invalidate the page tables to remove
> +	 * any speculatively loaded cache lines.
> +	 */
> +	inval_cache_range((uint64_t)page_table,
> +				(uint64_t)page_table + PAGE_TABLE_SIZE);
> +
> +	switch(get_current_el()) {
> +	case 2:
> +		invalidate_tlbs_el2();
> +		tcr_flags |= (get_ips_bits() << TCR_PS_EL2_SHIFT);
> +		set_mair_tcr_ttbr_sctlr_el2((uint64_t)page_table, tcr_flags);
> +		break;
> +	case 1:
> +		invalidate_tlbs_el1();
> +		tcr_flags |= (get_ips_bits() << TCR_IPS_EL1_SHIFT);
> +		set_mair_tcr_ttbr_sctlr_el1((uint64_t)page_table, tcr_flags);
> +		break;
> +	default:
> +		return;
> +	}

> +	invalidate_icache();

What is this protecting against? We have executed instructions between here and
setting the I+M bits in set_mair_tcr_ttbr_sctlr_el1(). (so it may be too late)

arm64_relocate_new_kernel() already did 'ic iallu' before it branched into the
purgatory code. No executable code has been changed or moved since then, so I
don't think this is necessary.


> +}
> +
> +/*
> + *	enable_dcache: Enable D-cache and set appropriate attributes
> + *	ram_start - Start address of RAM
> + *	ram_end - End address of RAM
> + *	uart_base - Base address of uart
> + */
> +int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base)
> +{
> +	va_bits = get_va_bits();
> +
> +	page_table_used = 0;
> +	if (is_64k_page_supported()) {
> +		page_shift = 16;
> +		if (va_bits <= 42)
> +			pgtable_level = 1;
> +		else
> +			pgtable_level = 2;
> +	} else if (is_4k_page_supported()) {
> +		page_shift = 12;
> +		if (va_bits <= 39)
> +			pgtable_level = 1;
> +		else
> +			pgtable_level = 2;
> +	} else {
> +		printf("Valid Page Granule not supported by hardware\n");
> +		return -1;
> +	}
> +	init_page_table();
> +	create_identity_mapping(ram_start, ram_end, MM_MMUFLAGS_NORMAL);
> +	printf("Normal identity mapping created from %lx to %lx\n",
> +			ram_start, ram_end);
> +	if (uart_base) {
> +		create_identity_mapping((uint64_t)uart_base,
> +					(uint64_t)uart_base + PAGE_SIZE,
> +					MM_MMUFLAGS_DEVICE);
> +		printf("Device identity mapping created from %lx to %lx\n",
> +				(uint64_t)uart_base,
> +				(uint64_t)uart_base + PAGE_SIZE);
> +	}
> +	enable_mmu_dcache();
> +	printf("Cache Enabled\n");
> +
> +	return 0;
> +}
> +
> +/*
> + *	disable_dcache: Disable D-cache and flush RAM locations
> + *	ram_start - Start address of RAM
> + *	ram_end - End address of RAM
> + */
> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
> +{
> +	switch(get_current_el()) {
> +	case 2:
> +		reset_sctlr_el2();
> +		break;
> +	case 1:
> +		reset_sctlr_el1();

You have C code running between disabling the MMU and cleaning the cache. The
compiler is allowed to move data on and off the stack in here, but after
disabling the MMU it will see whatever was on the stack before we turned the MMU
on. Any data written at the beginning of this function is left in the caches.

I'm afraid this sort of stuff needs to be done in assembly!


> +		break;
> +	default:
> +		return;
> +	}
> +	invalidate_icache();
> +	flush_dcache_range(ram_start, ram_end);
> +	printf("Cache Disabled\n");
> +}
> diff --git a/purgatory/arch/arm64/cache.h b/purgatory/arch/arm64/cache.h
> new file mode 100644
> index 000000000000..c988020566e3
> --- /dev/null
> +++ b/purgatory/arch/arm64/cache.h
> @@ -0,0 +1,79 @@
> +#ifndef	__CACHE_H__
> +#define __CACHE_H__
> +
> +#define MT_DEVICE_NGNRNE	0
> +#define MT_DEVICE_NGNRE		1
> +#define MT_DEVICE_GRE		2
> +#define MT_NORMAL_NC		3
> +#define MT_NORMAL		4

You only use two of these. I guess this is so the MAIR value matches the kernel?


> +
> +#ifndef __ASSEMBLER__
> +
> +#define MAX_PAGE_SIZE		0x10000
> +#define PAGE_TABLE_SIZE		(3 * MAX_PAGE_SIZE)
> +#define ID_AA64MMFR0_TGRAN64_SHIFT	24
> +#define ID_AA64MMFR0_TGRAN4_SHIFT	28
> +#define ID_AA64MMFR0_TGRAN64_MASK	(0xFUL << ID_AA64MMFR0_TGRAN64_SHIFT)
> +#define ID_AA64MMFR0_TGRAN4_MASK	(0xFUL << ID_AA64MMFR0_TGRAN4_SHIFT)
> +#define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
> +#define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
> +#define ID_AA64MMFR0_PARANGE_SHIFT	0
> +#define ID_AA64MMFR0_PARANGE_MASK	(0xFUL << ID_AA64MMFR0_PARANGE_SHIFT)
> +#define ID_AA64MMFR0_PARANGE_48		0x5
> +#define ID_AA64MMFR0_PARANGE_44		0x4
> +#define ID_AA64MMFR0_PARANGE_42		0x3
> +#define ID_AA64MMFR0_PARANGE_40		0x2
> +#define ID_AA64MMFR0_PARANGE_36		0x1
> +#define ID_AA64MMFR0_PARANGE_32		0x0
> +
> +#define TCR_TG0_64K 		(1UL << 14)
> +#define TCR_TG0_4K 		(0UL << 14)
> +#define TCR_SHARED_NONE		(0UL << 12)
> +#define TCR_ORGN_WBWA		(1UL << 10)
> +#define TCR_IRGN_WBWA		(1UL << 8)
> +#define TCR_IPS_EL1_SHIFT	32
> +#define TCR_PS_EL2_SHIFT	16
> +#define TCR_T0SZ(x)		((unsigned long)(64 - (x)) << 0)
> +#define TCR_FLAGS (TCR_SHARED_NONE | TCR_ORGN_WBWA | TCR_IRGN_WBWA)
> +
> +#define PMD_TYPE_SECT		(1UL << 0)
> +#define PMD_TYPE_TABLE		(3UL << 0)
> +#define PMD_TYPE_MASK		0x3
> +#define PMD_SECT_AF		(1UL << 10)
> +#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
> +#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
> +#define PMD_SECT_PXN		(1UL << 53)
> +#define PMD_SECT_UXN		(1UL << 54)
> +#define PMD_FLAGS_DEVICE	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_PXN | PMD_SECT_UXN)
> +#define MM_MMUFLAGS_NORMAL	PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL
> +#define MM_MMUFLAGS_DEVICE	PMD_ATTRINDX(MT_DEVICE_NGNRE) | PMD_FLAGS_DEVICE
> +
> +void disable_dcache(uint64_t ram_start, uint64_t ram_end);
> +int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base);
> +uint64_t get_mm_feature_reg0_val(void);
> +void inval_cache_range(uint64_t start, uint64_t end);
> +void flush_dcache_range(uint64_t start, uint64_t end);
> +uint64_t get_current_el(void);
> +void set_mair_tcr_ttbr_sctlr_el1(uint64_t page_table, uint64_t tcr_flags);
> +void set_mair_tcr_ttbr_sctlr_el2(uint64_t page_table, uint64_t tcr_flags);
> +void invalidate_tlbs_el1(void);
> +void invalidate_tlbs_el2(void);
> +void invalidate_icache(void);
> +void reset_sctlr_el1(void);
> +void reset_sctlr_el2(void);
> +#else
> +#define MEMORY_ATTRIBUTES	((0x00 << (MT_DEVICE_NGNRNE*8)) | \
> +				(0x04 << (MT_DEVICE_NGNRE*8)) | \
> +				(0x0C << (MT_DEVICE_GRE*8)) | \
> +				(0x44 << (MT_NORMAL_NC*8)) | \
> +				(0xFF << (MT_NORMAL*8)))

Again, you only use two of these.


> +/* Common SCTLR_ELx flags. */
> +#define SCTLR_ELx_I		(1 << 12)
> +#define SCTLR_ELx_C		(1 << 2)
> +#define SCTLR_ELx_M		(1 << 0)
> +
> +#define SCTLR_ELx_FLAGS (SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_I)
> +
> +#endif
> +#endif
> 


Thanks!

James

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-11-25 18:30     ` James Morse
  0 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-11-25 18:30 UTC (permalink / raw)
  To: Pratyush Anand; +Cc: geoff, Mark Rutland, kexec, linux-arm-kernel

Hi Pratyush,

(CC: Mark, mismatched memory attributes in paragraph 3?)

On 22/11/16 04:32, Pratyush Anand wrote:
> This patch adds support to enable/disable d-cache, which can be used for
> faster purgatory sha256 verification.

(I'm not clear why we want the sha256, but that is being discussed elsewhere on
 the thread)


> We are supporting only 4K and 64K page sizes. This code will not work if a
> hardware is not supporting at least one of these page sizes.  Therefore,
> D-cache is disabled by default and enabled only when "enable-dcache" is
> passed to the kexec().

I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would be
a lot simpler to only support one page size, which should be 4K as that is what
UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)

I would go as far as to generate the page tables at 'kexec -l' time, and only if
'/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore must
support 4K pages). This would keep the purgatory code as simple as possible.

I don't think the performance difference between 4K and 64K page sizes will be
measurable, is purgatory really performance sensitive code?


> Since this is an identity mapped system, so VA_BITS will be same as max PA
> bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
> level of page table will be there with block descriptor entries.
> Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
> only table entries pointing to a level 1 lookup. Level 1 will have only
> block entries which will map 1GB block. For 64K mapping, TTBR points to
> level 1 lookups, which will have only table entries pointing to a level 2
> lookup. Level 2 will have only block entries which will map 512MB block. If

This is more complexity to pick a VA size. Why not always use the maximum 48bit
VA? The cost is negligible compared to having simpler (easier to review!)
purgatory code.

By always using 1GB blocks you may be creating aliases with mismatched attributes:
* If kdump only reserves 128MB, your 1GB mapping will alias whatever else was
  in the same 1GB of address space. This could be a reserved region with some
  other memory attributes.
* With kdump, we may have failed to park the other CPUs if they are executing
  with interrupts masked and haven't yet handled the smp_send_stop() IPI.
* One of these other CPUs could be reading/writing in this area as it doesn't
  belong to the kdump reserved area, just happens to be in the same 1GB.

I need to dig through the ARM-ARM to find out what happens next, but I'm pretty
sure this is well into the "don't do that" territory.


It would be much better to force the memory areas to be a multiple of 2MB and
2MB aligned, which will allow you to use 2M section mappings for memory, (but
not the uart). This way we only map regions we had reserved and know are memory.


> UART base address and RAM addresses are not at least 1GB and 512MB apart
> for 4K and 64K respectively, then mapping result could be unpredictable. In
> that case we need to support one more level of granularity, but until
> someone needs that keep it like this only.
>
> We can not allocate dynamic memory in purgatory. Therefore we keep page
> table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
> first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
> points to table at next level (having block entries).  If index for RAM
> area and UART area in first table is not same, then we will need another
> next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).


> diff --git a/purgatory/arch/arm64/cache-asm.S b/purgatory/arch/arm64/cache-asm.S
> new file mode 100644
> index 000000000000..bef97ef48888
> --- /dev/null
> +++ b/purgatory/arch/arm64/cache-asm.S
> @@ -0,0 +1,186 @@
> +/*
> + * Some of the routines have been copied from Linux Kernel, therefore
> + * copying the license as well.
> + *
> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
> + * Copyright (C) 2012 ARM Ltd.
> + * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include "cache.h"
> +
> +/*
> + * 	dcache_line_size - get the minimum D-cache line size from the CTR register.
> + */
> +	.macro	dcache_line_size, reg, tmp
> +	mrs	\tmp, ctr_el0			// read CTR
> +	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
> +	mov	\reg, #4			// bytes per word
> +	lsl	\reg, \reg, \tmp		// actual cache line size
> +	.endm
> +
> +/*
> + *	inval_cache_range(start, end)
> + *	- x0 - start	- start address of region
> + *	- x1 - end	- end address of region
> + */
> +.globl inval_cache_range
> +inval_cache_range:
> +	dcache_line_size x2, x3
> +	sub	x3, x2, #1
> +	tst	x1, x3				// end cache line aligned?
> +	bic	x1, x1, x3
> +	b.eq	1f
> +	dc	civac, x1			// clean & invalidate D / U line
> +1:	tst	x0, x3				// start cache line aligned?
> +	bic	x0, x0, x3
> +	b.eq	2f
> +	dc	civac, x0			// clean & invalidate D / U line
> +	b	3f
> +2:	dc	ivac, x0			// invalidate D / U line
> +3:	add	x0, x0, x2
> +	cmp	x0, x1
> +	b.lo	2b
> +	dsb	sy
> +	ret
> +/*
> + *	flush_dcache_range(start, end)
> + *	- x0 - start	- start address of region
> + *	- x1 - end	- end address of region
> + *
> + */
> +.globl flush_dcache_range
> +flush_dcache_range:
> +	dcache_line_size x2, x3
> +	sub	x3, x2, #1
> +	bic	x0, x0, x3
> +1:	dc	civac, x0			// clean & invalidate D line / unified line
> +	add	x0, x0, x2
> +	cmp	x0, x1
> +	b.lo	1b
> +	dsb	sy
> +	ret
> +
> +/*
> + *	invalidate_tlbs_el1()
> + */
> +.globl invalidate_tlbs_el1
> +invalidate_tlbs_el1:
> +	dsb	nshst
> +	tlbi	vmalle1
> +	dsb	nsh
> +	isb
> +	ret
> +
> +/*
> + *	invalidate_tlbs_el2()
> + */
> +.globl invalidate_tlbs_el2
> +invalidate_tlbs_el2:
> +	dsb	nshst
> +	tlbi	alle2
> +	dsb	nsh
> +	isb
> +	ret
> +
> +/*
> + * 	get_mm_feature_reg0_val - Get information about supported MM
> + * 	features
> + */
> +.globl get_mm_feature_reg0_val
> +get_mm_feature_reg0_val:
> +	mrs	x0, ID_AA64MMFR0_EL1
> +	ret
> +
> +/*
> + * 	get_current_el - Get information about current exception level
> + */
> +.globl get_current_el
> +get_current_el:
> +	mrs 	x0, CurrentEL
> +	lsr	x0, x0, #2
> +	ret
> +
> +/*
> + * 	invalidate_icache - Invalidate I-cache
> + */
> +.globl invalidate_icache
> +invalidate_icache:
> +	ic	iallu
> +	dsb	nsh
> +	isb
> +	ret
> +
> +/*
> + * 	set_mair_tcr_ttbr_sctlr_el1(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
> + * 	x0 - page_table - Page Table Base
> + * 	x1 - tcr_flags - TCR Flags to be set
> + */
> +.globl set_mair_tcr_ttbr_sctlr_el1
> +set_mair_tcr_ttbr_sctlr_el1:
> +	ldr	x2, =MEMORY_ATTRIBUTES
> +	msr	mair_el1, x2
> +	msr	tcr_el1, x1
> +	msr	ttbr0_el1, x0
> +	isb
> +	mrs	x0, sctlr_el1
> +	ldr	x3, =SCTLR_ELx_FLAGS
> +	orr	x0, x0, x3
> +	msr	sctlr_el1, x0
> +	isb
> +	ret
> +
> +/*
> + * 	set_mair_tcr_ttbr_sctlr_el2(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
> + * 	x0 - page_table - Page Table Base
> + * 	x1 - tcr_flags - TCR Flags to be set
> + */
> +.globl set_mair_tcr_ttbr_sctlr_el2
> +set_mair_tcr_ttbr_sctlr_el2:
> +	ldr	x2, =MEMORY_ATTRIBUTES
> +	msr	mair_el2, x2
> +	msr	tcr_el2, x1
> +	msr	ttbr0_el2, x0
> +	isb
> +	mrs	x0, sctlr_el2
> +	ldr	x3, =SCTLR_ELx_FLAGS
> +	orr	x0, x0, x3
> +	msr	sctlr_el2, x0
> +	isb
> +	ret
> +
> +/*
> + * reset_sctlr_el1 - disables cache and mmu
> + */
> +.globl reset_sctlr_el1
> +reset_sctlr_el1:
> +	mrs	x0, sctlr_el1
> +	bic	x0, x0, #SCTLR_ELx_C
> +	bic	x0, x0, #SCTLR_ELx_M
> +	msr	sctlr_el1, x0
> +	isb
> +	ret
> +
> +/*
> + * reset_sctlr_el2 - disables cache and mmu
> + */
> +.globl reset_sctlr_el2
> +reset_sctlr_el2:
> +	mrs	x0, sctlr_el2
> +	bic	x0, x0, #SCTLR_ELx_C
> +	bic	x0, x0, #SCTLR_ELx_M
> +	msr	sctlr_el2, x0
> +	isb
> +	ret
> diff --git a/purgatory/arch/arm64/cache.c b/purgatory/arch/arm64/cache.c
> new file mode 100644
> index 000000000000..3c7e058ccf11
> --- /dev/null
> +++ b/purgatory/arch/arm64/cache.c
> @@ -0,0 +1,330 @@
> +/*
> + * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +/* We are supporting only 4K and 64K page sizes. This code will not work if
> + * a hardware is not supporting at least one of these page sizes.
> + * Therefore, D-cache is disabled by default and enabled only when
> + * "enable-dcache" is passed to the kexec().
> + * Since this is an identity mapped system, so VA_BITS will be same as max
> + * PA bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only
> + * one level of page table will be there with block descriptor entries.
> + * Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will
> + * have only table entries pointing to a level 1 lookup. Level 1 will have
> + * only block entries which will map 1GB block.For 64K mapping, TTBR points
> + * to level 1 lookups, which will have only table entries pointing to a
> + * level 2 lookup. Level 2 will have only block entries which will map
> + * 512MB block. If UART base address and RAM addresses are not at least 1GB
> + * and 512MB apart for 4K and 64K respectively, then mapping result could
> + * be unpredictable. In that case we need to support one more level of
> + * granularity, but until someone needs that keep it like this only.
> + * We can not allocate dynamic memory in purgatory. Therefore we keep page
> + * table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points
> + * to first level (having only table entries) and (page_table +
> + * MAX_PAGE_SIZE) points to table at next level (having block entries). If
> + * index for RAM area and UART area in first table is not same, then we
> + * will need another next level table which will be located at (page_table
> + * + 2 * MAX_PAGE_SIZE).
> + */
> +
> +#include <stdint.h>
> +#include <string.h>
> +#include <purgatory.h>
> +#include "cache.h"
> +
> +static uint64_t page_shift;
> +static uint64_t pgtable_level;
> +static uint64_t va_bits;
> +
> +static uint64_t page_table[PAGE_TABLE_SIZE / sizeof(uint64_t)] __attribute__ ((aligned (MAX_PAGE_SIZE))) = { };
> +static uint64_t page_table_used;
> +
> +#define PAGE_SIZE	(1 << page_shift)
> +/*
> + *	is_4k_page_supported - return true if 4k page is supported else
> + *	false
> + */
> +static int is_4k_page_supported(void)
> +{
> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN4_MASK) ==
> +			ID_AA64MMFR0_TGRAN4_SUPPORTED);
> +}
> +
> +/*
> + *	is_64k_page_supported - return true if 64k page is supported else
> + *	false
> + */
> +static int is_64k_page_supported(void)
> +{
> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN64_MASK) ==
> +			ID_AA64MMFR0_TGRAN64_SUPPORTED);
> +}
> +
> +/*
> + *	get_ips_bits - return supported IPS bits
> + */
> +static uint64_t get_ips_bits(void)
> +{
> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_PARANGE_MASK) >>
> +			ID_AA64MMFR0_PARANGE_SHIFT);
> +}
> +
> +/*
> + *	get_va_bits - return supported VA bits (For identity mapping VA = PA)
> + */
> +static uint64_t get_va_bits(void)
> +{
> +	uint64_t ips = get_ips_bits();
> +
> +	switch(ips) {
> +	case ID_AA64MMFR0_PARANGE_48:
> +		return 48;
> +	case ID_AA64MMFR0_PARANGE_44:
> +		return 44;
> +	case ID_AA64MMFR0_PARANGE_42:
> +		return 42;
> +	case ID_AA64MMFR0_PARANGE_40:
> +		return 40;
> +	case ID_AA64MMFR0_PARANGE_36:
> +		return 36;
> +	default:
> +		return 32;
> +	}
> +}
> +
> +/*
> + *	get_section_shift - get block shift for supported page size
> + */
> +static uint64_t get_section_shift(void)
> +{
> +	if (page_shift == 16)
> +		return 29;
> +	else if(page_shift == 12)
> +		return 30;
> +	else
> +		return 0;
> +}
> +
> +/*
> + *	get_section_mask - get section mask for supported page size
> + */
> +static uint64_t get_section_mask(void)
> +{
> +	if (page_shift == 16)
> +		return 0x1FFF;
> +	else if(page_shift == 12)
> +		return 0x1FF;
> +	else
> +		return 0;
> +}
> +
> +/*
> + *	get_pgdir_shift - get pgdir shift for supported page size
> + */
> +static uint64_t get_pgdir_shift(void)
> +{
> +	if (page_shift == 16)
> +		return 42;
> +	else if(page_shift == 12)
> +		return 39;
> +	else
> +		return 0;
> +}
> +
> +/*
> + *	init_page_table - Initializes page table locations
> + */
> +
> +static void init_page_table(void)
> +{
> +	/*
> +	 * Invalidate the page tables to avoid potential dirty cache lines
> +	 * being evicted.
> +	 */

How do these lines get dirty? arm64_relocate_new_kernel() invalidated these
pages to PoC before it copied the data. If they were speculatively fetched (I
don't know the rules of when/how that happens) they may be wrong, but will be
clean and not written back. If we change them in purgatory, you invalidate again
from enable_mmu_dcache(). I don't think this is needed.


> +	inval_cache_range((uint64_t)page_table,
> +			(uint64_t)page_table + PAGE_TABLE_SIZE);
> +	memset(page_table, 0, PAGE_TABLE_SIZE);
> +}
> +/*
> + *	create_identity_mapping(start, end, flags)
> + *	start		- start address
> + *	end		- end address
> + *	flags 		- MMU Flags for Normal or Device type memory
> + */
> +static void create_identity_mapping(uint64_t start, uint64_t end,
> +					uint64_t flags)
> +{
> +	uint32_t sec_shift, pgdir_shift, sec_mask;
> +	uint64_t desc, s1, e1, s2, e2;
> +	uint64_t *table2;
> +
> +	s1 = start;
> +	e1 = end - 1;
> +
> +	sec_shift = get_section_shift();
> +	if (pgtable_level == 1) {
> +		s1 >>= sec_shift;
> +		e1 >>= sec_shift;
> +		do {
> +			desc = s1 << sec_shift;
> +			desc |= flags;
> +			page_table[s1] = desc;
> +			s1++;
> +		} while (s1 <= e1);
> +	} else {
> +		pgdir_shift = get_pgdir_shift();
> +		sec_mask = get_section_mask();
> +		s1 >>= pgdir_shift;
> +		e1 >>= pgdir_shift;
> +		do {
> +			/*
> +			 * If there is no table entry then write a new
> +			 * entry else, use old entry
> +			 */
> +			if (!page_table[s1]) {
> +				table2 = &page_table[(++page_table_used *
> +						MAX_PAGE_SIZE) /
> +						sizeof(uint64_t)];
> +				desc = (uint64_t)table2 | PMD_TYPE_TABLE;
> +				page_table[s1] = desc;
> +			} else {
> +				table2 = (uint64_t *)(page_table[s1] &
> +						~PMD_TYPE_MASK);
> +			}
> +			s1++;
> +			s2 = start >> sec_shift;
> +			s2 &= sec_mask;
> +			e2 = (end - 1) >> sec_shift;
> +			e2 &= sec_mask;
> +			do {
> +				desc = s2 << sec_shift;
> +				desc |= flags;
> +				table2[s2] = desc;
> +				s2++;
> +			} while (s2 <= e2);
> +		} while (s1 <= e1);
> +	}
> +}

(I will need to come back to this ... it looks pretty complicated. If you mimic
Linux's p?d/pte macros it will be more familiar and easier to read.)


> +
> +/*
> + *	enable_mmu_dcache: Enable mmu and D-cache in sctlr_el1
> + */
> +static void enable_mmu_dcache(void)
> +{
> +	uint64_t tcr_flags = TCR_FLAGS | TCR_T0SZ(va_bits);
> +
> +	switch(page_shift) {
> +	case 16:
> +		tcr_flags |= TCR_TG0_64K;
> +		break;
> +	case 12:
> +		tcr_flags |= TCR_TG0_4K;
> +		break;
> +	default:
> +		printf("page shift not supported\n");
> +		return;
> +	}
> +	/*
> +	 * Since the page tables have been populated with non-cacheable
> +	 * accesses (MMU disabled), invalidate the page tables to remove
> +	 * any speculatively loaded cache lines.
> +	 */
> +	inval_cache_range((uint64_t)page_table,
> +				(uint64_t)page_table + PAGE_TABLE_SIZE);
> +
> +	switch(get_current_el()) {
> +	case 2:
> +		invalidate_tlbs_el2();
> +		tcr_flags |= (get_ips_bits() << TCR_PS_EL2_SHIFT);
> +		set_mair_tcr_ttbr_sctlr_el2((uint64_t)page_table, tcr_flags);
> +		break;
> +	case 1:
> +		invalidate_tlbs_el1();
> +		tcr_flags |= (get_ips_bits() << TCR_IPS_EL1_SHIFT);
> +		set_mair_tcr_ttbr_sctlr_el1((uint64_t)page_table, tcr_flags);
> +		break;
> +	default:
> +		return;
> +	}

> +	invalidate_icache();

What is this protecting against? We have executed instructions between here and
setting the I+M bits in set_mair_tcr_ttbr_sctlr_el1(). (so it may be too late)

arm64_relocate_new_kernel() already did 'ic iallu' before it branched into the
purgatory code. No executable code has been changed or moved since then, so I
don't think this is necessary.


> +}
> +
> +/*
> + *	enable_dcache: Enable D-cache and set appropriate attributes
> + *	ram_start - Start address of RAM
> + *	ram_end - End address of RAM
> + *	uart_base - Base address of uart
> + */
> +int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base)
> +{
> +	va_bits = get_va_bits();
> +
> +	page_table_used = 0;
> +	if (is_64k_page_supported()) {
> +		page_shift = 16;
> +		if (va_bits <= 42)
> +			pgtable_level = 1;
> +		else
> +			pgtable_level = 2;
> +	} else if (is_4k_page_supported()) {
> +		page_shift = 12;
> +		if (va_bits <= 39)
> +			pgtable_level = 1;
> +		else
> +			pgtable_level = 2;
> +	} else {
> +		printf("Valid Page Granule not supported by hardware\n");
> +		return -1;
> +	}
> +	init_page_table();
> +	create_identity_mapping(ram_start, ram_end, MM_MMUFLAGS_NORMAL);
> +	printf("Normal identity mapping created from %lx to %lx\n",
> +			ram_start, ram_end);
> +	if (uart_base) {
> +		create_identity_mapping((uint64_t)uart_base,
> +					(uint64_t)uart_base + PAGE_SIZE,
> +					MM_MMUFLAGS_DEVICE);
> +		printf("Device identity mapping created from %lx to %lx\n",
> +				(uint64_t)uart_base,
> +				(uint64_t)uart_base + PAGE_SIZE);
> +	}
> +	enable_mmu_dcache();
> +	printf("Cache Enabled\n");
> +
> +	return 0;
> +}
> +
> +/*
> + *	disable_dcache: Disable D-cache and flush RAM locations
> + *	ram_start - Start address of RAM
> + *	ram_end - End address of RAM
> + */
> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
> +{
> +	switch(get_current_el()) {
> +	case 2:
> +		reset_sctlr_el2();
> +		break;
> +	case 1:
> +		reset_sctlr_el1();

You have C code running between disabling the MMU and cleaning the cache. The
compiler is allowed to move data on and off the stack in here, but after
disabling the MMU it will see whatever was on the stack before we turned the MMU
on. Any data written at the beginning of this function is left in the caches.

I'm afraid this sort of stuff needs to be done in assembly!


> +		break;
> +	default:
> +		return;
> +	}
> +	invalidate_icache();
> +	flush_dcache_range(ram_start, ram_end);
> +	printf("Cache Disabled\n");
> +}
> diff --git a/purgatory/arch/arm64/cache.h b/purgatory/arch/arm64/cache.h
> new file mode 100644
> index 000000000000..c988020566e3
> --- /dev/null
> +++ b/purgatory/arch/arm64/cache.h
> @@ -0,0 +1,79 @@
> +#ifndef	__CACHE_H__
> +#define __CACHE_H__
> +
> +#define MT_DEVICE_NGNRNE	0
> +#define MT_DEVICE_NGNRE		1
> +#define MT_DEVICE_GRE		2
> +#define MT_NORMAL_NC		3
> +#define MT_NORMAL		4

You only use two of these. I guess this is so the MAIR value matches the kernel?


> +
> +#ifndef __ASSEMBLER__
> +
> +#define MAX_PAGE_SIZE		0x10000
> +#define PAGE_TABLE_SIZE		(3 * MAX_PAGE_SIZE)
> +#define ID_AA64MMFR0_TGRAN64_SHIFT	24
> +#define ID_AA64MMFR0_TGRAN4_SHIFT	28
> +#define ID_AA64MMFR0_TGRAN64_MASK	(0xFUL << ID_AA64MMFR0_TGRAN64_SHIFT)
> +#define ID_AA64MMFR0_TGRAN4_MASK	(0xFUL << ID_AA64MMFR0_TGRAN4_SHIFT)
> +#define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
> +#define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
> +#define ID_AA64MMFR0_PARANGE_SHIFT	0
> +#define ID_AA64MMFR0_PARANGE_MASK	(0xFUL << ID_AA64MMFR0_PARANGE_SHIFT)
> +#define ID_AA64MMFR0_PARANGE_48		0x5
> +#define ID_AA64MMFR0_PARANGE_44		0x4
> +#define ID_AA64MMFR0_PARANGE_42		0x3
> +#define ID_AA64MMFR0_PARANGE_40		0x2
> +#define ID_AA64MMFR0_PARANGE_36		0x1
> +#define ID_AA64MMFR0_PARANGE_32		0x0
> +
> +#define TCR_TG0_64K 		(1UL << 14)
> +#define TCR_TG0_4K 		(0UL << 14)
> +#define TCR_SHARED_NONE		(0UL << 12)
> +#define TCR_ORGN_WBWA		(1UL << 10)
> +#define TCR_IRGN_WBWA		(1UL << 8)
> +#define TCR_IPS_EL1_SHIFT	32
> +#define TCR_PS_EL2_SHIFT	16
> +#define TCR_T0SZ(x)		((unsigned long)(64 - (x)) << 0)
> +#define TCR_FLAGS (TCR_SHARED_NONE | TCR_ORGN_WBWA | TCR_IRGN_WBWA)
> +
> +#define PMD_TYPE_SECT		(1UL << 0)
> +#define PMD_TYPE_TABLE		(3UL << 0)
> +#define PMD_TYPE_MASK		0x3
> +#define PMD_SECT_AF		(1UL << 10)
> +#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
> +#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
> +#define PMD_SECT_PXN		(1UL << 53)
> +#define PMD_SECT_UXN		(1UL << 54)
> +#define PMD_FLAGS_DEVICE	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_PXN | PMD_SECT_UXN)
> +#define MM_MMUFLAGS_NORMAL	PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL
> +#define MM_MMUFLAGS_DEVICE	PMD_ATTRINDX(MT_DEVICE_NGNRE) | PMD_FLAGS_DEVICE
> +
> +void disable_dcache(uint64_t ram_start, uint64_t ram_end);
> +int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base);
> +uint64_t get_mm_feature_reg0_val(void);
> +void inval_cache_range(uint64_t start, uint64_t end);
> +void flush_dcache_range(uint64_t start, uint64_t end);
> +uint64_t get_current_el(void);
> +void set_mair_tcr_ttbr_sctlr_el1(uint64_t page_table, uint64_t tcr_flags);
> +void set_mair_tcr_ttbr_sctlr_el2(uint64_t page_table, uint64_t tcr_flags);
> +void invalidate_tlbs_el1(void);
> +void invalidate_tlbs_el2(void);
> +void invalidate_icache(void);
> +void reset_sctlr_el1(void);
> +void reset_sctlr_el2(void);
> +#else
> +#define MEMORY_ATTRIBUTES	((0x00 << (MT_DEVICE_NGNRNE*8)) | \
> +				(0x04 << (MT_DEVICE_NGNRE*8)) | \
> +				(0x0C << (MT_DEVICE_GRE*8)) | \
> +				(0x44 << (MT_NORMAL_NC*8)) | \
> +				(0xFF << (MT_NORMAL*8)))

Again, you only use two of these.


> +/* Common SCTLR_ELx flags. */
> +#define SCTLR_ELx_I		(1 << 12)
> +#define SCTLR_ELx_C		(1 << 2)
> +#define SCTLR_ELx_M		(1 << 0)
> +
> +#define SCTLR_ELx_FLAGS (SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_I)
> +
> +#endif
> +#endif
> 


Thanks!

James


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility
  2016-11-22 18:56   ` Geoff Levand
@ 2016-11-25 18:30     ` James Morse
  -1 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-11-25 18:30 UTC (permalink / raw)
  To: linux-arm-kernel

Hi guys,

On 22/11/16 18:56, Geoff Levand wrote:
> On 11/21/2016 08:32 PM, Pratyush Anand wrote:
>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>> even when we have -O2 optimization enabled. However, if dcache is enabled
>> during purgatory execution then, it takes just a second in SHA verification.
> 
> As I had mentioned in another thread, I think -O2 optimization is
> sufficient considering the complexity of the code needed to enable
> the dcache.  Integrity checking is only needed for crash dump
> support.  If the crash reboot takes an extra 20 seconds does it
> matter?
> 
> For the re-boot of a stable system where the new kernel is loaded
> then immediately kexec'ed into integrity checking is not needed.

I agree.
If purgatory detects corruption in the new-kernel or initramfs all it can do is
spin in a loop. If we are very lucky in could print a debug message to the
serial console. If the planets line up, someone might see this message.

If we validate the checksum in the kernel kexec core code we can possibly fail
the syscall and return to a running system. We can use EFI runtime services to
try and reboot, or print a message to somewhere that might get seen such as
syslog or netconsole.

I agree kdump is different but I don't think 'we crashed' is performance critical.

Thanks,

James

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility
@ 2016-11-25 18:30     ` James Morse
  0 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-11-25 18:30 UTC (permalink / raw)
  To: Pratyush Anand; +Cc: Geoff Levand, kexec, linux-arm-kernel

Hi guys,

On 22/11/16 18:56, Geoff Levand wrote:
> On 11/21/2016 08:32 PM, Pratyush Anand wrote:
>> It takes more that 2 minutes to verify SHA in purgatory when vmlinuz image
>> is around 13MB and initramfs is around 30MB. It takes more than 20 second
>> even when we have -O2 optimization enabled. However, if dcache is enabled
>> during purgatory execution then, it takes just a second in SHA verification.
> 
> As I had mentioned in another thread, I think -O2 optimization is
> sufficient considering the complexity of the code needed to enable
> the dcache.  Integrity checking is only needed for crash dump
> support.  If the crash reboot takes an extra 20 seconds does it
> matter?
> 
> For the re-boot of a stable system where the new kernel is loaded
> then immediately kexec'ed into integrity checking is not needed.

I agree.
If purgatory detects corruption in the new-kernel or initramfs all it can do is
spin in a loop. If we are very lucky in could print a debug message to the
serial console. If the planets line up, someone might see this message.

If we validate the checksum in the kernel kexec core code we can possibly fail
the syscall and return to a running system. We can use EFI runtime services to
try and reboot, or print a message to somewhere that might get seen such as
syslog or netconsole.

I agree kdump is different but I don't think 'we crashed' is performance critical.

Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-11-25 18:30     ` James Morse
@ 2016-12-14  9:38       ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14  9:38 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

Thanks a lot for your review. Its helpful.

On Saturday 26 November 2016 12:00 AM, James Morse wrote:
> Hi Pratyush,
>
> (CC: Mark, mismatched memory attributes in paragraph 3?)
>
> On 22/11/16 04:32, Pratyush Anand wrote:
>> This patch adds support to enable/disable d-cache, which can be used for
>> faster purgatory sha256 verification.
>
> (I'm not clear why we want the sha256, but that is being discussed elsewhere on
>  the thread)
>
>
>> We are supporting only 4K and 64K page sizes. This code will not work if a
>> hardware is not supporting at least one of these page sizes.  Therefore,
>> D-cache is disabled by default and enabled only when "enable-dcache" is
>> passed to the kexec().
>
> I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would be
> a lot simpler to only support one page size, which should be 4K as that is what
> UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)

Ok.. So, I will implement a new version after considering that 4K will 
always be supported. If 4K is not supported by hw(which is very 
unlikely) then there would be no d-cache enabling feature.

>
> I would go as far as to generate the page tables at 'kexec -l' time, and only if

Ok..So you mean that I create a new section which will have page table 
entries mapping physicalmemory represented by remaining section, and 
then purgatory can just enable mmu with page table from that section, 
right? Seems doable. can do that.

> '/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore must
> support 4K pages). This would keep the purgatory code as simple as possible.

What about reading ID_AA64MMFR0_EL1 instead of /sys/firmware/efi? That 
can also tell us that whether 4K is supported or not?

>
> I don't think the performance difference between 4K and 64K page sizes will be
> measurable, is purgatory really performance sensitive code?

I agree, implementing only 4K will make it very simple.

>
>
>> Since this is an identity mapped system, so VA_BITS will be same as max PA
>> bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
>> level of page table will be there with block descriptor entries.
>> Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
>> only table entries pointing to a level 1 lookup. Level 1 will have only
>> block entries which will map 1GB block. For 64K mapping, TTBR points to
>> level 1 lookups, which will have only table entries pointing to a level 2
>> lookup. Level 2 will have only block entries which will map 512MB block. If
>
> This is more complexity to pick a VA size. Why not always use the maximum 48bit
> VA? The cost is negligible compared to having simpler (easier to review!)
> purgatory code.
>
> By always using 1GB blocks you may be creating aliases with mismatched attributes:
> * If kdump only reserves 128MB, your 1GB mapping will alias whatever else was
>   in the same 1GB of address space. This could be a reserved region with some
>   other memory attributes.
> * With kdump, we may have failed to park the other CPUs if they are executing
>   with interrupts masked and haven't yet handled the smp_send_stop() IPI.
> * One of these other CPUs could be reading/writing in this area as it doesn't
>   belong to the kdump reserved area, just happens to be in the same 1GB.
>
> I need to dig through the ARM-ARM to find out what happens next, but I'm pretty
> sure this is well into the "don't do that" territory.
>
>
> It would be much better to force the memory areas to be a multiple of 2MB and
> 2MB aligned, which will allow you to use 2M section mappings for memory, (but
> not the uart). This way we only map regions we had reserved and know are memory.


OK. So, 48 bit VA, 4K page size, 3 level page table with entries in 3rd 
level representing 2M block size.


>
>
>> UART base address and RAM addresses are not at least 1GB and 512MB apart
>> for 4K and 64K respectively, then mapping result could be unpredictable. In
>> that case we need to support one more level of granularity, but until
>> someone needs that keep it like this only.
>>
>> We can not allocate dynamic memory in purgatory. Therefore we keep page
>> table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
>> first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
>> points to table at next level (having block entries).  If index for RAM
>> area and UART area in first table is not same, then we will need another
>> next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).
>
>
>> diff --git a/purgatory/arch/arm64/cache-asm.S b/purgatory/arch/arm64/cache-asm.S
>> new file mode 100644
>> index 000000000000..bef97ef48888
>> --- /dev/null
>> +++ b/purgatory/arch/arm64/cache-asm.S
>> @@ -0,0 +1,186 @@
>> +/*
>> + * Some of the routines have been copied from Linux Kernel, therefore
>> + * copying the license as well.
>> + *
>> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
>> + * Copyright (C) 2012 ARM Ltd.
>> + * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "cache.h"
>> +
>> +/*
>> + * 	dcache_line_size - get the minimum D-cache line size from the CTR register.
>> + */
>> +	.macro	dcache_line_size, reg, tmp
>> +	mrs	\tmp, ctr_el0			// read CTR
>> +	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
>> +	mov	\reg, #4			// bytes per word
>> +	lsl	\reg, \reg, \tmp		// actual cache line size
>> +	.endm
>> +
>> +/*
>> + *	inval_cache_range(start, end)
>> + *	- x0 - start	- start address of region
>> + *	- x1 - end	- end address of region
>> + */
>> +.globl inval_cache_range
>> +inval_cache_range:
>> +	dcache_line_size x2, x3
>> +	sub	x3, x2, #1
>> +	tst	x1, x3				// end cache line aligned?
>> +	bic	x1, x1, x3
>> +	b.eq	1f
>> +	dc	civac, x1			// clean & invalidate D / U line
>> +1:	tst	x0, x3				// start cache line aligned?
>> +	bic	x0, x0, x3
>> +	b.eq	2f
>> +	dc	civac, x0			// clean & invalidate D / U line
>> +	b	3f
>> +2:	dc	ivac, x0			// invalidate D / U line
>> +3:	add	x0, x0, x2
>> +	cmp	x0, x1
>> +	b.lo	2b
>> +	dsb	sy
>> +	ret
>> +/*
>> + *	flush_dcache_range(start, end)
>> + *	- x0 - start	- start address of region
>> + *	- x1 - end	- end address of region
>> + *
>> + */
>> +.globl flush_dcache_range
>> +flush_dcache_range:
>> +	dcache_line_size x2, x3
>> +	sub	x3, x2, #1
>> +	bic	x0, x0, x3
>> +1:	dc	civac, x0			// clean & invalidate D line / unified line
>> +	add	x0, x0, x2
>> +	cmp	x0, x1
>> +	b.lo	1b
>> +	dsb	sy
>> +	ret
>> +
>> +/*
>> + *	invalidate_tlbs_el1()
>> + */
>> +.globl invalidate_tlbs_el1
>> +invalidate_tlbs_el1:
>> +	dsb	nshst
>> +	tlbi	vmalle1
>> +	dsb	nsh
>> +	isb
>> +	ret
>> +
>> +/*
>> + *	invalidate_tlbs_el2()
>> + */
>> +.globl invalidate_tlbs_el2
>> +invalidate_tlbs_el2:
>> +	dsb	nshst
>> +	tlbi	alle2
>> +	dsb	nsh
>> +	isb
>> +	ret
>> +
>> +/*
>> + * 	get_mm_feature_reg0_val - Get information about supported MM
>> + * 	features
>> + */
>> +.globl get_mm_feature_reg0_val
>> +get_mm_feature_reg0_val:
>> +	mrs	x0, ID_AA64MMFR0_EL1
>> +	ret
>> +
>> +/*
>> + * 	get_current_el - Get information about current exception level
>> + */
>> +.globl get_current_el
>> +get_current_el:
>> +	mrs 	x0, CurrentEL
>> +	lsr	x0, x0, #2
>> +	ret
>> +
>> +/*
>> + * 	invalidate_icache - Invalidate I-cache
>> + */
>> +.globl invalidate_icache
>> +invalidate_icache:
>> +	ic	iallu
>> +	dsb	nsh
>> +	isb
>> +	ret
>> +
>> +/*
>> + * 	set_mair_tcr_ttbr_sctlr_el1(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
>> + * 	x0 - page_table - Page Table Base
>> + * 	x1 - tcr_flags - TCR Flags to be set
>> + */
>> +.globl set_mair_tcr_ttbr_sctlr_el1
>> +set_mair_tcr_ttbr_sctlr_el1:
>> +	ldr	x2, =MEMORY_ATTRIBUTES
>> +	msr	mair_el1, x2
>> +	msr	tcr_el1, x1
>> +	msr	ttbr0_el1, x0
>> +	isb
>> +	mrs	x0, sctlr_el1
>> +	ldr	x3, =SCTLR_ELx_FLAGS
>> +	orr	x0, x0, x3
>> +	msr	sctlr_el1, x0
>> +	isb
>> +	ret
>> +
>> +/*
>> + * 	set_mair_tcr_ttbr_sctlr_el2(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
>> + * 	x0 - page_table - Page Table Base
>> + * 	x1 - tcr_flags - TCR Flags to be set
>> + */
>> +.globl set_mair_tcr_ttbr_sctlr_el2
>> +set_mair_tcr_ttbr_sctlr_el2:
>> +	ldr	x2, =MEMORY_ATTRIBUTES
>> +	msr	mair_el2, x2
>> +	msr	tcr_el2, x1
>> +	msr	ttbr0_el2, x0
>> +	isb
>> +	mrs	x0, sctlr_el2
>> +	ldr	x3, =SCTLR_ELx_FLAGS
>> +	orr	x0, x0, x3
>> +	msr	sctlr_el2, x0
>> +	isb
>> +	ret
>> +
>> +/*
>> + * reset_sctlr_el1 - disables cache and mmu
>> + */
>> +.globl reset_sctlr_el1
>> +reset_sctlr_el1:
>> +	mrs	x0, sctlr_el1
>> +	bic	x0, x0, #SCTLR_ELx_C
>> +	bic	x0, x0, #SCTLR_ELx_M
>> +	msr	sctlr_el1, x0
>> +	isb
>> +	ret
>> +
>> +/*
>> + * reset_sctlr_el2 - disables cache and mmu
>> + */
>> +.globl reset_sctlr_el2
>> +reset_sctlr_el2:
>> +	mrs	x0, sctlr_el2
>> +	bic	x0, x0, #SCTLR_ELx_C
>> +	bic	x0, x0, #SCTLR_ELx_M
>> +	msr	sctlr_el2, x0
>> +	isb
>> +	ret
>> diff --git a/purgatory/arch/arm64/cache.c b/purgatory/arch/arm64/cache.c
>> new file mode 100644
>> index 000000000000..3c7e058ccf11
>> --- /dev/null
>> +++ b/purgatory/arch/arm64/cache.c
>> @@ -0,0 +1,330 @@
>> +/*
>> + * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +/* We are supporting only 4K and 64K page sizes. This code will not work if
>> + * a hardware is not supporting at least one of these page sizes.
>> + * Therefore, D-cache is disabled by default and enabled only when
>> + * "enable-dcache" is passed to the kexec().
>> + * Since this is an identity mapped system, so VA_BITS will be same as max
>> + * PA bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only
>> + * one level of page table will be there with block descriptor entries.
>> + * Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will
>> + * have only table entries pointing to a level 1 lookup. Level 1 will have
>> + * only block entries which will map 1GB block.For 64K mapping, TTBR points
>> + * to level 1 lookups, which will have only table entries pointing to a
>> + * level 2 lookup. Level 2 will have only block entries which will map
>> + * 512MB block. If UART base address and RAM addresses are not at least 1GB
>> + * and 512MB apart for 4K and 64K respectively, then mapping result could
>> + * be unpredictable. In that case we need to support one more level of
>> + * granularity, but until someone needs that keep it like this only.
>> + * We can not allocate dynamic memory in purgatory. Therefore we keep page
>> + * table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points
>> + * to first level (having only table entries) and (page_table +
>> + * MAX_PAGE_SIZE) points to table at next level (having block entries). If
>> + * index for RAM area and UART area in first table is not same, then we
>> + * will need another next level table which will be located at (page_table
>> + * + 2 * MAX_PAGE_SIZE).
>> + */
>> +
>> +#include <stdint.h>
>> +#include <string.h>
>> +#include <purgatory.h>
>> +#include "cache.h"
>> +
>> +static uint64_t page_shift;
>> +static uint64_t pgtable_level;
>> +static uint64_t va_bits;
>> +
>> +static uint64_t page_table[PAGE_TABLE_SIZE / sizeof(uint64_t)] __attribute__ ((aligned (MAX_PAGE_SIZE))) = { };
>> +static uint64_t page_table_used;
>> +
>> +#define PAGE_SIZE	(1 << page_shift)
>> +/*
>> + *	is_4k_page_supported - return true if 4k page is supported else
>> + *	false
>> + */
>> +static int is_4k_page_supported(void)
>> +{
>> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN4_MASK) ==
>> +			ID_AA64MMFR0_TGRAN4_SUPPORTED);
>> +}
>> +
>> +/*
>> + *	is_64k_page_supported - return true if 64k page is supported else
>> + *	false
>> + */
>> +static int is_64k_page_supported(void)
>> +{
>> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN64_MASK) ==
>> +			ID_AA64MMFR0_TGRAN64_SUPPORTED);
>> +}
>> +
>> +/*
>> + *	get_ips_bits - return supported IPS bits
>> + */
>> +static uint64_t get_ips_bits(void)
>> +{
>> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_PARANGE_MASK) >>
>> +			ID_AA64MMFR0_PARANGE_SHIFT);
>> +}
>> +
>> +/*
>> + *	get_va_bits - return supported VA bits (For identity mapping VA = PA)
>> + */
>> +static uint64_t get_va_bits(void)
>> +{
>> +	uint64_t ips = get_ips_bits();
>> +
>> +	switch(ips) {
>> +	case ID_AA64MMFR0_PARANGE_48:
>> +		return 48;
>> +	case ID_AA64MMFR0_PARANGE_44:
>> +		return 44;
>> +	case ID_AA64MMFR0_PARANGE_42:
>> +		return 42;
>> +	case ID_AA64MMFR0_PARANGE_40:
>> +		return 40;
>> +	case ID_AA64MMFR0_PARANGE_36:
>> +		return 36;
>> +	default:
>> +		return 32;
>> +	}
>> +}
>> +
>> +/*
>> + *	get_section_shift - get block shift for supported page size
>> + */
>> +static uint64_t get_section_shift(void)
>> +{
>> +	if (page_shift == 16)
>> +		return 29;
>> +	else if(page_shift == 12)
>> +		return 30;
>> +	else
>> +		return 0;
>> +}
>> +
>> +/*
>> + *	get_section_mask - get section mask for supported page size
>> + */
>> +static uint64_t get_section_mask(void)
>> +{
>> +	if (page_shift == 16)
>> +		return 0x1FFF;
>> +	else if(page_shift == 12)
>> +		return 0x1FF;
>> +	else
>> +		return 0;
>> +}
>> +
>> +/*
>> + *	get_pgdir_shift - get pgdir shift for supported page size
>> + */
>> +static uint64_t get_pgdir_shift(void)
>> +{
>> +	if (page_shift == 16)
>> +		return 42;
>> +	else if(page_shift == 12)
>> +		return 39;
>> +	else
>> +		return 0;
>> +}
>> +
>> +/*
>> + *	init_page_table - Initializes page table locations
>> + */
>> +
>> +static void init_page_table(void)
>> +{
>> +	/*
>> +	 * Invalidate the page tables to avoid potential dirty cache lines
>> +	 * being evicted.
>> +	 */
>
> How do these lines get dirty? arm64_relocate_new_kernel() invalidated these
> pages to PoC before it copied the data. If they were speculatively fetched (I
> don't know the rules of when/how that happens) they may be wrong, but will be
> clean and not written back. If we change them in purgatory, you invalidate again
> from enable_mmu_dcache(). I don't think this is needed.
>

Had taken it from kernel arch/arm64/kernel/head.S.

But anyway, since this part will go to kexec code as you suggested, so 
we will not need
it there for sure.

>
>> +	inval_cache_range((uint64_t)page_table,
>> +			(uint64_t)page_table + PAGE_TABLE_SIZE);
>> +	memset(page_table, 0, PAGE_TABLE_SIZE);
>> +}
>> +/*
>> + *	create_identity_mapping(start, end, flags)
>> + *	start		- start address
>> + *	end		- end address
>> + *	flags 		- MMU Flags for Normal or Device type memory
>> + */
>> +static void create_identity_mapping(uint64_t start, uint64_t end,
>> +					uint64_t flags)
>> +{
>> +	uint32_t sec_shift, pgdir_shift, sec_mask;
>> +	uint64_t desc, s1, e1, s2, e2;
>> +	uint64_t *table2;
>> +
>> +	s1 = start;
>> +	e1 = end - 1;
>> +
>> +	sec_shift = get_section_shift();
>> +	if (pgtable_level == 1) {
>> +		s1 >>= sec_shift;
>> +		e1 >>= sec_shift;
>> +		do {
>> +			desc = s1 << sec_shift;
>> +			desc |= flags;
>> +			page_table[s1] = desc;
>> +			s1++;
>> +		} while (s1 <= e1);
>> +	} else {
>> +		pgdir_shift = get_pgdir_shift();
>> +		sec_mask = get_section_mask();
>> +		s1 >>= pgdir_shift;
>> +		e1 >>= pgdir_shift;
>> +		do {
>> +			/*
>> +			 * If there is no table entry then write a new
>> +			 * entry else, use old entry
>> +			 */
>> +			if (!page_table[s1]) {
>> +				table2 = &page_table[(++page_table_used *
>> +						MAX_PAGE_SIZE) /
>> +						sizeof(uint64_t)];
>> +				desc = (uint64_t)table2 | PMD_TYPE_TABLE;
>> +				page_table[s1] = desc;
>> +			} else {
>> +				table2 = (uint64_t *)(page_table[s1] &
>> +						~PMD_TYPE_MASK);
>> +			}
>> +			s1++;
>> +			s2 = start >> sec_shift;
>> +			s2 &= sec_mask;
>> +			e2 = (end - 1) >> sec_shift;
>> +			e2 &= sec_mask;
>> +			do {
>> +				desc = s2 << sec_shift;
>> +				desc |= flags;
>> +				table2[s2] = desc;
>> +				s2++;
>> +			} while (s2 <= e2);
>> +		} while (s1 <= e1);
>> +	}
>> +}
>
> (I will need to come back to this ... it looks pretty complicated. If you mimic
> Linux's p?d/pte macros it will be more familiar and easier to read.)

Ok, will try to take definitions from head.S, as far as possible.
>
>
>> +
>> +/*
>> + *	enable_mmu_dcache: Enable mmu and D-cache in sctlr_el1
>> + */
>> +static void enable_mmu_dcache(void)
>> +{
>> +	uint64_t tcr_flags = TCR_FLAGS | TCR_T0SZ(va_bits);
>> +
>> +	switch(page_shift) {
>> +	case 16:
>> +		tcr_flags |= TCR_TG0_64K;
>> +		break;
>> +	case 12:
>> +		tcr_flags |= TCR_TG0_4K;
>> +		break;
>> +	default:
>> +		printf("page shift not supported\n");
>> +		return;
>> +	}
>> +	/*
>> +	 * Since the page tables have been populated with non-cacheable
>> +	 * accesses (MMU disabled), invalidate the page tables to remove
>> +	 * any speculatively loaded cache lines.
>> +	 */
>> +	inval_cache_range((uint64_t)page_table,
>> +				(uint64_t)page_table + PAGE_TABLE_SIZE);
>> +
>> +	switch(get_current_el()) {
>> +	case 2:
>> +		invalidate_tlbs_el2();
>> +		tcr_flags |= (get_ips_bits() << TCR_PS_EL2_SHIFT);
>> +		set_mair_tcr_ttbr_sctlr_el2((uint64_t)page_table, tcr_flags);
>> +		break;
>> +	case 1:
>> +		invalidate_tlbs_el1();
>> +		tcr_flags |= (get_ips_bits() << TCR_IPS_EL1_SHIFT);
>> +		set_mair_tcr_ttbr_sctlr_el1((uint64_t)page_table, tcr_flags);
>> +		break;
>> +	default:
>> +		return;
>> +	}
>
>> +	invalidate_icache();
>
> What is this protecting against? We have executed instructions between here and
> setting the I+M bits in set_mair_tcr_ttbr_sctlr_el1(). (so it may be too late)
>
> arm64_relocate_new_kernel() already did 'ic iallu' before it branched into the
> purgatory code. No executable code has been changed or moved since then, so I
> don't think this is necessary.

OK.

>
>
>> +}
>> +
>> +/*
>> + *	enable_dcache: Enable D-cache and set appropriate attributes
>> + *	ram_start - Start address of RAM
>> + *	ram_end - End address of RAM
>> + *	uart_base - Base address of uart
>> + */
>> +int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base)
>> +{
>> +	va_bits = get_va_bits();
>> +
>> +	page_table_used = 0;
>> +	if (is_64k_page_supported()) {
>> +		page_shift = 16;
>> +		if (va_bits <= 42)
>> +			pgtable_level = 1;
>> +		else
>> +			pgtable_level = 2;
>> +	} else if (is_4k_page_supported()) {
>> +		page_shift = 12;
>> +		if (va_bits <= 39)
>> +			pgtable_level = 1;
>> +		else
>> +			pgtable_level = 2;
>> +	} else {
>> +		printf("Valid Page Granule not supported by hardware\n");
>> +		return -1;
>> +	}
>> +	init_page_table();
>> +	create_identity_mapping(ram_start, ram_end, MM_MMUFLAGS_NORMAL);
>> +	printf("Normal identity mapping created from %lx to %lx\n",
>> +			ram_start, ram_end);
>> +	if (uart_base) {
>> +		create_identity_mapping((uint64_t)uart_base,
>> +					(uint64_t)uart_base + PAGE_SIZE,
>> +					MM_MMUFLAGS_DEVICE);
>> +		printf("Device identity mapping created from %lx to %lx\n",
>> +				(uint64_t)uart_base,
>> +				(uint64_t)uart_base + PAGE_SIZE);
>> +	}
>> +	enable_mmu_dcache();
>> +	printf("Cache Enabled\n");
>> +
>> +	return 0;
>> +}
>> +
>> +/*
>> + *	disable_dcache: Disable D-cache and flush RAM locations
>> + *	ram_start - Start address of RAM
>> + *	ram_end - End address of RAM
>> + */
>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
>> +{
>> +	switch(get_current_el()) {
>> +	case 2:
>> +		reset_sctlr_el2();
>> +		break;
>> +	case 1:
>> +		reset_sctlr_el1();
>
> You have C code running between disabling the MMU and cleaning the cache. The
> compiler is allowed to move data on and off the stack in here, but after
> disabling the MMU it will see whatever was on the stack before we turned the MMU
> on. Any data written at the beginning of this function is left in the caches.
>
> I'm afraid this sort of stuff needs to be done in assembly!

All these routines are self coded in assembly even though they are called
from C, so should be safe I think. Anyway, I can keep all of them in
assembly as well.


>
>
>> +		break;
>> +	default:
>> +		return;
>> +	}
>> +	invalidate_icache();
>> +	flush_dcache_range(ram_start, ram_end);
>> +	printf("Cache Disabled\n");
>> +}
>> diff --git a/purgatory/arch/arm64/cache.h b/purgatory/arch/arm64/cache.h
>> new file mode 100644
>> index 000000000000..c988020566e3
>> --- /dev/null
>> +++ b/purgatory/arch/arm64/cache.h
>> @@ -0,0 +1,79 @@
>> +#ifndef	__CACHE_H__
>> +#define __CACHE_H__
>> +
>> +#define MT_DEVICE_NGNRNE	0
>> +#define MT_DEVICE_NGNRE		1
>> +#define MT_DEVICE_GRE		2
>> +#define MT_NORMAL_NC		3
>> +#define MT_NORMAL		4
>
> You only use two of these. I guess this is so the MAIR value matches the kernel?

OK, can remove others. Yes, they are matching with kernel.

>
>
>> +
>> +#ifndef __ASSEMBLER__
>> +
>> +#define MAX_PAGE_SIZE		0x10000
>> +#define PAGE_TABLE_SIZE		(3 * MAX_PAGE_SIZE)
>> +#define ID_AA64MMFR0_TGRAN64_SHIFT	24
>> +#define ID_AA64MMFR0_TGRAN4_SHIFT	28
>> +#define ID_AA64MMFR0_TGRAN64_MASK	(0xFUL << ID_AA64MMFR0_TGRAN64_SHIFT)
>> +#define ID_AA64MMFR0_TGRAN4_MASK	(0xFUL << ID_AA64MMFR0_TGRAN4_SHIFT)
>> +#define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
>> +#define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
>> +#define ID_AA64MMFR0_PARANGE_SHIFT	0
>> +#define ID_AA64MMFR0_PARANGE_MASK	(0xFUL << ID_AA64MMFR0_PARANGE_SHIFT)
>> +#define ID_AA64MMFR0_PARANGE_48		0x5
>> +#define ID_AA64MMFR0_PARANGE_44		0x4
>> +#define ID_AA64MMFR0_PARANGE_42		0x3
>> +#define ID_AA64MMFR0_PARANGE_40		0x2
>> +#define ID_AA64MMFR0_PARANGE_36		0x1
>> +#define ID_AA64MMFR0_PARANGE_32		0x0
>> +
>> +#define TCR_TG0_64K 		(1UL << 14)
>> +#define TCR_TG0_4K 		(0UL << 14)
>> +#define TCR_SHARED_NONE		(0UL << 12)
>> +#define TCR_ORGN_WBWA		(1UL << 10)
>> +#define TCR_IRGN_WBWA		(1UL << 8)
>> +#define TCR_IPS_EL1_SHIFT	32
>> +#define TCR_PS_EL2_SHIFT	16
>> +#define TCR_T0SZ(x)		((unsigned long)(64 - (x)) << 0)
>> +#define TCR_FLAGS (TCR_SHARED_NONE | TCR_ORGN_WBWA | TCR_IRGN_WBWA)
>> +
>> +#define PMD_TYPE_SECT		(1UL << 0)
>> +#define PMD_TYPE_TABLE		(3UL << 0)
>> +#define PMD_TYPE_MASK		0x3
>> +#define PMD_SECT_AF		(1UL << 10)
>> +#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
>> +#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
>> +#define PMD_SECT_PXN		(1UL << 53)
>> +#define PMD_SECT_UXN		(1UL << 54)
>> +#define PMD_FLAGS_DEVICE	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_PXN | PMD_SECT_UXN)
>> +#define MM_MMUFLAGS_NORMAL	PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL
>> +#define MM_MMUFLAGS_DEVICE	PMD_ATTRINDX(MT_DEVICE_NGNRE) | PMD_FLAGS_DEVICE
>> +
>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end);
>> +int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base);
>> +uint64_t get_mm_feature_reg0_val(void);
>> +void inval_cache_range(uint64_t start, uint64_t end);
>> +void flush_dcache_range(uint64_t start, uint64_t end);
>> +uint64_t get_current_el(void);
>> +void set_mair_tcr_ttbr_sctlr_el1(uint64_t page_table, uint64_t tcr_flags);
>> +void set_mair_tcr_ttbr_sctlr_el2(uint64_t page_table, uint64_t tcr_flags);
>> +void invalidate_tlbs_el1(void);
>> +void invalidate_tlbs_el2(void);
>> +void invalidate_icache(void);
>> +void reset_sctlr_el1(void);
>> +void reset_sctlr_el2(void);
>> +#else
>> +#define MEMORY_ATTRIBUTES	((0x00 << (MT_DEVICE_NGNRNE*8)) | \
>> +				(0x04 << (MT_DEVICE_NGNRE*8)) | \
>> +				(0x0C << (MT_DEVICE_GRE*8)) | \
>> +				(0x44 << (MT_NORMAL_NC*8)) | \
>> +				(0xFF << (MT_NORMAL*8)))
>
> Again, you only use two of these.

OK, will remove others.

>
>
>> +/* Common SCTLR_ELx flags. */
>> +#define SCTLR_ELx_I		(1 << 12)
>> +#define SCTLR_ELx_C		(1 << 2)
>> +#define SCTLR_ELx_M		(1 << 0)
>> +
>> +#define SCTLR_ELx_FLAGS (SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_I)
>> +
>> +#endif
>> +#endif
>>

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14  9:38       ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14  9:38 UTC (permalink / raw)
  To: James Morse; +Cc: geoff, Mark Rutland, kexec, linux-arm-kernel

Hi James,

Thanks a lot for your review. Its helpful.

On Saturday 26 November 2016 12:00 AM, James Morse wrote:
> Hi Pratyush,
>
> (CC: Mark, mismatched memory attributes in paragraph 3?)
>
> On 22/11/16 04:32, Pratyush Anand wrote:
>> This patch adds support to enable/disable d-cache, which can be used for
>> faster purgatory sha256 verification.
>
> (I'm not clear why we want the sha256, but that is being discussed elsewhere on
>  the thread)
>
>
>> We are supporting only 4K and 64K page sizes. This code will not work if a
>> hardware is not supporting at least one of these page sizes.  Therefore,
>> D-cache is disabled by default and enabled only when "enable-dcache" is
>> passed to the kexec().
>
> I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would be
> a lot simpler to only support one page size, which should be 4K as that is what
> UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)

Ok.. So, I will implement a new version after considering that 4K will 
always be supported. If 4K is not supported by hw(which is very 
unlikely) then there would be no d-cache enabling feature.

>
> I would go as far as to generate the page tables at 'kexec -l' time, and only if

Ok..So you mean that I create a new section which will have page table 
entries mapping physicalmemory represented by remaining section, and 
then purgatory can just enable mmu with page table from that section, 
right? Seems doable. can do that.

> '/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore must
> support 4K pages). This would keep the purgatory code as simple as possible.

What about reading ID_AA64MMFR0_EL1 instead of /sys/firmware/efi? That 
can also tell us that whether 4K is supported or not?

>
> I don't think the performance difference between 4K and 64K page sizes will be
> measurable, is purgatory really performance sensitive code?

I agree, implementing only 4K will make it very simple.

>
>
>> Since this is an identity mapped system, so VA_BITS will be same as max PA
>> bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only one
>> level of page table will be there with block descriptor entries.
>> Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will have
>> only table entries pointing to a level 1 lookup. Level 1 will have only
>> block entries which will map 1GB block. For 64K mapping, TTBR points to
>> level 1 lookups, which will have only table entries pointing to a level 2
>> lookup. Level 2 will have only block entries which will map 512MB block. If
>
> This is more complexity to pick a VA size. Why not always use the maximum 48bit
> VA? The cost is negligible compared to having simpler (easier to review!)
> purgatory code.
>
> By always using 1GB blocks you may be creating aliases with mismatched attributes:
> * If kdump only reserves 128MB, your 1GB mapping will alias whatever else was
>   in the same 1GB of address space. This could be a reserved region with some
>   other memory attributes.
> * With kdump, we may have failed to park the other CPUs if they are executing
>   with interrupts masked and haven't yet handled the smp_send_stop() IPI.
> * One of these other CPUs could be reading/writing in this area as it doesn't
>   belong to the kdump reserved area, just happens to be in the same 1GB.
>
> I need to dig through the ARM-ARM to find out what happens next, but I'm pretty
> sure this is well into the "don't do that" territory.
>
>
> It would be much better to force the memory areas to be a multiple of 2MB and
> 2MB aligned, which will allow you to use 2M section mappings for memory, (but
> not the uart). This way we only map regions we had reserved and know are memory.


OK. So, 48 bit VA, 4K page size, 3 level page table with entries in 3rd 
level representing 2M block size.


>
>
>> UART base address and RAM addresses are not at least 1GB and 512MB apart
>> for 4K and 64K respectively, then mapping result could be unpredictable. In
>> that case we need to support one more level of granularity, but until
>> someone needs that keep it like this only.
>>
>> We can not allocate dynamic memory in purgatory. Therefore we keep page
>> table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points to
>> first level (having only table entries) and (page_table + MAX_PAGE_SIZE)
>> points to table at next level (having block entries).  If index for RAM
>> area and UART area in first table is not same, then we will need another
>> next level table which will be located at (page_table + 2 * MAX_PAGE_SIZE).
>
>
>> diff --git a/purgatory/arch/arm64/cache-asm.S b/purgatory/arch/arm64/cache-asm.S
>> new file mode 100644
>> index 000000000000..bef97ef48888
>> --- /dev/null
>> +++ b/purgatory/arch/arm64/cache-asm.S
>> @@ -0,0 +1,186 @@
>> +/*
>> + * Some of the routines have been copied from Linux Kernel, therefore
>> + * copying the license as well.
>> + *
>> + * Copyright (C) 2001 Deep Blue Solutions Ltd.
>> + * Copyright (C) 2012 ARM Ltd.
>> + * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +#include "cache.h"
>> +
>> +/*
>> + * 	dcache_line_size - get the minimum D-cache line size from the CTR register.
>> + */
>> +	.macro	dcache_line_size, reg, tmp
>> +	mrs	\tmp, ctr_el0			// read CTR
>> +	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
>> +	mov	\reg, #4			// bytes per word
>> +	lsl	\reg, \reg, \tmp		// actual cache line size
>> +	.endm
>> +
>> +/*
>> + *	inval_cache_range(start, end)
>> + *	- x0 - start	- start address of region
>> + *	- x1 - end	- end address of region
>> + */
>> +.globl inval_cache_range
>> +inval_cache_range:
>> +	dcache_line_size x2, x3
>> +	sub	x3, x2, #1
>> +	tst	x1, x3				// end cache line aligned?
>> +	bic	x1, x1, x3
>> +	b.eq	1f
>> +	dc	civac, x1			// clean & invalidate D / U line
>> +1:	tst	x0, x3				// start cache line aligned?
>> +	bic	x0, x0, x3
>> +	b.eq	2f
>> +	dc	civac, x0			// clean & invalidate D / U line
>> +	b	3f
>> +2:	dc	ivac, x0			// invalidate D / U line
>> +3:	add	x0, x0, x2
>> +	cmp	x0, x1
>> +	b.lo	2b
>> +	dsb	sy
>> +	ret
>> +/*
>> + *	flush_dcache_range(start, end)
>> + *	- x0 - start	- start address of region
>> + *	- x1 - end	- end address of region
>> + *
>> + */
>> +.globl flush_dcache_range
>> +flush_dcache_range:
>> +	dcache_line_size x2, x3
>> +	sub	x3, x2, #1
>> +	bic	x0, x0, x3
>> +1:	dc	civac, x0			// clean & invalidate D line / unified line
>> +	add	x0, x0, x2
>> +	cmp	x0, x1
>> +	b.lo	1b
>> +	dsb	sy
>> +	ret
>> +
>> +/*
>> + *	invalidate_tlbs_el1()
>> + */
>> +.globl invalidate_tlbs_el1
>> +invalidate_tlbs_el1:
>> +	dsb	nshst
>> +	tlbi	vmalle1
>> +	dsb	nsh
>> +	isb
>> +	ret
>> +
>> +/*
>> + *	invalidate_tlbs_el2()
>> + */
>> +.globl invalidate_tlbs_el2
>> +invalidate_tlbs_el2:
>> +	dsb	nshst
>> +	tlbi	alle2
>> +	dsb	nsh
>> +	isb
>> +	ret
>> +
>> +/*
>> + * 	get_mm_feature_reg0_val - Get information about supported MM
>> + * 	features
>> + */
>> +.globl get_mm_feature_reg0_val
>> +get_mm_feature_reg0_val:
>> +	mrs	x0, ID_AA64MMFR0_EL1
>> +	ret
>> +
>> +/*
>> + * 	get_current_el - Get information about current exception level
>> + */
>> +.globl get_current_el
>> +get_current_el:
>> +	mrs 	x0, CurrentEL
>> +	lsr	x0, x0, #2
>> +	ret
>> +
>> +/*
>> + * 	invalidate_icache - Invalidate I-cache
>> + */
>> +.globl invalidate_icache
>> +invalidate_icache:
>> +	ic	iallu
>> +	dsb	nsh
>> +	isb
>> +	ret
>> +
>> +/*
>> + * 	set_mair_tcr_ttbr_sctlr_el1(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
>> + * 	x0 - page_table - Page Table Base
>> + * 	x1 - tcr_flags - TCR Flags to be set
>> + */
>> +.globl set_mair_tcr_ttbr_sctlr_el1
>> +set_mair_tcr_ttbr_sctlr_el1:
>> +	ldr	x2, =MEMORY_ATTRIBUTES
>> +	msr	mair_el1, x2
>> +	msr	tcr_el1, x1
>> +	msr	ttbr0_el1, x0
>> +	isb
>> +	mrs	x0, sctlr_el1
>> +	ldr	x3, =SCTLR_ELx_FLAGS
>> +	orr	x0, x0, x3
>> +	msr	sctlr_el1, x0
>> +	isb
>> +	ret
>> +
>> +/*
>> + * 	set_mair_tcr_ttbr_sctlr_el2(page_table, tcr_flags) - sets MAIR, TCR , TTBR and SCTLR registers
>> + * 	x0 - page_table - Page Table Base
>> + * 	x1 - tcr_flags - TCR Flags to be set
>> + */
>> +.globl set_mair_tcr_ttbr_sctlr_el2
>> +set_mair_tcr_ttbr_sctlr_el2:
>> +	ldr	x2, =MEMORY_ATTRIBUTES
>> +	msr	mair_el2, x2
>> +	msr	tcr_el2, x1
>> +	msr	ttbr0_el2, x0
>> +	isb
>> +	mrs	x0, sctlr_el2
>> +	ldr	x3, =SCTLR_ELx_FLAGS
>> +	orr	x0, x0, x3
>> +	msr	sctlr_el2, x0
>> +	isb
>> +	ret
>> +
>> +/*
>> + * reset_sctlr_el1 - disables cache and mmu
>> + */
>> +.globl reset_sctlr_el1
>> +reset_sctlr_el1:
>> +	mrs	x0, sctlr_el1
>> +	bic	x0, x0, #SCTLR_ELx_C
>> +	bic	x0, x0, #SCTLR_ELx_M
>> +	msr	sctlr_el1, x0
>> +	isb
>> +	ret
>> +
>> +/*
>> + * reset_sctlr_el2 - disables cache and mmu
>> + */
>> +.globl reset_sctlr_el2
>> +reset_sctlr_el2:
>> +	mrs	x0, sctlr_el2
>> +	bic	x0, x0, #SCTLR_ELx_C
>> +	bic	x0, x0, #SCTLR_ELx_M
>> +	msr	sctlr_el2, x0
>> +	isb
>> +	ret
>> diff --git a/purgatory/arch/arm64/cache.c b/purgatory/arch/arm64/cache.c
>> new file mode 100644
>> index 000000000000..3c7e058ccf11
>> --- /dev/null
>> +++ b/purgatory/arch/arm64/cache.c
>> @@ -0,0 +1,330 @@
>> +/*
>> + * Copyright (C) 2015 Pratyush Anand <panand@redhat.com>
>> + *
>> + * This program is free software; you can redistribute it and/or modify
>> + * it under the terms of the GNU General Public License version 2 as
>> + * published by the Free Software Foundation.
>> + *
>> + * This program is distributed in the hope that it will be useful,
>> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
>> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
>> + * GNU General Public License for more details.
>> + *
>> + * You should have received a copy of the GNU General Public License
>> + * along with this program. If not, see <http://www.gnu.org/licenses/>.
>> + */
>> +
>> +/* We are supporting only 4K and 64K page sizes. This code will not work if
>> + * a hardware is not supporting at least one of these page sizes.
>> + * Therefore, D-cache is disabled by default and enabled only when
>> + * "enable-dcache" is passed to the kexec().
>> + * Since this is an identity mapped system, so VA_BITS will be same as max
>> + * PA bits supported. If VA_BITS <= 42 for 64K and <= 39 for 4K then only
>> + * one level of page table will be there with block descriptor entries.
>> + * Otherwise, For 4K mapping, TTBR points to level 0 lookups, which will
>> + * have only table entries pointing to a level 1 lookup. Level 1 will have
>> + * only block entries which will map 1GB block.For 64K mapping, TTBR points
>> + * to level 1 lookups, which will have only table entries pointing to a
>> + * level 2 lookup. Level 2 will have only block entries which will map
>> + * 512MB block. If UART base address and RAM addresses are not at least 1GB
>> + * and 512MB apart for 4K and 64K respectively, then mapping result could
>> + * be unpredictable. In that case we need to support one more level of
>> + * granularity, but until someone needs that keep it like this only.
>> + * We can not allocate dynamic memory in purgatory. Therefore we keep page
>> + * table allocation size fixed as (3 * MAX_PAGE_SIZE). (page_table) points
>> + * to first level (having only table entries) and (page_table +
>> + * MAX_PAGE_SIZE) points to table at next level (having block entries). If
>> + * index for RAM area and UART area in first table is not same, then we
>> + * will need another next level table which will be located at (page_table
>> + * + 2 * MAX_PAGE_SIZE).
>> + */
>> +
>> +#include <stdint.h>
>> +#include <string.h>
>> +#include <purgatory.h>
>> +#include "cache.h"
>> +
>> +static uint64_t page_shift;
>> +static uint64_t pgtable_level;
>> +static uint64_t va_bits;
>> +
>> +static uint64_t page_table[PAGE_TABLE_SIZE / sizeof(uint64_t)] __attribute__ ((aligned (MAX_PAGE_SIZE))) = { };
>> +static uint64_t page_table_used;
>> +
>> +#define PAGE_SIZE	(1 << page_shift)
>> +/*
>> + *	is_4k_page_supported - return true if 4k page is supported else
>> + *	false
>> + */
>> +static int is_4k_page_supported(void)
>> +{
>> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN4_MASK) ==
>> +			ID_AA64MMFR0_TGRAN4_SUPPORTED);
>> +}
>> +
>> +/*
>> + *	is_64k_page_supported - return true if 64k page is supported else
>> + *	false
>> + */
>> +static int is_64k_page_supported(void)
>> +{
>> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_TGRAN64_MASK) ==
>> +			ID_AA64MMFR0_TGRAN64_SUPPORTED);
>> +}
>> +
>> +/*
>> + *	get_ips_bits - return supported IPS bits
>> + */
>> +static uint64_t get_ips_bits(void)
>> +{
>> +	return ((get_mm_feature_reg0_val() & ID_AA64MMFR0_PARANGE_MASK) >>
>> +			ID_AA64MMFR0_PARANGE_SHIFT);
>> +}
>> +
>> +/*
>> + *	get_va_bits - return supported VA bits (For identity mapping VA = PA)
>> + */
>> +static uint64_t get_va_bits(void)
>> +{
>> +	uint64_t ips = get_ips_bits();
>> +
>> +	switch(ips) {
>> +	case ID_AA64MMFR0_PARANGE_48:
>> +		return 48;
>> +	case ID_AA64MMFR0_PARANGE_44:
>> +		return 44;
>> +	case ID_AA64MMFR0_PARANGE_42:
>> +		return 42;
>> +	case ID_AA64MMFR0_PARANGE_40:
>> +		return 40;
>> +	case ID_AA64MMFR0_PARANGE_36:
>> +		return 36;
>> +	default:
>> +		return 32;
>> +	}
>> +}
>> +
>> +/*
>> + *	get_section_shift - get block shift for supported page size
>> + */
>> +static uint64_t get_section_shift(void)
>> +{
>> +	if (page_shift == 16)
>> +		return 29;
>> +	else if(page_shift == 12)
>> +		return 30;
>> +	else
>> +		return 0;
>> +}
>> +
>> +/*
>> + *	get_section_mask - get section mask for supported page size
>> + */
>> +static uint64_t get_section_mask(void)
>> +{
>> +	if (page_shift == 16)
>> +		return 0x1FFF;
>> +	else if(page_shift == 12)
>> +		return 0x1FF;
>> +	else
>> +		return 0;
>> +}
>> +
>> +/*
>> + *	get_pgdir_shift - get pgdir shift for supported page size
>> + */
>> +static uint64_t get_pgdir_shift(void)
>> +{
>> +	if (page_shift == 16)
>> +		return 42;
>> +	else if(page_shift == 12)
>> +		return 39;
>> +	else
>> +		return 0;
>> +}
>> +
>> +/*
>> + *	init_page_table - Initializes page table locations
>> + */
>> +
>> +static void init_page_table(void)
>> +{
>> +	/*
>> +	 * Invalidate the page tables to avoid potential dirty cache lines
>> +	 * being evicted.
>> +	 */
>
> How do these lines get dirty? arm64_relocate_new_kernel() invalidated these
> pages to PoC before it copied the data. If they were speculatively fetched (I
> don't know the rules of when/how that happens) they may be wrong, but will be
> clean and not written back. If we change them in purgatory, you invalidate again
> from enable_mmu_dcache(). I don't think this is needed.
>

Had taken it from kernel arch/arm64/kernel/head.S.

But anyway, since this part will go to kexec code as you suggested, so 
we will not need
it there for sure.

>
>> +	inval_cache_range((uint64_t)page_table,
>> +			(uint64_t)page_table + PAGE_TABLE_SIZE);
>> +	memset(page_table, 0, PAGE_TABLE_SIZE);
>> +}
>> +/*
>> + *	create_identity_mapping(start, end, flags)
>> + *	start		- start address
>> + *	end		- end address
>> + *	flags 		- MMU Flags for Normal or Device type memory
>> + */
>> +static void create_identity_mapping(uint64_t start, uint64_t end,
>> +					uint64_t flags)
>> +{
>> +	uint32_t sec_shift, pgdir_shift, sec_mask;
>> +	uint64_t desc, s1, e1, s2, e2;
>> +	uint64_t *table2;
>> +
>> +	s1 = start;
>> +	e1 = end - 1;
>> +
>> +	sec_shift = get_section_shift();
>> +	if (pgtable_level == 1) {
>> +		s1 >>= sec_shift;
>> +		e1 >>= sec_shift;
>> +		do {
>> +			desc = s1 << sec_shift;
>> +			desc |= flags;
>> +			page_table[s1] = desc;
>> +			s1++;
>> +		} while (s1 <= e1);
>> +	} else {
>> +		pgdir_shift = get_pgdir_shift();
>> +		sec_mask = get_section_mask();
>> +		s1 >>= pgdir_shift;
>> +		e1 >>= pgdir_shift;
>> +		do {
>> +			/*
>> +			 * If there is no table entry then write a new
>> +			 * entry else, use old entry
>> +			 */
>> +			if (!page_table[s1]) {
>> +				table2 = &page_table[(++page_table_used *
>> +						MAX_PAGE_SIZE) /
>> +						sizeof(uint64_t)];
>> +				desc = (uint64_t)table2 | PMD_TYPE_TABLE;
>> +				page_table[s1] = desc;
>> +			} else {
>> +				table2 = (uint64_t *)(page_table[s1] &
>> +						~PMD_TYPE_MASK);
>> +			}
>> +			s1++;
>> +			s2 = start >> sec_shift;
>> +			s2 &= sec_mask;
>> +			e2 = (end - 1) >> sec_shift;
>> +			e2 &= sec_mask;
>> +			do {
>> +				desc = s2 << sec_shift;
>> +				desc |= flags;
>> +				table2[s2] = desc;
>> +				s2++;
>> +			} while (s2 <= e2);
>> +		} while (s1 <= e1);
>> +	}
>> +}
>
> (I will need to come back to this ... it looks pretty complicated. If you mimic
> Linux's p?d/pte macros it will be more familiar and easier to read.)

Ok, will try to take definitions from head.S, as far as possible.
>
>
>> +
>> +/*
>> + *	enable_mmu_dcache: Enable mmu and D-cache in sctlr_el1
>> + */
>> +static void enable_mmu_dcache(void)
>> +{
>> +	uint64_t tcr_flags = TCR_FLAGS | TCR_T0SZ(va_bits);
>> +
>> +	switch(page_shift) {
>> +	case 16:
>> +		tcr_flags |= TCR_TG0_64K;
>> +		break;
>> +	case 12:
>> +		tcr_flags |= TCR_TG0_4K;
>> +		break;
>> +	default:
>> +		printf("page shift not supported\n");
>> +		return;
>> +	}
>> +	/*
>> +	 * Since the page tables have been populated with non-cacheable
>> +	 * accesses (MMU disabled), invalidate the page tables to remove
>> +	 * any speculatively loaded cache lines.
>> +	 */
>> +	inval_cache_range((uint64_t)page_table,
>> +				(uint64_t)page_table + PAGE_TABLE_SIZE);
>> +
>> +	switch(get_current_el()) {
>> +	case 2:
>> +		invalidate_tlbs_el2();
>> +		tcr_flags |= (get_ips_bits() << TCR_PS_EL2_SHIFT);
>> +		set_mair_tcr_ttbr_sctlr_el2((uint64_t)page_table, tcr_flags);
>> +		break;
>> +	case 1:
>> +		invalidate_tlbs_el1();
>> +		tcr_flags |= (get_ips_bits() << TCR_IPS_EL1_SHIFT);
>> +		set_mair_tcr_ttbr_sctlr_el1((uint64_t)page_table, tcr_flags);
>> +		break;
>> +	default:
>> +		return;
>> +	}
>
>> +	invalidate_icache();
>
> What is this protecting against? We have executed instructions between here and
> setting the I+M bits in set_mair_tcr_ttbr_sctlr_el1(). (so it may be too late)
>
> arm64_relocate_new_kernel() already did 'ic iallu' before it branched into the
> purgatory code. No executable code has been changed or moved since then, so I
> don't think this is necessary.

OK.

>
>
>> +}
>> +
>> +/*
>> + *	enable_dcache: Enable D-cache and set appropriate attributes
>> + *	ram_start - Start address of RAM
>> + *	ram_end - End address of RAM
>> + *	uart_base - Base address of uart
>> + */
>> +int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base)
>> +{
>> +	va_bits = get_va_bits();
>> +
>> +	page_table_used = 0;
>> +	if (is_64k_page_supported()) {
>> +		page_shift = 16;
>> +		if (va_bits <= 42)
>> +			pgtable_level = 1;
>> +		else
>> +			pgtable_level = 2;
>> +	} else if (is_4k_page_supported()) {
>> +		page_shift = 12;
>> +		if (va_bits <= 39)
>> +			pgtable_level = 1;
>> +		else
>> +			pgtable_level = 2;
>> +	} else {
>> +		printf("Valid Page Granule not supported by hardware\n");
>> +		return -1;
>> +	}
>> +	init_page_table();
>> +	create_identity_mapping(ram_start, ram_end, MM_MMUFLAGS_NORMAL);
>> +	printf("Normal identity mapping created from %lx to %lx\n",
>> +			ram_start, ram_end);
>> +	if (uart_base) {
>> +		create_identity_mapping((uint64_t)uart_base,
>> +					(uint64_t)uart_base + PAGE_SIZE,
>> +					MM_MMUFLAGS_DEVICE);
>> +		printf("Device identity mapping created from %lx to %lx\n",
>> +				(uint64_t)uart_base,
>> +				(uint64_t)uart_base + PAGE_SIZE);
>> +	}
>> +	enable_mmu_dcache();
>> +	printf("Cache Enabled\n");
>> +
>> +	return 0;
>> +}
>> +
>> +/*
>> + *	disable_dcache: Disable D-cache and flush RAM locations
>> + *	ram_start - Start address of RAM
>> + *	ram_end - End address of RAM
>> + */
>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
>> +{
>> +	switch(get_current_el()) {
>> +	case 2:
>> +		reset_sctlr_el2();
>> +		break;
>> +	case 1:
>> +		reset_sctlr_el1();
>
> You have C code running between disabling the MMU and cleaning the cache. The
> compiler is allowed to move data on and off the stack in here, but after
> disabling the MMU it will see whatever was on the stack before we turned the MMU
> on. Any data written at the beginning of this function is left in the caches.
>
> I'm afraid this sort of stuff needs to be done in assembly!

All these routines are self coded in assembly even though they are called
from C, so should be safe I think. Anyway, I can keep all of them in
assembly as well.


>
>
>> +		break;
>> +	default:
>> +		return;
>> +	}
>> +	invalidate_icache();
>> +	flush_dcache_range(ram_start, ram_end);
>> +	printf("Cache Disabled\n");
>> +}
>> diff --git a/purgatory/arch/arm64/cache.h b/purgatory/arch/arm64/cache.h
>> new file mode 100644
>> index 000000000000..c988020566e3
>> --- /dev/null
>> +++ b/purgatory/arch/arm64/cache.h
>> @@ -0,0 +1,79 @@
>> +#ifndef	__CACHE_H__
>> +#define __CACHE_H__
>> +
>> +#define MT_DEVICE_NGNRNE	0
>> +#define MT_DEVICE_NGNRE		1
>> +#define MT_DEVICE_GRE		2
>> +#define MT_NORMAL_NC		3
>> +#define MT_NORMAL		4
>
> You only use two of these. I guess this is so the MAIR value matches the kernel?

OK, can remove others. Yes, they are matching with kernel.

>
>
>> +
>> +#ifndef __ASSEMBLER__
>> +
>> +#define MAX_PAGE_SIZE		0x10000
>> +#define PAGE_TABLE_SIZE		(3 * MAX_PAGE_SIZE)
>> +#define ID_AA64MMFR0_TGRAN64_SHIFT	24
>> +#define ID_AA64MMFR0_TGRAN4_SHIFT	28
>> +#define ID_AA64MMFR0_TGRAN64_MASK	(0xFUL << ID_AA64MMFR0_TGRAN64_SHIFT)
>> +#define ID_AA64MMFR0_TGRAN4_MASK	(0xFUL << ID_AA64MMFR0_TGRAN4_SHIFT)
>> +#define ID_AA64MMFR0_TGRAN64_SUPPORTED	0x0
>> +#define ID_AA64MMFR0_TGRAN4_SUPPORTED	0x0
>> +#define ID_AA64MMFR0_PARANGE_SHIFT	0
>> +#define ID_AA64MMFR0_PARANGE_MASK	(0xFUL << ID_AA64MMFR0_PARANGE_SHIFT)
>> +#define ID_AA64MMFR0_PARANGE_48		0x5
>> +#define ID_AA64MMFR0_PARANGE_44		0x4
>> +#define ID_AA64MMFR0_PARANGE_42		0x3
>> +#define ID_AA64MMFR0_PARANGE_40		0x2
>> +#define ID_AA64MMFR0_PARANGE_36		0x1
>> +#define ID_AA64MMFR0_PARANGE_32		0x0
>> +
>> +#define TCR_TG0_64K 		(1UL << 14)
>> +#define TCR_TG0_4K 		(0UL << 14)
>> +#define TCR_SHARED_NONE		(0UL << 12)
>> +#define TCR_ORGN_WBWA		(1UL << 10)
>> +#define TCR_IRGN_WBWA		(1UL << 8)
>> +#define TCR_IPS_EL1_SHIFT	32
>> +#define TCR_PS_EL2_SHIFT	16
>> +#define TCR_T0SZ(x)		((unsigned long)(64 - (x)) << 0)
>> +#define TCR_FLAGS (TCR_SHARED_NONE | TCR_ORGN_WBWA | TCR_IRGN_WBWA)
>> +
>> +#define PMD_TYPE_SECT		(1UL << 0)
>> +#define PMD_TYPE_TABLE		(3UL << 0)
>> +#define PMD_TYPE_MASK		0x3
>> +#define PMD_SECT_AF		(1UL << 10)
>> +#define PMD_ATTRINDX(t)		((unsigned long)(t) << 2)
>> +#define PMD_FLAGS_NORMAL	(PMD_TYPE_SECT | PMD_SECT_AF)
>> +#define PMD_SECT_PXN		(1UL << 53)
>> +#define PMD_SECT_UXN		(1UL << 54)
>> +#define PMD_FLAGS_DEVICE	(PMD_TYPE_SECT | PMD_SECT_AF | PMD_SECT_PXN | PMD_SECT_UXN)
>> +#define MM_MMUFLAGS_NORMAL	PMD_ATTRINDX(MT_NORMAL) | PMD_FLAGS_NORMAL
>> +#define MM_MMUFLAGS_DEVICE	PMD_ATTRINDX(MT_DEVICE_NGNRE) | PMD_FLAGS_DEVICE
>> +
>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end);
>> +int enable_dcache(uint64_t ram_start, uint64_t ram_end, uint64_t uart_base);
>> +uint64_t get_mm_feature_reg0_val(void);
>> +void inval_cache_range(uint64_t start, uint64_t end);
>> +void flush_dcache_range(uint64_t start, uint64_t end);
>> +uint64_t get_current_el(void);
>> +void set_mair_tcr_ttbr_sctlr_el1(uint64_t page_table, uint64_t tcr_flags);
>> +void set_mair_tcr_ttbr_sctlr_el2(uint64_t page_table, uint64_t tcr_flags);
>> +void invalidate_tlbs_el1(void);
>> +void invalidate_tlbs_el2(void);
>> +void invalidate_icache(void);
>> +void reset_sctlr_el1(void);
>> +void reset_sctlr_el2(void);
>> +#else
>> +#define MEMORY_ATTRIBUTES	((0x00 << (MT_DEVICE_NGNRNE*8)) | \
>> +				(0x04 << (MT_DEVICE_NGNRE*8)) | \
>> +				(0x0C << (MT_DEVICE_GRE*8)) | \
>> +				(0x44 << (MT_NORMAL_NC*8)) | \
>> +				(0xFF << (MT_NORMAL*8)))
>
> Again, you only use two of these.

OK, will remove others.

>
>
>> +/* Common SCTLR_ELx flags. */
>> +#define SCTLR_ELx_I		(1 << 12)
>> +#define SCTLR_ELx_C		(1 << 2)
>> +#define SCTLR_ELx_M		(1 << 0)
>> +
>> +#define SCTLR_ELx_FLAGS (SCTLR_ELx_M | SCTLR_ELx_C | SCTLR_ELx_I)
>> +
>> +#endif
>> +#endif
>>

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14  9:38       ` Pratyush Anand
@ 2016-12-14 10:12         ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14 10:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>
>>
>> I would go as far as to generate the page tables at 'kexec -l' time,
>> and only if
>
> Ok..So you mean that I create a new section which will have page table
> entries mapping physicalmemory represented by remaining section, and
> then purgatory can just enable mmu with page table from that section,
> right? Seems doable. can do that.

I see a problem here. If we create  page table as a new segment then, 
how can we verify in purgatory that sha for page table is correct? We 
need page table before sha verification start,and we can not rely the 
page table created by first kernel until it's sha is verified. So a 
chicken-egg problem.

I think, creating page table will just take fraction of second and 
should be good even in purgatory, What do you say?

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 10:12         ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14 10:12 UTC (permalink / raw)
  To: James Morse; +Cc: geoff, Mark Rutland, kexec, linux-arm-kernel

On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>
>>
>> I would go as far as to generate the page tables at 'kexec -l' time,
>> and only if
>
> Ok..So you mean that I create a new section which will have page table
> entries mapping physicalmemory represented by remaining section, and
> then purgatory can just enable mmu with page table from that section,
> right? Seems doable. can do that.

I see a problem here. If we create  page table as a new segment then, 
how can we verify in purgatory that sha for page table is correct? We 
need page table before sha verification start,and we can not rely the 
page table created by first kernel until it's sha is verified. So a 
chicken-egg problem.

I think, creating page table will just take fraction of second and 
should be good even in purgatory, What do you say?

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14  9:38       ` Pratyush Anand
@ 2016-12-14 11:16         ` James Morse
  -1 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-12-14 11:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pratyush,

On 14/12/16 09:38, Pratyush Anand wrote:
> On Saturday 26 November 2016 12:00 AM, James Morse wrote:
>> On 22/11/16 04:32, Pratyush Anand wrote:
>>> This patch adds support to enable/disable d-cache, which can be used for
>>> faster purgatory sha256 verification.
>>
>> (I'm not clear why we want the sha256, but that is being discussed elsewhere on
>>  the thread)
>>
>>
>>> We are supporting only 4K and 64K page sizes. This code will not work if a
>>> hardware is not supporting at least one of these page sizes.  Therefore,
>>> D-cache is disabled by default and enabled only when "enable-dcache" is
>>> passed to the kexec().
>>
>> I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would be
>> a lot simpler to only support one page size, which should be 4K as that is what
>> UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)
> 
> Ok.. So, I will implement a new version after considering that 4K will always be
> supported. If 4K is not supported by hw(which is very unlikely) then there would
> be no d-cache enabling feature.

Sounds good tom me. I think its important to keep the purgatory code as small
and as simple as possible as its very hard to debug. If we do get bug reports
they are likely to be 'it didn't nothing', with no further details. If it only
fails on some platform we don't have access to its basically impossible.


>> I would go as far as to generate the page tables at 'kexec -l' time, and only if
> 
> Ok..So you mean that I create a new section which will have page table entries
> mapping physicalmemory represented by remaining section, and then purgatory can
> just enable mmu with page table from that section, right? Seems doable. can do
> that.
> 
>> '/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore must
>> support 4K pages). This would keep the purgatory code as simple as possible.
> 
> What about reading ID_AA64MMFR0_EL1 instead of /sys/firmware/efi? That can also
> tell us that whether 4K is supported or not?

If you're doing it at EL1/EL2 in the purgatory code, sure. But if you generate
the page tables at 'kexec -l' time you can't read this register from EL0 so you
need another way to guess if 4K pages are supported (or just assume they are and
test that register once you're in purgatory).

I was looking for some way to print a message at 'kexec -l' time that the sha256
would be slow as 4K wasn't supported. (a message printed at any other time won't
get seen).


>>> +/*
>>> + *    disable_dcache: Disable D-cache and flush RAM locations
>>> + *    ram_start - Start address of RAM
>>> + *    ram_end - End address of RAM
>>> + */
>>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
>>> +{
>>> +    switch(get_current_el()) {
>>> +    case 2:
>>> +        reset_sctlr_el2();
>>> +        break;
>>> +    case 1:
>>> +        reset_sctlr_el1();
>>
>> You have C code running between disabling the MMU and cleaning the cache. The
>> compiler is allowed to move data on and off the stack in here, but after
>> disabling the MMU it will see whatever was on the stack before we turned the MMU
>> on. Any data written at the beginning of this function is left in the caches.
>>
>> I'm afraid this sort of stuff needs to be done in assembly!
> 
> All these routines are self coded in assembly even though they are called
> from C, so should be safe I think. Anyway, I can keep all of them in
> assembly as well.

You can't tell the compiler that the stack data is inaccessible until the dcache
clean call completes. Some future version may do really crazy things in here.
You can decompile what your compiler version produces to check it doesn't
load/store to the stack, but that doesn't mean my compiler version does the
same. This is the kind of thing that is extremely difficult to debug, its best
not to take the risk.


Thanks,

James

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 11:16         ` James Morse
  0 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-12-14 11:16 UTC (permalink / raw)
  To: Pratyush Anand; +Cc: geoff, Mark Rutland, kexec, linux-arm-kernel

Hi Pratyush,

On 14/12/16 09:38, Pratyush Anand wrote:
> On Saturday 26 November 2016 12:00 AM, James Morse wrote:
>> On 22/11/16 04:32, Pratyush Anand wrote:
>>> This patch adds support to enable/disable d-cache, which can be used for
>>> faster purgatory sha256 verification.
>>
>> (I'm not clear why we want the sha256, but that is being discussed elsewhere on
>>  the thread)
>>
>>
>>> We are supporting only 4K and 64K page sizes. This code will not work if a
>>> hardware is not supporting at least one of these page sizes.  Therefore,
>>> D-cache is disabled by default and enabled only when "enable-dcache" is
>>> passed to the kexec().
>>
>> I don't think the maybe-4K/maybe-64K/maybe-neither logic is needed. It would be
>> a lot simpler to only support one page size, which should be 4K as that is what
>> UEFI requires. (If there are CPUs that only support one size, I bet its 4K!)
> 
> Ok.. So, I will implement a new version after considering that 4K will always be
> supported. If 4K is not supported by hw(which is very unlikely) then there would
> be no d-cache enabling feature.

Sounds good tom me. I think its important to keep the purgatory code as small
and as simple as possible as its very hard to debug. If we do get bug reports
they are likely to be 'it didn't nothing', with no further details. If it only
fails on some platform we don't have access to its basically impossible.


>> I would go as far as to generate the page tables at 'kexec -l' time, and only if
> 
> Ok..So you mean that I create a new section which will have page table entries
> mapping physicalmemory represented by remaining section, and then purgatory can
> just enable mmu with page table from that section, right? Seems doable. can do
> that.
> 
>> '/sys/firmware/efi' exists to indicate we booted via UEFI. (and therefore must
>> support 4K pages). This would keep the purgatory code as simple as possible.
> 
> What about reading ID_AA64MMFR0_EL1 instead of /sys/firmware/efi? That can also
> tell us that whether 4K is supported or not?

If you're doing it at EL1/EL2 in the purgatory code, sure. But if you generate
the page tables at 'kexec -l' time you can't read this register from EL0 so you
need another way to guess if 4K pages are supported (or just assume they are and
test that register once you're in purgatory).

I was looking for some way to print a message at 'kexec -l' time that the sha256
would be slow as 4K wasn't supported. (a message printed at any other time won't
get seen).


>>> +/*
>>> + *    disable_dcache: Disable D-cache and flush RAM locations
>>> + *    ram_start - Start address of RAM
>>> + *    ram_end - End address of RAM
>>> + */
>>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
>>> +{
>>> +    switch(get_current_el()) {
>>> +    case 2:
>>> +        reset_sctlr_el2();
>>> +        break;
>>> +    case 1:
>>> +        reset_sctlr_el1();
>>
>> You have C code running between disabling the MMU and cleaning the cache. The
>> compiler is allowed to move data on and off the stack in here, but after
>> disabling the MMU it will see whatever was on the stack before we turned the MMU
>> on. Any data written at the beginning of this function is left in the caches.
>>
>> I'm afraid this sort of stuff needs to be done in assembly!
> 
> All these routines are self coded in assembly even though they are called
> from C, so should be safe I think. Anyway, I can keep all of them in
> assembly as well.

You can't tell the compiler that the stack data is inaccessible until the dcache
clean call completes. Some future version may do really crazy things in here.
You can decompile what your compiler version produces to check it doesn't
load/store to the stack, but that doesn't mean my compiler version does the
same. This is the kind of thing that is extremely difficult to debug, its best
not to take the risk.


Thanks,

James


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14 10:12         ` Pratyush Anand
@ 2016-12-14 11:16           ` James Morse
  -1 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-12-14 11:16 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pratyush,

On 14/12/16 10:12, Pratyush Anand wrote:
> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>> I would go as far as to generate the page tables at 'kexec -l' time,
>>> and only if
>>
>> Ok..So you mean that I create a new section which will have page table
>> entries mapping physicalmemory represented by remaining section, and
>> then purgatory can just enable mmu with page table from that section,
>> right? Seems doable. can do that.
> 
> I see a problem here. If we create  page table as a new segment then, how can we
> verify in purgatory that sha for page table is correct? We need page table
> before sha verification start,and we can not rely the page table created by
> first kernel until it's sha is verified. So a chicken-egg problem.

There is more than one of those! What happens if your sha256 calculation code is
corrupted? You have to run it before you know. The same goes for all the
purgatory code.

This is why I think its better to do this in the kernel before we exit to
purgatory, but obviously that doesn't work for kdump.


> I think, creating page table will just take fraction of second and should be
> good even in purgatory, What do you say?

If it's for kdump its best-effort. I think its easier/simpler to generate and
debug them at 'kexec -l' time, but if you're worried about the increased area
that could be corrupted then do it in purgatory.


Thanks,

James

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 11:16           ` James Morse
  0 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-12-14 11:16 UTC (permalink / raw)
  To: Pratyush Anand; +Cc: geoff, Mark Rutland, kexec, linux-arm-kernel

Hi Pratyush,

On 14/12/16 10:12, Pratyush Anand wrote:
> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>> I would go as far as to generate the page tables at 'kexec -l' time,
>>> and only if
>>
>> Ok..So you mean that I create a new section which will have page table
>> entries mapping physicalmemory represented by remaining section, and
>> then purgatory can just enable mmu with page table from that section,
>> right? Seems doable. can do that.
> 
> I see a problem here. If we create  page table as a new segment then, how can we
> verify in purgatory that sha for page table is correct? We need page table
> before sha verification start,and we can not rely the page table created by
> first kernel until it's sha is verified. So a chicken-egg problem.

There is more than one of those! What happens if your sha256 calculation code is
corrupted? You have to run it before you know. The same goes for all the
purgatory code.

This is why I think its better to do this in the kernel before we exit to
purgatory, but obviously that doesn't work for kdump.


> I think, creating page table will just take fraction of second and should be
> good even in purgatory, What do you say?

If it's for kdump its best-effort. I think its easier/simpler to generate and
debug them at 'kexec -l' time, but if you're worried about the increased area
that could be corrupted then do it in purgatory.


Thanks,

James


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14 11:16         ` James Morse
@ 2016-12-14 11:28           ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2016-12-14 11:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 14, 2016 at 11:16:07AM +0000, James Morse wrote:
> Hi Pratyush,
> On 14/12/16 09:38, Pratyush Anand wrote:
> > On Saturday 26 November 2016 12:00 AM, James Morse wrote:
> >> On 22/11/16 04:32, Pratyush Anand wrote:
> >>> +/*
> >>> + *    disable_dcache: Disable D-cache and flush RAM locations
> >>> + *    ram_start - Start address of RAM
> >>> + *    ram_end - End address of RAM
> >>> + */
> >>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
> >>> +{
> >>> +    switch(get_current_el()) {
> >>> +    case 2:
> >>> +        reset_sctlr_el2();
> >>> +        break;
> >>> +    case 1:
> >>> +        reset_sctlr_el1();
> >>
> >> You have C code running between disabling the MMU and cleaning the cache. The
> >> compiler is allowed to move data on and off the stack in here, but after
> >> disabling the MMU it will see whatever was on the stack before we turned the MMU
> >> on. Any data written at the beginning of this function is left in the caches.
> >>
> >> I'm afraid this sort of stuff needs to be done in assembly!
> > 
> > All these routines are self coded in assembly even though they are called
> > from C, so should be safe I think. Anyway, I can keep all of them in
> > assembly as well.
> 
> You can't tell the compiler that the stack data is inaccessible until the dcache
> clean call completes. Some future version may do really crazy things in here.
> You can decompile what your compiler version produces to check it doesn't
> load/store to the stack, but that doesn't mean my compiler version does the
> same. This is the kind of thing that is extremely difficult to debug, its best
> not to take the risk.

FWIW, I completely agree.

We've been bitten in the past; see commit 5e051531447259e5 ("arm64:
convert part of soft_restart() to assembly") for an example.

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 11:28           ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2016-12-14 11:28 UTC (permalink / raw)
  To: James Morse; +Cc: Pratyush Anand, geoff, kexec, linux-arm-kernel

On Wed, Dec 14, 2016 at 11:16:07AM +0000, James Morse wrote:
> Hi Pratyush,
> On 14/12/16 09:38, Pratyush Anand wrote:
> > On Saturday 26 November 2016 12:00 AM, James Morse wrote:
> >> On 22/11/16 04:32, Pratyush Anand wrote:
> >>> +/*
> >>> + *    disable_dcache: Disable D-cache and flush RAM locations
> >>> + *    ram_start - Start address of RAM
> >>> + *    ram_end - End address of RAM
> >>> + */
> >>> +void disable_dcache(uint64_t ram_start, uint64_t ram_end)
> >>> +{
> >>> +    switch(get_current_el()) {
> >>> +    case 2:
> >>> +        reset_sctlr_el2();
> >>> +        break;
> >>> +    case 1:
> >>> +        reset_sctlr_el1();
> >>
> >> You have C code running between disabling the MMU and cleaning the cache. The
> >> compiler is allowed to move data on and off the stack in here, but after
> >> disabling the MMU it will see whatever was on the stack before we turned the MMU
> >> on. Any data written at the beginning of this function is left in the caches.
> >>
> >> I'm afraid this sort of stuff needs to be done in assembly!
> > 
> > All these routines are self coded in assembly even though they are called
> > from C, so should be safe I think. Anyway, I can keep all of them in
> > assembly as well.
> 
> You can't tell the compiler that the stack data is inaccessible until the dcache
> clean call completes. Some future version may do really crazy things in here.
> You can decompile what your compiler version produces to check it doesn't
> load/store to the stack, but that doesn't mean my compiler version does the
> same. This is the kind of thing that is extremely difficult to debug, its best
> not to take the risk.

FWIW, I completely agree.

We've been bitten in the past; see commit 5e051531447259e5 ("arm64:
convert part of soft_restart() to assembly") for an example.

Thanks,
Mark.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14 11:16           ` James Morse
@ 2016-12-14 11:37             ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2016-12-14 11:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Dec 14, 2016 at 11:16:17AM +0000, James Morse wrote:
> Hi Pratyush,
> 
> On 14/12/16 10:12, Pratyush Anand wrote:
> > On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
> >>> I would go as far as to generate the page tables at 'kexec -l' time,
> >>> and only if
> >>
> >> Ok..So you mean that I create a new section which will have page table
> >> entries mapping physicalmemory represented by remaining section, and
> >> then purgatory can just enable mmu with page table from that section,
> >> right? Seems doable. can do that.
> > 
> > I see a problem here. If we create  page table as a new segment then, how can we
> > verify in purgatory that sha for page table is correct? We need page table
> > before sha verification start,and we can not rely the page table created by
> > first kernel until it's sha is verified. So a chicken-egg problem.
> 
> There is more than one of those! What happens if your sha256 calculation code is
> corrupted? You have to run it before you know. The same goes for all the
> purgatory code.
> 
> This is why I think its better to do this in the kernel before we exit to
> purgatory, but obviously that doesn't work for kdump.

I see in an earlier message that the need for sha256 was being discussed
in another thread. Do either of you happen to have a pointer to that.

To me, it seems like it doesn't come with much benefit for the kdump
case given that's best-effort anyway, and as above the verification code
could have been be corrupted. In the non-kdump case it's not strictly
necessary and seems like a debugging aid rather than a necessary piece
of functionality -- if that's the case, a 20 second delay isn't the end
of the world...

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 11:37             ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2016-12-14 11:37 UTC (permalink / raw)
  To: James Morse; +Cc: Pratyush Anand, geoff, kexec, linux-arm-kernel

On Wed, Dec 14, 2016 at 11:16:17AM +0000, James Morse wrote:
> Hi Pratyush,
> 
> On 14/12/16 10:12, Pratyush Anand wrote:
> > On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
> >>> I would go as far as to generate the page tables at 'kexec -l' time,
> >>> and only if
> >>
> >> Ok..So you mean that I create a new section which will have page table
> >> entries mapping physicalmemory represented by remaining section, and
> >> then purgatory can just enable mmu with page table from that section,
> >> right? Seems doable. can do that.
> > 
> > I see a problem here. If we create  page table as a new segment then, how can we
> > verify in purgatory that sha for page table is correct? We need page table
> > before sha verification start,and we can not rely the page table created by
> > first kernel until it's sha is verified. So a chicken-egg problem.
> 
> There is more than one of those! What happens if your sha256 calculation code is
> corrupted? You have to run it before you know. The same goes for all the
> purgatory code.
> 
> This is why I think its better to do this in the kernel before we exit to
> purgatory, but obviously that doesn't work for kdump.

I see in an earlier message that the need for sha256 was being discussed
in another thread. Do either of you happen to have a pointer to that.

To me, it seems like it doesn't come with much benefit for the kdump
case given that's best-effort anyway, and as above the verification code
could have been be corrupted. In the non-kdump case it's not strictly
necessary and seems like a debugging aid rather than a necessary piece
of functionality -- if that's the case, a 20 second delay isn't the end
of the world...

Thanks,
Mark.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14 11:37             ` Mark Rutland
@ 2016-12-14 12:11               ` James Morse
  -1 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-12-14 12:11 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Mark,

On 14/12/16 11:37, Mark Rutland wrote:
> On Wed, Dec 14, 2016 at 11:16:17AM +0000, James Morse wrote:
>> On 14/12/16 10:12, Pratyush Anand wrote:
>>> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>>>> I would go as far as to generate the page tables at 'kexec -l' time,
>>>>> and only if
>>>>
>>>> Ok..So you mean that I create a new section which will have page table
>>>> entries mapping physicalmemory represented by remaining section, and
>>>> then purgatory can just enable mmu with page table from that section,
>>>> right? Seems doable. can do that.
>>>
>>> I see a problem here. If we create  page table as a new segment then, how can we
>>> verify in purgatory that sha for page table is correct? We need page table
>>> before sha verification start,and we can not rely the page table created by
>>> first kernel until it's sha is verified. So a chicken-egg problem.
>>
>> There is more than one of those! What happens if your sha256 calculation code is
>> corrupted? You have to run it before you know. The same goes for all the
>> purgatory code.
>>
>> This is why I think its better to do this in the kernel before we exit to
>> purgatory, but obviously that doesn't work for kdump.
> 
> I see in an earlier message that the need for sha256 was being discussed
> in another thread. Do either of you happen to have a pointer to that.

https://www.spinics.net/lists/arm-kernel/msg544472.html


Thanks,

James

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 12:11               ` James Morse
  0 siblings, 0 replies; 48+ messages in thread
From: James Morse @ 2016-12-14 12:11 UTC (permalink / raw)
  To: Mark Rutland; +Cc: Pratyush Anand, geoff, kexec, linux-arm-kernel

Hi Mark,

On 14/12/16 11:37, Mark Rutland wrote:
> On Wed, Dec 14, 2016 at 11:16:17AM +0000, James Morse wrote:
>> On 14/12/16 10:12, Pratyush Anand wrote:
>>> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>>>> I would go as far as to generate the page tables at 'kexec -l' time,
>>>>> and only if
>>>>
>>>> Ok..So you mean that I create a new section which will have page table
>>>> entries mapping physicalmemory represented by remaining section, and
>>>> then purgatory can just enable mmu with page table from that section,
>>>> right? Seems doable. can do that.
>>>
>>> I see a problem here. If we create  page table as a new segment then, how can we
>>> verify in purgatory that sha for page table is correct? We need page table
>>> before sha verification start,and we can not rely the page table created by
>>> first kernel until it's sha is verified. So a chicken-egg problem.
>>
>> There is more than one of those! What happens if your sha256 calculation code is
>> corrupted? You have to run it before you know. The same goes for all the
>> purgatory code.
>>
>> This is why I think its better to do this in the kernel before we exit to
>> purgatory, but obviously that doesn't work for kdump.
> 
> I see in an earlier message that the need for sha256 was being discussed
> in another thread. Do either of you happen to have a pointer to that.

https://www.spinics.net/lists/arm-kernel/msg544472.html


Thanks,

James

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14 11:16           ` James Morse
@ 2016-12-14 12:13             ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14 12:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

Thanks for your input !!

On Wednesday 14 December 2016 04:46 PM, James Morse wrote:
> Hi Pratyush,
>
> On 14/12/16 10:12, Pratyush Anand wrote:
>> > On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>>> >>> I would go as far as to generate the page tables at 'kexec -l' time,
>>>> >>> and only if
>>> >>
>>> >> Ok..So you mean that I create a new section which will have page table
>>> >> entries mapping physicalmemory represented by remaining section, and
>>> >> then purgatory can just enable mmu with page table from that section,
>>> >> right? Seems doable. can do that.
>> >
>> > I see a problem here. If we create  page table as a new segment then, how can we
>> > verify in purgatory that sha for page table is correct? We need page table
>> > before sha verification start,and we can not rely the page table created by
>> > first kernel until it's sha is verified. So a chicken-egg problem.
> There is more than one of those! What happens if your sha256 calculation code is
> corrupted? You have to run it before you know. The same goes for all the
> purgatory code.
>

OK, seems reasonable... will do it in kexec code.

> This is why I think its better to do this in the kernel before we exit to
> purgatory, but obviously that doesn't work for kdump.
>
>
>> > I think, creating page table will just take fraction of second and should be
>> > good even in purgatory, What do you say?
> If it's for kdump its best-effort. I think its easier/simpler to generate and
> debug them at 'kexec -l' time, but if you're worried about the increased area
> that could be corrupted then do it in purgatory.
>

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 12:13             ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14 12:13 UTC (permalink / raw)
  To: James Morse; +Cc: geoff, Mark Rutland, kexec, linux-arm-kernel

Hi James,

Thanks for your input !!

On Wednesday 14 December 2016 04:46 PM, James Morse wrote:
> Hi Pratyush,
>
> On 14/12/16 10:12, Pratyush Anand wrote:
>> > On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>>> >>> I would go as far as to generate the page tables at 'kexec -l' time,
>>>> >>> and only if
>>> >>
>>> >> Ok..So you mean that I create a new section which will have page table
>>> >> entries mapping physicalmemory represented by remaining section, and
>>> >> then purgatory can just enable mmu with page table from that section,
>>> >> right? Seems doable. can do that.
>> >
>> > I see a problem here. If we create  page table as a new segment then, how can we
>> > verify in purgatory that sha for page table is correct? We need page table
>> > before sha verification start,and we can not rely the page table created by
>> > first kernel until it's sha is verified. So a chicken-egg problem.
> There is more than one of those! What happens if your sha256 calculation code is
> corrupted? You have to run it before you know. The same goes for all the
> purgatory code.
>

OK, seems reasonable... will do it in kexec code.

> This is why I think its better to do this in the kernel before we exit to
> purgatory, but obviously that doesn't work for kdump.
>
>
>> > I think, creating page table will just take fraction of second and should be
>> > good even in purgatory, What do you say?
> If it's for kdump its best-effort. I think its easier/simpler to generate and
> debug them at 'kexec -l' time, but if you're worried about the increased area
> that could be corrupted then do it in purgatory.
>

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14 11:37             ` Mark Rutland
@ 2016-12-14 12:21               ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14 12:21 UTC (permalink / raw)
  To: linux-arm-kernel



On Wednesday 14 December 2016 05:07 PM, Mark Rutland wrote:
> On Wed, Dec 14, 2016 at 11:16:17AM +0000, James Morse wrote:
>> Hi Pratyush,
>>
>> On 14/12/16 10:12, Pratyush Anand wrote:
>>> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>>>> I would go as far as to generate the page tables at 'kexec -l' time,
>>>>> and only if
>>>>
>>>> Ok..So you mean that I create a new section which will have page table
>>>> entries mapping physicalmemory represented by remaining section, and
>>>> then purgatory can just enable mmu with page table from that section,
>>>> right? Seems doable. can do that.
>>>
>>> I see a problem here. If we create  page table as a new segment then, how can we
>>> verify in purgatory that sha for page table is correct? We need page table
>>> before sha verification start,and we can not rely the page table created by
>>> first kernel until it's sha is verified. So a chicken-egg problem.
>>
>> There is more than one of those! What happens if your sha256 calculation code is
>> corrupted? You have to run it before you know. The same goes for all the
>> purgatory code.
>>
>> This is why I think its better to do this in the kernel before we exit to
>> purgatory, but obviously that doesn't work for kdump.
>
> I see in an earlier message that the need for sha256 was being discussed
> in another thread. Do either of you happen to have a pointer to that.
>

patch 0/2 of this series.

> To me, it seems like it doesn't come with much benefit for the kdump
> case given that's best-effort anyway, and as above the verification code
> could have been be corrupted. In the non-kdump case it's not strictly
> necessary and seems like a debugging aid rather than a necessary piece
> of functionality -- if that's the case, a 20 second delay isn't the end
> of the world...


Even for the non-kdump ie `kexec -l` case we do not have a functionality 
to bypass sha verification in kexec-tools. --lite option with the 
kexec-tools was discouraged and not accepted. So,it is 20s for both 
`kexec -l` and `kexec -p`.
Also other arch like x86_64 takes negligible time in sha verification.

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 12:21               ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14 12:21 UTC (permalink / raw)
  To: Mark Rutland, James Morse; +Cc: geoff, Dave Young, kexec, linux-arm-kernel



On Wednesday 14 December 2016 05:07 PM, Mark Rutland wrote:
> On Wed, Dec 14, 2016 at 11:16:17AM +0000, James Morse wrote:
>> Hi Pratyush,
>>
>> On 14/12/16 10:12, Pratyush Anand wrote:
>>> On Wednesday 14 December 2016 03:08 PM, Pratyush Anand wrote:
>>>>> I would go as far as to generate the page tables at 'kexec -l' time,
>>>>> and only if
>>>>
>>>> Ok..So you mean that I create a new section which will have page table
>>>> entries mapping physicalmemory represented by remaining section, and
>>>> then purgatory can just enable mmu with page table from that section,
>>>> right? Seems doable. can do that.
>>>
>>> I see a problem here. If we create  page table as a new segment then, how can we
>>> verify in purgatory that sha for page table is correct? We need page table
>>> before sha verification start,and we can not rely the page table created by
>>> first kernel until it's sha is verified. So a chicken-egg problem.
>>
>> There is more than one of those! What happens if your sha256 calculation code is
>> corrupted? You have to run it before you know. The same goes for all the
>> purgatory code.
>>
>> This is why I think its better to do this in the kernel before we exit to
>> purgatory, but obviously that doesn't work for kdump.
>
> I see in an earlier message that the need for sha256 was being discussed
> in another thread. Do either of you happen to have a pointer to that.
>

patch 0/2 of this series.

> To me, it seems like it doesn't come with much benefit for the kdump
> case given that's best-effort anyway, and as above the verification code
> could have been be corrupted. In the non-kdump case it's not strictly
> necessary and seems like a debugging aid rather than a necessary piece
> of functionality -- if that's the case, a 20 second delay isn't the end
> of the world...


Even for the non-kdump ie `kexec -l` case we do not have a functionality 
to bypass sha verification in kexec-tools. --lite option with the 
kexec-tools was discouraged and not accepted. So,it is 20s for both 
`kexec -l` and `kexec -p`.
Also other arch like x86_64 takes negligible time in sha verification.

~Pratyush

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14 12:21               ` Pratyush Anand
@ 2016-12-14 13:44                 ` Mark Rutland
  -1 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2016-12-14 13:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Wed, Dec 14, 2016 at 05:51:05PM +0530, Pratyush Anand wrote:
> 
> On Wednesday 14 December 2016 05:07 PM, Mark Rutland wrote:
> >I see in an earlier message that the need for sha256 was being discussed
> >in another thread. Do either of you happen to have a pointer to that.
> 
> patch 0/2 of this series.

AFAICT, that just says the the existing sha256 check is slow, not *why*
a sha256 check of some description is necessary. I'm still at a loss as
to why it is considered necessary, rather than being a debugging aid or
sanity check.

> >To me, it seems like it doesn't come with much benefit for the kdump
> >case given that's best-effort anyway, and as above the verification code
> >could have been be corrupted. In the non-kdump case it's not strictly
> >necessary and seems like a debugging aid rather than a necessary piece
> >of functionality -- if that's the case, a 20 second delay isn't the end
> >of the world...
> 
> Even for the non-kdump ie `kexec -l` case we do not have a
> functionality to bypass sha verification in kexec-tools. --lite
> option with the kexec-tools was discouraged and not accepted.

Ok. Do you have a pointer to the thread regarding that, for context?

> So,it is 20s for both `kexec -l` and `kexec -p`.

Well, unless we can have a --{no-,}sha-check, and make the default NO
for arm64.

> Also other arch like x86_64 takes negligible time in sha verification.

That's certainly an argument for not changing the other architectures,
but given it's slow for arm64, we could have a different default...

Thanks,
Mark.

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 13:44                 ` Mark Rutland
  0 siblings, 0 replies; 48+ messages in thread
From: Mark Rutland @ 2016-12-14 13:44 UTC (permalink / raw)
  To: Pratyush Anand; +Cc: geoff, kexec, Dave Young, James Morse, linux-arm-kernel

Hi,

On Wed, Dec 14, 2016 at 05:51:05PM +0530, Pratyush Anand wrote:
> 
> On Wednesday 14 December 2016 05:07 PM, Mark Rutland wrote:
> >I see in an earlier message that the need for sha256 was being discussed
> >in another thread. Do either of you happen to have a pointer to that.
> 
> patch 0/2 of this series.

AFAICT, that just says the the existing sha256 check is slow, not *why*
a sha256 check of some description is necessary. I'm still at a loss as
to why it is considered necessary, rather than being a debugging aid or
sanity check.

> >To me, it seems like it doesn't come with much benefit for the kdump
> >case given that's best-effort anyway, and as above the verification code
> >could have been be corrupted. In the non-kdump case it's not strictly
> >necessary and seems like a debugging aid rather than a necessary piece
> >of functionality -- if that's the case, a 20 second delay isn't the end
> >of the world...
> 
> Even for the non-kdump ie `kexec -l` case we do not have a
> functionality to bypass sha verification in kexec-tools. --lite
> option with the kexec-tools was discouraged and not accepted.

Ok. Do you have a pointer to the thread regarding that, for context?

> So,it is 20s for both `kexec -l` and `kexec -p`.

Well, unless we can have a --{no-,}sha-check, and make the default NO
for arm64.

> Also other arch like x86_64 takes negligible time in sha verification.

That's certainly an argument for not changing the other architectures,
but given it's slow for arm64, we could have a different default...

Thanks,
Mark.

_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
  2016-12-14 13:44                 ` Mark Rutland
@ 2016-12-14 14:13                   ` Pratyush Anand
  -1 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14 14:13 UTC (permalink / raw)
  To: linux-arm-kernel



On Wednesday 14 December 2016 07:14 PM, Mark Rutland wrote:
>> Even for the non-kdump ie `kexec -l` case we do not have a
>> > functionality to bypass sha verification in kexec-tools. --lite
>> > option with the kexec-tools was discouraged and not accepted.
> Ok. Do you have a pointer to the thread regarding that, for context?
>

https://lists.ozlabs.org/pipermail/petitboot/2015-October/000141.html
https://lists.ozlabs.org/pipermail/petitboot/2015-October/000136.html

~Pratyush

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory
@ 2016-12-14 14:13                   ` Pratyush Anand
  0 siblings, 0 replies; 48+ messages in thread
From: Pratyush Anand @ 2016-12-14 14:13 UTC (permalink / raw)
  To: Mark Rutland; +Cc: geoff, kexec, Dave Young, James Morse, linux-arm-kernel



On Wednesday 14 December 2016 07:14 PM, Mark Rutland wrote:
>> Even for the non-kdump ie `kexec -l` case we do not have a
>> > functionality to bypass sha verification in kexec-tools. --lite
>> > option with the kexec-tools was discouraged and not accepted.
> Ok. Do you have a pointer to the thread regarding that, for context?
>

https://lists.ozlabs.org/pipermail/petitboot/2015-October/000141.html
https://lists.ozlabs.org/pipermail/petitboot/2015-October/000136.html

~Pratyush


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2016-12-14 14:14 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-11-22  4:32 [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility Pratyush Anand
2016-11-22  4:32 ` Pratyush Anand
2016-11-22  4:32 ` [PATCH 1/2] arm64: Add enable/disable d-cache support for purgatory Pratyush Anand
2016-11-22  4:32   ` Pratyush Anand
2016-11-25 18:30   ` James Morse
2016-11-25 18:30     ` James Morse
2016-12-14  9:38     ` Pratyush Anand
2016-12-14  9:38       ` Pratyush Anand
2016-12-14 10:12       ` Pratyush Anand
2016-12-14 10:12         ` Pratyush Anand
2016-12-14 11:16         ` James Morse
2016-12-14 11:16           ` James Morse
2016-12-14 11:37           ` Mark Rutland
2016-12-14 11:37             ` Mark Rutland
2016-12-14 12:11             ` James Morse
2016-12-14 12:11               ` James Morse
2016-12-14 12:21             ` Pratyush Anand
2016-12-14 12:21               ` Pratyush Anand
2016-12-14 13:44               ` Mark Rutland
2016-12-14 13:44                 ` Mark Rutland
2016-12-14 14:13                 ` Pratyush Anand
2016-12-14 14:13                   ` Pratyush Anand
2016-12-14 12:13           ` Pratyush Anand
2016-12-14 12:13             ` Pratyush Anand
2016-12-14 11:16       ` James Morse
2016-12-14 11:16         ` James Morse
2016-12-14 11:28         ` Mark Rutland
2016-12-14 11:28           ` Mark Rutland
2016-11-22  4:32 ` [PATCH 2/2] arm64: Pass RAM boundary and enable-dcache flag to purgatory Pratyush Anand
2016-11-22  4:32   ` Pratyush Anand
2016-11-22 18:57   ` Geoff Levand
2016-11-22 18:57     ` Geoff Levand
2016-11-23  1:46     ` Pratyush Anand
2016-11-23  1:46       ` Pratyush Anand
2016-11-23  2:03       ` Dave Young
2016-11-23  2:03         ` Dave Young
2016-11-23  2:11         ` Pratyush Anand
2016-11-23  2:11           ` Pratyush Anand
2016-11-23  8:08           ` Simon Horman
2016-11-23  8:08             ` Simon Horman
2016-11-23  8:17             ` Pratyush Anand
2016-11-23  8:17               ` Pratyush Anand
2016-11-22 18:56 ` [PATCH 0/2] kexec-tools: arm64: Add dcache enabling facility Geoff Levand
2016-11-22 18:56   ` Geoff Levand
2016-11-23  1:39   ` Pratyush Anand
2016-11-23  1:39     ` Pratyush Anand
2016-11-25 18:30   ` James Morse
2016-11-25 18:30     ` James Morse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.