linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2015-10-27 17:29 James Morse
  2015-10-27 17:29 ` [PATCH v2 01/11] arm64: kernel: fix tcr_el1.t0sz restore on systems with extended idmap James Morse
                   ` (10 more replies)
  0 siblings, 11 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

This version of the series follows Lorenzo's option one, described at [0],
when cleaning executable code that may be held in data caches after
hibernate/resume. Patch ten adds the necessary hook to kernel/power/snapshot.c.

This allows the architecture's hibernate assembly code to clean all the
pages that it copies, meaning the for_each_process(); for_each_vma(); version
of this can be removed.

The other changes were fixes so that this code works both before and after
Ard Biesheuvel's 'relax Image placement rules' series [1] that moves the
kernel text out of the linear map.

Finally the series has picked up Lorenzo's t0sz fix [2], as it refactors
all of those changes out, and the first four patches of kexec v10, [3].

(Version one here: [4])


James

[0] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/380153.html
[1] http://www.spinics.net/lists/arm-kernel/msg446929.html
[2] http://www.spinics.net/lists/arm-kernel/msg454862.html
[3] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/379268.html
[4] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html

AKASHI Takahiro (1):
  arm64: kvm: allows kvm cpu hotplug

Geoff Levand (3):
  arm64: Fold proc-macros.S into assembler.h
  arm64: Convert hcalls to use HVC immediate value
  arm64: Add new hcall HVC_CALL_FUNC

James Morse (6):
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by
    va
  arm64: kernel: Include _AC definition in page.h
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  PM / Hibernate: clean cached pages on architectures that require it
  arm64: kernel: Add support for hibernate/suspend-to-disk.

Lorenzo Pieralisi (1):
  arm64: kernel: fix tcr_el1.t0sz restore on systems with extended idmap

 arch/arm/include/asm/kvm_host.h    |  10 +-
 arch/arm/include/asm/kvm_mmu.h     |   1 +
 arch/arm/kvm/arm.c                 |  79 ++++----
 arch/arm/kvm/mmu.c                 |   5 +
 arch/arm64/Kconfig                 |   3 +
 arch/arm64/include/asm/assembler.h |  48 ++++-
 arch/arm64/include/asm/kvm_host.h  |  16 +-
 arch/arm64/include/asm/kvm_mmu.h   |   1 +
 arch/arm64/include/asm/memory.h    |   3 +
 arch/arm64/include/asm/page.h      |   2 +
 arch/arm64/include/asm/suspend.h   |  31 +++-
 arch/arm64/include/asm/virt.h      |  49 +++++
 arch/arm64/kernel/Makefile         |   1 +
 arch/arm64/kernel/asm-offsets.c    |   9 +-
 arch/arm64/kernel/head.S           |   6 +-
 arch/arm64/kernel/hibernate-asm.S  | 118 ++++++++++++
 arch/arm64/kernel/hibernate.c      | 359 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S       |  43 +++--
 arch/arm64/kernel/setup.c          |   1 -
 arch/arm64/kernel/sleep.S          | 148 ++++++---------
 arch/arm64/kernel/suspend.c        |  99 ++++------
 arch/arm64/kernel/vmlinux.lds.S    |  15 ++
 arch/arm64/kvm/hyp-init.S          |  34 +++-
 arch/arm64/kvm/hyp.S               |  44 ++++-
 arch/arm64/mm/cache.S              |   2 -
 arch/arm64/mm/proc-macros.S        |  64 -------
 arch/arm64/mm/proc.S               |  30 +---
 kernel/power/snapshot.c            |   4 +
 28 files changed, 900 insertions(+), 325 deletions(-)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.1.4

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 01/11] arm64: kernel: fix tcr_el1.t0sz restore on systems with extended idmap
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-10-27 17:29 ` [PATCH v2 02/11] arm64: Fold proc-macros.S into assembler.h James Morse
                   ` (9 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

From: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>

Commit dd006da21646 ("arm64: mm: increase VA range of identity map")
introduced a mechanism to extend the virtual memory map range
to support arm64 systems with system RAM located at very high offset,
where the identity mapping used to enable/disable the MMU requires
additional translation levels to map the physical memory at an equal
virtual offset.

The kernel detects at boot time the tcr_el1.t0sz value required by the
identity mapping and sets-up the tcr_el1.t0sz register field accordingly,
any time the identity map is required in the kernel (ie when enabling the
MMU).

After enabling the MMU, in the cold boot path the kernel resets the
tcr_el1.t0sz to its default value (ie the actual configuration value for
the system virtual address space) so that after enabling the MMU the
memory space translated by ttbr0_el1 is restored as expected.

Commit dd006da21646 ("arm64: mm: increase VA range of identity map")
also added code to set-up the tcr_el1.t0sz value when the kernel resumes
from low-power states with the MMU off through cpu_resume() in order to
effectively use the identity mapping to enable the MMU but failed to add
the code required to restore the tcr_el1.t0sz to its default value, when
the core returns to the kernel with the MMU enabled, so that the kernel
might end up running with tcr_el1.t0sz value set-up for the identity
mapping which can be lower than the value required by the actual virtual
address space, resulting in an erroneous set-up.

This patchs adds code in the resume path that restores the tcr_el1.t0sz
default value upon core resume, mirroring this way the cold boot path
behaviour therefore fixing the issue.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Acked-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Fixes: dd006da21646 ("arm64: mm: increase VA range of identity map")
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/suspend.c | 22 +++++++++++++---------
 1 file changed, 13 insertions(+), 9 deletions(-)

diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 8297d502217e..44ca4143b013 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -80,17 +80,21 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	if (ret == 0) {
 		/*
 		 * We are resuming from reset with TTBR0_EL1 set to the
-		 * idmap to enable the MMU; restore the active_mm mappings in
-		 * TTBR0_EL1 unless the active_mm == &init_mm, in which case
-		 * the thread entered cpu_suspend with TTBR0_EL1 set to
-		 * reserved TTBR0 page tables and should be restored as such.
+		 * idmap to enable the MMU; set the TTBR0 to the reserved
+		 * page tables to prevent speculative TLB allocations, flush
+		 * the local tlb and set the default tcr_el1.t0sz so that
+		 * the TTBR0 address space set-up is properly restored.
+		 * If the current active_mm != &init_mm we entered cpu_suspend
+		 * with mappings in TTBR0 that must be restored, so we switch
+		 * them back to complete the address space configuration
+		 * restoration before returning.
 		 */
-		if (mm == &init_mm)
-			cpu_set_reserved_ttbr0();
-		else
-			cpu_switch_mm(mm->pgd, mm);
-
+		cpu_set_reserved_ttbr0();
 		flush_tlb_all();
+		cpu_set_default_tcr_t0sz();
+
+		if (mm != &init_mm)
+			cpu_switch_mm(mm->pgd, mm);
 
 		/*
 		 * Restore per-cpu offset before any kernel
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 02/11] arm64: Fold proc-macros.S into assembler.h
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
  2015-10-27 17:29 ` [PATCH v2 01/11] arm64: kernel: fix tcr_el1.t0sz restore on systems with extended idmap James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-11-14 21:25   ` Pavel Machek
  2015-10-27 17:29 ` [PATCH v2 03/11] arm64: Convert hcalls to use HVC immediate value James Morse
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to
be used outside the mm code move the contents of proc-macros.S to
asm/assembler.h.  Also, delete proc-macros.S, and fix up all references
to proc-macros.S.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 48 +++++++++++++++++++++++++++-
 arch/arm64/kernel/head.S           |  1 -
 arch/arm64/kvm/hyp-init.S          |  1 -
 arch/arm64/mm/cache.S              |  2 --
 arch/arm64/mm/proc-macros.S        | 64 --------------------------------------
 arch/arm64/mm/proc.S               |  3 --
 6 files changed, 47 insertions(+), 72 deletions(-)
 delete mode 100644 arch/arm64/mm/proc-macros.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index b51f2cc22ca9..91cb311d33de 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -1,5 +1,5 @@
 /*
- * Based on arch/arm/include/asm/assembler.h
+ * * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S
  *
  * Copyright (C) 1996-2000 Russell King
  * Copyright (C) 2012 ARM Ltd.
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H
 
+#include <asm/asm-offsets.h>
+#include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
@@ -193,4 +195,48 @@ lr	.req	x30		// link register
 	str	\src, [\tmp, :lo12:\sym]
 	.endm
 
+/*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+	.macro	vma_vm_mm, rd, rn
+	ldr	\rd, [\rn, #VMA_VM_MM]
+	.endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+	.macro	mmid, rd, rn
+	ldr	\rd, [\rn, #MM_CONTEXT_ID]
+	.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+	.macro	icache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
+ */
+	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
+#ifndef CONFIG_ARM64_VA_BITS_48
+	ldr_l	\tmpreg, idmap_t0sz
+	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
+#endif
+	.endm
+
 #endif	/* __ASM_ASSEMBLER_H */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 90d09eddd5b2..9ad8b1f15b19 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -31,7 +31,6 @@
 #include <asm/cputype.h>
 #include <asm/memory.h>
 #include <asm/thread_info.h>
-#include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 #include <asm/page.h>
 #include <asm/virt.h>
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 178ba2248a98..2e67a4872c51 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -20,7 +20,6 @@
 #include <asm/assembler.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
-#include <asm/pgtable-hwdef.h>
 
 	.text
 	.pushsection	.hyp.idmap.text, "ax"
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index eb48d5df4a0f..9e13cb53c927 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -24,8 +24,6 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
-#include "proc-macros.S"
-
 /*
  *	flush_icache_range(start,end)
  *
diff --git a/arch/arm64/mm/proc-macros.S b/arch/arm64/mm/proc-macros.S
deleted file mode 100644
index 4c4d93c4bf65..000000000000
--- a/arch/arm64/mm/proc-macros.S
+++ /dev/null
@@ -1,64 +0,0 @@
-/*
- * Based on arch/arm/mm/proc-macros.S
- *
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-
-/*
- * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
- */
-	.macro	vma_vm_mm, rd, rn
-	ldr	\rd, [\rn, #VMA_VM_MM]
-	.endm
-
-/*
- * mmid - get context id from mm pointer (mm->context.id)
- */
-	.macro	mmid, rd, rn
-	ldr	\rd, [\rn, #MM_CONTEXT_ID]
-	.endm
-
-/*
- * dcache_line_size - get the minimum D-cache line size from the CTR register.
- */
-	.macro	dcache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * icache_line_size - get the minimum I-cache line size from the CTR register.
- */
-	.macro	icache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	and	\tmp, \tmp, #0xf		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
- */
-	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
-	ldr_l	\tmpreg, idmap_t0sz
-	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
-#endif
-	.endm
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index e4ee7bd8830a..456c1c5f8ecd 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -23,11 +23,8 @@
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
-#include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 
-#include "proc-macros.S"
-
 #ifdef CONFIG_ARM64_64K_PAGES
 #define TCR_TG_FLAGS	TCR_TG0_64K | TCR_TG1_64K
 #else
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 03/11] arm64: Convert hcalls to use HVC immediate value
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
  2015-10-27 17:29 ` [PATCH v2 01/11] arm64: kernel: fix tcr_el1.t0sz restore on systems with extended idmap James Morse
  2015-10-27 17:29 ` [PATCH v2 02/11] arm64: Fold proc-macros.S into assembler.h James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-10-27 17:29 ` [PATCH v2 04/11] arm64: Add new hcall HVC_CALL_FUNC James Morse
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

The existing arm64 hcall implementations are limited in that they only allow
for two distinct hcalls; with the x0 register either zero or not zero.  Also,
the API of the hyp-stub exception vector routines and the KVM exception vector
routines differ; hyp-stub uses a non-zero value in x0 to implement
__hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp.

To allow for additional hcalls to be defined and to make the arm64 hcall API
more consistent across exception vector routines, change the hcall
implementations to use the 16 bit immediate value of the HVC instruction to
specify the hcall type.

Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and
HVC_SET_VECTORS to be used as hcall type specifiers and convert the
existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines
to use these new macros when executing an HVC call.  Also, change the
corresponding hyp-stub and KVM el1_sync exception vector routines to use these
new macros.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/virt.h | 27 +++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 32 +++++++++++++++++++++-----------
 arch/arm64/kvm/hyp.S          | 16 +++++++++-------
 3 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df5252dd7..eb10368c329e 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -18,6 +18,33 @@
 #ifndef __ASM__VIRT_H
 #define __ASM__VIRT_H
 
+/*
+ * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to
+ * specify the hcall type.  The exception handlers are allowed to use registers
+ * x17 and x18 in their implementation.  Any routine issuing an hcall must not
+ * expect these registers to be preserved.
+ */
+
+/*
+ * HVC_CALL_HYP - Execute a hyp routine.
+ */
+
+#define HVC_CALL_HYP 0
+
+/*
+ * HVC_GET_VECTORS - Return the value of the vbar_el2 register.
+ */
+
+#define HVC_GET_VECTORS 1
+
+/*
+ * HVC_SET_VECTORS - Set the value of the vbar_el2 register.
+ *
+ * @x0: Physical address of the new vector table.
+ */
+
+#define HVC_SET_VECTORS 2
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index a272f335c289..017ab519aaf1 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -22,6 +22,7 @@
 #include <linux/irqchip/arm-gic-v3.h>
 
 #include <asm/assembler.h>
+#include <asm/kvm_arm.h>
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
@@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors)
 	.align 11
 
 el1_sync:
-	mrs	x1, esr_el2
-	lsr	x1, x1, #26
-	cmp	x1, #0x16
+	mrs	x18, esr_el2
+	lsr	x17, x18, #ESR_ELx_EC_SHIFT
+	and	x18, x18, #ESR_ELx_ISS_MASK
+
+	cmp	x17, #ESR_ELx_EC_HVC64
 	b.ne	2f				// Not an HVC trap
-	cbz	x0, 1f
-	msr	vbar_el2, x0			// Set vbar_el2
+
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
+	mrs	x0, vbar_el2
 	b	2f
-1:	mrs	x0, vbar_el2			// Return vbar_el2
+
+1:	cmp	x18, #HVC_SET_VECTORS
+	b.ne	2f
+	msr	vbar_el2, x0
+
 2:	eret
 ENDPROC(el1_sync)
 
@@ -100,11 +109,12 @@ ENDPROC(\label)
  * initialisation entry point.
  */
 
-ENTRY(__hyp_get_vectors)
-	mov	x0, xzr
-	// fall through
 ENTRY(__hyp_set_vectors)
-	hvc	#0
+	hvc	#HVC_SET_VECTORS
 	ret
-ENDPROC(__hyp_get_vectors)
 ENDPROC(__hyp_set_vectors)
+
+ENTRY(__hyp_get_vectors)
+	hvc	#HVC_GET_VECTORS
+	ret
+ENDPROC(__hyp_get_vectors)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index e5836138ec42..073b8bf3daf7 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -29,6 +29,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/memory.h>
+#include <asm/virt.h>
 
 #define CPU_GP_REG_OFFSET(x)	(CPU_GP_REGS + x)
 #define CPU_XREG_OFFSET(x)	CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x)
@@ -924,12 +925,9 @@ __hyp_panic_str:
  * in Hyp mode (see init_hyp_mode in arch/arm/kvm/arm.c).  Return values are
  * passed in r0 and r1.
  *
- * A function pointer with a value of 0 has a special meaning, and is
- * used to implement __hyp_get_vectors in the same way as in
- * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(kvm_call_hyp)
-	hvc	#0
+	hvc	#HVC_CALL_HYP
 	ret
 ENDPROC(kvm_call_hyp)
 
@@ -960,6 +958,7 @@ el1_sync:					// Guest trapped into EL2
 
 	mrs	x1, esr_el2
 	lsr	x2, x1, #ESR_ELx_EC_SHIFT
+	and	x0, x1, #ESR_ELx_ISS_MASK
 
 	cmp	x2, #ESR_ELx_EC_HVC64
 	b.ne	el1_trap
@@ -968,15 +967,18 @@ el1_sync:					// Guest trapped into EL2
 	cbnz	x3, el1_trap			// called HVC
 
 	/* Here, we're pretty sure the host called HVC. */
+	mov	x18, x0
 	pop	x2, x3
 	pop	x0, x1
 
-	/* Check for __hyp_get_vectors */
-	cbnz	x0, 1f
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
 	mrs	x0, vbar_el2
 	b	2f
 
-1:	push	lr, xzr
+1:	/* Default to HVC_CALL_HYP. */
+
+	push	lr, xzr
 
 	/*
 	 * Compute the function address in EL2, and shuffle the parameters.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 04/11] arm64: Add new hcall HVC_CALL_FUNC
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
                   ` (2 preceding siblings ...)
  2015-10-27 17:29 ` [PATCH v2 03/11] arm64: Convert hcalls to use HVC immediate value James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-10-27 17:29 ` [PATCH v2 05/11] arm64: kvm: allows kvm cpu hotplug James Morse
                   ` (6 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

Add the new hcall HVC_CALL_FUNC that allows execution of a function at EL2.
During CPU reset the CPU must be brought to the exception level it had on
entry to the kernel.  The HVC_CALL_FUNC hcall will provide the mechanism
needed for this exception level switch.

To allow the HVC_CALL_FUNC exception vector to work without a stack, which is
needed to support an hcall at CPU reset, this implementation uses register x18
to store the link register across the caller provided function.  This dictates
that the caller provided function must preserve the contents of register x18.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/virt.h | 13 +++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 13 ++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index eb10368c329e..30700961f28c 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -45,6 +45,19 @@
 
 #define HVC_SET_VECTORS 2
 
+/*
+ * HVC_CALL_FUNC - Execute a function at EL2.
+ *
+ * @x0: Physical address of the function to be executed.
+ * @x1: Passed as the first argument to the function.
+ * @x2: Passed as the second argument to the function.
+ * @x3: Passed as the third argument to the function.
+ *
+ * The called function must preserve the contents of register x18.
+ */
+
+#define HVC_CALL_FUNC 3
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 017ab519aaf1..e8febe90c036 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -67,8 +67,19 @@ el1_sync:
 	b	2f
 
 1:	cmp	x18, #HVC_SET_VECTORS
-	b.ne	2f
+	b.ne	1f
 	msr	vbar_el2, x0
+	b	2f
+
+1:	cmp	x18, #HVC_CALL_FUNC
+	b.ne	2f
+	mov	x18, lr
+	mov	lr, x0
+	mov	x0, x1
+	mov	x1, x2
+	mov	x2, x3
+	blr	lr
+	mov	lr, x18
 
 2:	eret
 ENDPROC(el1_sync)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 05/11] arm64: kvm: allows kvm cpu hotplug
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
                   ` (3 preceding siblings ...)
  2015-10-27 17:29 ` [PATCH v2 04/11] arm64: Add new hcall HVC_CALL_FUNC James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-10-27 17:29 ` [PATCH v2 06/11] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter() James Morse
                   ` (5 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

The current kvm implementation on arm64 does cpu-specific initialization
at system boot, and has no way to gracefully shutdown a core in terms of
kvm. This prevents, especially, kexec from rebooting the system on a boot
core in EL2.

This patch adds a cpu tear-down function and also puts an existing cpu-init
code into a separate function, kvm_arch_hardware_disable() and
kvm_arch_hardware_enable() respectively.
We don't need arm64-specific cpu hotplug hook any more.

Since this patch modifies common part of code between arm and arm64, one
stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
compiling errors.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm/include/asm/kvm_host.h   | 10 ++++-
 arch/arm/include/asm/kvm_mmu.h    |  1 +
 arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
 arch/arm/kvm/mmu.c                |  5 +++
 arch/arm64/include/asm/kvm_host.h | 16 +++++++-
 arch/arm64/include/asm/kvm_mmu.h  |  1 +
 arch/arm64/include/asm/virt.h     |  9 +++++
 arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
 arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
 9 files changed, 138 insertions(+), 48 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index c4072d9f32c7..4c6a38cafd31 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -211,6 +211,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * TODO
+	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+	 */
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
@@ -223,7 +232,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa1883307..dc6fadfd0407 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 78b286994577..5005e6eafc55 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -16,7 +16,6 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
-#include <linux/cpu.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -61,6 +60,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
 static u8 kvm_next_vmid;
 static DEFINE_SPINLOCK(kvm_vmid_lock);
 
+static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
 	BUG_ON(preemptible());
@@ -85,11 +86,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
 	return &kvm_arm_running_vcpu;
 }
 
-int kvm_arch_hardware_enable(void)
-{
-	return 0;
-}
-
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
@@ -915,7 +911,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 }
 
-static void cpu_init_hyp_mode(void *dummy)
+int kvm_arch_hardware_enable(void)
 {
 	phys_addr_t boot_pgd_ptr;
 	phys_addr_t pgd_ptr;
@@ -923,6 +919,9 @@ static void cpu_init_hyp_mode(void *dummy)
 	unsigned long stack_page;
 	unsigned long vector_ptr;
 
+	if (__hyp_get_vectors() != hyp_default_vectors)
+		return 0;
+
 	/* Switch from the HYP stub to our own HYP init vector */
 	__hyp_set_vectors(kvm_get_idmap_vector());
 
@@ -935,38 +934,50 @@ static void cpu_init_hyp_mode(void *dummy)
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
 	kvm_arm_init_debug();
+
+	return 0;
 }
 
-static int hyp_init_cpu_notify(struct notifier_block *self,
-			       unsigned long action, void *cpu)
+void kvm_arch_hardware_disable(void)
 {
-	switch (action) {
-	case CPU_STARTING:
-	case CPU_STARTING_FROZEN:
-		if (__hyp_get_vectors() == hyp_default_vectors)
-			cpu_init_hyp_mode(NULL);
-		break;
-	}
+	phys_addr_t boot_pgd_ptr;
+	phys_addr_t phys_idmap_start;
 
-	return NOTIFY_OK;
-}
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
 
-static struct notifier_block hyp_init_cpu_nb = {
-	.notifier_call = hyp_init_cpu_notify,
-};
+	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
+	phys_idmap_start = kvm_get_idmap_start();
+
+	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
+}
 
 #ifdef CONFIG_CPU_PM
 static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
 				    unsigned long cmd,
 				    void *v)
 {
-	if (cmd == CPU_PM_EXIT &&
-	    __hyp_get_vectors() == hyp_default_vectors) {
-		cpu_init_hyp_mode(NULL);
+	switch (cmd) {
+	case CPU_PM_ENTER:
+		if (__hyp_get_vectors() != hyp_default_vectors)
+			__this_cpu_write(kvm_arm_hardware_enabled, 1);
+		else
+			__this_cpu_write(kvm_arm_hardware_enabled, 0);
+		/*
+		 * don't call kvm_arch_hardware_disable() in case of
+		 * CPU_PM_ENTER because it does't actually save any state.
+		 */
+
+		return NOTIFY_OK;
+	case CPU_PM_EXIT:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			kvm_arch_hardware_enable();
+
 		return NOTIFY_OK;
-	}
 
-	return NOTIFY_DONE;
+	default:
+		return NOTIFY_DONE;
+	}
 }
 
 static struct notifier_block hyp_init_cpu_pm_nb = {
@@ -1064,11 +1075,6 @@ static int init_hyp_mode(void)
 	}
 
 	/*
-	 * Execute the init code on each CPU.
-	 */
-	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
-
-	/*
 	 * Init HYP view of VGIC
 	 */
 	err = kvm_vgic_hyp_init();
@@ -1142,26 +1148,15 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
-	cpu_notifier_register_begin();
-
 	err = init_hyp_mode();
 	if (err)
 		goto out_err;
 
-	err = __register_cpu_notifier(&hyp_init_cpu_nb);
-	if (err) {
-		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
-		goto out_err;
-	}
-
-	cpu_notifier_register_done();
-
 	hyp_cpu_pm_init();
 
 	kvm_coproc_table_init();
 	return 0;
 out_err:
-	cpu_notifier_register_done();
 	return err;
 }
 
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 6984342da13d..69b4a33d232d 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1644,6 +1644,11 @@ phys_addr_t kvm_get_idmap_vector(void)
 	return hyp_idmap_vector;
 }
 
+phys_addr_t kvm_get_idmap_start(void)
+{
+	return hyp_idmap_start;
+}
+
 int kvm_mmu_init(void)
 {
 	int err;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index ed039688c221..e86421a6065b 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -220,6 +220,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_call_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
@@ -244,7 +245,20 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 		     hyp_stack_ptr, vector_ptr);
 }
 
-static inline void kvm_arch_hardware_disable(void) {}
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * Call reset code, and switch back to stub hyp vectors.
+	 */
+	kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+}
+
+struct vgic_sr_vectors {
+	void	*save_vgic;
+	void	*restore_vgic;
+};
+
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 61505676d085..ff5a08777e11 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -98,6 +98,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 30700961f28c..bca79f90178c 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -58,9 +58,18 @@
 
 #define HVC_CALL_FUNC 3
 
+/*
+ * HVC_RESET_CPU - Reset cpu in EL2 to initial state.
+ *
+ * @x0: entry address in trampoline code in va
+ * @x1: identical mapping page table in pa
+ */
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
+#define HVC_RESET_CPU 4
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 2e67a4872c51..192516332e47 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -139,6 +139,39 @@ merged:
 	eret
 ENDPROC(__kvm_hyp_init)
 
+	/*
+	 * x0: HYP boot pgd
+	 * x1: HYP phys_idmap_start
+	 */
+ENTRY(__kvm_hyp_reset)
+	/* We're in trampoline code in VA, switch back to boot page tables */
+	msr	ttbr0_el2, x0
+	isb
+
+	/* Invalidate the old TLBs */
+	tlbi	alle2
+	dsb	sy
+
+	/* Branch into PA space */
+	adr	x0, 1f
+	bfi	x1, x0, #0, #PAGE_SHIFT
+	br	x1
+
+	/* We're now in idmap, disable MMU */
+1:	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_EL2_FLAGS
+	bic	x0, x0, x1		// Clear SCTL_M and etc
+	msr	sctlr_el2, x0
+	isb
+
+	/* Install stub vectors */
+	adrp	x0, __hyp_stub_vectors
+	add	x0, x0, #:lo12:__hyp_stub_vectors
+	msr	vbar_el2, x0
+
+	eret
+ENDPROC(__kvm_hyp_reset)
+
 	.ltorg
 
 	.popsection
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 073b8bf3daf7..c9804ac3ac70 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -931,6 +931,11 @@ ENTRY(kvm_call_hyp)
 	ret
 ENDPROC(kvm_call_hyp)
 
+ENTRY(kvm_call_reset)
+	hvc	#HVC_RESET_CPU
+	ret
+ENDPROC(kvm_call_reset)
+
 .macro invalid_vector	label, target
 	.align	2
 \label:
@@ -974,10 +979,27 @@ el1_sync:					// Guest trapped into EL2
 	cmp	x18, #HVC_GET_VECTORS
 	b.ne	1f
 	mrs	x0, vbar_el2
-	b	2f
-
-1:	/* Default to HVC_CALL_HYP. */
+	b	do_eret
 
+	/* jump into trampoline code */
+1:	cmp	x18, #HVC_RESET_CPU
+	b.ne	2f
+	/*
+	 * Entry point is:
+	 *	TRAMPOLINE_VA
+	 *	+ (__kvm_hyp_reset - (__hyp_idmap_text_start & PAGE_MASK))
+	 */
+	adrp	x2, __kvm_hyp_reset
+	add	x2, x2, #:lo12:__kvm_hyp_reset
+	adrp	x3, __hyp_idmap_text_start
+	add	x3, x3, #:lo12:__hyp_idmap_text_start
+	and	x3, x3, PAGE_MASK
+	sub	x2, x2, x3
+	ldr	x3, =TRAMPOLINE_VA
+	add	x2, x2, x3
+	br	x2				// no return
+
+2:	/* Default to HVC_CALL_HYP. */
 	push	lr, xzr
 
 	/*
@@ -991,7 +1013,9 @@ el1_sync:					// Guest trapped into EL2
 	blr	lr
 
 	pop	lr, xzr
-2:	eret
+
+do_eret:
+	eret
 
 el1_trap:
 	/*
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 06/11] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
                   ` (4 preceding siblings ...)
  2015-10-27 17:29 ` [PATCH v2 05/11] arm64: kvm: allows kvm cpu hotplug James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-10-27 17:29 ` [PATCH v2 07/11] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va James Morse
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hibernate could make use of the cpu_suspend() code to save/restore cpu
state, however it needs to be able to return '0' from the 'finisher'.

Rework cpu_suspend() so that the finisher is called from C code,
independently from the save/restore of cpu state. Space to save the context
in is allocated in the caller's stack frame, and passed into
__cpu_suspend_enter().

Hibernate's use of this API will look like a copy of the cpu_suspend()
function.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
Changes since v1:
 * Added/fixed two comments identified by Lorenzo
 * Rebased onto Lorenzo's tcr_el1.t0sz fix.

 arch/arm64/include/asm/suspend.h | 20 +++++++++
 arch/arm64/kernel/asm-offsets.c  |  2 +
 arch/arm64/kernel/sleep.S        | 93 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 89 ++++++++++++++++++++++----------------
 4 files changed, 108 insertions(+), 96 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 59a5b0f1e81c..ccd26da93d03 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -2,6 +2,7 @@
 #define __ASM_SUSPEND_H
 
 #define NR_CTX_REGS 11
+#define NR_CALLEE_SAVED_REGS 12
 
 /*
  * struct cpu_suspend_ctx must be 16-byte aligned since it is allocated on
@@ -21,6 +22,25 @@ struct sleep_save_sp {
 	phys_addr_t save_ptr_stash_phys;
 };
 
+/*
+ * Memory to save the cpu state is allocated on the stack by
+ * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
+ * This data must survive until cpu_resume() is called.
+ *
+ * This struct desribes the size and the layout of the saved cpu state.
+ * The layout of the callee_saved_regs is defined by the implementation
+ * of __cpu_suspend_enter(), and cpu_resume(). This struct must be passed
+ * in by the caller as __cpu_suspend_enter()'s stack-frame is gone once it
+ * returns, and the data would be subsequently corrupted by the call to the
+ * finisher.
+ */
+struct sleep_stack_data {
+	struct cpu_suspend_ctx	system_regs;
+	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
+};
+
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
+int __cpu_suspend_enter(struct sleep_stack_data *state);
+void __cpu_suspend_exit(struct mm_struct *mm);
 #endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 8d89cf8dae55..5daa4e692932 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -160,6 +160,8 @@ int main(void)
   DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
   DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
   DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
+  DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
+  DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index f586f7c875e2..1fa40573db13 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -49,37 +49,30 @@
 	orr	\dst, \dst, \mask		// dst|=(aff3>>rs3)
 	.endm
 /*
- * Save CPU state for a suspend and execute the suspend finisher.
- * On success it will return 0 through cpu_resume - ie through a CPU
- * soft/hard reboot from the reset vector.
- * On failure it returns the suspend finisher return value or force
- * -EOPNOTSUPP if the finisher erroneously returns 0 (the suspend finisher
- * is not allowed to return, if it does this must be considered failure).
- * It saves callee registers, and allocates space on the kernel stack
- * to save the CPU specific registers + some other data for resume.
+ * Save CPU state in the provided sleep_stack_data area, and publish its
+ * location for cpu_resume()'s use in sleep_save_stash.
  *
- *  x0 = suspend finisher argument
- *  x1 = suspend finisher function pointer
+ * cpu_resume() will restore this saved state, and return. Because the
+ * link-register is saved and restored, it will appear to return from this
+ * function. So that the caller can tell the suspend/resume paths apart,
+ * __cpu_suspend_enter() will always return a non-zero value, whereas the
+ * path through cpu_resume() will return 0.
+ *
+ *  x0 = struct sleep_stack_data area
  */
 ENTRY(__cpu_suspend_enter)
-	stp	x29, lr, [sp, #-96]!
-	stp	x19, x20, [sp,#16]
-	stp	x21, x22, [sp,#32]
-	stp	x23, x24, [sp,#48]
-	stp	x25, x26, [sp,#64]
-	stp	x27, x28, [sp,#80]
-	/*
-	 * Stash suspend finisher and its argument in x20 and x19
-	 */
-	mov	x19, x0
-	mov	x20, x1
+	stp	x29, lr, [x0, #SLEEP_STACK_DATA_CALLEE_REGS]
+	stp	x19, x20, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+16]
+	stp	x21, x22, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+32]
+	stp	x23, x24, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+48]
+	stp	x25, x26, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+64]
+	stp	x27, x28, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+80]
+
+	/* save the sp in cpu_suspend_ctx */
 	mov	x2, sp
-	sub	sp, sp, #CPU_SUSPEND_SZ	// allocate cpu_suspend_ctx
-	mov	x0, sp
-	/*
-	 * x0 now points to struct cpu_suspend_ctx allocated on the stack
-	 */
-	str	x2, [x0, #CPU_CTX_SP]
+	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
+
+	/* find the mpidr_hash */
 	ldr	x1, =sleep_save_sp
 	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
 	mrs	x7, mpidr_el1
@@ -93,34 +86,11 @@ ENTRY(__cpu_suspend_enter)
 	ldp	w5, w6, [x9, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
+
+	push	x29, lr
 	bl	__cpu_suspend_save
-	/*
-	 * Grab suspend finisher in x20 and its argument in x19
-	 */
-	mov	x0, x19
-	mov	x1, x20
-	/*
-	 * We are ready for power down, fire off the suspend finisher
-	 * in x1, with argument in x0
-	 */
-	blr	x1
-        /*
-	 * Never gets here, unless suspend finisher fails.
-	 * Successful cpu_suspend should return from cpu_resume, returning
-	 * through this code path is considered an error
-	 * If the return value is set to 0 force x0 = -EOPNOTSUPP
-	 * to make sure a proper error condition is propagated
-	 */
-	cmp	x0, #0
-	mov	x3, #-EOPNOTSUPP
-	csel	x0, x3, x0, eq
-	add	sp, sp, #CPU_SUSPEND_SZ	// rewind stack pointer
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
+	pop	x29, lr
+	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
@@ -146,12 +116,6 @@ ENDPROC(cpu_resume_mmu)
 	.popsection
 cpu_resume_after_mmu:
 	mov	x0, #0			// return zero on success
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
 	ret
 ENDPROC(cpu_resume_after_mmu)
 
@@ -168,6 +132,8 @@ ENTRY(cpu_resume)
         /* x7 contains hash index, let's use it to grab context pointer */
 	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
 	ldr	x0, [x0, x7, lsl #3]
+	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
 	/* load physical address of identity map page table in x1 */
@@ -178,5 +144,12 @@ ENTRY(cpu_resume)
 	 * pointer and x1 to contain physical address of 1:1 page tables
 	 */
 	bl	cpu_do_resume		// PC relative jump, MMU off
+	/* Can't access these by physical address once the MMU is on */
+	ldp	x19, x20, [x29, #16]
+	ldp	x21, x22, [x29, #32]
+	ldp	x23, x24, [x29, #48]
+	ldp	x25, x26, [x29, #64]
+	ldp	x27, x28, [x29, #80]
+	ldp	x29, lr, [x29]
 	b	cpu_resume_mmu		// Resume MMU, never returns
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 44ca4143b013..eeb088d15bcb 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -9,22 +9,22 @@
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 
-extern int __cpu_suspend_enter(unsigned long arg, int (*fn)(unsigned long));
+
 /*
  * This is called by __cpu_suspend_enter() to save the state, and do whatever
  * flushing is required to ensure that when the CPU goes to sleep we have
  * the necessary data available when the caches are not searched.
  *
- * ptr: CPU context virtual address
+ * ptr: sleep_stack_data containing cpu state virtual address.
  * save_ptr: address of the location where the context physical address
  *           must be saved
  */
-void notrace __cpu_suspend_save(struct cpu_suspend_ctx *ptr,
+void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
 				phys_addr_t *save_ptr)
 {
 	*save_ptr = virt_to_phys(ptr);
 
-	cpu_do_suspend(ptr);
+	cpu_do_suspend(&ptr->system_regs);
 	/*
 	 * Only flush the context that must be retrieved with the MMU
 	 * off. VA primitives ensure the flush is applied to all
@@ -50,6 +50,41 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 	hw_breakpoint_restore = hw_bp_restore;
 }
 
+void notrace __cpu_suspend_exit(struct mm_struct *mm)
+{
+	/*
+	 * We are resuming from reset with TTBR0_EL1 set to the
+	 * idmap to enable the MMU; set the TTBR0 to the reserved
+	 * page tables to prevent speculative TLB allocations, flush
+	 * the local tlb and set the default tcr_el1.t0sz so that
+	 * the TTBR0 address space set-up is properly restored.
+	 * If the current active_mm != &init_mm we entered cpu_suspend
+	 * with mappings in TTBR0 that must be restored, so we switch
+	 * them back to complete the address space configuration
+	 * restoration before returning.
+	 */
+	cpu_set_reserved_ttbr0();
+	flush_tlb_all();
+	cpu_set_default_tcr_t0sz();
+
+	if (mm != &init_mm)
+		cpu_switch_mm(mm->pgd, mm);
+
+	/*
+	 * Restore per-cpu offset before any kernel
+	 * subsystem relying on it has a chance to run.
+	 */
+	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+
+	/*
+	 * Restore HW breakpoint registers to sane values
+	 * before debug exceptions are possibly reenabled
+	 * through local_dbg_restore.
+	 */
+	if (hw_breakpoint_restore)
+		hw_breakpoint_restore(NULL);
+}
+
 /*
  * cpu_suspend
  *
@@ -60,8 +95,9 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 {
 	struct mm_struct *mm = current->active_mm;
-	int ret;
+	int ret = 0;
 	unsigned long flags;
+	struct sleep_stack_data state;
 
 	/*
 	 * From this point debug exceptions are disabled to prevent
@@ -76,40 +112,21 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * page tables, so that the thread address space is properly
 	 * set-up on function return.
 	 */
-	ret = __cpu_suspend_enter(arg, fn);
-	if (ret == 0) {
-		/*
-		 * We are resuming from reset with TTBR0_EL1 set to the
-		 * idmap to enable the MMU; set the TTBR0 to the reserved
-		 * page tables to prevent speculative TLB allocations, flush
-		 * the local tlb and set the default tcr_el1.t0sz so that
-		 * the TTBR0 address space set-up is properly restored.
-		 * If the current active_mm != &init_mm we entered cpu_suspend
-		 * with mappings in TTBR0 that must be restored, so we switch
-		 * them back to complete the address space configuration
-		 * restoration before returning.
-		 */
-		cpu_set_reserved_ttbr0();
-		flush_tlb_all();
-		cpu_set_default_tcr_t0sz();
-
-		if (mm != &init_mm)
-			cpu_switch_mm(mm->pgd, mm);
-
-		/*
-		 * Restore per-cpu offset before any kernel
-		 * subsystem relying on it has a chance to run.
-		 */
-		set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+	if (__cpu_suspend_enter(&state)) {
+		/* Call the suspend finisher */
+		ret = fn(arg);
 
 		/*
-		 * Restore HW breakpoint registers to sane values
-		 * before debug exceptions are possibly reenabled
-		 * through local_dbg_restore.
+		 * Never gets here, unless suspend finisher fails.
+		 * Successful cpu_suspend should return from cpu_resume,
+		 * returning through this code path is considered an error
+		 * If the return value is set to 0 force ret = -EOPNOTSUPP
+		 * to make sure a proper error condition is propagated
 		 */
-		if (hw_breakpoint_restore)
-			hw_breakpoint_restore(NULL);
-	}
+		if (!ret)
+			ret = -EOPNOTSUPP;
+	} else
+		__cpu_suspend_exit(mm);
 
 	/*
 	 * Restore pstate flags. OS lock and mdscr have been already
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 07/11] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
                   ` (5 preceding siblings ...)
  2015-10-27 17:29 ` [PATCH v2 06/11] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter() James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-10-27 17:29 ` [PATCH v2 08/11] arm64: kernel: Include _AC definition in page.h James Morse
                   ` (3 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

By enabling the MMU early in cpu_resume(), the sleep_save_sp and stack can
be accessed by VA, which avoids the need to convert-addresses and clean to
PoC on the suspend path.

MMU setup is shared with the boot path, meaning the swapper_pg_dir is
restored directly: ttbr1_el1 is no longer saved/restored.

struct sleep_save_sp is removed, replacing it with a single array of
pointers.

cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
__cpu_setup(). However these values all contain res0 bits that may be used
to enable future features.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/suspend.h |  7 +----
 arch/arm64/kernel/asm-offsets.c  |  3 ---
 arch/arm64/kernel/head.S         |  2 +-
 arch/arm64/kernel/setup.c        |  1 -
 arch/arm64/kernel/sleep.S        | 57 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 52 +++---------------------------------
 arch/arm64/mm/proc.S             | 27 +++++--------------
 7 files changed, 33 insertions(+), 116 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index ccd26da93d03..5faa3ce1fa3a 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,7 +1,7 @@
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
-#define NR_CTX_REGS 11
+#define NR_CTX_REGS 10
 #define NR_CALLEE_SAVED_REGS 12
 
 /*
@@ -17,11 +17,6 @@ struct cpu_suspend_ctx {
 	u64 sp;
 } __aligned(16);
 
-struct sleep_save_sp {
-	phys_addr_t *save_ptr_stash;
-	phys_addr_t save_ptr_stash_phys;
-};
-
 /*
  * Memory to save the cpu state is allocated on the stack by
  * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 5daa4e692932..3cb1383d3deb 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -157,9 +157,6 @@ int main(void)
   DEFINE(CPU_CTX_SP,		offsetof(struct cpu_suspend_ctx, sp));
   DEFINE(MPIDR_HASH_MASK,	offsetof(struct mpidr_hash, mask));
   DEFINE(MPIDR_HASH_SHIFTS,	offsetof(struct mpidr_hash, shift_aff));
-  DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
-  DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
-  DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
   DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
   DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 9ad8b1f15b19..cf4e0bdf6533 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -630,7 +630,7 @@ ENDPROC(__secondary_switched)
  * other registers depend on the function called upon completion
  */
 	.section	".idmap.text", "ax"
-__enable_mmu:
+ENTRY(__enable_mmu)
 	ldr	x5, =vectors
 	msr	vbar_el1, x5
 	msr	ttbr0_el1, x25			// load TTBR0
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 232247945b1c..5a338235ba1a 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -192,7 +192,6 @@ static void __init smp_build_mpidr_hash(void)
 	 */
 	if (mpidr_hash_size() > 4 * num_possible_cpus())
 		pr_warn("Large number of MPIDR hash buckets detected\n");
-	__flush_dcache_area(&mpidr_hash, sizeof(struct mpidr_hash));
 }
 
 static void __init setup_processor(void)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 1fa40573db13..07e005d756b0 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -73,8 +73,8 @@ ENTRY(__cpu_suspend_enter)
 	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
 
 	/* find the mpidr_hash */
-	ldr	x1, =sleep_save_sp
-	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
+	ldr	x1, =sleep_save_stash
+	ldr	x1, [x1]
 	mrs	x7, mpidr_el1
 	ldr	x9, =mpidr_hash
 	ldr	x10, [x9, #MPIDR_HASH_MASK]
@@ -87,40 +87,26 @@ ENTRY(__cpu_suspend_enter)
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
 
+	str	x0, [x1]
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	push	x29, lr
-	bl	__cpu_suspend_save
+	bl	cpu_do_suspend
 	pop	x29, lr
 	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
 
-/*
- * x0 must contain the sctlr value retrieved from restored context
- */
-	.pushsection	".idmap.text", "ax"
-ENTRY(cpu_resume_mmu)
-	ldr	x3, =cpu_resume_after_mmu
-	msr	sctlr_el1, x0		// restore sctlr_el1
-	isb
-	/*
-	 * Invalidate the local I-cache so that any instructions fetched
-	 * speculatively from the PoC are discarded, since they may have
-	 * been dynamically patched at the PoU.
-	 */
-	ic	iallu
-	dsb	nsh
-	isb
-	br	x3			// global jump to virtual address
-ENDPROC(cpu_resume_mmu)
-	.popsection
-cpu_resume_after_mmu:
-	mov	x0, #0			// return zero on success
-	ret
-ENDPROC(cpu_resume_after_mmu)
-
 ENTRY(cpu_resume)
 	bl	el2_setup		// if in EL2 drop to EL1 cleanly
+	/* enable the MMU early - so we can access sleep_save_stash by va */
+	adr_l	lr, __enable_mmu	/* __cpu_setup will return here */
+	ldr	x27, =_cpu_resume	/* __enable_mmu will branch here */
+	adrp	x25, idmap_pg_dir
+	adrp	x26, swapper_pg_dir
+	b	__cpu_setup
+
+ENTRY(_cpu_resume)
 	mrs	x1, mpidr_el1
 	adrp	x8, mpidr_hash
 	add x8, x8, #:lo12:mpidr_hash // x8 = struct mpidr_hash phys address
@@ -130,26 +116,23 @@ ENTRY(cpu_resume)
 	ldp	w5, w6, [x8, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x7, x3, x4, x5, x6, x1, x2
         /* x7 contains hash index, let's use it to grab context pointer */
-	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
+	ldr_l	x0, sleep_save_stash
 	ldr	x0, [x0, x7, lsl #3]
 	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
 	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
-	/* load physical address of identity map page table in x1 */
-	adrp	x1, idmap_pg_dir
 	mov	sp, x2
-	/*
-	 * cpu_do_resume expects x0 to contain context physical address
-	 * pointer and x1 to contain physical address of 1:1 page tables
-	 */
-	bl	cpu_do_resume		// PC relative jump, MMU off
-	/* Can't access these by physical address once the MMU is on */
+	bl	cpu_do_resume
+	msr	sctlr_el1, x0
+	isb
+
 	ldp	x19, x20, [x29, #16]
 	ldp	x21, x22, [x29, #32]
 	ldp	x23, x24, [x29, #48]
 	ldp	x25, x26, [x29, #64]
 	ldp	x27, x28, [x29, #80]
 	ldp	x29, lr, [x29]
-	b	cpu_resume_mmu		// Resume MMU, never returns
+	mov	x0, #0
+	ret
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index eeb088d15bcb..03ef74c02939 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -11,30 +11,6 @@
 
 
 /*
- * This is called by __cpu_suspend_enter() to save the state, and do whatever
- * flushing is required to ensure that when the CPU goes to sleep we have
- * the necessary data available when the caches are not searched.
- *
- * ptr: sleep_stack_data containing cpu state virtual address.
- * save_ptr: address of the location where the context physical address
- *           must be saved
- */
-void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
-				phys_addr_t *save_ptr)
-{
-	*save_ptr = virt_to_phys(ptr);
-
-	cpu_do_suspend(&ptr->system_regs);
-	/*
-	 * Only flush the context that must be retrieved with the MMU
-	 * off. VA primitives ensure the flush is applied to all
-	 * cache levels so context is pushed to DRAM.
-	 */
-	__flush_dcache_area(ptr, sizeof(*ptr));
-	__flush_dcache_area(save_ptr, sizeof(*save_ptr));
-}
-
-/*
  * This hook is provided so that cpu_suspend code can restore HW
  * breakpoints as early as possible in the resume path, before reenabling
  * debug exceptions. Code cannot be run from a CPU PM notifier since by the
@@ -52,21 +28,6 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 
 void notrace __cpu_suspend_exit(struct mm_struct *mm)
 {
-	/*
-	 * We are resuming from reset with TTBR0_EL1 set to the
-	 * idmap to enable the MMU; set the TTBR0 to the reserved
-	 * page tables to prevent speculative TLB allocations, flush
-	 * the local tlb and set the default tcr_el1.t0sz so that
-	 * the TTBR0 address space set-up is properly restored.
-	 * If the current active_mm != &init_mm we entered cpu_suspend
-	 * with mappings in TTBR0 that must be restored, so we switch
-	 * them back to complete the address space configuration
-	 * restoration before returning.
-	 */
-	cpu_set_reserved_ttbr0();
-	flush_tlb_all();
-	cpu_set_default_tcr_t0sz();
-
 	if (mm != &init_mm)
 		cpu_switch_mm(mm->pgd, mm);
 
@@ -138,22 +99,17 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	return ret;
 }
 
-struct sleep_save_sp sleep_save_sp;
+unsigned long *sleep_save_stash;
 
 static int __init cpu_suspend_init(void)
 {
-	void *ctx_ptr;
-
 	/* ctx_ptr is an array of physical addresses */
-	ctx_ptr = kcalloc(mpidr_hash_size(), sizeof(phys_addr_t), GFP_KERNEL);
+	sleep_save_stash = kcalloc(mpidr_hash_size(), sizeof(*sleep_save_stash),
+				   GFP_KERNEL);
 
-	if (WARN_ON(!ctx_ptr))
+	if (WARN_ON(!sleep_save_stash))
 		return -ENOMEM;
 
-	sleep_save_sp.save_ptr_stash = ctx_ptr;
-	sleep_save_sp.save_ptr_stash_phys = virt_to_phys(ctx_ptr);
-	__flush_dcache_area(&sleep_save_sp, sizeof(struct sleep_save_sp));
-
 	return 0;
 }
 early_initcall(cpu_suspend_init);
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 456c1c5f8ecd..b3afb6123c81 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -59,20 +59,17 @@ ENTRY(cpu_do_suspend)
 	mrs	x2, tpidr_el0
 	mrs	x3, tpidrro_el0
 	mrs	x4, contextidr_el1
-	mrs	x5, mair_el1
 	mrs	x6, cpacr_el1
-	mrs	x7, ttbr1_el1
 	mrs	x8, tcr_el1
 	mrs	x9, vbar_el1
 	mrs	x10, mdscr_el1
 	mrs	x11, oslsr_el1
 	mrs	x12, sctlr_el1
 	stp	x2, x3, [x0]
-	stp	x4, x5, [x0, #16]
-	stp	x6, x7, [x0, #32]
-	stp	x8, x9, [x0, #48]
-	stp	x10, x11, [x0, #64]
-	str	x12, [x0, #80]
+	stp	x4, xzr, [x0, #16]
+	stp	x6, x8, [x0, #32]
+	stp	x9, x10, [x0, #48]
+	stp	x11, x12, [x0, #64]
 	ret
 ENDPROC(cpu_do_suspend)
 
@@ -80,29 +77,20 @@ ENDPROC(cpu_do_suspend)
  * cpu_do_resume - restore CPU register context
  *
  * x0: Physical address of context pointer
- * x1: ttbr0_el1 to be restored
  *
  * Returns:
  *	sctlr_el1 value in x0
  */
 ENTRY(cpu_do_resume)
-	/*
-	 * Invalidate local tlb entries before turning on MMU
-	 */
-	tlbi	vmalle1
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
-	ldp	x6, x7, [x0, #32]
-	ldp	x8, x9, [x0, #48]
-	ldp	x10, x11, [x0, #64]
-	ldr	x12, [x0, #80]
+	ldp	x6, x8, [x0, #32]
+	ldp	x9, x10, [x0, #48]
+	ldp	x11, x12, [x0, #64]
 	msr	tpidr_el0, x2
 	msr	tpidrro_el0, x3
 	msr	contextidr_el1, x4
-	msr	mair_el1, x5
 	msr	cpacr_el1, x6
-	msr	ttbr0_el1, x1
-	msr	ttbr1_el1, x7
 	tcr_set_idmap_t0sz x8, x7
 	msr	tcr_el1, x8
 	msr	vbar_el1, x9
@@ -113,7 +101,6 @@ ENTRY(cpu_do_resume)
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	mov	x0, x12
-	dsb	nsh		// Make sure local tlb invalidation completed
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 08/11] arm64: kernel: Include _AC definition in page.h
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
                   ` (6 preceding siblings ...)
  2015-10-27 17:29 ` [PATCH v2 07/11] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-10-27 17:29 ` [PATCH v2 09/11] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file James Morse
                   ` (2 subsequent siblings)
  10 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include
linux/const.h where this is defined. This produces build warnings when only
asm/page.h is included by asm code.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/page.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 7d9c7e4a424b..74f67d049a63 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -19,6 +19,8 @@
 #ifndef __ASM_PAGE_H
 #define __ASM_PAGE_H
 
+#include <linux/const.h>
+
 /* PAGE_SHIFT determines the page size */
 #ifdef CONFIG_ARM64_64K_PAGES
 #define PAGE_SHIFT		16
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 09/11] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
                   ` (7 preceding siblings ...)
  2015-10-27 17:29 ` [PATCH v2 08/11] arm64: kernel: Include _AC definition in page.h James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-10-27 17:29 ` [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it James Morse
  2015-10-27 17:29 ` [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
  10 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

KERNEL_START and KERNEL_END are useful outside head.S, move them to a
header file.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/memory.h | 3 +++
 arch/arm64/kernel/head.S        | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 6b4c3ad75a2a..1383491f3db4 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -72,6 +72,9 @@
 #error Top of 64-bit user space clashes with start of module space
 #endif
 
+#define KERNEL_START      _text
+#define KERNEL_END        _end
+
 /*
  * Physical vs virtual RAM address space conversion.  These are
  * private definitions which should NOT be used outside memory.h
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index cf4e0bdf6533..537406273f96 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -55,9 +55,6 @@
 #define TABLE_SHIFT	PUD_SHIFT
 #endif
 
-#define KERNEL_START	_text
-#define KERNEL_END	_end
-
 /*
  * Initial memory map attributes.
  */
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
                   ` (8 preceding siblings ...)
  2015-10-27 17:29 ` [PATCH v2 09/11] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-11-11 11:40   ` Lorenzo Pieralisi
                     ` (2 more replies)
  2015-10-27 17:29 ` [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
  10 siblings, 3 replies; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches so that the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Add a call to flush_icache_range(), which is provided by
architectures that require it, to perform the maintenance.

This mirrors the kernel's behaviour when loading kernel modules and when
mapping executable pages to user space.

Signed-off-by: James Morse <james.morse@arm.com>
---
 kernel/power/snapshot.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 5235dd4e1e2f..139fc449ad75 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -31,6 +31,7 @@
 #include <linux/ktime.h>
 
 #include <asm/uaccess.h>
+#include <asm/cacheflush.h>
 #include <asm/mmu_context.h>
 #include <asm/pgtable.h>
 #include <asm/tlbflush.h>
@@ -1196,9 +1197,12 @@ static unsigned int count_data_pages(void)
 static inline void do_copy_page(long *dst, long *src)
 {
 	int n;
+	unsigned long __maybe_unused start = (unsigned long)dst;
 
 	for (n = PAGE_SIZE / sizeof(long); n; n--)
 		*dst++ = *src++;
+
+	flush_icache_range(start, start+PAGE_SIZE);
 }
 
 
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
                   ` (9 preceding siblings ...)
  2015-10-27 17:29 ` [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it James Morse
@ 2015-10-27 17:29 ` James Morse
  2015-11-14 21:34   ` Pavel Machek
  10 siblings, 1 reply; 33+ messages in thread
From: James Morse @ 2015-10-27 17:29 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for hibernate/suspend-to-disk.

Suspend borrows code from cpu_suspend() to write cpu state onto the stack,
before calling swsusp_save() to save the memory image.

Restore creates a set of temporary page tables, covering the kernel and the
linear map, copies the restore code to a 'safe' page, then uses the copy to
restore the memory image. It calls into cpu_resume(),
and then follows the normal cpu_suspend() path back into the suspend code.

The implementation assumes that exactly the same kernel is booted on the
same hardware, and that the kernel is loaded at the same physical address.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig                |   3 +
 arch/arm64/include/asm/suspend.h  |   8 +
 arch/arm64/kernel/Makefile        |   1 +
 arch/arm64/kernel/asm-offsets.c   |   4 +
 arch/arm64/kernel/hibernate-asm.S | 118 +++++++++++++
 arch/arm64/kernel/hibernate.c     | 359 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S   |  15 ++
 7 files changed, 508 insertions(+)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 07d1811aa03f..d081dbc35335 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -707,6 +707,9 @@ menu "Power management options"
 
 source "kernel/power/Kconfig"
 
+config ARCH_HIBERNATION_POSSIBLE
+	def_bool y
+
 config ARCH_SUSPEND_POSSIBLE
 	def_bool y
 
diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 5faa3ce1fa3a..e75ad7aa268c 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,3 +1,5 @@
+#include <linux/suspend.h>
+
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
@@ -34,6 +36,12 @@ struct sleep_stack_data {
 	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
 };
 
+extern int swsusp_arch_suspend(void);
+extern int swsusp_arch_resume(void);
+int swsusp_arch_suspend_enter(struct cpu_suspend_ctx *ptr);
+void __noreturn swsusp_arch_suspend_exit(phys_addr_t tmp_pg_dir,
+					 phys_addr_t swapper_pg_dir,
+					 void *kernel_start, void *kernel_end);
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
 int __cpu_suspend_enter(struct sleep_stack_data *state);
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 22dc9bc781be..b9151ae4a7ae 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -36,6 +36,7 @@ arm64-obj-$(CONFIG_EFI)			+= efi.o efi-stub.o efi-entry.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 3cb1383d3deb..b5d9495a94a1 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
+#include <linux/suspend.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/smp_plat.h>
@@ -160,5 +161,8 @@ int main(void)
   DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
   DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
+  DEFINE(HIBERN_PBE_ORIG,	offsetof(struct pbe, orig_address));
+  DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
+  DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
   return 0;
 }
diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S
new file mode 100644
index 000000000000..2cead4779804
--- /dev/null
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -0,0 +1,118 @@
+#include <linux/linkage.h>
+#include <linux/errno.h>
+
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
+#include <asm/cputype.h>
+#include <asm/memory.h>
+#include <asm/page.h>
+
+/*
+ * Corrupt memory.
+ *
+ * Loads temporary page tables then restores the memory image.
+ * Finally branches to cpu_resume() to restore the state saved by
+ * swsusp_arch_suspend().
+ *
+ * Because this code has to be copied to a safe_page, it can't call out to
+ * other functions by pc-relative address. Also remember that it may be
+ * mid-way through over-writing other functions. For this reason it contains
+ * a copy of copy_page() and code from flush_icache_range().
+ *
+ * All of memory gets written to, including code. We need to clean the kernel
+ * text to the PoC before secondary cores can be booted. Because kernel modules,
+ * and executable pages mapped to user space are also written as data, we
+ * clean all pages we touch to the PoU.
+ *
+ * x0: physical address of temporary page tables
+ * x1: physical address of swapper page tables
+ * x2: address of kernel_start
+ * x3: address of kernel_end
+ */
+.pushsection    ".hibernate_exit.text", "ax"
+ENTRY(swsusp_arch_suspend_exit)
+	/* Temporary page tables are a copy, so no need for a trampoline here */
+	msr	ttbr1_el1, x0
+	isb
+	tlbi	vmalle1is
+	ic	ialluis
+	isb
+
+	mov	x21, x1
+	mov	x22, x2
+	mov	x23, x3
+
+	/* walk the restore_pblist and use copy_page() to over-write memory */
+	ldr	x19, =restore_pblist
+	ldr	x19, [x19]
+
+2:	ldr	x10, [x19, #HIBERN_PBE_ORIG]
+	mov	x0, x10
+	ldr	x1, [x19, #HIBERN_PBE_ADDR]
+
+	/* arch/arm64/lib/copy_page.S:copy_page() */
+	prfm	pldl1strm, [x1, #64]
+3:	ldp	x2, x3, [x1]
+	ldp	x4, x5, [x1, #16]
+	ldp	x6, x7, [x1, #32]
+	ldp	x8, x9, [x1, #48]
+	add	x1, x1, #64
+	prfm	pldl1strm, [x1, #64]
+	stnp	x2, x3, [x0]
+	stnp	x4, x5, [x0, #16]
+	stnp	x6, x7, [x0, #32]
+	stnp	x8, x9, [x0, #48]
+	add	x0, x0, #64
+	tst	x1, #(PAGE_SIZE - 1)
+	b.ne	3b
+
+	dsb	ish		//  memory restore must finish before cleaning
+
+	add	x1, x10, #PAGE_SIZE
+	/* Clean the copied page to PoU - based on flush_icache_range() */
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x10, x3
+4:	dc	cvau, x4	// clean D line / unified line
+	add	x4, x4, x2
+	cmp	x4, x1
+	b.lo	4b
+
+	ldr	x19, [x19, #HIBERN_PBE_NEXT]
+	cbnz	x19, 2b
+
+	/* Clean the kernel text to PoC - based on flush_icache_range() */
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x22, x3
+5:	dc	cvac, x4
+	add	x4, x4, x2
+	cmp	x4, x23
+	b.lo	5b
+
+	/*
+	 * branch into the restored kernel - so that when we restore the page
+	 * tables, code continues to be executable.
+	 */
+	ldr	x1, =__hibernate_exit
+	mov	x0, x21		// physical address of swapper page tables.
+	br	x1
+
+	.ltorg
+ENDPROC(swsusp_arch_suspend_exit)
+.popsection
+
+/*
+ * Reset the page tables, and wake up in cpu_resume().
+ * Temporary page tables were a copy, so again, no trampoline here.
+ *
+ * x0: physical address of swapper_pg_dir
+ */
+ENTRY(__hibernate_exit)
+	msr	ttbr1_el1, x0
+	isb
+	tlbi	vmalle1is
+	ic	ialluis
+	isb
+	b	_cpu_resume
+ENDPROC(__hibernate_exit)
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
new file mode 100644
index 000000000000..40eb55bcee15
--- /dev/null
+++ b/arch/arm64/kernel/hibernate.c
@@ -0,0 +1,359 @@
+/*:
+ * Hibernate support specific for ARM64
+ *
+ * Derived from work on ARM hibernation support by:
+ *
+ * Ubuntu project, hibernation support for mach-dove
+ * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)
+ * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)
+ *  https://lkml.org/lkml/2010/6/18/4
+ *  https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html
+ *  https://patchwork.kernel.org/patch/96442/
+ *
+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * License terms: GNU General Public License (GPL) version 2
+ */
+#define pr_fmt(x) "hibernate: " x
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <linux/pm.h>
+#include <linux/sched.h>
+#include <linux/suspend.h>
+#include <linux/version.h>
+
+#include <asm/barrier.h>
+#include <asm/cacheflush.h>
+#include <asm/irqflags.h>
+#include <asm/memory.h>
+#include <asm/mmu_context.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/pgtable-hwdef.h>
+#include <asm/sections.h>
+#include <asm/suspend.h>
+
+/* These are necessary to build without ifdefery */
+#ifndef pmd_index
+#define pmd_index(x)	0
+#endif
+#ifndef pud_index
+#define pud_index(x)	0
+#endif
+
+/*
+ * Start/end of the hibernate exit code, this must be copied to a 'safe'
+ * location in memory, and executed from there.
+ */
+extern char __hibernate_exit_text_start[], __hibernate_exit_text_end[];
+
+int pfn_is_nosave(unsigned long pfn)
+{
+	unsigned long nosave_begin_pfn = virt_to_pfn(&__nosave_begin);
+	unsigned long nosave_end_pfn = virt_to_pfn(&__nosave_end - 1);
+
+	return (pfn >= nosave_begin_pfn) && (pfn <= nosave_end_pfn);
+}
+
+void notrace save_processor_state(void)
+{
+	WARN_ON(num_online_cpus() != 1);
+	local_fiq_disable();
+}
+
+void notrace restore_processor_state(void)
+{
+	local_fiq_enable();
+}
+
+/*
+ * Copies src_length bytes, starting at src_start into an new page,
+ * perform cache maintentance, then map it@the top of memory as executable.
+ *
+ * This is used by hibernate to copy the code it needs to execute when
+ * overwriting the kernel text.
+ *
+ * Suggested allocators are get_safe_page() or get_zeroed_page(). Your chosen
+ * mask must cause zero'd pages to be returned.
+ */
+static int create_safe_exec_page(void *src_start, size_t length,
+				 void **dst_addr,
+				 unsigned long (*allocator)(gfp_t mask),
+				 gfp_t mask)
+{
+	int rc = 0;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+	unsigned long dst = allocator(mask);
+	if (!dst) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	memcpy((void *)dst, src_start, length);
+	flush_icache_range(dst, dst + length);
+
+	pgd = pgd_offset(&init_mm, (unsigned long)-1);
+	if (!pgd_val(*pgd) && PTRS_PER_PGD > 1) {
+		pud = (pud_t *)allocator(mask);
+		if (!pud) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pgd(pgd, __pgd(virt_to_phys(pud) | PUD_TYPE_TABLE));
+	}
+
+	pud = pud_offset(pgd, (unsigned long)-1);
+	if (!pud_val(*pud) && PTRS_PER_PUD > 1) {
+		pmd = (pmd_t *)allocator(mask);
+		if (!pmd) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pud(pud, __pud(virt_to_phys(pmd) | PUD_TYPE_TABLE));
+	}
+
+	pmd = pmd_offset(pud, (unsigned long)-1);
+	if (!pmd_val(*pmd) && PTRS_PER_PMD > 1) {
+		pte = (pte_t *)allocator(mask);
+		if (!pte) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pmd(pmd, __pmd(virt_to_phys(pte) | PMD_TYPE_TABLE));
+	}
+
+	pte = pte_offset_kernel(pmd, (unsigned long)-1);
+	set_pte_at(&init_mm, dst, pte,
+		   __pte(virt_to_phys((void *)dst) | PAGE_KERNEL_EXEC));
+
+	/* this is a new mapping, so no need for a tlbi */
+
+	*dst_addr = (void *)((unsigned long)-1 & PAGE_MASK);
+
+out:
+	return rc;
+}
+
+
+int swsusp_arch_suspend(void)
+{
+	int ret = 0;
+	unsigned long flags;
+	struct sleep_stack_data state;
+	struct mm_struct *mm = current->active_mm;
+
+	local_dbg_save(flags);
+
+	if (__cpu_suspend_enter(&state))
+		ret = swsusp_save();
+	else
+		__cpu_suspend_exit(mm);
+
+	local_dbg_restore(flags);
+
+	return ret;
+}
+
+static int copy_pte(pmd_t *dst, pmd_t *src, unsigned long *start_addr)
+{
+	int i;
+	pte_t *old_pte = pte_offset_kernel(src, *start_addr);
+	pte_t *new_pte = pte_offset_kernel(dst, *start_addr);
+
+	for (i = pte_index(*start_addr); i < PTRS_PER_PTE;
+	     i++, old_pte++, new_pte++) {
+		if (pte_val(*old_pte))
+			set_pte(new_pte,
+				__pte(pte_val(*old_pte) & ~PTE_RDONLY));
+	}
+
+	*start_addr &= PAGE_MASK;
+
+	return 0;
+}
+
+static int copy_pmd(pud_t *dst, pud_t *src, unsigned long *start_addr)
+{
+	int i;
+	int rc = 0;
+	pte_t *new_pte;
+	pmd_t *old_pmd = pmd_offset(src, *start_addr);
+	pmd_t *new_pmd = pmd_offset(dst, *start_addr);
+
+	for (i = pmd_index(*start_addr); i < PTRS_PER_PMD;
+	     i++, *start_addr += PMD_SIZE, old_pmd++, new_pmd++) {
+		if (!pmd_val(*old_pmd))
+			continue;
+
+		if (pmd_table(*(old_pmd))) {
+			new_pte = (pte_t *)get_safe_page(GFP_ATOMIC);
+			if (!new_pte) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pmd(new_pmd, __pmd(virt_to_phys(new_pte)
+					       | PMD_TYPE_TABLE));
+
+			rc = copy_pte(new_pmd, old_pmd, start_addr);
+			if (rc)
+				break;
+		} else
+			set_pmd(new_pmd,
+				__pmd(pmd_val(*old_pmd) & ~PMD_SECT_RDONLY));
+
+		*start_addr &= PMD_MASK;
+	}
+
+	return rc;
+}
+
+static int copy_pud(pgd_t *dst, pgd_t *src, unsigned long *start_addr)
+{
+	int i;
+	int rc = 0;
+	pmd_t *new_pmd;
+	pud_t *old_pud = pud_offset(src, *start_addr);
+	pud_t *new_pud = pud_offset(dst, *start_addr);
+
+	for (i = pud_index(*start_addr); i < PTRS_PER_PUD;
+	     i++, *start_addr += PUD_SIZE, old_pud++, new_pud++) {
+		if (!pud_val(*old_pud))
+			continue;
+
+		if (pud_table(*(old_pud))) {
+			if (PTRS_PER_PMD != 1) {
+				new_pmd = (pmd_t *)get_safe_page(GFP_ATOMIC);
+				if (!new_pmd) {
+					rc = -ENOMEM;
+					break;
+				}
+
+				set_pud(new_pud, __pud(virt_to_phys(new_pmd)
+						       | PUD_TYPE_TABLE));
+			}
+
+			rc = copy_pmd(new_pud, old_pud, start_addr);
+			if (rc)
+				break;
+		} else
+			set_pud(new_pud,
+				__pud(pud_val(*old_pud) & ~PMD_SECT_RDONLY));
+
+		*start_addr &= PUD_MASK;
+	}
+
+	return rc;
+}
+
+static int copy_page_tables(pgd_t *new_pgd, unsigned long start_addr)
+{
+	int i;
+	int rc = 0;
+	pud_t *new_pud;
+	pgd_t *old_pgd = pgd_offset_k(start_addr);
+
+	new_pgd += pgd_index(start_addr);
+
+	for (i = pgd_index(start_addr); i < PTRS_PER_PGD;
+	     i++, start_addr += PGDIR_SIZE, old_pgd++, new_pgd++) {
+		if (!pgd_val(*old_pgd))
+			continue;
+
+		if (PTRS_PER_PUD != 1) {
+			new_pud = (pud_t *)get_safe_page(GFP_ATOMIC);
+			if (!new_pud) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pgd(new_pgd, __pgd(virt_to_phys(new_pud)
+					       | PUD_TYPE_TABLE));
+		}
+
+		rc = copy_pud(new_pgd, old_pgd, &start_addr);
+		if (rc)
+			break;
+
+		start_addr &= PGDIR_MASK;
+	}
+
+	return rc;
+}
+
+/*
+ * Setup then Resume from the hibernate image using swsusp_arch_suspend_exit().
+ *
+ * Memory allocated by get_safe_page() will be dealt with by the hibernate code,
+ * we don't need to free it here.
+ *
+ * Allocate a safe zero page to use as ttbr0, as all existing page tables, and
+ * even the empty_zero_page will be overwritten.
+ */
+int swsusp_arch_resume(void)
+{
+	int rc = 0;
+	size_t exit_size;
+	pgd_t *tmp_pg_dir;
+	void *safe_zero_page_mem;
+	unsigned long tmp_pg_start;
+	void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t, void *, void *);
+
+	/* Copy swsusp_arch_suspend_exit() to a safe page. */
+	exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start;
+	rc = create_safe_exec_page(__hibernate_exit_text_start, exit_size,
+			(void **)&hibernate_exit, get_safe_page, GFP_ATOMIC);
+	if (rc) {
+		pr_err("Failed to create safe executable page for"
+		       " hibernate_exit code.");
+		goto out;
+	}
+
+	/*
+	 * Even the zero page may get overwritten during restore.
+	 * get_safe_page() only returns zero'd pages.
+	 */
+	safe_zero_page_mem = (void *)get_safe_page(GFP_ATOMIC);
+	if (!safe_zero_page_mem) {
+		pr_err("Failed to allocate memory for zero page.");
+		rc = -ENOMEM;
+		goto out;
+	}
+	empty_zero_page = virt_to_page(safe_zero_page_mem);
+	cpu_set_reserved_ttbr0();
+
+	/*
+	 * Restoring the memory image will overwrite the ttbr1 page tables.
+	 * Create a second copy, of the kernel and linear map, and use this
+	 * when restoring.
+	 */
+	tmp_pg_dir = (pgd_t *)get_safe_page(GFP_ATOMIC);
+	if (!tmp_pg_dir) {
+		pr_err("Failed to allocate memory for temporary page tables.");
+		rc = -ENOMEM;
+		goto out;
+	}
+	tmp_pg_start = min((unsigned long)KERNEL_START,
+			   (unsigned long)PAGE_OFFSET);
+	rc = copy_page_tables(tmp_pg_dir, tmp_pg_start);
+	if (rc)
+		goto out;
+
+	/*
+	 * EL2 may get upset if we overwrite its page-tables/stack.
+	 * kvm_arch_hardware_disable() returns EL2 to the hyp stub. This
+	 * isn't needed on normal suspend/resume as PSCI prevents us from
+	 * ruining EL2.
+	 */
+	if (IS_ENABLED(CONFIG_KVM_ARM_HOST))
+		kvm_arch_hardware_disable();
+
+	hibernate_exit(virt_to_phys(tmp_pg_dir), virt_to_phys(swapper_pg_dir),
+		       KERNEL_START, KERNEL_END);
+
+out:
+	return rc;
+}
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 98073332e2d0..3d8284d91f4c 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -44,6 +44,16 @@ jiffies = jiffies_64;
 	*(.idmap.text)					\
 	VMLINUX_SYMBOL(__idmap_text_end) = .;
 
+#ifdef CONFIG_HIBERNATION
+#define HIBERNATE_TEXT					\
+	. = ALIGN(SZ_4K);				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_start) = .;\
+	*(.hibernate_exit.text)				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_end) = .;
+#else
+#define HIBERNATE_TEXT
+#endif
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -102,6 +112,7 @@ SECTIONS
 			LOCK_TEXT
 			HYPERVISOR_TEXT
 			IDMAP_TEXT
+			HIBERNATE_TEXT
 			*(.fixup)
 			*(.gnu.warning)
 		. = ALIGN(16);
@@ -181,6 +192,10 @@ ASSERT(__hyp_idmap_text_end - (__hyp_idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"HYP init code too big or misaligned")
 ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"ID map text too big or misaligned")
+#ifdef CONFIG_HIBERNATION
+ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
+	<= SZ_4K, "Hibernate exit text too big or misaligned")
+#endif
 
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
-- 
2.1.4

^ permalink raw reply related	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-10-27 17:29 ` [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it James Morse
@ 2015-11-11 11:40   ` Lorenzo Pieralisi
  2015-11-12  0:48     ` Rafael J. Wysocki
  2015-11-12  2:53     ` Chen, Yu C
  2015-11-14 20:26   ` Pavel Machek
  2015-11-26 14:23   ` James Morse
  2 siblings, 2 replies; 33+ messages in thread
From: Lorenzo Pieralisi @ 2015-11-11 11:40 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pavel, Rafael,

Do you have any feedback on this patch ?

It is fundamental to this series and affects Hibernate core code so if you
have any feedback that would be much appreciated.

Thanks a lot,
Lorenzo

On Tue, Oct 27, 2015 at 05:29:19PM +0000, James Morse wrote:
> Some architectures require code written to memory as if it were data to be
> 'cleaned' from any data caches so that the processor can fetch them as new
> instructions.
> 
> During resume from hibernate, the snapshot code copies some pages directly,
> meaning these architectures do not get a chance to perform their cache
> maintenance. Add a call to flush_icache_range(), which is provided by
> architectures that require it, to perform the maintenance.
> 
> This mirrors the kernel's behaviour when loading kernel modules and when
> mapping executable pages to user space.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  kernel/power/snapshot.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index 5235dd4e1e2f..139fc449ad75 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -31,6 +31,7 @@
>  #include <linux/ktime.h>
>  
>  #include <asm/uaccess.h>
> +#include <asm/cacheflush.h>
>  #include <asm/mmu_context.h>
>  #include <asm/pgtable.h>
>  #include <asm/tlbflush.h>
> @@ -1196,9 +1197,12 @@ static unsigned int count_data_pages(void)
>  static inline void do_copy_page(long *dst, long *src)
>  {
>  	int n;
> +	unsigned long __maybe_unused start = (unsigned long)dst;
>  
>  	for (n = PAGE_SIZE / sizeof(long); n; n--)
>  		*dst++ = *src++;
> +
> +	flush_icache_range(start, start+PAGE_SIZE);
>  }
>  
>  
> -- 
> 2.1.4
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-11 11:40   ` Lorenzo Pieralisi
@ 2015-11-12  0:48     ` Rafael J. Wysocki
  2015-11-12 11:47       ` Lorenzo Pieralisi
  2015-11-12  2:53     ` Chen, Yu C
  1 sibling, 1 reply; 33+ messages in thread
From: Rafael J. Wysocki @ 2015-11-12  0:48 UTC (permalink / raw)
  To: linux-arm-kernel

On Wednesday, November 11, 2015 11:40:39 AM Lorenzo Pieralisi wrote:
> Hi Pavel, Rafael,
> 
> Do you have any feedback on this patch ?
> 
> It is fundamental to this series and affects Hibernate core code so if you
> have any feedback that would be much appreciated.

I'm really not familiar with the flush_icache_range() interface.

What exactly does it do?

Rafael


> On Tue, Oct 27, 2015 at 05:29:19PM +0000, James Morse wrote:
> > Some architectures require code written to memory as if it were data to be
> > 'cleaned' from any data caches so that the processor can fetch them as new
> > instructions.
> > 
> > During resume from hibernate, the snapshot code copies some pages directly,
> > meaning these architectures do not get a chance to perform their cache
> > maintenance. Add a call to flush_icache_range(), which is provided by
> > architectures that require it, to perform the maintenance.
> > 
> > This mirrors the kernel's behaviour when loading kernel modules and when
> > mapping executable pages to user space.
> > 
> > Signed-off-by: James Morse <james.morse@arm.com>
> > ---
> >  kernel/power/snapshot.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > index 5235dd4e1e2f..139fc449ad75 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -31,6 +31,7 @@
> >  #include <linux/ktime.h>
> >  
> >  #include <asm/uaccess.h>
> > +#include <asm/cacheflush.h>
> >  #include <asm/mmu_context.h>
> >  #include <asm/pgtable.h>
> >  #include <asm/tlbflush.h>
> > @@ -1196,9 +1197,12 @@ static unsigned int count_data_pages(void)
> >  static inline void do_copy_page(long *dst, long *src)
> >  {
> >  	int n;
> > +	unsigned long __maybe_unused start = (unsigned long)dst;
> >  
> >  	for (n = PAGE_SIZE / sizeof(long); n; n--)
> >  		*dst++ = *src++;
> > +
> > +	flush_icache_range(start, start+PAGE_SIZE);
> >  }
> >  
> >  
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-11 11:40   ` Lorenzo Pieralisi
  2015-11-12  0:48     ` Rafael J. Wysocki
@ 2015-11-12  2:53     ` Chen, Yu C
  2015-11-12 11:52       ` Lorenzo Pieralisi
  1 sibling, 1 reply; 33+ messages in thread
From: Chen, Yu C @ 2015-11-12  2:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

> -----Original Message-----
> From: linux-pm-owner at vger.kernel.org [mailto:linux-pm-
> owner at vger.kernel.org] On Behalf Of Lorenzo Pieralisi
> Sent: Wednesday, November 11, 2015 7:41 PM
> To: James Morse; Rafael J. Wysocki; Pavel Machek
> Cc: linux-arm-kernel at lists.infradead.org; linux-pm at vger.kernel.org; Will
> Deacon; Sudeep Holla; Kevin Kang; Geoff Levand; Catalin Marinas; Mark
> Rutland; AKASHI Takahiro; wangfei; Marc Zyngier
> Subject: Re: [PATCH v2 10/11] PM / Hibernate: clean cached pages on
> architectures that require it
> 
> Hi Pavel, Rafael,
> 
> Do you have any feedback on this patch ?
> 
> It is fundamental to this series and affects Hibernate core code so if you have
> any feedback that would be much appreciated.
> 
> Thanks a lot,
> Lorenzo
> 
> On Tue, Oct 27, 2015 at 05:29:19PM +0000, James Morse wrote:
> > Some architectures require code written to memory as if it were data
> > to be 'cleaned' from any data caches so that the processor can fetch
> > them as new instructions.
> >
> > During resume from hibernate, the snapshot code copies some pages
> > directly, meaning these architectures do not get a chance to perform
> > their cache maintenance. Add a call to flush_icache_range(), which is
> > provided by architectures that require it, to perform the maintenance.
> >
> > This mirrors the kernel's behaviour when loading kernel modules and
> > when mapping executable pages to user space.
> >
> > Signed-off-by: James Morse <james.morse@arm.com>
> > ---
> >  kernel/power/snapshot.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c index
> > 5235dd4e1e2f..139fc449ad75 100644
> > --- a/kernel/power/snapshot.c
> > +++ b/kernel/power/snapshot.c
> > @@ -31,6 +31,7 @@
> >  #include <linux/ktime.h>
> >
> >  #include <asm/uaccess.h>
> > +#include <asm/cacheflush.h>
> >  #include <asm/mmu_context.h>
> >  #include <asm/pgtable.h>
> >  #include <asm/tlbflush.h>
> > @@ -1196,9 +1197,12 @@ static unsigned int count_data_pages(void)
> > static inline void do_copy_page(long *dst, long *src)  {
> >  	int n;
> > +	unsigned long __maybe_unused start = (unsigned long)dst;
> >
> >  	for (n = PAGE_SIZE / sizeof(long); n; n--)
> >  		*dst++ = *src++;
> > +
> > +	flush_icache_range(start, start+PAGE_SIZE);
> >  }
How about invalid all icache lines before doing do_copy_page, since
do_copy_page might deal with both data pages and execute pages, that might be
redundant to do it for every page?

Yu

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-12  0:48     ` Rafael J. Wysocki
@ 2015-11-12 11:47       ` Lorenzo Pieralisi
  2015-11-13 23:38         ` Rafael J. Wysocki
  0 siblings, 1 reply; 33+ messages in thread
From: Lorenzo Pieralisi @ 2015-11-12 11:47 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 12, 2015 at 01:48:32AM +0100, Rafael J. Wysocki wrote:
> On Wednesday, November 11, 2015 11:40:39 AM Lorenzo Pieralisi wrote:
> > Hi Pavel, Rafael,
> > 
> > Do you have any feedback on this patch ?
> > 
> > It is fundamental to this series and affects Hibernate core code so if you
> > have any feedback that would be much appreciated.
> 
> I'm really not familiar with the flush_icache_range() interface.
> 
> What exactly does it do?

It is used to sync a memory range that is written into (eg loading
modules, copying from snapshot is basically the same thing, reads from
storage and restore pages that might well be executable code), in particular
to sync the I-cache and the D-cache, eg on arm64 the page that the snapshot
code is copying might be executable code that has to be cleaned from the
D-cache so that it is made visible to the I-cache.

On x86 it is a NOP AFAIK.

Thanks,
Lorenzo

> 
> Rafael
> 
> 
> > On Tue, Oct 27, 2015 at 05:29:19PM +0000, James Morse wrote:
> > > Some architectures require code written to memory as if it were data to be
> > > 'cleaned' from any data caches so that the processor can fetch them as new
> > > instructions.
> > > 
> > > During resume from hibernate, the snapshot code copies some pages directly,
> > > meaning these architectures do not get a chance to perform their cache
> > > maintenance. Add a call to flush_icache_range(), which is provided by
> > > architectures that require it, to perform the maintenance.
> > > 
> > > This mirrors the kernel's behaviour when loading kernel modules and when
> > > mapping executable pages to user space.
> > > 
> > > Signed-off-by: James Morse <james.morse@arm.com>
> > > ---
> > >  kernel/power/snapshot.c | 4 ++++
> > >  1 file changed, 4 insertions(+)
> > > 
> > > diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> > > index 5235dd4e1e2f..139fc449ad75 100644
> > > --- a/kernel/power/snapshot.c
> > > +++ b/kernel/power/snapshot.c
> > > @@ -31,6 +31,7 @@
> > >  #include <linux/ktime.h>
> > >  
> > >  #include <asm/uaccess.h>
> > > +#include <asm/cacheflush.h>
> > >  #include <asm/mmu_context.h>
> > >  #include <asm/pgtable.h>
> > >  #include <asm/tlbflush.h>
> > > @@ -1196,9 +1197,12 @@ static unsigned int count_data_pages(void)
> > >  static inline void do_copy_page(long *dst, long *src)
> > >  {
> > >  	int n;
> > > +	unsigned long __maybe_unused start = (unsigned long)dst;
> > >  
> > >  	for (n = PAGE_SIZE / sizeof(long); n; n--)
> > >  		*dst++ = *src++;
> > > +
> > > +	flush_icache_range(start, start+PAGE_SIZE);
> > >  }
> > >  
> > >  
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> > the body of a message to majordomo at vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> -- 
> I speak only for myself.
> Rafael J. Wysocki, Intel Open Source Technology Center.
> 

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-12  2:53     ` Chen, Yu C
@ 2015-11-12 11:52       ` Lorenzo Pieralisi
  0 siblings, 0 replies; 33+ messages in thread
From: Lorenzo Pieralisi @ 2015-11-12 11:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 12, 2015 at 02:53:35AM +0000, Chen, Yu C wrote:

[...]

> > > static inline void do_copy_page(long *dst, long *src)  {
> > >  	int n;
> > > +	unsigned long __maybe_unused start = (unsigned long)dst;
> > >
> > >  	for (n = PAGE_SIZE / sizeof(long); n; n--)
> > >  		*dst++ = *src++;
> > > +
> > > +	flush_icache_range(start, start+PAGE_SIZE);
> > >  }
> How about invalid all icache lines before doing do_copy_page, since
> do_copy_page might deal with both data pages and execute pages, that might be
> redundant to do it for every page?

The point here is to make sure I-cache and D-cache are in sync, because
the page we are copying can be executable code, so flush_icache_range is
there to make sure that the code is cleaned from the D-cache to a cache
level that is visible to the I-cache (yes, the I-cache range is
invalidate too in the process).

I agree it is redundant for data pages, the point is, you do not really
know what you are copying at this stage, we can extend the snapshot
code to flag the pages accordingly, but it might well be overkill, so
we posted this patch to get consensus before proceeding.

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-12 11:47       ` Lorenzo Pieralisi
@ 2015-11-13 23:38         ` Rafael J. Wysocki
  2015-11-17 12:38           ` Lorenzo Pieralisi
  0 siblings, 1 reply; 33+ messages in thread
From: Rafael J. Wysocki @ 2015-11-13 23:38 UTC (permalink / raw)
  To: linux-arm-kernel

On Thursday, November 12, 2015 11:47:05 AM Lorenzo Pieralisi wrote:
> On Thu, Nov 12, 2015 at 01:48:32AM +0100, Rafael J. Wysocki wrote:
> > On Wednesday, November 11, 2015 11:40:39 AM Lorenzo Pieralisi wrote:
> > > Hi Pavel, Rafael,
> > > 
> > > Do you have any feedback on this patch ?
> > > 
> > > It is fundamental to this series and affects Hibernate core code so if you
> > > have any feedback that would be much appreciated.
> > 
> > I'm really not familiar with the flush_icache_range() interface.
> > 
> > What exactly does it do?
> 
> It is used to sync a memory range that is written into (eg loading
> modules, copying from snapshot is basically the same thing, reads from
> storage and restore pages that might well be executable code), in particular
> to sync the I-cache and the D-cache, eg on arm64 the page that the snapshot
> code is copying might be executable code that has to be cleaned from the
> D-cache so that it is made visible to the I-cache.
> 
> On x86 it is a NOP AFAIK.

If that's the case, I have no problems with this change as long as the code
works on architectures with non-trivial flush_icache_range().

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-10-27 17:29 ` [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it James Morse
  2015-11-11 11:40   ` Lorenzo Pieralisi
@ 2015-11-14 20:26   ` Pavel Machek
  2015-11-16 12:27     ` James Morse
  2015-11-26 14:23   ` James Morse
  2 siblings, 1 reply; 33+ messages in thread
From: Pavel Machek @ 2015-11-14 20:26 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue 2015-10-27 17:29:19, James Morse wrote:
> Some architectures require code written to memory as if it were data to be
> 'cleaned' from any data caches so that the processor can fetch them as new
> instructions.
> 
> During resume from hibernate, the snapshot code copies some pages directly,
> meaning these architectures do not get a chance to perform their cache
> maintenance. Add a call to flush_icache_range(), which is provided by
> architectures that require it, to perform the maintenance.
> 
> This mirrors the kernel's behaviour when loading kernel modules and when
> mapping executable pages to user space.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Looks reasonable.

Acked-by: Pavel Machek <pavel@ucw.cz>

> ---
>  kernel/power/snapshot.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> index 5235dd4e1e2f..139fc449ad75 100644
> --- a/kernel/power/snapshot.c
> +++ b/kernel/power/snapshot.c
> @@ -1196,9 +1197,12 @@ static unsigned int count_data_pages(void)
>  static inline void do_copy_page(long *dst, long *src)
>  {
>  	int n;
> +	unsigned long __maybe_unused start = (unsigned long)dst;

Why the "maybe unused"?

>  	for (n = PAGE_SIZE / sizeof(long); n; n--)
>  		*dst++ = *src++;
> +
> +	flush_icache_range(start, start+PAGE_SIZE);
>  }

									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 02/11] arm64: Fold proc-macros.S into assembler.h
  2015-10-27 17:29 ` [PATCH v2 02/11] arm64: Fold proc-macros.S into assembler.h James Morse
@ 2015-11-14 21:25   ` Pavel Machek
  2015-11-16 18:44     ` Geoff Levand
  0 siblings, 1 reply; 33+ messages in thread
From: Pavel Machek @ 2015-11-14 21:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue 2015-10-27 17:29:11, James Morse wrote:
> From: Geoff Levand <geoff@infradead.org>
> 
> To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to
> be used outside the mm code move the contents of proc-macros.S to
> asm/assembler.h.  Also, delete proc-macros.S, and fix up all references
> to proc-macros.S.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
>  arch/arm64/include/asm/assembler.h | 48 +++++++++++++++++++++++++++-
>  arch/arm64/kernel/head.S           |  1 -
>  arch/arm64/kvm/hyp-init.S          |  1 -
>  arch/arm64/mm/cache.S              |  2 --
>  arch/arm64/mm/proc-macros.S        | 64 --------------------------------------
>  arch/arm64/mm/proc.S               |  3 --
>  6 files changed, 47 insertions(+), 72 deletions(-)
>  delete mode 100644 arch/arm64/mm/proc-macros.S
> 
> diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> index b51f2cc22ca9..91cb311d33de 100644
> --- a/arch/arm64/include/asm/assembler.h
> +++ b/arch/arm64/include/asm/assembler.h
> @@ -1,5 +1,5 @@
>  /*
> - * Based on arch/arm/include/asm/assembler.h
> + * * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S

Why the two stars?

Otherwise looks ok.

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-10-27 17:29 ` [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
@ 2015-11-14 21:34   ` Pavel Machek
  2015-11-16 12:29     ` James Morse
  0 siblings, 1 reply; 33+ messages in thread
From: Pavel Machek @ 2015-11-14 21:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi!

> The implementation assumes that exactly the same kernel is booted on the
> same hardware, and that the kernel is loaded at the same physical address.

BTW... on newer implementations (and I have patch for x86, too), we
try to make it so that resume kernel does not have to be same as
suspend one. It would be nice to move there with arm64, too. 

> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel Machek <pavel@ucw.cz>

> +/*
> + * Corrupt memory.
> + *

Umm. Really?

> + * Because this code has to be copied to a safe_page, it can't call out to
> + * other functions by pc-relative address. Also remember that it

PC-relative?


> + * mid-way through over-writing other functions. For this reason it contains
> + * a copy of copy_page() and code from flush_icache_range().
> + *
> + * All of memory gets written to, including code. We need to clean the kernel
> + * text to the PoC before secondary cores can be booted. Because kernel modules,

"the kernel modules" and no ","?

> + * and executable pages mapped to user space are also written as data, we
> + * clean all pages we touch to the PoU.

What is PoC and PoU?

Thanks,									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-14 20:26   ` Pavel Machek
@ 2015-11-16 12:27     ` James Morse
  2015-11-16 12:36       ` Pavel Machek
  0 siblings, 1 reply; 33+ messages in thread
From: James Morse @ 2015-11-16 12:27 UTC (permalink / raw)
  To: linux-arm-kernel

On 14/11/15 20:26, Pavel Machek wrote:
> On Tue 2015-10-27 17:29:19, James Morse wrote:
>> Some architectures require code written to memory as if it were data to be
>> 'cleaned' from any data caches so that the processor can fetch them as new
>> instructions.
>>
>> During resume from hibernate, the snapshot code copies some pages directly,
>> meaning these architectures do not get a chance to perform their cache
>> maintenance. Add a call to flush_icache_range(), which is provided by
>> architectures that require it, to perform the maintenance.
>>
>> This mirrors the kernel's behaviour when loading kernel modules and when
>> mapping executable pages to user space.
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
> 
> Looks reasonable.
> 
> Acked-by: Pavel Machek <pavel@ucw.cz>

Thanks!

> 
>> ---
>>  kernel/power/snapshot.c | 4 ++++
>>  1 file changed, 4 insertions(+)
>>
>> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
>> index 5235dd4e1e2f..139fc449ad75 100644
>> --- a/kernel/power/snapshot.c
>> +++ b/kernel/power/snapshot.c
>> @@ -1196,9 +1197,12 @@ static unsigned int count_data_pages(void)
>>  static inline void do_copy_page(long *dst, long *src)
>>  {
>>  	int n;
>> +	unsigned long __maybe_unused start = (unsigned long)dst;
> 
> Why the "maybe unused"?

To avoid a build warning on x86_64, and any other architectures that don't
use the arguments in their flush_icache_range() implementation.

Without __maybe_unused, gcc 4.9.2, building for x86_64:
> ../kernel/power/snapshot.c: In function ?do_copy_page?:
> ../kernel/power/snapshot.c:1200:16: warning: unused variable ?start?
> [-Wunused-variable]
>  unsigned long start = (unsigned long)dst;


>>  	for (n = PAGE_SIZE / sizeof(long); n; n--)
>>  		*dst++ = *src++;
>> +
>> +	flush_icache_range(start, start+PAGE_SIZE);
>>  }


Thanks,

James

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-11-14 21:34   ` Pavel Machek
@ 2015-11-16 12:29     ` James Morse
  2015-11-16 12:41       ` Pavel Machek
  0 siblings, 1 reply; 33+ messages in thread
From: James Morse @ 2015-11-16 12:29 UTC (permalink / raw)
  To: linux-arm-kernel

Hi!

On 14/11/15 21:34, Pavel Machek wrote:
>> The implementation assumes that exactly the same kernel is booted on the
>> same hardware, and that the kernel is loaded at the same physical address.
> 
> BTW... on newer implementations (and I have patch for x86, too), we
> try to make it so that resume kernel does not have to be same as
> suspend one. It would be nice to move there with arm64, too. 

Yes, that is a neat trick, can I leave it as future work?


>> Signed-off-by: James Morse <james.morse@arm.com>
> 
> Acked-by: Pavel Machek <pavel@ucw.cz>

Thanks!

(I have a fixup for this patch to add a missing __pgprot() in the page
 table copy code.)


>> +/*
>> + * Corrupt memory.
>> + *
> 
> Umm. Really?

Effectively. Until it finishes copying, you have to assume all memory is
corrupt, code, page-tables etc. It was more a tounge-in-cheek reminder to
myself to be careful.


>> + * Because this code has to be copied to a safe_page, it can't call out to
>> + * other functions by pc-relative address. Also remember that it
> 
> PC-relative?

The linker may (often!) use program-counter relative addresses for loads
and stores. This code gets copied, so the linker doesn't know where the
code will be executed from, so any instructions using pc-relative addresses
will get the wrong result, (if they reference something outside the function).


>> + * mid-way through over-writing other functions. For this reason it contains
>> + * a copy of copy_page() and code from flush_icache_range().
>> + *
>> + * All of memory gets written to, including code. We need to clean the kernel
>> + * text to the PoC before secondary cores can be booted. Because kernel modules,
> 
> "the kernel modules" and no ","?

Sure.


>> + * and executable pages mapped to user space are also written as data, we
>> + * clean all pages we touch to the PoU.
> 
> What is PoC and PoU?

They are points in the CPU's cache hierarchy:

ARM processors are of a 'modified Harvard' architecture, their paths to
read instructions and data are different. The 'Point of Unification' is the
first point in the cache hierarchy that is the same for both. On ARM,
flush_icache_range() makes sure code written as data is pushed through any
data caches to this point, and then evicts any stale copies in the
instruction caches.

PoC is the 'Point of Coherency', it is the first point that is the same for
all devices, (e.g. a cpu with caches turned on, and one with them off), it
is normally main memory. The kernel text has to be pushed to this point, so
that secondary cores, while running early-boot code with their MMU and
caches turned off, don't get incorrect code/data from before resume.

I have resisted the urge to draw some ascii-art!



Thanks for your comments,

James

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-16 12:27     ` James Morse
@ 2015-11-16 12:36       ` Pavel Machek
  0 siblings, 0 replies; 33+ messages in thread
From: Pavel Machek @ 2015-11-16 12:36 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon 2015-11-16 12:27:07, James Morse wrote:
> On 14/11/15 20:26, Pavel Machek wrote:
> > On Tue 2015-10-27 17:29:19, James Morse wrote:
> >> Some architectures require code written to memory as if it were data to be
> >> 'cleaned' from any data caches so that the processor can fetch them as new
> >> instructions.
> >>
> >> During resume from hibernate, the snapshot code copies some pages directly,
> >> meaning these architectures do not get a chance to perform their cache
> >> maintenance. Add a call to flush_icache_range(), which is provided by
> >> architectures that require it, to perform the maintenance.
> >>
> >> This mirrors the kernel's behaviour when loading kernel modules and when
> >> mapping executable pages to user space.
> >>
> >> Signed-off-by: James Morse <james.morse@arm.com>
> > 
> > Looks reasonable.
> > 
> > Acked-by: Pavel Machek <pavel@ucw.cz>
> 
> Thanks!
> 
> > 
> >> ---
> >>  kernel/power/snapshot.c | 4 ++++
> >>  1 file changed, 4 insertions(+)
> >>
> >> diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
> >> index 5235dd4e1e2f..139fc449ad75 100644
> >> --- a/kernel/power/snapshot.c
> >> +++ b/kernel/power/snapshot.c
> >> @@ -1196,9 +1197,12 @@ static unsigned int count_data_pages(void)
> >>  static inline void do_copy_page(long *dst, long *src)
> >>  {
> >>  	int n;
> >> +	unsigned long __maybe_unused start = (unsigned long)dst;
> > 
> > Why the "maybe unused"?
> 
> To avoid a build warning on x86_64, and any other architectures that don't
> use the arguments in their flush_icache_range() implementation.

That's wrong fix, I believe.

flush_icache_range() should use their arguments. We should not have
all the callers caring about this.

								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-11-16 12:29     ` James Morse
@ 2015-11-16 12:41       ` Pavel Machek
  2015-11-16 14:01         ` James Morse
  0 siblings, 1 reply; 33+ messages in thread
From: Pavel Machek @ 2015-11-16 12:41 UTC (permalink / raw)
  To: linux-arm-kernel

Hi!

> On 14/11/15 21:34, Pavel Machek wrote:
> >> The implementation assumes that exactly the same kernel is booted on the
> >> same hardware, and that the kernel is loaded at the same physical address.
> > 
> > BTW... on newer implementations (and I have patch for x86, too), we
> > try to make it so that resume kernel does not have to be same as
> > suspend one. It would be nice to move there with arm64, too. 
> 
> Yes, that is a neat trick, can I leave it as future work?

Yes. But it is really not hard.

> >> + * Because this code has to be copied to a safe_page, it can't call out to
> >> + * other functions by pc-relative address. Also remember that it
> > 
> > PC-relative?
> 
> The linker may (often!) use program-counter relative addresses for loads
> and stores. This code gets copied, so the linker doesn't know where the
> code will be executed from, so any instructions using pc-relative addresses
> will get the wrong result, (if they reference something outside the
> function).

I was wondering if it should be spelled "PC-relative", not
"pc-relative" :-).

> >> + * and executable pages mapped to user space are also written as data, we
> >> + * clean all pages we touch to the PoU.
> > 
> > What is PoC and PoU?
> 
> They are points in the CPU's cache hierarchy:
> 
> ARM processors are of a 'modified Harvard' architecture, their paths to
> read instructions and data are different. The 'Point of Unification' is the
> first point in the cache hierarchy that is the same for both. On ARM,
> flush_icache_range() makes sure code written as data is pushed through any
> data caches to this point, and then evicts any stale copies in the
> instruction caches.
> 
> PoC is the 'Point of Coherency', it is the first point that is the same for
> all devices, (e.g. a cpu with caches turned on, and one with them off), it
> is normally main memory. The kernel text has to be pushed to this point, so
> that secondary cores, while running early-boot code with their MMU and
> caches turned off, don't get incorrect code/data from before resume.
> 
> I have resisted the urge to draw some ascii-art!

That's ok, you just might want to replace PoI -> 'Point of
Unification' and PoC -> 'Point of Coherency' in the comments. That
should make googling easier for people not familiar with arm
terminology.

Thanks,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-11-16 12:41       ` Pavel Machek
@ 2015-11-16 14:01         ` James Morse
  2015-11-16 14:23           ` Mark Rutland
  2015-11-16 18:01           ` Pavel Machek
  0 siblings, 2 replies; 33+ messages in thread
From: James Morse @ 2015-11-16 14:01 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On 16/11/15 12:41, Pavel Machek wrote:
>> On 14/11/15 21:34, Pavel Machek wrote:
>>>> The implementation assumes that exactly the same kernel is booted on the
>>>> same hardware, and that the kernel is loaded at the same physical address.
>>>
>>> BTW... on newer implementations (and I have patch for x86, too), we
>>> try to make it so that resume kernel does not have to be same as
>>> suspend one. It would be nice to move there with arm64, too. 
>>
>> Yes, that is a neat trick, can I leave it as future work?
> 
> Yes. But it is really not hard.

I think its harder than it looks:
It means the MMU has to be turned off, as two different kernels may not
have used the same configuration for the MMU - and I don't think its safe
to change while the MMU is running. There are also going to be
complications with resetting the hypervisor/el2 configuration, which I need
to spend more time thinking about (and probably ask for advice!).


>>>> + * Because this code has to be copied to a safe_page, it can't call out to
>>>> + * other functions by pc-relative address. Also remember that it
>>>
>>> PC-relative?
>>
>> The linker may (often!) use program-counter relative addresses for loads
>> and stores. This code gets copied, so the linker doesn't know where the
>> code will be executed from, so any instructions using pc-relative addresses
>> will get the wrong result, (if they reference something outside the
>> function).
> 
> I was wondering if it should be spelled "PC-relative", not
> "pc-relative" :-).

Hah - sorry!
Fixed.


>>>> + * and executable pages mapped to user space are also written as data, we
>>>> + * clean all pages we touch to the PoU.
>>>
>>> What is PoC and PoU?
>>
>> They are points in the CPU's cache hierarchy:
>>
>> ARM processors are of a 'modified Harvard' architecture, their paths to
>> read instructions and data are different. The 'Point of Unification' is the
>> first point in the cache hierarchy that is the same for both. On ARM,
>> flush_icache_range() makes sure code written as data is pushed through any
>> data caches to this point, and then evicts any stale copies in the
>> instruction caches.
>>
>> PoC is the 'Point of Coherency', it is the first point that is the same for
>> all devices, (e.g. a cpu with caches turned on, and one with them off), it
>> is normally main memory. The kernel text has to be pushed to this point, so
>> that secondary cores, while running early-boot code with their MMU and
>> caches turned off, don't get incorrect code/data from before resume.
>>
>> I have resisted the urge to draw some ascii-art!
> 
> That's ok, you just might want to replace PoI -> 'Point of
> Unification' and PoC -> 'Point of Coherency' in the comments. That
> should make googling easier for people not familiar with arm
> terminology.

There aren't any other points under arch/arm64 that use the full expansion,
but it can't hurt to include both.



Thanks,

James

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-11-16 14:01         ` James Morse
@ 2015-11-16 14:23           ` Mark Rutland
  2015-11-16 18:01           ` Pavel Machek
  1 sibling, 0 replies; 33+ messages in thread
From: Mark Rutland @ 2015-11-16 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

> >>> What is PoC and PoU?
> >>
> >> They are points in the CPU's cache hierarchy:
> >>
> >> ARM processors are of a 'modified Harvard' architecture, their paths to
> >> read instructions and data are different. The 'Point of Unification' is the
> >> first point in the cache hierarchy that is the same for both. On ARM,
> >> flush_icache_range() makes sure code written as data is pushed through any
> >> data caches to this point, and then evicts any stale copies in the
> >> instruction caches.
> >>
> >> PoC is the 'Point of Coherency', it is the first point that is the same for
> >> all devices, (e.g. a cpu with caches turned on, and one with them off), it
> >> is normally main memory. The kernel text has to be pushed to this point, so
> >> that secondary cores, while running early-boot code with their MMU and
> >> caches turned off, don't get incorrect code/data from before resume.
> >>
> >> I have resisted the urge to draw some ascii-art!
> > 
> > That's ok, you just might want to replace PoI -> 'Point of
> > Unification' and PoC -> 'Point of Coherency' in the comments. That
> > should make googling easier for people not familiar with arm
> > terminology.
> 
> There aren't any other points under arch/arm64 that use the full expansion,
> but it can't hurt to include both.

I would prefer to keep PoC/PoU as they are for consistency.

The abbreviations are provided by the architecture, and much like
register names that we do not expand, to reason about them you need to
read the architecture documentation anyway. So the short forms alone are
fine.

Mark.

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-11-16 14:01         ` James Morse
  2015-11-16 14:23           ` Mark Rutland
@ 2015-11-16 18:01           ` Pavel Machek
  1 sibling, 0 replies; 33+ messages in thread
From: Pavel Machek @ 2015-11-16 18:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon 2015-11-16 14:01:04, James Morse wrote:
> Hi,
> 
> On 16/11/15 12:41, Pavel Machek wrote:
> >> On 14/11/15 21:34, Pavel Machek wrote:
> >>>> The implementation assumes that exactly the same kernel is booted on the
> >>>> same hardware, and that the kernel is loaded at the same physical address.
> >>>
> >>> BTW... on newer implementations (and I have patch for x86, too), we
> >>> try to make it so that resume kernel does not have to be same as
> >>> suspend one. It would be nice to move there with arm64, too. 
> >>
> >> Yes, that is a neat trick, can I leave it as future work?
> > 
> > Yes. But it is really not hard.
> 
> I think its harder than it looks:
> It means the MMU has to be turned off, as two different kernels may not
> have used the same configuration for the MMU - and I don't think its safe
> to change while the MMU is running. There are also going to be
> complications with resetting the hypervisor/el2 configuration, which I need
> to spend more time thinking about (and probably ask for advice!).

Well, you can simplify it here: If MMU configuration is stable between
4.3, 4.4 and 4.5 kernels, you just use suspend signature for 4.3-like
kernels. If you need to change it in future, you change the signature.

(If it changes too often, this might not work).

Best regards,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 02/11] arm64: Fold proc-macros.S into assembler.h
  2015-11-14 21:25   ` Pavel Machek
@ 2015-11-16 18:44     ` Geoff Levand
  0 siblings, 0 replies; 33+ messages in thread
From: Geoff Levand @ 2015-11-16 18:44 UTC (permalink / raw)
  To: linux-arm-kernel

Hi,

On Sat, 2015-11-14 at 22:25 +0100, Pavel Machek wrote:
> On Tue 2015-10-27 17:29:11, James Morse wrote:

> > From: Geoff Levand <geoff@infradead.org>
> Why the two stars?

I have a newer version that fixed that typo in
my kexec-v11 branch:

http://git.kernel.org/cgit/linux/kernel/git/geoff/linux-kexec.git/commit/?h=kexec-v11&id=afbeea6ab2e1aa3ea8c4683521ac155667f86ca1

-Geoff

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-13 23:38         ` Rafael J. Wysocki
@ 2015-11-17 12:38           ` Lorenzo Pieralisi
  2015-11-17 13:13             ` Pavel Machek
  0 siblings, 1 reply; 33+ messages in thread
From: Lorenzo Pieralisi @ 2015-11-17 12:38 UTC (permalink / raw)
  To: linux-arm-kernel

[Cc'ed maintainers of affected arches]

On Sat, Nov 14, 2015 at 12:38:50AM +0100, Rafael J. Wysocki wrote:
> On Thursday, November 12, 2015 11:47:05 AM Lorenzo Pieralisi wrote:
> > On Thu, Nov 12, 2015 at 01:48:32AM +0100, Rafael J. Wysocki wrote:
> > > On Wednesday, November 11, 2015 11:40:39 AM Lorenzo Pieralisi wrote:
> > > > Hi Pavel, Rafael,
> > > > 
> > > > Do you have any feedback on this patch ?
> > > > 
> > > > It is fundamental to this series and affects Hibernate core code so if you
> > > > have any feedback that would be much appreciated.
> > > 
> > > I'm really not familiar with the flush_icache_range() interface.
> > > 
> > > What exactly does it do?
> > 
> > It is used to sync a memory range that is written into (eg loading
> > modules, copying from snapshot is basically the same thing, reads from
> > storage and restore pages that might well be executable code), in particular
> > to sync the I-cache and the D-cache, eg on arm64 the page that the snapshot
> > code is copying might be executable code that has to be cleaned from the
> > D-cache so that it is made visible to the I-cache.
> > 
> > On x86 it is a NOP AFAIK.
> 
> If that's the case, I have no problems with this change as long as the code
> works on architectures with non-trivial flush_icache_range().

I Cc'ed the respective arches maintainers, it should work (it may
make resuming a bit slower, owing to the cache syncing), problem is
that we have no way of testing it on platforms other than arm/arm64.

How do you want us to go on about this ? Should we add a config option
to prevent calling flush_icache_range() on all platforms (where it
is not a nop) ?

Thanks a lot !
Lorenzo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-17 12:38           ` Lorenzo Pieralisi
@ 2015-11-17 13:13             ` Pavel Machek
  2015-11-17 13:43               ` Lorenzo Pieralisi
  0 siblings, 1 reply; 33+ messages in thread
From: Pavel Machek @ 2015-11-17 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue 2015-11-17 12:38:07, Lorenzo Pieralisi wrote:
> [Cc'ed maintainers of affected arches]
> 
> On Sat, Nov 14, 2015 at 12:38:50AM +0100, Rafael J. Wysocki wrote:
> > On Thursday, November 12, 2015 11:47:05 AM Lorenzo Pieralisi wrote:
> > > On Thu, Nov 12, 2015 at 01:48:32AM +0100, Rafael J. Wysocki wrote:
> > > > On Wednesday, November 11, 2015 11:40:39 AM Lorenzo Pieralisi wrote:
> > > > > Hi Pavel, Rafael,
> > > > > 
> > > > > Do you have any feedback on this patch ?
> > > > > 
> > > > > It is fundamental to this series and affects Hibernate core code so if you
> > > > > have any feedback that would be much appreciated.
> > > > 
> > > > I'm really not familiar with the flush_icache_range() interface.
> > > > 
> > > > What exactly does it do?
> > > 
> > > It is used to sync a memory range that is written into (eg loading
> > > modules, copying from snapshot is basically the same thing, reads from
> > > storage and restore pages that might well be executable code), in particular
> > > to sync the I-cache and the D-cache, eg on arm64 the page that the snapshot
> > > code is copying might be executable code that has to be cleaned from the
> > > D-cache so that it is made visible to the I-cache.
> > > 
> > > On x86 it is a NOP AFAIK.
> > 
> > If that's the case, I have no problems with this change as long as the code
> > works on architectures with non-trivial flush_icache_range().
> 
> I Cc'ed the respective arches maintainers, it should work (it may
> make resuming a bit slower, owing to the cache syncing), problem is
> that we have no way of testing it on platforms other than arm/arm64.

Sure you can find x86 and x86-64 machine near you?

And as hibernation is only supported on x86 and arm, that should be
ok.

Or just merge it to the next and let the world do testing for you...

> How do you want us to go on about this ? Should we add a config option
> to prevent calling flush_icache_range() on all platforms (where it
> is not a nop) ?

Config option is definitely not an option ;-).
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-11-17 13:13             ` Pavel Machek
@ 2015-11-17 13:43               ` Lorenzo Pieralisi
  0 siblings, 0 replies; 33+ messages in thread
From: Lorenzo Pieralisi @ 2015-11-17 13:43 UTC (permalink / raw)
  To: linux-arm-kernel

On Tue, Nov 17, 2015 at 02:13:45PM +0100, Pavel Machek wrote:
> On Tue 2015-11-17 12:38:07, Lorenzo Pieralisi wrote:
> > [Cc'ed maintainers of affected arches]
> > 
> > On Sat, Nov 14, 2015 at 12:38:50AM +0100, Rafael J. Wysocki wrote:
> > > On Thursday, November 12, 2015 11:47:05 AM Lorenzo Pieralisi wrote:
> > > > On Thu, Nov 12, 2015 at 01:48:32AM +0100, Rafael J. Wysocki wrote:
> > > > > On Wednesday, November 11, 2015 11:40:39 AM Lorenzo Pieralisi wrote:
> > > > > > Hi Pavel, Rafael,
> > > > > > 
> > > > > > Do you have any feedback on this patch ?
> > > > > > 
> > > > > > It is fundamental to this series and affects Hibernate core code so if you
> > > > > > have any feedback that would be much appreciated.
> > > > > 
> > > > > I'm really not familiar with the flush_icache_range() interface.
> > > > > 
> > > > > What exactly does it do?
> > > > 
> > > > It is used to sync a memory range that is written into (eg loading
> > > > modules, copying from snapshot is basically the same thing, reads from
> > > > storage and restore pages that might well be executable code), in particular
> > > > to sync the I-cache and the D-cache, eg on arm64 the page that the snapshot
> > > > code is copying might be executable code that has to be cleaned from the
> > > > D-cache so that it is made visible to the I-cache.
> > > > 
> > > > On x86 it is a NOP AFAIK.
> > > 
> > > If that's the case, I have no problems with this change as long as the code
> > > works on architectures with non-trivial flush_icache_range().
> > 
> > I Cc'ed the respective arches maintainers, it should work (it may
> > make resuming a bit slower, owing to the cache syncing), problem is
> > that we have no way of testing it on platforms other than arm/arm64.
> 
> Sure you can find x86 and x86-64 machine near you?

Yes but on x86 this patch is a NOP so testing on it does not really
add much.

> And as hibernation is only supported on x86 and arm, that should be
> ok.

And what are the other arches swsusp_arch_{suspend/resume} there for
then ?

I just had a look at arch code implementing swsusp_arch_{suspend/resume},
if it has to work only on x86 and arm we will give it a spin on arm (32-bit)
platforms (I really doubt it is going to be even noticeable) and we
are done.

> Or just merge it to the next and let the world do testing for you...

I wanted to ask before because I am not sure that's something people
regularly test and I do not want to end up with regressions that might
have been prevented by just enquiring, I am not sure that's something
-next will help detect either.

> > How do you want us to go on about this ? Should we add a config option
> > to prevent calling flush_icache_range() on all platforms (where it
> > is not a nop) ?
> 
> Config option is definitely not an option ;-).

Ok, thanks.

Lorenzo

^ permalink raw reply	[flat|nested] 33+ messages in thread

* [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it
  2015-10-27 17:29 ` [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it James Morse
  2015-11-11 11:40   ` Lorenzo Pieralisi
  2015-11-14 20:26   ` Pavel Machek
@ 2015-11-26 14:23   ` James Morse
  2 siblings, 0 replies; 33+ messages in thread
From: James Morse @ 2015-11-26 14:23 UTC (permalink / raw)
  To: linux-arm-kernel

On 27/10/15 17:29, James Morse wrote:
> Some architectures require code written to memory as if it were data to be
> 'cleaned' from any data caches so that the processor can fetch them as new
> instructions.
> 
> During resume from hibernate, the snapshot code copies some pages directly,
> meaning these architectures do not get a chance to perform their cache
> maintenance. Add a call to flush_icache_range(), which is provided by
> architectures that require it, to perform the maintenance.
> 
> This mirrors the kernel's behaviour when loading kernel modules and when
> mapping executable pages to user space.

While trying to benchmark the impact of this patch on 32bit ARM, I've
discovered the fix is in the wrong place! do_copy_page() isn't used on
the resume path for the pages restored 'in place'.

I will produce another version of the series - hopefully later today.



James

^ permalink raw reply	[flat|nested] 33+ messages in thread

end of thread, other threads:[~2015-11-26 14:23 UTC | newest]

Thread overview: 33+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-10-27 17:29 [PATCH v2 00/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
2015-10-27 17:29 ` [PATCH v2 01/11] arm64: kernel: fix tcr_el1.t0sz restore on systems with extended idmap James Morse
2015-10-27 17:29 ` [PATCH v2 02/11] arm64: Fold proc-macros.S into assembler.h James Morse
2015-11-14 21:25   ` Pavel Machek
2015-11-16 18:44     ` Geoff Levand
2015-10-27 17:29 ` [PATCH v2 03/11] arm64: Convert hcalls to use HVC immediate value James Morse
2015-10-27 17:29 ` [PATCH v2 04/11] arm64: Add new hcall HVC_CALL_FUNC James Morse
2015-10-27 17:29 ` [PATCH v2 05/11] arm64: kvm: allows kvm cpu hotplug James Morse
2015-10-27 17:29 ` [PATCH v2 06/11] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter() James Morse
2015-10-27 17:29 ` [PATCH v2 07/11] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va James Morse
2015-10-27 17:29 ` [PATCH v2 08/11] arm64: kernel: Include _AC definition in page.h James Morse
2015-10-27 17:29 ` [PATCH v2 09/11] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file James Morse
2015-10-27 17:29 ` [PATCH v2 10/11] PM / Hibernate: clean cached pages on architectures that require it James Morse
2015-11-11 11:40   ` Lorenzo Pieralisi
2015-11-12  0:48     ` Rafael J. Wysocki
2015-11-12 11:47       ` Lorenzo Pieralisi
2015-11-13 23:38         ` Rafael J. Wysocki
2015-11-17 12:38           ` Lorenzo Pieralisi
2015-11-17 13:13             ` Pavel Machek
2015-11-17 13:43               ` Lorenzo Pieralisi
2015-11-12  2:53     ` Chen, Yu C
2015-11-12 11:52       ` Lorenzo Pieralisi
2015-11-14 20:26   ` Pavel Machek
2015-11-16 12:27     ` James Morse
2015-11-16 12:36       ` Pavel Machek
2015-11-26 14:23   ` James Morse
2015-10-27 17:29 ` [PATCH v2 11/11] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
2015-11-14 21:34   ` Pavel Machek
2015-11-16 12:29     ` James Morse
2015-11-16 12:41       ` Pavel Machek
2015-11-16 14:01         ` James Morse
2015-11-16 14:23           ` Mark Rutland
2015-11-16 18:01           ` Pavel Machek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).