All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v6 00/14] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-02-29 17:20 ` James Morse
  0 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Geoff Levand, Catalin Marinas,
	Lorenzo Pieralisi, Mark Rutland, AKASHI Takahiro, Marc Zyngier,
	James Morse, Rafael J . Wysocki, Pavel Machek, linux-pm

Hi all,

for-next/core has changed quite a bit with the staged inclusion of kaslr.
This series is a rebase onto the most recent commit "arm64: lse: deal with
clobbered IP registers after branch via PLT", and then tested with some
nasty hacks in the efi stub to provide a psuedo-random address layout on Juno.

After Lorenzo's suggestion, el2_setup() now gets called from the assembly code,
meaning we have to store its physical and virtual addresses in the arch header.
This means patch 11 from v5 is no longer needed.

The last patch that prevents resume with a different kernel version has been
changed to use UTS_VERSION, meaning it has to be the same build of the kernel.


This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b hibernate/v6


Changes since v5:
 * Call el2_setup() from hibernate's assembly code
 * Use UTS_VERSION to catch hibernate/resume with different builds

Changes since v4:
 * Mask in current T0SZ bits when restoring tcr_el1.
 * Use el2_setup() to reconfigure el2.
 * Remove fiq calls.
 * Removed kvm_arch_hardware_disable() call.
 * Added 'in_suspend = 0' to avoid 'if (uninitialised-memory) ...'.
 * Removed icache-flush from hibernate-asm.S, this was done in hibernate.c.
 * Commented tlbis.
 * Removed dsb after copy_page(), this behaviour is guaranteed by the
   architecture.

Changes since v3:
 * To work with kaslr:
   * hibernate now uses the arch-header to store the address of the page
     tables, and the point to re-enter the resumed kernel.
   * The el2 vectors are reloaded to point to the 'safe' page, then back to the
     resumed kernel.
   * PoC cleaning is done after the jump to the resumed kernel, as we don't
     know the restored kernel's boundaries in advance.
   * Some variables are accessed via aliases in the linear map, as the kernel
     text is not mapped during resume. restore_pblist is one example.
   * Execute the safe page from the bottom of memory, not the top, so that we
     can restore the resumed kernel's page tables directly.

 * Rebased the common patches onto v13 of kexec
 * Changed hibernate-asm.S to use the new copy_page macro.
 * Changed copy_p?d()s to use the do { } while(); pattern.
 * Added some missing barriers. (dsb after ic ialluis).

Changes since v2:
 * Rewrote restore in-place patch - we can't clean pages in copy_page(), we
   need to publish a list for the architecture to clean
 * Added missing pgprot_val() in hibernate.c, spotted by STRICT_MM_TYPECHECKS
 * Removed 'tcr_set_idmap_t0sz' from proc.S - I missed this when rebase-ing
 * Re-imported the first four patches from kexec v12
 * Rebased onto v4.4-rc2
 * Changes from Pavel Machek's comments

Changes since v1:
 * Removed for_each_process(){ for_each_vma() { } }; cache cleaning, replaced
   with icache_flush_range() call in core hibernate code
 * Rebased onto conflicting tcr_ek1.t0sz bug-fix patch

[v5] http://www.spinics.net/lists/arm-kernel/msg483595.html
[v4] http://www.spinics.net/lists/arm-kernel/msg477769.html
[v3] http://www.spinics.net/lists/arm-kernel/msg463590.html
[v2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html
[v1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html


AKASHI Takahiro (1):
  arm64: kvm: allows kvm cpu hotplug

Geoff Levand (5):
  arm64: Fold proc-macros.S into assembler.h
  arm64: Cleanup SCTLR flags
  arm64: Convert hcalls to use HVC immediate value
  arm64: Add new hcall HVC_CALL_FUNC
  arm64: Add new asm macro copy_page

James Morse (8):
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by
    va
  arm64: kernel: Include _AC definition in page.h
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  arm64: head.S: el2_setup() to accept sctlr_el1 as an argument
  PM / Hibernate: Call flush_icache_range() on pages restored in-place
  arm64: kernel: Add support for hibernate/suspend-to-disk
  arm64: hibernate: Prevent resume from a different kernel version

 arch/arm/include/asm/kvm_host.h    |  10 +-
 arch/arm/include/asm/kvm_mmu.h     |   1 +
 arch/arm/kvm/arm.c                 |  98 +++++---
 arch/arm/kvm/mmu.c                 |   5 +
 arch/arm64/Kconfig                 |   7 +
 arch/arm64/include/asm/assembler.h | 111 ++++++++-
 arch/arm64/include/asm/kvm_arm.h   |  11 -
 arch/arm64/include/asm/kvm_host.h  |   1 -
 arch/arm64/include/asm/kvm_mmu.h   |  20 ++
 arch/arm64/include/asm/memory.h    |   3 +
 arch/arm64/include/asm/page.h      |   2 +
 arch/arm64/include/asm/suspend.h   |  32 ++-
 arch/arm64/include/asm/sysreg.h    |  19 +-
 arch/arm64/include/asm/virt.h      |  40 ++++
 arch/arm64/kernel/Makefile         |   1 +
 arch/arm64/kernel/asm-offsets.c    |  10 +-
 arch/arm64/kernel/head.S           |  24 +-
 arch/arm64/kernel/hibernate-asm.S  | 166 +++++++++++++
 arch/arm64/kernel/hibernate.c      | 478 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S       |  49 ++--
 arch/arm64/kernel/setup.c          |   1 -
 arch/arm64/kernel/sleep.S          | 146 ++++-------
 arch/arm64/kernel/suspend.c        |  96 ++++----
 arch/arm64/kernel/vmlinux.lds.S    |  15 ++
 arch/arm64/kvm/hyp-init.S          |  47 +++-
 arch/arm64/kvm/hyp.S               |   3 +-
 arch/arm64/kvm/hyp/hyp-entry.S     |   9 +-
 arch/arm64/mm/cache.S              |   2 -
 arch/arm64/mm/proc-macros.S        |  98 --------
 arch/arm64/mm/proc.S               |  56 ++---
 kernel/power/swap.c                |  18 ++
 31 files changed, 1203 insertions(+), 376 deletions(-)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.7.0


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 00/14] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-02-29 17:20 ` James Morse
  0 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

for-next/core has changed quite a bit with the staged inclusion of kaslr.
This series is a rebase onto the most recent commit "arm64: lse: deal with
clobbered IP registers after branch via PLT", and then tested with some
nasty hacks in the efi stub to provide a psuedo-random address layout on Juno.

After Lorenzo's suggestion, el2_setup() now gets called from the assembly code,
meaning we have to store its physical and virtual addresses in the arch header.
This means patch 11 from v5 is no longer needed.

The last patch that prevents resume with a different kernel version has been
changed to use UTS_VERSION, meaning it has to be the same build of the kernel.


This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b hibernate/v6


Changes since v5:
 * Call el2_setup() from hibernate's assembly code
 * Use UTS_VERSION to catch hibernate/resume with different builds

Changes since v4:
 * Mask in current T0SZ bits when restoring tcr_el1.
 * Use el2_setup() to reconfigure el2.
 * Remove fiq calls.
 * Removed kvm_arch_hardware_disable() call.
 * Added 'in_suspend = 0' to avoid 'if (uninitialised-memory) ...'.
 * Removed icache-flush from hibernate-asm.S, this was done in hibernate.c.
 * Commented tlbis.
 * Removed dsb after copy_page(), this behaviour is guaranteed by the
   architecture.

Changes since v3:
 * To work with kaslr:
   * hibernate now uses the arch-header to store the address of the page
     tables, and the point to re-enter the resumed kernel.
   * The el2 vectors are reloaded to point to the 'safe' page, then back to the
     resumed kernel.
   * PoC cleaning is done after the jump to the resumed kernel, as we don't
     know the restored kernel's boundaries in advance.
   * Some variables are accessed via aliases in the linear map, as the kernel
     text is not mapped during resume. restore_pblist is one example.
   * Execute the safe page from the bottom of memory, not the top, so that we
     can restore the resumed kernel's page tables directly.

 * Rebased the common patches onto v13 of kexec
 * Changed hibernate-asm.S to use the new copy_page macro.
 * Changed copy_p?d()s to use the do { } while(); pattern.
 * Added some missing barriers. (dsb after ic ialluis).

Changes since v2:
 * Rewrote restore in-place patch - we can't clean pages in copy_page(), we
   need to publish a list for the architecture to clean
 * Added missing pgprot_val() in hibernate.c, spotted by STRICT_MM_TYPECHECKS
 * Removed 'tcr_set_idmap_t0sz' from proc.S - I missed this when rebase-ing
 * Re-imported the first four patches from kexec v12
 * Rebased onto v4.4-rc2
 * Changes from Pavel Machek's comments

Changes since v1:
 * Removed for_each_process(){ for_each_vma() { } }; cache cleaning, replaced
   with icache_flush_range() call in core hibernate code
 * Rebased onto conflicting tcr_ek1.t0sz bug-fix patch

[v5] http://www.spinics.net/lists/arm-kernel/msg483595.html
[v4] http://www.spinics.net/lists/arm-kernel/msg477769.html
[v3] http://www.spinics.net/lists/arm-kernel/msg463590.html
[v2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html
[v1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html


AKASHI Takahiro (1):
  arm64: kvm: allows kvm cpu hotplug

Geoff Levand (5):
  arm64: Fold proc-macros.S into assembler.h
  arm64: Cleanup SCTLR flags
  arm64: Convert hcalls to use HVC immediate value
  arm64: Add new hcall HVC_CALL_FUNC
  arm64: Add new asm macro copy_page

James Morse (8):
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by
    va
  arm64: kernel: Include _AC definition in page.h
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  arm64: head.S: el2_setup() to accept sctlr_el1 as an argument
  PM / Hibernate: Call flush_icache_range() on pages restored in-place
  arm64: kernel: Add support for hibernate/suspend-to-disk
  arm64: hibernate: Prevent resume from a different kernel version

 arch/arm/include/asm/kvm_host.h    |  10 +-
 arch/arm/include/asm/kvm_mmu.h     |   1 +
 arch/arm/kvm/arm.c                 |  98 +++++---
 arch/arm/kvm/mmu.c                 |   5 +
 arch/arm64/Kconfig                 |   7 +
 arch/arm64/include/asm/assembler.h | 111 ++++++++-
 arch/arm64/include/asm/kvm_arm.h   |  11 -
 arch/arm64/include/asm/kvm_host.h  |   1 -
 arch/arm64/include/asm/kvm_mmu.h   |  20 ++
 arch/arm64/include/asm/memory.h    |   3 +
 arch/arm64/include/asm/page.h      |   2 +
 arch/arm64/include/asm/suspend.h   |  32 ++-
 arch/arm64/include/asm/sysreg.h    |  19 +-
 arch/arm64/include/asm/virt.h      |  40 ++++
 arch/arm64/kernel/Makefile         |   1 +
 arch/arm64/kernel/asm-offsets.c    |  10 +-
 arch/arm64/kernel/head.S           |  24 +-
 arch/arm64/kernel/hibernate-asm.S  | 166 +++++++++++++
 arch/arm64/kernel/hibernate.c      | 478 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S       |  49 ++--
 arch/arm64/kernel/setup.c          |   1 -
 arch/arm64/kernel/sleep.S          | 146 ++++-------
 arch/arm64/kernel/suspend.c        |  96 ++++----
 arch/arm64/kernel/vmlinux.lds.S    |  15 ++
 arch/arm64/kvm/hyp-init.S          |  47 +++-
 arch/arm64/kvm/hyp.S               |   3 +-
 arch/arm64/kvm/hyp/hyp-entry.S     |   9 +-
 arch/arm64/mm/cache.S              |   2 -
 arch/arm64/mm/proc-macros.S        |  98 --------
 arch/arm64/mm/proc.S               |  56 ++---
 kernel/power/swap.c                |  18 ++
 31 files changed, 1203 insertions(+), 376 deletions(-)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.7.0

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 01/14] arm64: Fold proc-macros.S into assembler.h
  2016-02-29 17:20 ` James Morse
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to
be used outside the mm code move the contents of proc-macros.S to
asm/assembler.h.  Also, delete proc-macros.S, and fix up all references
to proc-macros.S.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
[rebased, included dcache_by_line_op]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 82 ++++++++++++++++++++++++++++++-
 arch/arm64/mm/cache.S              |  2 -
 arch/arm64/mm/proc-macros.S        | 98 --------------------------------------
 arch/arm64/mm/proc.S               |  3 --
 4 files changed, 81 insertions(+), 104 deletions(-)
 delete mode 100644 arch/arm64/mm/proc-macros.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 70f7b9e04598..c72474c2d4dc 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -1,5 +1,5 @@
 /*
- * Based on arch/arm/include/asm/assembler.h
+ * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S
  *
  * Copyright (C) 1996-2000 Russell King
  * Copyright (C) 2012 ARM Ltd.
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H
 
+#include <asm/asm-offsets.h>
+#include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
@@ -212,6 +214,84 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+	.macro	vma_vm_mm, rd, rn
+	ldr	\rd, [\rn, #VMA_VM_MM]
+	.endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+	.macro	mmid, rd, rn
+	ldr	\rd, [\rn, #MM_CONTEXT_ID]
+	.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+	.macro	icache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
+ */
+	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
+#ifndef CONFIG_ARM64_VA_BITS_48
+	ldr_l	\tmpreg, idmap_t0sz
+	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
+#endif
+	.endm
+
+/*
+ * Macro to perform a data cache maintenance for the interval
+ * [kaddr, kaddr + size)
+ *
+ * 	op:		operation passed to dc instruction
+ * 	domain:		domain used in dsb instruciton
+ * 	kaddr:		starting virtual address of the region
+ * 	size:		size of the region
+ * 	Corrupts:	kaddr, size, tmp1, tmp2
+ */
+	.macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
+	dcache_line_size \tmp1, \tmp2
+	add	\size, \kaddr, \size
+	sub	\tmp2, \tmp1, #1
+	bic	\kaddr, \kaddr, \tmp2
+9998:	dc	\op, \kaddr
+	add	\kaddr, \kaddr, \tmp1
+	cmp	\kaddr, \size
+	b.lo	9998b
+	dsb	\domain
+	.endm
+
+/*
+ * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
+ */
+	.macro	reset_pmuserenr_el0, tmpreg
+	mrs	\tmpreg, id_aa64dfr0_el1	// Check ID_AA64DFR0_EL1 PMUVer
+	sbfx	\tmpreg, \tmpreg, #8, #4
+	cmp	\tmpreg, #1			// Skip if no PMU present
+	b.lt	9000f
+	msr	pmuserenr_el0, xzr		// Disable PMU access from EL0
+9000:
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 6df07069a025..50ff9ba3a236 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -24,8 +24,6 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
-#include "proc-macros.S"
-
 /*
  *	flush_icache_range(start,end)
  *
diff --git a/arch/arm64/mm/proc-macros.S b/arch/arm64/mm/proc-macros.S
deleted file mode 100644
index e6a30e1268a8..000000000000
--- a/arch/arm64/mm/proc-macros.S
+++ /dev/null
@@ -1,98 +0,0 @@
-/*
- * Based on arch/arm/mm/proc-macros.S
- *
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-
-/*
- * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
- */
-	.macro	vma_vm_mm, rd, rn
-	ldr	\rd, [\rn, #VMA_VM_MM]
-	.endm
-
-/*
- * mmid - get context id from mm pointer (mm->context.id)
- */
-	.macro	mmid, rd, rn
-	ldr	\rd, [\rn, #MM_CONTEXT_ID]
-	.endm
-
-/*
- * dcache_line_size - get the minimum D-cache line size from the CTR register.
- */
-	.macro	dcache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * icache_line_size - get the minimum I-cache line size from the CTR register.
- */
-	.macro	icache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	and	\tmp, \tmp, #0xf		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
- */
-	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
-	ldr_l	\tmpreg, idmap_t0sz
-	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
-#endif
-	.endm
-
-/*
- * Macro to perform a data cache maintenance for the interval
- * [kaddr, kaddr + size)
- *
- * 	op:		operation passed to dc instruction
- * 	domain:		domain used in dsb instruciton
- * 	kaddr:		starting virtual address of the region
- * 	size:		size of the region
- * 	Corrupts: 	kaddr, size, tmp1, tmp2
- */
-	.macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
-	dcache_line_size \tmp1, \tmp2
-	add	\size, \kaddr, \size
-	sub	\tmp2, \tmp1, #1
-	bic	\kaddr, \kaddr, \tmp2
-9998:	dc	\op, \kaddr
-	add	\kaddr, \kaddr, \tmp1
-	cmp	\kaddr, \size
-	b.lo	9998b
-	dsb	\domain
-	.endm
-
-/*
- * reset_pmuserenr_el0 - reset PMUSERENR_EL0 if PMUv3 present
- */
-	.macro	reset_pmuserenr_el0, tmpreg
-	mrs	\tmpreg, id_aa64dfr0_el1	// Check ID_AA64DFR0_EL1 PMUVer
-	sbfx	\tmpreg, \tmpreg, #8, #4
-	cmp	\tmpreg, #1			// Skip if no PMU present
-	b.lt	9000f
-	msr	pmuserenr_el0, xzr		// Disable PMU access from EL0
-9000:
-	.endm
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 543f5198005a..9f6deacf41d2 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -23,13 +23,10 @@
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
-#include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
-#include "proc-macros.S"
-
 #ifdef CONFIG_ARM64_64K_PAGES
 #define TCR_TG_FLAGS	TCR_TG0_64K | TCR_TG1_64K
 #elif defined(CONFIG_ARM64_16K_PAGES)
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 02/14] arm64: Cleanup SCTLR flags
  2016-02-29 17:20 ` James Morse
  (?)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

We currently have macros defining flags for the arm64 sctlr registers in
both kvm_arm.h and sysreg.h.  To clean things up and simplify move the
definitions of the SCTLR_EL2 flags from kvm_arm.h to sysreg.h, rename any
SCTLR_EL1 or SCTLR_EL2 flags that are common to both registers to be
SCTLR_ELx, with 'x' indicating a common flag, and fixup all files to
include the proper header or to use the new macro names.

Signed-off-by: Geoff Levand <geoff@infradead.org>
[Restored pgtable-hwdef.h include]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/kvm_arm.h | 11 -----------
 arch/arm64/include/asm/sysreg.h  | 19 +++++++++++++++----
 arch/arm64/kvm/hyp-init.S        |  5 +++--
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index bef6e9243c63..9e1c0e7b0b73 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -83,17 +83,6 @@
 #define HCR_INT_OVERRIDE   (HCR_FMO | HCR_IMO)
 
 
-/* Hyp System Control Register (SCTLR_EL2) bits */
-#define SCTLR_EL2_EE	(1 << 25)
-#define SCTLR_EL2_WXN	(1 << 19)
-#define SCTLR_EL2_I	(1 << 12)
-#define SCTLR_EL2_SA	(1 << 3)
-#define SCTLR_EL2_C	(1 << 2)
-#define SCTLR_EL2_A	(1 << 1)
-#define SCTLR_EL2_M	1
-#define SCTLR_EL2_FLAGS	(SCTLR_EL2_M | SCTLR_EL2_A | SCTLR_EL2_C |	\
-			 SCTLR_EL2_SA | SCTLR_EL2_I)
-
 /* TCR_EL2 Registers bits */
 #define TCR_EL2_RES1	((1 << 31) | (1 << 23))
 #define TCR_EL2_TBI	(1 << 20)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 1a78d6e2a78b..0961a24e8d48 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -86,10 +86,21 @@
 #define SET_PSTATE_UAO(x) __inst_arm(0xd5000000 | REG_PSTATE_UAO_IMM |\
 				     (!!x)<<8 | 0x1f)
 
-/* SCTLR_EL1 */
-#define SCTLR_EL1_CP15BEN	(0x1 << 5)
-#define SCTLR_EL1_SED		(0x1 << 8)
-#define SCTLR_EL1_SPAN		(0x1 << 23)
+/* Common SCTLR_ELx flags. */
+#define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_I	(1 << 12)
+#define SCTLR_ELx_SA	(1 << 3)
+#define SCTLR_ELx_C	(1 << 2)
+#define SCTLR_ELx_A	(1 << 1)
+#define SCTLR_ELx_M	1
+
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
+			 SCTLR_ELx_SA | SCTLR_ELx_I)
+
+/* SCTLR_EL1 specific flags. */
+#define SCTLR_EL1_SPAN		(1 << 23)
+#define SCTLR_EL1_SED		(1 << 8)
+#define SCTLR_EL1_CP15BEN	(1 << 5)
 
 
 /* id_aa64isar0 */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 3e568dcd907b..dc6335a7353e 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -21,6 +21,7 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/pgtable-hwdef.h>
+#include <asm/sysreg.h>
 
 	.text
 	.pushsection	.hyp.idmap.text, "ax"
@@ -114,8 +115,8 @@ __do_hyp_init:
 	dsb	sy
 
 	mrs	x4, sctlr_el2
-	and	x4, x4, #SCTLR_EL2_EE	// preserve endianness of EL2
-	ldr	x5, =SCTLR_EL2_FLAGS
+	and	x4, x4, #SCTLR_ELx_EE	// preserve endianness of EL2
+	ldr	x5, =SCTLR_ELx_FLAGS
 	orr	x4, x4, x5
 	msr	sctlr_el2, x4
 	isb
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 03/14] arm64: Convert hcalls to use HVC immediate value
  2016-02-29 17:20 ` James Morse
                   ` (2 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

The existing arm64 hcall implementations are limited in that they only
allow for two distinct hcalls; with the x0 register either zero or not
zero.  Also, the API of the hyp-stub exception vector routines and the
KVM exception vector routines differ; hyp-stub uses a non-zero value in
x0 to implement __hyp_set_vectors, whereas KVM uses it to implement
kvm_call_hyp.

To allow for additional hcalls to be defined and to make the arm64 hcall
API more consistent across exception vector routines, change the hcall
implementations to use the 16 bit immediate value of the HVC instruction
to specify the hcall type.

Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and
HVC_SET_VECTORS to be used as hcall type specifiers and convert the
existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp()
routines to use these new macros when executing an HVC call.  Also,
change the corresponding hyp-stub and KVM el1_sync exception vector
routines to use these new macros.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/virt.h  | 27 +++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S   | 32 +++++++++++++++++++++-----------
 arch/arm64/kvm/hyp.S           |  3 ++-
 arch/arm64/kvm/hyp/hyp-entry.S |  9 ++++++---
 4 files changed, 56 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df5252dd7..eb10368c329e 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -18,6 +18,33 @@
 #ifndef __ASM__VIRT_H
 #define __ASM__VIRT_H
 
+/*
+ * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to
+ * specify the hcall type.  The exception handlers are allowed to use registers
+ * x17 and x18 in their implementation.  Any routine issuing an hcall must not
+ * expect these registers to be preserved.
+ */
+
+/*
+ * HVC_CALL_HYP - Execute a hyp routine.
+ */
+
+#define HVC_CALL_HYP 0
+
+/*
+ * HVC_GET_VECTORS - Return the value of the vbar_el2 register.
+ */
+
+#define HVC_GET_VECTORS 1
+
+/*
+ * HVC_SET_VECTORS - Set the value of the vbar_el2 register.
+ *
+ * @x0: Physical address of the new vector table.
+ */
+
+#define HVC_SET_VECTORS 2
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index a272f335c289..017ab519aaf1 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -22,6 +22,7 @@
 #include <linux/irqchip/arm-gic-v3.h>
 
 #include <asm/assembler.h>
+#include <asm/kvm_arm.h>
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
@@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors)
 	.align 11
 
 el1_sync:
-	mrs	x1, esr_el2
-	lsr	x1, x1, #26
-	cmp	x1, #0x16
+	mrs	x18, esr_el2
+	lsr	x17, x18, #ESR_ELx_EC_SHIFT
+	and	x18, x18, #ESR_ELx_ISS_MASK
+
+	cmp	x17, #ESR_ELx_EC_HVC64
 	b.ne	2f				// Not an HVC trap
-	cbz	x0, 1f
-	msr	vbar_el2, x0			// Set vbar_el2
+
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
+	mrs	x0, vbar_el2
 	b	2f
-1:	mrs	x0, vbar_el2			// Return vbar_el2
+
+1:	cmp	x18, #HVC_SET_VECTORS
+	b.ne	2f
+	msr	vbar_el2, x0
+
 2:	eret
 ENDPROC(el1_sync)
 
@@ -100,11 +109,12 @@ ENDPROC(\label)
  * initialisation entry point.
  */
 
-ENTRY(__hyp_get_vectors)
-	mov	x0, xzr
-	// fall through
 ENTRY(__hyp_set_vectors)
-	hvc	#0
+	hvc	#HVC_SET_VECTORS
 	ret
-ENDPROC(__hyp_get_vectors)
 ENDPROC(__hyp_set_vectors)
+
+ENTRY(__hyp_get_vectors)
+	hvc	#HVC_GET_VECTORS
+	ret
+ENDPROC(__hyp_get_vectors)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 870578f84b1c..29e93388ebab 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -18,6 +18,7 @@
 #include <linux/linkage.h>
 
 #include <asm/assembler.h>
+#include <asm/virt.h>
 
 /*
  * u64 __kvm_call_hyp(void *hypfn, ...);
@@ -38,6 +39,6 @@
  * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(__kvm_call_hyp)
-	hvc	#0
+	hvc	#HVC_CALL_HYP
 	ret
 ENDPROC(__kvm_call_hyp)
diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index 93e8d983c0bd..a1edf7743df6 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -43,6 +43,7 @@ el1_sync:				// Guest trapped into EL2
 
 	mrs	x1, esr_el2
 	lsr	x2, x1, #ESR_ELx_EC_SHIFT
+	and	x0, x1, #ESR_ELx_ISS_MASK
 
 	cmp	x2, #ESR_ELx_EC_HVC64
 	b.ne	el1_trap
@@ -51,14 +52,16 @@ el1_sync:				// Guest trapped into EL2
 	cbnz	x3, el1_trap		// called HVC
 
 	/* Here, we're pretty sure the host called HVC. */
+	mov	x18, x0
 	restore_x0_to_x3
 
-	/* Check for __hyp_get_vectors */
-	cbnz	x0, 1f
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
 	mrs	x0, vbar_el2
 	b	2f
 
-1:	stp	lr, xzr, [sp, #-16]!
+1:     /* Default to HVC_CALL_HYP. */
+	push	lr, xzr
 
 	/*
 	 * Compute the function address in EL2, and shuffle the parameters.
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 04/14] arm64: Add new hcall HVC_CALL_FUNC
  2016-02-29 17:20 ` James Morse
                   ` (3 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

Add the new hcall HVC_CALL_FUNC that allows execution of a function at EL2.
During CPU reset the CPU must be brought to the exception level it had on
entry to the kernel.  The HVC_CALL_FUNC hcall will provide the mechanism
needed for this exception level switch.

To allow the HVC_CALL_FUNC exception vector to work without a stack, which
is needed to support an hcall at CPU reset, this implementation uses
register x18 to store the link register across the caller provided
function.  This dictates that the caller provided function must preserve
the contents of register x18.

Signed-off-by: Geoff Levand <geoff@infradead.org>
[Renumbered labels in el1_sync()]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/virt.h | 13 +++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 19 +++++++++++++++----
 2 files changed, 28 insertions(+), 4 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index eb10368c329e..30700961f28c 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -45,6 +45,19 @@
 
 #define HVC_SET_VECTORS 2
 
+/*
+ * HVC_CALL_FUNC - Execute a function at EL2.
+ *
+ * @x0: Physical address of the function to be executed.
+ * @x1: Passed as the first argument to the function.
+ * @x2: Passed as the second argument to the function.
+ * @x3: Passed as the third argument to the function.
+ *
+ * The called function must preserve the contents of register x18.
+ */
+
+#define HVC_CALL_FUNC 3
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 017ab519aaf1..cfacfe55e3ef 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -59,18 +59,29 @@ el1_sync:
 	and	x18, x18, #ESR_ELx_ISS_MASK
 
 	cmp	x17, #ESR_ELx_EC_HVC64
-	b.ne	2f				// Not an HVC trap
+	b.ne	9f				// Not an HVC trap
 
 	cmp	x18, #HVC_GET_VECTORS
 	b.ne	1f
 	mrs	x0, vbar_el2
-	b	2f
+	b	9f
 
 1:	cmp	x18, #HVC_SET_VECTORS
 	b.ne	2f
 	msr	vbar_el2, x0
-
-2:	eret
+	b	9f
+
+2:	cmp	x18, #HVC_CALL_FUNC
+	b.ne	9f
+	mov	x18, lr
+	mov	lr, x0
+	mov	x0, x1
+	mov	x1, x2
+	mov	x2, x3
+	blr	lr
+	mov	lr, x18
+
+9:	eret
 ENDPROC(el1_sync)
 
 .macro invalid_vector	label
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 05/14] arm64: kvm: allows kvm cpu hotplug
  2016-02-29 17:20 ` James Morse
                   ` (4 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

The current kvm implementation on arm64 does cpu-specific initialization
at system boot, and has no way to gracefully shutdown a core in terms of
kvm. This prevents kexec from rebooting the system at EL2.

This patch adds a cpu tear-down function and also puts an existing cpu-init
code into a separate function, kvm_arch_hardware_disable() and
kvm_arch_hardware_enable() respectively.
We don't need the arm64 specific cpu hotplug hook any more.

Since this patch modifies common code between arm and arm64, one stub
definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
compilation errors.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[Moved __kvm_hyp_reset() to use kvm_call_hyp(), instead of having its own
 dedicated entry point in el1_sync. Added some comments and a tlbi.]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm/include/asm/kvm_host.h   | 10 +++-
 arch/arm/include/asm/kvm_mmu.h    |  1 +
 arch/arm/kvm/arm.c                | 98 ++++++++++++++++++++++++---------------
 arch/arm/kvm/mmu.c                |  5 ++
 arch/arm64/include/asm/kvm_host.h |  1 -
 arch/arm64/include/asm/kvm_mmu.h  | 20 ++++++++
 arch/arm64/kvm/hyp-init.S         | 42 +++++++++++++++++
 7 files changed, 137 insertions(+), 40 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index f9f27792d8ed..8af531d64771 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -220,6 +220,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * TODO
+	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+	 */
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
@@ -232,7 +241,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a520b7987a29..4fd9ddb48c0f 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index 975da6cfbf59..ee1bee7884a1 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -16,7 +16,6 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
-#include <linux/cpu.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -65,6 +64,8 @@ static DEFINE_SPINLOCK(kvm_vmid_lock);
 
 static bool vgic_present;
 
+static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
 	BUG_ON(preemptible());
@@ -89,11 +90,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
 	return &kvm_arm_running_vcpu;
 }
 
-int kvm_arch_hardware_enable(void)
-{
-	return 0;
-}
-
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
@@ -585,7 +581,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		/*
 		 * Re-check atomic conditions
 		 */
-		if (signal_pending(current)) {
+		if (unlikely(!__this_cpu_read(kvm_arm_hardware_enabled))) {
+			/* cpu has been torn down */
+			ret = 0;
+			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+			run->fail_entry.hardware_entry_failure_reason
+					= (u64)-ENOEXEC;
+		} else if (signal_pending(current)) {
 			ret = -EINTR;
 			run->exit_reason = KVM_EXIT_INTR;
 		}
@@ -967,7 +969,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 }
 
-static void cpu_init_hyp_mode(void *dummy)
+static void cpu_init_hyp_mode(void)
 {
 	phys_addr_t boot_pgd_ptr;
 	phys_addr_t pgd_ptr;
@@ -989,36 +991,61 @@ static void cpu_init_hyp_mode(void *dummy)
 	kvm_arm_init_debug();
 }
 
-static int hyp_init_cpu_notify(struct notifier_block *self,
-			       unsigned long action, void *cpu)
+static void cpu_reset_hyp_mode(void)
 {
-	switch (action) {
-	case CPU_STARTING:
-	case CPU_STARTING_FROZEN:
-		if (__hyp_get_vectors() == hyp_default_vectors)
-			cpu_init_hyp_mode(NULL);
-		break;
+	phys_addr_t boot_pgd_ptr;
+	phys_addr_t phys_idmap_start;
+
+	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
+	phys_idmap_start = kvm_get_idmap_start();
+
+	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
+}
+
+int kvm_arch_hardware_enable(void)
+{
+	if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
+		cpu_init_hyp_mode();
+		__this_cpu_write(kvm_arm_hardware_enabled, 1);
 	}
 
-	return NOTIFY_OK;
+	return 0;
 }
 
-static struct notifier_block hyp_init_cpu_nb = {
-	.notifier_call = hyp_init_cpu_notify,
-};
+void kvm_arch_hardware_disable(void)
+{
+	if (!__this_cpu_read(kvm_arm_hardware_enabled))
+		return;
+
+	cpu_reset_hyp_mode();
+	__this_cpu_write(kvm_arm_hardware_enabled, 0);
+}
 
 #ifdef CONFIG_CPU_PM
 static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
 				    unsigned long cmd,
 				    void *v)
 {
-	if (cmd == CPU_PM_EXIT &&
-	    __hyp_get_vectors() == hyp_default_vectors) {
-		cpu_init_hyp_mode(NULL);
+	/*
+	 * kvm_arm_hardware_enabled is left with its old value over
+	 * PM_ENTER->PM_EXIT. It is used to indicate PM_EXIT should
+	 * re-enable hyp.
+	 */
+	switch (cmd) {
+	case CPU_PM_ENTER:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			cpu_reset_hyp_mode();
+
+		return NOTIFY_OK;
+	case CPU_PM_EXIT:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			cpu_init_hyp_mode();
+
 		return NOTIFY_OK;
-	}
 
-	return NOTIFY_DONE;
+	default:
+		return NOTIFY_DONE;
+	}
 }
 
 static struct notifier_block hyp_init_cpu_pm_nb = {
@@ -1124,14 +1151,20 @@ static int init_hyp_mode(void)
 	}
 
 	/*
-	 * Execute the init code on each CPU.
+	 * Init this CPU temporarily to execute kvm_hyp_call()
+	 * during kvm_vgic_hyp_init().
 	 */
-	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
+	preempt_disable();
+	cpu_init_hyp_mode();
 
 	/*
 	 * Init HYP view of VGIC
 	 */
 	err = kvm_vgic_hyp_init();
+
+	cpu_reset_hyp_mode();
+	preempt_enable();
+
 	switch (err) {
 	case 0:
 		vgic_present = true;
@@ -1215,26 +1248,15 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
-	cpu_notifier_register_begin();
-
 	err = init_hyp_mode();
 	if (err)
 		goto out_err;
 
-	err = __register_cpu_notifier(&hyp_init_cpu_nb);
-	if (err) {
-		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
-		goto out_err;
-	}
-
-	cpu_notifier_register_done();
-
 	hyp_cpu_pm_init();
 
 	kvm_coproc_table_init();
 	return 0;
 out_err:
-	cpu_notifier_register_done();
 	return err;
 }
 
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index aba61fd3697a..7a3aed62499a 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1643,6 +1643,11 @@ phys_addr_t kvm_get_idmap_vector(void)
 	return hyp_idmap_vector;
 }
 
+phys_addr_t kvm_get_idmap_start(void)
+{
+	return hyp_idmap_start;
+}
+
 int kvm_mmu_init(void)
 {
 	int err;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index e3d67ff8798b..cdf49ce6c1ea 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -332,7 +332,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 		       hyp_stack_ptr, vector_ptr);
 }
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 2ac4a22d3119..e13b032c6bee 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -99,6 +99,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
@@ -310,5 +311,24 @@ static inline unsigned int kvm_get_vmid_bits(void)
 	return (cpuid_feature_extract_unsigned_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
 }
 
+void __kvm_hyp_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
+
+/*
+ * Call reset code, and switch back to stub hyp vectors. We need to execute
+ * __kvm_hyp_reset() from the trampoline page, we calculate its address here,
+ * without kimage_voffset translation.
+ */
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	unsigned long trampoline_hyp_reset;
+
+	trampoline_hyp_reset = TRAMPOLINE_VA +
+			       ((unsigned long)__kvm_hyp_reset & ~PAGE_MASK);
+
+	__kvm_call_hyp((void *)trampoline_hyp_reset,
+		     boot_pgd_ptr, phys_idmap_start);
+}
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index dc6335a7353e..d20d86c7f9d8 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -150,6 +150,48 @@ merged:
 	eret
 ENDPROC(__kvm_hyp_init)
 
+	/*
+	 * x0: HYP boot pgd
+	 * x1: HYP phys_idmap_start
+	 */
+ENTRY(__kvm_hyp_reset)
+	/*
+	 * Retrieve lr from the stack (pushed by el1_sync()), so we can eret
+	 * from here.
+	 */
+	ldp	lr, xzr, [sp], #16
+
+	/* We're in trampoline code in VA, switch back to boot page tables */
+	msr	ttbr0_el2, x0
+	isb
+
+	/* Ensure the PA branch doesn't find a stale tlb entry. */
+	tlbi	alle2
+	dsb	sy
+
+	/* Branch into PA space */
+	adr	x0, 1f
+	bfi	x1, x0, #0, #PAGE_SHIFT
+	br	x1
+
+	/* We're now in idmap, disable MMU */
+1:	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_ELx_FLAGS
+	bic	x0, x0, x1		// Clear SCTL_M and etc
+	msr	sctlr_el2, x0
+	isb
+
+	/* Invalidate the old TLBs */
+	tlbi	alle2
+	dsb	sy
+
+	/* Install stub vectors */
+	adr_l	x0, __hyp_stub_vectors
+	msr	vbar_el2, x0
+
+	eret
+ENDPROC(__kvm_hyp_reset)
+
 	.ltorg
 
 	.popsection
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 06/14] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  2016-02-29 17:20 ` James Morse
                   ` (5 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

Hibernate could make use of the cpu_suspend() code to save/restore cpu
state, however it needs to be able to return '0' from the 'finisher'.

Rework cpu_suspend() so that the finisher is called from C code,
independently from the save/restore of cpu state. Space to save the context
in is allocated in the caller's stack frame, and passed into
__cpu_suspend_enter().

Hibernate's use of this API will look like a copy of the cpu_suspend()
function.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/include/asm/suspend.h | 20 +++++++++
 arch/arm64/kernel/asm-offsets.c  |  2 +
 arch/arm64/kernel/sleep.S        | 93 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 66 +++++++++++++++++-----------
 4 files changed, 97 insertions(+), 84 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 59a5b0f1e81c..365d8cdd0073 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -2,6 +2,7 @@
 #define __ASM_SUSPEND_H
 
 #define NR_CTX_REGS 11
+#define NR_CALLEE_SAVED_REGS 12
 
 /*
  * struct cpu_suspend_ctx must be 16-byte aligned since it is allocated on
@@ -21,6 +22,25 @@ struct sleep_save_sp {
 	phys_addr_t save_ptr_stash_phys;
 };
 
+/*
+ * Memory to save the cpu state is allocated on the stack by
+ * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
+ * This data must survive until cpu_resume() is called.
+ *
+ * This struct desribes the size and the layout of the saved cpu state.
+ * The layout of the callee_saved_regs is defined by the implementation
+ * of __cpu_suspend_enter(), and cpu_resume(). This struct must be passed
+ * in by the caller as __cpu_suspend_enter()'s stack-frame is gone once it
+ * returns, and the data would be subsequently corrupted by the call to the
+ * finisher.
+ */
+struct sleep_stack_data {
+	struct cpu_suspend_ctx	system_regs;
+	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
+};
+
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
+int __cpu_suspend_enter(struct sleep_stack_data *state);
+void __cpu_suspend_exit(void);
 #endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 463fcf1aca8a..56dcb25c7bc0 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -125,6 +125,8 @@ int main(void)
   DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
   DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
   DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
+  DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
+  DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
   DEFINE(ARM_SMCCC_RES_X0_OFFS,	offsetof(struct arm_smccc_res, a0));
   DEFINE(ARM_SMCCC_RES_X2_OFFS,	offsetof(struct arm_smccc_res, a2));
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index e33fe33876ab..dca81612fe90 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -49,37 +49,30 @@
 	orr	\dst, \dst, \mask		// dst|=(aff3>>rs3)
 	.endm
 /*
- * Save CPU state for a suspend and execute the suspend finisher.
- * On success it will return 0 through cpu_resume - ie through a CPU
- * soft/hard reboot from the reset vector.
- * On failure it returns the suspend finisher return value or force
- * -EOPNOTSUPP if the finisher erroneously returns 0 (the suspend finisher
- * is not allowed to return, if it does this must be considered failure).
- * It saves callee registers, and allocates space on the kernel stack
- * to save the CPU specific registers + some other data for resume.
+ * Save CPU state in the provided sleep_stack_data area, and publish its
+ * location for cpu_resume()'s use in sleep_save_stash.
  *
- *  x0 = suspend finisher argument
- *  x1 = suspend finisher function pointer
+ * cpu_resume() will restore this saved state, and return. Because the
+ * link-register is saved and restored, it will appear to return from this
+ * function. So that the caller can tell the suspend/resume paths apart,
+ * __cpu_suspend_enter() will always return a non-zero value, whereas the
+ * path through cpu_resume() will return 0.
+ *
+ *  x0 = struct sleep_stack_data area
  */
 ENTRY(__cpu_suspend_enter)
-	stp	x29, lr, [sp, #-96]!
-	stp	x19, x20, [sp,#16]
-	stp	x21, x22, [sp,#32]
-	stp	x23, x24, [sp,#48]
-	stp	x25, x26, [sp,#64]
-	stp	x27, x28, [sp,#80]
-	/*
-	 * Stash suspend finisher and its argument in x20 and x19
-	 */
-	mov	x19, x0
-	mov	x20, x1
+	stp	x29, lr, [x0, #SLEEP_STACK_DATA_CALLEE_REGS]
+	stp	x19, x20, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+16]
+	stp	x21, x22, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+32]
+	stp	x23, x24, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+48]
+	stp	x25, x26, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+64]
+	stp	x27, x28, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+80]
+
+	/* save the sp in cpu_suspend_ctx */
 	mov	x2, sp
-	sub	sp, sp, #CPU_SUSPEND_SZ	// allocate cpu_suspend_ctx
-	mov	x0, sp
-	/*
-	 * x0 now points to struct cpu_suspend_ctx allocated on the stack
-	 */
-	str	x2, [x0, #CPU_CTX_SP]
+	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
+
+	/* find the mpidr_hash */
 	ldr	x1, =sleep_save_sp
 	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
 	mrs	x7, mpidr_el1
@@ -93,34 +86,11 @@ ENTRY(__cpu_suspend_enter)
 	ldp	w5, w6, [x9, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
+
+	push	x29, lr
 	bl	__cpu_suspend_save
-	/*
-	 * Grab suspend finisher in x20 and its argument in x19
-	 */
-	mov	x0, x19
-	mov	x1, x20
-	/*
-	 * We are ready for power down, fire off the suspend finisher
-	 * in x1, with argument in x0
-	 */
-	blr	x1
-        /*
-	 * Never gets here, unless suspend finisher fails.
-	 * Successful cpu_suspend should return from cpu_resume, returning
-	 * through this code path is considered an error
-	 * If the return value is set to 0 force x0 = -EOPNOTSUPP
-	 * to make sure a proper error condition is propagated
-	 */
-	cmp	x0, #0
-	mov	x3, #-EOPNOTSUPP
-	csel	x0, x3, x0, eq
-	add	sp, sp, #CPU_SUSPEND_SZ	// rewind stack pointer
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
+	pop	x29, lr
+	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
@@ -146,12 +116,6 @@ ENDPROC(cpu_resume_mmu)
 	.popsection
 cpu_resume_after_mmu:
 	mov	x0, #0			// return zero on success
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
 	ret
 ENDPROC(cpu_resume_after_mmu)
 
@@ -168,6 +132,8 @@ ENTRY(cpu_resume)
         /* x7 contains hash index, let's use it to grab context pointer */
 	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
 	ldr	x0, [x0, x7, lsl #3]
+	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
 	/* load physical address of identity map page table in x1 */
@@ -181,5 +147,12 @@ ENTRY(cpu_resume)
 	 * pointer and x1 to contain physical address of 1:1 page tables
 	 */
 	bl	cpu_do_resume		// PC relative jump, MMU off
+	/* Can't access these by physical address once the MMU is on */
+	ldp	x19, x20, [x29, #16]
+	ldp	x21, x22, [x29, #32]
+	ldp	x23, x24, [x29, #48]
+	ldp	x25, x26, [x29, #64]
+	ldp	x27, x28, [x29, #80]
+	ldp	x29, lr, [x29]
 	b	cpu_resume_mmu		// Resume MMU, never returns
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 66055392f445..6fe46100685a 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -10,22 +10,22 @@
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 
-extern int __cpu_suspend_enter(unsigned long arg, int (*fn)(unsigned long));
+
 /*
  * This is called by __cpu_suspend_enter() to save the state, and do whatever
  * flushing is required to ensure that when the CPU goes to sleep we have
  * the necessary data available when the caches are not searched.
  *
- * ptr: CPU context virtual address
+ * ptr: sleep_stack_data containing cpu state virtual address.
  * save_ptr: address of the location where the context physical address
  *           must be saved
  */
-void notrace __cpu_suspend_save(struct cpu_suspend_ctx *ptr,
+void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
 				phys_addr_t *save_ptr)
 {
 	*save_ptr = virt_to_phys(ptr);
 
-	cpu_do_suspend(ptr);
+	cpu_do_suspend(&ptr->system_regs);
 	/*
 	 * Only flush the context that must be retrieved with the MMU
 	 * off. VA primitives ensure the flush is applied to all
@@ -51,6 +51,30 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 	hw_breakpoint_restore = hw_bp_restore;
 }
 
+void notrace __cpu_suspend_exit(void)
+{
+	/*
+	 * We are resuming from reset with the idmap active in TTBR0_EL1.
+	 * We must uninstall the idmap and restore the expected MMU
+	 * state before we can possibly return to userspace.
+	 */
+	cpu_uninstall_idmap();
+
+	/*
+	 * Restore per-cpu offset before any kernel
+	 * subsystem relying on it has a chance to run.
+	 */
+	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+
+	/*
+	 * Restore HW breakpoint registers to sane values
+	 * before debug exceptions are possibly reenabled
+	 * through local_dbg_restore.
+	 */
+	if (hw_breakpoint_restore)
+		hw_breakpoint_restore(NULL);
+}
+
 /*
  * cpu_suspend
  *
@@ -60,8 +84,9 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
  */
 int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 {
-	int ret;
+	int ret = 0;
 	unsigned long flags;
+	struct sleep_stack_data state;
 
 	/*
 	 * From this point debug exceptions are disabled to prevent
@@ -83,28 +108,21 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * page tables, so that the thread address space is properly
 	 * set-up on function return.
 	 */
-	ret = __cpu_suspend_enter(arg, fn);
-	if (ret == 0) {
-		/*
-		 * We are resuming from reset with the idmap active in TTBR0_EL1.
-		 * We must uninstall the idmap and restore the expected MMU
-		 * state before we can possibly return to userspace.
-		 */
-		cpu_uninstall_idmap();
-
-		/*
-		 * Restore per-cpu offset before any kernel
-		 * subsystem relying on it has a chance to run.
-		 */
-		set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+	if (__cpu_suspend_enter(&state)) {
+		/* Call the suspend finisher */
+		ret = fn(arg);
 
 		/*
-		 * Restore HW breakpoint registers to sane values
-		 * before debug exceptions are possibly reenabled
-		 * through local_dbg_restore.
+		 * Never gets here, unless the suspend finisher fails.
+		 * Successful cpu_suspend() should return from cpu_resume(),
+		 * returning through this code path is considered an error
+		 * If the return value is set to 0 force ret = -EOPNOTSUPP
+		 * to make sure a proper error condition is propagated
 		 */
-		if (hw_breakpoint_restore)
-			hw_breakpoint_restore(NULL);
+		if (!ret)
+			ret = -EOPNOTSUPP;
+	} else {
+		__cpu_suspend_exit();
 	}
 
 	unpause_graph_tracing();
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 07/14] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  2016-02-29 17:20 ` James Morse
                   ` (6 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

By enabling the MMU early in cpu_resume(), the sleep_save_sp and stack can
be accessed by VA, which avoids the need to convert-addresses and clean to
PoC on the suspend path.

MMU setup is shared with the boot path, meaning the swapper_pg_dir is
restored directly: ttbr1_el1 is no longer saved/restored.

struct sleep_save_sp is removed, replacing it with a single array of
pointers.

cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
__cpu_setup(). However these values all contain res0 bits that may be used
to enable future features.

Signed-off-by: James Morse <james.morse@arm.com>
Reviewed-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/include/asm/suspend.h |  9 +++----
 arch/arm64/kernel/asm-offsets.c  |  3 ---
 arch/arm64/kernel/head.S         |  2 +-
 arch/arm64/kernel/setup.c        |  1 -
 arch/arm64/kernel/sleep.S        | 54 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 38 +++++-----------------------
 arch/arm64/mm/proc.S             | 53 ++++++++++++++++-----------------------
 7 files changed, 50 insertions(+), 110 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 365d8cdd0073..29d3c71433e1 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,7 +1,7 @@
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
-#define NR_CTX_REGS 11
+#define NR_CTX_REGS 10
 #define NR_CALLEE_SAVED_REGS 12
 
 /*
@@ -17,11 +17,6 @@ struct cpu_suspend_ctx {
 	u64 sp;
 } __aligned(16);
 
-struct sleep_save_sp {
-	phys_addr_t *save_ptr_stash;
-	phys_addr_t save_ptr_stash_phys;
-};
-
 /*
  * Memory to save the cpu state is allocated on the stack by
  * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
@@ -39,6 +34,8 @@ struct sleep_stack_data {
 	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
 };
 
+extern unsigned long *sleep_save_stash;
+
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
 int __cpu_suspend_enter(struct sleep_stack_data *state);
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 56dcb25c7bc0..925ae2f48448 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -122,9 +122,6 @@ int main(void)
   DEFINE(CPU_CTX_SP,		offsetof(struct cpu_suspend_ctx, sp));
   DEFINE(MPIDR_HASH_MASK,	offsetof(struct mpidr_hash, mask));
   DEFINE(MPIDR_HASH_SHIFTS,	offsetof(struct mpidr_hash, shift_aff));
-  DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
-  DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
-  DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
   DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
   DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 50c2134a4aaf..7526a777384c 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -731,7 +731,7 @@ ENTRY(__early_cpu_boot_status)
  * If it isn't, park the CPU
  */
 	.section	".idmap.text", "ax"
-__enable_mmu:
+ENTRY(__enable_mmu)
 	mrs	x18, sctlr_el1			// preserve old SCTLR_EL1 value
 	mrs	x1, ID_AA64MMFR0_EL1
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 42371f69def3..1e33d967c0ae 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -175,7 +175,6 @@ static void __init smp_build_mpidr_hash(void)
 	 */
 	if (mpidr_hash_size() > 4 * num_possible_cpus())
 		pr_warn("Large number of MPIDR hash buckets detected\n");
-	__flush_dcache_area(&mpidr_hash, sizeof(struct mpidr_hash));
 }
 
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index dca81612fe90..0e2b36f1fb44 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -73,8 +73,8 @@ ENTRY(__cpu_suspend_enter)
 	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
 
 	/* find the mpidr_hash */
-	ldr	x1, =sleep_save_sp
-	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
+	ldr	x1, =sleep_save_stash
+	ldr	x1, [x1]
 	mrs	x7, mpidr_el1
 	ldr	x9, =mpidr_hash
 	ldr	x10, [x9, #MPIDR_HASH_MASK]
@@ -87,40 +87,26 @@ ENTRY(__cpu_suspend_enter)
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
 
+	str	x0, [x1]
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	push	x29, lr
-	bl	__cpu_suspend_save
+	bl	cpu_do_suspend
 	pop	x29, lr
 	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
 
-/*
- * x0 must contain the sctlr value retrieved from restored context
- */
-	.pushsection	".idmap.text", "ax"
-ENTRY(cpu_resume_mmu)
-	ldr	x3, =cpu_resume_after_mmu
-	msr	sctlr_el1, x0		// restore sctlr_el1
-	isb
-	/*
-	 * Invalidate the local I-cache so that any instructions fetched
-	 * speculatively from the PoC are discarded, since they may have
-	 * been dynamically patched at the PoU.
-	 */
-	ic	iallu
-	dsb	nsh
-	isb
-	br	x3			// global jump to virtual address
-ENDPROC(cpu_resume_mmu)
-	.popsection
-cpu_resume_after_mmu:
-	mov	x0, #0			// return zero on success
-	ret
-ENDPROC(cpu_resume_after_mmu)
-
 ENTRY(cpu_resume)
 	bl	el2_setup		// if in EL2 drop to EL1 cleanly
+	/* enable the MMU early - so we can access sleep_save_stash by va */
+	adr_l	lr, __enable_mmu	/* __cpu_setup will return here */
+	ldr	x27, =_cpu_resume	/* __enable_mmu will branch here */
+	adrp	x25, idmap_pg_dir
+	adrp	x26, swapper_pg_dir
+	b	__cpu_setup
+
+ENTRY(_cpu_resume)
 	mrs	x1, mpidr_el1
 	adrp	x8, mpidr_hash
 	add x8, x8, #:lo12:mpidr_hash // x8 = struct mpidr_hash phys address
@@ -130,29 +116,27 @@ ENTRY(cpu_resume)
 	ldp	w5, w6, [x8, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x7, x3, x4, x5, x6, x1, x2
         /* x7 contains hash index, let's use it to grab context pointer */
-	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
+	ldr_l	x0, sleep_save_stash
 	ldr	x0, [x0, x7, lsl #3]
 	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
 	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
-	/* load physical address of identity map page table in x1 */
-	adrp	x1, idmap_pg_dir
 	mov	sp, x2
 	/* save thread_info */
 	and	x2, x2, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x2
 	/*
-	 * cpu_do_resume expects x0 to contain context physical address
-	 * pointer and x1 to contain physical address of 1:1 page tables
+	 * cpu_do_resume expects x0 to contain context address pointer
 	 */
-	bl	cpu_do_resume		// PC relative jump, MMU off
-	/* Can't access these by physical address once the MMU is on */
+	bl	cpu_do_resume
+
 	ldp	x19, x20, [x29, #16]
 	ldp	x21, x22, [x29, #32]
 	ldp	x23, x24, [x29, #48]
 	ldp	x25, x26, [x29, #64]
 	ldp	x27, x28, [x29, #80]
 	ldp	x29, lr, [x29]
-	b	cpu_resume_mmu		// Resume MMU, never returns
+	mov	x0, #0
+	ret
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 6fe46100685a..c6457cedd664 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -10,30 +10,11 @@
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 
-
 /*
- * This is called by __cpu_suspend_enter() to save the state, and do whatever
- * flushing is required to ensure that when the CPU goes to sleep we have
- * the necessary data available when the caches are not searched.
- *
- * ptr: sleep_stack_data containing cpu state virtual address.
- * save_ptr: address of the location where the context physical address
- *           must be saved
+ * This is allocated by cpu_suspend_init(), and used to store a pointer to
+ * the 'struct sleep_stack_data' the contains a particular CPUs state.
  */
-void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
-				phys_addr_t *save_ptr)
-{
-	*save_ptr = virt_to_phys(ptr);
-
-	cpu_do_suspend(&ptr->system_regs);
-	/*
-	 * Only flush the context that must be retrieved with the MMU
-	 * off. VA primitives ensure the flush is applied to all
-	 * cache levels so context is pushed to DRAM.
-	 */
-	__flush_dcache_area(ptr, sizeof(*ptr));
-	__flush_dcache_area(save_ptr, sizeof(*save_ptr));
-}
+unsigned long *sleep_save_stash;
 
 /*
  * This hook is provided so that cpu_suspend code can restore HW
@@ -137,22 +118,15 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	return ret;
 }
 
-struct sleep_save_sp sleep_save_sp;
-
 static int __init cpu_suspend_init(void)
 {
-	void *ctx_ptr;
-
 	/* ctx_ptr is an array of physical addresses */
-	ctx_ptr = kcalloc(mpidr_hash_size(), sizeof(phys_addr_t), GFP_KERNEL);
+	sleep_save_stash = kcalloc(mpidr_hash_size(), sizeof(*sleep_save_stash),
+				   GFP_KERNEL);
 
-	if (WARN_ON(!ctx_ptr))
+	if (WARN_ON(!sleep_save_stash))
 		return -ENOMEM;
 
-	sleep_save_sp.save_ptr_stash = ctx_ptr;
-	sleep_save_sp.save_ptr_stash_phys = virt_to_phys(ctx_ptr);
-	__flush_dcache_area(&sleep_save_sp, sizeof(struct sleep_save_sp));
-
 	return 0;
 }
 early_initcall(cpu_suspend_init);
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 9f6deacf41d2..c4317879b938 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -24,6 +24,7 @@
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
 #include <asm/pgtable.h>
+#include <asm/pgtable-hwdef.h>
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
@@ -63,62 +64,50 @@ ENTRY(cpu_do_suspend)
 	mrs	x2, tpidr_el0
 	mrs	x3, tpidrro_el0
 	mrs	x4, contextidr_el1
-	mrs	x5, mair_el1
-	mrs	x6, cpacr_el1
-	mrs	x7, ttbr1_el1
-	mrs	x8, tcr_el1
-	mrs	x9, vbar_el1
-	mrs	x10, mdscr_el1
-	mrs	x11, oslsr_el1
-	mrs	x12, sctlr_el1
+	mrs	x5, cpacr_el1
+	mrs	x6, tcr_el1
+	mrs	x7, vbar_el1
+	mrs	x8, mdscr_el1
+	mrs	x9, oslsr_el1
+	mrs	x10, sctlr_el1
 	stp	x2, x3, [x0]
-	stp	x4, x5, [x0, #16]
-	stp	x6, x7, [x0, #32]
-	stp	x8, x9, [x0, #48]
-	stp	x10, x11, [x0, #64]
-	str	x12, [x0, #80]
+	stp	x4, xzr, [x0, #16]
+	stp	x5, x6, [x0, #32]
+	stp	x7, x8, [x0, #48]
+	stp	x9, x10, [x0, #64]
 	ret
 ENDPROC(cpu_do_suspend)
 
 /**
  * cpu_do_resume - restore CPU register context
  *
- * x0: Physical address of context pointer
- * x1: ttbr0_el1 to be restored
- *
- * Returns:
- *	sctlr_el1 value in x0
+ * x0: Address of context pointer
  */
 ENTRY(cpu_do_resume)
-	/*
-	 * Invalidate local tlb entries before turning on MMU
-	 */
-	tlbi	vmalle1
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
-	ldp	x6, x7, [x0, #32]
-	ldp	x8, x9, [x0, #48]
-	ldp	x10, x11, [x0, #64]
-	ldr	x12, [x0, #80]
+	ldp	x6, x8, [x0, #32]
+	ldp	x9, x10, [x0, #48]
+	ldp	x11, x12, [x0, #64]
 	msr	tpidr_el0, x2
 	msr	tpidrro_el0, x3
 	msr	contextidr_el1, x4
-	msr	mair_el1, x5
 	msr	cpacr_el1, x6
-	msr	ttbr0_el1, x1
-	msr	ttbr1_el1, x7
-	tcr_set_idmap_t0sz x8, x7
+
+	/* Don't change t0sz here, mask those bits when restoring */
+	mrs	x5, tcr_el1
+	bfi	x8, x5, TCR_T0SZ_OFFSET, TCR_TxSZ_WIDTH
+
 	msr	tcr_el1, x8
 	msr	vbar_el1, x9
 	msr	mdscr_el1, x10
+	msr	sctlr_el1, x12
 	/*
 	 * Restore oslsr_el1 by writing oslar_el1
 	 */
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	reset_pmuserenr_el0 x0			// Disable PMU access from EL0
-	mov	x0, x12
-	dsb	nsh		// Make sure local tlb invalidation completed
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 08/14] arm64: kernel: Include _AC definition in page.h
  2016-02-29 17:20 ` James Morse
                   ` (7 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include
linux/const.h where this is defined. This produces build warnings when only
asm/page.h is included by asm code.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/include/asm/page.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index ae615b9d9a55..17b45f7d96d3 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -19,6 +19,8 @@
 #ifndef __ASM_PAGE_H
 #define __ASM_PAGE_H
 
+#include <linux/const.h>
+
 /* PAGE_SHIFT determines the page size */
 /* CONT_SHIFT determines the number of pages which can be tracked together  */
 #ifdef CONFIG_ARM64_64K_PAGES
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 09/14] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  2016-02-29 17:20 ` James Morse
                   ` (8 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

KERNEL_START and KERNEL_END are useful outside head.S, move them to a
header file.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/memory.h | 3 +++
 arch/arm64/kernel/head.S        | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 5f8667a99e41..22c8c53d4d65 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -71,6 +71,9 @@
 
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
 
+#define KERNEL_START      _text
+#define KERNEL_END        _end
+
 /*
  * The size of the KASAN shadow region. This should be 1/8th of the
  * size of the entire kernel virtual address space.
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 7526a777384c..7e4f840e1de7 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -50,9 +50,6 @@
 #error TEXT_OFFSET must be less than 2MB
 #endif
 
-#define KERNEL_START	_text
-#define KERNEL_END	_end
-
 /*
  * Kernel startup entry point.
  * ---------------------------
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 10/14] arm64: Add new asm macro copy_page
  2016-02-29 17:20 ` James Morse
                   ` (9 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

Kexec and hibernate need to copy pages of memory, but may not have all
of the kernel mapped, and are unable to call copy_page().

Add a simplistic copy_page() macro, that can be inlined in these
situations. lib/copy_page.S provides a bigger better version, but
uses more registers.

Signed-off-by: Geoff Levand <geoff@infradead.org>
[Changed asm label to 9998, added commit message]
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index c72474c2d4dc..a04fd7a9c102 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -24,6 +24,7 @@
 #define __ASM_ASSEMBLER_H
 
 #include <asm/asm-offsets.h>
+#include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
@@ -292,6 +293,24 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * copy_page - copy src to dest using temp registers t1-t8
+ */
+	.macro copy_page dest:req src:req t1:req t2:req t3:req t4:req t5:req t6:req t7:req t8:req
+9998:	ldp	\t1, \t2, [\src]
+	ldp	\t3, \t4, [\src, #16]
+	ldp	\t5, \t6, [\src, #32]
+	ldp	\t7, \t8, [\src, #48]
+	add	\src, \src, #64
+	stnp	\t1, \t2, [\dest]
+	stnp	\t3, \t4, [\dest, #16]
+	stnp	\t5, \t6, [\dest, #32]
+	stnp	\t7, \t8, [\dest, #48]
+	add	\dest, \dest, #64
+	tst	\src, #(PAGE_SIZE - 1)
+	b.ne	9998b
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 11/14] arm64: head.S: el2_setup() to accept sctlr_el1 as an argument
  2016-02-29 17:20 ` James Morse
                   ` (10 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

el2_setup() doesn't just configure el2, it configures el1 too. This
means we can't use it to re-configure el2 after resume from hibernate,
as we will be returned to el1 with the MMU turned off.

Split the sctlr_el1 setting code up, so that el2_setup() accepts an
initial value as an argument. This value will be ignored if el2_setup()
is called at el1: the running value will be preserved with endian
correction.

Hibernate can now call el2_setup() to re-configure el2, passing the
current sctlr_el1 as an argument.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/assembler.h | 10 ++++++++++
 arch/arm64/kernel/head.S           | 19 ++++++++++++-------
 arch/arm64/kernel/sleep.S          |  1 +
 3 files changed, 23 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index a04fd7a9c102..627d66efec68 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -311,6 +311,16 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * Generate the initial sctlr_el1 value for el2_setup to set if we boot at EL2.
+ */
+	.macro init_sctlr_el1 reg
+		mov	\reg, #0x0800		// Set/clear RES{1,0} bits
+CPU_BE(		movk	\reg, #0x33d0, lsl #16)	// Set EE and E0E on BE systems
+CPU_LE(		movk	\reg, #0x30d0, lsl #16)	// Clear EE and E0E on LE systems
+	.endm
+
+/*
+
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 7e4f840e1de7..190010225e93 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -207,6 +207,7 @@ section_table:
 
 ENTRY(stext)
 	bl	preserve_boot_args
+	init_sctlr_el1	x0
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
 	mov	x23, xzr			// KASLR offset, defaults to 0
 	adrp	x24, __PHYS_OFFSET
@@ -513,8 +514,12 @@ ENTRY(kimage_vaddr)
  *
  * Returns either BOOT_CPU_MODE_EL1 or BOOT_CPU_MODE_EL2 in x20 if
  * booted in EL1 or EL2 respectively.
+ *
+ * If booted in EL2, SCTLR_EL1 will be initialised with the value in x0
+ * (otherwise the existing value will be preserved, with endian correction).
  */
 ENTRY(el2_setup)
+	mov	x1, x0				// preserve passed-in sctlr_el1
 	mrs	x0, CurrentEL
 	cmp	x0, #CurrentEL_EL2
 	b.ne	1f
@@ -523,7 +528,7 @@ CPU_BE(	orr	x0, x0, #(1 << 25)	)	// Set the EE bit for EL2
 CPU_LE(	bic	x0, x0, #(1 << 25)	)	// Clear the EE bit for EL2
 	msr	sctlr_el2, x0
 	b	2f
-1:	mrs	x0, sctlr_el1
+1:	mrs	x0, sctlr_el1			// ignore passed-in sctlr_el1
 CPU_BE(	orr	x0, x0, #(3 << 24)	)	// Set the EE and E0E bits for EL1
 CPU_LE(	bic	x0, x0, #(3 << 24)	)	// Clear the EE and E0E bits for EL1
 	msr	sctlr_el1, x0
@@ -559,6 +564,10 @@ CPU_LE(	bic	x0, x0, #(3 << 24)	)	// Clear the EE and E0E bits for EL1
 
 3:
 #endif
+	/* use sctlr_el1 value we were provided with */
+CPU_BE(	orr	x1, x1, #(3 << 24)	)	// Set the EE and E0E bits for EL1
+CPU_LE(	bic	x1, x1, #(3 << 24)	)	// Clear the EE and E0E bits for EL1
+	msr	sctlr_el1, x1
 
 	/* Populate ID registers. */
 	mrs	x0, midr_el1
@@ -566,12 +575,6 @@ CPU_LE(	bic	x0, x0, #(3 << 24)	)	// Clear the EE and E0E bits for EL1
 	msr	vpidr_el2, x0
 	msr	vmpidr_el2, x1
 
-	/* sctlr_el1 */
-	mov	x0, #0x0800			// Set/clear RES{1,0} bits
-CPU_BE(	movk	x0, #0x33d0, lsl #16	)	// Set EE and E0E on BE systems
-CPU_LE(	movk	x0, #0x30d0, lsl #16	)	// Clear EE and E0E on LE systems
-	msr	sctlr_el1, x0
-
 	/* Coprocessor traps. */
 	mov	x0, #0x33ff
 	msr	cptr_el2, x0			// Disable copro. traps to EL2
@@ -641,6 +644,7 @@ ENTRY(__boot_cpu_mode)
 	 * cores are held until we're ready for them to initialise.
 	 */
 ENTRY(secondary_holding_pen)
+	init_sctlr_el1	x0
 	bl	el2_setup			// Drop to EL1, w20=cpu_boot_mode
 	bl	set_cpu_boot_mode_flag
 	mrs	x0, mpidr_el1
@@ -659,6 +663,7 @@ ENDPROC(secondary_holding_pen)
 	 * be used where CPUs are brought online dynamically by the kernel.
 	 */
 ENTRY(secondary_entry)
+	init_sctlr_el1	x0
 	bl	el2_setup			// Drop to EL1
 	bl	set_cpu_boot_mode_flag
 	b	secondary_startup
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 0e2b36f1fb44..5c66252d2b9f 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -98,6 +98,7 @@ ENDPROC(__cpu_suspend_enter)
 	.ltorg
 
 ENTRY(cpu_resume)
+	init_sctlr_el1	x0
 	bl	el2_setup		// if in EL2 drop to EL1 cleanly
 	/* enable the MMU early - so we can access sleep_save_stash by va */
 	adr_l	lr, __enable_mmu	/* __cpu_setup will return here */
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 12/14] PM / Hibernate: Call flush_icache_range() on pages restored in-place
  2016-02-29 17:20 ` James Morse
@ 2016-02-29 17:20   ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Geoff Levand, Catalin Marinas,
	Lorenzo Pieralisi, Mark Rutland, AKASHI Takahiro, Marc Zyngier,
	James Morse, Rafael J . Wysocki, Pavel Machek, linux-pm

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Modify the read and decompress code to call
flush_icache_range() on all pages that are restored, so that the restored
in-place pages are guaranteed to be executable on these architectures.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
---
 kernel/power/swap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 12cd989dadf6..a30645d2e93f 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -37,6 +37,14 @@
 #define HIBERNATE_SIG	"S1SUSPEND"
 
 /*
+ * When reading an {un,}compressed image, we may restore pages in place,
+ * in which case some architectures need these pages cleaning before they
+ * can be executed. We don't know which pages these may be, so clean the lot.
+ */
+bool clean_pages_on_read = false;
+bool clean_pages_on_decompress = false;
+
+/*
  *	The swap map is a data structure used for keeping track of each page
  *	written to a swap partition.  It consists of many swap_map_page
  *	structures that contain each an array of MAP_PAGE_ENTRIES swap entries.
@@ -241,6 +249,9 @@ static void hib_end_io(struct bio *bio)
 
 	if (bio_data_dir(bio) == WRITE)
 		put_page(page);
+	else if (clean_pages_on_read)
+		flush_icache_range((unsigned long)page_address(page),
+				   (unsigned long)page_address(page) + PAGE_SIZE);
 
 	if (bio->bi_error && !hb->error)
 		hb->error = bio->bi_error;
@@ -1049,6 +1060,7 @@ static int load_image(struct swap_map_handle *handle,
 
 	hib_init_batch(&hb);
 
+	clean_pages_on_read = true;
 	printk(KERN_INFO "PM: Loading image data pages (%u pages)...\n",
 		nr_to_read);
 	m = nr_to_read / 10;
@@ -1124,6 +1136,10 @@ static int lzo_decompress_threadfn(void *data)
 		d->unc_len = LZO_UNC_SIZE;
 		d->ret = lzo1x_decompress_safe(d->cmp + LZO_HEADER, d->cmp_len,
 		                               d->unc, &d->unc_len);
+		if (clean_pages_on_decompress)
+			flush_icache_range((unsigned long)d->unc,
+					   (unsigned long)d->unc + d->unc_len);
+
 		atomic_set(&d->stop, 1);
 		wake_up(&d->done);
 	}
@@ -1189,6 +1205,8 @@ static int load_image_lzo(struct swap_map_handle *handle,
 	}
 	memset(crc, 0, offsetof(struct crc_data, go));
 
+	clean_pages_on_decompress = true;
+
 	/*
 	 * Start the decompression threads.
 	 */
-- 
2.7.0


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 12/14] PM / Hibernate: Call flush_icache_range() on pages restored in-place
@ 2016-02-29 17:20   ` James Morse
  0 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Modify the read and decompress code to call
flush_icache_range() on all pages that are restored, so that the restored
in-place pages are guaranteed to be executable on these architectures.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
Acked-by: Rafael J. Wysocki <rjw@rjwysocki.net>
---
 kernel/power/swap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 12cd989dadf6..a30645d2e93f 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -37,6 +37,14 @@
 #define HIBERNATE_SIG	"S1SUSPEND"
 
 /*
+ * When reading an {un,}compressed image, we may restore pages in place,
+ * in which case some architectures need these pages cleaning before they
+ * can be executed. We don't know which pages these may be, so clean the lot.
+ */
+bool clean_pages_on_read = false;
+bool clean_pages_on_decompress = false;
+
+/*
  *	The swap map is a data structure used for keeping track of each page
  *	written to a swap partition.  It consists of many swap_map_page
  *	structures that contain each an array of MAP_PAGE_ENTRIES swap entries.
@@ -241,6 +249,9 @@ static void hib_end_io(struct bio *bio)
 
 	if (bio_data_dir(bio) == WRITE)
 		put_page(page);
+	else if (clean_pages_on_read)
+		flush_icache_range((unsigned long)page_address(page),
+				   (unsigned long)page_address(page) + PAGE_SIZE);
 
 	if (bio->bi_error && !hb->error)
 		hb->error = bio->bi_error;
@@ -1049,6 +1060,7 @@ static int load_image(struct swap_map_handle *handle,
 
 	hib_init_batch(&hb);
 
+	clean_pages_on_read = true;
 	printk(KERN_INFO "PM: Loading image data pages (%u pages)...\n",
 		nr_to_read);
 	m = nr_to_read / 10;
@@ -1124,6 +1136,10 @@ static int lzo_decompress_threadfn(void *data)
 		d->unc_len = LZO_UNC_SIZE;
 		d->ret = lzo1x_decompress_safe(d->cmp + LZO_HEADER, d->cmp_len,
 		                               d->unc, &d->unc_len);
+		if (clean_pages_on_decompress)
+			flush_icache_range((unsigned long)d->unc,
+					   (unsigned long)d->unc + d->unc_len);
+
 		atomic_set(&d->stop, 1);
 		wake_up(&d->done);
 	}
@@ -1189,6 +1205,8 @@ static int load_image_lzo(struct swap_map_handle *handle,
 	}
 	memset(crc, 0, offsetof(struct crc_data, go));
 
+	clean_pages_on_decompress = true;
+
 	/*
 	 * Start the decompression threads.
 	 */
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 13/14] arm64: kernel: Add support for hibernate/suspend-to-disk
  2016-02-29 17:20 ` James Morse
                   ` (12 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  -1 siblings, 0 replies; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for hibernate/suspend-to-disk.

Suspend borrows code from cpu_suspend() to write cpu state onto the stack,
before calling swsusp_save() to save the memory image.

Restore creates a set of temporary page tables, covering only the
linear map, copies the restore code to a 'safe' page, then uses the copy to
restore the memory image. The copied code executes in the lower half of the
address space, and once complete, restores the original kernel's page
tables. It then calls into cpu_resume(), and follows the normal
cpu_suspend() path back into the suspend code.

To restore a kernel using KASLR, the address of the page tables, and
cpu_resume() are stored in the hibernate arch-header and the el2
vectors are pivotted via the 'safe' page in low memory. This also permits
us to resume using a different version of the kernel to the version that
hibernated, but because the MMU isn't turned off during resume, the
MMU settings must be the same between both kernels. To ensure this, the
value of the translation control register (TCR_EL1) is also included in the
hibernate arch-header, this means your resume kernel must have the same
page size, and virtual address space size.

Signed-off-by: James Morse <james.morse@arm.com>
Tested-by: Kevin Hilman <khilman@baylibre.com> # Tested on Juno R2
---
 arch/arm64/Kconfig                |   7 +
 arch/arm64/include/asm/suspend.h  |   7 +
 arch/arm64/kernel/Makefile        |   1 +
 arch/arm64/kernel/asm-offsets.c   |   5 +
 arch/arm64/kernel/hibernate-asm.S | 166 +++++++++++++
 arch/arm64/kernel/hibernate.c     | 503 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S   |  15 ++
 7 files changed, 704 insertions(+)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index dbd47bb9caf2..a007441bfd52 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -942,6 +942,13 @@ menu "Power management options"
 
 source "kernel/power/Kconfig"
 
+config ARCH_HIBERNATION_POSSIBLE
+	def_bool y
+
+config ARCH_HIBERNATION_HEADER
+	def_bool y
+	depends on HIBERNATION
+
 config ARCH_SUSPEND_POSSIBLE
 	def_bool y
 
diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 29d3c71433e1..024d623f662e 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -40,4 +40,11 @@ extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
 int __cpu_suspend_enter(struct sleep_stack_data *state);
 void __cpu_suspend_exit(void);
+void _cpu_resume(void);
+
+int swsusp_arch_suspend(void);
+int swsusp_arch_resume(void);
+int arch_hibernation_header_save(void *addr, unsigned int max_size);
+int arch_hibernation_header_restore(void *addr);
+
 #endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 3793003e16a2..2173149d8954 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -45,6 +45,7 @@ arm64-obj-$(CONFIG_ACPI)		+= acpi.o
 arm64-obj-$(CONFIG_ARM64_ACPI_PARKING_PROTOCOL)	+= acpi_parking_protocol.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
 arm64-obj-$(CONFIG_RANDOMIZE_BASE)	+= kaslr.o
+arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 925ae2f48448..278f610048eb 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
+#include <linux/suspend.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/smp_plat.h>
@@ -127,5 +128,9 @@ int main(void)
 #endif
   DEFINE(ARM_SMCCC_RES_X0_OFFS,	offsetof(struct arm_smccc_res, a0));
   DEFINE(ARM_SMCCC_RES_X2_OFFS,	offsetof(struct arm_smccc_res, a2));
+  BLANK();
+  DEFINE(HIBERN_PBE_ORIG,	offsetof(struct pbe, orig_address));
+  DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
+  DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
   return 0;
 }
diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S
new file mode 100644
index 000000000000..28c814aee608
--- /dev/null
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -0,0 +1,166 @@
+#include <linux/linkage.h>
+#include <linux/errno.h>
+
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
+#include <asm/cputype.h>
+#include <asm/memory.h>
+#include <asm/page.h>
+#include <asm/virt.h>
+
+/*
+ * Corrupt memory.
+ *
+ * Loads temporary page tables then restores the memory image.
+ * Finally branches to cpu_resume() to restore the state saved by
+ * swsusp_arch_suspend().
+ *
+ * Because this code has to be copied to a safe_page, it can't call out to
+ * other functions by PC-relative address. Also remember that it may be
+ * mid-way through over-writing other functions. For this reason it contains
+ * code from flush_icache_range() and uses the copy_page() macro.
+ *
+ * All of memory gets written to, including code. We need to clean the kernel
+ * text to the Point of Coherence (PoC) before secondary cores can be booted.
+ * Because the kernel modules and executable pages mapped to user space are
+ * also written as data, we clean all pages we touch to the Point of
+ * Unification (PoU).
+ *
+ * We use el2_setup() to reconfigure el2. This code needs cleaning to PoC by VA,
+ * but called with its physical address, as we left el2 with the MMU turned off.
+ *
+ * x0: physical address of temporary page tables
+ * x1: physical address of swapper page tables
+ * x2: address of cpu_resume
+ * x3: linear map address of restore_pblist in the current kernel
+ * x4: virtual address of the page containing el2_setup, used to clean to PoC
+ * x5: physical address of el2_setup, used to execute at el2
+ */
+.pushsection    ".hibernate_exit.text", "ax"
+ENTRY(swsusp_arch_suspend_exit)
+	/* Temporary page tables are a copy, so no need for a trampoline here */
+	msr	ttbr1_el1, x0
+	isb
+	tlbi	vmalle1is	/* invalidate intermediate caching entries */
+	dsb	ish
+
+	mov	x21, x1
+	mov	x30, x2		/* el2_setup() will eret to the lr directly */
+	mov	x24, x4
+	mov	x25, x5
+
+	/* walk the restore_pblist and use copy_page() to over-write memory */
+	mov	x19, x3
+
+1:	ldr	x10, [x19, #HIBERN_PBE_ORIG]
+	mov	x0, x10
+	ldr	x1, [x19, #HIBERN_PBE_ADDR]
+
+	copy_page	x0, x1, x2, x3, x4, x5, x6, x7, x8, x9
+
+	add	x1, x10, #PAGE_SIZE
+	/* Clean the copied page to PoU - based on flush_icache_range() */
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x10, x3
+2:	dc	cvau, x4	/* clean D line / unified line */
+	add	x4, x4, x2
+	cmp	x4, x1
+	b.lo	2b
+
+	ldr	x19, [x19, #HIBERN_PBE_NEXT]
+	cbnz	x19, 1b
+
+
+	/* switch to the restored kernels page tables, to reconfigure el2 */
+	msr	ttbr1_el1, x21  /* physical address of swapper page tables */
+	isb
+	tlbi	vmalle1is	/* invalidate intermediate caching entries */
+	ic	ialluis
+	dsb	ish		/* also waits for PoU cleaning to finish */
+	isb
+
+
+	cbz	x24, 4f		/* Did we boot at el1? */
+	/* Clean el2_setup's page to PoC */
+	mov	x0, x24
+	/*
+	 * We don't know if el2_setup() overlaps a page boundary, clean two
+	 * pages, just in case.
+	 */
+	add	x1, x0, #2*PAGE_SIZE
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x0, x3
+3:	dc	cvac, x4
+	add	x4, x4, x2
+	cmp	x4, x1
+	b.lo	3b
+
+	/* reconfigure el2 */
+	mrs	x0, sctlr_el1
+	hvc	#0
+
+	/*
+	 * el2_setup() will eret to the location in x30, so we
+	 * only get here if we booted at el1.
+	 */
+
+4:	ret
+
+	.ltorg
+ENDPROC(swsusp_arch_suspend_exit)
+
+/*
+ * Restore the hyp stub. Once we know where in memory el2_setup is, we
+ * can use it to re-initialise el2. This must be done before the hibernate
+ * page is unmapped.
+ *
+ * x0: The current sctlr_el1 value, to be re-loaded
+ * x25: The physical address of el2_setup __pa(el2_setup)
+ * x30: Where el2_setup() should eret to
+ */
+el1_sync:
+	br	x25
+ENDPROC(el1_sync)
+
+.macro invalid_vector	label
+\label:
+	b \label
+ENDPROC(\label)
+.endm
+
+	invalid_vector	el2_sync_invalid
+	invalid_vector	el2_irq_invalid
+	invalid_vector	el2_fiq_invalid
+	invalid_vector	el2_error_invalid
+	invalid_vector	el1_sync_invalid
+	invalid_vector	el1_irq_invalid
+	invalid_vector	el1_fiq_invalid
+	invalid_vector	el1_error_invalid
+
+/* el2 vectors - switch el2 here while we restore the memory image. */
+	.align 11
+ENTRY(hibernate_el2_vectors)
+	ventry	el2_sync_invalid		// Synchronous EL2t
+	ventry	el2_irq_invalid			// IRQ EL2t
+	ventry	el2_fiq_invalid			// FIQ EL2t
+	ventry	el2_error_invalid		// Error EL2t
+
+	ventry	el2_sync_invalid		// Synchronous EL2h
+	ventry	el2_irq_invalid			// IRQ EL2h
+	ventry	el2_fiq_invalid			// FIQ EL2h
+	ventry	el2_error_invalid		// Error EL2h
+
+	ventry	el1_sync			// Synchronous 64-bit EL1
+	ventry	el1_irq_invalid			// IRQ 64-bit EL1
+	ventry	el1_fiq_invalid			// FIQ 64-bit EL1
+	ventry	el1_error_invalid		// Error 64-bit EL1
+
+	ventry	el1_sync_invalid		// Synchronous 32-bit EL1
+	ventry	el1_irq_invalid			// IRQ 32-bit EL1
+	ventry	el1_fiq_invalid			// FIQ 32-bit EL1
+	ventry	el1_error_invalid		// Error 32-bit EL1
+END(hibernate_el2_vectors)
+
+.popsection
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
new file mode 100644
index 000000000000..279c556ee24b
--- /dev/null
+++ b/arch/arm64/kernel/hibernate.c
@@ -0,0 +1,503 @@
+/*:
+ * Hibernate support specific for ARM64
+ *
+ * Derived from work on ARM hibernation support by:
+ *
+ * Ubuntu project, hibernation support for mach-dove
+ * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)
+ * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)
+ *  https://lkml.org/lkml/2010/6/18/4
+ *  https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html
+ *  https://patchwork.kernel.org/patch/96442/
+ *
+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * License terms: GNU General Public License (GPL) version 2
+ */
+#define pr_fmt(x) "hibernate: " x
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <linux/pm.h>
+#include <linux/sched.h>
+#include <linux/suspend.h>
+#include <linux/version.h>
+
+#include <asm/barrier.h>
+#include <asm/cacheflush.h>
+#include <asm/irqflags.h>
+#include <asm/memory.h>
+#include <asm/mmu_context.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/pgtable-hwdef.h>
+#include <asm/sections.h>
+#include <asm/suspend.h>
+#include <asm/virt.h>
+
+/* These are necessary to build without ifdefery */
+#ifndef pmd_index
+#define pmd_index(x)	0
+#endif
+#ifndef pud_index
+#define pud_index(x)	0
+#endif
+
+#define TCR_IPS_BITS (0x7UL<<32)
+
+/*
+ * Hibernate core relies on this value being 0 on resume, and marks it
+ * __nosavedata assuming it will keep the resume kernel's '0' value. This
+ * doesn't happen with either KASLR or resuming with a different kernel.
+ *
+ * defined as "__visible int in_suspend __nosavedata" in
+ * kernel/power/hibernate.c
+ */
+extern int in_suspend;
+
+/*
+ * This value is written to the hibernate arch header, and prevents resuming
+ * from a hibernate image produced by an incompatible kernel. If you change
+ * a value that isn't saved/restored by hibernate, you should change this value.
+ *
+ * For example, if the mair_el1 values used by the kernel are changed, you
+ * should prevent resuming from a kernel with incompatible attributes, as these
+ * aren't saved/restored.
+ */
+#define HIBERNATE_VERSION	KERNEL_VERSION(4, 6, 0)
+
+/* Find a symbols alias in the linear map */
+#define LMADDR(x)	phys_to_virt(virt_to_phys(x))
+
+/*
+ * Start/end of the hibernate exit code, this must be copied to a 'safe'
+ * location in memory, and executed from there.
+ */
+extern char __hibernate_exit_text_start[], __hibernate_exit_text_end[];
+
+/* temporary el2 vectors in the __hibernate_exit_text section. */
+extern char hibernate_el2_vectors[];
+
+/* el2_setup(), in head.S, use to re-configure el2 */
+extern char el2_setup[];
+
+struct arch_hibernate_hdr_invariants {
+	unsigned long	version;
+	unsigned long	tcr_el1;	/* page_size, va bit etc */
+};
+
+/* These values need to be know across a hibernate/restore. */
+static struct arch_hibernate_hdr {
+	struct arch_hibernate_hdr_invariants invariants;
+
+	/* These are needed to find the relocated kernel if built with kaslr */
+	phys_addr_t	ttbr1_el1;
+	void		(*reenter_kernel)(void);
+
+	/*
+	 * We need to know where el2_setup() is after restore to re-configure
+	 * el2. But first, we need to clean it to PoC by va, as we left el2 with
+	 * the MMU off. Both values will be 0 if we booted@el1.
+	 */
+	phys_addr_t	el2_setup_phys;
+	unsigned long	el2_setup_page;
+} resume_hdr;
+
+static inline void arch_hdr_invariants(struct arch_hibernate_hdr_invariants *i)
+{
+	i->version = HIBERNATE_VERSION;
+	asm volatile("mrs	%0, tcr_el1" : "=r"(i->tcr_el1));
+
+	/* IPS bits vary on big/little systems, mask them out */
+	i->tcr_el1 &= ~TCR_IPS_BITS;
+}
+
+int pfn_is_nosave(unsigned long pfn)
+{
+	unsigned long nosave_begin_pfn = virt_to_pfn(&__nosave_begin);
+	unsigned long nosave_end_pfn = virt_to_pfn(&__nosave_end - 1);
+
+	return (pfn >= nosave_begin_pfn) && (pfn <= nosave_end_pfn);
+}
+
+void notrace save_processor_state(void)
+{
+	WARN_ON(num_online_cpus() != 1);
+}
+
+void notrace restore_processor_state(void)
+{
+}
+
+int arch_hibernation_header_save(void *addr, unsigned int max_size)
+{
+	struct arch_hibernate_hdr *hdr = addr;
+
+	if (max_size < sizeof(*hdr))
+		return -EOVERFLOW;
+
+	arch_hdr_invariants(&hdr->invariants);
+	hdr->ttbr1_el1		= virt_to_phys(swapper_pg_dir);
+	hdr->reenter_kernel	= _cpu_resume;
+	if (is_hyp_mode_available()) {
+		hdr->el2_setup_page	= (unsigned long)el2_setup & PAGE_MASK;
+		hdr->el2_setup_phys	= virt_to_phys(el2_setup);
+	} else {
+		hdr->el2_setup_page = 0;
+		hdr->el2_setup_phys = 0;
+	}
+
+	return 0;
+}
+EXPORT_SYMBOL(arch_hibernation_header_save);
+
+int arch_hibernation_header_restore(void *addr)
+{
+	struct arch_hibernate_hdr_invariants invariants;
+	struct arch_hibernate_hdr *hdr = addr;
+
+	/*
+	 * If this header is ancient, it may be smaller than we expect.
+	 * Test the version first.
+	 */
+	if (hdr->invariants.version != HIBERNATE_VERSION) {
+		pr_crit("Hibernate image not compatible with this kernel version!\n");
+		return -EINVAL;
+	}
+
+	arch_hdr_invariants(&invariants);
+	if (memcmp(&hdr->invariants, &invariants, sizeof(invariants))) {
+		pr_crit("Hibernate image not compatible with this kernel configuration!\n");
+		return -EINVAL;
+	}
+
+	if ((is_hyp_mode_available() && !hdr->el2_setup_page) ||
+	    (!is_hyp_mode_available() && hdr->el2_setup_page)) {
+		pr_crit("Hibernate/resume kernels booted with differing virtualisation support. (EL1/EL2) ");
+		return -EINVAL;
+	}
+
+	resume_hdr = *hdr;
+
+	return 0;
+}
+EXPORT_SYMBOL(arch_hibernation_header_restore);
+
+/*
+ * Copies length bytes, starting at src_start into an new page,
+ * perform cache maintentance, then map it (nearly) at the bottom of memory
+ * as executable.
+ *
+ * This is used by hibernate to copy the code it needs to execute when
+ * overwriting the kernel text. This function generates a new set of page
+ * tables, which it loads into ttbr0.
+ *
+ * Length is provided as we probably only want 4K of data, even on a 64K
+ * page system. We don't use the very bottom page, so that dereferencing
+ * NULL continues to have the expected behaviour.
+ */
+static int create_safe_exec_page(void *src_start, size_t length,
+				 void **dst_addr, phys_addr_t *phys_dst_addr,
+				 unsigned long (*allocator)(gfp_t mask),
+				 gfp_t mask)
+{
+	int rc = 0;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+	unsigned long dst = allocator(mask);
+
+	if (!dst) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	memcpy((void *)dst, src_start, length);
+	flush_icache_range(dst, dst + length);
+
+	pgd = (pgd_t *)allocator(mask) + pgd_index(PAGE_SIZE);
+	if (PTRS_PER_PGD > 1) {
+		pud = (pud_t *)allocator(mask);
+		if (!pud) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pgd(pgd, __pgd(virt_to_phys(pud) | PUD_TYPE_TABLE));
+	}
+
+	pud = pud_offset(pgd, PAGE_SIZE);
+	if (PTRS_PER_PUD > 1) {
+		pmd = (pmd_t *)allocator(mask);
+		if (!pmd) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pud(pud, __pud(virt_to_phys(pmd) | PUD_TYPE_TABLE));
+	}
+
+	pmd = pmd_offset(pud, PAGE_SIZE);
+	if (PTRS_PER_PMD > 1) {
+		pte = (pte_t *)allocator(mask);
+		if (!pte) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pmd(pmd, __pmd(virt_to_phys(pte) | PMD_TYPE_TABLE));
+	}
+
+	pte = pte_offset_kernel(pmd, PAGE_SIZE);
+	set_pte_at(&init_mm, dst, pte,
+		   __pte(virt_to_phys((void *)dst) |
+			 pgprot_val(PAGE_KERNEL_EXEC)));
+
+	/* Load our new page tables */
+	asm volatile("msr	ttbr0_el1, %0;"
+		     "isb;"
+		     "tlbi	vmalle1is;"
+		     "dsb	ish" : : "r"(virt_to_phys(pgd)));
+
+	*dst_addr = (void *)(PAGE_SIZE);
+	*phys_dst_addr = virt_to_phys((void *)dst);
+
+out:
+	return rc;
+}
+
+
+int swsusp_arch_suspend(void)
+{
+	int ret = 0;
+	unsigned long flags;
+	struct sleep_stack_data state;
+
+	local_dbg_save(flags);
+
+	if (__cpu_suspend_enter(&state)) {
+		ret = swsusp_save();
+	} else {
+		void *lm_kernel_start;
+
+		/* Clean kernel to PoC for secondary core startup */
+		lm_kernel_start = LMADDR(KERNEL_START);
+		__flush_dcache_area(lm_kernel_start, KERNEL_END - KERNEL_START);
+
+		/*
+		 * Tell the hibernation core that we've just restored
+		 * the memory
+		 */
+		in_suspend = 0;
+
+		__cpu_suspend_exit();
+	}
+
+	local_dbg_restore(flags);
+
+	return ret;
+}
+
+static int copy_pte(pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long start,
+		    unsigned long end)
+{
+	unsigned long next;
+	unsigned long addr = start;
+	pte_t *src_pte = pte_offset_kernel(src_pmd, start);
+	pte_t *dst_pte = pte_offset_kernel(dst_pmd, start);
+
+	do {
+		next = addr + PAGE_SIZE;
+		if (pte_val(*src_pte))
+			set_pte(dst_pte,
+				__pte(pte_val(*src_pte) & ~PTE_RDONLY));
+	} while (dst_pte++, src_pte++, addr = next, addr != end);
+
+	return 0;
+}
+
+static int copy_pmd(pud_t *dst_pud, pud_t *src_pud, unsigned long start,
+		    unsigned long end)
+{
+	int rc = 0;
+	pte_t *dst_pte;
+	unsigned long next;
+	unsigned long addr = start;
+	pmd_t *src_pmd = pmd_offset(src_pud, start);
+	pmd_t *dst_pmd = pmd_offset(dst_pud, start);
+
+	do {
+		next = pmd_addr_end(addr, end);
+		if (!pmd_val(*src_pmd))
+			continue;
+
+		if (pmd_table(*(src_pmd))) {
+			dst_pte = (pte_t *)get_safe_page(GFP_ATOMIC);
+			if (!dst_pte) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pmd(dst_pmd, __pmd(virt_to_phys(dst_pte)
+					       | PMD_TYPE_TABLE));
+
+			rc = copy_pte(dst_pmd, src_pmd, addr, next);
+			if (rc)
+				break;
+		} else {
+			set_pmd(dst_pmd,
+				__pmd(pmd_val(*src_pmd) & ~PMD_SECT_RDONLY));
+		}
+	} while (dst_pmd++, src_pmd++, addr = next, addr != end);
+
+	return rc;
+}
+
+static int copy_pud(pgd_t *dst_pgd, pgd_t *src_pgd, unsigned long start,
+		    unsigned long end)
+{
+	int rc = 0;
+	pmd_t *dst_pmd;
+	unsigned long next;
+	unsigned long addr = start;
+	pud_t *src_pud = pud_offset(src_pgd, start);
+	pud_t *dst_pud = pud_offset(dst_pgd, start);
+
+	do {
+		next = pud_addr_end(addr, end);
+		if (!pud_val(*src_pud))
+			continue;
+
+		if (pud_table(*(src_pud))) {
+			if (PTRS_PER_PMD != 1) {
+				dst_pmd = (pmd_t *)get_safe_page(GFP_ATOMIC);
+				if (!dst_pmd) {
+					rc = -ENOMEM;
+					break;
+				}
+
+				set_pud(dst_pud, __pud(virt_to_phys(dst_pmd)
+						       | PUD_TYPE_TABLE));
+			}
+
+			rc = copy_pmd(dst_pud, src_pud, addr, next);
+			if (rc)
+				break;
+		} else {
+			set_pud(dst_pud,
+				__pud(pud_val(*src_pud) & ~PMD_SECT_RDONLY));
+		}
+	} while (dst_pud++, src_pud++, addr = next, addr != end);
+
+	return rc;
+}
+
+static int copy_page_tables(pgd_t *dst_pgd, unsigned long start,
+			    unsigned long end)
+{
+	int rc = 0;
+	pud_t *dst_pud;
+	unsigned long next;
+	unsigned long addr = start;
+	pgd_t *src_pgd = pgd_offset_k(start);
+
+	dst_pgd += pgd_index(start);
+
+	do {
+		next = pgd_addr_end(addr, end);
+		if (!pgd_val(*src_pgd))
+			continue;
+
+		if (PTRS_PER_PUD != 1) {
+			dst_pud = (pud_t *)get_safe_page(GFP_ATOMIC);
+			if (!dst_pud) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pgd(dst_pgd, __pgd(virt_to_phys(dst_pud)
+					       | PUD_TYPE_TABLE));
+		}
+
+		rc = copy_pud(dst_pgd, src_pgd, addr, next);
+		if (rc)
+			break;
+	} while (dst_pgd++, src_pgd++, addr = next, addr != end);
+
+	return rc;
+}
+
+/*
+ * Setup then Resume from the hibernate image using swsusp_arch_suspend_exit().
+ *
+ * Memory allocated by get_safe_page() will be dealt with by the hibernate code,
+ * we don't need to free it here.
+ */
+int swsusp_arch_resume(void)
+{
+	int rc = 0;
+	size_t exit_size;
+	pgd_t *tmp_pg_dir;
+	void *lm_restore_pblist;
+	phys_addr_t phys_hibernate_exit;
+	void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t, void *,
+					  void *, unsigned long, phys_addr_t);
+
+	/*
+	 * Copy swsusp_arch_suspend_exit() to a safe page. This will generate
+	 * a new set of ttbr0 page tables and load them.
+	 */
+	exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start;
+	rc = create_safe_exec_page(__hibernate_exit_text_start, exit_size,
+				   (void **)&hibernate_exit,
+				   &phys_hibernate_exit,
+				   get_safe_page, GFP_ATOMIC);
+	if (rc) {
+		pr_err("Failed to create safe executable page for hibernate_exit code.");
+		goto out;
+	}
+
+	/*
+	 * The hibernate exit text contains a set of el2 vectors, that will
+	 * be executed at el2 with the mmu off in order to reload hyp-stub.
+	 */
+	__flush_dcache_area(hibernate_exit, exit_size);
+
+	/*
+	 * Restoring the memory image will overwrite the ttbr1 page tables.
+	 * Create a second copy of just the linear map, and use this when
+	 * restoring.
+	 */
+	tmp_pg_dir = (pgd_t *)get_safe_page(GFP_ATOMIC);
+	if (!tmp_pg_dir) {
+		pr_err("Failed to allocate memory for temporary page tables.");
+		rc = -ENOMEM;
+		goto out;
+	}
+	rc = copy_page_tables(tmp_pg_dir, PAGE_OFFSET, 0);
+	if (rc)
+		goto out;
+
+	/*
+	 * Since we only copied the linear map, we need to find restore_pblist's
+	 * linear map address.
+	 */
+	lm_restore_pblist = LMADDR(restore_pblist);
+
+	/*
+	 * Both KASLR and restoring with a different kernel version will cause
+	 * the el2 vectors to be in a different location in the resumed kernel.
+	 * Load hibernate's temporary copy into el2.
+	 */
+	if (is_hyp_mode_available()) {
+		phys_addr_t el2_vectors = phys_hibernate_exit;  /* base */
+		el2_vectors += hibernate_el2_vectors -
+			       __hibernate_exit_text_start;     /* offset */
+
+		__hyp_set_vectors(el2_vectors);
+	}
+
+	hibernate_exit(virt_to_phys(tmp_pg_dir), resume_hdr.ttbr1_el1,
+		       resume_hdr.reenter_kernel, lm_restore_pblist,
+		       resume_hdr.el2_setup_page, resume_hdr.el2_setup_phys);
+
+out:
+	return rc;
+}
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 4c56e7a0621b..9041a842b9f3 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -46,6 +46,16 @@ jiffies = jiffies_64;
 	*(.idmap.text)					\
 	VMLINUX_SYMBOL(__idmap_text_end) = .;
 
+#ifdef CONFIG_HIBERNATION
+#define HIBERNATE_TEXT					\
+	. = ALIGN(SZ_4K);				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_start) = .;\
+	*(.hibernate_exit.text)				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_end) = .;
+#else
+#define HIBERNATE_TEXT
+#endif
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -108,6 +118,7 @@ SECTIONS
 			LOCK_TEXT
 			HYPERVISOR_TEXT
 			IDMAP_TEXT
+			HIBERNATE_TEXT
 			*(.fixup)
 			*(.gnu.warning)
 		. = ALIGN(16);
@@ -199,6 +210,10 @@ ASSERT(__hyp_idmap_text_end - (__hyp_idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"HYP init code too big or misaligned")
 ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"ID map text too big or misaligned")
+#ifdef CONFIG_HIBERNATION
+ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
+	<= SZ_4K, "Hibernate exit text too big or misaligned")
+#endif
 
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 14/14] arm64: hibernate: Prevent resume from a different kernel version
  2016-02-29 17:20 ` James Morse
                   ` (13 preceding siblings ...)
  (?)
@ 2016-02-29 17:20 ` James Morse
  2016-04-10 11:49   ` Pavel Machek
  -1 siblings, 1 reply; 20+ messages in thread
From: James Morse @ 2016-02-29 17:20 UTC (permalink / raw)
  To: linux-arm-kernel

Resuming using a different kernel version is fragile, while there are
sufficient details in the hibernate arch-header to perform the restore,
changes in the boot process can have a long-lasting impact on the system.
In particular, if the EFI stub causes more runtime services memory to be
allocated, the amount of memory left for linux is reduced. If we are
lucky, this will cause restore to fail with the message:
> PM: Image mismatch: memory size
If we are unlucky, the system will explode sometime later when an EFI
runtime services call is made.

Prevent resuming with a different kernel build, by storing UTS_VERSION
in the hibernate header. This also ensures the page size and va bits
configuration remain the same.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/hibernate.c | 35 +++++------------------------------
 1 file changed, 5 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 279c556ee24b..486315249f2a 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -20,6 +20,7 @@
 #include <linux/pm.h>
 #include <linux/sched.h>
 #include <linux/suspend.h>
+#include <linux/utsname.h>
 #include <linux/version.h>
 
 #include <asm/barrier.h>
@@ -42,8 +43,6 @@
 #define pud_index(x)	0
 #endif
 
-#define TCR_IPS_BITS (0x7UL<<32)
-
 /*
  * Hibernate core relies on this value being 0 on resume, and marks it
  * __nosavedata assuming it will keep the resume kernel's '0' value. This
@@ -54,17 +53,6 @@
  */
 extern int in_suspend;
 
-/*
- * This value is written to the hibernate arch header, and prevents resuming
- * from a hibernate image produced by an incompatible kernel. If you change
- * a value that isn't saved/restored by hibernate, you should change this value.
- *
- * For example, if the mair_el1 values used by the kernel are changed, you
- * should prevent resuming from a kernel with incompatible attributes, as these
- * aren't saved/restored.
- */
-#define HIBERNATE_VERSION	KERNEL_VERSION(4, 6, 0)
-
 /* Find a symbols alias in the linear map */
 #define LMADDR(x)	phys_to_virt(virt_to_phys(x))
 
@@ -81,8 +69,7 @@ extern char hibernate_el2_vectors[];
 extern char el2_setup[];
 
 struct arch_hibernate_hdr_invariants {
-	unsigned long	version;
-	unsigned long	tcr_el1;	/* page_size, va bit etc */
+	char		uts_version[__NEW_UTS_LEN + 1];
 };
 
 /* These values need to be know across a hibernate/restore. */
@@ -104,11 +91,8 @@ static struct arch_hibernate_hdr {
 
 static inline void arch_hdr_invariants(struct arch_hibernate_hdr_invariants *i)
 {
-	i->version = HIBERNATE_VERSION;
-	asm volatile("mrs	%0, tcr_el1" : "=r"(i->tcr_el1));
-
-	/* IPS bits vary on big/little systems, mask them out */
-	i->tcr_el1 &= ~TCR_IPS_BITS;
+	memset(i, 0, sizeof(*i));
+	memcpy(i->uts_version, init_utsname()->version, sizeof(i->uts_version));
 }
 
 int pfn_is_nosave(unsigned long pfn)
@@ -155,18 +139,9 @@ int arch_hibernation_header_restore(void *addr)
 	struct arch_hibernate_hdr_invariants invariants;
 	struct arch_hibernate_hdr *hdr = addr;
 
-	/*
-	 * If this header is ancient, it may be smaller than we expect.
-	 * Test the version first.
-	 */
-	if (hdr->invariants.version != HIBERNATE_VERSION) {
-		pr_crit("Hibernate image not compatible with this kernel version!\n");
-		return -EINVAL;
-	}
-
 	arch_hdr_invariants(&invariants);
 	if (memcmp(&hdr->invariants, &invariants, sizeof(invariants))) {
-		pr_crit("Hibernate image not compatible with this kernel configuration!\n");
+		pr_crit("Hibernate image not generated by this kernel!\n");
 		return -EINVAL;
 	}
 
-- 
2.7.0

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v6 14/14] arm64: hibernate: Prevent resume from a different kernel version
  2016-02-29 17:20 ` [PATCH v6 14/14] arm64: hibernate: Prevent resume from a different kernel version James Morse
@ 2016-04-10 11:49   ` Pavel Machek
  2016-04-14 12:47     ` James Morse
  0 siblings, 1 reply; 20+ messages in thread
From: Pavel Machek @ 2016-04-10 11:49 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon 2016-02-29 17:20:23, James Morse wrote:
> Resuming using a different kernel version is fragile, while there are
> sufficient details in the hibernate arch-header to perform the restore,
> changes in the boot process can have a long-lasting impact on the system.
> In particular, if the EFI stub causes more runtime services memory to be
> allocated, the amount of memory left for linux is reduced. If we are
> lucky, this will cause restore to fail with the message:
> > PM: Image mismatch: memory size
> If we are unlucky, the system will explode sometime later when an EFI
> runtime services call is made.
> 
> Prevent resuming with a different kernel build, by storing UTS_VERSION
> in the hibernate header. This also ensures the page size and va bits
> configuration remain the same.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Better solution would be to store the current bits _and_ UTS version. Then bump UTS_VERSION
every time Linux behaviour changes w.r.t. EFI. For a while, it would change a lot, but then
it should stabilize, as the port stabilizes


> ---
>  arch/arm64/kernel/hibernate.c | 35 +++++------------------------------
>  1 file changed, 5 insertions(+), 30 deletions(-)
> 
> diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
> index 279c556ee24b..486315249f2a 100644
> --- a/arch/arm64/kernel/hibernate.c
> +++ b/arch/arm64/kernel/hibernate.c
> @@ -20,6 +20,7 @@
>  #include <linux/pm.h>
>  #include <linux/sched.h>
>  #include <linux/suspend.h>
> +#include <linux/utsname.h>
>  #include <linux/version.h>
>  
>  #include <asm/barrier.h>
> @@ -42,8 +43,6 @@
>  #define pud_index(x)	0
>  #endif
>  
> -#define TCR_IPS_BITS (0x7UL<<32)
> -
>  /*
>   * Hibernate core relies on this value being 0 on resume, and marks it
>   * __nosavedata assuming it will keep the resume kernel's '0' value. This
> @@ -54,17 +53,6 @@
>   */
>  extern int in_suspend;
>  
> -/*
> - * This value is written to the hibernate arch header, and prevents resuming
> - * from a hibernate image produced by an incompatible kernel. If you change
> - * a value that isn't saved/restored by hibernate, you should change this value.
> - *
> - * For example, if the mair_el1 values used by the kernel are changed, you
> - * should prevent resuming from a kernel with incompatible attributes, as these
> - * aren't saved/restored.
> - */
> -#define HIBERNATE_VERSION	KERNEL_VERSION(4, 6, 0)
> -
>  /* Find a symbols alias in the linear map */
>  #define LMADDR(x)	phys_to_virt(virt_to_phys(x))
>  
> @@ -81,8 +69,7 @@ extern char hibernate_el2_vectors[];
>  extern char el2_setup[];
>  
>  struct arch_hibernate_hdr_invariants {
> -	unsigned long	version;
> -	unsigned long	tcr_el1;	/* page_size, va bit etc */
> +	char		uts_version[__NEW_UTS_LEN + 1];
>  };
>  
>  /* These values need to be know across a hibernate/restore. */
> @@ -104,11 +91,8 @@ static struct arch_hibernate_hdr {
>  
>  static inline void arch_hdr_invariants(struct arch_hibernate_hdr_invariants *i)
>  {
> -	i->version = HIBERNATE_VERSION;
> -	asm volatile("mrs	%0, tcr_el1" : "=r"(i->tcr_el1));
> -
> -	/* IPS bits vary on big/little systems, mask them out */
> -	i->tcr_el1 &= ~TCR_IPS_BITS;
> +	memset(i, 0, sizeof(*i));
> +	memcpy(i->uts_version, init_utsname()->version, sizeof(i->uts_version));
>  }
>  
>  int pfn_is_nosave(unsigned long pfn)
> @@ -155,18 +139,9 @@ int arch_hibernation_header_restore(void *addr)
>  	struct arch_hibernate_hdr_invariants invariants;
>  	struct arch_hibernate_hdr *hdr = addr;
>  
> -	/*
> -	 * If this header is ancient, it may be smaller than we expect.
> -	 * Test the version first.
> -	 */
> -	if (hdr->invariants.version != HIBERNATE_VERSION) {
> -		pr_crit("Hibernate image not compatible with this kernel version!\n");
> -		return -EINVAL;
> -	}
> -
>  	arch_hdr_invariants(&invariants);
>  	if (memcmp(&hdr->invariants, &invariants, sizeof(invariants))) {
> -		pr_crit("Hibernate image not compatible with this kernel configuration!\n");
> +		pr_crit("Hibernate image not generated by this kernel!\n");
>  		return -EINVAL;
>  	}
>  
> -- 
> 2.7.0

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 14/14] arm64: hibernate: Prevent resume from a different kernel version
  2016-04-10 11:49   ` Pavel Machek
@ 2016-04-14 12:47     ` James Morse
  2016-04-14 13:13       ` Pavel Machek
  0 siblings, 1 reply; 20+ messages in thread
From: James Morse @ 2016-04-14 12:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 10/04/16 12:49, Pavel Machek wrote:
> On Mon 2016-02-29 17:20:23, James Morse wrote:
>> Prevent resuming with a different kernel build, by storing UTS_VERSION
>> in the hibernate header. This also ensures the page size and va bits
>> configuration remain the same.
>>
>> Signed-off-by: James Morse <james.morse@arm.com>
> 
> Better solution would be to store the current bits _and_ UTS version. Then bump UTS_VERSION
> every time Linux behaviour changes w.r.t. EFI. For a while, it would change a lot, but then
> it should stabilize, as the port stabilizes

If we want to support resuming with a different kernel version. Patch 13 (of
this series) does this, it has 'HIBERNATE_VERSION', which we would have to
increase every time an incompatible change is made, but I'm worried that its a
constant game of catch-up to work out which changes are compatible. The first
sign a user gets that this isn't supported, (who reads the changelog?), is when
it fails to resume, and they loose any data that was written to the hibernate image.

What does restoring with a different kernel version get used for? (I agree its a
neat trick!) I can see it would be useful on laptops, where the distribution can
upgrade the kernel and the user can still use hibernate before they reboot. But
the distribution can switch this value to be just the kernel version if they are
willing to test hibernate/resume over any updates they make.

If I repost this series, I intend to squash this patch into the main 'add
support for' patch.



Thanks,

James

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v6 14/14] arm64: hibernate: Prevent resume from a different kernel version
  2016-04-14 12:47     ` James Morse
@ 2016-04-14 13:13       ` Pavel Machek
  0 siblings, 0 replies; 20+ messages in thread
From: Pavel Machek @ 2016-04-14 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi!

> >> Signed-off-by: James Morse <james.morse@arm.com>
> > 
> > Better solution would be to store the current bits _and_ UTS version. Then bump UTS_VERSION
> > every time Linux behaviour changes w.r.t. EFI. For a while, it would change a lot, but then
> > it should stabilize, as the port stabilizes
> 
> If we want to support resuming with a different kernel version. Patch 13 (of
> this series) does this, it has 'HIBERNATE_VERSION', which we would have to
> increase every time an incompatible change is made, but I'm worried that its a
> constant game of catch-up to work out which changes are compatible. The first
> sign a user gets that this isn't supported, (who reads the changelog?), is when
> it fails to resume, and they loose any data that was written to the hibernate image.
> 

> What does restoring with a different kernel version get used for? (I agree its a
> neat trick!) I can see it would be useful on laptops, where the distribution can
> upgrade the kernel and the user can still use hibernate before they reboot. But
> the distribution can switch this value to be just the kernel version if they are
> willing to test hibernate/resume over any updates they make.

Well, it is way too easy to forget that you've updated your kernel,
and then hibernate. I've been burned more than once. 

And for example updating from 4.4.1 to 4.4.2 to 4.4.3 to... should be
very safe, and pretty important for distribution users.

Best regards,
								Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2016-04-14 13:13 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-02-29 17:20 [PATCH v6 00/14] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
2016-02-29 17:20 ` James Morse
2016-02-29 17:20 ` [PATCH v6 01/14] arm64: Fold proc-macros.S into assembler.h James Morse
2016-02-29 17:20 ` [PATCH v6 02/14] arm64: Cleanup SCTLR flags James Morse
2016-02-29 17:20 ` [PATCH v6 03/14] arm64: Convert hcalls to use HVC immediate value James Morse
2016-02-29 17:20 ` [PATCH v6 04/14] arm64: Add new hcall HVC_CALL_FUNC James Morse
2016-02-29 17:20 ` [PATCH v6 05/14] arm64: kvm: allows kvm cpu hotplug James Morse
2016-02-29 17:20 ` [PATCH v6 06/14] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter() James Morse
2016-02-29 17:20 ` [PATCH v6 07/14] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va James Morse
2016-02-29 17:20 ` [PATCH v6 08/14] arm64: kernel: Include _AC definition in page.h James Morse
2016-02-29 17:20 ` [PATCH v6 09/14] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file James Morse
2016-02-29 17:20 ` [PATCH v6 10/14] arm64: Add new asm macro copy_page James Morse
2016-02-29 17:20 ` [PATCH v6 11/14] arm64: head.S: el2_setup() to accept sctlr_el1 as an argument James Morse
2016-02-29 17:20 ` [PATCH v6 12/14] PM / Hibernate: Call flush_icache_range() on pages restored in-place James Morse
2016-02-29 17:20   ` James Morse
2016-02-29 17:20 ` [PATCH v6 13/14] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
2016-02-29 17:20 ` [PATCH v6 14/14] arm64: hibernate: Prevent resume from a different kernel version James Morse
2016-04-10 11:49   ` Pavel Machek
2016-04-14 12:47     ` James Morse
2016-04-14 13:13       ` Pavel Machek

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.