All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3 00/10] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2015-11-26 17:32 ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

Hi all,

v2's "clean cached pages on architectures that require it" is wrong.
do_copy_page() isn't used on the resume path for pages restored 'in place'.
There is nowhere we can add the call to flush_icache_range(), instead
patch 9 of this series adds a new 'pbe' list of pages restored in place.
Architectures that need to clean these before they can be executed can walk
this list. This preserves the status-quo a little better, as existing
architectures won't get flush_icache_range() called on these pages.

The first four patches are borrowed from kexec v12, and are used to return
el2 to the hyp stub, as its stack/vectors may be overwritten.

Patches 5 and 6 provide some cleanup to the cpu_suspend() API:
* allowing it to be used with a 'finisher' that needs to return success,
* and turn the MMU on early to allow sleep_save_sp to be accessed by VA.

The last patch adds hibernate support, following the x86 approach, it creates
a temporary set of page tables and copies the hibernate_exit code. The
implementation requires that exactly the same kernel is booted on the
same hardware, and that the kernel is loaded at the same physical address.

This series is based on v4.4-rc2, and can be pulled from:
git://linux-arm.org/linux-jm.git -b hibernate/v3

Changes from v2:
 * Rewrote patch 9 - we can't clean pages in copy_page(), we need to publish
   a list for the architecture to clean
 * Updated patch 10 following rewritten patch 9
 * Added missing pgprot_val() in hibernate.c, spotted by STRICT_MM_TYPECHECKS
 * Removed 'tcr_set_idmap_t0sz' from proc.S - I missed this when rebase-ing
 * Re-imported the first four patches from kexec v12
 * Rebased onto v4.4-rc2
 * Changes from Pavel Machek's comments

Changes from v1:
 * Removed for_each_process(){ for_each_vma() { } }; cache cleaning, replaced
   with icache_flush_range() call in core hibernate code
 * Rebased onto conflicting tcr_ek1.t0sz bug-fix patch

[v2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html
[v1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html


AKASHI Takahiro (1):
  arm64: kvm: allows kvm cpu hotplug

Geoff Levand (3):
  arm64: Fold proc-macros.S into assembler.h
  arm64: Convert hcalls to use HVC immediate value
  arm64: Add new hcall HVC_CALL_FUNC

James Morse (6):
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by
    va
  arm64: kernel: Include _AC definition in page.h
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  PM / Hibernate: Publish pages restored in-place to arch code
  arm64: kernel: Add support for hibernate/suspend-to-disk.

 arch/arm/include/asm/kvm_host.h    |  10 +-
 arch/arm/include/asm/kvm_mmu.h     |   1 +
 arch/arm/kvm/arm.c                 |  79 ++++----
 arch/arm/kvm/mmu.c                 |   5 +
 arch/arm64/Kconfig                 |   3 +
 arch/arm64/include/asm/assembler.h |  48 ++++-
 arch/arm64/include/asm/kvm_host.h  |  16 +-
 arch/arm64/include/asm/kvm_mmu.h   |   1 +
 arch/arm64/include/asm/memory.h    |   3 +
 arch/arm64/include/asm/page.h      |   2 +
 arch/arm64/include/asm/suspend.h   |  31 ++-
 arch/arm64/include/asm/virt.h      |  49 +++++
 arch/arm64/kernel/Makefile         |   1 +
 arch/arm64/kernel/asm-offsets.c    |   9 +-
 arch/arm64/kernel/head.S           |   5 +-
 arch/arm64/kernel/hibernate-asm.S  | 119 ++++++++++++
 arch/arm64/kernel/hibernate.c      | 376 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S       |  43 +++--
 arch/arm64/kernel/setup.c          |   1 -
 arch/arm64/kernel/sleep.S          | 148 +++++----------
 arch/arm64/kernel/suspend.c        | 103 ++++------
 arch/arm64/kernel/vmlinux.lds.S    |  15 ++
 arch/arm64/kvm/hyp-init.S          |  34 +++-
 arch/arm64/kvm/hyp.S               |  44 ++++-
 arch/arm64/mm/cache.S              |   2 -
 arch/arm64/mm/proc-macros.S        |  64 -------
 arch/arm64/mm/proc.S               |  31 +--
 include/linux/suspend.h            |   1 +
 kernel/power/snapshot.c            |  42 +++--
 29 files changed, 943 insertions(+), 343 deletions(-)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.6.2


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 00/10] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2015-11-26 17:32 ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

v2's "clean cached pages on architectures that require it" is wrong.
do_copy_page() isn't used on the resume path for pages restored 'in place'.
There is nowhere we can add the call to flush_icache_range(), instead
patch 9 of this series adds a new 'pbe' list of pages restored in place.
Architectures that need to clean these before they can be executed can walk
this list. This preserves the status-quo a little better, as existing
architectures won't get flush_icache_range() called on these pages.

The first four patches are borrowed from kexec v12, and are used to return
el2 to the hyp stub, as its stack/vectors may be overwritten.

Patches 5 and 6 provide some cleanup to the cpu_suspend() API:
* allowing it to be used with a 'finisher' that needs to return success,
* and turn the MMU on early to allow sleep_save_sp to be accessed by VA.

The last patch adds hibernate support, following the x86 approach, it creates
a temporary set of page tables and copies the hibernate_exit code. The
implementation requires that exactly the same kernel is booted on the
same hardware, and that the kernel is loaded at the same physical address.

This series is based on v4.4-rc2, and can be pulled from:
git://linux-arm.org/linux-jm.git -b hibernate/v3

Changes from v2:
 * Rewrote patch 9 - we can't clean pages in copy_page(), we need to publish
   a list for the architecture to clean
 * Updated patch 10 following rewritten patch 9
 * Added missing pgprot_val() in hibernate.c, spotted by STRICT_MM_TYPECHECKS
 * Removed 'tcr_set_idmap_t0sz' from proc.S - I missed this when rebase-ing
 * Re-imported the first four patches from kexec v12
 * Rebased onto v4.4-rc2
 * Changes from Pavel Machek's comments

Changes from v1:
 * Removed for_each_process(){ for_each_vma() { } }; cache cleaning, replaced
   with icache_flush_range() call in core hibernate code
 * Rebased onto conflicting tcr_ek1.t0sz bug-fix patch

[v2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html
[v1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html


AKASHI Takahiro (1):
  arm64: kvm: allows kvm cpu hotplug

Geoff Levand (3):
  arm64: Fold proc-macros.S into assembler.h
  arm64: Convert hcalls to use HVC immediate value
  arm64: Add new hcall HVC_CALL_FUNC

James Morse (6):
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by
    va
  arm64: kernel: Include _AC definition in page.h
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  PM / Hibernate: Publish pages restored in-place to arch code
  arm64: kernel: Add support for hibernate/suspend-to-disk.

 arch/arm/include/asm/kvm_host.h    |  10 +-
 arch/arm/include/asm/kvm_mmu.h     |   1 +
 arch/arm/kvm/arm.c                 |  79 ++++----
 arch/arm/kvm/mmu.c                 |   5 +
 arch/arm64/Kconfig                 |   3 +
 arch/arm64/include/asm/assembler.h |  48 ++++-
 arch/arm64/include/asm/kvm_host.h  |  16 +-
 arch/arm64/include/asm/kvm_mmu.h   |   1 +
 arch/arm64/include/asm/memory.h    |   3 +
 arch/arm64/include/asm/page.h      |   2 +
 arch/arm64/include/asm/suspend.h   |  31 ++-
 arch/arm64/include/asm/virt.h      |  49 +++++
 arch/arm64/kernel/Makefile         |   1 +
 arch/arm64/kernel/asm-offsets.c    |   9 +-
 arch/arm64/kernel/head.S           |   5 +-
 arch/arm64/kernel/hibernate-asm.S  | 119 ++++++++++++
 arch/arm64/kernel/hibernate.c      | 376 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S       |  43 +++--
 arch/arm64/kernel/setup.c          |   1 -
 arch/arm64/kernel/sleep.S          | 148 +++++----------
 arch/arm64/kernel/suspend.c        | 103 ++++------
 arch/arm64/kernel/vmlinux.lds.S    |  15 ++
 arch/arm64/kvm/hyp-init.S          |  34 +++-
 arch/arm64/kvm/hyp.S               |  44 ++++-
 arch/arm64/mm/cache.S              |   2 -
 arch/arm64/mm/proc-macros.S        |  64 -------
 arch/arm64/mm/proc.S               |  31 +--
 include/linux/suspend.h            |   1 +
 kernel/power/snapshot.c            |  42 +++--
 29 files changed, 943 insertions(+), 343 deletions(-)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.6.2

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 01/10] arm64: Fold proc-macros.S into assembler.h
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

From: Geoff Levand <geoff@infradead.org>

To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to be used
outside the mm code move the contents of proc-macros.S to asm/assembler.h.  Also,
delete proc-macros.S, and fix up all references to proc-macros.S.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>

---
 arch/arm64/include/asm/assembler.h | 48 +++++++++++++++++++++++++++-
 arch/arm64/kvm/hyp-init.S          |  1 -
 arch/arm64/mm/cache.S              |  2 --
 arch/arm64/mm/proc-macros.S        | 64 --------------------------------------
 arch/arm64/mm/proc.S               |  3 --
 5 files changed, 47 insertions(+), 71 deletions(-)
 delete mode 100644 arch/arm64/mm/proc-macros.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 12eff928ef8b..21979a43e84b 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -1,5 +1,5 @@
 /*
- * Based on arch/arm/include/asm/assembler.h
+ * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S
  *
  * Copyright (C) 1996-2000 Russell King
  * Copyright (C) 2012 ARM Ltd.
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H
 
+#include <asm/asm-offsets.h>
+#include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
@@ -194,6 +196,50 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+	.macro	vma_vm_mm, rd, rn
+	ldr	\rd, [\rn, #VMA_VM_MM]
+	.endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+	.macro	mmid, rd, rn
+	ldr	\rd, [\rn, #MM_CONTEXT_ID]
+	.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+	.macro	icache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
+ */
+	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
+#ifndef CONFIG_ARM64_VA_BITS_48
+	ldr_l	\tmpreg, idmap_t0sz
+	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
+#endif
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 178ba2248a98..2e67a4872c51 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -20,7 +20,6 @@
 #include <asm/assembler.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
-#include <asm/pgtable-hwdef.h>
 
 	.text
 	.pushsection	.hyp.idmap.text, "ax"
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index cfa44a6adc0a..f49041dcfbd1 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -24,8 +24,6 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
-#include "proc-macros.S"
-
 /*
  *	flush_icache_range(start,end)
  *
diff --git a/arch/arm64/mm/proc-macros.S b/arch/arm64/mm/proc-macros.S
deleted file mode 100644
index 4c4d93c4bf65..000000000000
--- a/arch/arm64/mm/proc-macros.S
+++ /dev/null
@@ -1,64 +0,0 @@
-/*
- * Based on arch/arm/mm/proc-macros.S
- *
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-
-/*
- * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
- */
-	.macro	vma_vm_mm, rd, rn
-	ldr	\rd, [\rn, #VMA_VM_MM]
-	.endm
-
-/*
- * mmid - get context id from mm pointer (mm->context.id)
- */
-	.macro	mmid, rd, rn
-	ldr	\rd, [\rn, #MM_CONTEXT_ID]
-	.endm
-
-/*
- * dcache_line_size - get the minimum D-cache line size from the CTR register.
- */
-	.macro	dcache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * icache_line_size - get the minimum I-cache line size from the CTR register.
- */
-	.macro	icache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	and	\tmp, \tmp, #0xf		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
- */
-	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
-	ldr_l	\tmpreg, idmap_t0sz
-	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
-#endif
-	.endm
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index cacecc4ad3e5..7ab3a9097369 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -23,11 +23,8 @@
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
-#include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 
-#include "proc-macros.S"
-
 #ifdef CONFIG_ARM64_64K_PAGES
 #define TCR_TG_FLAGS	TCR_TG0_64K | TCR_TG1_64K
 #elif defined(CONFIG_ARM64_16K_PAGES)
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 01/10] arm64: Fold proc-macros.S into assembler.h
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to be used
outside the mm code move the contents of proc-macros.S to asm/assembler.h.  Also,
delete proc-macros.S, and fix up all references to proc-macros.S.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>

---
 arch/arm64/include/asm/assembler.h | 48 +++++++++++++++++++++++++++-
 arch/arm64/kvm/hyp-init.S          |  1 -
 arch/arm64/mm/cache.S              |  2 --
 arch/arm64/mm/proc-macros.S        | 64 --------------------------------------
 arch/arm64/mm/proc.S               |  3 --
 5 files changed, 47 insertions(+), 71 deletions(-)
 delete mode 100644 arch/arm64/mm/proc-macros.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 12eff928ef8b..21979a43e84b 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -1,5 +1,5 @@
 /*
- * Based on arch/arm/include/asm/assembler.h
+ * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S
  *
  * Copyright (C) 1996-2000 Russell King
  * Copyright (C) 2012 ARM Ltd.
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H
 
+#include <asm/asm-offsets.h>
+#include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
@@ -194,6 +196,50 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+	.macro	vma_vm_mm, rd, rn
+	ldr	\rd, [\rn, #VMA_VM_MM]
+	.endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+	.macro	mmid, rd, rn
+	ldr	\rd, [\rn, #MM_CONTEXT_ID]
+	.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+	.macro	icache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
+ */
+	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
+#ifndef CONFIG_ARM64_VA_BITS_48
+	ldr_l	\tmpreg, idmap_t0sz
+	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
+#endif
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 178ba2248a98..2e67a4872c51 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -20,7 +20,6 @@
 #include <asm/assembler.h>
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
-#include <asm/pgtable-hwdef.h>
 
 	.text
 	.pushsection	.hyp.idmap.text, "ax"
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index cfa44a6adc0a..f49041dcfbd1 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -24,8 +24,6 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
-#include "proc-macros.S"
-
 /*
  *	flush_icache_range(start,end)
  *
diff --git a/arch/arm64/mm/proc-macros.S b/arch/arm64/mm/proc-macros.S
deleted file mode 100644
index 4c4d93c4bf65..000000000000
--- a/arch/arm64/mm/proc-macros.S
+++ /dev/null
@@ -1,64 +0,0 @@
-/*
- * Based on arch/arm/mm/proc-macros.S
- *
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-
-/*
- * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
- */
-	.macro	vma_vm_mm, rd, rn
-	ldr	\rd, [\rn, #VMA_VM_MM]
-	.endm
-
-/*
- * mmid - get context id from mm pointer (mm->context.id)
- */
-	.macro	mmid, rd, rn
-	ldr	\rd, [\rn, #MM_CONTEXT_ID]
-	.endm
-
-/*
- * dcache_line_size - get the minimum D-cache line size from the CTR register.
- */
-	.macro	dcache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * icache_line_size - get the minimum I-cache line size from the CTR register.
- */
-	.macro	icache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	and	\tmp, \tmp, #0xf		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
- */
-	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
-	ldr_l	\tmpreg, idmap_t0sz
-	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
-#endif
-	.endm
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index cacecc4ad3e5..7ab3a9097369 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -23,11 +23,8 @@
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
-#include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 
-#include "proc-macros.S"
-
 #ifdef CONFIG_ARM64_64K_PAGES
 #define TCR_TG_FLAGS	TCR_TG0_64K | TCR_TG1_64K
 #elif defined(CONFIG_ARM64_16K_PAGES)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 02/10] arm64: Convert hcalls to use HVC immediate value
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

From: Geoff Levand <geoff@infradead.org>

The existing arm64 hcall implementations are limited in that they only allow
for two distinct hcalls; with the x0 register either zero or not zero.  Also,
the API of the hyp-stub exception vector routines and the KVM exception vector
routines differ; hyp-stub uses a non-zero value in x0 to implement
__hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp.

To allow for additional hcalls to be defined and to make the arm64 hcall API
more consistent across exception vector routines, change the hcall
implementations to use the 16 bit immediate value of the HVC instruction to
specify the hcall type.

Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and
HVC_SET_VECTORS to be used as hcall type specifiers and convert the
existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines
to use these new macros when executing an HVC call.  Also, change the
corresponding hyp-stub and KVM el1_sync exception vector routines to use these
new macros.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>

---
 arch/arm64/include/asm/virt.h | 27 +++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 32 +++++++++++++++++++++-----------
 arch/arm64/kvm/hyp.S          | 16 +++++++++-------
 3 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df5252dd7..eb10368c329e 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -18,6 +18,33 @@
 #ifndef __ASM__VIRT_H
 #define __ASM__VIRT_H
 
+/*
+ * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to
+ * specify the hcall type.  The exception handlers are allowed to use registers
+ * x17 and x18 in their implementation.  Any routine issuing an hcall must not
+ * expect these registers to be preserved.
+ */
+
+/*
+ * HVC_CALL_HYP - Execute a hyp routine.
+ */
+
+#define HVC_CALL_HYP 0
+
+/*
+ * HVC_GET_VECTORS - Return the value of the vbar_el2 register.
+ */
+
+#define HVC_GET_VECTORS 1
+
+/*
+ * HVC_SET_VECTORS - Set the value of the vbar_el2 register.
+ *
+ * @x0: Physical address of the new vector table.
+ */
+
+#define HVC_SET_VECTORS 2
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index a272f335c289..017ab519aaf1 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -22,6 +22,7 @@
 #include <linux/irqchip/arm-gic-v3.h>
 
 #include <asm/assembler.h>
+#include <asm/kvm_arm.h>
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
@@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors)
 	.align 11
 
 el1_sync:
-	mrs	x1, esr_el2
-	lsr	x1, x1, #26
-	cmp	x1, #0x16
+	mrs	x18, esr_el2
+	lsr	x17, x18, #ESR_ELx_EC_SHIFT
+	and	x18, x18, #ESR_ELx_ISS_MASK
+
+	cmp	x17, #ESR_ELx_EC_HVC64
 	b.ne	2f				// Not an HVC trap
-	cbz	x0, 1f
-	msr	vbar_el2, x0			// Set vbar_el2
+
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
+	mrs	x0, vbar_el2
 	b	2f
-1:	mrs	x0, vbar_el2			// Return vbar_el2
+
+1:	cmp	x18, #HVC_SET_VECTORS
+	b.ne	2f
+	msr	vbar_el2, x0
+
 2:	eret
 ENDPROC(el1_sync)
 
@@ -100,11 +109,12 @@ ENDPROC(\label)
  * initialisation entry point.
  */
 
-ENTRY(__hyp_get_vectors)
-	mov	x0, xzr
-	// fall through
 ENTRY(__hyp_set_vectors)
-	hvc	#0
+	hvc	#HVC_SET_VECTORS
 	ret
-ENDPROC(__hyp_get_vectors)
 ENDPROC(__hyp_set_vectors)
+
+ENTRY(__hyp_get_vectors)
+	hvc	#HVC_GET_VECTORS
+	ret
+ENDPROC(__hyp_get_vectors)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 1599701ef044..1bef8db4a13d 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -29,6 +29,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/memory.h>
+#include <asm/virt.h>
 
 #define CPU_GP_REG_OFFSET(x)	(CPU_GP_REGS + x)
 #define CPU_XREG_OFFSET(x)	CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x)
@@ -932,12 +933,9 @@ __hyp_panic_str:
  * in Hyp mode (see init_hyp_mode in arch/arm/kvm/arm.c).  Return values are
  * passed in r0 and r1.
  *
- * A function pointer with a value of 0 has a special meaning, and is
- * used to implement __hyp_get_vectors in the same way as in
- * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(kvm_call_hyp)
-	hvc	#0
+	hvc	#HVC_CALL_HYP
 	ret
 ENDPROC(kvm_call_hyp)
 
@@ -968,6 +966,7 @@ el1_sync:					// Guest trapped into EL2
 
 	mrs	x1, esr_el2
 	lsr	x2, x1, #ESR_ELx_EC_SHIFT
+	and	x0, x1, #ESR_ELx_ISS_MASK
 
 	cmp	x2, #ESR_ELx_EC_HVC64
 	b.ne	el1_trap
@@ -976,15 +975,18 @@ el1_sync:					// Guest trapped into EL2
 	cbnz	x3, el1_trap			// called HVC
 
 	/* Here, we're pretty sure the host called HVC. */
+	mov	x18, x0
 	pop	x2, x3
 	pop	x0, x1
 
-	/* Check for __hyp_get_vectors */
-	cbnz	x0, 1f
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
 	mrs	x0, vbar_el2
 	b	2f
 
-1:	push	lr, xzr
+1:	/* Default to HVC_CALL_HYP. */
+
+	push	lr, xzr
 
 	/*
 	 * Compute the function address in EL2, and shuffle the parameters.
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 02/10] arm64: Convert hcalls to use HVC immediate value
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

The existing arm64 hcall implementations are limited in that they only allow
for two distinct hcalls; with the x0 register either zero or not zero.  Also,
the API of the hyp-stub exception vector routines and the KVM exception vector
routines differ; hyp-stub uses a non-zero value in x0 to implement
__hyp_set_vectors, whereas KVM uses it to implement kvm_call_hyp.

To allow for additional hcalls to be defined and to make the arm64 hcall API
more consistent across exception vector routines, change the hcall
implementations to use the 16 bit immediate value of the HVC instruction to
specify the hcall type.

Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and
HVC_SET_VECTORS to be used as hcall type specifiers and convert the
existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp() routines
to use these new macros when executing an HVC call.  Also, change the
corresponding hyp-stub and KVM el1_sync exception vector routines to use these
new macros.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>

---
 arch/arm64/include/asm/virt.h | 27 +++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 32 +++++++++++++++++++++-----------
 arch/arm64/kvm/hyp.S          | 16 +++++++++-------
 3 files changed, 57 insertions(+), 18 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df5252dd7..eb10368c329e 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -18,6 +18,33 @@
 #ifndef __ASM__VIRT_H
 #define __ASM__VIRT_H
 
+/*
+ * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to
+ * specify the hcall type.  The exception handlers are allowed to use registers
+ * x17 and x18 in their implementation.  Any routine issuing an hcall must not
+ * expect these registers to be preserved.
+ */
+
+/*
+ * HVC_CALL_HYP - Execute a hyp routine.
+ */
+
+#define HVC_CALL_HYP 0
+
+/*
+ * HVC_GET_VECTORS - Return the value of the vbar_el2 register.
+ */
+
+#define HVC_GET_VECTORS 1
+
+/*
+ * HVC_SET_VECTORS - Set the value of the vbar_el2 register.
+ *
+ * @x0: Physical address of the new vector table.
+ */
+
+#define HVC_SET_VECTORS 2
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index a272f335c289..017ab519aaf1 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -22,6 +22,7 @@
 #include <linux/irqchip/arm-gic-v3.h>
 
 #include <asm/assembler.h>
+#include <asm/kvm_arm.h>
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
@@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors)
 	.align 11
 
 el1_sync:
-	mrs	x1, esr_el2
-	lsr	x1, x1, #26
-	cmp	x1, #0x16
+	mrs	x18, esr_el2
+	lsr	x17, x18, #ESR_ELx_EC_SHIFT
+	and	x18, x18, #ESR_ELx_ISS_MASK
+
+	cmp	x17, #ESR_ELx_EC_HVC64
 	b.ne	2f				// Not an HVC trap
-	cbz	x0, 1f
-	msr	vbar_el2, x0			// Set vbar_el2
+
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
+	mrs	x0, vbar_el2
 	b	2f
-1:	mrs	x0, vbar_el2			// Return vbar_el2
+
+1:	cmp	x18, #HVC_SET_VECTORS
+	b.ne	2f
+	msr	vbar_el2, x0
+
 2:	eret
 ENDPROC(el1_sync)
 
@@ -100,11 +109,12 @@ ENDPROC(\label)
  * initialisation entry point.
  */
 
-ENTRY(__hyp_get_vectors)
-	mov	x0, xzr
-	// fall through
 ENTRY(__hyp_set_vectors)
-	hvc	#0
+	hvc	#HVC_SET_VECTORS
 	ret
-ENDPROC(__hyp_get_vectors)
 ENDPROC(__hyp_set_vectors)
+
+ENTRY(__hyp_get_vectors)
+	hvc	#HVC_GET_VECTORS
+	ret
+ENDPROC(__hyp_get_vectors)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 1599701ef044..1bef8db4a13d 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -29,6 +29,7 @@
 #include <asm/kvm_asm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/memory.h>
+#include <asm/virt.h>
 
 #define CPU_GP_REG_OFFSET(x)	(CPU_GP_REGS + x)
 #define CPU_XREG_OFFSET(x)	CPU_GP_REG_OFFSET(CPU_USER_PT_REGS + 8*x)
@@ -932,12 +933,9 @@ __hyp_panic_str:
  * in Hyp mode (see init_hyp_mode in arch/arm/kvm/arm.c).  Return values are
  * passed in r0 and r1.
  *
- * A function pointer with a value of 0 has a special meaning, and is
- * used to implement __hyp_get_vectors in the same way as in
- * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(kvm_call_hyp)
-	hvc	#0
+	hvc	#HVC_CALL_HYP
 	ret
 ENDPROC(kvm_call_hyp)
 
@@ -968,6 +966,7 @@ el1_sync:					// Guest trapped into EL2
 
 	mrs	x1, esr_el2
 	lsr	x2, x1, #ESR_ELx_EC_SHIFT
+	and	x0, x1, #ESR_ELx_ISS_MASK
 
 	cmp	x2, #ESR_ELx_EC_HVC64
 	b.ne	el1_trap
@@ -976,15 +975,18 @@ el1_sync:					// Guest trapped into EL2
 	cbnz	x3, el1_trap			// called HVC
 
 	/* Here, we're pretty sure the host called HVC. */
+	mov	x18, x0
 	pop	x2, x3
 	pop	x0, x1
 
-	/* Check for __hyp_get_vectors */
-	cbnz	x0, 1f
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
 	mrs	x0, vbar_el2
 	b	2f
 
-1:	push	lr, xzr
+1:	/* Default to HVC_CALL_HYP. */
+
+	push	lr, xzr
 
 	/*
 	 * Compute the function address in EL2, and shuffle the parameters.
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 03/10] arm64: Add new hcall HVC_CALL_FUNC
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

From: Geoff Levand <geoff@infradead.org>

Add the new hcall HVC_CALL_FUNC that allows execution of a function at EL2.
During CPU reset the CPU must be brought to the exception level it had on
entry to the kernel.  The HVC_CALL_FUNC hcall will provide the mechanism
needed for this exception level switch.

To allow the HVC_CALL_FUNC exception vector to work without a stack, which is
needed to support an hcall at CPU reset, this implementation uses register x18
to store the link register across the caller provided function.  This dictates
that the caller provided function must preserve the contents of register x18.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>

---
 arch/arm64/include/asm/virt.h | 13 +++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 13 ++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index eb10368c329e..30700961f28c 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -45,6 +45,19 @@
 
 #define HVC_SET_VECTORS 2
 
+/*
+ * HVC_CALL_FUNC - Execute a function at EL2.
+ *
+ * @x0: Physical address of the function to be executed.
+ * @x1: Passed as the first argument to the function.
+ * @x2: Passed as the second argument to the function.
+ * @x3: Passed as the third argument to the function.
+ *
+ * The called function must preserve the contents of register x18.
+ */
+
+#define HVC_CALL_FUNC 3
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 017ab519aaf1..e8febe90c036 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -67,8 +67,19 @@ el1_sync:
 	b	2f
 
 1:	cmp	x18, #HVC_SET_VECTORS
-	b.ne	2f
+	b.ne	1f
 	msr	vbar_el2, x0
+	b	2f
+
+1:	cmp	x18, #HVC_CALL_FUNC
+	b.ne	2f
+	mov	x18, lr
+	mov	lr, x0
+	mov	x0, x1
+	mov	x1, x2
+	mov	x2, x3
+	blr	lr
+	mov	lr, x18
 
 2:	eret
 ENDPROC(el1_sync)
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 03/10] arm64: Add new hcall HVC_CALL_FUNC
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

Add the new hcall HVC_CALL_FUNC that allows execution of a function at EL2.
During CPU reset the CPU must be brought to the exception level it had on
entry to the kernel.  The HVC_CALL_FUNC hcall will provide the mechanism
needed for this exception level switch.

To allow the HVC_CALL_FUNC exception vector to work without a stack, which is
needed to support an hcall at CPU reset, this implementation uses register x18
to store the link register across the caller provided function.  This dictates
that the caller provided function must preserve the contents of register x18.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>

---
 arch/arm64/include/asm/virt.h | 13 +++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 13 ++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index eb10368c329e..30700961f28c 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -45,6 +45,19 @@
 
 #define HVC_SET_VECTORS 2
 
+/*
+ * HVC_CALL_FUNC - Execute a function at EL2.
+ *
+ * @x0: Physical address of the function to be executed.
+ * @x1: Passed as the first argument to the function.
+ * @x2: Passed as the second argument to the function.
+ * @x3: Passed as the third argument to the function.
+ *
+ * The called function must preserve the contents of register x18.
+ */
+
+#define HVC_CALL_FUNC 3
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 017ab519aaf1..e8febe90c036 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -67,8 +67,19 @@ el1_sync:
 	b	2f
 
 1:	cmp	x18, #HVC_SET_VECTORS
-	b.ne	2f
+	b.ne	1f
 	msr	vbar_el2, x0
+	b	2f
+
+1:	cmp	x18, #HVC_CALL_FUNC
+	b.ne	2f
+	mov	x18, lr
+	mov	lr, x0
+	mov	x0, x1
+	mov	x1, x2
+	mov	x2, x3
+	blr	lr
+	mov	lr, x18
 
 2:	eret
 ENDPROC(el1_sync)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 04/10] arm64: kvm: allows kvm cpu hotplug
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

The current kvm implementation on arm64 does cpu-specific initialization
at system boot, and has no way to gracefully shutdown a core in terms of
kvm. This prevents, especially, kexec from rebooting the system on a boot
core in EL2.

This patch adds a cpu tear-down function and also puts an existing cpu-init
code into a separate function, kvm_arch_hardware_disable() and
kvm_arch_hardware_enable() respectively.
We don't need arm64-specific cpu hotplug hook any more.

Since this patch modifies common part of code between arm and arm64, one
stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
compiling errors.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm/include/asm/kvm_host.h   | 10 ++++-
 arch/arm/include/asm/kvm_mmu.h    |  1 +
 arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
 arch/arm/kvm/mmu.c                |  5 +++
 arch/arm64/include/asm/kvm_host.h | 16 +++++++-
 arch/arm64/include/asm/kvm_mmu.h  |  1 +
 arch/arm64/include/asm/virt.h     |  9 +++++
 arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
 arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
 9 files changed, 138 insertions(+), 48 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 6692982c9b57..924276571658 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -214,6 +214,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * TODO
+	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+	 */
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
@@ -226,7 +235,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa1883307..dc6fadfd0407 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index eab83b2435b8..a5d9d74f5c75 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -16,7 +16,6 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
-#include <linux/cpu.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -61,6 +60,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
 static u8 kvm_next_vmid;
 static DEFINE_SPINLOCK(kvm_vmid_lock);
 
+static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
 	BUG_ON(preemptible());
@@ -85,11 +86,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
 	return &kvm_arm_running_vcpu;
 }
 
-int kvm_arch_hardware_enable(void)
-{
-	return 0;
-}
-
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
@@ -959,7 +955,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 }
 
-static void cpu_init_hyp_mode(void *dummy)
+int kvm_arch_hardware_enable(void)
 {
 	phys_addr_t boot_pgd_ptr;
 	phys_addr_t pgd_ptr;
@@ -967,6 +963,9 @@ static void cpu_init_hyp_mode(void *dummy)
 	unsigned long stack_page;
 	unsigned long vector_ptr;
 
+	if (__hyp_get_vectors() != hyp_default_vectors)
+		return 0;
+
 	/* Switch from the HYP stub to our own HYP init vector */
 	__hyp_set_vectors(kvm_get_idmap_vector());
 
@@ -979,38 +978,50 @@ static void cpu_init_hyp_mode(void *dummy)
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
 	kvm_arm_init_debug();
+
+	return 0;
 }
 
-static int hyp_init_cpu_notify(struct notifier_block *self,
-			       unsigned long action, void *cpu)
+void kvm_arch_hardware_disable(void)
 {
-	switch (action) {
-	case CPU_STARTING:
-	case CPU_STARTING_FROZEN:
-		if (__hyp_get_vectors() == hyp_default_vectors)
-			cpu_init_hyp_mode(NULL);
-		break;
-	}
+	phys_addr_t boot_pgd_ptr;
+	phys_addr_t phys_idmap_start;
 
-	return NOTIFY_OK;
-}
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
 
-static struct notifier_block hyp_init_cpu_nb = {
-	.notifier_call = hyp_init_cpu_notify,
-};
+	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
+	phys_idmap_start = kvm_get_idmap_start();
+
+	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
+}
 
 #ifdef CONFIG_CPU_PM
 static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
 				    unsigned long cmd,
 				    void *v)
 {
-	if (cmd == CPU_PM_EXIT &&
-	    __hyp_get_vectors() == hyp_default_vectors) {
-		cpu_init_hyp_mode(NULL);
+	switch (cmd) {
+	case CPU_PM_ENTER:
+		if (__hyp_get_vectors() != hyp_default_vectors)
+			__this_cpu_write(kvm_arm_hardware_enabled, 1);
+		else
+			__this_cpu_write(kvm_arm_hardware_enabled, 0);
+		/*
+		 * don't call kvm_arch_hardware_disable() in case of
+		 * CPU_PM_ENTER because it does't actually save any state.
+		 */
+
+		return NOTIFY_OK;
+	case CPU_PM_EXIT:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			kvm_arch_hardware_enable();
+
 		return NOTIFY_OK;
-	}
 
-	return NOTIFY_DONE;
+	default:
+		return NOTIFY_DONE;
+	}
 }
 
 static struct notifier_block hyp_init_cpu_pm_nb = {
@@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
 	}
 
 	/*
-	 * Execute the init code on each CPU.
-	 */
-	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
-
-	/*
 	 * Init HYP view of VGIC
 	 */
 	err = kvm_vgic_hyp_init();
@@ -1186,26 +1192,15 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
-	cpu_notifier_register_begin();
-
 	err = init_hyp_mode();
 	if (err)
 		goto out_err;
 
-	err = __register_cpu_notifier(&hyp_init_cpu_nb);
-	if (err) {
-		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
-		goto out_err;
-	}
-
-	cpu_notifier_register_done();
-
 	hyp_cpu_pm_init();
 
 	kvm_coproc_table_init();
 	return 0;
 out_err:
-	cpu_notifier_register_done();
 	return err;
 }
 
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 6984342da13d..69b4a33d232d 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1644,6 +1644,11 @@ phys_addr_t kvm_get_idmap_vector(void)
 	return hyp_idmap_vector;
 }
 
+phys_addr_t kvm_get_idmap_start(void)
+{
+	return hyp_idmap_start;
+}
+
 int kvm_mmu_init(void)
 {
 	int err;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a35ce7266aac..0b540f852ec1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -223,6 +223,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_call_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
@@ -247,7 +248,20 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 		     hyp_stack_ptr, vector_ptr);
 }
 
-static inline void kvm_arch_hardware_disable(void) {}
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * Call reset code, and switch back to stub hyp vectors.
+	 */
+	kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+}
+
+struct vgic_sr_vectors {
+	void	*save_vgic;
+	void	*restore_vgic;
+};
+
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 61505676d085..ff5a08777e11 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -98,6 +98,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 30700961f28c..bca79f90178c 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -58,9 +58,18 @@
 
 #define HVC_CALL_FUNC 3
 
+/*
+ * HVC_RESET_CPU - Reset cpu in EL2 to initial state.
+ *
+ * @x0: entry address in trampoline code in va
+ * @x1: identical mapping page table in pa
+ */
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
+#define HVC_RESET_CPU 4
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 2e67a4872c51..192516332e47 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -139,6 +139,39 @@ merged:
 	eret
 ENDPROC(__kvm_hyp_init)
 
+	/*
+	 * x0: HYP boot pgd
+	 * x1: HYP phys_idmap_start
+	 */
+ENTRY(__kvm_hyp_reset)
+	/* We're in trampoline code in VA, switch back to boot page tables */
+	msr	ttbr0_el2, x0
+	isb
+
+	/* Invalidate the old TLBs */
+	tlbi	alle2
+	dsb	sy
+
+	/* Branch into PA space */
+	adr	x0, 1f
+	bfi	x1, x0, #0, #PAGE_SHIFT
+	br	x1
+
+	/* We're now in idmap, disable MMU */
+1:	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_EL2_FLAGS
+	bic	x0, x0, x1		// Clear SCTL_M and etc
+	msr	sctlr_el2, x0
+	isb
+
+	/* Install stub vectors */
+	adrp	x0, __hyp_stub_vectors
+	add	x0, x0, #:lo12:__hyp_stub_vectors
+	msr	vbar_el2, x0
+
+	eret
+ENDPROC(__kvm_hyp_reset)
+
 	.ltorg
 
 	.popsection
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 1bef8db4a13d..aca11d6b996e 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -939,6 +939,11 @@ ENTRY(kvm_call_hyp)
 	ret
 ENDPROC(kvm_call_hyp)
 
+ENTRY(kvm_call_reset)
+	hvc	#HVC_RESET_CPU
+	ret
+ENDPROC(kvm_call_reset)
+
 .macro invalid_vector	label, target
 	.align	2
 \label:
@@ -982,10 +987,27 @@ el1_sync:					// Guest trapped into EL2
 	cmp	x18, #HVC_GET_VECTORS
 	b.ne	1f
 	mrs	x0, vbar_el2
-	b	2f
-
-1:	/* Default to HVC_CALL_HYP. */
+	b	do_eret
 
+	/* jump into trampoline code */
+1:	cmp	x18, #HVC_RESET_CPU
+	b.ne	2f
+	/*
+	 * Entry point is:
+	 *	TRAMPOLINE_VA
+	 *	+ (__kvm_hyp_reset - (__hyp_idmap_text_start & PAGE_MASK))
+	 */
+	adrp	x2, __kvm_hyp_reset
+	add	x2, x2, #:lo12:__kvm_hyp_reset
+	adrp	x3, __hyp_idmap_text_start
+	add	x3, x3, #:lo12:__hyp_idmap_text_start
+	and	x3, x3, PAGE_MASK
+	sub	x2, x2, x3
+	ldr	x3, =TRAMPOLINE_VA
+	add	x2, x2, x3
+	br	x2				// no return
+
+2:	/* Default to HVC_CALL_HYP. */
 	push	lr, xzr
 
 	/*
@@ -999,7 +1021,9 @@ el1_sync:					// Guest trapped into EL2
 	blr	lr
 
 	pop	lr, xzr
-2:	eret
+
+do_eret:
+	eret
 
 el1_trap:
 	/*
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 04/10] arm64: kvm: allows kvm cpu hotplug
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

The current kvm implementation on arm64 does cpu-specific initialization
at system boot, and has no way to gracefully shutdown a core in terms of
kvm. This prevents, especially, kexec from rebooting the system on a boot
core in EL2.

This patch adds a cpu tear-down function and also puts an existing cpu-init
code into a separate function, kvm_arch_hardware_disable() and
kvm_arch_hardware_enable() respectively.
We don't need arm64-specific cpu hotplug hook any more.

Since this patch modifies common part of code between arm and arm64, one
stub definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
compiling errors.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm/include/asm/kvm_host.h   | 10 ++++-
 arch/arm/include/asm/kvm_mmu.h    |  1 +
 arch/arm/kvm/arm.c                | 79 ++++++++++++++++++---------------------
 arch/arm/kvm/mmu.c                |  5 +++
 arch/arm64/include/asm/kvm_host.h | 16 +++++++-
 arch/arm64/include/asm/kvm_mmu.h  |  1 +
 arch/arm64/include/asm/virt.h     |  9 +++++
 arch/arm64/kvm/hyp-init.S         | 33 ++++++++++++++++
 arch/arm64/kvm/hyp.S              | 32 ++++++++++++++--
 9 files changed, 138 insertions(+), 48 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index 6692982c9b57..924276571658 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -214,6 +214,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * TODO
+	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+	 */
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
@@ -226,7 +235,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index 405aa1883307..dc6fadfd0407 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index eab83b2435b8..a5d9d74f5c75 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -16,7 +16,6 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
-#include <linux/cpu.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -61,6 +60,8 @@ static atomic64_t kvm_vmid_gen = ATOMIC64_INIT(1);
 static u8 kvm_next_vmid;
 static DEFINE_SPINLOCK(kvm_vmid_lock);
 
+static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
 	BUG_ON(preemptible());
@@ -85,11 +86,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
 	return &kvm_arm_running_vcpu;
 }
 
-int kvm_arch_hardware_enable(void)
-{
-	return 0;
-}
-
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
@@ -959,7 +955,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 }
 
-static void cpu_init_hyp_mode(void *dummy)
+int kvm_arch_hardware_enable(void)
 {
 	phys_addr_t boot_pgd_ptr;
 	phys_addr_t pgd_ptr;
@@ -967,6 +963,9 @@ static void cpu_init_hyp_mode(void *dummy)
 	unsigned long stack_page;
 	unsigned long vector_ptr;
 
+	if (__hyp_get_vectors() != hyp_default_vectors)
+		return 0;
+
 	/* Switch from the HYP stub to our own HYP init vector */
 	__hyp_set_vectors(kvm_get_idmap_vector());
 
@@ -979,38 +978,50 @@ static void cpu_init_hyp_mode(void *dummy)
 	__cpu_init_hyp_mode(boot_pgd_ptr, pgd_ptr, hyp_stack_ptr, vector_ptr);
 
 	kvm_arm_init_debug();
+
+	return 0;
 }
 
-static int hyp_init_cpu_notify(struct notifier_block *self,
-			       unsigned long action, void *cpu)
+void kvm_arch_hardware_disable(void)
 {
-	switch (action) {
-	case CPU_STARTING:
-	case CPU_STARTING_FROZEN:
-		if (__hyp_get_vectors() == hyp_default_vectors)
-			cpu_init_hyp_mode(NULL);
-		break;
-	}
+	phys_addr_t boot_pgd_ptr;
+	phys_addr_t phys_idmap_start;
 
-	return NOTIFY_OK;
-}
+	if (__hyp_get_vectors() == hyp_default_vectors)
+		return;
 
-static struct notifier_block hyp_init_cpu_nb = {
-	.notifier_call = hyp_init_cpu_notify,
-};
+	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
+	phys_idmap_start = kvm_get_idmap_start();
+
+	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
+}
 
 #ifdef CONFIG_CPU_PM
 static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
 				    unsigned long cmd,
 				    void *v)
 {
-	if (cmd == CPU_PM_EXIT &&
-	    __hyp_get_vectors() == hyp_default_vectors) {
-		cpu_init_hyp_mode(NULL);
+	switch (cmd) {
+	case CPU_PM_ENTER:
+		if (__hyp_get_vectors() != hyp_default_vectors)
+			__this_cpu_write(kvm_arm_hardware_enabled, 1);
+		else
+			__this_cpu_write(kvm_arm_hardware_enabled, 0);
+		/*
+		 * don't call kvm_arch_hardware_disable() in case of
+		 * CPU_PM_ENTER because it does't actually save any state.
+		 */
+
+		return NOTIFY_OK;
+	case CPU_PM_EXIT:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			kvm_arch_hardware_enable();
+
 		return NOTIFY_OK;
-	}
 
-	return NOTIFY_DONE;
+	default:
+		return NOTIFY_DONE;
+	}
 }
 
 static struct notifier_block hyp_init_cpu_pm_nb = {
@@ -1108,11 +1119,6 @@ static int init_hyp_mode(void)
 	}
 
 	/*
-	 * Execute the init code on each CPU.
-	 */
-	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
-
-	/*
 	 * Init HYP view of VGIC
 	 */
 	err = kvm_vgic_hyp_init();
@@ -1186,26 +1192,15 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
-	cpu_notifier_register_begin();
-
 	err = init_hyp_mode();
 	if (err)
 		goto out_err;
 
-	err = __register_cpu_notifier(&hyp_init_cpu_nb);
-	if (err) {
-		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
-		goto out_err;
-	}
-
-	cpu_notifier_register_done();
-
 	hyp_cpu_pm_init();
 
 	kvm_coproc_table_init();
 	return 0;
 out_err:
-	cpu_notifier_register_done();
 	return err;
 }
 
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index 6984342da13d..69b4a33d232d 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1644,6 +1644,11 @@ phys_addr_t kvm_get_idmap_vector(void)
 	return hyp_idmap_vector;
 }
 
+phys_addr_t kvm_get_idmap_start(void)
+{
+	return hyp_idmap_start;
+}
+
 int kvm_mmu_init(void)
 {
 	int err;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index a35ce7266aac..0b540f852ec1 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -223,6 +223,7 @@ struct kvm_vcpu *kvm_arm_get_running_vcpu(void);
 struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void);
 
 u64 kvm_call_hyp(void *hypfn, ...);
+void kvm_call_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
 void force_vm_exit(const cpumask_t *mask);
 void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
@@ -247,7 +248,20 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 		     hyp_stack_ptr, vector_ptr);
 }
 
-static inline void kvm_arch_hardware_disable(void) {}
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * Call reset code, and switch back to stub hyp vectors.
+	 */
+	kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+}
+
+struct vgic_sr_vectors {
+	void	*save_vgic;
+	void	*restore_vgic;
+};
+
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 61505676d085..ff5a08777e11 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -98,6 +98,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 30700961f28c..bca79f90178c 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -58,9 +58,18 @@
 
 #define HVC_CALL_FUNC 3
 
+/*
+ * HVC_RESET_CPU - Reset cpu in EL2 to initial state.
+ *
+ * @x0: entry address in trampoline code in va
+ * @x1: identical mapping page table in pa
+ */
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
+#define HVC_RESET_CPU 4
+
 #ifndef __ASSEMBLY__
 
 /*
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 2e67a4872c51..192516332e47 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -139,6 +139,39 @@ merged:
 	eret
 ENDPROC(__kvm_hyp_init)
 
+	/*
+	 * x0: HYP boot pgd
+	 * x1: HYP phys_idmap_start
+	 */
+ENTRY(__kvm_hyp_reset)
+	/* We're in trampoline code in VA, switch back to boot page tables */
+	msr	ttbr0_el2, x0
+	isb
+
+	/* Invalidate the old TLBs */
+	tlbi	alle2
+	dsb	sy
+
+	/* Branch into PA space */
+	adr	x0, 1f
+	bfi	x1, x0, #0, #PAGE_SHIFT
+	br	x1
+
+	/* We're now in idmap, disable MMU */
+1:	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_EL2_FLAGS
+	bic	x0, x0, x1		// Clear SCTL_M and etc
+	msr	sctlr_el2, x0
+	isb
+
+	/* Install stub vectors */
+	adrp	x0, __hyp_stub_vectors
+	add	x0, x0, #:lo12:__hyp_stub_vectors
+	msr	vbar_el2, x0
+
+	eret
+ENDPROC(__kvm_hyp_reset)
+
 	.ltorg
 
 	.popsection
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 1bef8db4a13d..aca11d6b996e 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -939,6 +939,11 @@ ENTRY(kvm_call_hyp)
 	ret
 ENDPROC(kvm_call_hyp)
 
+ENTRY(kvm_call_reset)
+	hvc	#HVC_RESET_CPU
+	ret
+ENDPROC(kvm_call_reset)
+
 .macro invalid_vector	label, target
 	.align	2
 \label:
@@ -982,10 +987,27 @@ el1_sync:					// Guest trapped into EL2
 	cmp	x18, #HVC_GET_VECTORS
 	b.ne	1f
 	mrs	x0, vbar_el2
-	b	2f
-
-1:	/* Default to HVC_CALL_HYP. */
+	b	do_eret
 
+	/* jump into trampoline code */
+1:	cmp	x18, #HVC_RESET_CPU
+	b.ne	2f
+	/*
+	 * Entry point is:
+	 *	TRAMPOLINE_VA
+	 *	+ (__kvm_hyp_reset - (__hyp_idmap_text_start & PAGE_MASK))
+	 */
+	adrp	x2, __kvm_hyp_reset
+	add	x2, x2, #:lo12:__kvm_hyp_reset
+	adrp	x3, __hyp_idmap_text_start
+	add	x3, x3, #:lo12:__hyp_idmap_text_start
+	and	x3, x3, PAGE_MASK
+	sub	x2, x2, x3
+	ldr	x3, =TRAMPOLINE_VA
+	add	x2, x2, x3
+	br	x2				// no return
+
+2:	/* Default to HVC_CALL_HYP. */
 	push	lr, xzr
 
 	/*
@@ -999,7 +1021,9 @@ el1_sync:					// Guest trapped into EL2
 	blr	lr
 
 	pop	lr, xzr
-2:	eret
+
+do_eret:
+	eret
 
 el1_trap:
 	/*
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 05/10] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

Hibernate could make use of the cpu_suspend() code to save/restore cpu
state, however it needs to be able to return '0' from the 'finisher'.

Rework cpu_suspend() so that the finisher is called from C code,
independently from the save/restore of cpu state. Space to save the context
in is allocated in the caller's stack frame, and passed into
__cpu_suspend_enter().

Hibernate's use of this API will look like a copy of the cpu_suspend()
function.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/include/asm/suspend.h | 20 +++++++++
 arch/arm64/kernel/asm-offsets.c  |  2 +
 arch/arm64/kernel/sleep.S        | 93 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 89 ++++++++++++++++++++++----------------
 4 files changed, 108 insertions(+), 96 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 59a5b0f1e81c..ccd26da93d03 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -2,6 +2,7 @@
 #define __ASM_SUSPEND_H
 
 #define NR_CTX_REGS 11
+#define NR_CALLEE_SAVED_REGS 12
 
 /*
  * struct cpu_suspend_ctx must be 16-byte aligned since it is allocated on
@@ -21,6 +22,25 @@ struct sleep_save_sp {
 	phys_addr_t save_ptr_stash_phys;
 };
 
+/*
+ * Memory to save the cpu state is allocated on the stack by
+ * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
+ * This data must survive until cpu_resume() is called.
+ *
+ * This struct desribes the size and the layout of the saved cpu state.
+ * The layout of the callee_saved_regs is defined by the implementation
+ * of __cpu_suspend_enter(), and cpu_resume(). This struct must be passed
+ * in by the caller as __cpu_suspend_enter()'s stack-frame is gone once it
+ * returns, and the data would be subsequently corrupted by the call to the
+ * finisher.
+ */
+struct sleep_stack_data {
+	struct cpu_suspend_ctx	system_regs;
+	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
+};
+
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
+int __cpu_suspend_enter(struct sleep_stack_data *state);
+void __cpu_suspend_exit(struct mm_struct *mm);
 #endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 25de8b244961..1f13aabb9f39 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -160,6 +160,8 @@ int main(void)
   DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
   DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
   DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
+  DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
+  DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index f586f7c875e2..1fa40573db13 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -49,37 +49,30 @@
 	orr	\dst, \dst, \mask		// dst|=(aff3>>rs3)
 	.endm
 /*
- * Save CPU state for a suspend and execute the suspend finisher.
- * On success it will return 0 through cpu_resume - ie through a CPU
- * soft/hard reboot from the reset vector.
- * On failure it returns the suspend finisher return value or force
- * -EOPNOTSUPP if the finisher erroneously returns 0 (the suspend finisher
- * is not allowed to return, if it does this must be considered failure).
- * It saves callee registers, and allocates space on the kernel stack
- * to save the CPU specific registers + some other data for resume.
+ * Save CPU state in the provided sleep_stack_data area, and publish its
+ * location for cpu_resume()'s use in sleep_save_stash.
  *
- *  x0 = suspend finisher argument
- *  x1 = suspend finisher function pointer
+ * cpu_resume() will restore this saved state, and return. Because the
+ * link-register is saved and restored, it will appear to return from this
+ * function. So that the caller can tell the suspend/resume paths apart,
+ * __cpu_suspend_enter() will always return a non-zero value, whereas the
+ * path through cpu_resume() will return 0.
+ *
+ *  x0 = struct sleep_stack_data area
  */
 ENTRY(__cpu_suspend_enter)
-	stp	x29, lr, [sp, #-96]!
-	stp	x19, x20, [sp,#16]
-	stp	x21, x22, [sp,#32]
-	stp	x23, x24, [sp,#48]
-	stp	x25, x26, [sp,#64]
-	stp	x27, x28, [sp,#80]
-	/*
-	 * Stash suspend finisher and its argument in x20 and x19
-	 */
-	mov	x19, x0
-	mov	x20, x1
+	stp	x29, lr, [x0, #SLEEP_STACK_DATA_CALLEE_REGS]
+	stp	x19, x20, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+16]
+	stp	x21, x22, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+32]
+	stp	x23, x24, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+48]
+	stp	x25, x26, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+64]
+	stp	x27, x28, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+80]
+
+	/* save the sp in cpu_suspend_ctx */
 	mov	x2, sp
-	sub	sp, sp, #CPU_SUSPEND_SZ	// allocate cpu_suspend_ctx
-	mov	x0, sp
-	/*
-	 * x0 now points to struct cpu_suspend_ctx allocated on the stack
-	 */
-	str	x2, [x0, #CPU_CTX_SP]
+	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
+
+	/* find the mpidr_hash */
 	ldr	x1, =sleep_save_sp
 	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
 	mrs	x7, mpidr_el1
@@ -93,34 +86,11 @@ ENTRY(__cpu_suspend_enter)
 	ldp	w5, w6, [x9, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
+
+	push	x29, lr
 	bl	__cpu_suspend_save
-	/*
-	 * Grab suspend finisher in x20 and its argument in x19
-	 */
-	mov	x0, x19
-	mov	x1, x20
-	/*
-	 * We are ready for power down, fire off the suspend finisher
-	 * in x1, with argument in x0
-	 */
-	blr	x1
-        /*
-	 * Never gets here, unless suspend finisher fails.
-	 * Successful cpu_suspend should return from cpu_resume, returning
-	 * through this code path is considered an error
-	 * If the return value is set to 0 force x0 = -EOPNOTSUPP
-	 * to make sure a proper error condition is propagated
-	 */
-	cmp	x0, #0
-	mov	x3, #-EOPNOTSUPP
-	csel	x0, x3, x0, eq
-	add	sp, sp, #CPU_SUSPEND_SZ	// rewind stack pointer
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
+	pop	x29, lr
+	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
@@ -146,12 +116,6 @@ ENDPROC(cpu_resume_mmu)
 	.popsection
 cpu_resume_after_mmu:
 	mov	x0, #0			// return zero on success
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
 	ret
 ENDPROC(cpu_resume_after_mmu)
 
@@ -168,6 +132,8 @@ ENTRY(cpu_resume)
         /* x7 contains hash index, let's use it to grab context pointer */
 	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
 	ldr	x0, [x0, x7, lsl #3]
+	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
 	/* load physical address of identity map page table in x1 */
@@ -178,5 +144,12 @@ ENTRY(cpu_resume)
 	 * pointer and x1 to contain physical address of 1:1 page tables
 	 */
 	bl	cpu_do_resume		// PC relative jump, MMU off
+	/* Can't access these by physical address once the MMU is on */
+	ldp	x19, x20, [x29, #16]
+	ldp	x21, x22, [x29, #32]
+	ldp	x23, x24, [x29, #48]
+	ldp	x25, x26, [x29, #64]
+	ldp	x27, x28, [x29, #80]
+	ldp	x29, lr, [x29]
 	b	cpu_resume_mmu		// Resume MMU, never returns
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1095aa483a1c..1de29063e2e4 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -10,22 +10,22 @@
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 
-extern int __cpu_suspend_enter(unsigned long arg, int (*fn)(unsigned long));
+
 /*
  * This is called by __cpu_suspend_enter() to save the state, and do whatever
  * flushing is required to ensure that when the CPU goes to sleep we have
  * the necessary data available when the caches are not searched.
  *
- * ptr: CPU context virtual address
+ * ptr: sleep_stack_data containing cpu state virtual address.
  * save_ptr: address of the location where the context physical address
  *           must be saved
  */
-void notrace __cpu_suspend_save(struct cpu_suspend_ctx *ptr,
+void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
 				phys_addr_t *save_ptr)
 {
 	*save_ptr = virt_to_phys(ptr);
 
-	cpu_do_suspend(ptr);
+	cpu_do_suspend(&ptr->system_regs);
 	/*
 	 * Only flush the context that must be retrieved with the MMU
 	 * off. VA primitives ensure the flush is applied to all
@@ -51,6 +51,41 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 	hw_breakpoint_restore = hw_bp_restore;
 }
 
+void notrace __cpu_suspend_exit(struct mm_struct *mm)
+{
+	/*
+	 * We are resuming from reset with TTBR0_EL1 set to the
+	 * idmap to enable the MMU; set the TTBR0 to the reserved
+	 * page tables to prevent speculative TLB allocations, flush
+	 * the local tlb and set the default tcr_el1.t0sz so that
+	 * the TTBR0 address space set-up is properly restored.
+	 * If the current active_mm != &init_mm we entered cpu_suspend
+	 * with mappings in TTBR0 that must be restored, so we switch
+	 * them back to complete the address space configuration
+	 * restoration before returning.
+	 */
+	cpu_set_reserved_ttbr0();
+	local_flush_tlb_all();
+	cpu_set_default_tcr_t0sz();
+
+	if (mm != &init_mm)
+		cpu_switch_mm(mm->pgd, mm);
+
+	/*
+	 * Restore per-cpu offset before any kernel
+	 * subsystem relying on it has a chance to run.
+	 */
+	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+
+	/*
+	 * Restore HW breakpoint registers to sane values
+	 * before debug exceptions are possibly reenabled
+	 * through local_dbg_restore.
+	 */
+	if (hw_breakpoint_restore)
+		hw_breakpoint_restore(NULL);
+}
+
 /*
  * cpu_suspend
  *
@@ -61,8 +96,9 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 {
 	struct mm_struct *mm = current->active_mm;
-	int ret;
+	int ret = 0;
 	unsigned long flags;
+	struct sleep_stack_data state;
 
 	/*
 	 * From this point debug exceptions are disabled to prevent
@@ -84,40 +120,21 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * page tables, so that the thread address space is properly
 	 * set-up on function return.
 	 */
-	ret = __cpu_suspend_enter(arg, fn);
-	if (ret == 0) {
-		/*
-		 * We are resuming from reset with TTBR0_EL1 set to the
-		 * idmap to enable the MMU; set the TTBR0 to the reserved
-		 * page tables to prevent speculative TLB allocations, flush
-		 * the local tlb and set the default tcr_el1.t0sz so that
-		 * the TTBR0 address space set-up is properly restored.
-		 * If the current active_mm != &init_mm we entered cpu_suspend
-		 * with mappings in TTBR0 that must be restored, so we switch
-		 * them back to complete the address space configuration
-		 * restoration before returning.
-		 */
-		cpu_set_reserved_ttbr0();
-		local_flush_tlb_all();
-		cpu_set_default_tcr_t0sz();
-
-		if (mm != &init_mm)
-			cpu_switch_mm(mm->pgd, mm);
-
-		/*
-		 * Restore per-cpu offset before any kernel
-		 * subsystem relying on it has a chance to run.
-		 */
-		set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+	if (__cpu_suspend_enter(&state)) {
+		/* Call the suspend finisher */
+		ret = fn(arg);
 
 		/*
-		 * Restore HW breakpoint registers to sane values
-		 * before debug exceptions are possibly reenabled
-		 * through local_dbg_restore.
+		 * Never gets here, unless suspend finisher fails.
+		 * Successful cpu_suspend should return from cpu_resume,
+		 * returning through this code path is considered an error
+		 * If the return value is set to 0 force ret = -EOPNOTSUPP
+		 * to make sure a proper error condition is propagated
 		 */
-		if (hw_breakpoint_restore)
-			hw_breakpoint_restore(NULL);
-	}
+		if (!ret)
+			ret = -EOPNOTSUPP;
+	} else
+		__cpu_suspend_exit(mm);
 
 	unpause_graph_tracing();
 
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 05/10] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

Hibernate could make use of the cpu_suspend() code to save/restore cpu
state, however it needs to be able to return '0' from the 'finisher'.

Rework cpu_suspend() so that the finisher is called from C code,
independently from the save/restore of cpu state. Space to save the context
in is allocated in the caller's stack frame, and passed into
__cpu_suspend_enter().

Hibernate's use of this API will look like a copy of the cpu_suspend()
function.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/include/asm/suspend.h | 20 +++++++++
 arch/arm64/kernel/asm-offsets.c  |  2 +
 arch/arm64/kernel/sleep.S        | 93 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 89 ++++++++++++++++++++++----------------
 4 files changed, 108 insertions(+), 96 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 59a5b0f1e81c..ccd26da93d03 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -2,6 +2,7 @@
 #define __ASM_SUSPEND_H
 
 #define NR_CTX_REGS 11
+#define NR_CALLEE_SAVED_REGS 12
 
 /*
  * struct cpu_suspend_ctx must be 16-byte aligned since it is allocated on
@@ -21,6 +22,25 @@ struct sleep_save_sp {
 	phys_addr_t save_ptr_stash_phys;
 };
 
+/*
+ * Memory to save the cpu state is allocated on the stack by
+ * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
+ * This data must survive until cpu_resume() is called.
+ *
+ * This struct desribes the size and the layout of the saved cpu state.
+ * The layout of the callee_saved_regs is defined by the implementation
+ * of __cpu_suspend_enter(), and cpu_resume(). This struct must be passed
+ * in by the caller as __cpu_suspend_enter()'s stack-frame is gone once it
+ * returns, and the data would be subsequently corrupted by the call to the
+ * finisher.
+ */
+struct sleep_stack_data {
+	struct cpu_suspend_ctx	system_regs;
+	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
+};
+
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
+int __cpu_suspend_enter(struct sleep_stack_data *state);
+void __cpu_suspend_exit(struct mm_struct *mm);
 #endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 25de8b244961..1f13aabb9f39 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -160,6 +160,8 @@ int main(void)
   DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
   DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
   DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
+  DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
+  DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
   return 0;
 }
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index f586f7c875e2..1fa40573db13 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -49,37 +49,30 @@
 	orr	\dst, \dst, \mask		// dst|=(aff3>>rs3)
 	.endm
 /*
- * Save CPU state for a suspend and execute the suspend finisher.
- * On success it will return 0 through cpu_resume - ie through a CPU
- * soft/hard reboot from the reset vector.
- * On failure it returns the suspend finisher return value or force
- * -EOPNOTSUPP if the finisher erroneously returns 0 (the suspend finisher
- * is not allowed to return, if it does this must be considered failure).
- * It saves callee registers, and allocates space on the kernel stack
- * to save the CPU specific registers + some other data for resume.
+ * Save CPU state in the provided sleep_stack_data area, and publish its
+ * location for cpu_resume()'s use in sleep_save_stash.
  *
- *  x0 = suspend finisher argument
- *  x1 = suspend finisher function pointer
+ * cpu_resume() will restore this saved state, and return. Because the
+ * link-register is saved and restored, it will appear to return from this
+ * function. So that the caller can tell the suspend/resume paths apart,
+ * __cpu_suspend_enter() will always return a non-zero value, whereas the
+ * path through cpu_resume() will return 0.
+ *
+ *  x0 = struct sleep_stack_data area
  */
 ENTRY(__cpu_suspend_enter)
-	stp	x29, lr, [sp, #-96]!
-	stp	x19, x20, [sp,#16]
-	stp	x21, x22, [sp,#32]
-	stp	x23, x24, [sp,#48]
-	stp	x25, x26, [sp,#64]
-	stp	x27, x28, [sp,#80]
-	/*
-	 * Stash suspend finisher and its argument in x20 and x19
-	 */
-	mov	x19, x0
-	mov	x20, x1
+	stp	x29, lr, [x0, #SLEEP_STACK_DATA_CALLEE_REGS]
+	stp	x19, x20, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+16]
+	stp	x21, x22, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+32]
+	stp	x23, x24, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+48]
+	stp	x25, x26, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+64]
+	stp	x27, x28, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+80]
+
+	/* save the sp in cpu_suspend_ctx */
 	mov	x2, sp
-	sub	sp, sp, #CPU_SUSPEND_SZ	// allocate cpu_suspend_ctx
-	mov	x0, sp
-	/*
-	 * x0 now points to struct cpu_suspend_ctx allocated on the stack
-	 */
-	str	x2, [x0, #CPU_CTX_SP]
+	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
+
+	/* find the mpidr_hash */
 	ldr	x1, =sleep_save_sp
 	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
 	mrs	x7, mpidr_el1
@@ -93,34 +86,11 @@ ENTRY(__cpu_suspend_enter)
 	ldp	w5, w6, [x9, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
+
+	push	x29, lr
 	bl	__cpu_suspend_save
-	/*
-	 * Grab suspend finisher in x20 and its argument in x19
-	 */
-	mov	x0, x19
-	mov	x1, x20
-	/*
-	 * We are ready for power down, fire off the suspend finisher
-	 * in x1, with argument in x0
-	 */
-	blr	x1
-        /*
-	 * Never gets here, unless suspend finisher fails.
-	 * Successful cpu_suspend should return from cpu_resume, returning
-	 * through this code path is considered an error
-	 * If the return value is set to 0 force x0 = -EOPNOTSUPP
-	 * to make sure a proper error condition is propagated
-	 */
-	cmp	x0, #0
-	mov	x3, #-EOPNOTSUPP
-	csel	x0, x3, x0, eq
-	add	sp, sp, #CPU_SUSPEND_SZ	// rewind stack pointer
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
+	pop	x29, lr
+	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
@@ -146,12 +116,6 @@ ENDPROC(cpu_resume_mmu)
 	.popsection
 cpu_resume_after_mmu:
 	mov	x0, #0			// return zero on success
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
 	ret
 ENDPROC(cpu_resume_after_mmu)
 
@@ -168,6 +132,8 @@ ENTRY(cpu_resume)
         /* x7 contains hash index, let's use it to grab context pointer */
 	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
 	ldr	x0, [x0, x7, lsl #3]
+	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
 	/* load physical address of identity map page table in x1 */
@@ -178,5 +144,12 @@ ENTRY(cpu_resume)
 	 * pointer and x1 to contain physical address of 1:1 page tables
 	 */
 	bl	cpu_do_resume		// PC relative jump, MMU off
+	/* Can't access these by physical address once the MMU is on */
+	ldp	x19, x20, [x29, #16]
+	ldp	x21, x22, [x29, #32]
+	ldp	x23, x24, [x29, #48]
+	ldp	x25, x26, [x29, #64]
+	ldp	x27, x28, [x29, #80]
+	ldp	x29, lr, [x29]
 	b	cpu_resume_mmu		// Resume MMU, never returns
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1095aa483a1c..1de29063e2e4 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -10,22 +10,22 @@
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 
-extern int __cpu_suspend_enter(unsigned long arg, int (*fn)(unsigned long));
+
 /*
  * This is called by __cpu_suspend_enter() to save the state, and do whatever
  * flushing is required to ensure that when the CPU goes to sleep we have
  * the necessary data available when the caches are not searched.
  *
- * ptr: CPU context virtual address
+ * ptr: sleep_stack_data containing cpu state virtual address.
  * save_ptr: address of the location where the context physical address
  *           must be saved
  */
-void notrace __cpu_suspend_save(struct cpu_suspend_ctx *ptr,
+void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
 				phys_addr_t *save_ptr)
 {
 	*save_ptr = virt_to_phys(ptr);
 
-	cpu_do_suspend(ptr);
+	cpu_do_suspend(&ptr->system_regs);
 	/*
 	 * Only flush the context that must be retrieved with the MMU
 	 * off. VA primitives ensure the flush is applied to all
@@ -51,6 +51,41 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 	hw_breakpoint_restore = hw_bp_restore;
 }
 
+void notrace __cpu_suspend_exit(struct mm_struct *mm)
+{
+	/*
+	 * We are resuming from reset with TTBR0_EL1 set to the
+	 * idmap to enable the MMU; set the TTBR0 to the reserved
+	 * page tables to prevent speculative TLB allocations, flush
+	 * the local tlb and set the default tcr_el1.t0sz so that
+	 * the TTBR0 address space set-up is properly restored.
+	 * If the current active_mm != &init_mm we entered cpu_suspend
+	 * with mappings in TTBR0 that must be restored, so we switch
+	 * them back to complete the address space configuration
+	 * restoration before returning.
+	 */
+	cpu_set_reserved_ttbr0();
+	local_flush_tlb_all();
+	cpu_set_default_tcr_t0sz();
+
+	if (mm != &init_mm)
+		cpu_switch_mm(mm->pgd, mm);
+
+	/*
+	 * Restore per-cpu offset before any kernel
+	 * subsystem relying on it has a chance to run.
+	 */
+	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+
+	/*
+	 * Restore HW breakpoint registers to sane values
+	 * before debug exceptions are possibly reenabled
+	 * through local_dbg_restore.
+	 */
+	if (hw_breakpoint_restore)
+		hw_breakpoint_restore(NULL);
+}
+
 /*
  * cpu_suspend
  *
@@ -61,8 +96,9 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 {
 	struct mm_struct *mm = current->active_mm;
-	int ret;
+	int ret = 0;
 	unsigned long flags;
+	struct sleep_stack_data state;
 
 	/*
 	 * From this point debug exceptions are disabled to prevent
@@ -84,40 +120,21 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * page tables, so that the thread address space is properly
 	 * set-up on function return.
 	 */
-	ret = __cpu_suspend_enter(arg, fn);
-	if (ret == 0) {
-		/*
-		 * We are resuming from reset with TTBR0_EL1 set to the
-		 * idmap to enable the MMU; set the TTBR0 to the reserved
-		 * page tables to prevent speculative TLB allocations, flush
-		 * the local tlb and set the default tcr_el1.t0sz so that
-		 * the TTBR0 address space set-up is properly restored.
-		 * If the current active_mm != &init_mm we entered cpu_suspend
-		 * with mappings in TTBR0 that must be restored, so we switch
-		 * them back to complete the address space configuration
-		 * restoration before returning.
-		 */
-		cpu_set_reserved_ttbr0();
-		local_flush_tlb_all();
-		cpu_set_default_tcr_t0sz();
-
-		if (mm != &init_mm)
-			cpu_switch_mm(mm->pgd, mm);
-
-		/*
-		 * Restore per-cpu offset before any kernel
-		 * subsystem relying on it has a chance to run.
-		 */
-		set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+	if (__cpu_suspend_enter(&state)) {
+		/* Call the suspend finisher */
+		ret = fn(arg);
 
 		/*
-		 * Restore HW breakpoint registers to sane values
-		 * before debug exceptions are possibly reenabled
-		 * through local_dbg_restore.
+		 * Never gets here, unless suspend finisher fails.
+		 * Successful cpu_suspend should return from cpu_resume,
+		 * returning through this code path is considered an error
+		 * If the return value is set to 0 force ret = -EOPNOTSUPP
+		 * to make sure a proper error condition is propagated
 		 */
-		if (hw_breakpoint_restore)
-			hw_breakpoint_restore(NULL);
-	}
+		if (!ret)
+			ret = -EOPNOTSUPP;
+	} else
+		__cpu_suspend_exit(mm);
 
 	unpause_graph_tracing();
 
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 06/10] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

By enabling the MMU early in cpu_resume(), the sleep_save_sp and stack can
be accessed by VA, which avoids the need to convert-addresses and clean to
PoC on the suspend path.

MMU setup is shared with the boot path, meaning the swapper_pg_dir is
restored directly: ttbr1_el1 is no longer saved/restored.

struct sleep_save_sp is removed, replacing it with a single array of
pointers.

cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
__cpu_setup(). However these values all contain res0 bits that may be used
to enable future features.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/suspend.h |  7 +----
 arch/arm64/kernel/asm-offsets.c  |  3 ---
 arch/arm64/kernel/head.S         |  2 +-
 arch/arm64/kernel/setup.c        |  1 -
 arch/arm64/kernel/sleep.S        | 57 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 52 +++---------------------------------
 arch/arm64/mm/proc.S             | 28 +++++---------------
 7 files changed, 33 insertions(+), 117 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index ccd26da93d03..5faa3ce1fa3a 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,7 +1,7 @@
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
-#define NR_CTX_REGS 11
+#define NR_CTX_REGS 10
 #define NR_CALLEE_SAVED_REGS 12
 
 /*
@@ -17,11 +17,6 @@ struct cpu_suspend_ctx {
 	u64 sp;
 } __aligned(16);
 
-struct sleep_save_sp {
-	phys_addr_t *save_ptr_stash;
-	phys_addr_t save_ptr_stash_phys;
-};
-
 /*
  * Memory to save the cpu state is allocated on the stack by
  * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 1f13aabb9f39..7e0be84e1bdc 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -157,9 +157,6 @@ int main(void)
   DEFINE(CPU_CTX_SP,		offsetof(struct cpu_suspend_ctx, sp));
   DEFINE(MPIDR_HASH_MASK,	offsetof(struct mpidr_hash, mask));
   DEFINE(MPIDR_HASH_SHIFTS,	offsetof(struct mpidr_hash, shift_aff));
-  DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
-  DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
-  DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
   DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
   DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 23cfc08fc8ba..7cec62a76f50 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -622,7 +622,7 @@ ENDPROC(__secondary_switched)
  * If it isn't, park the CPU
  */
 	.section	".idmap.text", "ax"
-__enable_mmu:
+ENTRY(__enable_mmu)
 	mrs	x1, ID_AA64MMFR0_EL1
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 8119479147db..1c4bc180efbe 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -174,7 +174,6 @@ static void __init smp_build_mpidr_hash(void)
 	 */
 	if (mpidr_hash_size() > 4 * num_possible_cpus())
 		pr_warn("Large number of MPIDR hash buckets detected\n");
-	__flush_dcache_area(&mpidr_hash, sizeof(struct mpidr_hash));
 }
 
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 1fa40573db13..07e005d756b0 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -73,8 +73,8 @@ ENTRY(__cpu_suspend_enter)
 	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
 
 	/* find the mpidr_hash */
-	ldr	x1, =sleep_save_sp
-	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
+	ldr	x1, =sleep_save_stash
+	ldr	x1, [x1]
 	mrs	x7, mpidr_el1
 	ldr	x9, =mpidr_hash
 	ldr	x10, [x9, #MPIDR_HASH_MASK]
@@ -87,40 +87,26 @@ ENTRY(__cpu_suspend_enter)
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
 
+	str	x0, [x1]
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	push	x29, lr
-	bl	__cpu_suspend_save
+	bl	cpu_do_suspend
 	pop	x29, lr
 	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
 
-/*
- * x0 must contain the sctlr value retrieved from restored context
- */
-	.pushsection	".idmap.text", "ax"
-ENTRY(cpu_resume_mmu)
-	ldr	x3, =cpu_resume_after_mmu
-	msr	sctlr_el1, x0		// restore sctlr_el1
-	isb
-	/*
-	 * Invalidate the local I-cache so that any instructions fetched
-	 * speculatively from the PoC are discarded, since they may have
-	 * been dynamically patched at the PoU.
-	 */
-	ic	iallu
-	dsb	nsh
-	isb
-	br	x3			// global jump to virtual address
-ENDPROC(cpu_resume_mmu)
-	.popsection
-cpu_resume_after_mmu:
-	mov	x0, #0			// return zero on success
-	ret
-ENDPROC(cpu_resume_after_mmu)
-
 ENTRY(cpu_resume)
 	bl	el2_setup		// if in EL2 drop to EL1 cleanly
+	/* enable the MMU early - so we can access sleep_save_stash by va */
+	adr_l	lr, __enable_mmu	/* __cpu_setup will return here */
+	ldr	x27, =_cpu_resume	/* __enable_mmu will branch here */
+	adrp	x25, idmap_pg_dir
+	adrp	x26, swapper_pg_dir
+	b	__cpu_setup
+
+ENTRY(_cpu_resume)
 	mrs	x1, mpidr_el1
 	adrp	x8, mpidr_hash
 	add x8, x8, #:lo12:mpidr_hash // x8 = struct mpidr_hash phys address
@@ -130,26 +116,23 @@ ENTRY(cpu_resume)
 	ldp	w5, w6, [x8, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x7, x3, x4, x5, x6, x1, x2
         /* x7 contains hash index, let's use it to grab context pointer */
-	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
+	ldr_l	x0, sleep_save_stash
 	ldr	x0, [x0, x7, lsl #3]
 	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
 	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
-	/* load physical address of identity map page table in x1 */
-	adrp	x1, idmap_pg_dir
 	mov	sp, x2
-	/*
-	 * cpu_do_resume expects x0 to contain context physical address
-	 * pointer and x1 to contain physical address of 1:1 page tables
-	 */
-	bl	cpu_do_resume		// PC relative jump, MMU off
-	/* Can't access these by physical address once the MMU is on */
+	bl	cpu_do_resume
+	msr	sctlr_el1, x0
+	isb
+
 	ldp	x19, x20, [x29, #16]
 	ldp	x21, x22, [x29, #32]
 	ldp	x23, x24, [x29, #48]
 	ldp	x25, x26, [x29, #64]
 	ldp	x27, x28, [x29, #80]
 	ldp	x29, lr, [x29]
-	b	cpu_resume_mmu		// Resume MMU, never returns
+	mov	x0, #0
+	ret
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1de29063e2e4..57e1c3b7e930 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -12,30 +12,6 @@
 
 
 /*
- * This is called by __cpu_suspend_enter() to save the state, and do whatever
- * flushing is required to ensure that when the CPU goes to sleep we have
- * the necessary data available when the caches are not searched.
- *
- * ptr: sleep_stack_data containing cpu state virtual address.
- * save_ptr: address of the location where the context physical address
- *           must be saved
- */
-void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
-				phys_addr_t *save_ptr)
-{
-	*save_ptr = virt_to_phys(ptr);
-
-	cpu_do_suspend(&ptr->system_regs);
-	/*
-	 * Only flush the context that must be retrieved with the MMU
-	 * off. VA primitives ensure the flush is applied to all
-	 * cache levels so context is pushed to DRAM.
-	 */
-	__flush_dcache_area(ptr, sizeof(*ptr));
-	__flush_dcache_area(save_ptr, sizeof(*save_ptr));
-}
-
-/*
  * This hook is provided so that cpu_suspend code can restore HW
  * breakpoints as early as possible in the resume path, before reenabling
  * debug exceptions. Code cannot be run from a CPU PM notifier since by the
@@ -53,21 +29,6 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 
 void notrace __cpu_suspend_exit(struct mm_struct *mm)
 {
-	/*
-	 * We are resuming from reset with TTBR0_EL1 set to the
-	 * idmap to enable the MMU; set the TTBR0 to the reserved
-	 * page tables to prevent speculative TLB allocations, flush
-	 * the local tlb and set the default tcr_el1.t0sz so that
-	 * the TTBR0 address space set-up is properly restored.
-	 * If the current active_mm != &init_mm we entered cpu_suspend
-	 * with mappings in TTBR0 that must be restored, so we switch
-	 * them back to complete the address space configuration
-	 * restoration before returning.
-	 */
-	cpu_set_reserved_ttbr0();
-	local_flush_tlb_all();
-	cpu_set_default_tcr_t0sz();
-
 	if (mm != &init_mm)
 		cpu_switch_mm(mm->pgd, mm);
 
@@ -148,22 +109,17 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	return ret;
 }
 
-struct sleep_save_sp sleep_save_sp;
+unsigned long *sleep_save_stash;
 
 static int __init cpu_suspend_init(void)
 {
-	void *ctx_ptr;
-
 	/* ctx_ptr is an array of physical addresses */
-	ctx_ptr = kcalloc(mpidr_hash_size(), sizeof(phys_addr_t), GFP_KERNEL);
+	sleep_save_stash = kcalloc(mpidr_hash_size(), sizeof(*sleep_save_stash),
+				   GFP_KERNEL);
 
-	if (WARN_ON(!ctx_ptr))
+	if (WARN_ON(!sleep_save_stash))
 		return -ENOMEM;
 
-	sleep_save_sp.save_ptr_stash = ctx_ptr;
-	sleep_save_sp.save_ptr_stash_phys = virt_to_phys(ctx_ptr);
-	__flush_dcache_area(&sleep_save_sp, sizeof(struct sleep_save_sp));
-
 	return 0;
 }
 early_initcall(cpu_suspend_init);
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 7ab3a9097369..874bc0c178d1 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -61,20 +61,17 @@ ENTRY(cpu_do_suspend)
 	mrs	x2, tpidr_el0
 	mrs	x3, tpidrro_el0
 	mrs	x4, contextidr_el1
-	mrs	x5, mair_el1
 	mrs	x6, cpacr_el1
-	mrs	x7, ttbr1_el1
 	mrs	x8, tcr_el1
 	mrs	x9, vbar_el1
 	mrs	x10, mdscr_el1
 	mrs	x11, oslsr_el1
 	mrs	x12, sctlr_el1
 	stp	x2, x3, [x0]
-	stp	x4, x5, [x0, #16]
-	stp	x6, x7, [x0, #32]
-	stp	x8, x9, [x0, #48]
-	stp	x10, x11, [x0, #64]
-	str	x12, [x0, #80]
+	stp	x4, xzr, [x0, #16]
+	stp	x6, x8, [x0, #32]
+	stp	x9, x10, [x0, #48]
+	stp	x11, x12, [x0, #64]
 	ret
 ENDPROC(cpu_do_suspend)
 
@@ -82,30 +79,20 @@ ENDPROC(cpu_do_suspend)
  * cpu_do_resume - restore CPU register context
  *
  * x0: Physical address of context pointer
- * x1: ttbr0_el1 to be restored
  *
  * Returns:
  *	sctlr_el1 value in x0
  */
 ENTRY(cpu_do_resume)
-	/*
-	 * Invalidate local tlb entries before turning on MMU
-	 */
-	tlbi	vmalle1
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
-	ldp	x6, x7, [x0, #32]
-	ldp	x8, x9, [x0, #48]
-	ldp	x10, x11, [x0, #64]
-	ldr	x12, [x0, #80]
+	ldp	x6, x8, [x0, #32]
+	ldp	x9, x10, [x0, #48]
+	ldp	x11, x12, [x0, #64]
 	msr	tpidr_el0, x2
 	msr	tpidrro_el0, x3
 	msr	contextidr_el1, x4
-	msr	mair_el1, x5
 	msr	cpacr_el1, x6
-	msr	ttbr0_el1, x1
-	msr	ttbr1_el1, x7
-	tcr_set_idmap_t0sz x8, x7
 	msr	tcr_el1, x8
 	msr	vbar_el1, x9
 	msr	mdscr_el1, x10
@@ -115,7 +102,6 @@ ENTRY(cpu_do_resume)
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	mov	x0, x12
-	dsb	nsh		// Make sure local tlb invalidation completed
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 06/10] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

By enabling the MMU early in cpu_resume(), the sleep_save_sp and stack can
be accessed by VA, which avoids the need to convert-addresses and clean to
PoC on the suspend path.

MMU setup is shared with the boot path, meaning the swapper_pg_dir is
restored directly: ttbr1_el1 is no longer saved/restored.

struct sleep_save_sp is removed, replacing it with a single array of
pointers.

cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
__cpu_setup(). However these values all contain res0 bits that may be used
to enable future features.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/suspend.h |  7 +----
 arch/arm64/kernel/asm-offsets.c  |  3 ---
 arch/arm64/kernel/head.S         |  2 +-
 arch/arm64/kernel/setup.c        |  1 -
 arch/arm64/kernel/sleep.S        | 57 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 52 +++---------------------------------
 arch/arm64/mm/proc.S             | 28 +++++---------------
 7 files changed, 33 insertions(+), 117 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index ccd26da93d03..5faa3ce1fa3a 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,7 +1,7 @@
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
-#define NR_CTX_REGS 11
+#define NR_CTX_REGS 10
 #define NR_CALLEE_SAVED_REGS 12
 
 /*
@@ -17,11 +17,6 @@ struct cpu_suspend_ctx {
 	u64 sp;
 } __aligned(16);
 
-struct sleep_save_sp {
-	phys_addr_t *save_ptr_stash;
-	phys_addr_t save_ptr_stash_phys;
-};
-
 /*
  * Memory to save the cpu state is allocated on the stack by
  * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 1f13aabb9f39..7e0be84e1bdc 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -157,9 +157,6 @@ int main(void)
   DEFINE(CPU_CTX_SP,		offsetof(struct cpu_suspend_ctx, sp));
   DEFINE(MPIDR_HASH_MASK,	offsetof(struct mpidr_hash, mask));
   DEFINE(MPIDR_HASH_SHIFTS,	offsetof(struct mpidr_hash, shift_aff));
-  DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
-  DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
-  DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
   DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
   DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 23cfc08fc8ba..7cec62a76f50 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -622,7 +622,7 @@ ENDPROC(__secondary_switched)
  * If it isn't, park the CPU
  */
 	.section	".idmap.text", "ax"
-__enable_mmu:
+ENTRY(__enable_mmu)
 	mrs	x1, ID_AA64MMFR0_EL1
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 8119479147db..1c4bc180efbe 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -174,7 +174,6 @@ static void __init smp_build_mpidr_hash(void)
 	 */
 	if (mpidr_hash_size() > 4 * num_possible_cpus())
 		pr_warn("Large number of MPIDR hash buckets detected\n");
-	__flush_dcache_area(&mpidr_hash, sizeof(struct mpidr_hash));
 }
 
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index 1fa40573db13..07e005d756b0 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -73,8 +73,8 @@ ENTRY(__cpu_suspend_enter)
 	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
 
 	/* find the mpidr_hash */
-	ldr	x1, =sleep_save_sp
-	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
+	ldr	x1, =sleep_save_stash
+	ldr	x1, [x1]
 	mrs	x7, mpidr_el1
 	ldr	x9, =mpidr_hash
 	ldr	x10, [x9, #MPIDR_HASH_MASK]
@@ -87,40 +87,26 @@ ENTRY(__cpu_suspend_enter)
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
 
+	str	x0, [x1]
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	push	x29, lr
-	bl	__cpu_suspend_save
+	bl	cpu_do_suspend
 	pop	x29, lr
 	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
 
-/*
- * x0 must contain the sctlr value retrieved from restored context
- */
-	.pushsection	".idmap.text", "ax"
-ENTRY(cpu_resume_mmu)
-	ldr	x3, =cpu_resume_after_mmu
-	msr	sctlr_el1, x0		// restore sctlr_el1
-	isb
-	/*
-	 * Invalidate the local I-cache so that any instructions fetched
-	 * speculatively from the PoC are discarded, since they may have
-	 * been dynamically patched at the PoU.
-	 */
-	ic	iallu
-	dsb	nsh
-	isb
-	br	x3			// global jump to virtual address
-ENDPROC(cpu_resume_mmu)
-	.popsection
-cpu_resume_after_mmu:
-	mov	x0, #0			// return zero on success
-	ret
-ENDPROC(cpu_resume_after_mmu)
-
 ENTRY(cpu_resume)
 	bl	el2_setup		// if in EL2 drop to EL1 cleanly
+	/* enable the MMU early - so we can access sleep_save_stash by va */
+	adr_l	lr, __enable_mmu	/* __cpu_setup will return here */
+	ldr	x27, =_cpu_resume	/* __enable_mmu will branch here */
+	adrp	x25, idmap_pg_dir
+	adrp	x26, swapper_pg_dir
+	b	__cpu_setup
+
+ENTRY(_cpu_resume)
 	mrs	x1, mpidr_el1
 	adrp	x8, mpidr_hash
 	add x8, x8, #:lo12:mpidr_hash // x8 = struct mpidr_hash phys address
@@ -130,26 +116,23 @@ ENTRY(cpu_resume)
 	ldp	w5, w6, [x8, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x7, x3, x4, x5, x6, x1, x2
         /* x7 contains hash index, let's use it to grab context pointer */
-	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
+	ldr_l	x0, sleep_save_stash
 	ldr	x0, [x0, x7, lsl #3]
 	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
 	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
-	/* load physical address of identity map page table in x1 */
-	adrp	x1, idmap_pg_dir
 	mov	sp, x2
-	/*
-	 * cpu_do_resume expects x0 to contain context physical address
-	 * pointer and x1 to contain physical address of 1:1 page tables
-	 */
-	bl	cpu_do_resume		// PC relative jump, MMU off
-	/* Can't access these by physical address once the MMU is on */
+	bl	cpu_do_resume
+	msr	sctlr_el1, x0
+	isb
+
 	ldp	x19, x20, [x29, #16]
 	ldp	x21, x22, [x29, #32]
 	ldp	x23, x24, [x29, #48]
 	ldp	x25, x26, [x29, #64]
 	ldp	x27, x28, [x29, #80]
 	ldp	x29, lr, [x29]
-	b	cpu_resume_mmu		// Resume MMU, never returns
+	mov	x0, #0
+	ret
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1de29063e2e4..57e1c3b7e930 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -12,30 +12,6 @@
 
 
 /*
- * This is called by __cpu_suspend_enter() to save the state, and do whatever
- * flushing is required to ensure that when the CPU goes to sleep we have
- * the necessary data available when the caches are not searched.
- *
- * ptr: sleep_stack_data containing cpu state virtual address.
- * save_ptr: address of the location where the context physical address
- *           must be saved
- */
-void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
-				phys_addr_t *save_ptr)
-{
-	*save_ptr = virt_to_phys(ptr);
-
-	cpu_do_suspend(&ptr->system_regs);
-	/*
-	 * Only flush the context that must be retrieved with the MMU
-	 * off. VA primitives ensure the flush is applied to all
-	 * cache levels so context is pushed to DRAM.
-	 */
-	__flush_dcache_area(ptr, sizeof(*ptr));
-	__flush_dcache_area(save_ptr, sizeof(*save_ptr));
-}
-
-/*
  * This hook is provided so that cpu_suspend code can restore HW
  * breakpoints as early as possible in the resume path, before reenabling
  * debug exceptions. Code cannot be run from a CPU PM notifier since by the
@@ -53,21 +29,6 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 
 void notrace __cpu_suspend_exit(struct mm_struct *mm)
 {
-	/*
-	 * We are resuming from reset with TTBR0_EL1 set to the
-	 * idmap to enable the MMU; set the TTBR0 to the reserved
-	 * page tables to prevent speculative TLB allocations, flush
-	 * the local tlb and set the default tcr_el1.t0sz so that
-	 * the TTBR0 address space set-up is properly restored.
-	 * If the current active_mm != &init_mm we entered cpu_suspend
-	 * with mappings in TTBR0 that must be restored, so we switch
-	 * them back to complete the address space configuration
-	 * restoration before returning.
-	 */
-	cpu_set_reserved_ttbr0();
-	local_flush_tlb_all();
-	cpu_set_default_tcr_t0sz();
-
 	if (mm != &init_mm)
 		cpu_switch_mm(mm->pgd, mm);
 
@@ -148,22 +109,17 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	return ret;
 }
 
-struct sleep_save_sp sleep_save_sp;
+unsigned long *sleep_save_stash;
 
 static int __init cpu_suspend_init(void)
 {
-	void *ctx_ptr;
-
 	/* ctx_ptr is an array of physical addresses */
-	ctx_ptr = kcalloc(mpidr_hash_size(), sizeof(phys_addr_t), GFP_KERNEL);
+	sleep_save_stash = kcalloc(mpidr_hash_size(), sizeof(*sleep_save_stash),
+				   GFP_KERNEL);
 
-	if (WARN_ON(!ctx_ptr))
+	if (WARN_ON(!sleep_save_stash))
 		return -ENOMEM;
 
-	sleep_save_sp.save_ptr_stash = ctx_ptr;
-	sleep_save_sp.save_ptr_stash_phys = virt_to_phys(ctx_ptr);
-	__flush_dcache_area(&sleep_save_sp, sizeof(struct sleep_save_sp));
-
 	return 0;
 }
 early_initcall(cpu_suspend_init);
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 7ab3a9097369..874bc0c178d1 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -61,20 +61,17 @@ ENTRY(cpu_do_suspend)
 	mrs	x2, tpidr_el0
 	mrs	x3, tpidrro_el0
 	mrs	x4, contextidr_el1
-	mrs	x5, mair_el1
 	mrs	x6, cpacr_el1
-	mrs	x7, ttbr1_el1
 	mrs	x8, tcr_el1
 	mrs	x9, vbar_el1
 	mrs	x10, mdscr_el1
 	mrs	x11, oslsr_el1
 	mrs	x12, sctlr_el1
 	stp	x2, x3, [x0]
-	stp	x4, x5, [x0, #16]
-	stp	x6, x7, [x0, #32]
-	stp	x8, x9, [x0, #48]
-	stp	x10, x11, [x0, #64]
-	str	x12, [x0, #80]
+	stp	x4, xzr, [x0, #16]
+	stp	x6, x8, [x0, #32]
+	stp	x9, x10, [x0, #48]
+	stp	x11, x12, [x0, #64]
 	ret
 ENDPROC(cpu_do_suspend)
 
@@ -82,30 +79,20 @@ ENDPROC(cpu_do_suspend)
  * cpu_do_resume - restore CPU register context
  *
  * x0: Physical address of context pointer
- * x1: ttbr0_el1 to be restored
  *
  * Returns:
  *	sctlr_el1 value in x0
  */
 ENTRY(cpu_do_resume)
-	/*
-	 * Invalidate local tlb entries before turning on MMU
-	 */
-	tlbi	vmalle1
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
-	ldp	x6, x7, [x0, #32]
-	ldp	x8, x9, [x0, #48]
-	ldp	x10, x11, [x0, #64]
-	ldr	x12, [x0, #80]
+	ldp	x6, x8, [x0, #32]
+	ldp	x9, x10, [x0, #48]
+	ldp	x11, x12, [x0, #64]
 	msr	tpidr_el0, x2
 	msr	tpidrro_el0, x3
 	msr	contextidr_el1, x4
-	msr	mair_el1, x5
 	msr	cpacr_el1, x6
-	msr	ttbr0_el1, x1
-	msr	ttbr1_el1, x7
-	tcr_set_idmap_t0sz x8, x7
 	msr	tcr_el1, x8
 	msr	vbar_el1, x9
 	msr	mdscr_el1, x10
@@ -115,7 +102,6 @@ ENTRY(cpu_do_resume)
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	mov	x0, x12
-	dsb	nsh		// Make sure local tlb invalidation completed
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 07/10] arm64: kernel: Include _AC definition in page.h
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include
linux/const.h where this is defined. This produces build warnings when only
asm/page.h is included by asm code.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/page.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 9b2f5a9d019d..fbafd0ad16df 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -19,6 +19,8 @@
 #ifndef __ASM_PAGE_H
 #define __ASM_PAGE_H
 
+#include <linux/const.h>
+
 /* PAGE_SHIFT determines the page size */
 /* CONT_SHIFT determines the number of pages which can be tracked together  */
 #ifdef CONFIG_ARM64_64K_PAGES
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 07/10] arm64: kernel: Include _AC definition in page.h
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include
linux/const.h where this is defined. This produces build warnings when only
asm/page.h is included by asm code.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/page.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 9b2f5a9d019d..fbafd0ad16df 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -19,6 +19,8 @@
 #ifndef __ASM_PAGE_H
 #define __ASM_PAGE_H
 
+#include <linux/const.h>
+
 /* PAGE_SHIFT determines the page size */
 /* CONT_SHIFT determines the number of pages which can be tracked together  */
 #ifdef CONFIG_ARM64_64K_PAGES
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 08/10] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

KERNEL_START and KERNEL_END are useful outside head.S, move them to a
header file.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/memory.h | 3 +++
 arch/arm64/kernel/head.S        | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..5773a6629f10 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -70,6 +70,9 @@
 
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
 
+#define KERNEL_START      _text
+#define KERNEL_END        _end
+
 /*
  * Physical vs virtual RAM address space conversion.  These are
  * private definitions which should NOT be used outside memory.h
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 7cec62a76f50..c58ede3398db 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -48,9 +48,6 @@
 #error TEXT_OFFSET must be less than 2MB
 #endif
 
-#define KERNEL_START	_text
-#define KERNEL_END	_end
-
 /*
  * Kernel startup entry point.
  * ---------------------------
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 08/10] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

KERNEL_START and KERNEL_END are useful outside head.S, move them to a
header file.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/memory.h | 3 +++
 arch/arm64/kernel/head.S        | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..5773a6629f10 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -70,6 +70,9 @@
 
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
 
+#define KERNEL_START      _text
+#define KERNEL_END        _end
+
 /*
  * Physical vs virtual RAM address space conversion.  These are
  * private definitions which should NOT be used outside memory.h
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 7cec62a76f50..c58ede3398db 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -48,9 +48,6 @@
 #error TEXT_OFFSET must be less than 2MB
 #endif
 
-#define KERNEL_START	_text
-#define KERNEL_END	_end
-
 /*
  * Kernel startup entry point.
  * ---------------------------
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Pavel Machek
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, James Morse

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Create a new list of pages that were restored in place, so
that the arch code can perform this maintenance when necessary.

Signed-off-by: James Morse <james.morse@arm.com>
---
 include/linux/suspend.h |  1 +
 kernel/power/snapshot.c | 42 ++++++++++++++++++++++++++++--------------
 2 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 8b6ec7ef0854..b17cf6081bca 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -384,6 +384,7 @@ extern bool system_entering_hibernation(void);
 extern bool hibernation_available(void);
 asmlinkage int swsusp_save(void);
 extern struct pbe *restore_pblist;
+extern struct pbe *restored_inplace_pblist;
 #else /* CONFIG_HIBERNATION */
 static inline void register_nosave_region(unsigned long b, unsigned long e) {}
 static inline void register_nosave_region_late(unsigned long b, unsigned long e) {}
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 3a970604308f..f251f5af49fb 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -74,6 +74,11 @@ void __init hibernate_image_size_init(void)
  */
 struct pbe *restore_pblist;
 
+/* List of PBEs that were restored in place. modified-harvard architectures
+ * need to 'clean' these pages before they can be executed.
+ */
+struct pbe *restored_inplace_pblist;
+
 /* Pointer to an auxiliary buffer (1 page) */
 static void *buffer;
 
@@ -1359,6 +1364,7 @@ out:
 	nr_copy_pages = 0;
 	nr_meta_pages = 0;
 	restore_pblist = NULL;
+	restored_inplace_pblist = NULL;
 	buffer = NULL;
 	alloc_normal = 0;
 	alloc_highmem = 0;
@@ -2072,6 +2078,7 @@ load_header(struct swsusp_info *info)
 	int error;
 
 	restore_pblist = NULL;
+	restored_inplace_pblist = NULL;
 	error = check_header(info);
 	if (!error) {
 		nr_copy_pages = info->image_pages;
@@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca)
 	if (PageHighMem(page))
 		return get_highmem_page_buffer(page, ca);
 
-	if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page))
-		/* We have allocated the "original" page frame and we can
-		 * use it directly to store the loaded page.
-		 */
-		return page_address(page);
-
-	/* The "original" page frame has not been allocated and we have to
-	 * use a "safe" page frame to store the loaded page.
-	 */
 	pbe = chain_alloc(ca, sizeof(struct pbe));
 	if (!pbe) {
 		swsusp_free();
 		return ERR_PTR(-ENOMEM);
 	}
-	pbe->orig_address = page_address(page);
-	pbe->address = safe_pages_list;
-	safe_pages_list = safe_pages_list->next;
-	pbe->next = restore_pblist;
-	restore_pblist = pbe;
+
+	if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page)) {
+		/* We have allocated the "original" page frame and we can
+		 * use it directly to store the loaded page.
+		 */
+		pbe->orig_address = NULL;
+		pbe->address = page_address(page);
+		pbe->next = restored_inplace_pblist;
+		restored_inplace_pblist = pbe;
+	} else {
+		/* The "original" page frame has not been allocated and we
+		 * have to use a "safe" page frame to store the loaded page.
+		 */
+		pbe->orig_address = page_address(page);
+		pbe->address = safe_pages_list;
+		safe_pages_list = safe_pages_list->next;
+		pbe->next = restore_pblist;
+		restore_pblist = pbe;
+	}
+
 	return pbe->address;
 }
 
@@ -2513,6 +2526,7 @@ int snapshot_write_next(struct snapshot_handle *handle)
 			chain_init(&ca, GFP_ATOMIC, PG_SAFE);
 			memory_bm_position_reset(&orig_bm);
 			restore_pblist = NULL;
+			restored_inplace_pblist = NULL;
 			handle->buffer = get_buffer(&orig_bm, &ca);
 			handle->sync_read = 0;
 			if (IS_ERR(handle->buffer))
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Create a new list of pages that were restored in place, so
that the arch code can perform this maintenance when necessary.

Signed-off-by: James Morse <james.morse@arm.com>
---
 include/linux/suspend.h |  1 +
 kernel/power/snapshot.c | 42 ++++++++++++++++++++++++++++--------------
 2 files changed, 29 insertions(+), 14 deletions(-)

diff --git a/include/linux/suspend.h b/include/linux/suspend.h
index 8b6ec7ef0854..b17cf6081bca 100644
--- a/include/linux/suspend.h
+++ b/include/linux/suspend.h
@@ -384,6 +384,7 @@ extern bool system_entering_hibernation(void);
 extern bool hibernation_available(void);
 asmlinkage int swsusp_save(void);
 extern struct pbe *restore_pblist;
+extern struct pbe *restored_inplace_pblist;
 #else /* CONFIG_HIBERNATION */
 static inline void register_nosave_region(unsigned long b, unsigned long e) {}
 static inline void register_nosave_region_late(unsigned long b, unsigned long e) {}
diff --git a/kernel/power/snapshot.c b/kernel/power/snapshot.c
index 3a970604308f..f251f5af49fb 100644
--- a/kernel/power/snapshot.c
+++ b/kernel/power/snapshot.c
@@ -74,6 +74,11 @@ void __init hibernate_image_size_init(void)
  */
 struct pbe *restore_pblist;
 
+/* List of PBEs that were restored in place. modified-harvard architectures
+ * need to 'clean' these pages before they can be executed.
+ */
+struct pbe *restored_inplace_pblist;
+
 /* Pointer to an auxiliary buffer (1 page) */
 static void *buffer;
 
@@ -1359,6 +1364,7 @@ out:
 	nr_copy_pages = 0;
 	nr_meta_pages = 0;
 	restore_pblist = NULL;
+	restored_inplace_pblist = NULL;
 	buffer = NULL;
 	alloc_normal = 0;
 	alloc_highmem = 0;
@@ -2072,6 +2078,7 @@ load_header(struct swsusp_info *info)
 	int error;
 
 	restore_pblist = NULL;
+	restored_inplace_pblist = NULL;
 	error = check_header(info);
 	if (!error) {
 		nr_copy_pages = info->image_pages;
@@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca)
 	if (PageHighMem(page))
 		return get_highmem_page_buffer(page, ca);
 
-	if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page))
-		/* We have allocated the "original" page frame and we can
-		 * use it directly to store the loaded page.
-		 */
-		return page_address(page);
-
-	/* The "original" page frame has not been allocated and we have to
-	 * use a "safe" page frame to store the loaded page.
-	 */
 	pbe = chain_alloc(ca, sizeof(struct pbe));
 	if (!pbe) {
 		swsusp_free();
 		return ERR_PTR(-ENOMEM);
 	}
-	pbe->orig_address = page_address(page);
-	pbe->address = safe_pages_list;
-	safe_pages_list = safe_pages_list->next;
-	pbe->next = restore_pblist;
-	restore_pblist = pbe;
+
+	if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page)) {
+		/* We have allocated the "original" page frame and we can
+		 * use it directly to store the loaded page.
+		 */
+		pbe->orig_address = NULL;
+		pbe->address = page_address(page);
+		pbe->next = restored_inplace_pblist;
+		restored_inplace_pblist = pbe;
+	} else {
+		/* The "original" page frame has not been allocated and we
+		 * have to use a "safe" page frame to store the loaded page.
+		 */
+		pbe->orig_address = page_address(page);
+		pbe->address = safe_pages_list;
+		safe_pages_list = safe_pages_list->next;
+		pbe->next = restore_pblist;
+		restore_pblist = pbe;
+	}
+
 	return pbe->address;
 }
 
@@ -2513,6 +2526,7 @@ int snapshot_write_next(struct snapshot_handle *handle)
 			chain_init(&ca, GFP_ATOMIC, PG_SAFE);
 			memory_bm_position_reset(&orig_bm);
 			restore_pblist = NULL;
+			restored_inplace_pblist = NULL;
 			handle->buffer = get_buffer(&orig_bm, &ca);
 			handle->sync_read = 0;
 			if (IS_ERR(handle->buffer))
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-11-26 17:32 ` James Morse
@ 2015-11-26 17:32   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

Add support for hibernate/suspend-to-disk.

Suspend borrows code from cpu_suspend() to write cpu state onto the stack,
before calling swsusp_save() to save the memory image.

Restore creates a set of temporary page tables, covering the kernel and the
linear map, copies the restore code to a 'safe' page, then uses the copy to
restore the memory image. It calls into cpu_resume(),
and then follows the normal cpu_suspend() path back into the suspend code.

The implementation assumes that exactly the same kernel is booted on the
same hardware, and that the kernel is loaded at the same physical address.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
---
 arch/arm64/Kconfig                |   3 +
 arch/arm64/include/asm/suspend.h  |   8 +
 arch/arm64/kernel/Makefile        |   1 +
 arch/arm64/kernel/asm-offsets.c   |   4 +
 arch/arm64/kernel/hibernate-asm.S | 119 ++++++++++++
 arch/arm64/kernel/hibernate.c     | 376 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S   |  15 ++
 7 files changed, 526 insertions(+)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9ac16a482ff1..b15d831f4016 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -769,6 +769,9 @@ menu "Power management options"
 
 source "kernel/power/Kconfig"
 
+config ARCH_HIBERNATION_POSSIBLE
+	def_bool y
+
 config ARCH_SUSPEND_POSSIBLE
 	def_bool y
 
diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 5faa3ce1fa3a..e75ad7aa268c 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,3 +1,5 @@
+#include <linux/suspend.h>
+
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
@@ -34,6 +36,12 @@ struct sleep_stack_data {
 	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
 };
 
+extern int swsusp_arch_suspend(void);
+extern int swsusp_arch_resume(void);
+int swsusp_arch_suspend_enter(struct cpu_suspend_ctx *ptr);
+void __noreturn swsusp_arch_suspend_exit(phys_addr_t tmp_pg_dir,
+					 phys_addr_t swapper_pg_dir,
+					 void *kernel_start, void *kernel_end);
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
 int __cpu_suspend_enter(struct sleep_stack_data *state);
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f8b13a..71da22197963 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -41,6 +41,7 @@ arm64-obj-$(CONFIG_EFI)			+= efi.o efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 7e0be84e1bdc..15c36aae6fac 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
+#include <linux/suspend.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/smp_plat.h>
@@ -160,5 +161,8 @@ int main(void)
   DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
   DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
+  DEFINE(HIBERN_PBE_ORIG,	offsetof(struct pbe, orig_address));
+  DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
+  DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
   return 0;
 }
diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S
new file mode 100644
index 000000000000..db3e4d6bb68e
--- /dev/null
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -0,0 +1,119 @@
+#include <linux/linkage.h>
+#include <linux/errno.h>
+
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
+#include <asm/cputype.h>
+#include <asm/memory.h>
+#include <asm/page.h>
+
+/*
+ * Corrupt memory.
+ *
+ * Loads temporary page tables then restores the memory image.
+ * Finally branches to cpu_resume() to restore the state saved by
+ * swsusp_arch_suspend().
+ *
+ * Because this code has to be copied to a safe_page, it can't call out to
+ * other functions by PC-relative address. Also remember that it may be
+ * mid-way through over-writing other functions. For this reason it contains
+ * a copy of copy_page() and code from flush_icache_range().
+ *
+ * All of memory gets written to, including code. We need to clean the kernel
+ * text to the Point of Coherence (PoC) before secondary cores can be booted.
+ * Because the kernel modules and executable pages mapped to user space are
+ * also written as data,  we clean all pages we touch to the Point of
+ * Unification (PoU).
+ *
+ * x0: physical address of temporary page tables
+ * x1: physical address of swapper page tables
+ * x2: address of kernel_start
+ * x3: address of kernel_end
+ */
+.pushsection    ".hibernate_exit.text", "ax"
+ENTRY(swsusp_arch_suspend_exit)
+	/* Temporary page tables are a copy, so no need for a trampoline here */
+	msr	ttbr1_el1, x0
+	isb
+	tlbi	vmalle1is
+	ic	ialluis
+	isb
+
+	mov	x21, x1
+	mov	x22, x2
+	mov	x23, x3
+
+	/* walk the restore_pblist and use copy_page() to over-write memory */
+	ldr	x19, =restore_pblist
+	ldr	x19, [x19]
+
+2:	ldr	x10, [x19, #HIBERN_PBE_ORIG]
+	mov	x0, x10
+	ldr	x1, [x19, #HIBERN_PBE_ADDR]
+
+	/* arch/arm64/lib/copy_page.S:copy_page() */
+	prfm	pldl1strm, [x1, #64]
+3:	ldp	x2, x3, [x1]
+	ldp	x4, x5, [x1, #16]
+	ldp	x6, x7, [x1, #32]
+	ldp	x8, x9, [x1, #48]
+	add	x1, x1, #64
+	prfm	pldl1strm, [x1, #64]
+	stnp	x2, x3, [x0]
+	stnp	x4, x5, [x0, #16]
+	stnp	x6, x7, [x0, #32]
+	stnp	x8, x9, [x0, #48]
+	add	x0, x0, #64
+	tst	x1, #(PAGE_SIZE - 1)
+	b.ne	3b
+
+	dsb	ish		//  memory restore must finish before cleaning
+
+	add	x1, x10, #PAGE_SIZE
+	/* Clean the copied page to PoU - based on flush_icache_range() */
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x10, x3
+4:	dc	cvau, x4	// clean D line / unified line
+	add	x4, x4, x2
+	cmp	x4, x1
+	b.lo	4b
+
+	ldr	x19, [x19, #HIBERN_PBE_NEXT]
+	cbnz	x19, 2b
+
+	/* Clean the kernel text to PoC - based on flush_icache_range() */
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x22, x3
+5:	dc	cvac, x4
+	add	x4, x4, x2
+	cmp	x4, x23
+	b.lo	5b
+
+	/*
+	 * branch into the restored kernel - so that when we restore the page
+	 * tables, code continues to be executable.
+	 */
+	ldr	x1, =__hibernate_exit
+	mov	x0, x21		// physical address of swapper page tables.
+	br	x1
+
+	.ltorg
+ENDPROC(swsusp_arch_suspend_exit)
+.popsection
+
+/*
+ * Reset the page tables, and wake up in cpu_resume().
+ * Temporary page tables were a copy, so again, no trampoline here.
+ *
+ * x0: physical address of swapper_pg_dir
+ */
+ENTRY(__hibernate_exit)
+	msr	ttbr1_el1, x0
+	isb
+	tlbi	vmalle1is
+	ic	ialluis
+	isb
+	b	_cpu_resume
+ENDPROC(__hibernate_exit)
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
new file mode 100644
index 000000000000..ff2fe72e7d95
--- /dev/null
+++ b/arch/arm64/kernel/hibernate.c
@@ -0,0 +1,376 @@
+/*:
+ * Hibernate support specific for ARM64
+ *
+ * Derived from work on ARM hibernation support by:
+ *
+ * Ubuntu project, hibernation support for mach-dove
+ * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)
+ * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)
+ *  https://lkml.org/lkml/2010/6/18/4
+ *  https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html
+ *  https://patchwork.kernel.org/patch/96442/
+ *
+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * License terms: GNU General Public License (GPL) version 2
+ */
+#define pr_fmt(x) "hibernate: " x
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <linux/pm.h>
+#include <linux/sched.h>
+#include <linux/suspend.h>
+#include <linux/version.h>
+
+#include <asm/barrier.h>
+#include <asm/cacheflush.h>
+#include <asm/irqflags.h>
+#include <asm/memory.h>
+#include <asm/mmu_context.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/pgtable-hwdef.h>
+#include <asm/sections.h>
+#include <asm/suspend.h>
+
+/* These are necessary to build without ifdefery */
+#ifndef pmd_index
+#define pmd_index(x)	0
+#endif
+#ifndef pud_index
+#define pud_index(x)	0
+#endif
+
+/*
+ * Start/end of the hibernate exit code, this must be copied to a 'safe'
+ * location in memory, and executed from there.
+ */
+extern char __hibernate_exit_text_start[], __hibernate_exit_text_end[];
+
+int pfn_is_nosave(unsigned long pfn)
+{
+	unsigned long nosave_begin_pfn = virt_to_pfn(&__nosave_begin);
+	unsigned long nosave_end_pfn = virt_to_pfn(&__nosave_end - 1);
+
+	return (pfn >= nosave_begin_pfn) && (pfn <= nosave_end_pfn);
+}
+
+void notrace save_processor_state(void)
+{
+	WARN_ON(num_online_cpus() != 1);
+	local_fiq_disable();
+}
+
+void notrace restore_processor_state(void)
+{
+	local_fiq_enable();
+}
+
+/*
+ * Copies src_length bytes, starting at src_start into an new page,
+ * perform cache maintentance, then map it at the top of memory as executable.
+ *
+ * This is used by hibernate to copy the code it needs to execute when
+ * overwriting the kernel text.
+ *
+ * Suggested allocators are get_safe_page() or get_zeroed_page(). Your chosen
+ * mask must cause zero'd pages to be returned.
+ */
+static int create_safe_exec_page(void *src_start, size_t length,
+				 void **dst_addr,
+				 unsigned long (*allocator)(gfp_t mask),
+				 gfp_t mask)
+{
+	int rc = 0;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+	unsigned long dst = allocator(mask);
+
+	if (!dst) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	memcpy((void *)dst, src_start, length);
+	flush_icache_range(dst, dst + length);
+
+	pgd = pgd_offset(&init_mm, (unsigned long)-1);
+	if (!pgd_val(*pgd) && PTRS_PER_PGD > 1) {
+		pud = (pud_t *)allocator(mask);
+		if (!pud) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pgd(pgd, __pgd(virt_to_phys(pud) | PUD_TYPE_TABLE));
+	}
+
+	pud = pud_offset(pgd, (unsigned long)-1);
+	if (!pud_val(*pud) && PTRS_PER_PUD > 1) {
+		pmd = (pmd_t *)allocator(mask);
+		if (!pmd) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pud(pud, __pud(virt_to_phys(pmd) | PUD_TYPE_TABLE));
+	}
+
+	pmd = pmd_offset(pud, (unsigned long)-1);
+	if (!pmd_val(*pmd) && PTRS_PER_PMD > 1) {
+		pte = (pte_t *)allocator(mask);
+		if (!pte) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pmd(pmd, __pmd(virt_to_phys(pte) | PMD_TYPE_TABLE));
+	}
+
+	pte = pte_offset_kernel(pmd, (unsigned long)-1);
+	set_pte_at(&init_mm, dst, pte,
+		   __pte(virt_to_phys((void *)dst)
+			 | pgprot_val(PAGE_KERNEL_EXEC)));
+
+	/* this is a new mapping, so no need for a tlbi */
+
+	*dst_addr = (void *)((unsigned long)-1 & PAGE_MASK);
+
+out:
+	return rc;
+}
+
+
+int swsusp_arch_suspend(void)
+{
+	int ret = 0;
+	unsigned long flags;
+	struct sleep_stack_data state;
+	struct mm_struct *mm = current->active_mm;
+
+	local_dbg_save(flags);
+
+	if (__cpu_suspend_enter(&state))
+		ret = swsusp_save();
+	else
+		__cpu_suspend_exit(mm);
+
+	local_dbg_restore(flags);
+
+	return ret;
+}
+
+static int copy_pte(pmd_t *dst, pmd_t *src, unsigned long *start_addr)
+{
+	int i;
+	pte_t *old_pte = pte_offset_kernel(src, *start_addr);
+	pte_t *new_pte = pte_offset_kernel(dst, *start_addr);
+
+	for (i = pte_index(*start_addr); i < PTRS_PER_PTE;
+	     i++, old_pte++, new_pte++) {
+		if (pte_val(*old_pte))
+			set_pte(new_pte,
+				__pte(pte_val(*old_pte) & ~PTE_RDONLY));
+	}
+
+	*start_addr &= PAGE_MASK;
+
+	return 0;
+}
+
+static int copy_pmd(pud_t *dst, pud_t *src, unsigned long *start_addr)
+{
+	int i;
+	int rc = 0;
+	pte_t *new_pte;
+	pmd_t *old_pmd = pmd_offset(src, *start_addr);
+	pmd_t *new_pmd = pmd_offset(dst, *start_addr);
+
+	for (i = pmd_index(*start_addr); i < PTRS_PER_PMD;
+	     i++, *start_addr += PMD_SIZE, old_pmd++, new_pmd++) {
+		if (!pmd_val(*old_pmd))
+			continue;
+
+		if (pmd_table(*(old_pmd))) {
+			new_pte = (pte_t *)get_safe_page(GFP_ATOMIC);
+			if (!new_pte) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pmd(new_pmd, __pmd(virt_to_phys(new_pte)
+					       | PMD_TYPE_TABLE));
+
+			rc = copy_pte(new_pmd, old_pmd, start_addr);
+			if (rc)
+				break;
+		} else
+			set_pmd(new_pmd,
+				__pmd(pmd_val(*old_pmd) & ~PMD_SECT_RDONLY));
+
+		*start_addr &= PMD_MASK;
+	}
+
+	return rc;
+}
+
+static int copy_pud(pgd_t *dst, pgd_t *src, unsigned long *start_addr)
+{
+	int i;
+	int rc = 0;
+	pmd_t *new_pmd;
+	pud_t *old_pud = pud_offset(src, *start_addr);
+	pud_t *new_pud = pud_offset(dst, *start_addr);
+
+	for (i = pud_index(*start_addr); i < PTRS_PER_PUD;
+	     i++, *start_addr += PUD_SIZE, old_pud++, new_pud++) {
+		if (!pud_val(*old_pud))
+			continue;
+
+		if (pud_table(*(old_pud))) {
+			if (PTRS_PER_PMD != 1) {
+				new_pmd = (pmd_t *)get_safe_page(GFP_ATOMIC);
+				if (!new_pmd) {
+					rc = -ENOMEM;
+					break;
+				}
+
+				set_pud(new_pud, __pud(virt_to_phys(new_pmd)
+						       | PUD_TYPE_TABLE));
+			}
+
+			rc = copy_pmd(new_pud, old_pud, start_addr);
+			if (rc)
+				break;
+		} else
+			set_pud(new_pud,
+				__pud(pud_val(*old_pud) & ~PMD_SECT_RDONLY));
+
+		*start_addr &= PUD_MASK;
+	}
+
+	return rc;
+}
+
+static int copy_page_tables(pgd_t *new_pgd, unsigned long start_addr)
+{
+	int i;
+	int rc = 0;
+	pud_t *new_pud;
+	pgd_t *old_pgd = pgd_offset_k(start_addr);
+
+	new_pgd += pgd_index(start_addr);
+
+	for (i = pgd_index(start_addr); i < PTRS_PER_PGD;
+	     i++, start_addr += PGDIR_SIZE, old_pgd++, new_pgd++) {
+		if (!pgd_val(*old_pgd))
+			continue;
+
+		if (PTRS_PER_PUD != 1) {
+			new_pud = (pud_t *)get_safe_page(GFP_ATOMIC);
+			if (!new_pud) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pgd(new_pgd, __pgd(virt_to_phys(new_pud)
+					       | PUD_TYPE_TABLE));
+		}
+
+		rc = copy_pud(new_pgd, old_pgd, &start_addr);
+		if (rc)
+			break;
+
+		start_addr &= PGDIR_MASK;
+	}
+
+	return rc;
+}
+
+/*
+ * Setup then Resume from the hibernate image using swsusp_arch_suspend_exit().
+ *
+ * Memory allocated by get_safe_page() will be dealt with by the hibernate code,
+ * we don't need to free it here.
+ *
+ * Allocate a safe zero page to use as ttbr0, as all existing page tables, and
+ * even the empty_zero_page will be overwritten.
+ */
+int swsusp_arch_resume(void)
+{
+	int rc = 0;
+	size_t exit_size;
+	pgd_t *tmp_pg_dir;
+	void *safe_zero_page_mem;
+	unsigned long tmp_pg_start;
+	void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t,
+					  void *, void *);
+
+	/* Copy swsusp_arch_suspend_exit() to a safe page. */
+	exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start;
+	rc = create_safe_exec_page(__hibernate_exit_text_start, exit_size,
+			(void **)&hibernate_exit, get_safe_page, GFP_ATOMIC);
+	if (rc) {
+		pr_err("Failed to create safe executable page for"
+		       " hibernate_exit code.");
+		goto out;
+	}
+
+	/*
+	 * Even the zero page may get overwritten during restore.
+	 * get_safe_page() only returns zero'd pages.
+	 */
+	safe_zero_page_mem = (void *)get_safe_page(GFP_ATOMIC);
+	if (!safe_zero_page_mem) {
+		pr_err("Failed to allocate memory for zero page.");
+		rc = -ENOMEM;
+		goto out;
+	}
+	empty_zero_page = virt_to_page(safe_zero_page_mem);
+	cpu_set_reserved_ttbr0();
+
+	/*
+	 * Restoring the memory image will overwrite the ttbr1 page tables.
+	 * Create a second copy, of the kernel and linear map, and use this
+	 * when restoring.
+	 */
+	tmp_pg_dir = (pgd_t *)get_safe_page(GFP_ATOMIC);
+	if (!tmp_pg_dir) {
+		pr_err("Failed to allocate memory for temporary page tables.");
+		rc = -ENOMEM;
+		goto out;
+	}
+	tmp_pg_start = min((unsigned long)KERNEL_START,
+			   (unsigned long)PAGE_OFFSET);
+	rc = copy_page_tables(tmp_pg_dir, tmp_pg_start);
+	if (rc)
+		goto out;
+
+	/*
+	 * EL2 may get upset if we overwrite its page-tables/stack.
+	 * kvm_arch_hardware_disable() returns EL2 to the hyp stub. This
+	 * isn't needed on normal suspend/resume as PSCI prevents us from
+	 * ruining EL2.
+	 */
+	if (IS_ENABLED(CONFIG_KVM_ARM_HOST))
+		kvm_arch_hardware_disable();
+
+	/*
+	 * Some pages are read directly into their final location by
+	 * kernel/power/snapshot.c, these are listed in
+	 * restored_inplace_pblist. Some of them may be executable, we
+	 * need to clean them to the PoU.
+	 */
+	while (restored_inplace_pblist != NULL) {
+		struct pbe *pbe = restored_inplace_pblist;
+
+		flush_icache_range((unsigned long)pbe->address,
+				   (unsigned long)pbe->address + PAGE_SIZE);
+		restored_inplace_pblist = pbe->next;
+	}
+
+	hibernate_exit(virt_to_phys(tmp_pg_dir), virt_to_phys(swapper_pg_dir),
+		       KERNEL_START, KERNEL_END);
+
+out:
+	return rc;
+}
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 1ee2c3937d4e..87addff02009 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -45,6 +45,16 @@ jiffies = jiffies_64;
 	*(.idmap.text)					\
 	VMLINUX_SYMBOL(__idmap_text_end) = .;
 
+#ifdef CONFIG_HIBERNATION
+#define HIBERNATE_TEXT					\
+	. = ALIGN(SZ_4K);				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_start) = .;\
+	*(.hibernate_exit.text)				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_end) = .;
+#else
+#define HIBERNATE_TEXT
+#endif
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -106,6 +116,7 @@ SECTIONS
 			LOCK_TEXT
 			HYPERVISOR_TEXT
 			IDMAP_TEXT
+			HIBERNATE_TEXT
 			*(.fixup)
 			*(.gnu.warning)
 		. = ALIGN(16);
@@ -185,6 +196,10 @@ ASSERT(__hyp_idmap_text_end - (__hyp_idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"HYP init code too big or misaligned")
 ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"ID map text too big or misaligned")
+#ifdef CONFIG_HIBERNATION
+ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
+	<= SZ_4K, "Hibernate exit text too big or misaligned")
+#endif
 
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk.
@ 2015-11-26 17:32   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-11-26 17:32 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for hibernate/suspend-to-disk.

Suspend borrows code from cpu_suspend() to write cpu state onto the stack,
before calling swsusp_save() to save the memory image.

Restore creates a set of temporary page tables, covering the kernel and the
linear map, copies the restore code to a 'safe' page, then uses the copy to
restore the memory image. It calls into cpu_resume(),
and then follows the normal cpu_suspend() path back into the suspend code.

The implementation assumes that exactly the same kernel is booted on the
same hardware, and that the kernel is loaded at the same physical address.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Pavel Machek <pavel@ucw.cz>
---
 arch/arm64/Kconfig                |   3 +
 arch/arm64/include/asm/suspend.h  |   8 +
 arch/arm64/kernel/Makefile        |   1 +
 arch/arm64/kernel/asm-offsets.c   |   4 +
 arch/arm64/kernel/hibernate-asm.S | 119 ++++++++++++
 arch/arm64/kernel/hibernate.c     | 376 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S   |  15 ++
 7 files changed, 526 insertions(+)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 9ac16a482ff1..b15d831f4016 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -769,6 +769,9 @@ menu "Power management options"
 
 source "kernel/power/Kconfig"
 
+config ARCH_HIBERNATION_POSSIBLE
+	def_bool y
+
 config ARCH_SUSPEND_POSSIBLE
 	def_bool y
 
diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 5faa3ce1fa3a..e75ad7aa268c 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,3 +1,5 @@
+#include <linux/suspend.h>
+
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
@@ -34,6 +36,12 @@ struct sleep_stack_data {
 	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
 };
 
+extern int swsusp_arch_suspend(void);
+extern int swsusp_arch_resume(void);
+int swsusp_arch_suspend_enter(struct cpu_suspend_ctx *ptr);
+void __noreturn swsusp_arch_suspend_exit(phys_addr_t tmp_pg_dir,
+					 phys_addr_t swapper_pg_dir,
+					 void *kernel_start, void *kernel_end);
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
 int __cpu_suspend_enter(struct sleep_stack_data *state);
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 474691f8b13a..71da22197963 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -41,6 +41,7 @@ arm64-obj-$(CONFIG_EFI)			+= efi.o efi-entry.stub.o
 arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
+arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 7e0be84e1bdc..15c36aae6fac 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
+#include <linux/suspend.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/smp_plat.h>
@@ -160,5 +161,8 @@ int main(void)
   DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
   DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
+  DEFINE(HIBERN_PBE_ORIG,	offsetof(struct pbe, orig_address));
+  DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
+  DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
   return 0;
 }
diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S
new file mode 100644
index 000000000000..db3e4d6bb68e
--- /dev/null
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -0,0 +1,119 @@
+#include <linux/linkage.h>
+#include <linux/errno.h>
+
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
+#include <asm/cputype.h>
+#include <asm/memory.h>
+#include <asm/page.h>
+
+/*
+ * Corrupt memory.
+ *
+ * Loads temporary page tables then restores the memory image.
+ * Finally branches to cpu_resume() to restore the state saved by
+ * swsusp_arch_suspend().
+ *
+ * Because this code has to be copied to a safe_page, it can't call out to
+ * other functions by PC-relative address. Also remember that it may be
+ * mid-way through over-writing other functions. For this reason it contains
+ * a copy of copy_page() and code from flush_icache_range().
+ *
+ * All of memory gets written to, including code. We need to clean the kernel
+ * text to the Point of Coherence (PoC) before secondary cores can be booted.
+ * Because the kernel modules and executable pages mapped to user space are
+ * also written as data,  we clean all pages we touch to the Point of
+ * Unification (PoU).
+ *
+ * x0: physical address of temporary page tables
+ * x1: physical address of swapper page tables
+ * x2: address of kernel_start
+ * x3: address of kernel_end
+ */
+.pushsection    ".hibernate_exit.text", "ax"
+ENTRY(swsusp_arch_suspend_exit)
+	/* Temporary page tables are a copy, so no need for a trampoline here */
+	msr	ttbr1_el1, x0
+	isb
+	tlbi	vmalle1is
+	ic	ialluis
+	isb
+
+	mov	x21, x1
+	mov	x22, x2
+	mov	x23, x3
+
+	/* walk the restore_pblist and use copy_page() to over-write memory */
+	ldr	x19, =restore_pblist
+	ldr	x19, [x19]
+
+2:	ldr	x10, [x19, #HIBERN_PBE_ORIG]
+	mov	x0, x10
+	ldr	x1, [x19, #HIBERN_PBE_ADDR]
+
+	/* arch/arm64/lib/copy_page.S:copy_page() */
+	prfm	pldl1strm, [x1, #64]
+3:	ldp	x2, x3, [x1]
+	ldp	x4, x5, [x1, #16]
+	ldp	x6, x7, [x1, #32]
+	ldp	x8, x9, [x1, #48]
+	add	x1, x1, #64
+	prfm	pldl1strm, [x1, #64]
+	stnp	x2, x3, [x0]
+	stnp	x4, x5, [x0, #16]
+	stnp	x6, x7, [x0, #32]
+	stnp	x8, x9, [x0, #48]
+	add	x0, x0, #64
+	tst	x1, #(PAGE_SIZE - 1)
+	b.ne	3b
+
+	dsb	ish		//  memory restore must finish before cleaning
+
+	add	x1, x10, #PAGE_SIZE
+	/* Clean the copied page to PoU - based on flush_icache_range() */
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x10, x3
+4:	dc	cvau, x4	// clean D line / unified line
+	add	x4, x4, x2
+	cmp	x4, x1
+	b.lo	4b
+
+	ldr	x19, [x19, #HIBERN_PBE_NEXT]
+	cbnz	x19, 2b
+
+	/* Clean the kernel text to PoC - based on flush_icache_range() */
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x22, x3
+5:	dc	cvac, x4
+	add	x4, x4, x2
+	cmp	x4, x23
+	b.lo	5b
+
+	/*
+	 * branch into the restored kernel - so that when we restore the page
+	 * tables, code continues to be executable.
+	 */
+	ldr	x1, =__hibernate_exit
+	mov	x0, x21		// physical address of swapper page tables.
+	br	x1
+
+	.ltorg
+ENDPROC(swsusp_arch_suspend_exit)
+.popsection
+
+/*
+ * Reset the page tables, and wake up in cpu_resume().
+ * Temporary page tables were a copy, so again, no trampoline here.
+ *
+ * x0: physical address of swapper_pg_dir
+ */
+ENTRY(__hibernate_exit)
+	msr	ttbr1_el1, x0
+	isb
+	tlbi	vmalle1is
+	ic	ialluis
+	isb
+	b	_cpu_resume
+ENDPROC(__hibernate_exit)
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
new file mode 100644
index 000000000000..ff2fe72e7d95
--- /dev/null
+++ b/arch/arm64/kernel/hibernate.c
@@ -0,0 +1,376 @@
+/*:
+ * Hibernate support specific for ARM64
+ *
+ * Derived from work on ARM hibernation support by:
+ *
+ * Ubuntu project, hibernation support for mach-dove
+ * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)
+ * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)
+ *  https://lkml.org/lkml/2010/6/18/4
+ *  https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html
+ *  https://patchwork.kernel.org/patch/96442/
+ *
+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * License terms: GNU General Public License (GPL) version 2
+ */
+#define pr_fmt(x) "hibernate: " x
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <linux/pm.h>
+#include <linux/sched.h>
+#include <linux/suspend.h>
+#include <linux/version.h>
+
+#include <asm/barrier.h>
+#include <asm/cacheflush.h>
+#include <asm/irqflags.h>
+#include <asm/memory.h>
+#include <asm/mmu_context.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/pgtable-hwdef.h>
+#include <asm/sections.h>
+#include <asm/suspend.h>
+
+/* These are necessary to build without ifdefery */
+#ifndef pmd_index
+#define pmd_index(x)	0
+#endif
+#ifndef pud_index
+#define pud_index(x)	0
+#endif
+
+/*
+ * Start/end of the hibernate exit code, this must be copied to a 'safe'
+ * location in memory, and executed from there.
+ */
+extern char __hibernate_exit_text_start[], __hibernate_exit_text_end[];
+
+int pfn_is_nosave(unsigned long pfn)
+{
+	unsigned long nosave_begin_pfn = virt_to_pfn(&__nosave_begin);
+	unsigned long nosave_end_pfn = virt_to_pfn(&__nosave_end - 1);
+
+	return (pfn >= nosave_begin_pfn) && (pfn <= nosave_end_pfn);
+}
+
+void notrace save_processor_state(void)
+{
+	WARN_ON(num_online_cpus() != 1);
+	local_fiq_disable();
+}
+
+void notrace restore_processor_state(void)
+{
+	local_fiq_enable();
+}
+
+/*
+ * Copies src_length bytes, starting at src_start into an new page,
+ * perform cache maintentance, then map it@the top of memory as executable.
+ *
+ * This is used by hibernate to copy the code it needs to execute when
+ * overwriting the kernel text.
+ *
+ * Suggested allocators are get_safe_page() or get_zeroed_page(). Your chosen
+ * mask must cause zero'd pages to be returned.
+ */
+static int create_safe_exec_page(void *src_start, size_t length,
+				 void **dst_addr,
+				 unsigned long (*allocator)(gfp_t mask),
+				 gfp_t mask)
+{
+	int rc = 0;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+	unsigned long dst = allocator(mask);
+
+	if (!dst) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	memcpy((void *)dst, src_start, length);
+	flush_icache_range(dst, dst + length);
+
+	pgd = pgd_offset(&init_mm, (unsigned long)-1);
+	if (!pgd_val(*pgd) && PTRS_PER_PGD > 1) {
+		pud = (pud_t *)allocator(mask);
+		if (!pud) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pgd(pgd, __pgd(virt_to_phys(pud) | PUD_TYPE_TABLE));
+	}
+
+	pud = pud_offset(pgd, (unsigned long)-1);
+	if (!pud_val(*pud) && PTRS_PER_PUD > 1) {
+		pmd = (pmd_t *)allocator(mask);
+		if (!pmd) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pud(pud, __pud(virt_to_phys(pmd) | PUD_TYPE_TABLE));
+	}
+
+	pmd = pmd_offset(pud, (unsigned long)-1);
+	if (!pmd_val(*pmd) && PTRS_PER_PMD > 1) {
+		pte = (pte_t *)allocator(mask);
+		if (!pte) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pmd(pmd, __pmd(virt_to_phys(pte) | PMD_TYPE_TABLE));
+	}
+
+	pte = pte_offset_kernel(pmd, (unsigned long)-1);
+	set_pte_at(&init_mm, dst, pte,
+		   __pte(virt_to_phys((void *)dst)
+			 | pgprot_val(PAGE_KERNEL_EXEC)));
+
+	/* this is a new mapping, so no need for a tlbi */
+
+	*dst_addr = (void *)((unsigned long)-1 & PAGE_MASK);
+
+out:
+	return rc;
+}
+
+
+int swsusp_arch_suspend(void)
+{
+	int ret = 0;
+	unsigned long flags;
+	struct sleep_stack_data state;
+	struct mm_struct *mm = current->active_mm;
+
+	local_dbg_save(flags);
+
+	if (__cpu_suspend_enter(&state))
+		ret = swsusp_save();
+	else
+		__cpu_suspend_exit(mm);
+
+	local_dbg_restore(flags);
+
+	return ret;
+}
+
+static int copy_pte(pmd_t *dst, pmd_t *src, unsigned long *start_addr)
+{
+	int i;
+	pte_t *old_pte = pte_offset_kernel(src, *start_addr);
+	pte_t *new_pte = pte_offset_kernel(dst, *start_addr);
+
+	for (i = pte_index(*start_addr); i < PTRS_PER_PTE;
+	     i++, old_pte++, new_pte++) {
+		if (pte_val(*old_pte))
+			set_pte(new_pte,
+				__pte(pte_val(*old_pte) & ~PTE_RDONLY));
+	}
+
+	*start_addr &= PAGE_MASK;
+
+	return 0;
+}
+
+static int copy_pmd(pud_t *dst, pud_t *src, unsigned long *start_addr)
+{
+	int i;
+	int rc = 0;
+	pte_t *new_pte;
+	pmd_t *old_pmd = pmd_offset(src, *start_addr);
+	pmd_t *new_pmd = pmd_offset(dst, *start_addr);
+
+	for (i = pmd_index(*start_addr); i < PTRS_PER_PMD;
+	     i++, *start_addr += PMD_SIZE, old_pmd++, new_pmd++) {
+		if (!pmd_val(*old_pmd))
+			continue;
+
+		if (pmd_table(*(old_pmd))) {
+			new_pte = (pte_t *)get_safe_page(GFP_ATOMIC);
+			if (!new_pte) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pmd(new_pmd, __pmd(virt_to_phys(new_pte)
+					       | PMD_TYPE_TABLE));
+
+			rc = copy_pte(new_pmd, old_pmd, start_addr);
+			if (rc)
+				break;
+		} else
+			set_pmd(new_pmd,
+				__pmd(pmd_val(*old_pmd) & ~PMD_SECT_RDONLY));
+
+		*start_addr &= PMD_MASK;
+	}
+
+	return rc;
+}
+
+static int copy_pud(pgd_t *dst, pgd_t *src, unsigned long *start_addr)
+{
+	int i;
+	int rc = 0;
+	pmd_t *new_pmd;
+	pud_t *old_pud = pud_offset(src, *start_addr);
+	pud_t *new_pud = pud_offset(dst, *start_addr);
+
+	for (i = pud_index(*start_addr); i < PTRS_PER_PUD;
+	     i++, *start_addr += PUD_SIZE, old_pud++, new_pud++) {
+		if (!pud_val(*old_pud))
+			continue;
+
+		if (pud_table(*(old_pud))) {
+			if (PTRS_PER_PMD != 1) {
+				new_pmd = (pmd_t *)get_safe_page(GFP_ATOMIC);
+				if (!new_pmd) {
+					rc = -ENOMEM;
+					break;
+				}
+
+				set_pud(new_pud, __pud(virt_to_phys(new_pmd)
+						       | PUD_TYPE_TABLE));
+			}
+
+			rc = copy_pmd(new_pud, old_pud, start_addr);
+			if (rc)
+				break;
+		} else
+			set_pud(new_pud,
+				__pud(pud_val(*old_pud) & ~PMD_SECT_RDONLY));
+
+		*start_addr &= PUD_MASK;
+	}
+
+	return rc;
+}
+
+static int copy_page_tables(pgd_t *new_pgd, unsigned long start_addr)
+{
+	int i;
+	int rc = 0;
+	pud_t *new_pud;
+	pgd_t *old_pgd = pgd_offset_k(start_addr);
+
+	new_pgd += pgd_index(start_addr);
+
+	for (i = pgd_index(start_addr); i < PTRS_PER_PGD;
+	     i++, start_addr += PGDIR_SIZE, old_pgd++, new_pgd++) {
+		if (!pgd_val(*old_pgd))
+			continue;
+
+		if (PTRS_PER_PUD != 1) {
+			new_pud = (pud_t *)get_safe_page(GFP_ATOMIC);
+			if (!new_pud) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pgd(new_pgd, __pgd(virt_to_phys(new_pud)
+					       | PUD_TYPE_TABLE));
+		}
+
+		rc = copy_pud(new_pgd, old_pgd, &start_addr);
+		if (rc)
+			break;
+
+		start_addr &= PGDIR_MASK;
+	}
+
+	return rc;
+}
+
+/*
+ * Setup then Resume from the hibernate image using swsusp_arch_suspend_exit().
+ *
+ * Memory allocated by get_safe_page() will be dealt with by the hibernate code,
+ * we don't need to free it here.
+ *
+ * Allocate a safe zero page to use as ttbr0, as all existing page tables, and
+ * even the empty_zero_page will be overwritten.
+ */
+int swsusp_arch_resume(void)
+{
+	int rc = 0;
+	size_t exit_size;
+	pgd_t *tmp_pg_dir;
+	void *safe_zero_page_mem;
+	unsigned long tmp_pg_start;
+	void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t,
+					  void *, void *);
+
+	/* Copy swsusp_arch_suspend_exit() to a safe page. */
+	exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start;
+	rc = create_safe_exec_page(__hibernate_exit_text_start, exit_size,
+			(void **)&hibernate_exit, get_safe_page, GFP_ATOMIC);
+	if (rc) {
+		pr_err("Failed to create safe executable page for"
+		       " hibernate_exit code.");
+		goto out;
+	}
+
+	/*
+	 * Even the zero page may get overwritten during restore.
+	 * get_safe_page() only returns zero'd pages.
+	 */
+	safe_zero_page_mem = (void *)get_safe_page(GFP_ATOMIC);
+	if (!safe_zero_page_mem) {
+		pr_err("Failed to allocate memory for zero page.");
+		rc = -ENOMEM;
+		goto out;
+	}
+	empty_zero_page = virt_to_page(safe_zero_page_mem);
+	cpu_set_reserved_ttbr0();
+
+	/*
+	 * Restoring the memory image will overwrite the ttbr1 page tables.
+	 * Create a second copy, of the kernel and linear map, and use this
+	 * when restoring.
+	 */
+	tmp_pg_dir = (pgd_t *)get_safe_page(GFP_ATOMIC);
+	if (!tmp_pg_dir) {
+		pr_err("Failed to allocate memory for temporary page tables.");
+		rc = -ENOMEM;
+		goto out;
+	}
+	tmp_pg_start = min((unsigned long)KERNEL_START,
+			   (unsigned long)PAGE_OFFSET);
+	rc = copy_page_tables(tmp_pg_dir, tmp_pg_start);
+	if (rc)
+		goto out;
+
+	/*
+	 * EL2 may get upset if we overwrite its page-tables/stack.
+	 * kvm_arch_hardware_disable() returns EL2 to the hyp stub. This
+	 * isn't needed on normal suspend/resume as PSCI prevents us from
+	 * ruining EL2.
+	 */
+	if (IS_ENABLED(CONFIG_KVM_ARM_HOST))
+		kvm_arch_hardware_disable();
+
+	/*
+	 * Some pages are read directly into their final location by
+	 * kernel/power/snapshot.c, these are listed in
+	 * restored_inplace_pblist. Some of them may be executable, we
+	 * need to clean them to the PoU.
+	 */
+	while (restored_inplace_pblist != NULL) {
+		struct pbe *pbe = restored_inplace_pblist;
+
+		flush_icache_range((unsigned long)pbe->address,
+				   (unsigned long)pbe->address + PAGE_SIZE);
+		restored_inplace_pblist = pbe->next;
+	}
+
+	hibernate_exit(virt_to_phys(tmp_pg_dir), virt_to_phys(swapper_pg_dir),
+		       KERNEL_START, KERNEL_END);
+
+out:
+	return rc;
+}
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index 1ee2c3937d4e..87addff02009 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -45,6 +45,16 @@ jiffies = jiffies_64;
 	*(.idmap.text)					\
 	VMLINUX_SYMBOL(__idmap_text_end) = .;
 
+#ifdef CONFIG_HIBERNATION
+#define HIBERNATE_TEXT					\
+	. = ALIGN(SZ_4K);				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_start) = .;\
+	*(.hibernate_exit.text)				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_end) = .;
+#else
+#define HIBERNATE_TEXT
+#endif
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -106,6 +116,7 @@ SECTIONS
 			LOCK_TEXT
 			HYPERVISOR_TEXT
 			IDMAP_TEXT
+			HIBERNATE_TEXT
 			*(.fixup)
 			*(.gnu.warning)
 		. = ALIGN(16);
@@ -185,6 +196,10 @@ ASSERT(__hyp_idmap_text_end - (__hyp_idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"HYP init code too big or misaligned")
 ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"ID map text too big or misaligned")
+#ifdef CONFIG_HIBERNATION
+ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
+	<= SZ_4K, "Hibernate exit text too big or misaligned")
+#endif
 
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 01/10] arm64: Fold proc-macros.S into assembler.h
  2015-11-26 17:32   ` James Morse
@ 2015-12-01  9:18     ` Pavel Machek
  -1 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-01  9:18 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	linux-pm

On Thu 2015-11-26 17:32:39, James Morse wrote:
> From: Geoff Levand <geoff@infradead.org>
> 
> To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to be used
> outside the mm code move the contents of proc-macros.S to asm/assembler.h.  Also,
> delete proc-macros.S, and fix up all references to proc-macros.S.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel Machek <pavel@ucw.cz>

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 01/10] arm64: Fold proc-macros.S into assembler.h
@ 2015-12-01  9:18     ` Pavel Machek
  0 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-01  9:18 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu 2015-11-26 17:32:39, James Morse wrote:
> From: Geoff Levand <geoff@infradead.org>
> 
> To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to be used
> outside the mm code move the contents of proc-macros.S to asm/assembler.h.  Also,
> delete proc-macros.S, and fix up all references to proc-macros.S.
> 
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel Machek <pavel@ucw.cz>

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 07/10] arm64: kernel: Include _AC definition in page.h
  2015-11-26 17:32   ` James Morse
@ 2015-12-01  9:28     ` Pavel Machek
  -1 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-01  9:28 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	linux-pm

On Thu 2015-11-26 17:32:45, James Morse wrote:
> page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include
> linux/const.h where this is defined. This produces build warnings when only
> asm/page.h is included by asm code.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel Machek <pavel@ucw.cz>

> ---
>  arch/arm64/include/asm/page.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 9b2f5a9d019d..fbafd0ad16df 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -19,6 +19,8 @@
>  #ifndef __ASM_PAGE_H
>  #define __ASM_PAGE_H
>  
> +#include <linux/const.h>
> +
>  /* PAGE_SHIFT determines the page size */
>  /* CONT_SHIFT determines the number of pages which can be tracked together  */
>  #ifdef CONFIG_ARM64_64K_PAGES

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 07/10] arm64: kernel: Include _AC definition in page.h
@ 2015-12-01  9:28     ` Pavel Machek
  0 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-01  9:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu 2015-11-26 17:32:45, James Morse wrote:
> page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include
> linux/const.h where this is defined. This produces build warnings when only
> asm/page.h is included by asm code.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel Machek <pavel@ucw.cz>

> ---
>  arch/arm64/include/asm/page.h | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
> index 9b2f5a9d019d..fbafd0ad16df 100644
> --- a/arch/arm64/include/asm/page.h
> +++ b/arch/arm64/include/asm/page.h
> @@ -19,6 +19,8 @@
>  #ifndef __ASM_PAGE_H
>  #define __ASM_PAGE_H
>  
> +#include <linux/const.h>
> +
>  /* PAGE_SHIFT determines the page size */
>  /* CONT_SHIFT determines the number of pages which can be tracked together  */
>  #ifdef CONFIG_ARM64_64K_PAGES

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 08/10] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  2015-11-26 17:32   ` James Morse
@ 2015-12-01  9:28     ` Pavel Machek
  -1 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-01  9:28 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	linux-pm

On Thu 2015-11-26 17:32:46, James Morse wrote:
> KERNEL_START and KERNEL_END are useful outside head.S, move them to a
> header file.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel	 Machek <pavel@ucw.cz>

> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 853953cd1f08..5773a6629f10 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -70,6 +70,9 @@
>  
>  #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
>  
> +#define KERNEL_START      _text
> +#define KERNEL_END        _end
> +
>  /*
>   * Physical vs virtual RAM address space conversion.  These are
>   * private definitions which should NOT be used outside memory.h
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 7cec62a76f50..c58ede3398db 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -48,9 +48,6 @@
>  #error TEXT_OFFSET must be less than 2MB
>  #endif
>  
> -#define KERNEL_START	_text
> -#define KERNEL_END	_end
> -
>  /*
>   * Kernel startup entry point.
>   * ---------------------------

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 08/10] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
@ 2015-12-01  9:28     ` Pavel Machek
  0 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-01  9:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu 2015-11-26 17:32:46, James Morse wrote:
> KERNEL_START and KERNEL_END are useful outside head.S, move them to a
> header file.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel	 Machek <pavel@ucw.cz>

> diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
> index 853953cd1f08..5773a6629f10 100644
> --- a/arch/arm64/include/asm/memory.h
> +++ b/arch/arm64/include/asm/memory.h
> @@ -70,6 +70,9 @@
>  
>  #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
>  
> +#define KERNEL_START      _text
> +#define KERNEL_END        _end
> +
>  /*
>   * Physical vs virtual RAM address space conversion.  These are
>   * private definitions which should NOT be used outside memory.h
> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> index 7cec62a76f50..c58ede3398db 100644
> --- a/arch/arm64/kernel/head.S
> +++ b/arch/arm64/kernel/head.S
> @@ -48,9 +48,6 @@
>  #error TEXT_OFFSET must be less than 2MB
>  #endif
>  
> -#define KERNEL_START	_text
> -#define KERNEL_END	_end
> -
>  /*
>   * Kernel startup entry point.
>   * ---------------------------

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-11-26 17:32   ` James Morse
@ 2015-12-01  9:31     ` Pavel Machek
  -1 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-01  9:31 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	linux-pm

Hi!

> Suspend borrows code from cpu_suspend() to write cpu state onto the stack,
> before calling swsusp_save() to save the memory image.
> 
> Restore creates a set of temporary page tables, covering the kernel and the
> linear map, copies the restore code to a 'safe' page, then uses the copy to
> restore the memory image. It calls into cpu_resume(),
> and then follows the normal cpu_suspend() path back into the suspend code.
> 
> The implementation assumes that exactly the same kernel is booted on the
> same hardware, and that the kernel is loaded at the same physical address.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Acked-by: Pavel Machek <pavel@ucw.cz>

> diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
> index 5faa3ce1fa3a..e75ad7aa268c 100644
> --- a/arch/arm64/include/asm/suspend.h
> +++ b/arch/arm64/include/asm/suspend.h
> @@ -1,3 +1,5 @@
> +#include <linux/suspend.h>
> +
>  #ifndef __ASM_SUSPEND_H
>  #define __ASM_SUSPEND_H
>

Actually... even additional includes should go after the #ifdef
guards.

Thanks,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk.
@ 2015-12-01  9:31     ` Pavel Machek
  0 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-01  9:31 UTC (permalink / raw)
  To: linux-arm-kernel

Hi!

> Suspend borrows code from cpu_suspend() to write cpu state onto the stack,
> before calling swsusp_save() to save the memory image.
> 
> Restore creates a set of temporary page tables, covering the kernel and the
> linear map, copies the restore code to a 'safe' page, then uses the copy to
> restore the memory image. It calls into cpu_resume(),
> and then follows the normal cpu_suspend() path back into the suspend code.
> 
> The implementation assumes that exactly the same kernel is booted on the
> same hardware, and that the kernel is loaded at the same physical address.
> 
> Signed-off-by: James Morse <james.morse@arm.com>
> Acked-by: Pavel Machek <pavel@ucw.cz>

> diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
> index 5faa3ce1fa3a..e75ad7aa268c 100644
> --- a/arch/arm64/include/asm/suspend.h
> +++ b/arch/arm64/include/asm/suspend.h
> @@ -1,3 +1,5 @@
> +#include <linux/suspend.h>
> +
>  #ifndef __ASM_SUSPEND_H
>  #define __ASM_SUSPEND_H
>

Actually... even additional includes should go after the #ifdef
guards.

Thanks,
									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
  2015-11-26 17:32   ` James Morse
@ 2015-12-03 12:09     ` Lorenzo Pieralisi
  -1 siblings, 0 replies; 50+ messages in thread
From: Lorenzo Pieralisi @ 2015-12-03 12:09 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Pavel Machek,
	Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Mark Rutland, AKASHI Takahiro, wangfei,
	Marc Zyngier

On Thu, Nov 26, 2015 at 05:32:47PM +0000, James Morse wrote:

[...]

> @@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca)
>  	if (PageHighMem(page))
>  		return get_highmem_page_buffer(page, ca);

I know it is not a problem for arm64, but you should export the
"restored" highmem pages list too because arch code may need to use
it for the same reasons this patch was implemented.

> -	if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page))
> -		/* We have allocated the "original" page frame and we can
> -		 * use it directly to store the loaded page.
> -		 */
> -		return page_address(page);
> -
> -	/* The "original" page frame has not been allocated and we have to
> -	 * use a "safe" page frame to store the loaded page.
> -	 */
>  	pbe = chain_alloc(ca, sizeof(struct pbe));
>  	if (!pbe) {
>  		swsusp_free();
>  		return ERR_PTR(-ENOMEM);
>  	}
> -	pbe->orig_address = page_address(page);
> -	pbe->address = safe_pages_list;
> -	safe_pages_list = safe_pages_list->next;
> -	pbe->next = restore_pblist;
> -	restore_pblist = pbe;
> +
> +	if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page)) {
> +		/* We have allocated the "original" page frame and we can
> +		 * use it directly to store the loaded page.
> +		 */
> +		pbe->orig_address = NULL;
> +		pbe->address = page_address(page);
> +		pbe->next = restored_inplace_pblist;
> +		restored_inplace_pblist = pbe;
> +	} else {
> +		/* The "original" page frame has not been allocated and we
> +		 * have to use a "safe" page frame to store the loaded page.
> +		 */
> +		pbe->orig_address = page_address(page);
> +		pbe->address = safe_pages_list;
> +		safe_pages_list = safe_pages_list->next;
> +		pbe->next = restore_pblist;
> +		restore_pblist = pbe;
> +	}

This makes sense to me, more so given that the pbe data is already
pre-allocated regardless in prepare_image() (ie it should not change
the current behaviour apart from calling chain_alloc() for every page we
are restoring), you are just adding a pointer to stash that information,
hence, if it is ok for Pavel and Rafael I think this patch can be considered
for merging.

Feedback appreciated.

Thanks,
Lorenzo

> +
>  	return pbe->address;
>  }
>  
> @@ -2513,6 +2526,7 @@ int snapshot_write_next(struct snapshot_handle *handle)
>  			chain_init(&ca, GFP_ATOMIC, PG_SAFE);
>  			memory_bm_position_reset(&orig_bm);
>  			restore_pblist = NULL;
> +			restored_inplace_pblist = NULL;
>  			handle->buffer = get_buffer(&orig_bm, &ca);
>  			handle->sync_read = 0;
>  			if (IS_ERR(handle->buffer))
> -- 
> 2.6.2
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
@ 2015-12-03 12:09     ` Lorenzo Pieralisi
  0 siblings, 0 replies; 50+ messages in thread
From: Lorenzo Pieralisi @ 2015-12-03 12:09 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 26, 2015 at 05:32:47PM +0000, James Morse wrote:

[...]

> @@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca)
>  	if (PageHighMem(page))
>  		return get_highmem_page_buffer(page, ca);

I know it is not a problem for arm64, but you should export the
"restored" highmem pages list too because arch code may need to use
it for the same reasons this patch was implemented.

> -	if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page))
> -		/* We have allocated the "original" page frame and we can
> -		 * use it directly to store the loaded page.
> -		 */
> -		return page_address(page);
> -
> -	/* The "original" page frame has not been allocated and we have to
> -	 * use a "safe" page frame to store the loaded page.
> -	 */
>  	pbe = chain_alloc(ca, sizeof(struct pbe));
>  	if (!pbe) {
>  		swsusp_free();
>  		return ERR_PTR(-ENOMEM);
>  	}
> -	pbe->orig_address = page_address(page);
> -	pbe->address = safe_pages_list;
> -	safe_pages_list = safe_pages_list->next;
> -	pbe->next = restore_pblist;
> -	restore_pblist = pbe;
> +
> +	if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page)) {
> +		/* We have allocated the "original" page frame and we can
> +		 * use it directly to store the loaded page.
> +		 */
> +		pbe->orig_address = NULL;
> +		pbe->address = page_address(page);
> +		pbe->next = restored_inplace_pblist;
> +		restored_inplace_pblist = pbe;
> +	} else {
> +		/* The "original" page frame has not been allocated and we
> +		 * have to use a "safe" page frame to store the loaded page.
> +		 */
> +		pbe->orig_address = page_address(page);
> +		pbe->address = safe_pages_list;
> +		safe_pages_list = safe_pages_list->next;
> +		pbe->next = restore_pblist;
> +		restore_pblist = pbe;
> +	}

This makes sense to me, more so given that the pbe data is already
pre-allocated regardless in prepare_image() (ie it should not change
the current behaviour apart from calling chain_alloc() for every page we
are restoring), you are just adding a pointer to stash that information,
hence, if it is ok for Pavel and Rafael I think this patch can be considered
for merging.

Feedback appreciated.

Thanks,
Lorenzo

> +
>  	return pbe->address;
>  }
>  
> @@ -2513,6 +2526,7 @@ int snapshot_write_next(struct snapshot_handle *handle)
>  			chain_init(&ca, GFP_ATOMIC, PG_SAFE);
>  			memory_bm_position_reset(&orig_bm);
>  			restore_pblist = NULL;
> +			restored_inplace_pblist = NULL;
>  			handle->buffer = get_buffer(&orig_bm, &ca);
>  			handle->sync_read = 0;
>  			if (IS_ERR(handle->buffer))
> -- 
> 2.6.2
> 

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
  2015-12-03 12:09     ` Lorenzo Pieralisi
@ 2015-12-04 16:26       ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-04 16:26 UTC (permalink / raw)
  To: Lorenzo Pieralisi
  Cc: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Pavel Machek,
	Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Mark Rutland, AKASHI Takahiro, wangfei,
	Marc Zyngier

Hi Lorenzo,

On 03/12/15 12:09, Lorenzo Pieralisi wrote:
> On Thu, Nov 26, 2015 at 05:32:47PM +0000, James Morse wrote:
>> @@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca)
>>  	if (PageHighMem(page))
>>  		return get_highmem_page_buffer(page, ca);
> 
> I know it is not a problem for arm64, but you should export the
> "restored" highmem pages list too because arch code may need to use
> it for the same reasons this patch was implemented.

I'm not sure it can be for the same reasons:
kernel/power/snapshot.c:swap_two_pages_data() does the copy for highmem
pages, and then kunmap-s them... if the page is to be used for
execution, it needs to be re-mapped first, possibly at a different address.
The place to do cache maintenance is in the kunmap/kmap calls.


Thanks,

James

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
@ 2015-12-04 16:26       ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-04 16:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Lorenzo,

On 03/12/15 12:09, Lorenzo Pieralisi wrote:
> On Thu, Nov 26, 2015 at 05:32:47PM +0000, James Morse wrote:
>> @@ -2427,25 +2434,31 @@ static void *get_buffer(struct memory_bitmap *bm, struct chain_allocator *ca)
>>  	if (PageHighMem(page))
>>  		return get_highmem_page_buffer(page, ca);
> 
> I know it is not a problem for arm64, but you should export the
> "restored" highmem pages list too because arch code may need to use
> it for the same reasons this patch was implemented.

I'm not sure it can be for the same reasons:
kernel/power/snapshot.c:swap_two_pages_data() does the copy for highmem
pages, and then kunmap-s them... if the page is to be used for
execution, it needs to be re-mapped first, possibly at a different address.
The place to do cache maintenance is in the kunmap/kmap calls.


Thanks,

James

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
  2015-11-26 17:32   ` James Morse
@ 2015-12-05  9:35     ` Pavel Machek
  -1 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-05  9:35 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Will Deacon,
	Sudeep Holla, Kevin Kang, Geoff Levand, Catalin Marinas,
	Lorenzo Pieralisi, Mark Rutland, AKASHI Takahiro, wangfei,
	Marc Zyngier

On Thu 2015-11-26 17:32:47, James Morse wrote:
> Some architectures require code written to memory as if it were data to be
> 'cleaned' from any data caches before the processor can fetch them as new
> instructions.
> 
> During resume from hibernate, the snapshot code copies some pages directly,
> meaning these architectures do not get a chance to perform their cache
> maintenance. Create a new list of pages that were restored in place, so
> that the arch code can perform this maintenance when necessary.

Umm. Could the copy function be modified to do the neccessary
flushing, instead?

Alternatively, can you just clean the whole cache before jumping to
the new kernel?
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
@ 2015-12-05  9:35     ` Pavel Machek
  0 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-05  9:35 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu 2015-11-26 17:32:47, James Morse wrote:
> Some architectures require code written to memory as if it were data to be
> 'cleaned' from any data caches before the processor can fetch them as new
> instructions.
> 
> During resume from hibernate, the snapshot code copies some pages directly,
> meaning these architectures do not get a chance to perform their cache
> maintenance. Create a new list of pages that were restored in place, so
> that the arch code can perform this maintenance when necessary.

Umm. Could the copy function be modified to do the neccessary
flushing, instead?

Alternatively, can you just clean the whole cache before jumping to
the new kernel?
								Pavel

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
  2015-12-05  9:35     ` Pavel Machek
@ 2015-12-07 11:28       ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-07 11:28 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Will Deacon,
	Sudeep Holla, Kevin Kang, Geoff Levand, Catalin Marinas,
	Lorenzo Pieralisi, Mark Rutland, AKASHI Takahiro, wangfei,
	Marc Zyngier

Hi Pavel,

On 05/12/15 09:35, Pavel Machek wrote:
> On Thu 2015-11-26 17:32:47, James Morse wrote:
>> Some architectures require code written to memory as if it were data to be
>> 'cleaned' from any data caches before the processor can fetch them as new
>> instructions.
>>
>> During resume from hibernate, the snapshot code copies some pages directly,
>> meaning these architectures do not get a chance to perform their cache
>> maintenance. Create a new list of pages that were restored in place, so
>> that the arch code can perform this maintenance when necessary.
> 
> Umm. Could the copy function be modified to do the neccessary
> flushing, instead?

The copying is done by load_image_lzo() using memcpy() if you have
compression enabled, and by load_image() using swap_read_page() if you
don't.

I didn't do it here as it would clean every page copied, which was the
worrying part of the previous approach. If there is an architecture
where this cache-clean operation is expensive, it would slow down
restore. I was trying to benchmark the impact of this on 32bit arm when
I spotted it was broken.

This allocated-same-page code path doesn't happen very often, so we
don't want this to have an impact on the 'normal' code path. On 32bit
arm I saw ~20 of these allocations out of ~60,000 pages.

This new way allocates a few extra pages during restore, and doesn't
assume that flush_cache_range() needs calling. It should have no impact
on architectures that aren't using the new list.


> Alternatively, can you just clean the whole cache before jumping to
> the new kernel?

On arm64, cleaning the whole cache means cleaning all of memory by
virtual address, which would be a high price to pay when we only need to
clean the pages we copied. The current implementation does clean all the
page it copies, the problem is the ~0.03% that are copied behind its
back. This patch publishes where those pages are.



Thanks!

James


^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
@ 2015-12-07 11:28       ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-07 11:28 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pavel,

On 05/12/15 09:35, Pavel Machek wrote:
> On Thu 2015-11-26 17:32:47, James Morse wrote:
>> Some architectures require code written to memory as if it were data to be
>> 'cleaned' from any data caches before the processor can fetch them as new
>> instructions.
>>
>> During resume from hibernate, the snapshot code copies some pages directly,
>> meaning these architectures do not get a chance to perform their cache
>> maintenance. Create a new list of pages that were restored in place, so
>> that the arch code can perform this maintenance when necessary.
> 
> Umm. Could the copy function be modified to do the neccessary
> flushing, instead?

The copying is done by load_image_lzo() using memcpy() if you have
compression enabled, and by load_image() using swap_read_page() if you
don't.

I didn't do it here as it would clean every page copied, which was the
worrying part of the previous approach. If there is an architecture
where this cache-clean operation is expensive, it would slow down
restore. I was trying to benchmark the impact of this on 32bit arm when
I spotted it was broken.

This allocated-same-page code path doesn't happen very often, so we
don't want this to have an impact on the 'normal' code path. On 32bit
arm I saw ~20 of these allocations out of ~60,000 pages.

This new way allocates a few extra pages during restore, and doesn't
assume that flush_cache_range() needs calling. It should have no impact
on architectures that aren't using the new list.


> Alternatively, can you just clean the whole cache before jumping to
> the new kernel?

On arm64, cleaning the whole cache means cleaning all of memory by
virtual address, which would be a high price to pay when we only need to
clean the pages we copied. The current implementation does clean all the
page it copies, the problem is the ~0.03% that are copied behind its
back. This patch publishes where those pages are.



Thanks!

James

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
  2015-12-07 11:28       ` James Morse
@ 2015-12-08  8:19         ` Pavel Machek
  -1 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-08  8:19 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Will Deacon,
	Sudeep Holla, Kevin Kang, Geoff Levand, Catalin Marinas,
	Lorenzo Pieralisi, Mark Rutland, AKASHI Takahiro, wangfei,
	Marc Zyngier

Hi!

> > Umm. Could the copy function be modified to do the neccessary
> > flushing, instead?
> 
> The copying is done by load_image_lzo() using memcpy() if you have
> compression enabled, and by load_image() using swap_read_page() if you
> don't.
> 
> I didn't do it here as it would clean every page copied, which was the
> worrying part of the previous approach. If there is an architecture
> where this cache-clean operation is expensive, it would slow down
> restore. I was trying to benchmark the impact of this on 32bit arm when
> I spotted it was broken.

You have just loaded the page from slow storage (hard drive,
MMC). Cleaning a page should be pretty fast compared to that.

> This allocated-same-page code path doesn't happen very often, so we
> don't want this to have an impact on the 'normal' code path. On 32bit
> arm I saw ~20 of these allocations out of ~60,000 pages.
> 
> This new way allocates a few extra pages during restore, and doesn't
> assume that flush_cache_range() needs calling. It should have no impact
> on architectures that aren't using the new list.

It is also complex.

> > Alternatively, can you just clean the whole cache before jumping to
> > the new kernel?
> 
> On arm64, cleaning the whole cache means cleaning all of memory by
> virtual address, which would be a high price to pay when we only need to
> clean the pages we copied. The current implementation does clean all

How high price to pay? I mean, hibernation/restore takes
_seconds_. Paying miliseconds to have cleaner code is acceptable
price.


									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
@ 2015-12-08  8:19         ` Pavel Machek
  0 siblings, 0 replies; 50+ messages in thread
From: Pavel Machek @ 2015-12-08  8:19 UTC (permalink / raw)
  To: linux-arm-kernel

Hi!

> > Umm. Could the copy function be modified to do the neccessary
> > flushing, instead?
> 
> The copying is done by load_image_lzo() using memcpy() if you have
> compression enabled, and by load_image() using swap_read_page() if you
> don't.
> 
> I didn't do it here as it would clean every page copied, which was the
> worrying part of the previous approach. If there is an architecture
> where this cache-clean operation is expensive, it would slow down
> restore. I was trying to benchmark the impact of this on 32bit arm when
> I spotted it was broken.

You have just loaded the page from slow storage (hard drive,
MMC). Cleaning a page should be pretty fast compared to that.

> This allocated-same-page code path doesn't happen very often, so we
> don't want this to have an impact on the 'normal' code path. On 32bit
> arm I saw ~20 of these allocations out of ~60,000 pages.
> 
> This new way allocates a few extra pages during restore, and doesn't
> assume that flush_cache_range() needs calling. It should have no impact
> on architectures that aren't using the new list.

It is also complex.

> > Alternatively, can you just clean the whole cache before jumping to
> > the new kernel?
> 
> On arm64, cleaning the whole cache means cleaning all of memory by
> virtual address, which would be a high price to pay when we only need to
> clean the pages we copied. The current implementation does clean all

How high price to pay? I mean, hibernation/restore takes
_seconds_. Paying miliseconds to have cleaner code is acceptable
price.


									Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-12-01  9:31     ` Pavel Machek
@ 2015-12-08 10:39       ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-08 10:39 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	linux-pm

Hi Pavel,

On 01/12/15 09:31, Pavel Machek wrote:
>> diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
>> index 5faa3ce1fa3a..e75ad7aa268c 100644
>> --- a/arch/arm64/include/asm/suspend.h
>> +++ b/arch/arm64/include/asm/suspend.h
>> @@ -1,3 +1,5 @@
>> +#include <linux/suspend.h>
>> +
>>  #ifndef __ASM_SUSPEND_H
>>  #define __ASM_SUSPEND_H
>>
> 
> Actually... even additional includes should go after the #ifdef
> guards.

Ooops - thanks for spotting that! I will send a fixup.

(I'm expecting to publish a new version if there are newer versions of
the four patches shared with kexec.)


Thanks,

James

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk.
@ 2015-12-08 10:39       ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-08 10:39 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pavel,

On 01/12/15 09:31, Pavel Machek wrote:
>> diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
>> index 5faa3ce1fa3a..e75ad7aa268c 100644
>> --- a/arch/arm64/include/asm/suspend.h
>> +++ b/arch/arm64/include/asm/suspend.h
>> @@ -1,3 +1,5 @@
>> +#include <linux/suspend.h>
>> +
>>  #ifndef __ASM_SUSPEND_H
>>  #define __ASM_SUSPEND_H
>>
> 
> Actually... even additional includes should go after the #ifdef
> guards.

Ooops - thanks for spotting that! I will send a fixup.

(I'm expecting to publish a new version if there are newer versions of
the four patches shared with kexec.)


Thanks,

James

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH] fixup! arm64: kernel: Add support for hibernate/suspend-to-disk
  2015-11-26 17:32   ` James Morse
@ 2015-12-08 11:48     ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-08 11:48 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, Marc Zyngier, Rafael J . Wysocki,
	Pavel Machek, linux-pm, James Morse

additional includes should go after the #ifdef guards

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/suspend.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index e75ad7aa268c..6682178783eb 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,8 +1,8 @@
-#include <linux/suspend.h>
-
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
+#include <linux/suspend.h>
+
 #define NR_CTX_REGS 10
 #define NR_CALLEE_SAVED_REGS 12
 
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [PATCH] fixup! arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2015-12-08 11:48     ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-08 11:48 UTC (permalink / raw)
  To: linux-arm-kernel

additional includes should go after the #ifdef guards

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/suspend.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index e75ad7aa268c..6682178783eb 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,8 +1,8 @@
-#include <linux/suspend.h>
-
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
+#include <linux/suspend.h>
+
 #define NR_CTX_REGS 10
 #define NR_CALLEE_SAVED_REGS 12
 
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk.
  2015-11-26 17:32   ` James Morse
@ 2015-12-15 17:42     ` Catalin Marinas
  -1 siblings, 0 replies; 50+ messages in thread
From: Catalin Marinas @ 2015-12-15 17:42 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Mark Rutland, Rafael J . Wysocki,
	Lorenzo Pieralisi, linux-pm, Geoff Levand, Will Deacon,
	AKASHI Takahiro, Pavel Machek, Sudeep Holla, Marc Zyngier,
	wangfei, Kevin Kang

On Thu, Nov 26, 2015 at 05:32:48PM +0000, James Morse wrote:
> +static int copy_page_tables(pgd_t *new_pgd, unsigned long start_addr)
> +{
> +	int i;
> +	int rc = 0;
> +	pud_t *new_pud;
> +	pgd_t *old_pgd = pgd_offset_k(start_addr);
> +
> +	new_pgd += pgd_index(start_addr);
> +
> +	for (i = pgd_index(start_addr); i < PTRS_PER_PGD;
> +	     i++, start_addr += PGDIR_SIZE, old_pgd++, new_pgd++) {
> +		if (!pgd_val(*old_pgd))
> +			continue;
> +
> +		if (PTRS_PER_PUD != 1) {
> +			new_pud = (pud_t *)get_safe_page(GFP_ATOMIC);
> +			if (!new_pud) {
> +				rc = -ENOMEM;
> +				break;
> +			}
> +
> +			set_pgd(new_pgd, __pgd(virt_to_phys(new_pud)
> +					       | PUD_TYPE_TABLE));
> +		}
> +
> +		rc = copy_pud(new_pgd, old_pgd, &start_addr);
> +		if (rc)
> +			break;
> +
> +		start_addr &= PGDIR_MASK;
> +	}
> +
> +	return rc;
> +}

I think you could implement the above with fewer lines if you followed a
common pattern for page table walking based on do...while(),
pgd_addr_end() etc. See copy_page_range() as a (more complex) example.

-- 
Catalin

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk.
@ 2015-12-15 17:42     ` Catalin Marinas
  0 siblings, 0 replies; 50+ messages in thread
From: Catalin Marinas @ 2015-12-15 17:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Nov 26, 2015 at 05:32:48PM +0000, James Morse wrote:
> +static int copy_page_tables(pgd_t *new_pgd, unsigned long start_addr)
> +{
> +	int i;
> +	int rc = 0;
> +	pud_t *new_pud;
> +	pgd_t *old_pgd = pgd_offset_k(start_addr);
> +
> +	new_pgd += pgd_index(start_addr);
> +
> +	for (i = pgd_index(start_addr); i < PTRS_PER_PGD;
> +	     i++, start_addr += PGDIR_SIZE, old_pgd++, new_pgd++) {
> +		if (!pgd_val(*old_pgd))
> +			continue;
> +
> +		if (PTRS_PER_PUD != 1) {
> +			new_pud = (pud_t *)get_safe_page(GFP_ATOMIC);
> +			if (!new_pud) {
> +				rc = -ENOMEM;
> +				break;
> +			}
> +
> +			set_pgd(new_pgd, __pgd(virt_to_phys(new_pud)
> +					       | PUD_TYPE_TABLE));
> +		}
> +
> +		rc = copy_pud(new_pgd, old_pgd, &start_addr);
> +		if (rc)
> +			break;
> +
> +		start_addr &= PGDIR_MASK;
> +	}
> +
> +	return rc;
> +}

I think you could implement the above with fewer lines if you followed a
common pattern for page table walking based on do...while(),
pgd_addr_end() etc. See copy_page_range() as a (more complex) example.

-- 
Catalin

^ permalink raw reply	[flat|nested] 50+ messages in thread

* Re: [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
  2015-12-08  8:19         ` Pavel Machek
@ 2015-12-16  9:55           ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-16  9:55 UTC (permalink / raw)
  To: Pavel Machek
  Cc: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Will Deacon,
	Sudeep Holla, Kevin Kang, Geoff Levand, Catalin Marinas,
	Lorenzo Pieralisi, Mark Rutland, AKASHI Takahiro, wangfei,
	Marc Zyngier

Hi Pavel,

On 08/12/15 08:19, Pavel Machek wrote:
>> I didn't do it here as it would clean every page copied, which was the
>> worrying part of the previous approach. If there is an architecture
>> where this cache-clean operation is expensive, it would slow down
>> restore. I was trying to benchmark the impact of this on 32bit arm when
>> I spotted it was broken.
> 
> You have just loaded the page from slow storage (hard drive,
> MMC). Cleaning a page should be pretty fast compared to that.

(One day I hope to own a laptop that hibernates to almost-memory speed
nvram!)

Speed is one issue - another is I don't think its correct to assume that
any architecture with a flush_icache_range() function can/should have
that called on any page of data.

There is also the possibility that an architecture needs to do something
other than flush_icache_range() on the pages that were copied. (I can
see lots of s390 hooks for 'page keys', Intel's memory protection keys
may want something similar...)

This patch is the general-purpose fix, matching the existing list of
'these pages need copying' with a 'these pages were already copied'.


>> This allocated-same-page code path doesn't happen very often, so we
>> don't want this to have an impact on the 'normal' code path. On 32bit
>> arm I saw ~20 of these allocations out of ~60,000 pages.
>>
>> This new way allocates a few extra pages during restore, and doesn't
>> assume that flush_cache_range() needs calling. It should have no impact
>> on architectures that aren't using the new list.
> 
> It is also complex.

Its symmetric with the existing restore_pblist code, I think that this
is the simplest way of doing it.


>>> Alternatively, can you just clean the whole cache before jumping to
>>> the new kernel?
>>
>> On arm64, cleaning the whole cache means cleaning all of memory by
>> virtual address, which would be a high price to pay when we only need to
>> clean the pages we copied. The current implementation does clean all
> 
> How high price to pay? I mean, hibernation/restore takes
> _seconds_. Paying miliseconds to have cleaner code is acceptable
> price.

I agree, but the code to clean all 8GB of memory on Juno takes ~3
seconds, and this will probably scale linearly. We only need to clean
the ~250MB that was copied by hibernate, (and of that, only the
executable pages). The sticking point is the few pages it copies, but
doesn't tell us about.

I will put together the flush_icache_range()-during-decompression
version of this patch... it looks like powerpc will suffer the most from
this, from the comments, its flush_icache_range() code pushes data all
the way out to memory...


Thanks,

James

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code
@ 2015-12-16  9:55           ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-16  9:55 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Pavel,

On 08/12/15 08:19, Pavel Machek wrote:
>> I didn't do it here as it would clean every page copied, which was the
>> worrying part of the previous approach. If there is an architecture
>> where this cache-clean operation is expensive, it would slow down
>> restore. I was trying to benchmark the impact of this on 32bit arm when
>> I spotted it was broken.
> 
> You have just loaded the page from slow storage (hard drive,
> MMC). Cleaning a page should be pretty fast compared to that.

(One day I hope to own a laptop that hibernates to almost-memory speed
nvram!)

Speed is one issue - another is I don't think its correct to assume that
any architecture with a flush_icache_range() function can/should have
that called on any page of data.

There is also the possibility that an architecture needs to do something
other than flush_icache_range() on the pages that were copied. (I can
see lots of s390 hooks for 'page keys', Intel's memory protection keys
may want something similar...)

This patch is the general-purpose fix, matching the existing list of
'these pages need copying' with a 'these pages were already copied'.


>> This allocated-same-page code path doesn't happen very often, so we
>> don't want this to have an impact on the 'normal' code path. On 32bit
>> arm I saw ~20 of these allocations out of ~60,000 pages.
>>
>> This new way allocates a few extra pages during restore, and doesn't
>> assume that flush_cache_range() needs calling. It should have no impact
>> on architectures that aren't using the new list.
> 
> It is also complex.

Its symmetric with the existing restore_pblist code, I think that this
is the simplest way of doing it.


>>> Alternatively, can you just clean the whole cache before jumping to
>>> the new kernel?
>>
>> On arm64, cleaning the whole cache means cleaning all of memory by
>> virtual address, which would be a high price to pay when we only need to
>> clean the pages we copied. The current implementation does clean all
> 
> How high price to pay? I mean, hibernation/restore takes
> _seconds_. Paying miliseconds to have cleaner code is acceptable
> price.

I agree, but the code to clean all 8GB of memory on Juno takes ~3
seconds, and this will probably scale linearly. We only need to clean
the ~250MB that was copied by hibernate, (and of that, only the
executable pages). The sticking point is the few pages it copies, but
doesn't tell us about.

I will put together the flush_icache_range()-during-decompression
version of this patch... it looks like powerpc will suffer the most from
this, from the comments, its flush_icache_range() code pushes data all
the way out to memory...


Thanks,

James

^ permalink raw reply	[flat|nested] 50+ messages in thread

* [ALT-PATCH v3 9/10] PM / Hibernate: Call flush_icache_range() on pages restored in-place
  2015-11-26 17:32 ` James Morse
@ 2015-12-18 11:37   ` James Morse
  -1 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-18 11:37 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Rafael J . Wysocki, Pavel Machek, linux-pm, mpe, James Morse

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Modify the read and decompress code to call
flush_icache_range() on all pages that are restored, so that the restored
in-place pages are guaranteed to be executable on these architectures.

Signed-off-by: James Morse <james.morse@arm.com>
CC: Michael Ellerman <mpe@ellerman.id.au>

---
Hi,

This is an alternative version of 9/10 [0] requested by Pavel. It isn't
possible to know which pages are being restored in place, so this code
cleans all the pages restored. This is unnecessary 99.95% of the time,
as the pages will be copied (and cleaned again) by arch code walking
restore_pblist.

I may be being over-cautions, but this looks like it will have the most
impact on powerpc, where the cache will be cleaned to memory, a page at
a time.

Patch 10/10 of the series won't build with this 9/10, I will repost the
series if we chose to use this version.


James

[0] http://www.spinics.net/lists/arm-kernel/msg463600.html

 kernel/power/swap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 12cd989dadf6..a30645d2e93f 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -37,6 +37,14 @@
 #define HIBERNATE_SIG	"S1SUSPEND"
 
 /*
+ * When reading an {un,}compressed image, we may restore pages in place,
+ * in which case some architectures need these pages cleaning before they
+ * can be executed. We don't know which pages these may be, so clean the lot.
+ */
+bool clean_pages_on_read = false;
+bool clean_pages_on_decompress = false;
+
+/*
  *	The swap map is a data structure used for keeping track of each page
  *	written to a swap partition.  It consists of many swap_map_page
  *	structures that contain each an array of MAP_PAGE_ENTRIES swap entries.
@@ -241,6 +249,9 @@ static void hib_end_io(struct bio *bio)
 
 	if (bio_data_dir(bio) == WRITE)
 		put_page(page);
+	else if (clean_pages_on_read)
+		flush_icache_range((unsigned long)page_address(page),
+				   (unsigned long)page_address(page) + PAGE_SIZE);
 
 	if (bio->bi_error && !hb->error)
 		hb->error = bio->bi_error;
@@ -1049,6 +1060,7 @@ static int load_image(struct swap_map_handle *handle,
 
 	hib_init_batch(&hb);
 
+	clean_pages_on_read = true;
 	printk(KERN_INFO "PM: Loading image data pages (%u pages)...\n",
 		nr_to_read);
 	m = nr_to_read / 10;
@@ -1124,6 +1136,10 @@ static int lzo_decompress_threadfn(void *data)
 		d->unc_len = LZO_UNC_SIZE;
 		d->ret = lzo1x_decompress_safe(d->cmp + LZO_HEADER, d->cmp_len,
 		                               d->unc, &d->unc_len);
+		if (clean_pages_on_decompress)
+			flush_icache_range((unsigned long)d->unc,
+					   (unsigned long)d->unc + d->unc_len);
+
 		atomic_set(&d->stop, 1);
 		wake_up(&d->done);
 	}
@@ -1189,6 +1205,8 @@ static int load_image_lzo(struct swap_map_handle *handle,
 	}
 	memset(crc, 0, offsetof(struct crc_data, go));
 
+	clean_pages_on_decompress = true;
+
 	/*
 	 * Start the decompression threads.
 	 */
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 50+ messages in thread

* [ALT-PATCH v3 9/10] PM / Hibernate: Call flush_icache_range() on pages restored in-place
@ 2015-12-18 11:37   ` James Morse
  0 siblings, 0 replies; 50+ messages in thread
From: James Morse @ 2015-12-18 11:37 UTC (permalink / raw)
  To: linux-arm-kernel

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Modify the read and decompress code to call
flush_icache_range() on all pages that are restored, so that the restored
in-place pages are guaranteed to be executable on these architectures.

Signed-off-by: James Morse <james.morse@arm.com>
CC: Michael Ellerman <mpe@ellerman.id.au>

---
Hi,

This is an alternative version of 9/10 [0] requested by Pavel. It isn't
possible to know which pages are being restored in place, so this code
cleans all the pages restored. This is unnecessary 99.95% of the time,
as the pages will be copied (and cleaned again) by arch code walking
restore_pblist.

I may be being over-cautions, but this looks like it will have the most
impact on powerpc, where the cache will be cleaned to memory, a page at
a time.

Patch 10/10 of the series won't build with this 9/10, I will repost the
series if we chose to use this version.


James

[0] http://www.spinics.net/lists/arm-kernel/msg463600.html

 kernel/power/swap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 12cd989dadf6..a30645d2e93f 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -37,6 +37,14 @@
 #define HIBERNATE_SIG	"S1SUSPEND"
 
 /*
+ * When reading an {un,}compressed image, we may restore pages in place,
+ * in which case some architectures need these pages cleaning before they
+ * can be executed. We don't know which pages these may be, so clean the lot.
+ */
+bool clean_pages_on_read = false;
+bool clean_pages_on_decompress = false;
+
+/*
  *	The swap map is a data structure used for keeping track of each page
  *	written to a swap partition.  It consists of many swap_map_page
  *	structures that contain each an array of MAP_PAGE_ENTRIES swap entries.
@@ -241,6 +249,9 @@ static void hib_end_io(struct bio *bio)
 
 	if (bio_data_dir(bio) == WRITE)
 		put_page(page);
+	else if (clean_pages_on_read)
+		flush_icache_range((unsigned long)page_address(page),
+				   (unsigned long)page_address(page) + PAGE_SIZE);
 
 	if (bio->bi_error && !hb->error)
 		hb->error = bio->bi_error;
@@ -1049,6 +1060,7 @@ static int load_image(struct swap_map_handle *handle,
 
 	hib_init_batch(&hb);
 
+	clean_pages_on_read = true;
 	printk(KERN_INFO "PM: Loading image data pages (%u pages)...\n",
 		nr_to_read);
 	m = nr_to_read / 10;
@@ -1124,6 +1136,10 @@ static int lzo_decompress_threadfn(void *data)
 		d->unc_len = LZO_UNC_SIZE;
 		d->ret = lzo1x_decompress_safe(d->cmp + LZO_HEADER, d->cmp_len,
 		                               d->unc, &d->unc_len);
+		if (clean_pages_on_decompress)
+			flush_icache_range((unsigned long)d->unc,
+					   (unsigned long)d->unc + d->unc_len);
+
 		atomic_set(&d->stop, 1);
 		wake_up(&d->done);
 	}
@@ -1189,6 +1205,8 @@ static int load_image_lzo(struct swap_map_handle *handle,
 	}
 	memset(crc, 0, offsetof(struct crc_data, go));
 
+	clean_pages_on_decompress = true;
+
 	/*
 	 * Start the decompression threads.
 	 */
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 50+ messages in thread

end of thread, other threads:[~2015-12-18 11:39 UTC | newest]

Thread overview: 50+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-11-26 17:32 [PATCH v3 00/10] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
2015-11-26 17:32 ` James Morse
2015-11-26 17:32 ` [PATCH v3 01/10] arm64: Fold proc-macros.S into assembler.h James Morse
2015-11-26 17:32   ` James Morse
2015-12-01  9:18   ` Pavel Machek
2015-12-01  9:18     ` Pavel Machek
2015-11-26 17:32 ` [PATCH v3 02/10] arm64: Convert hcalls to use HVC immediate value James Morse
2015-11-26 17:32   ` James Morse
2015-11-26 17:32 ` [PATCH v3 03/10] arm64: Add new hcall HVC_CALL_FUNC James Morse
2015-11-26 17:32   ` James Morse
2015-11-26 17:32 ` [PATCH v3 04/10] arm64: kvm: allows kvm cpu hotplug James Morse
2015-11-26 17:32   ` James Morse
2015-11-26 17:32 ` [PATCH v3 05/10] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter() James Morse
2015-11-26 17:32   ` James Morse
2015-11-26 17:32 ` [PATCH v3 06/10] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va James Morse
2015-11-26 17:32   ` James Morse
2015-11-26 17:32 ` [PATCH v3 07/10] arm64: kernel: Include _AC definition in page.h James Morse
2015-11-26 17:32   ` James Morse
2015-12-01  9:28   ` Pavel Machek
2015-12-01  9:28     ` Pavel Machek
2015-11-26 17:32 ` [PATCH v3 08/10] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file James Morse
2015-11-26 17:32   ` James Morse
2015-12-01  9:28   ` Pavel Machek
2015-12-01  9:28     ` Pavel Machek
2015-11-26 17:32 ` [PATCH v3 09/10] PM / Hibernate: Publish pages restored in-place to arch code James Morse
2015-11-26 17:32   ` James Morse
2015-12-03 12:09   ` Lorenzo Pieralisi
2015-12-03 12:09     ` Lorenzo Pieralisi
2015-12-04 16:26     ` James Morse
2015-12-04 16:26       ` James Morse
2015-12-05  9:35   ` Pavel Machek
2015-12-05  9:35     ` Pavel Machek
2015-12-07 11:28     ` James Morse
2015-12-07 11:28       ` James Morse
2015-12-08  8:19       ` Pavel Machek
2015-12-08  8:19         ` Pavel Machek
2015-12-16  9:55         ` James Morse
2015-12-16  9:55           ` James Morse
2015-11-26 17:32 ` [PATCH v3 10/10] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
2015-11-26 17:32   ` James Morse
2015-12-01  9:31   ` Pavel Machek
2015-12-01  9:31     ` Pavel Machek
2015-12-08 10:39     ` James Morse
2015-12-08 10:39       ` James Morse
2015-12-08 11:48   ` [PATCH] fixup! " James Morse
2015-12-08 11:48     ` James Morse
2015-12-15 17:42   ` [PATCH v3 10/10] " Catalin Marinas
2015-12-15 17:42     ` Catalin Marinas
2015-12-18 11:37 ` [ALT-PATCH v3 9/10] PM / Hibernate: Call flush_icache_range() on pages restored in-place James Morse
2015-12-18 11:37   ` James Morse

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.