All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-01-28 10:42 ` James Morse
  0 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, James Morse, linux-pm,
	Rafael J. Wysocki, Pavel Machek, Marc Zyngier

Hi all,

This version of hibernate is rebased onto v4.5-rc1, including updated patches
shared with kexec v13 [0] (1-5, 10).

I've also tested it with Ard's KASLR series [1].
The most significant change is the use of the arch-header to store the
location of the page tables and resume code when it has been moved by KASLR.

This implicitly allows resuming with a different kernel version, (and the work
needed to do this is the same). Patch 12 restricts resume to kernels with the
same MMU configuration, (page size, number of page levels etc).
Patch 13 forbids restoring with a different kernel version altogether - do
we want to support this?

Due to the arch-header changes, I dropped Pavel's Ack on patch 12.

Parts of patch 5 were reworked from Akashi Takahiro's original version, to
avoid another round of changes to hyp-entry.S:el1_sync.

This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b hibernate/v4

Changes since v3:
 * To work with kaslr:
   * hibernate now uses the arch-header to store the address of the page
     tables, and the point to re-enter the resumed kernel.
   * The el2 vectors are reloaded to point to the 'safe' page, then back to the
     resumed kernel.
   * PoC cleaning is done after the jump to the resumed kernel, as we don't
     know the restored kernel's boundaries in advance.
   * Some variables are accessed via aliases in the linear map, as the kernel
     text is not mapped during resume. restore_pblist is one example.
   * Execute the safe page from the bottom of memory, not the top, so that we
     can restore the resumed kernel's page tables directly.

 * Rebased the common patches onto v13 of kexec
 * Changed hibernate-asm.S to use the new copy_page macro.
 * Changed copy_p?d()s to use the do { } while(); pattern.
 * Added some missing barriers. (dsb after ic ialluis).

Changes from v2:
 * Rewrote restore in-place patch - we can't clean pages in copy_page(), we
   need to publish a list for the architecture to clean
 * Added missing pgprot_val() in hibernate.c, spotted by STRICT_MM_TYPECHECKS
 * Removed 'tcr_set_idmap_t0sz' from proc.S - I missed this when rebase-ing
 * Re-imported the first four patches from kexec v12
 * Rebased onto v4.4-rc2
 * Changes from Pavel Machek's comments

Changes from v1:
 * Removed for_each_process(){ for_each_vma() { } }; cache cleaning, replaced
   with icache_flush_range() call in core hibernate code
 * Rebased onto conflicting tcr_ek1.t0sz bug-fix patch


[v3] http://www.spinics.net/lists/arm-kernel/msg463590.html
[v2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html
[v1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html

[0] http://www.spinics.net/lists/arm-kernel/msg474889.html
[1] http://permalink.gmane.org/gmane.linux.kernel/2116531

AKASHI Takahiro (1):
  arm64: kvm: allows kvm cpu hotplug

Geoff Levand (5):
  arm64: Fold proc-macros.S into assembler.h
  arm64: Cleanup SCTLR flags
  arm64: Convert hcalls to use HVC immediate value
  arm64: Add new hcall HVC_CALL_FUNC
  arm64: Add new asm macro copy_page

James Morse (7):
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by
    va
  arm64: kernel: Include _AC definition in page.h
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  PM / Hibernate: Call flush_icache_range() on pages restored in-place
  arm64: kernel: Add support for hibernate/suspend-to-disk
  arm64: hibernate: Prevent resume from a different kernel version

 arch/arm/include/asm/kvm_host.h    |  10 +-
 arch/arm/include/asm/kvm_mmu.h     |   1 +
 arch/arm/kvm/arm.c                 |  98 +++++---
 arch/arm/kvm/mmu.c                 |   5 +
 arch/arm64/Kconfig                 |   7 +
 arch/arm64/include/asm/assembler.h |  89 ++++++-
 arch/arm64/include/asm/kvm_arm.h   |  11 -
 arch/arm64/include/asm/kvm_host.h  |   1 -
 arch/arm64/include/asm/kvm_mmu.h   |  19 ++
 arch/arm64/include/asm/memory.h    |   3 +
 arch/arm64/include/asm/page.h      |   2 +
 arch/arm64/include/asm/suspend.h   |  30 ++-
 arch/arm64/include/asm/sysreg.h    |  19 +-
 arch/arm64/include/asm/virt.h      |  40 ++++
 arch/arm64/kernel/Makefile         |   1 +
 arch/arm64/kernel/asm-offsets.c    |  10 +-
 arch/arm64/kernel/head.S           |   5 +-
 arch/arm64/kernel/hibernate-asm.S  | 137 +++++++++++
 arch/arm64/kernel/hibernate.c      | 472 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S       |  43 +++-
 arch/arm64/kernel/setup.c          |   1 -
 arch/arm64/kernel/sleep.S          | 145 ++++--------
 arch/arm64/kernel/suspend.c        | 108 ++++-----
 arch/arm64/kernel/vmlinux.lds.S    |  15 ++
 arch/arm64/kvm/hyp-init.S          |  47 +++-
 arch/arm64/kvm/hyp.S               |   3 +-
 arch/arm64/kvm/hyp/hyp-entry.S     |   9 +-
 arch/arm64/mm/cache.S              |   2 -
 arch/arm64/mm/proc-macros.S        |  86 -------
 arch/arm64/mm/proc.S               |  38 +--
 kernel/power/swap.c                |  18 ++
 31 files changed, 1114 insertions(+), 361 deletions(-)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.6.2


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-01-28 10:42 ` James Morse
  0 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

Hi all,

This version of hibernate is rebased onto v4.5-rc1, including updated patches
shared with kexec v13 [0] (1-5, 10).

I've also tested it with Ard's KASLR series [1].
The most significant change is the use of the arch-header to store the
location of the page tables and resume code when it has been moved by KASLR.

This implicitly allows resuming with a different kernel version, (and the work
needed to do this is the same). Patch 12 restricts resume to kernels with the
same MMU configuration, (page size, number of page levels etc).
Patch 13 forbids restoring with a different kernel version altogether - do
we want to support this?

Due to the arch-header changes, I dropped Pavel's Ack on patch 12.

Parts of patch 5 were reworked from Akashi Takahiro's original version, to
avoid another round of changes to hyp-entry.S:el1_sync.

This series can be retrieved from:
git://linux-arm.org/linux-jm.git -b hibernate/v4

Changes since v3:
 * To work with kaslr:
   * hibernate now uses the arch-header to store the address of the page
     tables, and the point to re-enter the resumed kernel.
   * The el2 vectors are reloaded to point to the 'safe' page, then back to the
     resumed kernel.
   * PoC cleaning is done after the jump to the resumed kernel, as we don't
     know the restored kernel's boundaries in advance.
   * Some variables are accessed via aliases in the linear map, as the kernel
     text is not mapped during resume. restore_pblist is one example.
   * Execute the safe page from the bottom of memory, not the top, so that we
     can restore the resumed kernel's page tables directly.

 * Rebased the common patches onto v13 of kexec
 * Changed hibernate-asm.S to use the new copy_page macro.
 * Changed copy_p?d()s to use the do { } while(); pattern.
 * Added some missing barriers. (dsb after ic ialluis).

Changes from v2:
 * Rewrote restore in-place patch - we can't clean pages in copy_page(), we
   need to publish a list for the architecture to clean
 * Added missing pgprot_val() in hibernate.c, spotted by STRICT_MM_TYPECHECKS
 * Removed 'tcr_set_idmap_t0sz' from proc.S - I missed this when rebase-ing
 * Re-imported the first four patches from kexec v12
 * Rebased onto v4.4-rc2
 * Changes from Pavel Machek's comments

Changes from v1:
 * Removed for_each_process(){ for_each_vma() { } }; cache cleaning, replaced
   with icache_flush_range() call in core hibernate code
 * Rebased onto conflicting tcr_ek1.t0sz bug-fix patch


[v3] http://www.spinics.net/lists/arm-kernel/msg463590.html
[v2] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html
[v1] http://lists.infradead.org/pipermail/linux-arm-kernel/2015-October/376450.html

[0] http://www.spinics.net/lists/arm-kernel/msg474889.html
[1] http://permalink.gmane.org/gmane.linux.kernel/2116531

AKASHI Takahiro (1):
  arm64: kvm: allows kvm cpu hotplug

Geoff Levand (5):
  arm64: Fold proc-macros.S into assembler.h
  arm64: Cleanup SCTLR flags
  arm64: Convert hcalls to use HVC immediate value
  arm64: Add new hcall HVC_CALL_FUNC
  arm64: Add new asm macro copy_page

James Morse (7):
  arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  arm64: Change cpu_resume() to enable mmu early then access sleep_sp by
    va
  arm64: kernel: Include _AC definition in page.h
  arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  PM / Hibernate: Call flush_icache_range() on pages restored in-place
  arm64: kernel: Add support for hibernate/suspend-to-disk
  arm64: hibernate: Prevent resume from a different kernel version

 arch/arm/include/asm/kvm_host.h    |  10 +-
 arch/arm/include/asm/kvm_mmu.h     |   1 +
 arch/arm/kvm/arm.c                 |  98 +++++---
 arch/arm/kvm/mmu.c                 |   5 +
 arch/arm64/Kconfig                 |   7 +
 arch/arm64/include/asm/assembler.h |  89 ++++++-
 arch/arm64/include/asm/kvm_arm.h   |  11 -
 arch/arm64/include/asm/kvm_host.h  |   1 -
 arch/arm64/include/asm/kvm_mmu.h   |  19 ++
 arch/arm64/include/asm/memory.h    |   3 +
 arch/arm64/include/asm/page.h      |   2 +
 arch/arm64/include/asm/suspend.h   |  30 ++-
 arch/arm64/include/asm/sysreg.h    |  19 +-
 arch/arm64/include/asm/virt.h      |  40 ++++
 arch/arm64/kernel/Makefile         |   1 +
 arch/arm64/kernel/asm-offsets.c    |  10 +-
 arch/arm64/kernel/head.S           |   5 +-
 arch/arm64/kernel/hibernate-asm.S  | 137 +++++++++++
 arch/arm64/kernel/hibernate.c      | 472 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S       |  43 +++-
 arch/arm64/kernel/setup.c          |   1 -
 arch/arm64/kernel/sleep.S          | 145 ++++--------
 arch/arm64/kernel/suspend.c        | 108 ++++-----
 arch/arm64/kernel/vmlinux.lds.S    |  15 ++
 arch/arm64/kvm/hyp-init.S          |  47 +++-
 arch/arm64/kvm/hyp.S               |   3 +-
 arch/arm64/kvm/hyp/hyp-entry.S     |   9 +-
 arch/arm64/mm/cache.S              |   2 -
 arch/arm64/mm/proc-macros.S        |  86 -------
 arch/arm64/mm/proc.S               |  38 +--
 kernel/power/swap.c                |  18 ++
 31 files changed, 1114 insertions(+), 361 deletions(-)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c
 delete mode 100644 arch/arm64/mm/proc-macros.S

-- 
2.6.2

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 01/13] arm64: Fold proc-macros.S into assembler.h
  2016-01-28 10:42 ` James Morse
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

To allow the assembler macros defined in arch/arm64/mm/proc-macros.S to
be used outside the mm code move the contents of proc-macros.S to
asm/assembler.h.  Also, delete proc-macros.S, and fix up all references
to proc-macros.S.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Acked-by: Pavel Machek <pavel@ucw.cz>
[rebased, included dcache_by_line_op]
Signed-off-by: James Morse <james.morse@arm.com>
---

This patch is from v13 of kexec

 arch/arm64/include/asm/assembler.h | 70 ++++++++++++++++++++++++++++++-
 arch/arm64/mm/cache.S              |  2 -
 arch/arm64/mm/proc-macros.S        | 86 --------------------------------------
 arch/arm64/mm/proc.S               |  3 --
 4 files changed, 69 insertions(+), 92 deletions(-)
 delete mode 100644 arch/arm64/mm/proc-macros.S

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index bb7b72734c24..137ee5b11eb7 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -1,5 +1,5 @@
 /*
- * Based on arch/arm/include/asm/assembler.h
+ * Based on arch/arm/include/asm/assembler.h, arch/arm/mm/proc-macros.S
  *
  * Copyright (C) 1996-2000 Russell King
  * Copyright (C) 2012 ARM Ltd.
@@ -23,6 +23,8 @@
 #ifndef __ASM_ASSEMBLER_H
 #define __ASM_ASSEMBLER_H
 
+#include <asm/asm-offsets.h>
+#include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
 
@@ -205,6 +207,72 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+	.macro	vma_vm_mm, rd, rn
+	ldr	\rd, [\rn, #VMA_VM_MM]
+	.endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+	.macro	mmid, rd, rn
+	ldr	\rd, [\rn, #MM_CONTEXT_ID]
+	.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+	.macro	icache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
+ */
+	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
+#ifndef CONFIG_ARM64_VA_BITS_48
+	ldr_l	\tmpreg, idmap_t0sz
+	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
+#endif
+	.endm
+
+/*
+ * Macro to perform a data cache maintenance for the interval
+ * [kaddr, kaddr + size)
+ *
+ * 	op:		operation passed to dc instruction
+ * 	domain:		domain used in dsb instruciton
+ * 	kaddr:		starting virtual address of the region
+ * 	size:		size of the region
+ * 	Corrupts:	kaddr, size, tmp1, tmp2
+ */
+	.macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
+	dcache_line_size \tmp1, \tmp2
+	add	\size, \kaddr, \size
+	sub	\tmp2, \tmp1, #1
+	bic	\kaddr, \kaddr, \tmp2
+9998:	dc	\op, \kaddr
+	add	\kaddr, \kaddr, \tmp1
+	cmp	\kaddr, \size
+	b.lo	9998b
+	dsb	\domain
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
index 6df07069a025..50ff9ba3a236 100644
--- a/arch/arm64/mm/cache.S
+++ b/arch/arm64/mm/cache.S
@@ -24,8 +24,6 @@
 #include <asm/cpufeature.h>
 #include <asm/alternative.h>
 
-#include "proc-macros.S"
-
 /*
  *	flush_icache_range(start,end)
  *
diff --git a/arch/arm64/mm/proc-macros.S b/arch/arm64/mm/proc-macros.S
deleted file mode 100644
index 146bd99a7532..000000000000
--- a/arch/arm64/mm/proc-macros.S
+++ /dev/null
@@ -1,86 +0,0 @@
-/*
- * Based on arch/arm/mm/proc-macros.S
- *
- * Copyright (C) 2012 ARM Ltd.
- *
- * This program is free software; you can redistribute it and/or modify
- * it under the terms of the GNU General Public License version 2 as
- * published by the Free Software Foundation.
- *
- * This program is distributed in the hope that it will be useful,
- * but WITHOUT ANY WARRANTY; without even the implied warranty of
- * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
- * GNU General Public License for more details.
- *
- * You should have received a copy of the GNU General Public License
- * along with this program.  If not, see <http://www.gnu.org/licenses/>.
- */
-
-#include <asm/asm-offsets.h>
-#include <asm/thread_info.h>
-
-/*
- * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
- */
-	.macro	vma_vm_mm, rd, rn
-	ldr	\rd, [\rn, #VMA_VM_MM]
-	.endm
-
-/*
- * mmid - get context id from mm pointer (mm->context.id)
- */
-	.macro	mmid, rd, rn
-	ldr	\rd, [\rn, #MM_CONTEXT_ID]
-	.endm
-
-/*
- * dcache_line_size - get the minimum D-cache line size from the CTR register.
- */
-	.macro	dcache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	ubfm	\tmp, \tmp, #16, #19		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * icache_line_size - get the minimum I-cache line size from the CTR register.
- */
-	.macro	icache_line_size, reg, tmp
-	mrs	\tmp, ctr_el0			// read CTR
-	and	\tmp, \tmp, #0xf		// cache line size encoding
-	mov	\reg, #4			// bytes per word
-	lsl	\reg, \reg, \tmp		// actual cache line size
-	.endm
-
-/*
- * tcr_set_idmap_t0sz - update TCR.T0SZ so that we can load the ID map
- */
-	.macro	tcr_set_idmap_t0sz, valreg, tmpreg
-#ifndef CONFIG_ARM64_VA_BITS_48
-	ldr_l	\tmpreg, idmap_t0sz
-	bfi	\valreg, \tmpreg, #TCR_T0SZ_OFFSET, #TCR_TxSZ_WIDTH
-#endif
-	.endm
-
-/*
- * Macro to perform a data cache maintenance for the interval
- * [kaddr, kaddr + size)
- *
- * 	op:		operation passed to dc instruction
- * 	domain:		domain used in dsb instruciton
- * 	kaddr:		starting virtual address of the region
- * 	size:		size of the region
- * 	Corrupts: 	kaddr, size, tmp1, tmp2
- */
-	.macro dcache_by_line_op op, domain, kaddr, size, tmp1, tmp2
-	dcache_line_size \tmp1, \tmp2
-	add	\size, \kaddr, \size
-	sub	\tmp2, \tmp1, #1
-	bic	\kaddr, \kaddr, \tmp2
-9998:	dc	\op, \kaddr
-	add	\kaddr, \kaddr, \tmp1
-	cmp	\kaddr, \size
-	b.lo	9998b
-	dsb	\domain
-	.endm
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index a3d867e723b4..3c7d170de822 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -23,11 +23,8 @@
 #include <asm/assembler.h>
 #include <asm/asm-offsets.h>
 #include <asm/hwcap.h>
-#include <asm/pgtable-hwdef.h>
 #include <asm/pgtable.h>
 
-#include "proc-macros.S"
-
 #ifdef CONFIG_ARM64_64K_PAGES
 #define TCR_TG_FLAGS	TCR_TG0_64K | TCR_TG1_64K
 #elif defined(CONFIG_ARM64_16K_PAGES)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 02/13] arm64: Cleanup SCTLR flags
  2016-01-28 10:42 ` James Morse
  (?)
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

We currently have macros defining flags for the arm64 sctlr registers in
both kvm_arm.h and sysreg.h.  To clean things up and simplify move the
definitions of the SCTLR_EL2 flags from kvm_arm.h to sysreg.h, rename any
SCTLR_EL1 or SCTLR_EL2 flags that are common to both registers to be
SCTLR_ELx, with 'x' indicating a common flag, and fixup all files to
include the proper header or to use the new macro names.

Signed-off-by: Geoff Levand <geoff@infradead.org>
[Restored pgtable-hwdef.h include]
Signed-off-by: James Morse <james.morse@arm.com>
---
This patch is from v13 of kexec

 arch/arm64/include/asm/kvm_arm.h | 11 -----------
 arch/arm64/include/asm/sysreg.h  | 19 +++++++++++++++----
 arch/arm64/kvm/hyp-init.S        |  5 +++--
 3 files changed, 18 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_arm.h b/arch/arm64/include/asm/kvm_arm.h
index 738a95f93e49..afbf68d32ecb 100644
--- a/arch/arm64/include/asm/kvm_arm.h
+++ b/arch/arm64/include/asm/kvm_arm.h
@@ -83,17 +83,6 @@
 #define HCR_INT_OVERRIDE   (HCR_FMO | HCR_IMO)
 
 
-/* Hyp System Control Register (SCTLR_EL2) bits */
-#define SCTLR_EL2_EE	(1 << 25)
-#define SCTLR_EL2_WXN	(1 << 19)
-#define SCTLR_EL2_I	(1 << 12)
-#define SCTLR_EL2_SA	(1 << 3)
-#define SCTLR_EL2_C	(1 << 2)
-#define SCTLR_EL2_A	(1 << 1)
-#define SCTLR_EL2_M	1
-#define SCTLR_EL2_FLAGS	(SCTLR_EL2_M | SCTLR_EL2_A | SCTLR_EL2_C |	\
-			 SCTLR_EL2_SA | SCTLR_EL2_I)
-
 /* TCR_EL2 Registers bits */
 #define TCR_EL2_RES1	((1 << 31) | (1 << 23))
 #define TCR_EL2_TBI	(1 << 20)
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index 4aeebec3d882..99dac2f9f21a 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -82,10 +82,21 @@
 #define SET_PSTATE_PAN(x) __inst_arm(0xd5000000 | REG_PSTATE_PAN_IMM |\
 				     (!!x)<<8 | 0x1f)
 
-/* SCTLR_EL1 */
-#define SCTLR_EL1_CP15BEN	(0x1 << 5)
-#define SCTLR_EL1_SED		(0x1 << 8)
-#define SCTLR_EL1_SPAN		(0x1 << 23)
+/* Common SCTLR_ELx flags. */
+#define SCTLR_ELx_EE    (1 << 25)
+#define SCTLR_ELx_I	(1 << 12)
+#define SCTLR_ELx_SA	(1 << 3)
+#define SCTLR_ELx_C	(1 << 2)
+#define SCTLR_ELx_A	(1 << 1)
+#define SCTLR_ELx_M	1
+
+#define SCTLR_ELx_FLAGS	(SCTLR_ELx_M | SCTLR_ELx_A | SCTLR_ELx_C | \
+			 SCTLR_ELx_SA | SCTLR_ELx_I)
+
+/* SCTLR_EL1 specific flags. */
+#define SCTLR_EL1_SPAN		(1 << 23)
+#define SCTLR_EL1_SED		(1 << 8)
+#define SCTLR_EL1_CP15BEN	(1 << 5)
 
 
 /* id_aa64isar0 */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index 3e568dcd907b..dc6335a7353e 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -21,6 +21,7 @@
 #include <asm/kvm_arm.h>
 #include <asm/kvm_mmu.h>
 #include <asm/pgtable-hwdef.h>
+#include <asm/sysreg.h>
 
 	.text
 	.pushsection	.hyp.idmap.text, "ax"
@@ -114,8 +115,8 @@ __do_hyp_init:
 	dsb	sy
 
 	mrs	x4, sctlr_el2
-	and	x4, x4, #SCTLR_EL2_EE	// preserve endianness of EL2
-	ldr	x5, =SCTLR_EL2_FLAGS
+	and	x4, x4, #SCTLR_ELx_EE	// preserve endianness of EL2
+	ldr	x5, =SCTLR_ELx_FLAGS
 	orr	x4, x4, x5
 	msr	sctlr_el2, x4
 	isb
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 03/13] arm64: Convert hcalls to use HVC immediate value
  2016-01-28 10:42 ` James Morse
                   ` (2 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

The existing arm64 hcall implementations are limited in that they only
allow for two distinct hcalls; with the x0 register either zero or not
zero.  Also, the API of the hyp-stub exception vector routines and the
KVM exception vector routines differ; hyp-stub uses a non-zero value in
x0 to implement __hyp_set_vectors, whereas KVM uses it to implement
kvm_call_hyp.

To allow for additional hcalls to be defined and to make the arm64 hcall
API more consistent across exception vector routines, change the hcall
implementations to use the 16 bit immediate value of the HVC instruction
to specify the hcall type.

Define three new preprocessor macros HVC_CALL_HYP, HVC_GET_VECTORS, and
HVC_SET_VECTORS to be used as hcall type specifiers and convert the
existing __hyp_get_vectors(), __hyp_set_vectors() and kvm_call_hyp()
routines to use these new macros when executing an HVC call.  Also,
change the corresponding hyp-stub and KVM el1_sync exception vector
routines to use these new macros.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
This patch is from v13 of kexec

 arch/arm64/include/asm/virt.h  | 27 +++++++++++++++++++++++++++
 arch/arm64/kernel/hyp-stub.S   | 32 +++++++++++++++++++++-----------
 arch/arm64/kvm/hyp.S           |  3 ++-
 arch/arm64/kvm/hyp/hyp-entry.S |  9 ++++++---
 4 files changed, 56 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index 7a5df5252dd7..eb10368c329e 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -18,6 +18,33 @@
 #ifndef __ASM__VIRT_H
 #define __ASM__VIRT_H
 
+/*
+ * The arm64 hcall implementation uses the ISS field of the ESR_EL2 register to
+ * specify the hcall type.  The exception handlers are allowed to use registers
+ * x17 and x18 in their implementation.  Any routine issuing an hcall must not
+ * expect these registers to be preserved.
+ */
+
+/*
+ * HVC_CALL_HYP - Execute a hyp routine.
+ */
+
+#define HVC_CALL_HYP 0
+
+/*
+ * HVC_GET_VECTORS - Return the value of the vbar_el2 register.
+ */
+
+#define HVC_GET_VECTORS 1
+
+/*
+ * HVC_SET_VECTORS - Set the value of the vbar_el2 register.
+ *
+ * @x0: Physical address of the new vector table.
+ */
+
+#define HVC_SET_VECTORS 2
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index a272f335c289..017ab519aaf1 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -22,6 +22,7 @@
 #include <linux/irqchip/arm-gic-v3.h>
 
 #include <asm/assembler.h>
+#include <asm/kvm_arm.h>
 #include <asm/ptrace.h>
 #include <asm/virt.h>
 
@@ -53,14 +54,22 @@ ENDPROC(__hyp_stub_vectors)
 	.align 11
 
 el1_sync:
-	mrs	x1, esr_el2
-	lsr	x1, x1, #26
-	cmp	x1, #0x16
+	mrs	x18, esr_el2
+	lsr	x17, x18, #ESR_ELx_EC_SHIFT
+	and	x18, x18, #ESR_ELx_ISS_MASK
+
+	cmp	x17, #ESR_ELx_EC_HVC64
 	b.ne	2f				// Not an HVC trap
-	cbz	x0, 1f
-	msr	vbar_el2, x0			// Set vbar_el2
+
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
+	mrs	x0, vbar_el2
 	b	2f
-1:	mrs	x0, vbar_el2			// Return vbar_el2
+
+1:	cmp	x18, #HVC_SET_VECTORS
+	b.ne	2f
+	msr	vbar_el2, x0
+
 2:	eret
 ENDPROC(el1_sync)
 
@@ -100,11 +109,12 @@ ENDPROC(\label)
  * initialisation entry point.
  */
 
-ENTRY(__hyp_get_vectors)
-	mov	x0, xzr
-	// fall through
 ENTRY(__hyp_set_vectors)
-	hvc	#0
+	hvc	#HVC_SET_VECTORS
 	ret
-ENDPROC(__hyp_get_vectors)
 ENDPROC(__hyp_set_vectors)
+
+ENTRY(__hyp_get_vectors)
+	hvc	#HVC_GET_VECTORS
+	ret
+ENDPROC(__hyp_get_vectors)
diff --git a/arch/arm64/kvm/hyp.S b/arch/arm64/kvm/hyp.S
index 0ccdcbbef3c2..a598f9ec1e95 100644
--- a/arch/arm64/kvm/hyp.S
+++ b/arch/arm64/kvm/hyp.S
@@ -18,6 +18,7 @@
 #include <linux/linkage.h>
 
 #include <asm/assembler.h>
+#include <asm/virt.h>
 
 /*
  * u64 kvm_call_hyp(void *hypfn, ...);
@@ -38,6 +39,6 @@
  * arch/arm64/kernel/hyp_stub.S.
  */
 ENTRY(kvm_call_hyp)
-	hvc	#0
+	hvc	#HVC_CALL_HYP
 	ret
 ENDPROC(kvm_call_hyp)
diff --git a/arch/arm64/kvm/hyp/hyp-entry.S b/arch/arm64/kvm/hyp/hyp-entry.S
index 93e8d983c0bd..a1edf7743df6 100644
--- a/arch/arm64/kvm/hyp/hyp-entry.S
+++ b/arch/arm64/kvm/hyp/hyp-entry.S
@@ -43,6 +43,7 @@ el1_sync:				// Guest trapped into EL2
 
 	mrs	x1, esr_el2
 	lsr	x2, x1, #ESR_ELx_EC_SHIFT
+	and	x0, x1, #ESR_ELx_ISS_MASK
 
 	cmp	x2, #ESR_ELx_EC_HVC64
 	b.ne	el1_trap
@@ -51,14 +52,16 @@ el1_sync:				// Guest trapped into EL2
 	cbnz	x3, el1_trap		// called HVC
 
 	/* Here, we're pretty sure the host called HVC. */
+	mov	x18, x0
 	restore_x0_to_x3
 
-	/* Check for __hyp_get_vectors */
-	cbnz	x0, 1f
+	cmp	x18, #HVC_GET_VECTORS
+	b.ne	1f
 	mrs	x0, vbar_el2
 	b	2f
 
-1:	stp	lr, xzr, [sp, #-16]!
+1:     /* Default to HVC_CALL_HYP. */
+	push	lr, xzr
 
 	/*
 	 * Compute the function address in EL2, and shuffle the parameters.
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 04/13] arm64: Add new hcall HVC_CALL_FUNC
  2016-01-28 10:42 ` James Morse
                   ` (3 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  2016-02-02  6:53   ` AKASHI Takahiro
  -1 siblings, 1 reply; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

Add the new hcall HVC_CALL_FUNC that allows execution of a function at EL2.
During CPU reset the CPU must be brought to the exception level it had on
entry to the kernel.  The HVC_CALL_FUNC hcall will provide the mechanism
needed for this exception level switch.

To allow the HVC_CALL_FUNC exception vector to work without a stack, which
is needed to support an hcall at CPU reset, this implementation uses
register x18 to store the link register across the caller provided
function.  This dictates that the caller provided function must preserve
the contents of register x18.

Signed-off-by: Geoff Levand <geoff@infradead.org>
Signed-off-by: James Morse <james.morse@arm.com>
---
This patch is from v13 of kexec

 arch/arm64/include/asm/virt.h | 13 +++++++++++++
 arch/arm64/kernel/hyp-stub.S  | 13 ++++++++++++-
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
index eb10368c329e..30700961f28c 100644
--- a/arch/arm64/include/asm/virt.h
+++ b/arch/arm64/include/asm/virt.h
@@ -45,6 +45,19 @@
 
 #define HVC_SET_VECTORS 2
 
+/*
+ * HVC_CALL_FUNC - Execute a function at EL2.
+ *
+ * @x0: Physical address of the function to be executed.
+ * @x1: Passed as the first argument to the function.
+ * @x2: Passed as the second argument to the function.
+ * @x3: Passed as the third argument to the function.
+ *
+ * The called function must preserve the contents of register x18.
+ */
+
+#define HVC_CALL_FUNC 3
+
 #define BOOT_CPU_MODE_EL1	(0xe11)
 #define BOOT_CPU_MODE_EL2	(0xe12)
 
diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
index 017ab519aaf1..e8febe90c036 100644
--- a/arch/arm64/kernel/hyp-stub.S
+++ b/arch/arm64/kernel/hyp-stub.S
@@ -67,8 +67,19 @@ el1_sync:
 	b	2f
 
 1:	cmp	x18, #HVC_SET_VECTORS
-	b.ne	2f
+	b.ne	1f
 	msr	vbar_el2, x0
+	b	2f
+
+1:	cmp	x18, #HVC_CALL_FUNC
+	b.ne	2f
+	mov	x18, lr
+	mov	lr, x0
+	mov	x0, x1
+	mov	x1, x2
+	mov	x2, x3
+	blr	lr
+	mov	lr, x18
 
 2:	eret
 ENDPROC(el1_sync)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 05/13] arm64: kvm: allows kvm cpu hotplug
  2016-01-28 10:42 ` James Morse
                   ` (4 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  2016-02-02  6:46   ` AKASHI Takahiro
  -1 siblings, 1 reply; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: AKASHI Takahiro <takahiro.akashi@linaro.org>

The current kvm implementation on arm64 does cpu-specific initialization
at system boot, and has no way to gracefully shutdown a core in terms of
kvm. This prevents kexec from rebooting the system at EL2.

This patch adds a cpu tear-down function and also puts an existing cpu-init
code into a separate function, kvm_arch_hardware_disable() and
kvm_arch_hardware_enable() respectively.
We don't need the arm64 specific cpu hotplug hook any more.

Since this patch modifies common code between arm and arm64, one stub
definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
compilation errors.

Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
[Moved __kvm_hyp_reset() to use kvm_call_hyp(), instead of having its own
 dedicated entry point in el1_sync. Added some comments and a tlbi.]
Signed-off-by: James Morse <james.morse@arm.com>
---
This patch is from v13 of kexec, see my [changes] above.

 arch/arm/include/asm/kvm_host.h   | 10 +++-
 arch/arm/include/asm/kvm_mmu.h    |  1 +
 arch/arm/kvm/arm.c                | 98 ++++++++++++++++++++++++---------------
 arch/arm/kvm/mmu.c                |  5 ++
 arch/arm64/include/asm/kvm_host.h |  1 -
 arch/arm64/include/asm/kvm_mmu.h  | 19 ++++++++
 arch/arm64/kvm/hyp-init.S         | 42 +++++++++++++++++
 7 files changed, 136 insertions(+), 40 deletions(-)

diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
index f9f27792d8ed..8af531d64771 100644
--- a/arch/arm/include/asm/kvm_host.h
+++ b/arch/arm/include/asm/kvm_host.h
@@ -220,6 +220,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
 }
 
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	/*
+	 * TODO
+	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
+	 */
+}
+
 static inline int kvm_arch_dev_ioctl_check_extension(long ext)
 {
 	return 0;
@@ -232,7 +241,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
 
 struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
index a520b7987a29..4fd9ddb48c0f 100644
--- a/arch/arm/include/asm/kvm_mmu.h
+++ b/arch/arm/include/asm/kvm_mmu.h
@@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
index dda1959f0dde..f060567e9c0a 100644
--- a/arch/arm/kvm/arm.c
+++ b/arch/arm/kvm/arm.c
@@ -16,7 +16,6 @@
  * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
  */
 
-#include <linux/cpu.h>
 #include <linux/cpu_pm.h>
 #include <linux/errno.h>
 #include <linux/err.h>
@@ -65,6 +64,8 @@ static DEFINE_SPINLOCK(kvm_vmid_lock);
 
 static bool vgic_present;
 
+static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
+
 static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
 {
 	BUG_ON(preemptible());
@@ -89,11 +90,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
 	return &kvm_arm_running_vcpu;
 }
 
-int kvm_arch_hardware_enable(void)
-{
-	return 0;
-}
-
 int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
 {
 	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
@@ -585,7 +581,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
 		/*
 		 * Re-check atomic conditions
 		 */
-		if (signal_pending(current)) {
+		if (unlikely(!__this_cpu_read(kvm_arm_hardware_enabled))) {
+			/* cpu has been torn down */
+			ret = 0;
+			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
+			run->fail_entry.hardware_entry_failure_reason
+					= (u64)-ENOEXEC;
+		} else if (signal_pending(current)) {
 			ret = -EINTR;
 			run->exit_reason = KVM_EXIT_INTR;
 		}
@@ -967,7 +969,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
 	}
 }
 
-static void cpu_init_hyp_mode(void *dummy)
+static void cpu_init_hyp_mode(void)
 {
 	phys_addr_t boot_pgd_ptr;
 	phys_addr_t pgd_ptr;
@@ -989,36 +991,61 @@ static void cpu_init_hyp_mode(void *dummy)
 	kvm_arm_init_debug();
 }
 
-static int hyp_init_cpu_notify(struct notifier_block *self,
-			       unsigned long action, void *cpu)
+static void cpu_reset_hyp_mode(void)
 {
-	switch (action) {
-	case CPU_STARTING:
-	case CPU_STARTING_FROZEN:
-		if (__hyp_get_vectors() == hyp_default_vectors)
-			cpu_init_hyp_mode(NULL);
-		break;
+	phys_addr_t boot_pgd_ptr;
+	phys_addr_t phys_idmap_start;
+
+	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
+	phys_idmap_start = kvm_get_idmap_start();
+
+	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
+}
+
+int kvm_arch_hardware_enable(void)
+{
+	if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
+		cpu_init_hyp_mode();
+		__this_cpu_write(kvm_arm_hardware_enabled, 1);
 	}
 
-	return NOTIFY_OK;
+	return 0;
 }
 
-static struct notifier_block hyp_init_cpu_nb = {
-	.notifier_call = hyp_init_cpu_notify,
-};
+void kvm_arch_hardware_disable(void)
+{
+	if (!__this_cpu_read(kvm_arm_hardware_enabled))
+		return;
+
+	cpu_reset_hyp_mode();
+	__this_cpu_write(kvm_arm_hardware_enabled, 0);
+}
 
 #ifdef CONFIG_CPU_PM
 static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
 				    unsigned long cmd,
 				    void *v)
 {
-	if (cmd == CPU_PM_EXIT &&
-	    __hyp_get_vectors() == hyp_default_vectors) {
-		cpu_init_hyp_mode(NULL);
+	/*
+	 * kvm_arm_hardware_enabled is left with its old value over
+	 * PM_ENTER->PM_EXIT. It is used to indicate PM_EXIT should
+	 * re-enable hyp.
+	 */
+	switch (cmd) {
+	case CPU_PM_ENTER:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			cpu_reset_hyp_mode();
+
+		return NOTIFY_OK;
+	case CPU_PM_EXIT:
+		if (__this_cpu_read(kvm_arm_hardware_enabled))
+			cpu_init_hyp_mode();
+
 		return NOTIFY_OK;
-	}
 
-	return NOTIFY_DONE;
+	default:
+		return NOTIFY_DONE;
+	}
 }
 
 static struct notifier_block hyp_init_cpu_pm_nb = {
@@ -1122,14 +1149,20 @@ static int init_hyp_mode(void)
 	}
 
 	/*
-	 * Execute the init code on each CPU.
+	 * Init this CPU temporarily to execute kvm_hyp_call()
+	 * during kvm_vgic_hyp_init().
 	 */
-	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
+	preempt_disable();
+	cpu_init_hyp_mode();
 
 	/*
 	 * Init HYP view of VGIC
 	 */
 	err = kvm_vgic_hyp_init();
+
+	cpu_reset_hyp_mode();
+	preempt_enable();
+
 	switch (err) {
 	case 0:
 		vgic_present = true;
@@ -1213,26 +1246,15 @@ int kvm_arch_init(void *opaque)
 		}
 	}
 
-	cpu_notifier_register_begin();
-
 	err = init_hyp_mode();
 	if (err)
 		goto out_err;
 
-	err = __register_cpu_notifier(&hyp_init_cpu_nb);
-	if (err) {
-		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
-		goto out_err;
-	}
-
-	cpu_notifier_register_done();
-
 	hyp_cpu_pm_init();
 
 	kvm_coproc_table_init();
 	return 0;
 out_err:
-	cpu_notifier_register_done();
 	return err;
 }
 
diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
index aba61fd3697a..7a3aed62499a 100644
--- a/arch/arm/kvm/mmu.c
+++ b/arch/arm/kvm/mmu.c
@@ -1643,6 +1643,11 @@ phys_addr_t kvm_get_idmap_vector(void)
 	return hyp_idmap_vector;
 }
 
+phys_addr_t kvm_get_idmap_start(void)
+{
+	return hyp_idmap_start;
+}
+
 int kvm_mmu_init(void)
 {
 	int err;
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 689d4c95e12f..7d6d75616fb5 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -332,7 +332,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
 		     hyp_stack_ptr, vector_ptr);
 }
 
-static inline void kvm_arch_hardware_disable(void) {}
 static inline void kvm_arch_hardware_unsetup(void) {}
 static inline void kvm_arch_sync_events(struct kvm *kvm) {}
 static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
index 736433912a1e..1d48208a904a 100644
--- a/arch/arm64/include/asm/kvm_mmu.h
+++ b/arch/arm64/include/asm/kvm_mmu.h
@@ -99,6 +99,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
 phys_addr_t kvm_mmu_get_httbr(void);
 phys_addr_t kvm_mmu_get_boot_httbr(void);
 phys_addr_t kvm_get_idmap_vector(void);
+phys_addr_t kvm_get_idmap_start(void);
 int kvm_mmu_init(void);
 void kvm_clear_hyp_idmap(void);
 
@@ -310,5 +311,23 @@ static inline unsigned int kvm_get_vmid_bits(void)
 	return (cpuid_feature_extract_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
 }
 
+void __kvm_hyp_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
+
+/*
+ * Call reset code, and switch back to stub hyp vectors. We need to execute
+ * __kvm_hyp_reset() from the trampoline page, we calculate its address here.
+ */
+static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
+					phys_addr_t phys_idmap_start)
+{
+	unsigned long trampoline_hyp_reset;
+
+	trampoline_hyp_reset = TRAMPOLINE_VA +
+			       ((unsigned long)__kvm_hyp_reset & ~PAGE_MASK);
+
+	kvm_call_hyp((void *)trampoline_hyp_reset,
+		     boot_pgd_ptr, phys_idmap_start);
+}
+
 #endif /* __ASSEMBLY__ */
 #endif /* __ARM64_KVM_MMU_H__ */
diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
index dc6335a7353e..d20d86c7f9d8 100644
--- a/arch/arm64/kvm/hyp-init.S
+++ b/arch/arm64/kvm/hyp-init.S
@@ -150,6 +150,48 @@ merged:
 	eret
 ENDPROC(__kvm_hyp_init)
 
+	/*
+	 * x0: HYP boot pgd
+	 * x1: HYP phys_idmap_start
+	 */
+ENTRY(__kvm_hyp_reset)
+	/*
+	 * Retrieve lr from the stack (pushed by el1_sync()), so we can eret
+	 * from here.
+	 */
+	ldp	lr, xzr, [sp], #16
+
+	/* We're in trampoline code in VA, switch back to boot page tables */
+	msr	ttbr0_el2, x0
+	isb
+
+	/* Ensure the PA branch doesn't find a stale tlb entry. */
+	tlbi	alle2
+	dsb	sy
+
+	/* Branch into PA space */
+	adr	x0, 1f
+	bfi	x1, x0, #0, #PAGE_SHIFT
+	br	x1
+
+	/* We're now in idmap, disable MMU */
+1:	mrs	x0, sctlr_el2
+	ldr	x1, =SCTLR_ELx_FLAGS
+	bic	x0, x0, x1		// Clear SCTL_M and etc
+	msr	sctlr_el2, x0
+	isb
+
+	/* Invalidate the old TLBs */
+	tlbi	alle2
+	dsb	sy
+
+	/* Install stub vectors */
+	adr_l	x0, __hyp_stub_vectors
+	msr	vbar_el2, x0
+
+	eret
+ENDPROC(__kvm_hyp_reset)
+
 	.ltorg
 
 	.popsection
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 06/13] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter().
  2016-01-28 10:42 ` James Morse
                   ` (5 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

Hibernate could make use of the cpu_suspend() code to save/restore cpu
state, however it needs to be able to return '0' from the 'finisher'.

Rework cpu_suspend() so that the finisher is called from C code,
independently from the save/restore of cpu state. Space to save the context
in is allocated in the caller's stack frame, and passed into
__cpu_suspend_enter().

Hibernate's use of this API will look like a copy of the cpu_suspend()
function.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 arch/arm64/include/asm/suspend.h | 20 +++++++++
 arch/arm64/kernel/asm-offsets.c  |  2 +
 arch/arm64/kernel/sleep.S        | 93 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 90 ++++++++++++++++++++++----------------
 4 files changed, 109 insertions(+), 96 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 59a5b0f1e81c..ccd26da93d03 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -2,6 +2,7 @@
 #define __ASM_SUSPEND_H
 
 #define NR_CTX_REGS 11
+#define NR_CALLEE_SAVED_REGS 12
 
 /*
  * struct cpu_suspend_ctx must be 16-byte aligned since it is allocated on
@@ -21,6 +22,25 @@ struct sleep_save_sp {
 	phys_addr_t save_ptr_stash_phys;
 };
 
+/*
+ * Memory to save the cpu state is allocated on the stack by
+ * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
+ * This data must survive until cpu_resume() is called.
+ *
+ * This struct desribes the size and the layout of the saved cpu state.
+ * The layout of the callee_saved_regs is defined by the implementation
+ * of __cpu_suspend_enter(), and cpu_resume(). This struct must be passed
+ * in by the caller as __cpu_suspend_enter()'s stack-frame is gone once it
+ * returns, and the data would be subsequently corrupted by the call to the
+ * finisher.
+ */
+struct sleep_stack_data {
+	struct cpu_suspend_ctx	system_regs;
+	unsigned long		callee_saved_regs[NR_CALLEE_SAVED_REGS];
+};
+
 extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
+int __cpu_suspend_enter(struct sleep_stack_data *state);
+void __cpu_suspend_exit(struct mm_struct *mm);
 #endif
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index fffa4ac6c25a..7d7994774e82 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -123,6 +123,8 @@ int main(void)
   DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
   DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
   DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
+  DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
+  DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
   DEFINE(ARM_SMCCC_RES_X0_OFFS,	offsetof(struct arm_smccc_res, a0));
   DEFINE(ARM_SMCCC_RES_X2_OFFS,	offsetof(struct arm_smccc_res, a2));
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index e33fe33876ab..dca81612fe90 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -49,37 +49,30 @@
 	orr	\dst, \dst, \mask		// dst|=(aff3>>rs3)
 	.endm
 /*
- * Save CPU state for a suspend and execute the suspend finisher.
- * On success it will return 0 through cpu_resume - ie through a CPU
- * soft/hard reboot from the reset vector.
- * On failure it returns the suspend finisher return value or force
- * -EOPNOTSUPP if the finisher erroneously returns 0 (the suspend finisher
- * is not allowed to return, if it does this must be considered failure).
- * It saves callee registers, and allocates space on the kernel stack
- * to save the CPU specific registers + some other data for resume.
+ * Save CPU state in the provided sleep_stack_data area, and publish its
+ * location for cpu_resume()'s use in sleep_save_stash.
  *
- *  x0 = suspend finisher argument
- *  x1 = suspend finisher function pointer
+ * cpu_resume() will restore this saved state, and return. Because the
+ * link-register is saved and restored, it will appear to return from this
+ * function. So that the caller can tell the suspend/resume paths apart,
+ * __cpu_suspend_enter() will always return a non-zero value, whereas the
+ * path through cpu_resume() will return 0.
+ *
+ *  x0 = struct sleep_stack_data area
  */
 ENTRY(__cpu_suspend_enter)
-	stp	x29, lr, [sp, #-96]!
-	stp	x19, x20, [sp,#16]
-	stp	x21, x22, [sp,#32]
-	stp	x23, x24, [sp,#48]
-	stp	x25, x26, [sp,#64]
-	stp	x27, x28, [sp,#80]
-	/*
-	 * Stash suspend finisher and its argument in x20 and x19
-	 */
-	mov	x19, x0
-	mov	x20, x1
+	stp	x29, lr, [x0, #SLEEP_STACK_DATA_CALLEE_REGS]
+	stp	x19, x20, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+16]
+	stp	x21, x22, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+32]
+	stp	x23, x24, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+48]
+	stp	x25, x26, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+64]
+	stp	x27, x28, [x0,#SLEEP_STACK_DATA_CALLEE_REGS+80]
+
+	/* save the sp in cpu_suspend_ctx */
 	mov	x2, sp
-	sub	sp, sp, #CPU_SUSPEND_SZ	// allocate cpu_suspend_ctx
-	mov	x0, sp
-	/*
-	 * x0 now points to struct cpu_suspend_ctx allocated on the stack
-	 */
-	str	x2, [x0, #CPU_CTX_SP]
+	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
+
+	/* find the mpidr_hash */
 	ldr	x1, =sleep_save_sp
 	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
 	mrs	x7, mpidr_el1
@@ -93,34 +86,11 @@ ENTRY(__cpu_suspend_enter)
 	ldp	w5, w6, [x9, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
+
+	push	x29, lr
 	bl	__cpu_suspend_save
-	/*
-	 * Grab suspend finisher in x20 and its argument in x19
-	 */
-	mov	x0, x19
-	mov	x1, x20
-	/*
-	 * We are ready for power down, fire off the suspend finisher
-	 * in x1, with argument in x0
-	 */
-	blr	x1
-        /*
-	 * Never gets here, unless suspend finisher fails.
-	 * Successful cpu_suspend should return from cpu_resume, returning
-	 * through this code path is considered an error
-	 * If the return value is set to 0 force x0 = -EOPNOTSUPP
-	 * to make sure a proper error condition is propagated
-	 */
-	cmp	x0, #0
-	mov	x3, #-EOPNOTSUPP
-	csel	x0, x3, x0, eq
-	add	sp, sp, #CPU_SUSPEND_SZ	// rewind stack pointer
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
+	pop	x29, lr
+	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
@@ -146,12 +116,6 @@ ENDPROC(cpu_resume_mmu)
 	.popsection
 cpu_resume_after_mmu:
 	mov	x0, #0			// return zero on success
-	ldp	x19, x20, [sp, #16]
-	ldp	x21, x22, [sp, #32]
-	ldp	x23, x24, [sp, #48]
-	ldp	x25, x26, [sp, #64]
-	ldp	x27, x28, [sp, #80]
-	ldp	x29, lr, [sp], #96
 	ret
 ENDPROC(cpu_resume_after_mmu)
 
@@ -168,6 +132,8 @@ ENTRY(cpu_resume)
         /* x7 contains hash index, let's use it to grab context pointer */
 	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
 	ldr	x0, [x0, x7, lsl #3]
+	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
 	/* load physical address of identity map page table in x1 */
@@ -181,5 +147,12 @@ ENTRY(cpu_resume)
 	 * pointer and x1 to contain physical address of 1:1 page tables
 	 */
 	bl	cpu_do_resume		// PC relative jump, MMU off
+	/* Can't access these by physical address once the MMU is on */
+	ldp	x19, x20, [x29, #16]
+	ldp	x21, x22, [x29, #32]
+	ldp	x23, x24, [x29, #48]
+	ldp	x25, x26, [x29, #64]
+	ldp	x27, x28, [x29, #80]
+	ldp	x29, lr, [x29]
 	b	cpu_resume_mmu		// Resume MMU, never returns
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index 1095aa483a1c..fbc14774af6f 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -10,22 +10,22 @@
 #include <asm/suspend.h>
 #include <asm/tlbflush.h>
 
-extern int __cpu_suspend_enter(unsigned long arg, int (*fn)(unsigned long));
+
 /*
  * This is called by __cpu_suspend_enter() to save the state, and do whatever
  * flushing is required to ensure that when the CPU goes to sleep we have
  * the necessary data available when the caches are not searched.
  *
- * ptr: CPU context virtual address
+ * ptr: sleep_stack_data containing cpu state virtual address.
  * save_ptr: address of the location where the context physical address
  *           must be saved
  */
-void notrace __cpu_suspend_save(struct cpu_suspend_ctx *ptr,
+void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
 				phys_addr_t *save_ptr)
 {
 	*save_ptr = virt_to_phys(ptr);
 
-	cpu_do_suspend(ptr);
+	cpu_do_suspend(&ptr->system_regs);
 	/*
 	 * Only flush the context that must be retrieved with the MMU
 	 * off. VA primitives ensure the flush is applied to all
@@ -51,6 +51,41 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 	hw_breakpoint_restore = hw_bp_restore;
 }
 
+void notrace __cpu_suspend_exit(struct mm_struct *mm)
+{
+	/*
+	 * We are resuming from reset with TTBR0_EL1 set to the
+	 * idmap to enable the MMU; set the TTBR0 to the reserved
+	 * page tables to prevent speculative TLB allocations, flush
+	 * the local tlb and set the default tcr_el1.t0sz so that
+	 * the TTBR0 address space set-up is properly restored.
+	 * If the current active_mm != &init_mm we entered cpu_suspend
+	 * with mappings in TTBR0 that must be restored, so we switch
+	 * them back to complete the address space configuration
+	 * restoration before returning.
+	 */
+	cpu_set_reserved_ttbr0();
+	local_flush_tlb_all();
+	cpu_set_default_tcr_t0sz();
+
+	if (mm != &init_mm)
+		cpu_switch_mm(mm->pgd, mm);
+
+	/*
+	 * Restore per-cpu offset before any kernel
+	 * subsystem relying on it has a chance to run.
+	 */
+	set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+
+	/*
+	 * Restore HW breakpoint registers to sane values
+	 * before debug exceptions are possibly reenabled
+	 * through local_dbg_restore.
+	 */
+	if (hw_breakpoint_restore)
+		hw_breakpoint_restore(NULL);
+}
+
 /*
  * cpu_suspend
  *
@@ -60,9 +95,10 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
  */
 int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 {
-	struct mm_struct *mm = current->active_mm;
-	int ret;
+	int ret = 0;
 	unsigned long flags;
+	struct sleep_stack_data state;
+	struct mm_struct *mm = current->active_mm;
 
 	/*
 	 * From this point debug exceptions are disabled to prevent
@@ -84,39 +120,21 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	 * page tables, so that the thread address space is properly
 	 * set-up on function return.
 	 */
-	ret = __cpu_suspend_enter(arg, fn);
-	if (ret == 0) {
-		/*
-		 * We are resuming from reset with TTBR0_EL1 set to the
-		 * idmap to enable the MMU; set the TTBR0 to the reserved
-		 * page tables to prevent speculative TLB allocations, flush
-		 * the local tlb and set the default tcr_el1.t0sz so that
-		 * the TTBR0 address space set-up is properly restored.
-		 * If the current active_mm != &init_mm we entered cpu_suspend
-		 * with mappings in TTBR0 that must be restored, so we switch
-		 * them back to complete the address space configuration
-		 * restoration before returning.
-		 */
-		cpu_set_reserved_ttbr0();
-		local_flush_tlb_all();
-		cpu_set_default_tcr_t0sz();
-
-		if (mm != &init_mm)
-			cpu_switch_mm(mm->pgd, mm);
-
-		/*
-		 * Restore per-cpu offset before any kernel
-		 * subsystem relying on it has a chance to run.
-		 */
-		set_my_cpu_offset(per_cpu_offset(smp_processor_id()));
+	if (__cpu_suspend_enter(&state)) {
+		/* Call the suspend finisher */
+		ret = fn(arg);
 
 		/*
-		 * Restore HW breakpoint registers to sane values
-		 * before debug exceptions are possibly reenabled
-		 * through local_dbg_restore.
+		 * Never gets here, unless the suspend finisher fails.
+		 * Successful cpu_suspend() should return from cpu_resume(),
+		 * returning through this code path is considered an error
+		 * If the return value is set to 0 force ret = -EOPNOTSUPP
+		 * to make sure a proper error condition is propagated
 		 */
-		if (hw_breakpoint_restore)
-			hw_breakpoint_restore(NULL);
+		if (!ret)
+			ret = -EOPNOTSUPP;
+	} else {
+		__cpu_suspend_exit(mm);
 	}
 
 	unpause_graph_tracing();
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 07/13] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  2016-01-28 10:42 ` James Morse
                   ` (6 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  2016-02-05 16:26   ` Lorenzo Pieralisi
  -1 siblings, 1 reply; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

By enabling the MMU early in cpu_resume(), the sleep_save_sp and stack can
be accessed by VA, which avoids the need to convert-addresses and clean to
PoC on the suspend path.

MMU setup is shared with the boot path, meaning the swapper_pg_dir is
restored directly: ttbr1_el1 is no longer saved/restored.

struct sleep_save_sp is removed, replacing it with a single array of
pointers.

cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
__cpu_setup(). However these values all contain res0 bits that may be used
to enable future features.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/suspend.h |  7 +-----
 arch/arm64/kernel/asm-offsets.c  |  3 ---
 arch/arm64/kernel/head.S         |  2 +-
 arch/arm64/kernel/setup.c        |  1 -
 arch/arm64/kernel/sleep.S        | 54 ++++++++++++++--------------------------
 arch/arm64/kernel/suspend.c      | 52 +++++---------------------------------
 arch/arm64/mm/proc.S             | 35 +++++++-------------------
 7 files changed, 36 insertions(+), 118 deletions(-)

diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index ccd26da93d03..5faa3ce1fa3a 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -1,7 +1,7 @@
 #ifndef __ASM_SUSPEND_H
 #define __ASM_SUSPEND_H
 
-#define NR_CTX_REGS 11
+#define NR_CTX_REGS 10
 #define NR_CALLEE_SAVED_REGS 12
 
 /*
@@ -17,11 +17,6 @@ struct cpu_suspend_ctx {
 	u64 sp;
 } __aligned(16);
 
-struct sleep_save_sp {
-	phys_addr_t *save_ptr_stash;
-	phys_addr_t save_ptr_stash_phys;
-};
-
 /*
  * Memory to save the cpu state is allocated on the stack by
  * __cpu_suspend_enter()'s caller, and populated by __cpu_suspend_enter().
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index 7d7994774e82..d6119c57f28a 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -120,9 +120,6 @@ int main(void)
   DEFINE(CPU_CTX_SP,		offsetof(struct cpu_suspend_ctx, sp));
   DEFINE(MPIDR_HASH_MASK,	offsetof(struct mpidr_hash, mask));
   DEFINE(MPIDR_HASH_SHIFTS,	offsetof(struct mpidr_hash, shift_aff));
-  DEFINE(SLEEP_SAVE_SP_SZ,	sizeof(struct sleep_save_sp));
-  DEFINE(SLEEP_SAVE_SP_PHYS,	offsetof(struct sleep_save_sp, save_ptr_stash_phys));
-  DEFINE(SLEEP_SAVE_SP_VIRT,	offsetof(struct sleep_save_sp, save_ptr_stash));
   DEFINE(SLEEP_STACK_DATA_SYSTEM_REGS,	offsetof(struct sleep_stack_data, system_regs));
   DEFINE(SLEEP_STACK_DATA_CALLEE_REGS,	offsetof(struct sleep_stack_data, callee_saved_regs));
 #endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index ffe9c2b6431b..85db49d3b191 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -626,7 +626,7 @@ ENDPROC(__secondary_switched)
  * If it isn't, park the CPU
  */
 	.section	".idmap.text", "ax"
-__enable_mmu:
+ENTRY(__enable_mmu)
 	mrs	x1, ID_AA64MMFR0_EL1
 	ubfx	x2, x1, #ID_AA64MMFR0_TGRAN_SHIFT, 4
 	cmp	x2, #ID_AA64MMFR0_TGRAN_SUPPORTED
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 8119479147db..1c4bc180efbe 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -174,7 +174,6 @@ static void __init smp_build_mpidr_hash(void)
 	 */
 	if (mpidr_hash_size() > 4 * num_possible_cpus())
 		pr_warn("Large number of MPIDR hash buckets detected\n");
-	__flush_dcache_area(&mpidr_hash, sizeof(struct mpidr_hash));
 }
 
 static void __init setup_machine_fdt(phys_addr_t dt_phys)
diff --git a/arch/arm64/kernel/sleep.S b/arch/arm64/kernel/sleep.S
index dca81612fe90..0e2b36f1fb44 100644
--- a/arch/arm64/kernel/sleep.S
+++ b/arch/arm64/kernel/sleep.S
@@ -73,8 +73,8 @@ ENTRY(__cpu_suspend_enter)
 	str	x2, [x0, #SLEEP_STACK_DATA_SYSTEM_REGS + CPU_CTX_SP]
 
 	/* find the mpidr_hash */
-	ldr	x1, =sleep_save_sp
-	ldr	x1, [x1, #SLEEP_SAVE_SP_VIRT]
+	ldr	x1, =sleep_save_stash
+	ldr	x1, [x1]
 	mrs	x7, mpidr_el1
 	ldr	x9, =mpidr_hash
 	ldr	x10, [x9, #MPIDR_HASH_MASK]
@@ -87,40 +87,26 @@ ENTRY(__cpu_suspend_enter)
 	compute_mpidr_hash x8, x3, x4, x5, x6, x7, x10
 	add	x1, x1, x8, lsl #3
 
+	str	x0, [x1]
+	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	push	x29, lr
-	bl	__cpu_suspend_save
+	bl	cpu_do_suspend
 	pop	x29, lr
 	mov	x0, #1
 	ret
 ENDPROC(__cpu_suspend_enter)
 	.ltorg
 
-/*
- * x0 must contain the sctlr value retrieved from restored context
- */
-	.pushsection	".idmap.text", "ax"
-ENTRY(cpu_resume_mmu)
-	ldr	x3, =cpu_resume_after_mmu
-	msr	sctlr_el1, x0		// restore sctlr_el1
-	isb
-	/*
-	 * Invalidate the local I-cache so that any instructions fetched
-	 * speculatively from the PoC are discarded, since they may have
-	 * been dynamically patched at the PoU.
-	 */
-	ic	iallu
-	dsb	nsh
-	isb
-	br	x3			// global jump to virtual address
-ENDPROC(cpu_resume_mmu)
-	.popsection
-cpu_resume_after_mmu:
-	mov	x0, #0			// return zero on success
-	ret
-ENDPROC(cpu_resume_after_mmu)
-
 ENTRY(cpu_resume)
 	bl	el2_setup		// if in EL2 drop to EL1 cleanly
+	/* enable the MMU early - so we can access sleep_save_stash by va */
+	adr_l	lr, __enable_mmu	/* __cpu_setup will return here */
+	ldr	x27, =_cpu_resume	/* __enable_mmu will branch here */
+	adrp	x25, idmap_pg_dir
+	adrp	x26, swapper_pg_dir
+	b	__cpu_setup
+
+ENTRY(_cpu_resume)
 	mrs	x1, mpidr_el1
 	adrp	x8, mpidr_hash
 	add x8, x8, #:lo12:mpidr_hash // x8 = struct mpidr_hash phys address
@@ -130,29 +116,27 @@ ENTRY(cpu_resume)
 	ldp	w5, w6, [x8, #(MPIDR_HASH_SHIFTS + 8)]
 	compute_mpidr_hash x7, x3, x4, x5, x6, x1, x2
         /* x7 contains hash index, let's use it to grab context pointer */
-	ldr_l	x0, sleep_save_sp + SLEEP_SAVE_SP_PHYS
+	ldr_l	x0, sleep_save_stash
 	ldr	x0, [x0, x7, lsl #3]
 	add	x29, x0, #SLEEP_STACK_DATA_CALLEE_REGS
 	add	x0, x0, #SLEEP_STACK_DATA_SYSTEM_REGS
 	/* load sp from context */
 	ldr	x2, [x0, #CPU_CTX_SP]
-	/* load physical address of identity map page table in x1 */
-	adrp	x1, idmap_pg_dir
 	mov	sp, x2
 	/* save thread_info */
 	and	x2, x2, #~(THREAD_SIZE - 1)
 	msr	sp_el0, x2
 	/*
-	 * cpu_do_resume expects x0 to contain context physical address
-	 * pointer and x1 to contain physical address of 1:1 page tables
+	 * cpu_do_resume expects x0 to contain context address pointer
 	 */
-	bl	cpu_do_resume		// PC relative jump, MMU off
-	/* Can't access these by physical address once the MMU is on */
+	bl	cpu_do_resume
+
 	ldp	x19, x20, [x29, #16]
 	ldp	x21, x22, [x29, #32]
 	ldp	x23, x24, [x29, #48]
 	ldp	x25, x26, [x29, #64]
 	ldp	x27, x28, [x29, #80]
 	ldp	x29, lr, [x29]
-	b	cpu_resume_mmu		// Resume MMU, never returns
+	mov	x0, #0
+	ret
 ENDPROC(cpu_resume)
diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c
index fbc14774af6f..800fde85cd75 100644
--- a/arch/arm64/kernel/suspend.c
+++ b/arch/arm64/kernel/suspend.c
@@ -12,30 +12,6 @@
 
 
 /*
- * This is called by __cpu_suspend_enter() to save the state, and do whatever
- * flushing is required to ensure that when the CPU goes to sleep we have
- * the necessary data available when the caches are not searched.
- *
- * ptr: sleep_stack_data containing cpu state virtual address.
- * save_ptr: address of the location where the context physical address
- *           must be saved
- */
-void notrace __cpu_suspend_save(struct sleep_stack_data *ptr,
-				phys_addr_t *save_ptr)
-{
-	*save_ptr = virt_to_phys(ptr);
-
-	cpu_do_suspend(&ptr->system_regs);
-	/*
-	 * Only flush the context that must be retrieved with the MMU
-	 * off. VA primitives ensure the flush is applied to all
-	 * cache levels so context is pushed to DRAM.
-	 */
-	__flush_dcache_area(ptr, sizeof(*ptr));
-	__flush_dcache_area(save_ptr, sizeof(*save_ptr));
-}
-
-/*
  * This hook is provided so that cpu_suspend code can restore HW
  * breakpoints as early as possible in the resume path, before reenabling
  * debug exceptions. Code cannot be run from a CPU PM notifier since by the
@@ -54,20 +30,9 @@ void __init cpu_suspend_set_dbg_restorer(void (*hw_bp_restore)(void *))
 void notrace __cpu_suspend_exit(struct mm_struct *mm)
 {
 	/*
-	 * We are resuming from reset with TTBR0_EL1 set to the
-	 * idmap to enable the MMU; set the TTBR0 to the reserved
-	 * page tables to prevent speculative TLB allocations, flush
-	 * the local tlb and set the default tcr_el1.t0sz so that
-	 * the TTBR0 address space set-up is properly restored.
-	 * If the current active_mm != &init_mm we entered cpu_suspend
-	 * with mappings in TTBR0 that must be restored, so we switch
-	 * them back to complete the address space configuration
-	 * restoration before returning.
+	 * We resume from suspend directly into the swapper_pg_dir. We may
+	 * also need to load user-space page tables.
 	 */
-	cpu_set_reserved_ttbr0();
-	local_flush_tlb_all();
-	cpu_set_default_tcr_t0sz();
-
 	if (mm != &init_mm)
 		cpu_switch_mm(mm->pgd, mm);
 
@@ -149,22 +114,17 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
 	return ret;
 }
 
-struct sleep_save_sp sleep_save_sp;
+unsigned long *sleep_save_stash;
 
 static int __init cpu_suspend_init(void)
 {
-	void *ctx_ptr;
-
 	/* ctx_ptr is an array of physical addresses */
-	ctx_ptr = kcalloc(mpidr_hash_size(), sizeof(phys_addr_t), GFP_KERNEL);
+	sleep_save_stash = kcalloc(mpidr_hash_size(), sizeof(*sleep_save_stash),
+				   GFP_KERNEL);
 
-	if (WARN_ON(!ctx_ptr))
+	if (WARN_ON(!sleep_save_stash))
 		return -ENOMEM;
 
-	sleep_save_sp.save_ptr_stash = ctx_ptr;
-	sleep_save_sp.save_ptr_stash_phys = virt_to_phys(ctx_ptr);
-	__flush_dcache_area(&sleep_save_sp, sizeof(struct sleep_save_sp));
-
 	return 0;
 }
 early_initcall(cpu_suspend_init);
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
index 3c7d170de822..a755108aaa75 100644
--- a/arch/arm64/mm/proc.S
+++ b/arch/arm64/mm/proc.S
@@ -61,62 +61,45 @@ ENTRY(cpu_do_suspend)
 	mrs	x2, tpidr_el0
 	mrs	x3, tpidrro_el0
 	mrs	x4, contextidr_el1
-	mrs	x5, mair_el1
 	mrs	x6, cpacr_el1
-	mrs	x7, ttbr1_el1
 	mrs	x8, tcr_el1
 	mrs	x9, vbar_el1
 	mrs	x10, mdscr_el1
 	mrs	x11, oslsr_el1
 	mrs	x12, sctlr_el1
 	stp	x2, x3, [x0]
-	stp	x4, x5, [x0, #16]
-	stp	x6, x7, [x0, #32]
-	stp	x8, x9, [x0, #48]
-	stp	x10, x11, [x0, #64]
-	str	x12, [x0, #80]
+	stp	x4, xzr, [x0, #16]
+	stp	x6, x8, [x0, #32]
+	stp	x9, x10, [x0, #48]
+	stp	x11, x12, [x0, #64]
 	ret
 ENDPROC(cpu_do_suspend)
 
 /**
  * cpu_do_resume - restore CPU register context
  *
- * x0: Physical address of context pointer
- * x1: ttbr0_el1 to be restored
- *
- * Returns:
- *	sctlr_el1 value in x0
+ * x0: Address of context pointer
  */
 ENTRY(cpu_do_resume)
-	/*
-	 * Invalidate local tlb entries before turning on MMU
-	 */
-	tlbi	vmalle1
 	ldp	x2, x3, [x0]
 	ldp	x4, x5, [x0, #16]
-	ldp	x6, x7, [x0, #32]
-	ldp	x8, x9, [x0, #48]
-	ldp	x10, x11, [x0, #64]
-	ldr	x12, [x0, #80]
+	ldp	x6, x8, [x0, #32]
+	ldp	x9, x10, [x0, #48]
+	ldp	x11, x12, [x0, #64]
 	msr	tpidr_el0, x2
 	msr	tpidrro_el0, x3
 	msr	contextidr_el1, x4
-	msr	mair_el1, x5
 	msr	cpacr_el1, x6
-	msr	ttbr0_el1, x1
-	msr	ttbr1_el1, x7
-	tcr_set_idmap_t0sz x8, x7
 	msr	tcr_el1, x8
 	msr	vbar_el1, x9
 	msr	mdscr_el1, x10
+	msr	sctlr_el1, x12
 	/*
 	 * Restore oslsr_el1 by writing oslar_el1
 	 */
 	ubfx	x11, x11, #1, #1
 	msr	oslar_el1, x11
 	msr	pmuserenr_el0, xzr		// Disable PMU access from EL0
-	mov	x0, x12
-	dsb	nsh		// Make sure local tlb invalidation completed
 	isb
 	ret
 ENDPROC(cpu_do_resume)
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 08/13] arm64: kernel: Include _AC definition in page.h
  2016-01-28 10:42 ` James Morse
                   ` (7 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

page.h uses '_AC' in the definition of PAGE_SIZE, but doesn't include
linux/const.h where this is defined. This produces build warnings when only
asm/page.h is included by asm code.

Signed-off-by: James Morse <james.morse@arm.com>
Acked-by: Mark Rutland <mark.rutland@arm.com>
---
 arch/arm64/include/asm/page.h | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
index 9b2f5a9d019d..fbafd0ad16df 100644
--- a/arch/arm64/include/asm/page.h
+++ b/arch/arm64/include/asm/page.h
@@ -19,6 +19,8 @@
 #ifndef __ASM_PAGE_H
 #define __ASM_PAGE_H
 
+#include <linux/const.h>
+
 /* PAGE_SHIFT determines the page size */
 /* CONT_SHIFT determines the number of pages which can be tracked together  */
 #ifdef CONFIG_ARM64_64K_PAGES
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 09/13] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file
  2016-01-28 10:42 ` James Morse
                   ` (8 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

KERNEL_START and KERNEL_END are useful outside head.S, move them to a
header file.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/include/asm/memory.h | 3 +++
 arch/arm64/kernel/head.S        | 3 ---
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
index 853953cd1f08..5773a6629f10 100644
--- a/arch/arm64/include/asm/memory.h
+++ b/arch/arm64/include/asm/memory.h
@@ -70,6 +70,9 @@
 
 #define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
 
+#define KERNEL_START      _text
+#define KERNEL_END        _end
+
 /*
  * Physical vs virtual RAM address space conversion.  These are
  * private definitions which should NOT be used outside memory.h
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
index 85db49d3b191..817027c0be00 100644
--- a/arch/arm64/kernel/head.S
+++ b/arch/arm64/kernel/head.S
@@ -48,9 +48,6 @@
 #error TEXT_OFFSET must be less than 2MB
 #endif
 
-#define KERNEL_START	_text
-#define KERNEL_END	_end
-
 /*
  * Kernel startup entry point.
  * ---------------------------
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 10/13] arm64: Add new asm macro copy_page
  2016-01-28 10:42 ` James Morse
                   ` (9 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

From: Geoff Levand <geoff@infradead.org>

Kexec and hibernate need to copy pages of memory, but may not have all
of the kernel mapped, and are unable to call copy_page().

Convert copy_page() to a macro, so that it can be inlined in these
situations.

Signed-off-by: Geoff Levand <geoff@infradead.org>
[Changed asm label to 9998, added commit message]
Signed-off-by: James Morse <james.morse@arm.com>
---
This patch is from v13 of kexec, see my [changes] above.

 arch/arm64/include/asm/assembler.h | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
index 137ee5b11eb7..13d8b46bd0bb 100644
--- a/arch/arm64/include/asm/assembler.h
+++ b/arch/arm64/include/asm/assembler.h
@@ -24,6 +24,7 @@
 #define __ASM_ASSEMBLER_H
 
 #include <asm/asm-offsets.h>
+#include <asm/page.h>
 #include <asm/pgtable-hwdef.h>
 #include <asm/ptrace.h>
 #include <asm/thread_info.h>
@@ -273,6 +274,24 @@ lr	.req	x30		// link register
 	.endm
 
 /*
+ * copy_page - copy src to dest using temp registers t1-t8
+ */
+	.macro copy_page dest:req src:req t1:req t2:req t3:req t4:req t5:req t6:req t7:req t8:req
+9998:	ldp	\t1, \t2, [\src]
+	ldp	\t3, \t4, [\src, #16]
+	ldp	\t5, \t6, [\src, #32]
+	ldp	\t7, \t8, [\src, #48]
+	add	\src, \src, #64
+	stnp	\t1, \t2, [\dest]
+	stnp	\t3, \t4, [\dest, #16]
+	stnp	\t5, \t6, [\dest, #32]
+	stnp	\t7, \t8, [\dest, #48]
+	add	\dest, \dest, #64
+	tst	\src, #(PAGE_SIZE - 1)
+	b.ne	9998b
+	.endm
+
+/*
  * Annotate a function as position independent, i.e., safe to be called before
  * the kernel virtual mapping is activated.
  */
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 11/13] PM / Hibernate: Call flush_icache_range() on pages restored in-place
  2016-01-28 10:42 ` James Morse
@ 2016-01-28 10:42   ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Pavel Machek
  Cc: Will Deacon, Sudeep Holla, Kevin Kang, Geoff Levand,
	Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, James Morse

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Modify the read and decompress code to call
flush_icache_range() on all pages that are restored, so that the restored
in-place pages are guaranteed to be executable on these architectures.

Signed-off-by: James Morse <james.morse@arm.com>
---
 kernel/power/swap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 12cd989dadf6..a30645d2e93f 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -37,6 +37,14 @@
 #define HIBERNATE_SIG	"S1SUSPEND"
 
 /*
+ * When reading an {un,}compressed image, we may restore pages in place,
+ * in which case some architectures need these pages cleaning before they
+ * can be executed. We don't know which pages these may be, so clean the lot.
+ */
+bool clean_pages_on_read = false;
+bool clean_pages_on_decompress = false;
+
+/*
  *	The swap map is a data structure used for keeping track of each page
  *	written to a swap partition.  It consists of many swap_map_page
  *	structures that contain each an array of MAP_PAGE_ENTRIES swap entries.
@@ -241,6 +249,9 @@ static void hib_end_io(struct bio *bio)
 
 	if (bio_data_dir(bio) == WRITE)
 		put_page(page);
+	else if (clean_pages_on_read)
+		flush_icache_range((unsigned long)page_address(page),
+				   (unsigned long)page_address(page) + PAGE_SIZE);
 
 	if (bio->bi_error && !hb->error)
 		hb->error = bio->bi_error;
@@ -1049,6 +1060,7 @@ static int load_image(struct swap_map_handle *handle,
 
 	hib_init_batch(&hb);
 
+	clean_pages_on_read = true;
 	printk(KERN_INFO "PM: Loading image data pages (%u pages)...\n",
 		nr_to_read);
 	m = nr_to_read / 10;
@@ -1124,6 +1136,10 @@ static int lzo_decompress_threadfn(void *data)
 		d->unc_len = LZO_UNC_SIZE;
 		d->ret = lzo1x_decompress_safe(d->cmp + LZO_HEADER, d->cmp_len,
 		                               d->unc, &d->unc_len);
+		if (clean_pages_on_decompress)
+			flush_icache_range((unsigned long)d->unc,
+					   (unsigned long)d->unc + d->unc_len);
+
 		atomic_set(&d->stop, 1);
 		wake_up(&d->done);
 	}
@@ -1189,6 +1205,8 @@ static int load_image_lzo(struct swap_map_handle *handle,
 	}
 	memset(crc, 0, offsetof(struct crc_data, go));
 
+	clean_pages_on_decompress = true;
+
 	/*
 	 * Start the decompression threads.
 	 */
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 11/13] PM / Hibernate: Call flush_icache_range() on pages restored in-place
@ 2016-01-28 10:42   ` James Morse
  0 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

Some architectures require code written to memory as if it were data to be
'cleaned' from any data caches before the processor can fetch them as new
instructions.

During resume from hibernate, the snapshot code copies some pages directly,
meaning these architectures do not get a chance to perform their cache
maintenance. Modify the read and decompress code to call
flush_icache_range() on all pages that are restored, so that the restored
in-place pages are guaranteed to be executable on these architectures.

Signed-off-by: James Morse <james.morse@arm.com>
---
 kernel/power/swap.c | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/kernel/power/swap.c b/kernel/power/swap.c
index 12cd989dadf6..a30645d2e93f 100644
--- a/kernel/power/swap.c
+++ b/kernel/power/swap.c
@@ -37,6 +37,14 @@
 #define HIBERNATE_SIG	"S1SUSPEND"
 
 /*
+ * When reading an {un,}compressed image, we may restore pages in place,
+ * in which case some architectures need these pages cleaning before they
+ * can be executed. We don't know which pages these may be, so clean the lot.
+ */
+bool clean_pages_on_read = false;
+bool clean_pages_on_decompress = false;
+
+/*
  *	The swap map is a data structure used for keeping track of each page
  *	written to a swap partition.  It consists of many swap_map_page
  *	structures that contain each an array of MAP_PAGE_ENTRIES swap entries.
@@ -241,6 +249,9 @@ static void hib_end_io(struct bio *bio)
 
 	if (bio_data_dir(bio) == WRITE)
 		put_page(page);
+	else if (clean_pages_on_read)
+		flush_icache_range((unsigned long)page_address(page),
+				   (unsigned long)page_address(page) + PAGE_SIZE);
 
 	if (bio->bi_error && !hb->error)
 		hb->error = bio->bi_error;
@@ -1049,6 +1060,7 @@ static int load_image(struct swap_map_handle *handle,
 
 	hib_init_batch(&hb);
 
+	clean_pages_on_read = true;
 	printk(KERN_INFO "PM: Loading image data pages (%u pages)...\n",
 		nr_to_read);
 	m = nr_to_read / 10;
@@ -1124,6 +1136,10 @@ static int lzo_decompress_threadfn(void *data)
 		d->unc_len = LZO_UNC_SIZE;
 		d->ret = lzo1x_decompress_safe(d->cmp + LZO_HEADER, d->cmp_len,
 		                               d->unc, &d->unc_len);
+		if (clean_pages_on_decompress)
+			flush_icache_range((unsigned long)d->unc,
+					   (unsigned long)d->unc + d->unc_len);
+
 		atomic_set(&d->stop, 1);
 		wake_up(&d->done);
 	}
@@ -1189,6 +1205,8 @@ static int load_image_lzo(struct swap_map_handle *handle,
 	}
 	memset(crc, 0, offsetof(struct crc_data, go));
 
+	clean_pages_on_decompress = true;
+
 	/*
 	 * Start the decompression threads.
 	 */
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 12/13] arm64: kernel: Add support for hibernate/suspend-to-disk
  2016-01-28 10:42 ` James Morse
                   ` (11 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

Add support for hibernate/suspend-to-disk.

Suspend borrows code from cpu_suspend() to write cpu state onto the stack,
before calling swsusp_save() to save the memory image.

Restore creates a set of temporary page tables, covering only the
linear map, copies the restore code to a 'safe' page, then uses the copy to
restore the memory image. The copied code executes in the lower half of the
address space, and once complete, restores the original kernel's page
tables. It then calls into cpu_resume(), and follows the normal
cpu_suspend() path back into the suspend code.

To restore a kernel using KASLR, the address of the page tables, and
cpu_resume() are stored in the hibernate arch-header and the el2
vectors are pivotted via the 'safe' page in low memory. This also permits
us to resume using a different version of the kernel to the version that
hibernated, but because the MMU isn't turned off during resume, the
MMU settings must be the same between both kernels. To ensure this, the
value of the translation control register (TCR_EL1) is also included in the
hibernate arch-header, this means your resume kernel must have the same
page size, and virtual address space size.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/Kconfig                |   7 +
 arch/arm64/include/asm/suspend.h  |   7 +
 arch/arm64/kernel/Makefile        |   1 +
 arch/arm64/kernel/asm-offsets.c   |   5 +
 arch/arm64/kernel/hibernate-asm.S | 137 +++++++++++
 arch/arm64/kernel/hibernate.c     | 477 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/vmlinux.lds.S   |  15 ++
 7 files changed, 649 insertions(+)
 create mode 100644 arch/arm64/kernel/hibernate-asm.S
 create mode 100644 arch/arm64/kernel/hibernate.c

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 8cc62289a63e..6ba01c7ba00e 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -838,6 +838,13 @@ menu "Power management options"
 
 source "kernel/power/Kconfig"
 
+config ARCH_HIBERNATION_POSSIBLE
+	def_bool y
+
+config ARCH_HIBERNATION_HEADER
+	def_bool y
+	depends on HIBERNATION
+
 config ARCH_SUSPEND_POSSIBLE
 	def_bool y
 
diff --git a/arch/arm64/include/asm/suspend.h b/arch/arm64/include/asm/suspend.h
index 5faa3ce1fa3a..488e03064426 100644
--- a/arch/arm64/include/asm/suspend.h
+++ b/arch/arm64/include/asm/suspend.h
@@ -38,4 +38,11 @@ extern int cpu_suspend(unsigned long arg, int (*fn)(unsigned long));
 extern void cpu_resume(void);
 int __cpu_suspend_enter(struct sleep_stack_data *state);
 void __cpu_suspend_exit(struct mm_struct *mm);
+void _cpu_resume(void);
+
+int swsusp_arch_suspend(void);
+int swsusp_arch_resume(void);
+int arch_hibernation_header_save(void *addr, unsigned int max_size);
+int arch_hibernation_header_restore(void *addr);
+
 #endif
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index 83cd7e68e83b..09f7da0f2cf9 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -42,6 +42,7 @@ arm64-obj-$(CONFIG_PCI)			+= pci.o
 arm64-obj-$(CONFIG_ARMV8_DEPRECATED)	+= armv8_deprecated.o
 arm64-obj-$(CONFIG_ACPI)		+= acpi.o
 arm64-obj-$(CONFIG_PARAVIRT)		+= paravirt.o
+arm64-obj-$(CONFIG_HIBERNATION)		+= hibernate.o hibernate-asm.o
 
 obj-y					+= $(arm64-obj-y) vdso/
 obj-m					+= $(arm64-obj-m)
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
index d6119c57f28a..2ac2789fda49 100644
--- a/arch/arm64/kernel/asm-offsets.c
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -22,6 +22,7 @@
 #include <linux/mm.h>
 #include <linux/dma-mapping.h>
 #include <linux/kvm_host.h>
+#include <linux/suspend.h>
 #include <asm/thread_info.h>
 #include <asm/memory.h>
 #include <asm/smp_plat.h>
@@ -125,5 +126,9 @@ int main(void)
 #endif
   DEFINE(ARM_SMCCC_RES_X0_OFFS,	offsetof(struct arm_smccc_res, a0));
   DEFINE(ARM_SMCCC_RES_X2_OFFS,	offsetof(struct arm_smccc_res, a2));
+  BLANK();
+  DEFINE(HIBERN_PBE_ORIG,	offsetof(struct pbe, orig_address));
+  DEFINE(HIBERN_PBE_ADDR,	offsetof(struct pbe, address));
+  DEFINE(HIBERN_PBE_NEXT,	offsetof(struct pbe, next));
   return 0;
 }
diff --git a/arch/arm64/kernel/hibernate-asm.S b/arch/arm64/kernel/hibernate-asm.S
new file mode 100644
index 000000000000..90b578d22a62
--- /dev/null
+++ b/arch/arm64/kernel/hibernate-asm.S
@@ -0,0 +1,137 @@
+#include <linux/linkage.h>
+#include <linux/errno.h>
+
+#include <asm/asm-offsets.h>
+#include <asm/assembler.h>
+#include <asm/cputype.h>
+#include <asm/memory.h>
+#include <asm/page.h>
+#include <asm/virt.h>
+
+/*
+ * Corrupt memory.
+ *
+ * Loads temporary page tables then restores the memory image.
+ * Finally branches to cpu_resume() to restore the state saved by
+ * swsusp_arch_suspend().
+ *
+ * Because this code has to be copied to a safe_page, it can't call out to
+ * other functions by PC-relative address. Also remember that it may be
+ * mid-way through over-writing other functions. For this reason it contains
+ * code from flush_icache_range() and uses the copy_page() macro.
+ *
+ * All of memory gets written to, including code. We need to clean the kernel
+ * text to the Point of Coherence (PoC) before secondary cores can be booted.
+ * Because the kernel modules and executable pages mapped to user space are
+ * also written as data, we clean all pages we touch to the Point of
+ * Unification (PoU).
+ *
+ * x0: physical address of temporary page tables
+ * x1: physical address of swapper page tables
+ * x2: address of cpu_resume
+ * x3: linear map address of restore_pblist in the current kernel
+ */
+.pushsection    ".hibernate_exit.text", "ax"
+ENTRY(swsusp_arch_suspend_exit)
+	/* Temporary page tables are a copy, so no need for a trampoline here */
+	msr	ttbr1_el1, x0
+	isb
+	tlbi	vmalle1is
+	ic	ialluis
+	dsb	ish
+
+	mov	x21, x1
+	mov	x22, x2
+
+	/* walk the restore_pblist and use copy_page() to over-write memory */
+	mov	x19, x3
+
+2:	ldr	x10, [x19, #HIBERN_PBE_ORIG]
+	mov	x0, x10
+	ldr	x1, [x19, #HIBERN_PBE_ADDR]
+
+	copy_page	x0, x1, x2, x3, x4, x5, x6, x7, x8, x9
+
+	dsb	ish		//  memory restore must finish before cleaning
+
+	add	x1, x10, #PAGE_SIZE
+	/* Clean the copied page to PoU - based on flush_icache_range() */
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x10, x3
+4:	dc	cvau, x4	// clean D line / unified line
+	add	x4, x4, x2
+	cmp	x4, x1
+	b.lo	4b
+
+	ldr	x19, [x19, #HIBERN_PBE_NEXT]
+	cbnz	x19, 2b
+
+	/*
+	 * switch to the restored kernels page tables, and branch to re-enter
+	 * kernel.
+	 */
+	msr	ttbr1_el1, x21  // physical address of swapper page tables.
+	isb
+	tlbi	vmalle1is
+	ic	ialluis
+	dsb	ish		// also waits for PoU cleaning to finish
+	isb			// code at x22 may now be different
+
+	br	x22
+
+	.ltorg
+ENDPROC(swsusp_arch_suspend_exit)
+
+/*
+ * Restore the hyp stub. Once we know where in memory the hyp-stub is, we
+ * can reload vbar_el2. This must be done before the hibernate page is
+ * unmapped.
+ *
+ * x0: The hyp-stub address for vbar_el2 __pa(__hyp_stub_vectors)
+ */
+el1_sync:
+	msr	vbar_el2, x0
+	eret
+ENDPROC(el1_sync)
+
+.macro invalid_vector	label
+\label:
+	b \label
+ENDPROC(\label)
+.endm
+
+	invalid_vector	el2_sync_invalid
+	invalid_vector	el2_irq_invalid
+	invalid_vector	el2_fiq_invalid
+	invalid_vector	el2_error_invalid
+	invalid_vector	el1_sync_invalid
+	invalid_vector	el1_irq_invalid
+	invalid_vector	el1_fiq_invalid
+	invalid_vector	el1_error_invalid
+
+/* el2 vectors - switch el2 here while we restore the memory image. */
+	.align 11
+ENTRY(hibernate_el2_vectors)
+	ventry	el2_sync_invalid		// Synchronous EL2t
+	ventry	el2_irq_invalid			// IRQ EL2t
+	ventry	el2_fiq_invalid			// FIQ EL2t
+	ventry	el2_error_invalid		// Error EL2t
+
+	ventry	el2_sync_invalid		// Synchronous EL2h
+	ventry	el2_irq_invalid			// IRQ EL2h
+	ventry	el2_fiq_invalid			// FIQ EL2h
+	ventry	el2_error_invalid		// Error EL2h
+
+	ventry	el1_sync			// Synchronous 64-bit EL1
+	ventry	el1_irq_invalid			// IRQ 64-bit EL1
+	ventry	el1_fiq_invalid			// FIQ 64-bit EL1
+	ventry	el1_error_invalid		// Error 64-bit EL1
+
+	ventry	el1_sync_invalid		// Synchronous 32-bit EL1
+	ventry	el1_irq_invalid			// IRQ 32-bit EL1
+	ventry	el1_fiq_invalid			// FIQ 32-bit EL1
+	ventry	el1_error_invalid		// Error 32-bit EL1
+END(hibernate_el2_vectors)
+
+.popsection
diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
new file mode 100644
index 000000000000..3789ae174b66
--- /dev/null
+++ b/arch/arm64/kernel/hibernate.c
@@ -0,0 +1,477 @@
+/*:
+ * Hibernate support specific for ARM64
+ *
+ * Derived from work on ARM hibernation support by:
+ *
+ * Ubuntu project, hibernation support for mach-dove
+ * Copyright (C) 2010 Nokia Corporation (Hiroshi Doyu)
+ * Copyright (C) 2010 Texas Instruments, Inc. (Teerth Reddy et al.)
+ *  https://lkml.org/lkml/2010/6/18/4
+ *  https://lists.linux-foundation.org/pipermail/linux-pm/2010-June/027422.html
+ *  https://patchwork.kernel.org/patch/96442/
+ *
+ * Copyright (C) 2006 Rafael J. Wysocki <rjw@sisk.pl>
+ *
+ * License terms: GNU General Public License (GPL) version 2
+ */
+#define pr_fmt(x) "hibernate: " x
+#include <linux/kvm_host.h>
+#include <linux/mm.h>
+#include <linux/pm.h>
+#include <linux/sched.h>
+#include <linux/suspend.h>
+#include <linux/version.h>
+
+#include <asm/barrier.h>
+#include <asm/cacheflush.h>
+#include <asm/irqflags.h>
+#include <asm/memory.h>
+#include <asm/mmu_context.h>
+#include <asm/pgalloc.h>
+#include <asm/pgtable.h>
+#include <asm/pgtable-hwdef.h>
+#include <asm/sections.h>
+#include <asm/suspend.h>
+#include <asm/virt.h>
+
+/* These are necessary to build without ifdefery */
+#ifndef pmd_index
+#define pmd_index(x)	0
+#endif
+#ifndef pud_index
+#define pud_index(x)	0
+#endif
+
+#define TCR_IPS_BITS (0x7UL<<32)
+
+/*
+ * This value is written to the hibernate arch header, and prevents resuming
+ * from a hibernate image produced by an incompatible kernel. If you change
+ * a value that isn't saved/restored by hibernate, you should change this value.
+ *
+ * For example, if the mair_el1 values used by the kernel are changed, you
+ * should prevent resuming from a kernel with incompatible attributes, as these
+ * aren't saved/restored.
+ */
+#define HIBERNATE_VERSION	KERNEL_VERSION(4, 6, 0)
+
+/*
+ * Start/end of the hibernate exit code, this must be copied to a 'safe'
+ * location in memory, and executed from there.
+ */
+extern char __hibernate_exit_text_start[], __hibernate_exit_text_end[];
+
+/* temporary el2 vectors in the __hibernate_exit_text section. */
+extern char hibernate_el2_vectors[];
+
+/* the hyp stubs el2 vectors, defined in kernel/hyp-stub.S */
+extern char __hyp_stub_vectors[];
+
+struct arch_hibernate_hdr_invariants {
+	unsigned long	version;
+	unsigned long	tcr_el1;	/* page_size, va bit etc */
+};
+
+/* These values need to be know across a hibernate/restore. */
+static struct arch_hibernate_hdr {
+	struct arch_hibernate_hdr_invariants invariants;
+
+	/* These are needed to find the relocated kernel if built with kaslr */
+	phys_addr_t	ttbr1_el1;
+	void		(*reenter_kernel)(void);
+} resume_hdr;
+
+static inline struct arch_hibernate_hdr_invariants arch_hdr_invariants(void)
+{
+	struct arch_hibernate_hdr_invariants rv;
+
+	rv.version = HIBERNATE_VERSION;
+	asm volatile("mrs	%0, tcr_el1" : "=r"(rv.tcr_el1));
+
+	/* IPS bits vary on big/little systems, mask them out */
+	rv.tcr_el1 &= ~TCR_IPS_BITS;
+
+	return rv;
+}
+
+int pfn_is_nosave(unsigned long pfn)
+{
+	unsigned long nosave_begin_pfn = virt_to_pfn(&__nosave_begin);
+	unsigned long nosave_end_pfn = virt_to_pfn(&__nosave_end - 1);
+
+	return (pfn >= nosave_begin_pfn) && (pfn <= nosave_end_pfn);
+}
+
+void notrace save_processor_state(void)
+{
+	WARN_ON(num_online_cpus() != 1);
+	local_fiq_disable();
+}
+
+void notrace restore_processor_state(void)
+{
+	local_fiq_enable();
+}
+
+int arch_hibernation_header_save(void *addr, unsigned int max_size)
+{
+	struct arch_hibernate_hdr *hdr = addr;
+
+	if (max_size < sizeof(*hdr))
+		return -EOVERFLOW;
+
+	hdr->invariants		= arch_hdr_invariants();
+	hdr->ttbr1_el1		= virt_to_phys(swapper_pg_dir);
+	hdr->reenter_kernel	= &_cpu_resume;
+
+	return 0;
+}
+EXPORT_SYMBOL(arch_hibernation_header_save);
+
+int arch_hibernation_header_restore(void *addr)
+{
+	struct arch_hibernate_hdr_invariants invariants;
+	struct arch_hibernate_hdr *hdr = addr;
+
+	/*
+	 * If this header is ancient, it may be smaller than we expect.
+	 * Test the version first.
+	 */
+	if (hdr->invariants.version != HIBERNATE_VERSION) {
+		pr_crit("Hibernate image not compatible with this kernel version!\n");
+		return -EINVAL;
+	}
+
+	invariants = arch_hdr_invariants();
+	if (memcmp(&hdr->invariants, &invariants, sizeof(invariants))) {
+		pr_crit("Hibernate image not compatible with this kernel configuration!\n");
+		return -EINVAL;
+	}
+
+	resume_hdr = *hdr;
+
+	return 0;
+}
+EXPORT_SYMBOL(arch_hibernation_header_restore);
+
+/*
+ * Copies length bytes, starting at src_start into an new page,
+ * perform cache maintentance, then map it (nearly) at the bottom of memory
+ * as executable.
+ *
+ * This is used by hibernate to copy the code it needs to execute when
+ * overwriting the kernel text. This function generates a new set of page
+ * tables, which it loads into ttbr0.
+ *
+ * Length is provided as we probably only want 4K of data, even on a 64K
+ * page system. We don't use the very bottom page, so that dereferencing
+ * NULL continues to have the expected behaviour.
+ */
+static int create_safe_exec_page(void *src_start, size_t length,
+				 void **dst_addr, phys_addr_t *phys_dst_addr,
+				 unsigned long (*allocator)(gfp_t mask),
+				 gfp_t mask)
+{
+	int rc = 0;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+	unsigned long dst = allocator(mask);
+
+	if (!dst) {
+		rc = -ENOMEM;
+		goto out;
+	}
+
+	memcpy((void *)dst, src_start, length);
+	flush_icache_range(dst, dst + length);
+
+	pgd = (pgd_t *)allocator(mask) + pgd_index(PAGE_SIZE);
+	if (PTRS_PER_PGD > 1) {
+		pud = (pud_t *)allocator(mask);
+		if (!pud) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pgd(pgd, __pgd(virt_to_phys(pud) | PUD_TYPE_TABLE));
+	}
+
+	pud = pud_offset(pgd, PAGE_SIZE);
+	if (PTRS_PER_PUD > 1) {
+		pmd = (pmd_t *)allocator(mask);
+		if (!pmd) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pud(pud, __pud(virt_to_phys(pmd) | PUD_TYPE_TABLE));
+	}
+
+	pmd = pmd_offset(pud, PAGE_SIZE);
+	if (PTRS_PER_PMD > 1) {
+		pte = (pte_t *)allocator(mask);
+		if (!pte) {
+			rc = -ENOMEM;
+			goto out;
+		}
+		set_pmd(pmd, __pmd(virt_to_phys(pte) | PMD_TYPE_TABLE));
+	}
+
+	pte = pte_offset_kernel(pmd, PAGE_SIZE);
+	set_pte_at(&init_mm, dst, pte,
+		   __pte(virt_to_phys((void *)dst) |
+			 pgprot_val(PAGE_KERNEL_EXEC)));
+
+	/* Load our new page tables */
+	asm volatile("msr	ttbr0_el1, %0;"
+		     "isb;"
+		     "tlbi	vmalle1is;"
+		     "dsb	ish" : : "r"(virt_to_phys(pgd)));
+
+	*dst_addr = (void *)(PAGE_SIZE);
+	*phys_dst_addr = virt_to_phys((void *)dst);
+
+out:
+	return rc;
+}
+
+
+int swsusp_arch_suspend(void)
+{
+	int ret = 0;
+	unsigned long flags;
+	struct sleep_stack_data state;
+	struct mm_struct *mm = current->mm;
+
+	local_dbg_save(flags);
+
+	if (__cpu_suspend_enter(&state)) {
+		ret = swsusp_save();
+	} else {
+		void *lm_kernel_start;
+
+		/* Clean kernel to PoC for secondary core startup */
+		lm_kernel_start = phys_to_virt(virt_to_phys(KERNEL_START));
+		__flush_dcache_area(lm_kernel_start, KERNEL_END - KERNEL_START);
+
+		/* Reload the hyp-stub */
+		if (is_hyp_mode_available())
+			__hyp_set_vectors(virt_to_phys(__hyp_stub_vectors));
+
+		__cpu_suspend_exit(mm);
+	}
+
+	local_dbg_restore(flags);
+
+	return ret;
+}
+
+static int copy_pte(pmd_t *dst_pmd, pmd_t *src_pmd, unsigned long start,
+		    unsigned long end)
+{
+	unsigned long next;
+	unsigned long addr = start;
+	pte_t *src_pte = pte_offset_kernel(src_pmd, start);
+	pte_t *dst_pte = pte_offset_kernel(dst_pmd, start);
+
+	do {
+		next = addr + PAGE_SIZE;
+		if (pte_val(*src_pte))
+			set_pte(dst_pte,
+				__pte(pte_val(*src_pte) & ~PTE_RDONLY));
+	} while (dst_pte++, src_pte++, addr = next, addr != end);
+
+	return 0;
+}
+
+static int copy_pmd(pud_t *dst_pud, pud_t *src_pud, unsigned long start,
+		    unsigned long end)
+{
+	int rc = 0;
+	pte_t *dst_pte;
+	unsigned long next;
+	unsigned long addr = start;
+	pmd_t *src_pmd = pmd_offset(src_pud, start);
+	pmd_t *dst_pmd = pmd_offset(dst_pud, start);
+
+	do {
+		next = pmd_addr_end(addr, end);
+		if (!pmd_val(*src_pmd))
+			continue;
+
+		if (pmd_table(*(src_pmd))) {
+			dst_pte = (pte_t *)get_safe_page(GFP_ATOMIC);
+			if (!dst_pte) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pmd(dst_pmd, __pmd(virt_to_phys(dst_pte)
+					       | PMD_TYPE_TABLE));
+
+			rc = copy_pte(dst_pmd, src_pmd, addr, next);
+			if (rc)
+				break;
+		} else
+			set_pmd(dst_pmd,
+				__pmd(pmd_val(*src_pmd) & ~PMD_SECT_RDONLY));
+	} while (dst_pmd++, src_pmd++, addr = next, addr != end);
+
+	return rc;
+}
+
+static int copy_pud(pgd_t *dst_pgd, pgd_t *src_pgd, unsigned long start,
+		    unsigned long end)
+{
+	int rc = 0;
+	pmd_t *dst_pmd;
+	unsigned long next;
+	unsigned long addr = start;
+	pud_t *src_pud = pud_offset(src_pgd, start);
+	pud_t *dst_pud = pud_offset(dst_pgd, start);
+
+	do {
+		next = pud_addr_end(addr, end);
+		if (!pud_val(*src_pud))
+			continue;
+
+		if (pud_table(*(src_pud))) {
+			if (PTRS_PER_PMD != 1) {
+				dst_pmd = (pmd_t *)get_safe_page(GFP_ATOMIC);
+				if (!dst_pmd) {
+					rc = -ENOMEM;
+					break;
+				}
+
+				set_pud(dst_pud, __pud(virt_to_phys(dst_pmd)
+						       | PUD_TYPE_TABLE));
+			}
+
+			rc = copy_pmd(dst_pud, src_pud, addr, next);
+			if (rc)
+				break;
+		} else {
+			set_pud(dst_pud,
+				__pud(pud_val(*src_pud) & ~PMD_SECT_RDONLY));
+		}
+	} while (dst_pud++, src_pud++, addr = next, addr != end);
+
+	return rc;
+}
+
+static int copy_page_tables(pgd_t *dst_pgd, unsigned long start,
+			    unsigned long end)
+{
+	int rc = 0;
+	pud_t *dst_pud;
+	unsigned long next;
+	unsigned long addr = start;
+	pgd_t *src_pgd = pgd_offset_k(start);
+
+	dst_pgd += pgd_index(start);
+
+	do {
+		next = pgd_addr_end(addr, end);
+		if (!pgd_val(*src_pgd))
+			continue;
+
+		if (PTRS_PER_PUD != 1) {
+			dst_pud = (pud_t *)get_safe_page(GFP_ATOMIC);
+			if (!dst_pud) {
+				rc = -ENOMEM;
+				break;
+			}
+
+			set_pgd(dst_pgd, __pgd(virt_to_phys(dst_pud)
+					       | PUD_TYPE_TABLE));
+		}
+
+		rc = copy_pud(dst_pgd, src_pgd, addr, next);
+		if (rc)
+			break;
+	} while (dst_pgd++, src_pgd++, addr = next, addr != end);
+
+	return rc;
+}
+
+/*
+ * Setup then Resume from the hibernate image using swsusp_arch_suspend_exit().
+ *
+ * Memory allocated by get_safe_page() will be dealt with by the hibernate code,
+ * we don't need to free it here.
+ */
+int swsusp_arch_resume(void)
+{
+	int rc = 0;
+	size_t exit_size;
+	pgd_t *tmp_pg_dir;
+	void *lm_restore_pblist;
+	phys_addr_t phys_hibernate_exit;
+	void __noreturn (*hibernate_exit)(phys_addr_t, phys_addr_t, void *, void *);
+
+	/*
+	 * Copy swsusp_arch_suspend_exit() to a safe page. This will generate
+	 * a new set of ttbr0 page tables and load them.
+	 */
+	exit_size = __hibernate_exit_text_end - __hibernate_exit_text_start;
+	rc = create_safe_exec_page(__hibernate_exit_text_start, exit_size,
+				   (void **)&hibernate_exit,
+				   &phys_hibernate_exit,
+				   get_safe_page, GFP_ATOMIC);
+	if (rc) {
+		pr_err("Failed to create safe executable page for hibernate_exit code.");
+		goto out;
+	}
+
+	/*
+	 * The hibernate exit text contains a set of el2 vectors, that will
+	 * be executed at el2 with the mmu off in order to reload hyp-stub.
+	 */
+	__flush_dcache_area(hibernate_exit, exit_size);
+
+	/*
+	 * Restoring the memory image will overwrite the ttbr1 page tables.
+	 * Create a second copy of just the linear map, and use this when
+	 * restoring.
+	 */
+	tmp_pg_dir = (pgd_t *)get_safe_page(GFP_ATOMIC);
+	if (!tmp_pg_dir) {
+		pr_err("Failed to allocate memory for temporary page tables.");
+		rc = -ENOMEM;
+		goto out;
+	}
+	rc = copy_page_tables(tmp_pg_dir, PAGE_OFFSET, 0);
+	if (rc)
+		goto out;
+
+	/*
+	 * Since we only copied the linear map, we need to find restore_pblist's
+	 * linear map address.
+	 */
+	lm_restore_pblist = phys_to_virt(virt_to_phys(restore_pblist));
+
+	/*
+	 * EL2 may get upset if we overwrite its page-tables/stack.
+	 * kvm_arch_hardware_disable() returns EL2 to the hyp stub.
+	 */
+	if (IS_ENABLED(CONFIG_KVM_ARM_HOST))
+		kvm_arch_hardware_disable();
+
+	/*
+	 * KASLR means the el2 vectors will be in a different location in the
+	 * resumed kernel. Load hibernate's temporary copy into el2.
+	 */
+	if (is_hyp_mode_available()) {
+		phys_addr_t el2_vectors = phys_hibernate_exit;  /* base */
+		el2_vectors += hibernate_el2_vectors -
+			       __hibernate_exit_text_start;     /* offset */
+
+		__hyp_set_vectors(el2_vectors);
+	}
+
+	hibernate_exit(virt_to_phys(tmp_pg_dir), resume_hdr.ttbr1_el1,
+		       resume_hdr.reenter_kernel, lm_restore_pblist);
+
+out:
+	return rc;
+}
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
index e3928f578891..0289ae86bc57 100644
--- a/arch/arm64/kernel/vmlinux.lds.S
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -46,6 +46,16 @@ jiffies = jiffies_64;
 	*(.idmap.text)					\
 	VMLINUX_SYMBOL(__idmap_text_end) = .;
 
+#ifdef CONFIG_HIBERNATION
+#define HIBERNATE_TEXT					\
+	. = ALIGN(SZ_4K);				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_start) = .;\
+	*(.hibernate_exit.text)				\
+	VMLINUX_SYMBOL(__hibernate_exit_text_end) = .;
+#else
+#define HIBERNATE_TEXT
+#endif
+
 /*
  * The size of the PE/COFF section that covers the kernel image, which
  * runs from stext to _edata, must be a round multiple of the PE/COFF
@@ -107,6 +117,7 @@ SECTIONS
 			LOCK_TEXT
 			HYPERVISOR_TEXT
 			IDMAP_TEXT
+			HIBERNATE_TEXT
 			*(.fixup)
 			*(.gnu.warning)
 		. = ALIGN(16);
@@ -183,6 +194,10 @@ ASSERT(__hyp_idmap_text_end - (__hyp_idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"HYP init code too big or misaligned")
 ASSERT(__idmap_text_end - (__idmap_text_start & ~(SZ_4K - 1)) <= SZ_4K,
 	"ID map text too big or misaligned")
+#ifdef CONFIG_HIBERNATION
+ASSERT(__hibernate_exit_text_end - (__hibernate_exit_text_start & ~(SZ_4K - 1))
+	<= SZ_4K, "Hibernate exit text too big or misaligned")
+#endif
 
 /*
  * If padding is applied before .head.text, virt<->phys conversions will fail.
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* [PATCH v4 13/13] arm64: hibernate: Prevent resume from a different kernel version
  2016-01-28 10:42 ` James Morse
                   ` (12 preceding siblings ...)
  (?)
@ 2016-01-28 10:42 ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-01-28 10:42 UTC (permalink / raw)
  To: linux-arm-kernel

Resuming using a different kernel version is fragile, while there are
sufficient details in the hibernate arch-header to perform the restore,
changes in the boot process can have a long-lasting impact on the system.
In particular, if the EFI stub causes more memory to be allocated, the
amount of memory left for linux is reduced. If we are lucky, this will
cause restore to fail with the message:
> PM: Image mismatch: memory size
If we are unlucky, the system will explode sometime later when an EFI
runtime services call is made.

Prevent resuming with a different kernel version, by making
HIBERNATE_VERSION the current kernel version.

Signed-off-by: James Morse <james.morse@arm.com>
---
 arch/arm64/kernel/hibernate.c | 9 ++-------
 1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/hibernate.c b/arch/arm64/kernel/hibernate.c
index 3789ae174b66..c9ce2e320c07 100644
--- a/arch/arm64/kernel/hibernate.c
+++ b/arch/arm64/kernel/hibernate.c
@@ -46,14 +46,9 @@
 
 /*
  * This value is written to the hibernate arch header, and prevents resuming
- * from a hibernate image produced by an incompatible kernel. If you change
- * a value that isn't saved/restored by hibernate, you should change this value.
- *
- * For example, if the mair_el1 values used by the kernel are changed, you
- * should prevent resuming from a kernel with incompatible attributes, as these
- * aren't saved/restored.
+ * from a hibernate image produced by a different kernel version.
  */
-#define HIBERNATE_VERSION	KERNEL_VERSION(4, 6, 0)
+#define HIBERNATE_VERSION	LINUX_VERSION_CODE
 
 /*
  * Start/end of the hibernate exit code, this must be copied to a 'safe'
-- 
2.6.2

^ permalink raw reply related	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
  2016-01-28 10:42 ` James Morse
@ 2016-01-29 22:34   ` Kevin Hilman
  -1 siblings, 0 replies; 34+ messages in thread
From: Kevin Hilman @ 2016-01-29 22:34 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, linux-pm, Rafael J. Wysocki,
	Pavel Machek, Marc Zyngier

Hi James,

James Morse <james.morse@arm.com> writes:

> This version of hibernate is rebased onto v4.5-rc1, including updated patches
> shared with kexec v13 [0] (1-5, 10).

Thanks for this series!

I'd like to help in testing this so I'm just curious which platforms
you've been testing this on.  I'm assuming a Juno (r2?), anything else?

Are you testing the resume from cold boot, or just from kexec?

For cold boot on Juno, I'm assuming there would be some
booloader/firmware changes needed to find and boot from the hibernation
image?  Is that being worked on?  If not Juno, are you aware of any
other platforms that could be tested with resume from cold boot?

Not knowing the answers to the above, I tested your branch using arm64
defconfig + CONFIG_HIBERNATION=y on my Juno and noticed that it didn't
stay suspended (full suspend log below) so I'm looking for
ideas/recommenations on how to test this.

FWIW, my Juno is running the latest ATF + u-boot firmware from the
Linaro ARM landing team.

Kevin

/ # echo disk > /sys/power/state
[   29.563430] PM: Syncing filesystems ... done.
[   29.567901] Freezing user space processes ... (elapsed 0.001 seconds) done.
[   29.576989] PM: Preallocating image memory... done (allocated 65512 pages)
[   31.969977] PM: Allocated 262048 kbytes in 2.38 seconds (110.10 MB/s)
[   31.976362] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   31.985303] Suspending console(s) (use no_console_suspend to debug)
[   31.992900] PM: freeze of devices complete after 1.370 msecs
[   31.993877] PM: late freeze of devices complete after 0.960 msecs
[   31.995109] PM: noirq freeze of devices complete after 1.225 msecs
[   31.995112] Disabling non-boot CPUs ...
[   32.012751] CPU1: shutdown
[   32.028584] psci: Retrying again to check for CPU kill
[   32.028587] psci: CPU1 killed.
[   32.060689] CPU2: shutdown
[   32.076579] psci: Retrying again to check for CPU kill
[   32.076583] psci: CPU2 killed.
[   32.108734] CPU3: shutdown
[   32.124579] psci: Retrying again to check for CPU kill
[   32.124582] psci: CPU3 killed.
[   32.160690] CPU4: shutdown
[   32.176578] psci: Retrying again to check for CPU kill
[   32.176581] psci: CPU4 killed.
[   32.212685] CPU5: shutdown
[   32.228580] psci: Retrying again to check for CPU kill
[   32.228584] psci: CPU5 killed.
[   32.241393] PM: Creating hibernation image:
[   32.241393] PM: Need to copy 63772 pages
[   32.241393] PM: Hibernation image created (63772 pages copied)
[   32.241440] Enabling non-boot CPUs ...
[   32.274634] Detected PIPT I-cache on CPU1
[   32.274680] CPU1: Booted secondary processor [410fd080]
[   32.274896]  cache: parent cpu1 should not be sleeping
[   32.275095] CPU1 is up
[   32.306712] Detected PIPT I-cache on CPU2
[   32.306738] CPU2: Booted secondary processor [410fd080]
[   32.306922]  cache: parent cpu2 should not be sleeping
[   32.307120] CPU2 is up
[   32.338913] Detected VIPT I-cache on CPU3
[   32.338959] CPU3: Booted secondary processor [410fd033]
[   32.339184]  cache: parent cpu3 should not be sleeping
[   32.339378] CPU3 is up
[   32.371098] Detected VIPT I-cache on CPU4
[   32.371125] CPU4: Booted secondary processor [410fd033]
[   32.371341]  cache: parent cpu4 should not be sleeping
[   32.371536] CPU4 is up
[   32.403281] Detected VIPT I-cache on CPU5
[   32.403309] CPU5: Booted secondary processor [410fd033]
[   32.403531]  cache: parent cpu5 should not be sleeping
[   32.403741] CPU5 is up
[   32.404202] PM: noirq thaw of devices complete after 0.454 msecs
[   32.404979] PM: early thaw of devices complete after 0.723 msecs
[...]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-01-29 22:34   ` Kevin Hilman
  0 siblings, 0 replies; 34+ messages in thread
From: Kevin Hilman @ 2016-01-29 22:34 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

James Morse <james.morse@arm.com> writes:

> This version of hibernate is rebased onto v4.5-rc1, including updated patches
> shared with kexec v13 [0] (1-5, 10).

Thanks for this series!

I'd like to help in testing this so I'm just curious which platforms
you've been testing this on.  I'm assuming a Juno (r2?), anything else?

Are you testing the resume from cold boot, or just from kexec?

For cold boot on Juno, I'm assuming there would be some
booloader/firmware changes needed to find and boot from the hibernation
image?  Is that being worked on?  If not Juno, are you aware of any
other platforms that could be tested with resume from cold boot?

Not knowing the answers to the above, I tested your branch using arm64
defconfig + CONFIG_HIBERNATION=y on my Juno and noticed that it didn't
stay suspended (full suspend log below) so I'm looking for
ideas/recommenations on how to test this.

FWIW, my Juno is running the latest ATF + u-boot firmware from the
Linaro ARM landing team.

Kevin

/ # echo disk > /sys/power/state
[   29.563430] PM: Syncing filesystems ... done.
[   29.567901] Freezing user space processes ... (elapsed 0.001 seconds) done.
[   29.576989] PM: Preallocating image memory... done (allocated 65512 pages)
[   31.969977] PM: Allocated 262048 kbytes in 2.38 seconds (110.10 MB/s)
[   31.976362] Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
[   31.985303] Suspending console(s) (use no_console_suspend to debug)
[   31.992900] PM: freeze of devices complete after 1.370 msecs
[   31.993877] PM: late freeze of devices complete after 0.960 msecs
[   31.995109] PM: noirq freeze of devices complete after 1.225 msecs
[   31.995112] Disabling non-boot CPUs ...
[   32.012751] CPU1: shutdown
[   32.028584] psci: Retrying again to check for CPU kill
[   32.028587] psci: CPU1 killed.
[   32.060689] CPU2: shutdown
[   32.076579] psci: Retrying again to check for CPU kill
[   32.076583] psci: CPU2 killed.
[   32.108734] CPU3: shutdown
[   32.124579] psci: Retrying again to check for CPU kill
[   32.124582] psci: CPU3 killed.
[   32.160690] CPU4: shutdown
[   32.176578] psci: Retrying again to check for CPU kill
[   32.176581] psci: CPU4 killed.
[   32.212685] CPU5: shutdown
[   32.228580] psci: Retrying again to check for CPU kill
[   32.228584] psci: CPU5 killed.
[   32.241393] PM: Creating hibernation image:
[   32.241393] PM: Need to copy 63772 pages
[   32.241393] PM: Hibernation image created (63772 pages copied)
[   32.241440] Enabling non-boot CPUs ...
[   32.274634] Detected PIPT I-cache on CPU1
[   32.274680] CPU1: Booted secondary processor [410fd080]
[   32.274896]  cache: parent cpu1 should not be sleeping
[   32.275095] CPU1 is up
[   32.306712] Detected PIPT I-cache on CPU2
[   32.306738] CPU2: Booted secondary processor [410fd080]
[   32.306922]  cache: parent cpu2 should not be sleeping
[   32.307120] CPU2 is up
[   32.338913] Detected VIPT I-cache on CPU3
[   32.338959] CPU3: Booted secondary processor [410fd033]
[   32.339184]  cache: parent cpu3 should not be sleeping
[   32.339378] CPU3 is up
[   32.371098] Detected VIPT I-cache on CPU4
[   32.371125] CPU4: Booted secondary processor [410fd033]
[   32.371341]  cache: parent cpu4 should not be sleeping
[   32.371536] CPU4 is up
[   32.403281] Detected VIPT I-cache on CPU5
[   32.403309] CPU5: Booted secondary processor [410fd033]
[   32.403531]  cache: parent cpu5 should not be sleeping
[   32.403741] CPU5 is up
[   32.404202] PM: noirq thaw of devices complete after 0.454 msecs
[   32.404979] PM: early thaw of devices complete after 0.723 msecs
[...]

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 11/13] PM / Hibernate: Call flush_icache_range() on pages restored in-place
  2016-01-28 10:42   ` James Morse
@ 2016-01-31 17:25     ` Pavel Machek
  -1 siblings, 0 replies; 34+ messages in thread
From: Pavel Machek @ 2016-01-31 17:25 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, linux-pm, Rafael J. Wysocki, Will Deacon,
	Sudeep Holla, Kevin Kang, Geoff Levand, Catalin Marinas,
	Lorenzo Pieralisi, Mark Rutland, AKASHI Takahiro, wangfei

On Thu 2016-01-28 10:42:44, James Morse wrote:
> Some architectures require code written to memory as if it were data to be
> 'cleaned' from any data caches before the processor can fetch them as new
> instructions.
> 
> During resume from hibernate, the snapshot code copies some pages directly,
> meaning these architectures do not get a chance to perform their cache
> maintenance. Modify the read and decompress code to call
> flush_icache_range() on all pages that are restored, so that the restored
> in-place pages are guaranteed to be executable on these architectures.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel Machek <pavel@ucw.cz>

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 11/13] PM / Hibernate: Call flush_icache_range() on pages restored in-place
@ 2016-01-31 17:25     ` Pavel Machek
  0 siblings, 0 replies; 34+ messages in thread
From: Pavel Machek @ 2016-01-31 17:25 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu 2016-01-28 10:42:44, James Morse wrote:
> Some architectures require code written to memory as if it were data to be
> 'cleaned' from any data caches before the processor can fetch them as new
> instructions.
> 
> During resume from hibernate, the snapshot code copies some pages directly,
> meaning these architectures do not get a chance to perform their cache
> maintenance. Modify the read and decompress code to call
> flush_icache_range() on all pages that are restored, so that the restored
> in-place pages are guaranteed to be executable on these architectures.
> 
> Signed-off-by: James Morse <james.morse@arm.com>

Acked-by: Pavel Machek <pavel@ucw.cz>

-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
  2016-01-29 22:34   ` Kevin Hilman
@ 2016-02-01  8:53     ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-02-01  8:53 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, linux-pm, Rafael J. Wysocki,
	Pavel Machek, Marc Zyngier

Hi Kevin,

On 29/01/16 22:34, Kevin Hilman wrote:
> James Morse <james.morse@arm.com> writes:
> I'd like to help in testing this so I'm just curious which platforms
> you've been testing this on.  I'm assuming a Juno (r2?), anything else?

That would be great - thanks!

I've done most of the testing on a Juno r1, but also gave it a spin on a
stray Seattle.


> Are you testing the resume from cold boot, or just from kexec?

>From cold boot. I haven't tried with kexec, but I doubt that's a use-case anyone
wants as you would resume immediately. (might be interesting for testing though)


> For cold boot on Juno, I'm assuming there would be some
> booloader/firmware changes needed to find and boot from the hibernation
> image?

Not at all! Your firmware only needs to support some mechanism to
turning CPUs off.

If you add 'resume=/dev/sda2' (or wherever your swap partition is located), the
kernel will check this partition for the hibernate-image signature, if found, it
will resume from that partition. Otherwise booting is as normal. (there is one
hoop to jump through to ensure your rootfs hasn't been mounted before the kernel
starts resume, as you could corrupt it - an initramfs in the kernel is the best
fix for this).

No firmware changes needed.

> Is that being worked on?  If not Juno, are you aware of any
> other platforms that could be tested with resume from cold boot?

Any arm64 platform with persistent storage should work. I've been using a swap
partition on a usb drive.


> Not knowing the answers to the above, I tested your branch using arm64
> defconfig + CONFIG_HIBERNATION=y on my Juno and noticed that it didn't
> stay suspended (full suspend log below) so I'm looking for
> ideas/recommenations on how to test this.

That trace looks quite normal, (one of mine below[0] for comparison). Any
failure would have happened after the point you stopped ... did you have a swap
partition 'on'?

'syscore' will freeze all processes and stop all devices, then create a
copy of the minimum amount of memory it needs to save. Then it starts
all the devices again, as it needs to write this image out to swap. This is what
you are seeing.

Once it has done this it calls poweroff or reboot.


Thanks,

James



[0] kernel output for hibernate/resume on Juno-r1:
root@localhost:/sys/power# swapon /dev/sda2
Adding 236540k swap on /dev/sda2.  Priority:-1 extents:1 across:236540k
root@localhost:/sys/power# ls
disk        pm_async           reserved_size  state
image_size  pm_freeze_timeout  resume         wakeup_count
root@localhost:/sys/power# cat disk
[shutdown] reboot suspend
root@localhost:/sys/power# echo disk > state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.001 seconds) done.
PM: Preallocating image memory... done (allocated 93967 pages)
PM: Allocated 375868 kbytes in 3.74 seconds (100.49 MB/s)
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: freeze of devices complete after 17.794 msecs
PM: late freeze of devices complete after 1.358 msecs
PM: noirq freeze of devices complete after 1.771 msecs
Disabling non-boot CPUs ...
PM: Creating hibernation image:
PM: Need to copy 93868 pages
PM: Hibernation image created (93868 pages copied)
PM: noirq thaw of devices complete after 1.319 msecs
PM: early thaw of devices complete after 0.890 msecs
serio serio0: device_attach() failed for 1c060000.kmi (1c060000.kmi), error: -51
7
serio serio1: device_attach() failed for 1c070000.kmi (1c070000.kmi), error: -51
7
PM: thaw of devices complete after 92.523 msecs
PM: Using 1 thread(s) for compression.
PM: Compressing and saving image data (94052 pages)...
PM: Image saving progress:   0%
atkbd serio0: keyboard reset failed on 1c060000.kmi
atkbd serio1: keyboard reset failed on 1c070000.kmi
PM: Image saving progress:  10%
PM: Image saving progress:  20%
PM: Image saving progress:  30%
PM: Image saving progress:  40%
PM: Image saving progress:  50%
PM: Image saving progress:  60%
PM: Image saving progress:  70%
PM: Image saving progress:  80%
PM: Image saving progress:  90%
PM: Image saving progress: 100%
PM: Image saving done.
PM: Wrote 376208 kbytes in 21.76 seconds (17.28 MB/s)
PM: S|
kvm: exiting hardware virtualization
reboot: Power down

Board powered down, use REBOOT to restart.

Cmd>

ARM V2M-Juno Boot loader v1.0.0
HBI0262 build 1635

ARM V2M_Juno r1 Firmware v1.3.3
Build Date: Mar 31 2015

Time :  08:25:55
Date :  01:02:2016

Press Enter to stop auto boot...
Powering up system...

Switching on ATXPSU...
PMIC RAM configuration (pms_v104.bin)...
MBtemp   : 28 degC

Configuring motherboard (rev C, var A)...
IOFPGA image \MB\HBI0262C\io_b117.bit
IOFPGA  config: PASSED
OSC CLK config: PASSED

Configuring SCC registers...
Writing SCC 0x00000054 with 0x0007FFFE
Writing SCC 0x0000005C with 0x00FE001E
Writing SCC 0x00000100 with 0x003F1000
Writing SCC 0x00000104 with 0x0001F300
Writing SCC 0x00000108 with 0x00331000
Writing SCC 0x0000010C with 0x00019300
Writing SCC 0x00000118 with 0x003F1000
Writing SCC 0x0000011C with 0x0001F100
Writing SCC 0x000000F4 with 0x00000018
Writing SCC 0x000000F8 with 0x0BEC0000
Writing SCC 0x000000FC with 0xABE40000
Writing SCC 0x00000A14 with 0x00000000
Writing SCC 0x0000000C with 0x000000C2
Writing SCC 0x00000010 with 0x000000C2

Peripheral ID0:0x000000AD
Peripheral ID1:0x000000B0
Peripheral ID2:0x0000001B
Peripheral ID3:0x00000000
Peripheral ID4:0x0000000D
Peripheral ID5:0x000000F0
Peripheral ID6:0x00000005
Peripheral ID7:0x000000B1

Programming NOR Flash
PCIE clock configured...

Testing motherboard interfaces (FPGA build 117)...
SRAM 32MB test: PASSED
LAN9118   test: PASSED
KMI1/2    test: PASSED
MMC       test: PASSED
PB/LEDs   test: PASSED
FPGA UART test: PASSED
PCIe init test: PASSED
MAC addrs test: PASSED

The default boot selection will start in   1 seconds
Downloading the file <Image> from the TFTP server
[=======================================>]    9656 Kb
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services and installing virtual address map...
Booting Linux on physical CPU 0x100
Linux version 4.5.0-rc1+ (morse@melchizedek) (gcc version 4.9.3 20141031 (prerel
ease) (Linaro GCC 2014.11) ) #1839 SMP PREEMPT Fri Jan 29 15:05:57 GMT 2016
Boot CPU: AArch64 Processor [410fd033]
earlycon: Early serial console at MMIO 0x7ff80000 (options '')
bootconsole [uart0] enabled
efi: Getting EFI parameters from FDT:
EFI v2.50 by ARM Juno EFI Nov 24 2015 12:36:35
efi:  ACPI=0xf95b0000  ACPI 2.0=0xf95b0014  PROP=0xfe8db4d8
cma: Reserved 16 MiB at 0x00000000fd800000
psci: probing for conduit method from DT.
psci: PSCIv1.0 detected in firmware.
psci: Using standard PSCI v0.2 function IDs
psci: Trusted OS migration not required
PERCPU: Embedded 20 pages/cpu @ffff80097ff4f000 s42240 r8192 d31488 u81920
Detected VIPT I-cache on CPU0
CPU features: enabling workaround for ARM erratum 845719
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 2060048
Kernel command line: console=ttyAMA0,115200 earlycon=pl011,0x7ff80000 root=/dev/
nfs nfsroot=10.xx.xx.xx:xx/aarch64-current resume=/dev/sda2 no_console_s
uspend ip=dhcp rw init=/bin/bash crashkernel=256M maxcpus=1
log_buf_len individual max cpu contribution: 4096 bytes
log_buf_len total cpu_extra contributions: 20480 bytes
log_buf_len min size: 16384 bytes
log_buf_len: 65536 bytes
early log buf free: 14304(87%)
PID hash table entries: 4096 (order: 3, 32768 bytes)
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
software IO TLB [mem 0xf54a0000-0xf94a0000] (64MB) mapped at [ffff8000754a0000-f
fff80007949ffff]
Memory: 8130452K/8371264K available (6007K kernel code, 524K rwdata, 2524K rodat
a, 592K init, 229K bss, 224428K reserved, 16384K cma-reserved)
Virtual kernel memory layout:
    vmalloc : 0xffff000000000000 - 0xffff7bffbfff0000   (126974 GB)
    vmemmap : 0xffff7bffc0000000 - 0xffff7fffc0000000   (  4096 GB maximum)
              0xffff7bffc2000000 - 0xffff7bffe8000000   (   608 MB actual)
    fixed   : 0xffff7ffffa7fd000 - 0xffff7ffffac00000   (  4108 KB)
    PCI I/O : 0xffff7ffffae00000 - 0xffff7ffffbe00000   (    16 MB)
    modules : 0xffff7ffffc000000 - 0xffff800000000000   (    64 MB)
    memory  : 0xffff800000000000 - 0xffff800980000000   ( 38912 MB)
      .init : 0xffff8000008d7000 - 0xffff80000096b000   (   592 KB)
      .text : 0xffff800000080000 - 0xffff8000008d6c64   (  8540 KB)
      .data : 0xffff80000096b000 - 0xffff8000009ee200   (   525 KB)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
Preemptible hierarchical RCU implementation.
        Build-time adjustment of leaf fanout to 64.
        RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=6.
RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=6
NR_IRQS:64 nr_irqs:64 0
GIC: Using split EOI/Deactivate mode
GICv2m: range[mem 0x2c1c0000-0x2c1c0fff], SPI[224:255]
Architected cp15 and mmio timer(s) running at 50.00MHz (phys/phys).
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xb8812736b, m
ax_idle_ns: 440795202655 ns
sched_clock: 56 bits at 50MHz, resolution 20ns, wraps every 4398046511100ns
Console: colour dummy device 80x25
Calibrating delay loop (skipped), value calculated using timer frequency.. 100.0
0 BogoMIPS (lpj=200000)
pid_max: default: 32768 minimum: 301
Security Framework initialized
Mount-cache hash table entries: 16384 (order: 5, 131072 bytes)
Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes)
ASID allocator initialised with 65536 entries
Remapping and enabling EFI services.
 EFI remap 0x0000000008000000 => 0000000020000000
  EFI remap 0x000000001c170000 => 0000000024000000
  EFI remap 0x00000000f94a0000 => 0000000024010000
  EFI remap 0x00000000f9520000 => 0000000024020000
  EFI remap 0x00000000f9530000 => 0000000024030000
  EFI remap 0x00000000f9540000 => 0000000024040000
  EFI remap 0x00000000f9570000 => 0000000024070000
  EFI remap 0x00000000f9580000 => 0000000024080000
  EFI remap 0x00000000f95c0000 => 00000000240a0000
  EFI remap 0x00000000f9640000 => 0000000024120000
  EFI remap 0x00000000f9650000 => 0000000024130000
  EFI remap 0x00000000f9720000 => 0000000024200000
  EFI remap 0x00000000f9730000 => 0000000024210000
  EFI remap 0x00000000f9740000 => 0000000024220000
  EFI remap 0x00000000f9780000 => 0000000024260000
  EFI remap 0x00000000f9790000 => 0000000024270000
  EFI remap 0x00000000f9800000 => 00000000242e0000
  EFI remap 0x00000000f9810000 => 00000000242f0000
  EFI remap 0x00000000f9820000 => 0000000024300000
  EFI remap 0x00000000fe820000 => 0000000024330000
  EFI remap 0x00000000fe830000 => 0000000024340000
  EFI remap 0x00000000fe840000 => 0000000024350000
  EFI remap 0x00000000fe870000 => 0000000024370000
Brought up 1 CPUs
SMP: Total of 1 processors activated.
CPU: All CPU(s) started at EL2
alternatives: patching kernel code
devtmpfs: initialized
DMI not present or invalid.
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645
041785100000 ns
pinctrl core: initialized pinctrl subsystem
NET: Registered protocol family 16
cpuidle: using governor ladder
cpuidle: using governor menu
vdso: 2 pages (1 code @ ffff800000972000, 1 data @ ffff800000971000)
hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
DMA: preallocated 256 KiB pool for atomic allocations
Serial: AMBA PL011 UART driver
7ff80000.uart: ttyAMA0 at MMIO 0x7ff80000 (irq = 25, base_baud = 0) is a PL011 r
ev3
console [ttyAMA0] enabled
console [ttyAMA0] enabled
bootconsole [uart0] disabled
bootconsole [uart0] disabled
HugeTLB registered 2 MB page size, pre-allocated 0 pages
ACPI: Interpreter disabled.
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
dmi: Firmware registration failed.
clocksource: Switched to clocksource arch_sys_counter
VFS: Disk quotas dquot_6.6.0
VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
pnp: PnP ACPI: disabled
NET: Registered protocol family 2
TCP established hash table entries: 65536 (order: 7, 524288 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 65536 bind 65536)
UDP hash table entries: 4096 (order: 5, 131072 bytes)
UDP-Lite hash table entries: 4096 (order: 5, 131072 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available
kvm [1]: interrupt-controller@2c04f000 IRQ14
kvm [1]: timer IRQ4
kvm [1]: 8-bit VMID
kvm [1]: Hyp mode initialized successfully
futex hash table entries: 2048 (order: 6, 262144 bytes)
audit: initializing netlink subsys (disabled)
audit: type=2000 audit(0.432:1): initialized
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
fuse init (API version 7.24)
9p: Installing v9fs 9p2000 file system support
io scheduler noop registered
io scheduler cfq registered (default)
PCI host bridge /pcie-controller@30000000 ranges:
   IO 0x5f800000..0x5fffffff -> 0x5f800000
  MEM 0x50000000..0x57ffffff -> 0x50000000
  MEM 0x4000000000..0x40ffffffff -> 0x4000000000
pci-host-generic 40000000.pcie-controller: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [io  0x0000-0x7fffff] (bus address [0x5f80000
0-0x5fffffff])
pci_bus 0000:00: root bus resource [mem 0x50000000-0x57ffffff]
pci_bus 0000:00: root bus resource [mem 0x4000000000-0x40ffffffff pref]
pci 0000:03:00.0: reg 0x20: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with
 'pcie_aspm=force'
pci 0000:08:00.0: reg 0x18: initial BAR value 0x00000000 invalid
pci 0000:00:00.0: BAR 8: assigned [mem 0x50000000-0x501fffff]
pci 0000:00:00.0: BAR 0: assigned [mem 0x4000000000-0x4000003fff 64bit pref]
pci 0000:00:00.0: BAR 7: assigned [io  0x1000-0x2fff]
pci 0000:01:00.0: BAR 8: assigned [mem 0x50000000-0x501fffff]
pci 0000:01:00.0: BAR 7: assigned [io  0x1000-0x2fff]
pci 0000:02:01.0: BAR 8: assigned [mem 0x50000000-0x500fffff]
pci 0000:02:1f.0: BAR 8: assigned [mem 0x50100000-0x501fffff]
pci 0000:02:01.0: BAR 7: assigned [io  0x1000-0x1fff]
pci 0000:02:1f.0: BAR 7: assigned [io  0x2000-0x2fff]
pci 0000:03:00.0: BAR 6: assigned [mem 0x50000000-0x5007ffff pref]
pci 0000:03:00.0: BAR 2: assigned [mem 0x50080000-0x50083fff 64bit]
pci 0000:03:00.0: BAR 0: assigned [mem 0x50084000-0x5008407f 64bit]
pci 0000:03:00.0: BAR 4: assigned [io  0x1000-0x107f]
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0:   bridge window [io  0x1000-0x1fff]
pci 0000:02:01.0:   bridge window [mem 0x50000000-0x500fffff]
pci 0000:02:02.0: PCI bridge to [bus 04]
pci 0000:02:03.0: PCI bridge to [bus 05]
pci 0000:02:0c.0: PCI bridge to [bus 06]
pci 0000:02:10.0: PCI bridge to [bus 07]
pci 0000:08:00.0: BAR 0: assigned [mem 0x50100000-0x50103fff 64bit]
pci 0000:08:00.0: BAR 2: assigned [io  0x2000-0x20ff]
pci 0000:02:1f.0: PCI bridge to [bus 08]
pci 0000:02:1f.0:   bridge window [io  0x2000-0x2fff]
pci 0000:02:1f.0:   bridge window [mem 0x50100000-0x501fffff]
pci 0000:01:00.0: PCI bridge to [bus 02-08]
pci 0000:01:00.0:   bridge window [io  0x1000-0x2fff]
pci 0000:01:00.0:   bridge window [mem 0x50000000-0x501fffff]
pci 0000:00:00.0: PCI bridge to [bus 01-08]
pci 0000:00:00.0:   bridge window [io  0x1000-0x2fff]
pci 0000:00:00.0:   bridge window [mem 0x50000000-0x501fffff]
pcieport 0000:00:00.0: Signaling PME through PCIe PME interrupt
pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
pci 0000:02:01.0: Signaling PME through PCIe PME interrupt
pci 0000:03:00.0: Signaling PME through PCIe PME interrupt
pci 0000:02:02.0: Signaling PME through PCIe PME interrupt
pci 0000:02:03.0: Signaling PME through PCIe PME interrupt
pci 0000:02:0c.0: Signaling PME through PCIe PME interrupt
pci 0000:02:10.0: Signaling PME through PCIe PME interrupt
pci 0000:02:1f.0: Signaling PME through PCIe PME interrupt
pci 0000:08:00.0: Signaling PME through PCIe PME interrupt
xenfs: not registering filesystem on non-xen platform
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
msm_serial: driver initialized
loop: module loaded
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
sky2: driver version 1.30
sky2 0000:08:00.0: enabling device (0000 -> 0003)
sky2 0000:08:00.0: Yukon-2 UL 2 chip revision 0
sky2 0000:08:00.0 (unnamed net_device) (uninitialized): Invalid MAC address, def
aulting to random
sky2 0000:08:00.0 eth0: addr c6:94:53:3d:70:00
libphy: smsc911x-mdio: probed
Generic PHY 18000000.etherne:01: attached PHY driver [Generic PHY] (mii_bus:phy_
addr=18000000.etherne:01, irq=-1)
smsc911x 18000000.ethernet eth1: MAC Address: 00:02:f7:00:60:c5
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ehci-platform: EHCI generic platform driver
ehci-platform 7ffc0000.ehci: EHCI Host Controller
ehci-platform 7ffc0000.ehci: new USB bus registered, assigned bus number 1
ehci-platform 7ffc0000.ehci: irq 28, io mem 0x7ffc0000
ehci-platform 7ffc0000.ehci: USB 2.0 started, EHCI 1.00
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci-pci: OHCI PCI platform driver
ohci-platform: OHCI generic platform driver
ohci-platform 7ffb0000.ohci: Generic Platform OHCI controller
ohci-platform 7ffb0000.ohci: new USB bus registered, assigned bus number 2
ohci-platform 7ffb0000.ohci: irq 27, io mem 0x7ffb0000
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 1 port detected
usbcore: registered new interface driver usb-storage
mousedev: PS/2 mouse device common for all mice
rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
mmci-pl18x 1c050000.mmci: mmc0: PL180 manf 41 rev0 at 0x1c050000 irq 33,0 (pio)
mmci-pl18x 1c050000.mmci: DMA channels RX none, TX none
sdhci: Secure Digital Host Controller Interface driver
sdhci: Copyright(c) Pierre Ossman
Synopsys Designware Multimedia Card Interface Driver
sdhci-pltfm: SDHCI platform and OF driver helper
ledtrig-cpu: registered to indicate activity on CPUs
EFI Variables Facility v0.08 2004-May-17
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
NET: Registered protocol family 17
9pnet: Installing 9P2000 support
Key type dns_resolver registered
registered taskstats version 1
rtc-efi rtc-efi: setting system clock to 2016-02-01 08:26:43 UTC (1454315203)
sky2 0000:08:00.0 eth0: enabling interface
smsc911x 18000000.ethernet eth1: SMSC911x/921x identified at 0xffff000000120000,
 IRQ: 31
usb 1-1: new high-speed USB device number 2 using ehci-platform
hub 1-1:1.0: USB hub found
hub 1-1:1.0: 4 ports detected
usb 1-1.2: new high-speed USB device number 3 using ehci-platform
usb-storage 1-1.2:1.0: USB Mass Storage device detected
scsi host0: usb-storage 1-1.2:1.0
atkbd serio0: keyboard reset failed on 1c060000.kmi
scsi 0:0:0:0: Direct-Access     TOSHIBA  TransMemory      1.00 PQ: 0 ANSI: 4
sd 0:0:0:0: [sda] 15155200 512-byte logical blocks: (7.76 GB/7.23 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DP
O or FUA
 sda: sda1 sda2
sd 0:0:0:0: [sda] Attached SCSI removable disk
atkbd serio1: keyboard reset failed on 1c070000.kmi
Sending DHCP requests ., OK
IP-Config: Got DHCP answer from 10.xx.xx.xx, my address is 10.xx.xx.xx
sky2 0000:08:00.0 eth0: disabling interface
IP-Config: Complete:
     device=eth1, hwaddr=00:02:f7:xx:xx:xx, ipaddr=10.xx.xx.xx, mask=255.255.255
.0, gw=10.xx.xx.xx
     host=10.xx.xx.xx, domain=cambridge.arm.com, nis-domain=(none)
     bootserver=0.0.0.0, rootserver=10.xx.xx.xx, rootpath=     nameserver0=10.xx
.xx.xx, nameserver1=10.xx.xx.xx
Freezing user space processes ... (elapsed 0.000 seconds) done.
PM: Using 1 thread(s) for decompression.
PM: Loading and decompressing image data (94052 pages)...
PM: Image loading progress:   0%
PM: Image loading progress:  10%
PM: Image loading progress:  20%
PM: Image loading progress:  30%
PM: Image loading progress:  40%
PM: Image loading progress:  50%
PM: Image loading progress:  60%
PM: Image loading progress:  70%
PM: Image loading progress:  80%
PM: Image loading progress:  90%
PM: Image loading progress: 100%
PM: Image loading done.
PM: Read 376208 kbytes in 7.09 seconds (53.06 MB/s)
PM: quiesce of devices complete after 16.499 msecs
PM: late quiesce of devices complete after 0.938 msecs
PM: noirq quiesce of devices complete after 11.044 msecs
Disabling non-boot CPUs ...
PM: noirq restore of devices complete after 2.397 msecs
PM: early restore of devices complete after 0.917 msecs
serio serio0: device_attach() failed for 1c060000.kmi (1c060000.kmi), error: -51
7
serio serio1: device_attach() failed for 1c070000.kmi (1c070000.kmi), error: -51
7
PM: restore of devices complete after 93.241 msecs
serio serio0: device_attach() failed for 1c060000.kmi (1c060000.kmi), error: -51
7
Restarting tasks ... done.
atkbd serio0: keyboard reset failed on 1c060000.kmi
atkbd serio1: keyboard reset failed on 1c070000.kmi
atkbd serio1: keyboard reset failed on 1c070000.kmi
root@localhost:/sys/power#
root@localhost:/sys/power# uptime
 08:27:00 up 2 days, 16:57,  2 users,  load average: 0.37, 0.14, 0.09
root@localhost:/sys/power#



^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-02-01  8:53     ` James Morse
  0 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-02-01  8:53 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Kevin,

On 29/01/16 22:34, Kevin Hilman wrote:
> James Morse <james.morse@arm.com> writes:
> I'd like to help in testing this so I'm just curious which platforms
> you've been testing this on.  I'm assuming a Juno (r2?), anything else?

That would be great - thanks!

I've done most of the testing on a Juno r1, but also gave it a spin on a
stray Seattle.


> Are you testing the resume from cold boot, or just from kexec?

>From cold boot. I haven't tried with kexec, but I doubt that's a use-case anyone
wants as you would resume immediately. (might be interesting for testing though)


> For cold boot on Juno, I'm assuming there would be some
> booloader/firmware changes needed to find and boot from the hibernation
> image?

Not at all! Your firmware only needs to support some mechanism to
turning CPUs off.

If you add 'resume=/dev/sda2' (or wherever your swap partition is located), the
kernel will check this partition for the hibernate-image signature, if found, it
will resume from that partition. Otherwise booting is as normal. (there is one
hoop to jump through to ensure your rootfs hasn't been mounted before the kernel
starts resume, as you could corrupt it - an initramfs in the kernel is the best
fix for this).

No firmware changes needed.

> Is that being worked on?  If not Juno, are you aware of any
> other platforms that could be tested with resume from cold boot?

Any arm64 platform with persistent storage should work. I've been using a swap
partition on a usb drive.


> Not knowing the answers to the above, I tested your branch using arm64
> defconfig + CONFIG_HIBERNATION=y on my Juno and noticed that it didn't
> stay suspended (full suspend log below) so I'm looking for
> ideas/recommenations on how to test this.

That trace looks quite normal, (one of mine below[0] for comparison). Any
failure would have happened after the point you stopped ... did you have a swap
partition 'on'?

'syscore' will freeze all processes and stop all devices, then create a
copy of the minimum amount of memory it needs to save. Then it starts
all the devices again, as it needs to write this image out to swap. This is what
you are seeing.

Once it has done this it calls poweroff or reboot.


Thanks,

James



[0] kernel output for hibernate/resume on Juno-r1:
root at localhost:/sys/power# swapon /dev/sda2
Adding 236540k swap on /dev/sda2.  Priority:-1 extents:1 across:236540k
root at localhost:/sys/power# ls
disk        pm_async           reserved_size  state
image_size  pm_freeze_timeout  resume         wakeup_count
root at localhost:/sys/power# cat disk
[shutdown] reboot suspend
root at localhost:/sys/power# echo disk > state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.001 seconds) done.
PM: Preallocating image memory... done (allocated 93967 pages)
PM: Allocated 375868 kbytes in 3.74 seconds (100.49 MB/s)
Freezing remaining freezable tasks ... (elapsed 0.001 seconds) done.
PM: freeze of devices complete after 17.794 msecs
PM: late freeze of devices complete after 1.358 msecs
PM: noirq freeze of devices complete after 1.771 msecs
Disabling non-boot CPUs ...
PM: Creating hibernation image:
PM: Need to copy 93868 pages
PM: Hibernation image created (93868 pages copied)
PM: noirq thaw of devices complete after 1.319 msecs
PM: early thaw of devices complete after 0.890 msecs
serio serio0: device_attach() failed for 1c060000.kmi (1c060000.kmi), error: -51
7
serio serio1: device_attach() failed for 1c070000.kmi (1c070000.kmi), error: -51
7
PM: thaw of devices complete after 92.523 msecs
PM: Using 1 thread(s) for compression.
PM: Compressing and saving image data (94052 pages)...
PM: Image saving progress:   0%
atkbd serio0: keyboard reset failed on 1c060000.kmi
atkbd serio1: keyboard reset failed on 1c070000.kmi
PM: Image saving progress:  10%
PM: Image saving progress:  20%
PM: Image saving progress:  30%
PM: Image saving progress:  40%
PM: Image saving progress:  50%
PM: Image saving progress:  60%
PM: Image saving progress:  70%
PM: Image saving progress:  80%
PM: Image saving progress:  90%
PM: Image saving progress: 100%
PM: Image saving done.
PM: Wrote 376208 kbytes in 21.76 seconds (17.28 MB/s)
PM: S|
kvm: exiting hardware virtualization
reboot: Power down

Board powered down, use REBOOT to restart.

Cmd>

ARM V2M-Juno Boot loader v1.0.0
HBI0262 build 1635

ARM V2M_Juno r1 Firmware v1.3.3
Build Date: Mar 31 2015

Time :  08:25:55
Date :  01:02:2016

Press Enter to stop auto boot...
Powering up system...

Switching on ATXPSU...
PMIC RAM configuration (pms_v104.bin)...
MBtemp   : 28 degC

Configuring motherboard (rev C, var A)...
IOFPGA image \MB\HBI0262C\io_b117.bit
IOFPGA  config: PASSED
OSC CLK config: PASSED

Configuring SCC registers...
Writing SCC 0x00000054 with 0x0007FFFE
Writing SCC 0x0000005C with 0x00FE001E
Writing SCC 0x00000100 with 0x003F1000
Writing SCC 0x00000104 with 0x0001F300
Writing SCC 0x00000108 with 0x00331000
Writing SCC 0x0000010C with 0x00019300
Writing SCC 0x00000118 with 0x003F1000
Writing SCC 0x0000011C with 0x0001F100
Writing SCC 0x000000F4 with 0x00000018
Writing SCC 0x000000F8 with 0x0BEC0000
Writing SCC 0x000000FC with 0xABE40000
Writing SCC 0x00000A14 with 0x00000000
Writing SCC 0x0000000C with 0x000000C2
Writing SCC 0x00000010 with 0x000000C2

Peripheral ID0:0x000000AD
Peripheral ID1:0x000000B0
Peripheral ID2:0x0000001B
Peripheral ID3:0x00000000
Peripheral ID4:0x0000000D
Peripheral ID5:0x000000F0
Peripheral ID6:0x00000005
Peripheral ID7:0x000000B1

Programming NOR Flash
PCIE clock configured...

Testing motherboard interfaces (FPGA build 117)...
SRAM 32MB test: PASSED
LAN9118   test: PASSED
KMI1/2    test: PASSED
MMC       test: PASSED
PB/LEDs   test: PASSED
FPGA UART test: PASSED
PCIe init test: PASSED
MAC addrs test: PASSED

The default boot selection will start in   1 seconds
Downloading the file <Image> from the TFTP server
[=======================================>]    9656 Kb
EFI stub: Booting Linux Kernel...
EFI stub: Using DTB from configuration table
EFI stub: Exiting boot services and installing virtual address map...
Booting Linux on physical CPU 0x100
Linux version 4.5.0-rc1+ (morse at melchizedek) (gcc version 4.9.3 20141031 (prerel
ease) (Linaro GCC 2014.11) ) #1839 SMP PREEMPT Fri Jan 29 15:05:57 GMT 2016
Boot CPU: AArch64 Processor [410fd033]
earlycon: Early serial console at MMIO 0x7ff80000 (options '')
bootconsole [uart0] enabled
efi: Getting EFI parameters from FDT:
EFI v2.50 by ARM Juno EFI Nov 24 2015 12:36:35
efi:  ACPI=0xf95b0000  ACPI 2.0=0xf95b0014  PROP=0xfe8db4d8
cma: Reserved 16 MiB at 0x00000000fd800000
psci: probing for conduit method from DT.
psci: PSCIv1.0 detected in firmware.
psci: Using standard PSCI v0.2 function IDs
psci: Trusted OS migration not required
PERCPU: Embedded 20 pages/cpu @ffff80097ff4f000 s42240 r8192 d31488 u81920
Detected VIPT I-cache on CPU0
CPU features: enabling workaround for ARM erratum 845719
Built 1 zonelists in Zone order, mobility grouping on.  Total pages: 2060048
Kernel command line: console=ttyAMA0,115200 earlycon=pl011,0x7ff80000 root=/dev/
nfs nfsroot=10.xx.xx.xx:xx/aarch64-current resume=/dev/sda2 no_console_s
uspend ip=dhcp rw init=/bin/bash crashkernel=256M maxcpus=1
log_buf_len individual max cpu contribution: 4096 bytes
log_buf_len total cpu_extra contributions: 20480 bytes
log_buf_len min size: 16384 bytes
log_buf_len: 65536 bytes
early log buf free: 14304(87%)
PID hash table entries: 4096 (order: 3, 32768 bytes)
Dentry cache hash table entries: 1048576 (order: 11, 8388608 bytes)
Inode-cache hash table entries: 524288 (order: 10, 4194304 bytes)
software IO TLB [mem 0xf54a0000-0xf94a0000] (64MB) mapped at [ffff8000754a0000-f
fff80007949ffff]
Memory: 8130452K/8371264K available (6007K kernel code, 524K rwdata, 2524K rodat
a, 592K init, 229K bss, 224428K reserved, 16384K cma-reserved)
Virtual kernel memory layout:
    vmalloc : 0xffff000000000000 - 0xffff7bffbfff0000   (126974 GB)
    vmemmap : 0xffff7bffc0000000 - 0xffff7fffc0000000   (  4096 GB maximum)
              0xffff7bffc2000000 - 0xffff7bffe8000000   (   608 MB actual)
    fixed   : 0xffff7ffffa7fd000 - 0xffff7ffffac00000   (  4108 KB)
    PCI I/O : 0xffff7ffffae00000 - 0xffff7ffffbe00000   (    16 MB)
    modules : 0xffff7ffffc000000 - 0xffff800000000000   (    64 MB)
    memory  : 0xffff800000000000 - 0xffff800980000000   ( 38912 MB)
      .init : 0xffff8000008d7000 - 0xffff80000096b000   (   592 KB)
      .text : 0xffff800000080000 - 0xffff8000008d6c64   (  8540 KB)
      .data : 0xffff80000096b000 - 0xffff8000009ee200   (   525 KB)
SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=6, Nodes=1
Preemptible hierarchical RCU implementation.
        Build-time adjustment of leaf fanout to 64.
        RCU restricting CPUs from NR_CPUS=64 to nr_cpu_ids=6.
RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=6
NR_IRQS:64 nr_irqs:64 0
GIC: Using split EOI/Deactivate mode
GICv2m: range[mem 0x2c1c0000-0x2c1c0fff], SPI[224:255]
Architected cp15 and mmio timer(s) running at 50.00MHz (phys/phys).
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0xb8812736b, m
ax_idle_ns: 440795202655 ns
sched_clock: 56 bits at 50MHz, resolution 20ns, wraps every 4398046511100ns
Console: colour dummy device 80x25
Calibrating delay loop (skipped), value calculated using timer frequency.. 100.0
0 BogoMIPS (lpj=200000)
pid_max: default: 32768 minimum: 301
Security Framework initialized
Mount-cache hash table entries: 16384 (order: 5, 131072 bytes)
Mountpoint-cache hash table entries: 16384 (order: 5, 131072 bytes)
ASID allocator initialised with 65536 entries
Remapping and enabling EFI services.
 EFI remap 0x0000000008000000 => 0000000020000000
  EFI remap 0x000000001c170000 => 0000000024000000
  EFI remap 0x00000000f94a0000 => 0000000024010000
  EFI remap 0x00000000f9520000 => 0000000024020000
  EFI remap 0x00000000f9530000 => 0000000024030000
  EFI remap 0x00000000f9540000 => 0000000024040000
  EFI remap 0x00000000f9570000 => 0000000024070000
  EFI remap 0x00000000f9580000 => 0000000024080000
  EFI remap 0x00000000f95c0000 => 00000000240a0000
  EFI remap 0x00000000f9640000 => 0000000024120000
  EFI remap 0x00000000f9650000 => 0000000024130000
  EFI remap 0x00000000f9720000 => 0000000024200000
  EFI remap 0x00000000f9730000 => 0000000024210000
  EFI remap 0x00000000f9740000 => 0000000024220000
  EFI remap 0x00000000f9780000 => 0000000024260000
  EFI remap 0x00000000f9790000 => 0000000024270000
  EFI remap 0x00000000f9800000 => 00000000242e0000
  EFI remap 0x00000000f9810000 => 00000000242f0000
  EFI remap 0x00000000f9820000 => 0000000024300000
  EFI remap 0x00000000fe820000 => 0000000024330000
  EFI remap 0x00000000fe830000 => 0000000024340000
  EFI remap 0x00000000fe840000 => 0000000024350000
  EFI remap 0x00000000fe870000 => 0000000024370000
Brought up 1 CPUs
SMP: Total of 1 processors activated.
CPU: All CPU(s) started at EL2
alternatives: patching kernel code
devtmpfs: initialized
DMI not present or invalid.
clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 7645
041785100000 ns
pinctrl core: initialized pinctrl subsystem
NET: Registered protocol family 16
cpuidle: using governor ladder
cpuidle: using governor menu
vdso: 2 pages (1 code @ ffff800000972000, 1 data @ ffff800000971000)
hw-breakpoint: found 6 breakpoint and 4 watchpoint registers.
DMA: preallocated 256 KiB pool for atomic allocations
Serial: AMBA PL011 UART driver
7ff80000.uart: ttyAMA0 at MMIO 0x7ff80000 (irq = 25, base_baud = 0) is a PL011 r
ev3
console [ttyAMA0] enabled
console [ttyAMA0] enabled
bootconsole [uart0] disabled
bootconsole [uart0] disabled
HugeTLB registered 2 MB page size, pre-allocated 0 pages
ACPI: Interpreter disabled.
vgaarb: loaded
SCSI subsystem initialized
usbcore: registered new interface driver usbfs
usbcore: registered new interface driver hub
usbcore: registered new device driver usb
dmi: Firmware registration failed.
clocksource: Switched to clocksource arch_sys_counter
VFS: Disk quotas dquot_6.6.0
VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes)
pnp: PnP ACPI: disabled
NET: Registered protocol family 2
TCP established hash table entries: 65536 (order: 7, 524288 bytes)
TCP bind hash table entries: 65536 (order: 8, 1048576 bytes)
TCP: Hash tables configured (established 65536 bind 65536)
UDP hash table entries: 4096 (order: 5, 131072 bytes)
UDP-Lite hash table entries: 4096 (order: 5, 131072 bytes)
NET: Registered protocol family 1
RPC: Registered named UNIX socket transport module.
RPC: Registered udp transport module.
RPC: Registered tcp transport module.
RPC: Registered tcp NFSv4.1 backchannel transport module.
hw perfevents: enabled with armv8_pmuv3 PMU driver, 7 counters available
kvm [1]: interrupt-controller at 2c04f000 IRQ14
kvm [1]: timer IRQ4
kvm [1]: 8-bit VMID
kvm [1]: Hyp mode initialized successfully
futex hash table entries: 2048 (order: 6, 262144 bytes)
audit: initializing netlink subsys (disabled)
audit: type=2000 audit(0.432:1): initialized
NFS: Registering the id_resolver key type
Key type id_resolver registered
Key type id_legacy registered
fuse init (API version 7.24)
9p: Installing v9fs 9p2000 file system support
io scheduler noop registered
io scheduler cfq registered (default)
PCI host bridge /pcie-controller at 30000000 ranges:
   IO 0x5f800000..0x5fffffff -> 0x5f800000
  MEM 0x50000000..0x57ffffff -> 0x50000000
  MEM 0x4000000000..0x40ffffffff -> 0x4000000000
pci-host-generic 40000000.pcie-controller: PCI host bridge to bus 0000:00
pci_bus 0000:00: root bus resource [bus 00-ff]
pci_bus 0000:00: root bus resource [io  0x0000-0x7fffff] (bus address [0x5f80000
0-0x5fffffff])
pci_bus 0000:00: root bus resource [mem 0x50000000-0x57ffffff]
pci_bus 0000:00: root bus resource [mem 0x4000000000-0x40ffffffff pref]
pci 0000:03:00.0: reg 0x20: initial BAR value 0x00000000 invalid
pci 0000:03:00.0: disabling ASPM on pre-1.1 PCIe device.  You can enable it with
 'pcie_aspm=force'
pci 0000:08:00.0: reg 0x18: initial BAR value 0x00000000 invalid
pci 0000:00:00.0: BAR 8: assigned [mem 0x50000000-0x501fffff]
pci 0000:00:00.0: BAR 0: assigned [mem 0x4000000000-0x4000003fff 64bit pref]
pci 0000:00:00.0: BAR 7: assigned [io  0x1000-0x2fff]
pci 0000:01:00.0: BAR 8: assigned [mem 0x50000000-0x501fffff]
pci 0000:01:00.0: BAR 7: assigned [io  0x1000-0x2fff]
pci 0000:02:01.0: BAR 8: assigned [mem 0x50000000-0x500fffff]
pci 0000:02:1f.0: BAR 8: assigned [mem 0x50100000-0x501fffff]
pci 0000:02:01.0: BAR 7: assigned [io  0x1000-0x1fff]
pci 0000:02:1f.0: BAR 7: assigned [io  0x2000-0x2fff]
pci 0000:03:00.0: BAR 6: assigned [mem 0x50000000-0x5007ffff pref]
pci 0000:03:00.0: BAR 2: assigned [mem 0x50080000-0x50083fff 64bit]
pci 0000:03:00.0: BAR 0: assigned [mem 0x50084000-0x5008407f 64bit]
pci 0000:03:00.0: BAR 4: assigned [io  0x1000-0x107f]
pci 0000:02:01.0: PCI bridge to [bus 03]
pci 0000:02:01.0:   bridge window [io  0x1000-0x1fff]
pci 0000:02:01.0:   bridge window [mem 0x50000000-0x500fffff]
pci 0000:02:02.0: PCI bridge to [bus 04]
pci 0000:02:03.0: PCI bridge to [bus 05]
pci 0000:02:0c.0: PCI bridge to [bus 06]
pci 0000:02:10.0: PCI bridge to [bus 07]
pci 0000:08:00.0: BAR 0: assigned [mem 0x50100000-0x50103fff 64bit]
pci 0000:08:00.0: BAR 2: assigned [io  0x2000-0x20ff]
pci 0000:02:1f.0: PCI bridge to [bus 08]
pci 0000:02:1f.0:   bridge window [io  0x2000-0x2fff]
pci 0000:02:1f.0:   bridge window [mem 0x50100000-0x501fffff]
pci 0000:01:00.0: PCI bridge to [bus 02-08]
pci 0000:01:00.0:   bridge window [io  0x1000-0x2fff]
pci 0000:01:00.0:   bridge window [mem 0x50000000-0x501fffff]
pci 0000:00:00.0: PCI bridge to [bus 01-08]
pci 0000:00:00.0:   bridge window [io  0x1000-0x2fff]
pci 0000:00:00.0:   bridge window [mem 0x50000000-0x501fffff]
pcieport 0000:00:00.0: Signaling PME through PCIe PME interrupt
pci 0000:01:00.0: Signaling PME through PCIe PME interrupt
pci 0000:02:01.0: Signaling PME through PCIe PME interrupt
pci 0000:03:00.0: Signaling PME through PCIe PME interrupt
pci 0000:02:02.0: Signaling PME through PCIe PME interrupt
pci 0000:02:03.0: Signaling PME through PCIe PME interrupt
pci 0000:02:0c.0: Signaling PME through PCIe PME interrupt
pci 0000:02:10.0: Signaling PME through PCIe PME interrupt
pci 0000:02:1f.0: Signaling PME through PCIe PME interrupt
pci 0000:08:00.0: Signaling PME through PCIe PME interrupt
xenfs: not registering filesystem on non-xen platform
Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
msm_serial: driver initialized
loop: module loaded
tun: Universal TUN/TAP device driver, 1.6
tun: (C) 1999-2004 Max Krasnyansky <maxk@qualcomm.com>
sky2: driver version 1.30
sky2 0000:08:00.0: enabling device (0000 -> 0003)
sky2 0000:08:00.0: Yukon-2 UL 2 chip revision 0
sky2 0000:08:00.0 (unnamed net_device) (uninitialized): Invalid MAC address, def
aulting to random
sky2 0000:08:00.0 eth0: addr c6:94:53:3d:70:00
libphy: smsc911x-mdio: probed
Generic PHY 18000000.etherne:01: attached PHY driver [Generic PHY] (mii_bus:phy_
addr=18000000.etherne:01, irq=-1)
smsc911x 18000000.ethernet eth1: MAC Address: 00:02:f7:00:60:c5
ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver
ehci-pci: EHCI PCI platform driver
ehci-platform: EHCI generic platform driver
ehci-platform 7ffc0000.ehci: EHCI Host Controller
ehci-platform 7ffc0000.ehci: new USB bus registered, assigned bus number 1
ehci-platform 7ffc0000.ehci: irq 28, io mem 0x7ffc0000
ehci-platform 7ffc0000.ehci: USB 2.0 started, EHCI 1.00
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 1 port detected
ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver
ohci-pci: OHCI PCI platform driver
ohci-platform: OHCI generic platform driver
ohci-platform 7ffb0000.ohci: Generic Platform OHCI controller
ohci-platform 7ffb0000.ohci: new USB bus registered, assigned bus number 2
ohci-platform 7ffb0000.ohci: irq 27, io mem 0x7ffb0000
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 1 port detected
usbcore: registered new interface driver usb-storage
mousedev: PS/2 mouse device common for all mice
rtc-efi rtc-efi: rtc core: registered rtc-efi as rtc0
mmci-pl18x 1c050000.mmci: mmc0: PL180 manf 41 rev0 at 0x1c050000 irq 33,0 (pio)
mmci-pl18x 1c050000.mmci: DMA channels RX none, TX none
sdhci: Secure Digital Host Controller Interface driver
sdhci: Copyright(c) Pierre Ossman
Synopsys Designware Multimedia Card Interface Driver
sdhci-pltfm: SDHCI platform and OF driver helper
ledtrig-cpu: registered to indicate activity on CPUs
EFI Variables Facility v0.08 2004-May-17
usbcore: registered new interface driver usbhid
usbhid: USB HID core driver
NET: Registered protocol family 17
9pnet: Installing 9P2000 support
Key type dns_resolver registered
registered taskstats version 1
rtc-efi rtc-efi: setting system clock to 2016-02-01 08:26:43 UTC (1454315203)
sky2 0000:08:00.0 eth0: enabling interface
smsc911x 18000000.ethernet eth1: SMSC911x/921x identified at 0xffff000000120000,
 IRQ: 31
usb 1-1: new high-speed USB device number 2 using ehci-platform
hub 1-1:1.0: USB hub found
hub 1-1:1.0: 4 ports detected
usb 1-1.2: new high-speed USB device number 3 using ehci-platform
usb-storage 1-1.2:1.0: USB Mass Storage device detected
scsi host0: usb-storage 1-1.2:1.0
atkbd serio0: keyboard reset failed on 1c060000.kmi
scsi 0:0:0:0: Direct-Access     TOSHIBA  TransMemory      1.00 PQ: 0 ANSI: 4
sd 0:0:0:0: [sda] 15155200 512-byte logical blocks: (7.76 GB/7.23 GiB)
sd 0:0:0:0: [sda] Write Protect is off
sd 0:0:0:0: [sda] Write cache: disabled, read cache: enabled, doesn't support DP
O or FUA
 sda: sda1 sda2
sd 0:0:0:0: [sda] Attached SCSI removable disk
atkbd serio1: keyboard reset failed on 1c070000.kmi
Sending DHCP requests ., OK
IP-Config: Got DHCP answer from 10.xx.xx.xx, my address is 10.xx.xx.xx
sky2 0000:08:00.0 eth0: disabling interface
IP-Config: Complete:
     device=eth1, hwaddr=00:02:f7:xx:xx:xx, ipaddr=10.xx.xx.xx, mask=255.255.255
.0, gw=10.xx.xx.xx
     host=10.xx.xx.xx, domain=cambridge.arm.com, nis-domain=(none)
     bootserver=0.0.0.0, rootserver=10.xx.xx.xx, rootpath=     nameserver0=10.xx
.xx.xx, nameserver1=10.xx.xx.xx
Freezing user space processes ... (elapsed 0.000 seconds) done.
PM: Using 1 thread(s) for decompression.
PM: Loading and decompressing image data (94052 pages)...
PM: Image loading progress:   0%
PM: Image loading progress:  10%
PM: Image loading progress:  20%
PM: Image loading progress:  30%
PM: Image loading progress:  40%
PM: Image loading progress:  50%
PM: Image loading progress:  60%
PM: Image loading progress:  70%
PM: Image loading progress:  80%
PM: Image loading progress:  90%
PM: Image loading progress: 100%
PM: Image loading done.
PM: Read 376208 kbytes in 7.09 seconds (53.06 MB/s)
PM: quiesce of devices complete after 16.499 msecs
PM: late quiesce of devices complete after 0.938 msecs
PM: noirq quiesce of devices complete after 11.044 msecs
Disabling non-boot CPUs ...
PM: noirq restore of devices complete after 2.397 msecs
PM: early restore of devices complete after 0.917 msecs
serio serio0: device_attach() failed for 1c060000.kmi (1c060000.kmi), error: -51
7
serio serio1: device_attach() failed for 1c070000.kmi (1c070000.kmi), error: -51
7
PM: restore of devices complete after 93.241 msecs
serio serio0: device_attach() failed for 1c060000.kmi (1c060000.kmi), error: -51
7
Restarting tasks ... done.
atkbd serio0: keyboard reset failed on 1c060000.kmi
atkbd serio1: keyboard reset failed on 1c070000.kmi
atkbd serio1: keyboard reset failed on 1c070000.kmi
root at localhost:/sys/power#
root at localhost:/sys/power# uptime
 08:27:00 up 2 days, 16:57,  2 users,  load average: 0.37, 0.14, 0.09
root at localhost:/sys/power#

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 05/13] arm64: kvm: allows kvm cpu hotplug
  2016-01-28 10:42 ` [PATCH v4 05/13] arm64: kvm: allows kvm cpu hotplug James Morse
@ 2016-02-02  6:46   ` AKASHI Takahiro
  0 siblings, 0 replies; 34+ messages in thread
From: AKASHI Takahiro @ 2016-02-02  6:46 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/28/2016 07:42 PM, James Morse wrote:
> From: AKASHI Takahiro <takahiro.akashi@linaro.org>
>
> The current kvm implementation on arm64 does cpu-specific initialization
> at system boot, and has no way to gracefully shutdown a core in terms of
> kvm. This prevents kexec from rebooting the system at EL2.
>
> This patch adds a cpu tear-down function and also puts an existing cpu-init
> code into a separate function, kvm_arch_hardware_disable() and
> kvm_arch_hardware_enable() respectively.
> We don't need the arm64 specific cpu hotplug hook any more.
>
> Since this patch modifies common code between arm and arm64, one stub
> definition, __cpu_reset_hyp_mode(), is added on arm side to avoid
> compilation errors.
>
> Signed-off-by: AKASHI Takahiro <takahiro.akashi@linaro.org>
> [Moved __kvm_hyp_reset() to use kvm_call_hyp(), instead of having its own
>   dedicated entry point in el1_sync. Added some comments and a tlbi.]
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> This patch is from v13 of kexec, see my [changes] above.
>
>   arch/arm/include/asm/kvm_host.h   | 10 +++-
>   arch/arm/include/asm/kvm_mmu.h    |  1 +
>   arch/arm/kvm/arm.c                | 98 ++++++++++++++++++++++++---------------
>   arch/arm/kvm/mmu.c                |  5 ++
>   arch/arm64/include/asm/kvm_host.h |  1 -
>   arch/arm64/include/asm/kvm_mmu.h  | 19 ++++++++
>   arch/arm64/kvm/hyp-init.S         | 42 +++++++++++++++++
>   7 files changed, 136 insertions(+), 40 deletions(-)
>
> diff --git a/arch/arm/include/asm/kvm_host.h b/arch/arm/include/asm/kvm_host.h
> index f9f27792d8ed..8af531d64771 100644
> --- a/arch/arm/include/asm/kvm_host.h
> +++ b/arch/arm/include/asm/kvm_host.h
> @@ -220,6 +220,15 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>   	kvm_call_hyp((void*)hyp_stack_ptr, vector_ptr, pgd_ptr);
>   }
>
> +static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
> +					phys_addr_t phys_idmap_start)
> +{
> +	/*
> +	 * TODO
> +	 * kvm_call_reset(boot_pgd_ptr, phys_idmap_start);
> +	 */
> +}
> +
>   static inline int kvm_arch_dev_ioctl_check_extension(long ext)
>   {
>   	return 0;
> @@ -232,7 +241,6 @@ void kvm_mmu_wp_memory_region(struct kvm *kvm, int slot);
>
>   struct kvm_vcpu *kvm_mpidr_to_vcpu(struct kvm *kvm, unsigned long mpidr);
>
> -static inline void kvm_arch_hardware_disable(void) {}
>   static inline void kvm_arch_hardware_unsetup(void) {}
>   static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>   static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm/include/asm/kvm_mmu.h b/arch/arm/include/asm/kvm_mmu.h
> index a520b7987a29..4fd9ddb48c0f 100644
> --- a/arch/arm/include/asm/kvm_mmu.h
> +++ b/arch/arm/include/asm/kvm_mmu.h
> @@ -66,6 +66,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
>   phys_addr_t kvm_mmu_get_httbr(void);
>   phys_addr_t kvm_mmu_get_boot_httbr(void);
>   phys_addr_t kvm_get_idmap_vector(void);
> +phys_addr_t kvm_get_idmap_start(void);
>   int kvm_mmu_init(void);
>   void kvm_clear_hyp_idmap(void);
>
> diff --git a/arch/arm/kvm/arm.c b/arch/arm/kvm/arm.c
> index dda1959f0dde..f060567e9c0a 100644
> --- a/arch/arm/kvm/arm.c
> +++ b/arch/arm/kvm/arm.c
> @@ -16,7 +16,6 @@
>    * Foundation, 51 Franklin Street, Fifth Floor, Boston, MA  02110-1301, USA.
>    */
>
> -#include <linux/cpu.h>
>   #include <linux/cpu_pm.h>
>   #include <linux/errno.h>
>   #include <linux/err.h>
> @@ -65,6 +64,8 @@ static DEFINE_SPINLOCK(kvm_vmid_lock);
>
>   static bool vgic_present;
>
> +static DEFINE_PER_CPU(unsigned char, kvm_arm_hardware_enabled);
> +
>   static void kvm_arm_set_running_vcpu(struct kvm_vcpu *vcpu)
>   {
>   	BUG_ON(preemptible());
> @@ -89,11 +90,6 @@ struct kvm_vcpu * __percpu *kvm_get_running_vcpus(void)
>   	return &kvm_arm_running_vcpu;
>   }
>
> -int kvm_arch_hardware_enable(void)
> -{
> -	return 0;
> -}
> -
>   int kvm_arch_vcpu_should_kick(struct kvm_vcpu *vcpu)
>   {
>   	return kvm_vcpu_exiting_guest_mode(vcpu) == IN_GUEST_MODE;
> @@ -585,7 +581,13 @@ int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *run)
>   		/*
>   		 * Re-check atomic conditions
>   		 */
> -		if (signal_pending(current)) {
> +		if (unlikely(!__this_cpu_read(kvm_arm_hardware_enabled))) {
> +			/* cpu has been torn down */
> +			ret = 0;
> +			run->exit_reason = KVM_EXIT_FAIL_ENTRY;
> +			run->fail_entry.hardware_entry_failure_reason
> +					= (u64)-ENOEXEC;
> +		} else if (signal_pending(current)) {
>   			ret = -EINTR;
>   			run->exit_reason = KVM_EXIT_INTR;
>   		}
> @@ -967,7 +969,7 @@ long kvm_arch_vm_ioctl(struct file *filp,
>   	}
>   }
>
> -static void cpu_init_hyp_mode(void *dummy)
> +static void cpu_init_hyp_mode(void)
>   {
>   	phys_addr_t boot_pgd_ptr;
>   	phys_addr_t pgd_ptr;
> @@ -989,36 +991,61 @@ static void cpu_init_hyp_mode(void *dummy)
>   	kvm_arm_init_debug();
>   }
>
> -static int hyp_init_cpu_notify(struct notifier_block *self,
> -			       unsigned long action, void *cpu)
> +static void cpu_reset_hyp_mode(void)
>   {
> -	switch (action) {
> -	case CPU_STARTING:
> -	case CPU_STARTING_FROZEN:
> -		if (__hyp_get_vectors() == hyp_default_vectors)
> -			cpu_init_hyp_mode(NULL);
> -		break;
> +	phys_addr_t boot_pgd_ptr;
> +	phys_addr_t phys_idmap_start;
> +
> +	boot_pgd_ptr = kvm_mmu_get_boot_httbr();
> +	phys_idmap_start = kvm_get_idmap_start();
> +
> +	__cpu_reset_hyp_mode(boot_pgd_ptr, phys_idmap_start);
> +}
> +
> +int kvm_arch_hardware_enable(void)
> +{
> +	if (!__this_cpu_read(kvm_arm_hardware_enabled)) {
> +		cpu_init_hyp_mode();
> +		__this_cpu_write(kvm_arm_hardware_enabled, 1);
>   	}
>
> -	return NOTIFY_OK;
> +	return 0;
>   }
>
> -static struct notifier_block hyp_init_cpu_nb = {
> -	.notifier_call = hyp_init_cpu_notify,
> -};
> +void kvm_arch_hardware_disable(void)
> +{
> +	if (!__this_cpu_read(kvm_arm_hardware_enabled))
> +		return;
> +
> +	cpu_reset_hyp_mode();
> +	__this_cpu_write(kvm_arm_hardware_enabled, 0);
> +}
>
>   #ifdef CONFIG_CPU_PM
>   static int hyp_init_cpu_pm_notifier(struct notifier_block *self,
>   				    unsigned long cmd,
>   				    void *v)
>   {
> -	if (cmd == CPU_PM_EXIT &&
> -	    __hyp_get_vectors() == hyp_default_vectors) {
> -		cpu_init_hyp_mode(NULL);
> +	/*
> +	 * kvm_arm_hardware_enabled is left with its old value over
> +	 * PM_ENTER->PM_EXIT. It is used to indicate PM_EXIT should
> +	 * re-enable hyp.
> +	 */
> +	switch (cmd) {
> +	case CPU_PM_ENTER:
> +		if (__this_cpu_read(kvm_arm_hardware_enabled))
> +			cpu_reset_hyp_mode();
> +
> +		return NOTIFY_OK;
> +	case CPU_PM_EXIT:
> +		if (__this_cpu_read(kvm_arm_hardware_enabled))
> +			cpu_init_hyp_mode();
> +
>   		return NOTIFY_OK;
> -	}
>
> -	return NOTIFY_DONE;
> +	default:
> +		return NOTIFY_DONE;
> +	}
>   }
>
>   static struct notifier_block hyp_init_cpu_pm_nb = {
> @@ -1122,14 +1149,20 @@ static int init_hyp_mode(void)
>   	}
>
>   	/*
> -	 * Execute the init code on each CPU.
> +	 * Init this CPU temporarily to execute kvm_hyp_call()
> +	 * during kvm_vgic_hyp_init().
>   	 */
> -	on_each_cpu(cpu_init_hyp_mode, NULL, 1);
> +	preempt_disable();
> +	cpu_init_hyp_mode();
>
>   	/*
>   	 * Init HYP view of VGIC
>   	 */
>   	err = kvm_vgic_hyp_init();
> +
> +	cpu_reset_hyp_mode();
> +	preempt_enable();
> +
>   	switch (err) {
>   	case 0:
>   		vgic_present = true;
> @@ -1213,26 +1246,15 @@ int kvm_arch_init(void *opaque)
>   		}
>   	}
>
> -	cpu_notifier_register_begin();
> -
>   	err = init_hyp_mode();
>   	if (err)
>   		goto out_err;
>
> -	err = __register_cpu_notifier(&hyp_init_cpu_nb);
> -	if (err) {
> -		kvm_err("Cannot register HYP init CPU notifier (%d)\n", err);
> -		goto out_err;
> -	}
> -
> -	cpu_notifier_register_done();
> -
>   	hyp_cpu_pm_init();
>
>   	kvm_coproc_table_init();
>   	return 0;
>   out_err:
> -	cpu_notifier_register_done();
>   	return err;
>   }
>
> diff --git a/arch/arm/kvm/mmu.c b/arch/arm/kvm/mmu.c
> index aba61fd3697a..7a3aed62499a 100644
> --- a/arch/arm/kvm/mmu.c
> +++ b/arch/arm/kvm/mmu.c
> @@ -1643,6 +1643,11 @@ phys_addr_t kvm_get_idmap_vector(void)
>   	return hyp_idmap_vector;
>   }
>
> +phys_addr_t kvm_get_idmap_start(void)
> +{
> +	return hyp_idmap_start;
> +}
> +
>   int kvm_mmu_init(void)
>   {
>   	int err;
> diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
> index 689d4c95e12f..7d6d75616fb5 100644
> --- a/arch/arm64/include/asm/kvm_host.h
> +++ b/arch/arm64/include/asm/kvm_host.h
> @@ -332,7 +332,6 @@ static inline void __cpu_init_hyp_mode(phys_addr_t boot_pgd_ptr,
>   		     hyp_stack_ptr, vector_ptr);
>   }
>
> -static inline void kvm_arch_hardware_disable(void) {}
>   static inline void kvm_arch_hardware_unsetup(void) {}
>   static inline void kvm_arch_sync_events(struct kvm *kvm) {}
>   static inline void kvm_arch_vcpu_uninit(struct kvm_vcpu *vcpu) {}
> diff --git a/arch/arm64/include/asm/kvm_mmu.h b/arch/arm64/include/asm/kvm_mmu.h
> index 736433912a1e..1d48208a904a 100644
> --- a/arch/arm64/include/asm/kvm_mmu.h
> +++ b/arch/arm64/include/asm/kvm_mmu.h
> @@ -99,6 +99,7 @@ void kvm_mmu_free_memory_caches(struct kvm_vcpu *vcpu);
>   phys_addr_t kvm_mmu_get_httbr(void);
>   phys_addr_t kvm_mmu_get_boot_httbr(void);
>   phys_addr_t kvm_get_idmap_vector(void);
> +phys_addr_t kvm_get_idmap_start(void);
>   int kvm_mmu_init(void);
>   void kvm_clear_hyp_idmap(void);
>
> @@ -310,5 +311,23 @@ static inline unsigned int kvm_get_vmid_bits(void)
>   	return (cpuid_feature_extract_field(reg, ID_AA64MMFR1_VMIDBITS_SHIFT) == 2) ? 16 : 8;
>   }
>
> +void __kvm_hyp_reset(phys_addr_t boot_pgd_ptr, phys_addr_t phys_idmap_start);
> +
> +/*
> + * Call reset code, and switch back to stub hyp vectors. We need to execute
> + * __kvm_hyp_reset() from the trampoline page, we calculate its address here.
> + */
> +static inline void __cpu_reset_hyp_mode(phys_addr_t boot_pgd_ptr,
> +					phys_addr_t phys_idmap_start)
> +{
> +	unsigned long trampoline_hyp_reset;
> +
> +	trampoline_hyp_reset = TRAMPOLINE_VA +
> +			       ((unsigned long)__kvm_hyp_reset & ~PAGE_MASK);
> +
> +	kvm_call_hyp((void *)trampoline_hyp_reset,
> +		     boot_pgd_ptr, phys_idmap_start);
> +}
> +

I want to place this definition in kvm_host.h as its counterpart, __cpu_init_hyp_mode().

-Takahiro AKASHI

>   #endif /* __ASSEMBLY__ */
>   #endif /* __ARM64_KVM_MMU_H__ */
> diff --git a/arch/arm64/kvm/hyp-init.S b/arch/arm64/kvm/hyp-init.S
> index dc6335a7353e..d20d86c7f9d8 100644
> --- a/arch/arm64/kvm/hyp-init.S
> +++ b/arch/arm64/kvm/hyp-init.S
> @@ -150,6 +150,48 @@ merged:
>   	eret
>   ENDPROC(__kvm_hyp_init)
>
> +	/*
> +	 * x0: HYP boot pgd
> +	 * x1: HYP phys_idmap_start
> +	 */
> +ENTRY(__kvm_hyp_reset)
> +	/*
> +	 * Retrieve lr from the stack (pushed by el1_sync()), so we can eret
> +	 * from here.
> +	 */
> +	ldp	lr, xzr, [sp], #16
> +
> +	/* We're in trampoline code in VA, switch back to boot page tables */
> +	msr	ttbr0_el2, x0
> +	isb
> +
> +	/* Ensure the PA branch doesn't find a stale tlb entry. */
> +	tlbi	alle2
> +	dsb	sy
> +
> +	/* Branch into PA space */
> +	adr	x0, 1f
> +	bfi	x1, x0, #0, #PAGE_SHIFT
> +	br	x1
> +
> +	/* We're now in idmap, disable MMU */
> +1:	mrs	x0, sctlr_el2
> +	ldr	x1, =SCTLR_ELx_FLAGS
> +	bic	x0, x0, x1		// Clear SCTL_M and etc
> +	msr	sctlr_el2, x0
> +	isb
> +
> +	/* Invalidate the old TLBs */
> +	tlbi	alle2
> +	dsb	sy
> +
> +	/* Install stub vectors */
> +	adr_l	x0, __hyp_stub_vectors
> +	msr	vbar_el2, x0
> +
> +	eret
> +ENDPROC(__kvm_hyp_reset)
> +
>   	.ltorg
>
>   	.popsection
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 04/13] arm64: Add new hcall HVC_CALL_FUNC
  2016-01-28 10:42 ` [PATCH v4 04/13] arm64: Add new hcall HVC_CALL_FUNC James Morse
@ 2016-02-02  6:53   ` AKASHI Takahiro
  0 siblings, 0 replies; 34+ messages in thread
From: AKASHI Takahiro @ 2016-02-02  6:53 UTC (permalink / raw)
  To: linux-arm-kernel

On 01/28/2016 07:42 PM, James Morse wrote:
> From: Geoff Levand <geoff@infradead.org>
>
> Add the new hcall HVC_CALL_FUNC that allows execution of a function at EL2.
> During CPU reset the CPU must be brought to the exception level it had on
> entry to the kernel.  The HVC_CALL_FUNC hcall will provide the mechanism
> needed for this exception level switch.
>
> To allow the HVC_CALL_FUNC exception vector to work without a stack, which
> is needed to support an hcall at CPU reset, this implementation uses
> register x18 to store the link register across the caller provided
> function.  This dictates that the caller provided function must preserve
> the contents of register x18.
>
> Signed-off-by: Geoff Levand <geoff@infradead.org>
> Signed-off-by: James Morse <james.morse@arm.com>
> ---
> This patch is from v13 of kexec
>
>   arch/arm64/include/asm/virt.h | 13 +++++++++++++
>   arch/arm64/kernel/hyp-stub.S  | 13 ++++++++++++-
>   2 files changed, 25 insertions(+), 1 deletion(-)
>
> diff --git a/arch/arm64/include/asm/virt.h b/arch/arm64/include/asm/virt.h
> index eb10368c329e..30700961f28c 100644
> --- a/arch/arm64/include/asm/virt.h
> +++ b/arch/arm64/include/asm/virt.h
> @@ -45,6 +45,19 @@
>
>   #define HVC_SET_VECTORS 2
>
> +/*
> + * HVC_CALL_FUNC - Execute a function at EL2.
> + *
> + * @x0: Physical address of the function to be executed.
> + * @x1: Passed as the first argument to the function.
> + * @x2: Passed as the second argument to the function.
> + * @x3: Passed as the third argument to the function.
> + *
> + * The called function must preserve the contents of register x18.
> + */
> +
> +#define HVC_CALL_FUNC 3
> +
>   #define BOOT_CPU_MODE_EL1	(0xe11)
>   #define BOOT_CPU_MODE_EL2	(0xe12)
>
> diff --git a/arch/arm64/kernel/hyp-stub.S b/arch/arm64/kernel/hyp-stub.S
> index 017ab519aaf1..e8febe90c036 100644
> --- a/arch/arm64/kernel/hyp-stub.S
> +++ b/arch/arm64/kernel/hyp-stub.S
> @@ -67,8 +67,19 @@ el1_sync:
>   	b	2f
>
>   1:	cmp	x18, #HVC_SET_VECTORS
> -	b.ne	2f
> +	b.ne	1f
>   	msr	vbar_el2, x0
> +	b	2f
> +
> +1:	cmp	x18, #HVC_CALL_FUNC

I think we should avoid duplicated label names ("1") in a short distance.
(I know it's Geoff's code.)

-Takahiro AKASHI

> +	b.ne	2f
> +	mov	x18, lr
> +	mov	lr, x0
> +	mov	x0, x1
> +	mov	x1, x2
> +	mov	x2, x3
> +	blr	lr
> +	mov	lr, x18
>
>   2:	eret
>   ENDPROC(el1_sync)
>

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
  2016-02-01  8:53     ` James Morse
@ 2016-02-03  0:42       ` Kevin Hilman
  -1 siblings, 0 replies; 34+ messages in thread
From: Kevin Hilman @ 2016-02-03  0:42 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, linux-pm, Rafael J. Wysocki,
	Pavel Machek, Marc Zyngier

James Morse <james.morse@arm.com> writes:

> On 29/01/16 22:34, Kevin Hilman wrote:
>> James Morse <james.morse@arm.com> writes:
>> I'd like to help in testing this so I'm just curious which platforms
>> you've been testing this on.  I'm assuming a Juno (r2?), anything else?
>
> That would be great - thanks!
>
> I've done most of the testing on a Juno r1, but also gave it a spin on a
> stray Seattle.

OK, I'm using a very recently arrived Juno R2, and I did get it
working.  I have a few other arm64 boards around that ar part of
kernelci.org and will try those as well.

>> Are you testing the resume from cold boot, or just from kexec?
>
> From cold boot. I haven't tried with kexec, but I doubt that's a use-case anyone
> wants as you would resume immediately. (might be interesting for testing though)
>
>
>> For cold boot on Juno, I'm assuming there would be some
>> booloader/firmware changes needed to find and boot from the hibernation
>> image?
>
> Not at all! Your firmware only needs to support some mechanism to
> turning CPUs off.
>
> If you add 'resume=/dev/sda2' (or wherever your swap partition is located), the
> kernel will check this partition for the hibernate-image signature, if found, it
> will resume from that partition. Otherwise booting is as normal. (there is one
> hoop to jump through to ensure your rootfs hasn't been mounted before the kernel
> starts resume, as you could corrupt it - an initramfs in the kernel is the best
> fix for this).
>
> No firmware changes needed.
>
>> Is that being worked on?  If not Juno, are you aware of any
>> other platforms that could be tested with resume from cold boot?
>
> Any arm64 platform with persistent storage should work. I've been using a swap
> partition on a usb drive.

>> Not knowing the answers to the above, I tested your branch using arm64
>> defconfig + CONFIG_HIBERNATION=y on my Juno and noticed that it didn't
>> stay suspended (full suspend log below) so I'm looking for
>> ideas/recommenations on how to test this.
>
> That trace looks quite normal, (one of mine below[0] for comparison). Any
> failure would have happened after the point you stopped ... did you have a swap
> partition 'on'?

I think I must have not setup swap correctly since after testing again
it's working fine.  Maybe I forgot the 'swapon'?  In any case, being a
little more careful, testing again and things are working fine on Juno.

I'm also using swap on USB storage for now.

> 'syscore' will freeze all processes and stop all devices, then create a
> copy of the minimum amount of memory it needs to save. Then it starts
> all the devices again, as it needs to write this image out to swap. This is what
> you are seeing.
>
> Once it has done this it calls poweroff or reboot.

Yeah, I wasn't seeing the call to poweroff, but most likely because it
was failing to fully write the image due to my missing/incorrect swap
setup.

Thanks for the help,

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-02-03  0:42       ` Kevin Hilman
  0 siblings, 0 replies; 34+ messages in thread
From: Kevin Hilman @ 2016-02-03  0:42 UTC (permalink / raw)
  To: linux-arm-kernel

James Morse <james.morse@arm.com> writes:

> On 29/01/16 22:34, Kevin Hilman wrote:
>> James Morse <james.morse@arm.com> writes:
>> I'd like to help in testing this so I'm just curious which platforms
>> you've been testing this on.  I'm assuming a Juno (r2?), anything else?
>
> That would be great - thanks!
>
> I've done most of the testing on a Juno r1, but also gave it a spin on a
> stray Seattle.

OK, I'm using a very recently arrived Juno R2, and I did get it
working.  I have a few other arm64 boards around that ar part of
kernelci.org and will try those as well.

>> Are you testing the resume from cold boot, or just from kexec?
>
> From cold boot. I haven't tried with kexec, but I doubt that's a use-case anyone
> wants as you would resume immediately. (might be interesting for testing though)
>
>
>> For cold boot on Juno, I'm assuming there would be some
>> booloader/firmware changes needed to find and boot from the hibernation
>> image?
>
> Not at all! Your firmware only needs to support some mechanism to
> turning CPUs off.
>
> If you add 'resume=/dev/sda2' (or wherever your swap partition is located), the
> kernel will check this partition for the hibernate-image signature, if found, it
> will resume from that partition. Otherwise booting is as normal. (there is one
> hoop to jump through to ensure your rootfs hasn't been mounted before the kernel
> starts resume, as you could corrupt it - an initramfs in the kernel is the best
> fix for this).
>
> No firmware changes needed.
>
>> Is that being worked on?  If not Juno, are you aware of any
>> other platforms that could be tested with resume from cold boot?
>
> Any arm64 platform with persistent storage should work. I've been using a swap
> partition on a usb drive.

>> Not knowing the answers to the above, I tested your branch using arm64
>> defconfig + CONFIG_HIBERNATION=y on my Juno and noticed that it didn't
>> stay suspended (full suspend log below) so I'm looking for
>> ideas/recommenations on how to test this.
>
> That trace looks quite normal, (one of mine below[0] for comparison). Any
> failure would have happened after the point you stopped ... did you have a swap
> partition 'on'?

I think I must have not setup swap correctly since after testing again
it's working fine.  Maybe I forgot the 'swapon'?  In any case, being a
little more careful, testing again and things are working fine on Juno.

I'm also using swap on USB storage for now.

> 'syscore' will freeze all processes and stop all devices, then create a
> copy of the minimum amount of memory it needs to save. Then it starts
> all the devices again, as it needs to write this image out to swap. This is what
> you are seeing.
>
> Once it has done this it calls poweroff or reboot.

Yeah, I wasn't seeing the call to poweroff, but most likely because it
was failing to fully write the image due to my missing/incorrect swap
setup.

Thanks for the help,

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
  2016-02-03  0:42       ` Kevin Hilman
@ 2016-02-05 14:18         ` James Morse
  -1 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-02-05 14:18 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, linux-pm, Rafael J. Wysocki,
	Pavel Machek, Marc Zyngier

Hi Kevin,

On 03/02/16 00:42, Kevin Hilman wrote:
> James Morse <james.morse@arm.com> writes:
>> On 29/01/16 22:34, Kevin Hilman wrote:
>>> James Morse <james.morse@arm.com> writes:
>>> I'd like to help in testing this so I'm just curious which platforms
>>> you've been testing this on.  I'm assuming a Juno (r2?), anything else?
>>
>> That would be great - thanks!
>>
>> I've done most of the testing on a Juno r1, but also gave it a spin on a
>> stray Seattle.
> 
> OK, I'm using a very recently arrived Juno R2, and I did get it
> working.  I have a few other arm64 boards around that ar part of
> kernelci.org and will try those as well.

Great,
It would be good to have a Tested-by if it works to your satisfaction...


Thanks,

James

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-02-05 14:18         ` James Morse
  0 siblings, 0 replies; 34+ messages in thread
From: James Morse @ 2016-02-05 14:18 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Kevin,

On 03/02/16 00:42, Kevin Hilman wrote:
> James Morse <james.morse@arm.com> writes:
>> On 29/01/16 22:34, Kevin Hilman wrote:
>>> James Morse <james.morse@arm.com> writes:
>>> I'd like to help in testing this so I'm just curious which platforms
>>> you've been testing this on.  I'm assuming a Juno (r2?), anything else?
>>
>> That would be great - thanks!
>>
>> I've done most of the testing on a Juno r1, but also gave it a spin on a
>> stray Seattle.
> 
> OK, I'm using a very recently arrived Juno R2, and I did get it
> working.  I have a few other arm64 boards around that ar part of
> kernelci.org and will try those as well.

Great,
It would be good to have a Tested-by if it works to your satisfaction...


Thanks,

James

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 07/13] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  2016-01-28 10:42 ` [PATCH v4 07/13] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va James Morse
@ 2016-02-05 16:26   ` Lorenzo Pieralisi
  2016-02-08  9:03     ` James Morse
  0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Pieralisi @ 2016-02-05 16:26 UTC (permalink / raw)
  To: linux-arm-kernel

Hi James,

On Thu, Jan 28, 2016 at 10:42:40AM +0000, James Morse wrote:
> By enabling the MMU early in cpu_resume(), the sleep_save_sp and stack can
> be accessed by VA, which avoids the need to convert-addresses and clean to
> PoC on the suspend path.
> 
> MMU setup is shared with the boot path, meaning the swapper_pg_dir is
> restored directly: ttbr1_el1 is no longer saved/restored.
> 
> struct sleep_save_sp is removed, replacing it with a single array of
> pointers.
> 
> cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
> mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
> __cpu_setup(). However these values all contain res0 bits that may be used
> to enable future features.

This patch is a very nice clean-up, a comment below.

I think that for registers like tcr_el1 and sctlr_el1 we should define
a restore mask (to avoid overwriting bits set-up by __cpu_setup), eg
current code that restores the tcr_el1 seems wrong to me, see below.

[...]

> -/*
>   * This hook is provided so that cpu_suspend code can restore HW
>   * breakpoints as early as possible in the resume path, before reenabling
>   * debug exceptions. Code cannot be run from a CPU PM notifier since by the
>  {
>  	/*
> -	 * We are resuming from reset with TTBR0_EL1 set to the
> -	 * idmap to enable the MMU; set the TTBR0 to the reserved
> -	 * page tables to prevent speculative TLB allocations, flush
> -	 * the local tlb and set the default tcr_el1.t0sz so that
> -	 * the TTBR0 address space set-up is properly restored.
> -	 * If the current active_mm != &init_mm we entered cpu_suspend
> -	 * with mappings in TTBR0 that must be restored, so we switch
> -	 * them back to complete the address space configuration
> -	 * restoration before returning.
> +	 * We resume from suspend directly into the swapper_pg_dir. We may
> +	 * also need to load user-space page tables.
>  	 */
> -	cpu_set_reserved_ttbr0();
> -	local_flush_tlb_all();
> -	cpu_set_default_tcr_t0sz();

You remove the code above since you restore tcr_el1 in cpu_do_resume(),
but this is not the way it should be done, ie the restore should be done
with the code sequence above otherwise it is not safe.

>  	if (mm != &init_mm)
>  		cpu_switch_mm(mm->pgd, mm);
>  
> @@ -149,22 +114,17 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long))
>  	return ret;
>  }
>  
> -struct sleep_save_sp sleep_save_sp;
> +unsigned long *sleep_save_stash;
>  
>  static int __init cpu_suspend_init(void)
>  {
> -	void *ctx_ptr;
> -
>  	/* ctx_ptr is an array of physical addresses */
> -	ctx_ptr = kcalloc(mpidr_hash_size(), sizeof(phys_addr_t), GFP_KERNEL);
> +	sleep_save_stash = kcalloc(mpidr_hash_size(), sizeof(*sleep_save_stash),
> +				   GFP_KERNEL);
>  
> -	if (WARN_ON(!ctx_ptr))
> +	if (WARN_ON(!sleep_save_stash))
>  		return -ENOMEM;
>  
> -	sleep_save_sp.save_ptr_stash = ctx_ptr;
> -	sleep_save_sp.save_ptr_stash_phys = virt_to_phys(ctx_ptr);
> -	__flush_dcache_area(&sleep_save_sp, sizeof(struct sleep_save_sp));
> -
>  	return 0;
>  }
>  early_initcall(cpu_suspend_init);
> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> index 3c7d170de822..a755108aaa75 100644
> --- a/arch/arm64/mm/proc.S
> +++ b/arch/arm64/mm/proc.S
> @@ -61,62 +61,45 @@ ENTRY(cpu_do_suspend)
>  	mrs	x2, tpidr_el0
>  	mrs	x3, tpidrro_el0
>  	mrs	x4, contextidr_el1
> -	mrs	x5, mair_el1
>  	mrs	x6, cpacr_el1
> -	mrs	x7, ttbr1_el1
>  	mrs	x8, tcr_el1
>  	mrs	x9, vbar_el1
>  	mrs	x10, mdscr_el1
>  	mrs	x11, oslsr_el1
>  	mrs	x12, sctlr_el1
>  	stp	x2, x3, [x0]
> -	stp	x4, x5, [x0, #16]
> -	stp	x6, x7, [x0, #32]
> -	stp	x8, x9, [x0, #48]
> -	stp	x10, x11, [x0, #64]
> -	str	x12, [x0, #80]
> +	stp	x4, xzr, [x0, #16]
> +	stp	x6, x8, [x0, #32]
> +	stp	x9, x10, [x0, #48]
> +	stp	x11, x12, [x0, #64]
>  	ret
>  ENDPROC(cpu_do_suspend)
>  
>  /**
>   * cpu_do_resume - restore CPU register context
>   *
> - * x0: Physical address of context pointer
> - * x1: ttbr0_el1 to be restored
> - *
> - * Returns:
> - *	sctlr_el1 value in x0
> + * x0: Address of context pointer
>   */
>  ENTRY(cpu_do_resume)
> -	/*
> -	 * Invalidate local tlb entries before turning on MMU
> -	 */
> -	tlbi	vmalle1
>  	ldp	x2, x3, [x0]
>  	ldp	x4, x5, [x0, #16]
> -	ldp	x6, x7, [x0, #32]
> -	ldp	x8, x9, [x0, #48]
> -	ldp	x10, x11, [x0, #64]
> -	ldr	x12, [x0, #80]
> +	ldp	x6, x8, [x0, #32]
> +	ldp	x9, x10, [x0, #48]
> +	ldp	x11, x12, [x0, #64]
>  	msr	tpidr_el0, x2
>  	msr	tpidrro_el0, x3
>  	msr	contextidr_el1, x4
> -	msr	mair_el1, x5
>  	msr	cpacr_el1, x6
> -	msr	ttbr0_el1, x1
> -	msr	ttbr1_el1, x7
> -	tcr_set_idmap_t0sz x8, x7
>  	msr	tcr_el1, x8

I do not think that's correct. You restore tcr_el1 here, but this has
side effect of "restoring" t0sz too and that's not correct since this
has to be done with the sequence you removed above.

I'd rather not touch t0sz at all (use a mask) and restore t0sz in
__cpu_suspend_exit() as it is done at present using:

cpu_set_reserved_ttbr0();
local_flush_tlb_all();
cpu_set_default_tcr_t0sz();

That's the only safe way of doing it.

Other than that the patch seems fine to me.

Thanks,
Lorenzo

>  	msr	vbar_el1, x9
>  	msr	mdscr_el1, x10
> +	msr	sctlr_el1, x12
>  	/*
>  	 * Restore oslsr_el1 by writing oslar_el1
>  	 */
>  	ubfx	x11, x11, #1, #1
>  	msr	oslar_el1, x11
>  	msr	pmuserenr_el0, xzr		// Disable PMU access from EL0
> -	mov	x0, x12
> -	dsb	nsh		// Make sure local tlb invalidation completed
>  	isb
>  	ret
>  ENDPROC(cpu_do_resume)
> -- 
> 2.6.2
> 

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 07/13] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  2016-02-05 16:26   ` Lorenzo Pieralisi
@ 2016-02-08  9:03     ` James Morse
  2016-02-08 11:55       ` Lorenzo Pieralisi
  0 siblings, 1 reply; 34+ messages in thread
From: James Morse @ 2016-02-08  9:03 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Lorenzo,

On 05/02/16 16:26, Lorenzo Pieralisi wrote:
>> cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
>> mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
>> __cpu_setup(). However these values all contain res0 bits that may be used
>> to enable future features.
> 
> This patch is a very nice clean-up, a comment below.
> 
> I think that for registers like tcr_el1 and sctlr_el1 we should define
> a restore mask (to avoid overwriting bits set-up by __cpu_setup), eg
> current code that restores the tcr_el1 seems wrong to me, see below.

Presumably this should be two masks, one to find RES0 bits that are set, and are
assumed to have some new meaning, and another to find RES1 bits that have been
cleared.


>> -	cpu_set_reserved_ttbr0();
>> -	local_flush_tlb_all();
>> -	cpu_set_default_tcr_t0sz();
> 
> You remove the code above since you restore tcr_el1 in cpu_do_resume(),
> but this is not the way it should be done, ie the restore should be done
> with the code sequence above otherwise it is not safe.

You're right - __cpu_setup() sets a guaranteed-incompatible t0sz value.
I removed it because all this now gets executed from the swapper page tables,
not the idmap, but there is more to it than that.


Thanks,

James

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 07/13] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  2016-02-08  9:03     ` James Morse
@ 2016-02-08 11:55       ` Lorenzo Pieralisi
  2016-02-08 12:03         ` Mark Rutland
  0 siblings, 1 reply; 34+ messages in thread
From: Lorenzo Pieralisi @ 2016-02-08 11:55 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 08, 2016 at 09:03:21AM +0000, James Morse wrote:
> Hi Lorenzo,
> 
> On 05/02/16 16:26, Lorenzo Pieralisi wrote:
> >> cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
> >> mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
> >> __cpu_setup(). However these values all contain res0 bits that may be used
> >> to enable future features.
> > 
> > This patch is a very nice clean-up, a comment below.
> > 
> > I think that for registers like tcr_el1 and sctlr_el1 we should define
> > a restore mask (to avoid overwriting bits set-up by __cpu_setup), eg
> > current code that restores the tcr_el1 seems wrong to me, see below.
> 
> Presumably this should be two masks, one to find RES0 bits that are
> set, and are assumed to have some new meaning, and another to find
> RES1 bits that have been cleared.

For the time being, it is ok to just fix t0sz restore which means
that either you avoid overwriting tcr_el1.t0sz in cpu_do_resume()
or you force the t0sz value field to be whatever value is already
present in the register (ie set-up in __cpu_setup through
tcr_set_idmap_t0sz) and finally set it to correct the default value
in __cpu_suspend_exit() using the correct procedure:

- set reserved ttbr0
- flush tlb
- cpu_set_default_tcr_t0sz

Thanks,
Lorenzo

^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 07/13] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va
  2016-02-08 11:55       ` Lorenzo Pieralisi
@ 2016-02-08 12:03         ` Mark Rutland
  0 siblings, 0 replies; 34+ messages in thread
From: Mark Rutland @ 2016-02-08 12:03 UTC (permalink / raw)
  To: linux-arm-kernel

On Mon, Feb 08, 2016 at 11:55:52AM +0000, Lorenzo Pieralisi wrote:
> On Mon, Feb 08, 2016 at 09:03:21AM +0000, James Morse wrote:
> > Hi Lorenzo,
> > 
> > On 05/02/16 16:26, Lorenzo Pieralisi wrote:
> > >> cpu_do_{suspend,resume} could be further reduced to not restore: cpacr_el1,
> > >> mdscr_el1, tcr_el1, vbar_el1 and sctlr_el1, all of which are set by
> > >> __cpu_setup(). However these values all contain res0 bits that may be used
> > >> to enable future features.
> > > 
> > > This patch is a very nice clean-up, a comment below.
> > > 
> > > I think that for registers like tcr_el1 and sctlr_el1 we should define
> > > a restore mask (to avoid overwriting bits set-up by __cpu_setup), eg
> > > current code that restores the tcr_el1 seems wrong to me, see below.
> > 
> > Presumably this should be two masks, one to find RES0 bits that are
> > set, and are assumed to have some new meaning, and another to find
> > RES1 bits that have been cleared.
> 
> For the time being, it is ok to just fix t0sz restore which means
> that either you avoid overwriting tcr_el1.t0sz in cpu_do_resume()
> or you force the t0sz value field to be whatever value is already
> present in the register (ie set-up in __cpu_setup through
> tcr_set_idmap_t0sz) and finally set it to correct the default value
> in __cpu_suspend_exit() using the correct procedure:
> 
> - set reserved ttbr0
> - flush tlb
> - cpu_set_default_tcr_t0sz

You can use cpu_uninstall_idmap() [1] to do that, which Catalin has
queued as part of my pagetable rework [2]. That will also install the
appropriate TTBR0 context for the current thread, if any.

Mark.

[1] https://git.kernel.org/cgit/linux/kernel/git/arm64/linux.git/commit/?h=for-next/pgtable&id=7036610bbd05a5269fa1d25c1c000ad3465c2906
[2] https://git.kernel.org/cgit/linux/kernel/git/arm64/linux.git/log/?h=for-next/pgtable

^ permalink raw reply	[flat|nested] 34+ messages in thread

* Re: [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
  2016-02-05 14:18         ` James Morse
@ 2016-02-08 21:20           ` Kevin Hilman
  -1 siblings, 0 replies; 34+ messages in thread
From: Kevin Hilman @ 2016-02-08 21:20 UTC (permalink / raw)
  To: James Morse
  Cc: linux-arm-kernel, Will Deacon, Sudeep Holla, Kevin Kang,
	Geoff Levand, Catalin Marinas, Lorenzo Pieralisi, Mark Rutland,
	AKASHI Takahiro, wangfei, linux-pm, Rafael J. Wysocki,
	Pavel Machek, Marc Zyngier

James Morse <james.morse@arm.com> writes:

> Hi Kevin,
>
> On 03/02/16 00:42, Kevin Hilman wrote:
>> James Morse <james.morse@arm.com> writes:
>>> On 29/01/16 22:34, Kevin Hilman wrote:
>>>> James Morse <james.morse@arm.com> writes:
>>>> I'd like to help in testing this so I'm just curious which platforms
>>>> you've been testing this on.  I'm assuming a Juno (r2?), anything else?
>>>
>>> That would be great - thanks!
>>>
>>> I've done most of the testing on a Juno r1, but also gave it a spin on a
>>> stray Seattle.
>> 
>> OK, I'm using a very recently arrived Juno R2, and I did get it
>> working.  I have a few other arm64 boards around that ar part of
>> kernelci.org and will try those as well.
>
> Great,
> It would be good to have a Tested-by if it works to your satisfaction...

Yes, feel free to add

Tested-by: Kevin Hilman <khilman@baylibre.com> # Tested on Juno R2

I tried to also test on the qcom 96boards, but that platform doesn't
support suspend/resume yet. :(

Kevin


^ permalink raw reply	[flat|nested] 34+ messages in thread

* [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk
@ 2016-02-08 21:20           ` Kevin Hilman
  0 siblings, 0 replies; 34+ messages in thread
From: Kevin Hilman @ 2016-02-08 21:20 UTC (permalink / raw)
  To: linux-arm-kernel

James Morse <james.morse@arm.com> writes:

> Hi Kevin,
>
> On 03/02/16 00:42, Kevin Hilman wrote:
>> James Morse <james.morse@arm.com> writes:
>>> On 29/01/16 22:34, Kevin Hilman wrote:
>>>> James Morse <james.morse@arm.com> writes:
>>>> I'd like to help in testing this so I'm just curious which platforms
>>>> you've been testing this on.  I'm assuming a Juno (r2?), anything else?
>>>
>>> That would be great - thanks!
>>>
>>> I've done most of the testing on a Juno r1, but also gave it a spin on a
>>> stray Seattle.
>> 
>> OK, I'm using a very recently arrived Juno R2, and I did get it
>> working.  I have a few other arm64 boards around that ar part of
>> kernelci.org and will try those as well.
>
> Great,
> It would be good to have a Tested-by if it works to your satisfaction...

Yes, feel free to add

Tested-by: Kevin Hilman <khilman@baylibre.com> # Tested on Juno R2

I tried to also test on the qcom 96boards, but that platform doesn't
support suspend/resume yet. :(

Kevin

^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2016-02-08 21:20 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-01-28 10:42 [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
2016-01-28 10:42 ` James Morse
2016-01-28 10:42 ` [PATCH v4 01/13] arm64: Fold proc-macros.S into assembler.h James Morse
2016-01-28 10:42 ` [PATCH v4 02/13] arm64: Cleanup SCTLR flags James Morse
2016-01-28 10:42 ` [PATCH v4 03/13] arm64: Convert hcalls to use HVC immediate value James Morse
2016-01-28 10:42 ` [PATCH v4 04/13] arm64: Add new hcall HVC_CALL_FUNC James Morse
2016-02-02  6:53   ` AKASHI Takahiro
2016-01-28 10:42 ` [PATCH v4 05/13] arm64: kvm: allows kvm cpu hotplug James Morse
2016-02-02  6:46   ` AKASHI Takahiro
2016-01-28 10:42 ` [PATCH v4 06/13] arm64: kernel: Rework finisher callback out of __cpu_suspend_enter() James Morse
2016-01-28 10:42 ` [PATCH v4 07/13] arm64: Change cpu_resume() to enable mmu early then access sleep_sp by va James Morse
2016-02-05 16:26   ` Lorenzo Pieralisi
2016-02-08  9:03     ` James Morse
2016-02-08 11:55       ` Lorenzo Pieralisi
2016-02-08 12:03         ` Mark Rutland
2016-01-28 10:42 ` [PATCH v4 08/13] arm64: kernel: Include _AC definition in page.h James Morse
2016-01-28 10:42 ` [PATCH v4 09/13] arm64: Promote KERNEL_START/KERNEL_END definitions to a header file James Morse
2016-01-28 10:42 ` [PATCH v4 10/13] arm64: Add new asm macro copy_page James Morse
2016-01-28 10:42 ` [PATCH v4 11/13] PM / Hibernate: Call flush_icache_range() on pages restored in-place James Morse
2016-01-28 10:42   ` James Morse
2016-01-31 17:25   ` Pavel Machek
2016-01-31 17:25     ` Pavel Machek
2016-01-28 10:42 ` [PATCH v4 12/13] arm64: kernel: Add support for hibernate/suspend-to-disk James Morse
2016-01-28 10:42 ` [PATCH v4 13/13] arm64: hibernate: Prevent resume from a different kernel version James Morse
2016-01-29 22:34 ` [PATCH v4 00/13] arm64: kernel: Add support for hibernate/suspend-to-disk Kevin Hilman
2016-01-29 22:34   ` Kevin Hilman
2016-02-01  8:53   ` James Morse
2016-02-01  8:53     ` James Morse
2016-02-03  0:42     ` Kevin Hilman
2016-02-03  0:42       ` Kevin Hilman
2016-02-05 14:18       ` James Morse
2016-02-05 14:18         ` James Morse
2016-02-08 21:20         ` Kevin Hilman
2016-02-08 21:20           ` Kevin Hilman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.