linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/31] AArch64 Linux kernel port
@ 2012-08-14 17:52 Catalin Marinas
  2012-08-14 17:52 ` [PATCH v2 01/31] arm64: Assembly macros and definitions Catalin Marinas
                   ` (31 more replies)
  0 siblings, 32 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann

This is the 2nd version of the set of patches implementing Linux kernel
support for the 64-bit ARM architecture (AArch64). Thanks to all who
provided feedback on the previous version.

The Linux kernel patches are available on this tree:

git://git.kernel.org/pub/scm/linux/kernel/git/cmarinas/linux-aarch64.git upstream

The "master" branch in the above repository tracks the development
history.

Main changes from the previous version (for the full log see the
"master" branch above):

- Kernel port directory and related functions renamed to "arm64".
  "uname -m" reports "aarch64" as per the official name.
- NO_BOOTMEM enabled.
- "mem=" is now used for limiting the amount of memory rather than
  specifying the memory banks (already done via FDT).
- struct mem_type removed as static definitions are enough for
  ioremap().
- Support for ZONE_DMA32.
- Replaced "user_debug" with "/proc/sys/debug/exception-trace".
- Added a generic defconfig file.
- More clean-up (comments, code) and bug-fixes.

The generic patches were dropped from this series as they have been pushed
separately (and most of them already merged into mainline).


Background to the 64-bit ARM architecture:

ARM introduced AArch64 as part of the ARMv8 architecture and consists of
a substantially revised exception model (with 4 exception levels: EL0 -
user, EL1 - kernel, EL2 - hypervisor, EL3 - secure monitor), new A64
instruction set based on larger register file, new FP/SIMD instructions.
The new ABI is LP64 and takes advantage of the larger register file. It
also mandates FP.

AArch64 documentation currently available (publicly, though
click-through agreement required):

- Instruction Set Overview:
http://infocenter.arm.com/help/topic/com.arm.doc.genc010197a/index.html

- ABI (PCS, ELF, DWARF, C++):
http://infocenter.arm.com/help/topic/com.arm.doc.ihi0059a/index.html


Regards,

Catalin


Catalin Marinas (23):
  arm64: Assembly macros and definitions
  arm64: Kernel booting and initialisation
  arm64: Exception handling
  arm64: MMU definitions
  arm64: MMU initialisation
  arm64: MMU fault handling and page table management
  arm64: Process management
  arm64: CPU support
  arm64: Cache maintenance routines
  arm64: TLB maintenance functionality
  arm64: Atomic operations
  arm64: Device specific operations
  arm64: DMA mapping API
  arm64: SMP support
  arm64: ELF definitions
  arm64: System calls handling
  arm64: Signal handling support
  arm64: User access library functions
  arm64: Floating point and SIMD
  arm64: Add support for /proc/sys/debug/exception-trace
  arm64: Miscellaneous header files
  arm64: Build infrastructure
  arm64: MAINTAINERS update

Marc Zyngier (3):
  arm64: IRQ handling
  arm64: Miscellaneous library functions
  arm64: Generic timers support

Will Deacon (5):
  arm64: VDSO support
  arm64: 32-bit (compat) applications support
  arm64: Debugging support
  arm64: Performance counters support
  arm64: Loadable modules

 Documentation/arm64/booting.txt               |  141 +++
 Documentation/arm64/memory.txt                |   69 ++
 MAINTAINERS                                   |    6 +
 arch/arm64/Kconfig                            |  261 +++++
 arch/arm64/Kconfig.debug                      |   27 +
 arch/arm64/Makefile                           |   71 ++
 arch/arm64/boot/.gitignore                    |    2 +
 arch/arm64/boot/Makefile                      |   38 +
 arch/arm64/boot/install.sh                    |   52 +
 arch/arm64/configs/generic_defconfig          |   85 ++
 arch/arm64/include/asm/Kbuild                 |   51 +
 arch/arm64/include/asm/asm-offsets.h          |    1 +
 arch/arm64/include/asm/assembler.h            |  109 ++
 arch/arm64/include/asm/atomic.h               |  306 ++++++
 arch/arm64/include/asm/auxvec.h               |   22 +
 arch/arm64/include/asm/barrier.h              |   52 +
 arch/arm64/include/asm/bitops.h               |   74 ++
 arch/arm64/include/asm/bitsperlong.h          |   23 +
 arch/arm64/include/asm/byteorder.h            |   21 +
 arch/arm64/include/asm/cache.h                |   32 +
 arch/arm64/include/asm/cacheflush.h           |  209 ++++
 arch/arm64/include/asm/cachetype.h            |   48 +
 arch/arm64/include/asm/cmpxchg.h              |  180 ++++
 arch/arm64/include/asm/compat.h               |  232 +++++
 arch/arm64/include/asm/compiler.h             |   30 +
 arch/arm64/include/asm/cputype.h              |   49 +
 arch/arm64/include/asm/debug-monitors.h       |   88 ++
 arch/arm64/include/asm/device.h               |   26 +
 arch/arm64/include/asm/dma-mapping.h          |  124 +++
 arch/arm64/include/asm/elf.h                  |  176 ++++
 arch/arm64/include/asm/exception.h            |   23 +
 arch/arm64/include/asm/exec.h                 |   23 +
 arch/arm64/include/asm/fb.h                   |   34 +
 arch/arm64/include/asm/fcntl.h                |   29 +
 arch/arm64/include/asm/fpsimd.h               |   64 ++
 arch/arm64/include/asm/futex.h                |  134 +++
 arch/arm64/include/asm/hardirq.h              |   52 +
 arch/arm64/include/asm/hw_breakpoint.h        |  137 +++
 arch/arm64/include/asm/hwcap.h                |   57 +
 arch/arm64/include/asm/io.h                   |  263 +++++
 arch/arm64/include/asm/irq.h                  |    8 +
 arch/arm64/include/asm/irqflags.h             |   91 ++
 arch/arm64/include/asm/memblock.h             |   21 +
 arch/arm64/include/asm/memory.h               |  144 +++
 arch/arm64/include/asm/mmu.h                  |   27 +
 arch/arm64/include/asm/mmu_context.h          |  152 +++
 arch/arm64/include/asm/module.h               |   23 +
 arch/arm64/include/asm/page.h                 |   67 ++
 arch/arm64/include/asm/param.h                |   23 +
 arch/arm64/include/asm/perf_event.h           |   22 +
 arch/arm64/include/asm/pgalloc.h              |  113 ++
 arch/arm64/include/asm/pgtable-2level-hwdef.h |   43 +
 arch/arm64/include/asm/pgtable-2level-types.h |   60 ++
 arch/arm64/include/asm/pgtable-3level-hwdef.h |   50 +
 arch/arm64/include/asm/pgtable-3level-types.h |   66 ++
 arch/arm64/include/asm/pgtable-hwdef.h        |   94 ++
 arch/arm64/include/asm/pgtable.h              |  328 ++++++
 arch/arm64/include/asm/pmu.h                  |   82 ++
 arch/arm64/include/asm/proc-fns.h             |   51 +
 arch/arm64/include/asm/processor.h            |  174 ++++
 arch/arm64/include/asm/procinfo.h             |   44 +
 arch/arm64/include/asm/prom.h                 |    1 +
 arch/arm64/include/asm/ptrace.h               |  206 ++++
 arch/arm64/include/asm/setup.h                |   26 +
 arch/arm64/include/asm/shmparam.h             |   28 +
 arch/arm64/include/asm/sigcontext.h           |   69 ++
 arch/arm64/include/asm/siginfo.h              |   23 +
 arch/arm64/include/asm/signal.h               |   24 +
 arch/arm64/include/asm/signal32.h             |   54 +
 arch/arm64/include/asm/smp.h                  |   69 ++
 arch/arm64/include/asm/sparsemem.h            |   24 +
 arch/arm64/include/asm/spinlock.h             |  199 ++++
 arch/arm64/include/asm/spinlock_types.h       |   38 +
 arch/arm64/include/asm/stacktrace.h           |   29 +
 arch/arm64/include/asm/stat.h                 |   63 ++
 arch/arm64/include/asm/statfs.h               |   23 +
 arch/arm64/include/asm/syscall.h              |  101 ++
 arch/arm64/include/asm/syscalls.h             |   40 +
 arch/arm64/include/asm/system_misc.h          |   54 +
 arch/arm64/include/asm/thread_info.h          |  124 +++
 arch/arm64/include/asm/timex.h                |   32 +
 arch/arm64/include/asm/tlb.h                  |  190 ++++
 arch/arm64/include/asm/tlbflush.h             |  123 +++
 arch/arm64/include/asm/traps.h                |   30 +
 arch/arm64/include/asm/uaccess.h              |  377 +++++++
 arch/arm64/include/asm/ucontext.h             |   30 +
 arch/arm64/include/asm/unistd.h               |   27 +
 arch/arm64/include/asm/unistd32.h             |  758 ++++++++++++++
 arch/arm64/include/asm/vdso.h                 |   41 +
 arch/arm64/include/asm/vdso_datapage.h        |   43 +
 arch/arm64/kernel/.gitignore                  |    1 +
 arch/arm64/kernel/Makefile                    |   27 +
 arch/arm64/kernel/arm64ksyms.c                |   55 +
 arch/arm64/kernel/asm-offsets.c               |  108 ++
 arch/arm64/kernel/debug-monitors.c            |  288 ++++++
 arch/arm64/kernel/elf.c                       |   41 +
 arch/arm64/kernel/entry-fpsimd.S              |   80 ++
 arch/arm64/kernel/entry.S                     |  695 +++++++++++++
 arch/arm64/kernel/fpsimd.c                    |  106 ++
 arch/arm64/kernel/head.S                      |  521 ++++++++++
 arch/arm64/kernel/hw_breakpoint.c             |  880 ++++++++++++++++
 arch/arm64/kernel/io.c                        |   64 ++
 arch/arm64/kernel/irq.c                       |   84 ++
 arch/arm64/kernel/kuser32.S                   |   77 ++
 arch/arm64/kernel/module.c                    |  456 ++++++++
 arch/arm64/kernel/perf_event.c                | 1368 +++++++++++++++++++++++++
 arch/arm64/kernel/process.c                   |  416 ++++++++
 arch/arm64/kernel/ptrace.c                    |  834 +++++++++++++++
 arch/arm64/kernel/setup.c                     |  357 +++++++
 arch/arm64/kernel/signal.c                    |  436 ++++++++
 arch/arm64/kernel/signal32.c                  |  876 ++++++++++++++++
 arch/arm64/kernel/smp.c                       |  469 +++++++++
 arch/arm64/kernel/stacktrace.c                |  127 +++
 arch/arm64/kernel/sys.c                       |  138 +++
 arch/arm64/kernel/sys32.S                     |  283 +++++
 arch/arm64/kernel/sys_compat.c                |  177 ++++
 arch/arm64/kernel/time.c                      |   65 ++
 arch/arm64/kernel/traps.c                     |  357 +++++++
 arch/arm64/kernel/vdso.c                      |  261 +++++
 arch/arm64/kernel/vdso/.gitignore             |    2 +
 arch/arm64/kernel/vdso/Makefile               |   63 ++
 arch/arm64/kernel/vdso/gen_vdso_offsets.sh    |   15 +
 arch/arm64/kernel/vdso/gettimeofday.S         |  242 +++++
 arch/arm64/kernel/vdso/note.S                 |   28 +
 arch/arm64/kernel/vdso/sigreturn.S            |   37 +
 arch/arm64/kernel/vdso/vdso.S                 |   33 +
 arch/arm64/kernel/vdso/vdso.lds.S             |  100 ++
 arch/arm64/kernel/vmlinux.lds.S               |  146 +++
 arch/arm64/lib/Makefile                       |    5 +
 arch/arm64/lib/bitops.c                       |   25 +
 arch/arm64/lib/clear_page.S                   |   39 +
 arch/arm64/lib/clear_user.S                   |   58 ++
 arch/arm64/lib/copy_from_user.S               |   66 ++
 arch/arm64/lib/copy_in_user.S                 |   63 ++
 arch/arm64/lib/copy_page.S                    |   46 +
 arch/arm64/lib/copy_to_user.S                 |   61 ++
 arch/arm64/lib/delay.c                        |   55 +
 arch/arm64/lib/getuser.S                      |   75 ++
 arch/arm64/lib/putuser.S                      |   73 ++
 arch/arm64/lib/strncpy_from_user.S            |   50 +
 arch/arm64/lib/strnlen_user.S                 |   47 +
 arch/arm64/mm/Kconfig                         |    5 +
 arch/arm64/mm/Makefile                        |    6 +
 arch/arm64/mm/cache.S                         |  279 +++++
 arch/arm64/mm/context.c                       |  159 +++
 arch/arm64/mm/copypage.c                      |   34 +
 arch/arm64/mm/dma-mapping.c                   |  208 ++++
 arch/arm64/mm/extable.c                       |   17 +
 arch/arm64/mm/fault.c                         |  534 ++++++++++
 arch/arm64/mm/flush.c                         |  132 +++
 arch/arm64/mm/init.c                          |  416 ++++++++
 arch/arm64/mm/ioremap.c                       |   84 ++
 arch/arm64/mm/mm.h                            |    2 +
 arch/arm64/mm/mmap.c                          |  144 +++
 arch/arm64/mm/mmu.c                           |  395 +++++++
 arch/arm64/mm/pgd.c                           |   49 +
 arch/arm64/mm/proc-macros.S                   |   55 +
 arch/arm64/mm/proc-syms.c                     |   31 +
 arch/arm64/mm/proc.S                          |  193 ++++
 arch/arm64/mm/tlb.S                           |   71 ++
 drivers/clocksource/Kconfig                   |    5 +
 drivers/clocksource/Makefile                  |    1 +
 drivers/clocksource/arm_generic.c             |  309 ++++++
 include/clocksource/arm_generic.h             |   21 +
 init/Kconfig                                  |    3 +-
 kernel/sysctl.c                               |    2 +-
 lib/Kconfig.debug                             |    6 +-
 tools/perf/perf.h                             |    6 +
 168 files changed, 22089 insertions(+), 4 deletions(-)


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 01/31] arm64: Assembly macros and definitions
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 12:57   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 02/31] arm64: Kernel booting and initialisation Catalin Marinas
                   ` (30 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch introduces several assembly macros and definitions used in
the .S files across arch/arm64/ like IRQ disabling/enabling, together
with asm-offsets.c.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/asm-offsets.h |    1 +
 arch/arm64/include/asm/assembler.h   |  109 ++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/asm-offsets.c      |  108 +++++++++++++++++++++++++++++++++
 arch/arm64/mm/proc-macros.S          |   55 +++++++++++++++++
 4 files changed, 273 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/asm-offsets.h
 create mode 100644 arch/arm64/include/asm/assembler.h
 create mode 100644 arch/arm64/kernel/asm-offsets.c
 create mode 100644 arch/arm64/mm/proc-macros.S

diff --git a/arch/arm64/include/asm/asm-offsets.h b/arch/arm64/include/asm/asm-offsets.h
new file mode 100644
index 0000000..d370ee3
--- /dev/null
+++ b/arch/arm64/include/asm/asm-offsets.h
@@ -0,0 +1 @@
+#include <generated/asm-offsets.h>
diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
new file mode 100644
index 0000000..da2a13e
--- /dev/null
+++ b/arch/arm64/include/asm/assembler.h
@@ -0,0 +1,109 @@
+/*
+ * Based on arch/arm/include/asm/assembler.h
+ *
+ * Copyright (C) 1996-2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASSEMBLY__
+#error "Only include this from assembly code"
+#endif
+
+#include <asm/ptrace.h>
+
+/*
+ * Stack pushing/popping (register pairs only). Equivalent to store decrement
+ * before, load increment after.
+ */
+	.macro	push, xreg1, xreg2
+	stp	\xreg1, \xreg2, [sp, #-16]!
+	.endm
+
+	.macro	pop, xreg1, xreg2
+	ldp	\xreg1, \xreg2, [sp], #16
+	.endm
+
+/*
+ * Enable and disable interrupts.
+ */
+	.macro	disable_irq
+	msr	daifset, #2
+	.endm
+
+	.macro	enable_irq
+	msr	daifclr, #2
+	.endm
+
+/*
+ * Save/disable and restore interrupts.
+ */
+	.macro	save_and_disable_irqs, olddaif
+	mrs	\olddaif, daif
+	disable_irq
+	.endm
+
+	.macro	restore_irqs, olddaif
+	msr	daif, \olddaif
+	.endm
+
+/*
+ * Enable and disable debug exceptions.
+ */
+	.macro	disable_dbg
+	msr	daifset, #8
+	.endm
+
+	.macro	enable_dbg
+	msr	daifclr, #8
+	.endm
+
+	.macro	disable_step, tmp
+	mrs	\tmp, mdscr_el1
+	bic	\tmp, \tmp, #1
+	msr	mdscr_el1, \tmp
+	.endm
+
+	.macro	enable_step, tmp
+	mrs	\tmp, mdscr_el1
+	orr	\tmp, \tmp, #1
+	msr	mdscr_el1, \tmp
+	.endm
+
+	.macro	enable_dbg_if_not_stepping, tmp
+	mrs	\tmp, mdscr_el1
+	tbnz	\tmp, #1, 9990f
+	enable_dbg
+9990:
+	.endm
+
+/*
+ * SMP data memory barrier
+ */
+	.macro	smp_dmb, opt
+#ifdef CONFIG_SMP
+	dmb	\opt
+#endif
+	.endm
+
+#define USER(l, x...)				\
+9999:	x;					\
+	.section __ex_table,"a";		\
+	.align	3;				\
+	.quad	9999b,l;			\
+	.previous
+
+/*
+ * Register aliases.
+ */
+lr	.req	x30		// link register
diff --git a/arch/arm64/kernel/asm-offsets.c b/arch/arm64/kernel/asm-offsets.c
new file mode 100644
index 0000000..5120e51
--- /dev/null
+++ b/arch/arm64/kernel/asm-offsets.c
@@ -0,0 +1,108 @@
+/*
+ * Based on arch/arm/kernel/asm-offsets.c
+ *
+ * Copyright (C) 1995-2003 Russell King
+ *               2001-2002 Keith Owens
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/dma-mapping.h>
+#include <asm/thread_info.h>
+#include <asm/memory.h>
+#include <asm/procinfo.h>
+#include <asm/vdso_datapage.h>
+#include <linux/kbuild.h>
+
+int main(void)
+{
+  DEFINE(TSK_ACTIVE_MM,		offsetof(struct task_struct, active_mm));
+  BLANK();
+  DEFINE(TI_FLAGS,		offsetof(struct thread_info, flags));
+  DEFINE(TI_PREEMPT,		offsetof(struct thread_info, preempt_count));
+  DEFINE(TI_ADDR_LIMIT,		offsetof(struct thread_info, addr_limit));
+  DEFINE(TI_TASK,		offsetof(struct thread_info, task));
+  DEFINE(TI_EXEC_DOMAIN,	offsetof(struct thread_info, exec_domain));
+  DEFINE(TI_CPU,		offsetof(struct thread_info, cpu));
+  BLANK();
+  DEFINE(THREAD_CPU_CONTEXT,	offsetof(struct task_struct, thread.cpu_context));
+  BLANK();
+  DEFINE(S_X0,			offsetof(struct pt_regs, regs[0]));
+  DEFINE(S_X1,			offsetof(struct pt_regs, regs[1]));
+  DEFINE(S_X2,			offsetof(struct pt_regs, regs[2]));
+  DEFINE(S_X3,			offsetof(struct pt_regs, regs[3]));
+  DEFINE(S_X4,			offsetof(struct pt_regs, regs[4]));
+  DEFINE(S_X5,			offsetof(struct pt_regs, regs[5]));
+  DEFINE(S_X6,			offsetof(struct pt_regs, regs[6]));
+  DEFINE(S_X7,			offsetof(struct pt_regs, regs[7]));
+  DEFINE(S_LR,			offsetof(struct pt_regs, regs[30]));
+  DEFINE(S_SP,			offsetof(struct pt_regs, sp));
+#ifdef CONFIG_AARCH32_EMULATION
+  DEFINE(S_COMPAT_SP,		offsetof(struct pt_regs, compat_sp));
+#endif
+  DEFINE(S_PSTATE,		offsetof(struct pt_regs, pstate));
+  DEFINE(S_PC,			offsetof(struct pt_regs, pc));
+  DEFINE(S_ORIG_X0,		offsetof(struct pt_regs, orig_x0));
+  DEFINE(S_SYSCALLNO,		offsetof(struct pt_regs, syscallno));
+  DEFINE(S_FRAME_SIZE,		sizeof(struct pt_regs));
+  BLANK();
+  DEFINE(MM_CONTEXT_ID,		offsetof(struct mm_struct, context.id));
+  BLANK();
+  DEFINE(VMA_VM_MM,		offsetof(struct vm_area_struct, vm_mm));
+  DEFINE(VMA_VM_FLAGS,		offsetof(struct vm_area_struct, vm_flags));
+  BLANK();
+  DEFINE(VM_EXEC,	       	VM_EXEC);
+  BLANK();
+  DEFINE(PAGE_SZ,	       	PAGE_SIZE);
+  BLANK();
+  DEFINE(PROC_INFO_SZ,		sizeof(struct proc_info_list));
+  DEFINE(PROCINFO_INITFUNC,	offsetof(struct proc_info_list, __cpu_flush));
+  BLANK();
+  DEFINE(DMA_BIDIRECTIONAL,	DMA_BIDIRECTIONAL);
+  DEFINE(DMA_TO_DEVICE,		DMA_TO_DEVICE);
+  DEFINE(DMA_FROM_DEVICE,	DMA_FROM_DEVICE);
+  BLANK();
+  DEFINE(CLOCK_REALTIME,	CLOCK_REALTIME);
+  DEFINE(CLOCK_MONOTONIC,	CLOCK_MONOTONIC);
+  DEFINE(CLOCK_REALTIME_RES,	MONOTONIC_RES_NSEC);
+  DEFINE(CLOCK_REALTIME_COARSE,	CLOCK_REALTIME_COARSE);
+  DEFINE(CLOCK_MONOTONIC_COARSE,CLOCK_MONOTONIC_COARSE);
+  DEFINE(CLOCK_COARSE_RES,	LOW_RES_NSEC);
+  DEFINE(NSEC_PER_SEC,		NSEC_PER_SEC);
+  BLANK();
+  DEFINE(VDSO_CS_CYCLE_LAST,	offsetof(struct vdso_data, cs_cycle_last));
+  DEFINE(VDSO_XTIME_CLK_SEC,	offsetof(struct vdso_data, xtime_clock_sec));
+  DEFINE(VDSO_XTIME_CLK_NSEC,	offsetof(struct vdso_data, xtime_clock_nsec));
+  DEFINE(VDSO_XTIME_CRS_SEC,	offsetof(struct vdso_data, xtime_coarse_sec));
+  DEFINE(VDSO_XTIME_CRS_NSEC,	offsetof(struct vdso_data, xtime_coarse_nsec));
+  DEFINE(VDSO_WTM_CLK_SEC,	offsetof(struct vdso_data, wtm_clock_sec));
+  DEFINE(VDSO_WTM_CLK_NSEC,	offsetof(struct vdso_data, wtm_clock_nsec));
+  DEFINE(VDSO_TB_SEQ_COUNT,	offsetof(struct vdso_data, tb_seq_count));
+  DEFINE(VDSO_CS_MULT,		offsetof(struct vdso_data, cs_mult));
+  DEFINE(VDSO_CS_SHIFT,		offsetof(struct vdso_data, cs_shift));
+  DEFINE(VDSO_TZ_MINWEST,	offsetof(struct vdso_data, tz_minuteswest));
+  DEFINE(VDSO_TZ_DSTTIME,	offsetof(struct vdso_data, tz_dsttime));
+  DEFINE(VDSO_USE_SYSCALL,	offsetof(struct vdso_data, use_syscall));
+  BLANK();
+  DEFINE(TVAL_TV_SEC,		offsetof(struct timeval, tv_sec));
+  DEFINE(TVAL_TV_USEC,		offsetof(struct timeval, tv_usec));
+  DEFINE(TSPEC_TV_SEC,		offsetof(struct timespec, tv_sec));
+  DEFINE(TSPEC_TV_NSEC,		offsetof(struct timespec, tv_nsec));
+  BLANK();
+  DEFINE(TZ_MINWEST,		offsetof(struct timezone, tz_minuteswest));
+  DEFINE(TZ_DSTTIME,		offsetof(struct timezone, tz_dsttime));
+  return 0;
+}
diff --git a/arch/arm64/mm/proc-macros.S b/arch/arm64/mm/proc-macros.S
new file mode 100644
index 0000000..8957b82
--- /dev/null
+++ b/arch/arm64/mm/proc-macros.S
@@ -0,0 +1,55 @@
+/*
+ * Based on arch/arm/mm/proc-macros.S
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <asm/asm-offsets.h>
+#include <asm/thread_info.h>
+
+/*
+ * vma_vm_mm - get mm pointer from vma pointer (vma->vm_mm)
+ */
+	.macro	vma_vm_mm, rd, rn
+	ldr	\rd, [\rn, #VMA_VM_MM]
+	.endm
+
+/*
+ * mmid - get context id from mm pointer (mm->context.id)
+ */
+	.macro	mmid, rd, rn
+	ldr	\rd, [\rn, #MM_CONTEXT_ID]
+	.endm
+
+/*
+ * dcache_line_size - get the minimum D-cache line size from the CTR register.
+ */
+	.macro	dcache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	lsr	\tmp, \tmp, #16
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm
+
+/*
+ * icache_line_size - get the minimum I-cache line size from the CTR register.
+ */
+	.macro	icache_line_size, reg, tmp
+	mrs	\tmp, ctr_el0			// read CTR
+	and	\tmp, \tmp, #0xf		// cache line size encoding
+	mov	\reg, #4			// bytes per word
+	lsl	\reg, \reg, \tmp		// actual cache line size
+	.endm


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 02/31] arm64: Kernel booting and initialisation
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
  2012-08-14 17:52 ` [PATCH v2 01/31] arm64: Assembly macros and definitions Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-14 23:06   ` Olof Johansson
                     ` (4 more replies)
  2012-08-14 17:52 ` [PATCH v2 03/31] arm64: Exception handling Catalin Marinas
                   ` (29 subsequent siblings)
  31 siblings, 5 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

The patch adds the kernel booting and the initial setup code.
Documentation/arm64/booting.txt describes the booting protocol on the
AArch64 Linux kernel. This is subject to change following the work on
boot standardisation, ACPI.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 Documentation/arm64/booting.txt |  141 +++++++++++
 arch/arm64/include/asm/setup.h  |   26 ++
 arch/arm64/kernel/head.S        |  521 +++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/setup.c       |  357 +++++++++++++++++++++++++++
 4 files changed, 1045 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/arm64/booting.txt
 create mode 100644 arch/arm64/include/asm/setup.h
 create mode 100644 arch/arm64/kernel/head.S
 create mode 100644 arch/arm64/kernel/setup.c

diff --git a/Documentation/arm64/booting.txt b/Documentation/arm64/booting.txt
new file mode 100644
index 0000000..3197820
--- /dev/null
+++ b/Documentation/arm64/booting.txt
@@ -0,0 +1,141 @@
+			Booting AArch64 Linux
+			=====================
+
+Author: Will Deacon <will.deacon@arm.com>
+Date  : 25 April 2012
+
+This document is based on the ARM booting document by Russell King and
+is relevant to all public releases of the AArch64 Linux kernel.
+
+The AArch64 exception model is made up of a number of exception levels
+(EL0 - EL3), with EL0 and EL1 having a secure and a non-secure
+counterpart.  EL2 is the hypervisor level and exists only in non-secure
+mode. EL3 is the highest priority level and exists only in secure mode.
+
+For the purposes of this document, we will use the term `boot loader'
+simply to define all software that executes on the CPU(s) before control
+is passed to the Linux kernel.  This may include secure monitor and
+hypervisor code, or it may just be a handful of instructions for
+preparing a minimal boot environment.
+
+Essentially, the boot loader should provide (as a minimum) the
+following:
+
+1. Setup and initialise the RAM
+2. Setup the device tree
+3. Decompress the kernel image
+4. Call the kernel image
+
+
+1. Setup and initialise RAM
+---------------------------
+
+Requirement: MANDATORY
+
+The boot loader is expected to find and initialise all RAM that the
+kernel will use for volatile data storage in the system.  It performs
+this in a machine dependent manner.  (It may use internal algorithms
+to automatically locate and size all RAM, or it may use knowledge of
+the RAM in the machine, or any other method the boot loader designer
+sees fit.)
+
+
+2. Setup the device tree
+-------------------------
+
+Requirement: MANDATORY
+
+The device tree blob (dtb) must be no bigger than 2 megabytes in size
+and placed at a 2-megabyte boundary within the first 512 megabytes from
+the start of the kernel image. This is to allow the kernel to map the
+blob using a single section mapping in the initial page tables.
+
+
+3. Decompress the kernel image
+------------------------------
+
+Requirement: OPTIONAL
+
+The AArch64 kernel does not provide a decompressor and therefore
+requires gzip decompression to be performed by the boot loader if the
+default Image.gz target is used.  For bootloaders that do not implement
+this requirement, the larger Image target is available instead.
+
+
+4. Call the kernel image
+------------------------
+
+Requirement: MANDATORY
+
+The decompressed kernel image contains a 32-byte header as follows:
+
+  u32 magic	= 0x14000008;	/* branch to stext, little-endian */
+  u32 res0	= 0;		/* reserved */
+  u64 text_offset;		/* Image load offset */
+  u64 res1	= 0;		/* reserved */
+  u64 res2	= 0;		/* reserved */
+
+The image must be placed at the specified offset (currently 0x80000)
+from the start of the system RAM and called there. The start of the
+system RAM must be aligned to 2MB.
+
+Before jumping into the kernel, the following conditions must be met:
+
+- Quiesce all DMA capable devices so that memory does not get
+  corrupted by bogus network packets or disk data.  This will save
+  you many hours of debug.
+
+- Primary CPU general-purpose register settings
+  x0 = physical address of device tree blob (dtb) in system RAM.
+
+- CPU mode
+  All forms of interrupts must be masked in PSTATE.DAIF (Debug, SError,
+  IRQ and FIQ).
+  The CPU must be in either EL2 (RECOMMENDED in order to have access to
+  the virtualisation extensions) or non-secure EL1.
+
+- Caches, MMUs
+  The MMU must be off.
+  Instruction cache may be on or off.
+  Data cache must be off and invalidated.
+
+- Architected timers
+  CNTFRQ must be programmed with the timer frequency.
+  If entering the kernel at EL1, CNTHCTL_EL2 must have EL1PCTEN (bit 0)
+  set where available.
+
+- Coherency
+  All CPUs to be booted by the kernel must be part of the same coherency
+  domain on entry to the kernel.  This may require IMPLEMENTATION DEFINED
+  initialisation to enable the receiving of maintenance operations on
+  each CPU.
+
+- System registers
+  All writable architected system registers at the exception level where
+  the kernel image will be entered must be initialised by software at a
+  higher exception level to prevent execution in an UNKNOWN state.
+
+The boot loader is expected to enter the kernel on each CPU in the
+following manner:
+
+- The primary CPU must jump directly to the first instruction of the
+  kernel image.  The device tree blob passed by this CPU must contain
+  for each CPU node:
+
+    1. An 'enable-method' property. Currently, the only supported value
+       for this field is the string "spin-table".
+
+    2. A 'cpu-release-addr' property identifying a 64-bit,
+       zero-initialised memory location.
+
+  It is expected that the bootloader will generate these device tree
+  properties and insert them into the blob prior to kernel entry.
+
+- Any secondary CPUs must spin outside of the kernel in a reserved area
+  of memory (communicated to the kernel by a /memreserve/ region in the
+  device tree) polling their cpu-release-addr location, which must be
+  contained in the reserved region.  A wfe instruction may be inserted
+  to reduce the overhead of the busy-loop and a sev will be issued by
+  the primary CPU.  When a read of the location pointed to by the
+  cpu-release-addr returns a non-zero value, the CPU must jump directly
+  to this value.
diff --git a/arch/arm64/include/asm/setup.h b/arch/arm64/include/asm/setup.h
new file mode 100644
index 0000000..d766493
--- /dev/null
+++ b/arch/arm64/include/asm/setup.h
@@ -0,0 +1,26 @@
+/*
+ * Based on arch/arm/include/asm/setup.h
+ *
+ * Copyright (C) 1997-1999 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SETUP_H
+#define __ASM_SETUP_H
+
+#include <linux/types.h>
+
+#define COMMAND_LINE_SIZE 1024
+
+#endif
diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
new file mode 100644
index 0000000..34ccdc0
--- /dev/null
+++ b/arch/arm64/kernel/head.S
@@ -0,0 +1,521 @@
+/*
+ * Low-level CPU initialisation
+ * Based on arch/arm/kernel/head.S
+ *
+ * Copyright (C) 1994-2002 Russell King
+ * Copyright (C) 2003-2012 ARM Ltd.
+ * Authors:	Catalin Marinas <catalin.marinas@arm.com>
+ *		Will Deacon <will.deacon@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <linux/init.h>
+
+#include <asm/assembler.h>
+#include <asm/ptrace.h>
+#include <asm/asm-offsets.h>
+#include <asm/memory.h>
+#include <asm/thread_info.h>
+#include <asm/pgtable-hwdef.h>
+#include <asm/pgtable.h>
+#include <asm/page.h>
+
+/*
+ * swapper_pg_dir is the virtual address of the initial page table. We place
+ * the page tables 3 * PAGE_SIZE below KERNEL_RAM_VADDR. The idmap_pg_dir has
+ * 2 pages and is placed below swapper_pg_dir.
+ */
+#define KERNEL_RAM_VADDR	(PAGE_OFFSET + TEXT_OFFSET)
+
+#if (KERNEL_RAM_VADDR & 0xfffff) != 0x80000
+#error KERNEL_RAM_VADDR must start at 0xXXX80000
+#endif
+
+#define SWAPPER_DIR_SIZE	(3 * PAGE_SIZE)
+#define IDMAP_DIR_SIZE		(2 * PAGE_SIZE)
+
+	.globl	swapper_pg_dir
+	.equ	swapper_pg_dir, KERNEL_RAM_VADDR - SWAPPER_DIR_SIZE
+
+	.globl	idmap_pg_dir
+	.equ	idmap_pg_dir, swapper_pg_dir - IDMAP_DIR_SIZE
+
+	.macro	pgtbl, ttb0, ttb1, phys
+	add	\ttb1, \phys, #TEXT_OFFSET - SWAPPER_DIR_SIZE
+	sub	\ttb0, \ttb1, #IDMAP_DIR_SIZE
+	.endm
+
+#ifdef CONFIG_ARM64_64K_PAGES
+#define BLOCK_SHIFT	PAGE_SHIFT
+#define BLOCK_SIZE	PAGE_SIZE
+#else
+#define BLOCK_SHIFT	SECTION_SHIFT
+#define BLOCK_SIZE	SECTION_SIZE
+#endif
+
+#define KERNEL_START	KERNEL_RAM_VADDR
+#define KERNEL_END	_end
+
+/*
+ * Initial memory map attributes.
+ */
+#ifndef CONFIG_SMP
+#define PTE_FLAGS	PTE_ATTRINDX(MT_NORMAL) | PTE_AF
+#define PMD_FLAGS	PMD_ATTRINDX(MT_NORMAL) | PMD_SECT_AF
+#else
+#define PTE_FLAGS	PTE_ATTRINDX(MT_NORMAL) | PTE_AF | PTE_SHARED
+#define PMD_FLAGS	PMD_ATTRINDX(MT_NORMAL) | PMD_SECT_AF | PMD_SECT_S
+#endif
+
+#ifdef CONFIG_ARM64_64K_PAGES
+#define MM_MMUFLAGS	PTE_TYPE_PAGE | PTE_FLAGS
+#define IO_MMUFLAGS	PTE_TYPE_PAGE | PTE_XN | PTE_FLAGS
+#else
+#define MM_MMUFLAGS	PMD_TYPE_SECT | PMD_FLAGS
+#define IO_MMUFLAGS	PMD_TYPE_SECT | PMD_SECT_XN | PMD_FLAGS
+#endif
+
+/*
+ * Kernel startup entry point.
+ * ---------------------------
+ *
+ * The requirements are:
+ *   MMU = off, D-cache = off, I-cache = on or off,
+ *   x0 = physical address to the FDT blob.
+ *
+ * This code is mostly position independent so you call this at
+ * __pa(PAGE_OFFSET + TEXT_OFFSET).
+ *
+ * Note that the callee-saved registers are used for storing variables
+ * that are useful before the MMU is enabled. The allocations are described
+ * in the entry routines.
+ */
+	__HEAD
+
+	/*
+	 * DO NOT MODIFY. Image header expected by Linux boot-loaders.
+	 */
+	b	stext				// branch to kernel start, magic
+	.long	0				// reserved
+	.quad	TEXT_OFFSET			// Image load offset from start of RAM
+	.quad	0				// reserved
+	.quad	0				// reserved
+
+ENTRY(stext)
+	mov	x21, x0				// x21=FDT
+	bl	el2_setup			// Drop to EL1
+	mrs	x22, midr_el1			// x22=cpuid
+	mov	x0, x22
+	bl	__lookup_processor_type
+	mov	x23, x0				// x23=procinfo
+	cbz	x23, __error_p			// invalid processor (x23=0)?
+	bl	__calc_phys_offset		// x24=phys offset
+	bl	__vet_fdt
+	bl	__create_page_tables		// x25=TTBR0, x26=TTBR1
+	/*
+	 * The following calls CPU specific code in a position independent
+	 * manner. See arch/arm64/mm/proc.S for details. x23 = base of
+	 * cpu_proc_info structure selected by __lookup_processor_type above.
+	 * On return, the CPU will be ready for the MMU to be turned on and
+	 * the TCR will have been set.
+	 */
+	ldr	x27, __switch_data		// address to jump to after
+						// MMU has been enabled
+	adr	lr, __enable_mmu		// return (PIC) address
+	add	x12, x23, #PROCINFO_INITFUNC
+	br	x12				// initialise processor
+ENDPROC(stext)
+
+/*
+ * If we're fortunate enough to boot at EL2, ensure that the world is
+ * sane before dropping to EL1.
+ */
+ENTRY(el2_setup)
+	mrs	x0, CurrentEL
+	cmp	x0, #PSR_MODE_EL2t
+	ccmp	x0, #PSR_MODE_EL2h, #0x4, ne
+	b.eq	1f
+	ret
+
+	/* Hyp configuration. */
+1:	mov	x0, #(1 << 31)			// 64-bit EL1
+	msr	hcr_el2, x0
+
+	/* Generic timers. */
+	mrs	x0, cnthctl_el2
+	orr	x0, x0, #3			// Enable EL1 physical timers
+	msr	cnthctl_el2, x0
+
+	/* Populate ID registers. */
+	mrs	x0, midr_el1
+	mrs	x1, mpidr_el1
+	msr	vpidr_el2, x0
+	msr	vmpidr_el2, x1
+
+	/* sctlr_el1 */
+	mov	x0, #0x0800			// Set/clear RES{1,0} bits
+	movk	x0, #0x30d0, lsl #16
+	msr	sctlr_el1, x0
+
+	/* Coprocessor traps. */
+	mov	x0, #0x33ff
+	msr	cptr_el2, x0			// Disable copro. traps to EL2
+
+#ifdef CONFIG_AARCH32_EMULATION
+	msr	hstr_el2, xzr			// Disable CP15 traps to EL2
+#endif
+
+	/* spsr */
+	mov	x0, #(PSR_F_BIT | PSR_I_BIT | PSR_A_BIT | PSR_D_BIT |\
+		      PSR_MODE_EL1h)
+	msr	spsr_el2, x0
+	msr	elr_el2, lr
+	eret
+ENDPROC(el2_setup)
+
+	.align	3
+2:	.quad	.
+	.quad	PAGE_OFFSET
+
+#ifdef CONFIG_SMP
+	.pushsection    .smp.pen.text, "ax"
+	.align	3
+1:	.quad	.
+	.quad	secondary_holding_pen_release
+
+	/*
+	 * This provides a "holding pen" for platforms to hold all secondary
+	 * cores are held until we're ready for them to initialise.
+	 */
+ENTRY(secondary_holding_pen)
+	bl	el2_setup			// Drop to EL1
+	mrs	x0, mpidr_el1
+	and	x0, x0, #15			// CPU number
+	adr	x1, 1b
+	ldp	x2, x3, [x1]
+	sub	x1, x1, x2
+	add	x3, x3, x1
+pen:	ldr	x4, [x3]
+	cmp	x4, x0
+	b.eq	secondary_startup
+	wfe
+	b	pen
+ENDPROC(secondary_holding_pen)
+	.popsection
+
+ENTRY(secondary_startup)
+	/*
+	 * Common entry point for secondary CPUs.
+	 */
+	mrs	x22, midr_el1			// x22=cpuid
+	mov	x0, x22
+	bl	__lookup_processor_type
+	mov	x23, x0				// x23=procinfo
+	cbz	x23, __error_p			// invalid processor (x23=0)?
+
+	bl	__calc_phys_offset		// x24=phys offset
+	pgtbl	x25, x26, x24			// x25=TTBR0, x26=TTBR1
+	add	x12, x23, #PROCINFO_INITFUNC
+	blr	x12				// initialise processor
+
+	ldr	x21, =secondary_data
+	ldr	x27, =__secondary_switched	// address to jump to after enabling the MMU
+	b	__enable_mmu
+ENDPROC(secondary_startup)
+
+ENTRY(__secondary_switched)
+	ldr	x0, [x21]			// get secondary_data.stack
+	mov	sp, x0
+	mov	x29, #0
+	b	secondary_start_kernel
+ENDPROC(__secondary_switched)
+#endif	/* CONFIG_SMP */
+
+/*
+ * Setup common bits before finally enabling the MMU. Essentially this is just
+ * loading the page table pointer and vector base registers.
+ *
+ * On entry to this code, x0 must contain the SCTLR_EL1 value for turning on
+ * the MMU.
+ */
+__enable_mmu:
+	ldr	x5, =vectors
+	msr	vbar_el1, x5
+	msr	ttbr0_el1, x25			// load TTBR0
+	msr	ttbr1_el1, x26			// load TTBR1
+	isb
+	b	__turn_mmu_on
+ENDPROC(__enable_mmu)
+
+/*
+ * Enable the MMU. This completely changes the structure of the visible memory
+ * space. You will not be able to trace execution through this.
+ *
+ *  x0  = system control register
+ *  x27 = *virtual* address to jump to upon completion
+ *
+ * other registers depend on the function called upon completion
+ */
+	.align	6
+__turn_mmu_on:
+	msr	sctlr_el1, x0
+	isb
+	br	x27
+ENDPROC(__turn_mmu_on)
+
+/*
+ * Calculate the start of physical memory.
+ */
+__calc_phys_offset:
+	adr	x0, 1f
+	ldp	x1, x2, [x0]
+	sub	x3, x0, x1			// PHYS_OFFSET - PAGE_OFFSET
+	add	x24, x2, x3			// x24=PHYS_OFFSET
+	ret
+ENDPROC(__calc_phys_offset)
+
+	.align 3
+1:	.quad	.
+	.quad	PAGE_OFFSET
+
+/*
+ * Macro to populate the PGD for the corresponding block entry in the next
+ * level (tbl) for the given virtual address.
+ *
+ * Preserves:	pgd, tbl, virt
+ * Corrupts:	tmp1, tmp2
+ */
+	.macro	create_pgd_entry, pgd, tbl, virt, tmp1, tmp2
+	lsr	\tmp1, \virt, #PGDIR_SHIFT
+	and	\tmp1, \tmp1, #PTRS_PER_PGD - 1	// PGD index
+	orr	\tmp2, \tbl, #3			// PGD entry table type
+	str	\tmp2, [\pgd, \tmp1, lsl #3]
+	.endm
+
+/*
+ * Macro to populate block entries in the page table for the start..end
+ * virtual range (inclusive).
+ *
+ * Preserves:	tbl, flags
+ * Corrupts:	phys, start, end, pstate
+ */
+	.macro	create_block_map, tbl, flags, phys, start, end, idmap=0
+	lsr	\phys, \phys, #BLOCK_SHIFT
+	.if	\idmap
+	and	\start, \phys, #PTRS_PER_PTE - 1	// table index
+	.else
+	lsr	\start, \start, #BLOCK_SHIFT
+	and	\start, \start, #PTRS_PER_PTE - 1	// table index
+	.endif
+	orr	\phys, \flags, \phys, lsl #BLOCK_SHIFT	// table entry
+	.ifnc	\start,\end
+	lsr	\end, \end, #BLOCK_SHIFT
+	and	\end, \end, #PTRS_PER_PTE - 1		// table end index
+	.endif
+9999:	str	\phys, [\tbl, \start, lsl #3]		// store the entry
+	.ifnc	\start,\end
+	add	\start, \start, #1			// next entry
+	add	\phys, \phys, #BLOCK_SIZE		// next block
+	cmp	\start, \end
+	b.ls	9999b
+	.endif
+	.endm
+
+/*
+ * Setup the initial page tables. We only setup the barest amount which is
+ * required to get the kernel running. The following sections are required:
+ *   - identity mapping to enable the MMU (low address, TTBR0)
+ *   - first few MB of the kernel linear mapping to jump to once the MMU has
+ *     been enabled, including the FDT blob (TTBR1)
+ */
+__create_page_tables:
+	pgtbl	x25, x26, x24			// idmap_pg_dir and swapper_pg_dir addresses
+
+	/*
+	 * Clear the idmap and swapper page tables.
+	 */
+	mov	x0, x25
+	add	x6, x26, #SWAPPER_DIR_SIZE
+1:	stp	xzr, xzr, [x0], #16
+	stp	xzr, xzr, [x0], #16
+	stp	xzr, xzr, [x0], #16
+	stp	xzr, xzr, [x0], #16
+	cmp	x0, x6
+	b.lo	1b
+
+	ldr	x7, =MM_MMUFLAGS
+
+	/*
+	 * Create the identity mapping.
+	 */
+	add	x0, x25, #PAGE_SIZE		// section table address
+	adr	x3, __turn_mmu_on		// virtual/physical address
+	create_pgd_entry x25, x0, x3, x5, x6
+	create_block_map x0, x7, x3, x5, x5, idmap=1
+
+	/*
+	 * Map the kernel image (starting with PHYS_OFFSET).
+	 */
+	add	x0, x26, #PAGE_SIZE		// section table address
+	mov	x5, #PAGE_OFFSET
+	create_pgd_entry x26, x0, x5, x3, x6
+	ldr	x6, =KERNEL_END - 1
+	mov	x3, x24				// phys offset
+	create_block_map x0, x7, x3, x5, x6
+
+	/*
+	 * Map the FDT blob (maximum 2MB; must be within 512MB of
+	 * PHYS_OFFSET).
+	 */
+	mov	x3, x21				// FDT phys address
+	and	x3, x3, #~((1 << 21) - 1)	// 2MB aligned
+	mov	x6, #PAGE_OFFSET
+	sub	x5, x3, x24			// subtract PHYS_OFFSET
+	tst	x5, #~((1 << 29) - 1)		// within 512MB?
+	csel	x21, xzr, x21, ne		// zero the FDT pointer
+	b.ne	1f
+	add	x5, x5, x6			// __va(FDT blob)
+	add	x6, x5, #1 << 21		// 2MB for the FDT blob
+	sub	x6, x6, #1			// inclusive range
+	create_block_map x0, x7, x3, x5, x6
+1:
+	ret
+ENDPROC(__create_page_tables)
+	.ltorg
+
+	.align	3
+	.type	__switch_data, %object
+__switch_data:
+	.quad	__mmap_switched
+	.quad	__data_loc			// x4
+	.quad	_data				// x5
+	.quad	__bss_start			// x6
+	.quad	_end				// x7
+	.quad	processor_id			// x4
+	.quad	__fdt_pointer			// x5
+	.quad	memstart_addr			// x6
+	.quad	init_thread_union + THREAD_START_SP // sp
+
+/*
+ * The following fragment of code is executed with the MMU on in MMU mode, and
+ * uses absolute addresses; this is not position independent.
+ */
+__mmap_switched:
+	adr	x3, __switch_data + 8
+
+	ldp	x4, x5, [x3], #16
+	ldp	x6, x7, [x3], #16
+	cmp	x4, x5				// Copy data segment if needed
+1:	ccmp	x5, x6, #4, ne
+	b.eq	2f
+	ldr	x16, [x4], #8
+	str	x16, [x5], #8
+	b	1b
+2:
+1:	cmp	x6, x7
+	b.hs	2f
+	str	xzr, [x6], #8			// Clear BSS
+	b	1b
+2:
+	ldp	x4, x5, [x3], #16
+	ldr	x6, [x3], #8
+	ldr	x16, [x3]
+	mov	sp, x16
+	str	x22, [x4]			// Save processor ID
+	str	x21, [x5]			// Save FDT pointer
+	str	x24, [x6]			// Save PHYS_OFFSET
+	mov	x29, #0
+	b	start_kernel
+ENDPROC(__mmap_switched)
+
+/*
+ * Exception handling. Something went wrong and we can't proceed. We ought to
+ * tell the user, but since we don't have any guarantee that we're even
+ * running on the right architecture, we do virtually nothing.
+ */
+__error_p:
+ENDPROC(__error_p)
+
+__error:
+1:	nop
+	b	1b
+ENDPROC(__error)
+
+/*
+ * Read processor ID register and look up in the linker-built supported
+ * processor list. Note that we can't use the absolute addresses for the
+ * __proc_info lists since we aren't running with the MMU on (and therefore,
+ * we are not in the correct address space). We have to calculate the offset.
+ *
+ * This routine can be called via C code, so to avoid needlessly saving
+ * callee-saved registers, we take the CPUID in x0 and return the physical
+ * proc_info pointer in x0 as well.
+ */
+__lookup_processor_type:
+	adr	x1, __lookup_processor_type_data
+	ldr	x2, [x1]
+	ldp	x3, x4, [x1, #8]
+	sub	x1, x1, x2			// get offset between virt&phys
+	add	x3, x3, x1			// convert virt addresses to
+	add	x4, x4, x1			// physical address space
+1:
+	ldp	w5, w6, [x3]			// load cpu_val and cpu_mask
+	and	x6, x6, x0
+	cmp	x5, x6
+	b.eq	2f
+	add	x3, x3, #PROC_INFO_SZ
+	cmp	x4, x4
+	b.ne	1b
+	mov	x3, #0				// unknown processor
+2:
+	mov	x0, x3
+	ret
+ENDPROC(__lookup_processor_type)
+
+/*
+ * This provides a C-API version of the above function.
+ */
+ENTRY(lookup_processor_type)
+	mov	x8, lr
+	bl	__lookup_processor_type
+	ret	x8
+ENDPROC(lookup_processor_type)
+
+	.align	3
+	.type	__lookup_processor_type_data, %object
+__lookup_processor_type_data:
+	.quad	.
+	.quad	__proc_info_begin
+	.quad	__proc_info_end
+	.size	__lookup_processor_type_data, . - __lookup_processor_type_data
+
+/*
+ * Determine validity of the x21 FDT pointer.
+ * The dtb must be 8-byte aligned and live in the first 512M of memory.
+ */
+__vet_fdt:
+	tst	x21, #0x7
+	b.ne	1f
+	cmp	x21, x24
+	b.lt	1f
+	mov	x0, #(1 << 29)
+	add	x0, x0, x24
+	cmp	x21, x0
+	b.ge	1f
+	ret
+1:
+	mov	x21, #0
+	ret
+ENDPROC(__vet_fdt)
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
new file mode 100644
index 0000000..f25186f
--- /dev/null
+++ b/arch/arm64/kernel/setup.c
@@ -0,0 +1,357 @@
+/*
+ * Based on arch/arm/kernel/setup.c
+ *
+ * Copyright (C) 1995-2001 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/stddef.h>
+#include <linux/ioport.h>
+#include <linux/delay.h>
+#include <linux/utsname.h>
+#include <linux/initrd.h>
+#include <linux/console.h>
+#include <linux/bootmem.h>
+#include <linux/seq_file.h>
+#include <linux/screen_info.h>
+#include <linux/init.h>
+#include <linux/kexec.h>
+#include <linux/crash_dump.h>
+#include <linux/root_dev.h>
+#include <linux/cpu.h>
+#include <linux/interrupt.h>
+#include <linux/smp.h>
+#include <linux/fs.h>
+#include <linux/proc_fs.h>
+#include <linux/memblock.h>
+#include <linux/of_fdt.h>
+
+#include <asm/cputype.h>
+#include <asm/elf.h>
+#include <asm/procinfo.h>
+#include <asm/sections.h>
+#include <asm/setup.h>
+#include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
+#include <asm/traps.h>
+#include <asm/memblock.h>
+
+extern void paging_init(void);
+
+unsigned int processor_id;
+EXPORT_SYMBOL(processor_id);
+
+unsigned int elf_hwcap __read_mostly;
+EXPORT_SYMBOL(elf_hwcap);
+
+static const char *cpu_name;
+static const char *machine_name;
+phys_addr_t __fdt_pointer __initdata;
+
+/*
+ * Standard memory resources
+ */
+static struct resource mem_res[] = {
+	{
+		.name = "Kernel code",
+		.start = 0,
+		.end = 0,
+		.flags = IORESOURCE_MEM
+	},
+	{
+		.name = "Kernel data",
+		.start = 0,
+		.end = 0,
+		.flags = IORESOURCE_MEM
+	}
+};
+
+#define kernel_code mem_res[0]
+#define kernel_data mem_res[1]
+
+/*
+ * These functions re-use the assembly code in head.S, which
+ * already provide the required functionality.
+ */
+extern struct proc_info_list *lookup_processor_type(unsigned int);
+
+void __init early_print(const char *str, ...)
+{
+	char buf[256];
+	va_list ap;
+
+	va_start(ap, str);
+	vsnprintf(buf, sizeof(buf), str, ap);
+	va_end(ap);
+
+	printk("%s", buf);
+}
+
+static void __init setup_processor(void)
+{
+	struct proc_info_list *list;
+
+	/*
+	 * locate processor in the list of supported processor
+	 * types.  The linker builds this table for us from the
+	 * entries in arch/arm/mm/proc.S
+	 */
+	list = lookup_processor_type(read_cpuid_id());
+	if (!list) {
+		printk("CPU configuration botched (ID %08x), unable to continue.\n",
+		       read_cpuid_id());
+		while (1);
+	}
+
+	cpu_name = list->cpu_name;
+
+	printk("CPU: %s [%08x] revision %d\n",
+	       cpu_name, read_cpuid_id(), read_cpuid_id() & 15);
+
+	sprintf(init_utsname()->machine, "aarch64");
+	elf_hwcap = 0;
+
+	cpu_proc_init();
+}
+
+static void __init setup_machine_fdt(phys_addr_t dt_phys)
+{
+	struct boot_param_header *devtree;
+	unsigned long dt_root;
+
+	/* Check we have a non-NULL DT pointer */
+	if (!dt_phys) {
+		early_print("\n"
+			"Error: NULL or invalid device tree blob\n"
+			"The dtb must be 8-byte aligned and passed in the first 512MB of memory\n"
+			"\nPlease check your bootloader.\n");
+
+		while (true)
+			cpu_relax();
+
+	}
+
+	devtree = phys_to_virt(dt_phys);
+
+	/* Check device tree validity */
+	if (be32_to_cpu(devtree->magic) != OF_DT_HEADER) {
+		early_print("\n"
+			"Error: invalid device tree blob at physical address 0x%p (virtual address 0x%p)\n"
+			"Expected 0x%x, found 0x%x\n"
+			"\nPlease check your bootloader.\n",
+			dt_phys, devtree, OF_DT_HEADER,
+			be32_to_cpu(devtree->magic));
+
+		while (true)
+			cpu_relax();
+	}
+
+	initial_boot_params = devtree;
+	dt_root = of_get_flat_dt_root();
+
+	machine_name = of_get_flat_dt_prop(dt_root, "model", NULL);
+	if (!machine_name)
+		machine_name = of_get_flat_dt_prop(dt_root, "compatible", NULL);
+	if (!machine_name)
+		machine_name = "<unknown>";
+	pr_info("Machine: %s\n", machine_name);
+
+	/* Retrieve various information from the /chosen node */
+	of_scan_flat_dt(early_init_dt_scan_chosen, boot_command_line);
+	/* Initialize {size,address}-cells info */
+	of_scan_flat_dt(early_init_dt_scan_root, NULL);
+	/* Setup memory, calling early_init_dt_add_memory_arch */
+	of_scan_flat_dt(early_init_dt_scan_memory, NULL);
+}
+
+void __init early_init_dt_add_memory_arch(u64 base, u64 size)
+{
+	size &= PAGE_MASK;
+	memblock_add(base, size);
+}
+
+void * __init early_init_dt_alloc_memory_arch(u64 size, u64 align)
+{
+	return __va(memblock_alloc(size, align));
+}
+
+/*
+ * Limit the memory size that was specified via FDT.
+ */
+static int __init early_mem(char *p)
+{
+	phys_addr_t limit;
+
+	if (!p)
+		return 1;
+
+	limit = memparse(p, &p) & PAGE_MASK;
+	pr_notice("Memory limited to %lldMB\n", limit >> 20);
+
+	memblock_enforce_memory_limit(limit);
+
+	return 0;
+}
+early_param("mem", early_mem);
+
+static void __init request_standard_resources(void)
+{
+	struct memblock_region *region;
+	struct resource *res;
+
+	kernel_code.start   = virt_to_phys(_text);
+	kernel_code.end     = virt_to_phys(_etext - 1);
+	kernel_data.start   = virt_to_phys(_sdata);
+	kernel_data.end     = virt_to_phys(_end - 1);
+
+	for_each_memblock(memory, region) {
+		res = alloc_bootmem_low(sizeof(*res));
+		res->name  = "System RAM";
+		res->start = __pfn_to_phys(memblock_region_memory_base_pfn(region));
+		res->end = __pfn_to_phys(memblock_region_memory_end_pfn(region)) - 1;
+		res->flags = IORESOURCE_MEM | IORESOURCE_BUSY;
+
+		request_resource(&iomem_resource, res);
+
+		if (kernel_code.start >= res->start &&
+		    kernel_code.end <= res->end)
+			request_resource(res, &kernel_code);
+		if (kernel_data.start >= res->start &&
+		    kernel_data.end <= res->end)
+			request_resource(res, &kernel_data);
+	}
+}
+
+void __init setup_arch(char **cmdline_p)
+{
+	setup_processor();
+
+	setup_machine_fdt(__fdt_pointer);
+
+	init_mm.start_code = (unsigned long) _text;
+	init_mm.end_code   = (unsigned long) _etext;
+	init_mm.end_data   = (unsigned long) _edata;
+	init_mm.brk	   = (unsigned long) _end;
+
+	*cmdline_p = boot_command_line;
+
+	parse_early_param();
+
+	arm64_memblock_init();
+
+	paging_init();
+	request_standard_resources();
+
+	unflatten_device_tree();
+
+#ifdef CONFIG_SMP
+	smp_init_cpus();
+#endif
+
+#ifdef CONFIG_VT
+#if defined(CONFIG_VGA_CONSOLE)
+	conswitchp = &vga_con;
+#elif defined(CONFIG_DUMMY_CONSOLE)
+	conswitchp = &dummy_con;
+#endif
+#endif
+}
+
+static DEFINE_PER_CPU(struct cpu, cpu_data);
+
+static int __init topology_init(void)
+{
+	int i;
+
+	for_each_possible_cpu(i) {
+		struct cpu *cpu = &per_cpu(cpu_data, i);
+		cpu->hotpluggable = 1;
+		register_cpu(cpu, i);
+	}
+
+	return 0;
+}
+subsys_initcall(topology_init);
+
+static const char *hwcap_str[] = {
+	"fp",
+	"asimd",
+	NULL
+};
+
+static int c_show(struct seq_file *m, void *v)
+{
+	int i;
+
+	seq_printf(m, "Processor\t: %s rev %d (%s)\n",
+		   cpu_name, read_cpuid_id() & 15, ELF_PLATFORM);
+
+	for_each_online_cpu(i) {
+		/*
+		 * glibc reads /proc/cpuinfo to determine the number of
+		 * online processors, looking for lines beginning with
+		 * "processor".  Give glibc what it expects.
+		 */
+#ifdef CONFIG_SMP
+		seq_printf(m, "processor\t: %d\n", i);
+#endif
+		seq_printf(m, "BogoMIPS\t: %lu.%02lu\n\n",
+			   loops_per_jiffy / (500000UL/HZ),
+			   loops_per_jiffy / (5000UL/HZ) % 100);
+	}
+
+	/* dump out the processor features */
+	seq_puts(m, "Features\t: ");
+
+	for (i = 0; hwcap_str[i]; i++)
+		if (elf_hwcap & (1 << i))
+			seq_printf(m, "%s ", hwcap_str[i]);
+
+	seq_printf(m, "\nCPU implementer\t: 0x%02x\n", read_cpuid_id() >> 24);
+	seq_printf(m, "CPU architecture: AArch64\n");
+	seq_printf(m, "CPU variant\t: 0x%x\n", (read_cpuid_id() >> 20) & 15);
+	seq_printf(m, "CPU part\t: 0x%03x\n", (read_cpuid_id() >> 4) & 0xfff);
+	seq_printf(m, "CPU revision\t: %d\n", read_cpuid_id() & 15);
+
+	seq_puts(m, "\n");
+
+	seq_printf(m, "Hardware\t: %s\n", machine_name);
+
+	return 0;
+}
+
+static void *c_start(struct seq_file *m, loff_t *pos)
+{
+	return *pos < 1 ? (void *)1 : NULL;
+}
+
+static void *c_next(struct seq_file *m, void *v, loff_t *pos)
+{
+	++*pos;
+	return NULL;
+}
+
+static void c_stop(struct seq_file *m, void *v)
+{
+}
+
+const struct seq_operations cpuinfo_op = {
+	.start	= c_start,
+	.next	= c_next,
+	.stop	= c_stop,
+	.show	= c_show
+};


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 03/31] arm64: Exception handling
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
  2012-08-14 17:52 ` [PATCH v2 01/31] arm64: Assembly macros and definitions Catalin Marinas
  2012-08-14 17:52 ` [PATCH v2 02/31] arm64: Kernel booting and initialisation Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-14 23:29   ` Olof Johansson
  2012-08-15 13:03   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 04/31] arm64: MMU definitions Catalin Marinas
                   ` (28 subsequent siblings)
  31 siblings, 2 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

The patch contains the exception entry code (kernel/entry.S), pt_regs
structure and related accessors, undefined instruction trapping and
stack tracing.

AArch64 Linux kernel (including kernel threads) runs in EL1 mode using
the SP1 stack. The vectors don't have a fixed address, only alignment
(2^11) requirements.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/ptrace.h     |  206 +++++++++++
 arch/arm64/include/asm/stacktrace.h |   29 ++
 arch/arm64/include/asm/traps.h      |   30 ++
 arch/arm64/kernel/entry.S           |  695 +++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/stacktrace.c      |  127 +++++++
 arch/arm64/kernel/traps.c           |  357 ++++++++++++++++++
 6 files changed, 1444 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/ptrace.h
 create mode 100644 arch/arm64/include/asm/stacktrace.h
 create mode 100644 arch/arm64/include/asm/traps.h
 create mode 100644 arch/arm64/kernel/entry.S
 create mode 100644 arch/arm64/kernel/stacktrace.c
 create mode 100644 arch/arm64/kernel/traps.c

diff --git a/arch/arm64/include/asm/ptrace.h b/arch/arm64/include/asm/ptrace.h
new file mode 100644
index 0000000..a9abace
--- /dev/null
+++ b/arch/arm64/include/asm/ptrace.h
@@ -0,0 +1,206 @@
+/*
+ * Based on arch/arm/include/asm/ptrace.h
+ *
+ * Copyright (C) 1996-2003 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PTRACE_H
+#define __ASM_PTRACE_H
+
+#include <linux/types.h>
+
+#include <asm/hwcap.h>
+
+#define PTRACE_GETREGS		12
+#define PTRACE_SETREGS		13
+#define PTRACE_GETFPSIMDREGS	14
+#define PTRACE_SETFPSIMDREGS	15
+/* PTRACE_ATTACH is 16 */
+/* PTRACE_DETACH is 17 */
+#define PTRACE_GET_THREAD_AREA	22
+#define PTRACE_SET_SYSCALL	23
+#define PTRACE_GETHBPREGS	29
+#define PTRACE_SETHBPREGS	30
+
+/* AArch32-specific ptrace requests */
+#define COMPAT_PTRACE_GETVFPREGS	27
+#define COMPAT_PTRACE_SETVFPREGS	28
+
+/*
+ * PSR bits
+ */
+#define PSR_MODE_EL0t	0x00000000
+#define PSR_MODE_EL1t	0x00000004
+#define PSR_MODE_EL1h	0x00000005
+#define PSR_MODE_EL2t	0x00000008
+#define PSR_MODE_EL2h	0x00000009
+#define PSR_MODE_EL3t	0x0000000c
+#define PSR_MODE_EL3h	0x0000000d
+#define PSR_MODE_MASK	0x0000000f
+
+/* AArch32 CPSR bits */
+#define PSR_MODE32_BIT		0x00000010
+#define COMPAT_PSR_MODE_USR	0x00000010
+#define COMPAT_PSR_T_BIT	0x00000020
+#define COMPAT_PSR_IT_MASK	0x0600fc00	/* If-Then execution state mask */
+
+/* AArch64 SPSR bits */
+#define PSR_F_BIT	0x00000040
+#define PSR_I_BIT	0x00000080
+#define PSR_A_BIT	0x00000100
+#define PSR_D_BIT	0x00000200
+#define PSR_Q_BIT	0x08000000
+#define PSR_V_BIT	0x10000000
+#define PSR_C_BIT	0x20000000
+#define PSR_Z_BIT	0x40000000
+#define PSR_N_BIT	0x80000000
+
+/*
+ * Groups of PSR bits
+ */
+#define PSR_f		0xff000000	/* Flags		*/
+#define PSR_s		0x00ff0000	/* Status		*/
+#define PSR_x		0x0000ff00	/* Extension		*/
+#define PSR_c		0x000000ff	/* Control		*/
+
+/*
+ * These are 'magic' values for PTRACE_PEEKUSR that return info about where a
+ * process is located in memory.
+ */
+#define PT_TEXT_ADDR		0x10000
+#define PT_DATA_ADDR		0x10004
+#define PT_TEXT_END_ADDR	0x10008
+
+#ifndef __ASSEMBLY__
+
+/*
+ * User structures for general purpose and floating point registers.
+ */
+struct user_pt_regs {
+	__u64		regs[31];
+	__u64		sp;
+	__u64		pc;
+	__u64		pstate;
+};
+
+struct user_fpsimd_state {
+	__uint128_t	vregs[32];
+	__u32		fpsr;
+	__u32		fpcr;
+};
+
+#ifdef __KERNEL__
+
+#ifdef CONFIG_AARCH32_EMULATION
+/* sizeof(struct user) for AArch32 */
+#define COMPAT_USER_SZ	296
+/* AArch32 uses x13 as the stack pointer... */
+#define compat_sp	regs[13]
+/* ... and x14 as the link register. */
+#define compat_lr	regs[14]
+#endif
+
+/*
+ * This struct defines the way the registers are stored on the stack during an
+ * exception. Note that sizeof(struct pt_regs) has to be a multiple of 16 (for
+ * stack alignment). struct user_pt_regs must form a prefix of struct pt_regs.
+ */
+struct pt_regs {
+	union {
+		struct user_pt_regs user_regs;
+		struct {
+			u64 regs[31];
+			u64 sp;
+			u64 pc;
+			u64 pstate;
+		};
+	};
+	u64 orig_x0;
+	u64 syscallno;
+};
+
+#define arch_has_single_step()	(1)
+
+#ifdef CONFIG_AARCH32_EMULATION
+#define compat_thumb_mode(regs) \
+	(((regs)->pstate & COMPAT_PSR_T_BIT))
+#else
+#define compat_thumb_mode(regs) (0)
+#endif
+
+#define user_mode(regs)	\
+	(((regs)->pstate & PSR_MODE_MASK) == PSR_MODE_EL0t)
+
+#define compat_user_mode(regs)	\
+	(((regs)->pstate & (PSR_MODE32_BIT | PSR_MODE_MASK)) == \
+	 (PSR_MODE32_BIT | PSR_MODE_EL0t))
+
+#define processor_mode(regs) \
+	((regs)->pstate & PSR_MODE_MASK)
+
+#define interrupts_enabled(regs) \
+	(!((regs)->pstate & PSR_I_BIT))
+
+#define fast_interrupts_enabled(regs) \
+	(!((regs)->pstate & PSR_F_BIT))
+
+#define user_stack_pointer(regs) \
+	((regs)->sp)
+
+/*
+ * Are the current registers suitable for user mode? (used to maintain
+ * security in signal handlers)
+ */
+static inline int valid_user_regs(struct user_pt_regs *regs)
+{
+	if (user_mode(regs) && (regs->pstate & PSR_I_BIT) == 0) {
+		regs->pstate &= ~(PSR_F_BIT | PSR_A_BIT);
+
+		/* The T bit is reserved for AArch64 */
+		if (!(regs->pstate & PSR_MODE32_BIT))
+			regs->pstate &= ~COMPAT_PSR_T_BIT;
+
+		return 1;
+	}
+
+	/*
+	 * Force PSR to something logical...
+	 */
+	regs->pstate &= PSR_f | PSR_s | (PSR_x & ~PSR_A_BIT) | \
+			COMPAT_PSR_T_BIT | PSR_MODE32_BIT;
+
+	if (!(regs->pstate & PSR_MODE32_BIT)) {
+		regs->pstate &= ~COMPAT_PSR_T_BIT;
+		regs->pstate |= PSR_MODE_EL0t;
+	}
+
+	return 0;
+}
+
+#define instruction_pointer(regs)	(regs)->pc
+
+#ifdef CONFIG_SMP
+extern unsigned long profile_pc(struct pt_regs *regs);
+#else
+#define profile_pc(regs) instruction_pointer(regs)
+#endif
+
+extern int aarch32_break_trap(struct pt_regs *regs);
+
+#endif /* __KERNEL__ */
+
+#endif /* __ASSEMBLY__ */
+
+#endif
diff --git a/arch/arm64/include/asm/stacktrace.h b/arch/arm64/include/asm/stacktrace.h
new file mode 100644
index 0000000..7318f6d
--- /dev/null
+++ b/arch/arm64/include/asm/stacktrace.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_STACKTRACE_H
+#define __ASM_STACKTRACE_H
+
+struct stackframe {
+	unsigned long fp;
+	unsigned long sp;
+	unsigned long pc;
+};
+
+extern int unwind_frame(struct stackframe *frame);
+extern void walk_stackframe(struct stackframe *frame,
+			    int (*fn)(struct stackframe *, void *), void *data);
+
+#endif	/* __ASM_STACKTRACE_H */
diff --git a/arch/arm64/include/asm/traps.h b/arch/arm64/include/asm/traps.h
new file mode 100644
index 0000000..10ca8ff
--- /dev/null
+++ b/arch/arm64/include/asm/traps.h
@@ -0,0 +1,30 @@
+/*
+ * Based on arch/arm/include/asm/traps.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_TRAP_H
+#define __ASM_TRAP_H
+
+static inline int in_exception_text(unsigned long ptr)
+{
+	extern char __exception_text_start[];
+	extern char __exception_text_end[];
+
+	return ptr >= (unsigned long)&__exception_text_start &&
+	       ptr < (unsigned long)&__exception_text_end;
+}
+
+#endif
diff --git a/arch/arm64/kernel/entry.S b/arch/arm64/kernel/entry.S
new file mode 100644
index 0000000..32b96ab
--- /dev/null
+++ b/arch/arm64/kernel/entry.S
@@ -0,0 +1,695 @@
+/*
+ * Low-level exception handling code
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Authors:	Catalin Marinas <catalin.marinas@arm.com>
+ *		Will Deacon <will.deacon@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+
+#include <asm/assembler.h>
+#include <asm/asm-offsets.h>
+#include <asm/errno.h>
+#include <asm/thread_info.h>
+#include <asm/unistd.h>
+
+/*
+ * Bad Abort numbers
+ *-----------------
+ */
+#define BAD_SYNC	0
+#define BAD_IRQ		1
+#define BAD_FIQ		2
+#define BAD_ERROR	3
+
+	.macro	kernel_entry, el, regsize = 64
+	sub	sp, sp, #S_FRAME_SIZE - S_LR	// room for LR, SP, SPSR, ELR
+	.if	\regsize == 32
+	mov	w0, w0				// zero upper 32 bits of x0
+	.endif
+	push	x28, x29
+	push	x26, x27
+	push	x24, x25
+	push	x22, x23
+	push	x20, x21
+	push	x18, x19
+	push	x16, x17
+	push	x14, x15
+	push	x12, x13
+	push	x10, x11
+	push	x8, x9
+	push	x6, x7
+	push	x4, x5
+	push	x2, x3
+	push	x0, x1
+	.if	\el == 0
+	mrs	x21, sp_el0
+	.else
+	add	x21, sp, #S_FRAME_SIZE
+	.endif
+	mrs	x22, elr_el1
+	mrs	x23, spsr_el1
+	stp	lr, x21, [sp, #S_LR]
+	stp	x22, x23, [sp, #S_PC]
+
+	/*
+	 * Set syscallno to -1 by default (overridden later if real syscall).
+	 */
+	.if	\el == 0
+	mvn	x21, xzr
+	str	x21, [sp, #S_SYSCALLNO]
+	.endif
+
+	/*
+	 * Registers that may be useful after this macro is invoked:
+	 *
+	 * x21 - aborted SP
+	 * x22 - aborted PC
+	 * x23 - aborted PSTATE
+	*/
+	.endm
+
+	.macro	kernel_exit, el, ret = 0
+	ldp	x21, x22, [sp, #S_PC]		// load ELR, SPSR
+	.if	\el == 0
+	ldr	x23, [sp, #S_SP]		// load return stack pointer
+	.endif
+	.if	\ret
+	ldr	x1, [sp, #S_X1]			// preserve x0 (syscall return)
+	add	sp, sp, S_X2
+	.else
+	pop	x0, x1
+	.endif
+	pop	x2, x3				// load the rest of the registers
+	pop	x4, x5
+	pop	x6, x7
+	pop	x8, x9
+	msr	elr_el1, x21			// set up the return data
+	msr	spsr_el1, x22
+	.if	\el == 0
+	msr	sp_el0, x23
+	.endif
+	pop	x10, x11
+	pop	x12, x13
+	pop	x14, x15
+	pop	x16, x17
+	pop	x18, x19
+	pop	x20, x21
+	pop	x22, x23
+	pop	x24, x25
+	pop	x26, x27
+	pop	x28, x29
+	ldr	lr, [sp], #S_FRAME_SIZE - S_LR	// load LR and restore SP
+	eret					// return to kernel
+	.endm
+
+	.macro	get_thread_info, rd
+	mov	\rd, sp
+	and	\rd, \rd, #~((1 << 13) - 1)	// top of 8K stack
+	.endm
+
+/*
+ * These are the registers used in the syscall handler, and allow us to
+ * have in theory up to 7 arguments to a function - x0 to x6.
+ *
+ * x7 is reserved for the system call number in 32-bit mode.
+ */
+sc_nr	.req	x25		// number of system calls
+scno	.req	x26		// syscall number
+stbl	.req	x27		// syscall table pointer
+tsk	.req	x28		// current thread_info
+
+/*
+ * Interrupt handling.
+ */
+	.macro	irq_handler
+	ldr	x1, handle_arch_irq
+	mov	x0, sp
+	blr	x1
+	.endm
+
+	.text
+
+/*
+ * Exception vectors.
+ */
+	.macro	ventry	label
+	.align	7
+	b	\label
+	.endm
+
+	.align	11
+ENTRY(vectors)
+	ventry	el1_sync_invalid		// Synchronous EL1t
+	ventry	el1_irq_invalid			// IRQ EL1t
+	ventry	el1_fiq_invalid			// FIQ EL1t
+	ventry	el1_error_invalid		// Error EL1t
+
+	ventry	el1_sync			// Synchronous EL1h
+	ventry	el1_irq				// IRQ EL1h
+	ventry	el1_fiq_invalid			// FIQ EL1h
+	ventry	el1_error_invalid		// Error EL1h
+
+	ventry	el0_sync			// Synchronous 64-bit EL0
+	ventry	el0_irq				// IRQ 64-bit EL0
+	ventry	el0_fiq_invalid			// FIQ 64-bit EL0
+	ventry	el0_error_invalid		// Error 64-bit EL0
+
+#ifdef CONFIG_AARCH32_EMULATION
+	ventry	el0_sync_compat			// Synchronous 32-bit EL0
+	ventry	el0_irq_compat			// IRQ 32-bit EL0
+	ventry	el0_fiq_invalid_compat		// FIQ 32-bit EL0
+	ventry	el0_error_invalid_compat	// Error 32-bit EL0
+#else
+	ventry	el0_sync_invalid		// Synchronous 32-bit EL0
+	ventry	el0_irq_invalid			// IRQ 32-bit EL0
+	ventry	el0_fiq_invalid			// FIQ 32-bit EL0
+	ventry	el0_error_invalid		// Error 32-bit EL0
+#endif
+END(vectors)
+
+/*
+ * Invalid mode handlers
+ */
+	.macro	inv_entry, el, reason, regsize = 64
+	kernel_entry el, \regsize
+	mov	x0, sp
+	mov	x1, #\reason
+	mrs	x2, esr_el1
+	b	bad_mode
+	.endm
+
+el0_sync_invalid:
+	inv_entry 0, BAD_SYNC
+ENDPROC(el0_sync_invalid)
+
+el0_irq_invalid:
+	inv_entry 0, BAD_IRQ
+ENDPROC(el0_irq_invalid)
+
+el0_fiq_invalid:
+	inv_entry 0, BAD_FIQ
+ENDPROC(el0_fiq_invalid)
+
+el0_error_invalid:
+	inv_entry 0, BAD_ERROR
+ENDPROC(el0_error_invalid)
+
+#ifdef CONFIG_AARCH32_EMULATION
+el0_fiq_invalid_compat:
+	inv_entry 0, BAD_FIQ, 32
+ENDPROC(el0_fiq_invalid_compat)
+
+el0_error_invalid_compat:
+	inv_entry 0, BAD_ERROR, 32
+ENDPROC(el0_error_invalid_compat)
+#endif
+
+el1_sync_invalid:
+	inv_entry 1, BAD_SYNC
+ENDPROC(el1_sync_invalid)
+
+el1_irq_invalid:
+	inv_entry 1, BAD_IRQ
+ENDPROC(el1_irq_invalid)
+
+el1_fiq_invalid:
+	inv_entry 1, BAD_FIQ
+ENDPROC(el1_fiq_invalid)
+
+el1_error_invalid:
+	inv_entry 1, BAD_ERROR
+ENDPROC(el1_error_invalid)
+
+/*
+ * EL1 mode handlers.
+ */
+	.align	6
+el1_sync:
+	kernel_entry 1
+	mrs	x1, esr_el1			// read the syndrome register
+	lsr	x24, x1, #26			// exception class
+	cmp	x24, #0x25			// data abort in EL1
+	b.eq	el1_da
+	cmp	x24, #0x18			// configurable trap
+	b.eq	el1_undef
+	cmp	x24, #0x26			// stack alignment exception
+	b.eq	el1_sp_pc
+	cmp	x24, #0x22			// pc alignment exception
+	b.eq	el1_sp_pc
+	cmp	x24, #0x00			// unknown exception in EL1
+	b.eq	el1_undef
+	cmp	x24, #0x30			// debug exception in EL1
+	b.ge	el1_dbg
+	b	el1_inv
+el1_da:
+	/*
+	 * Data abort handling
+	 */
+	mrs	x0, far_el1
+	enable_dbg_if_not_stepping x2
+	// re-enable interrupts if they were enabled in the aborted context
+	tbnz	x23, #7, 1f			// PSR_I_BIT
+	enable_irq
+1:
+	mov	x2, sp				// struct pt_regs
+	bl	do_mem_abort
+
+	// disable interrupts before pulling preserved data off the stack
+	disable_irq
+	kernel_exit 1
+el1_sp_pc:
+	/*
+	 *Stack or PC alignment exception handling
+	 */
+	mrs	x0, far_el1
+	mov	x1, x25
+	mov	x2, sp
+	b	do_sp_pc_abort
+el1_undef:
+	/*
+	 *Undefined instruction
+	 */
+	mov	x0, sp
+	b	do_undefinstr
+el1_dbg:
+	/*
+	 * Debug exception handling
+	 */
+	tbz	x24, #0, el1_inv		// EL1 only
+	mrs	x0, far_el1
+	mov	x2, sp				// struct pt_regs
+	bl	do_debug_exception
+
+	kernel_exit 1
+el1_inv:
+	// TODO: add support for undefined instructions in kernel mode
+	mov	x0, sp
+	mov	x1, #BAD_SYNC
+	mrs	x2, esr_el1
+	b	bad_mode
+ENDPROC(el1_sync)
+
+	.align	6
+el1_irq:
+	kernel_entry 1
+	enable_dbg_if_not_stepping x0
+#ifdef CONFIG_TRACE_IRQFLAGS
+	bl	trace_hardirqs_off
+#endif
+#ifdef CONFIG_PREEMPT
+	get_thread_info tsk
+	ldr	x24, [tsk, #TI_PREEMPT]		// get preempt count
+	add	x0, x24, #1			// increment it
+	str	x0, [tsk, #TI_PREEMPT]
+#endif
+	irq_handler
+#ifdef CONFIG_PREEMPT
+	str	x24, [tsk, #TI_PREEMPT]		// restore preempt count
+	cbnz	x24, 1f				// preempt count != 0
+	ldr	x0, [tsk, #TI_FLAGS]		// get flags
+	tbz	x0, #TIF_NEED_RESCHED, 1f	// needs rescheduling?
+	bl	el1_preempt
+1:
+#endif
+#ifdef CONFIG_TRACE_IRQFLAGS
+	bl	trace_hardirqs_on
+#endif
+	kernel_exit 1
+ENDPROC(el1_irq)
+
+#ifdef CONFIG_PREEMPT
+el1_preempt:
+	mov	x24, lr
+1:	enable_dbg
+	bl	preempt_schedule_irq		// irq en/disable is done inside
+	ldr	x0, [tsk, #TI_FLAGS]		// get new tasks TI_FLAGS
+	tbnz	x0, #TIF_NEED_RESCHED, 1b	// needs rescheduling?
+	ret	x24
+#endif
+
+/*
+ * EL0 mode handlers.
+ */
+	.align	6
+el0_sync:
+	kernel_entry 0
+	mrs	x25, esr_el1			// read the syndrome register
+	lsr	x24, x25, #26			// exception class
+	cmp	x24, #0x15			// SVC in 64-bit state
+	b.eq	el0_svc
+	adr	lr, ret_from_exception
+	cmp	x24, #0x24			// data abort in EL0
+	b.eq	el0_da
+	cmp	x24, #0x20			// instruction abort in EL0
+	b.eq	el0_ia
+	cmp	x24, #0x07			// FP/ASIMD access
+	b.eq	el0_fpsimd_acc
+	cmp	x24, #0x2c			// FP/ASIMD exception
+	b.eq	el0_fpsimd_exc
+	cmp	x24, #0x18			// configurable trap
+	b.eq	el0_undef
+	cmp	x24, #0x26			// stack alignment exception
+	b.eq	el0_sp_pc
+	cmp	x24, #0x22			// pc alignment exception
+	b.eq	el0_sp_pc
+	cmp	x24, #0x00			// unknown exception in EL0
+	b.eq	el0_undef
+	cmp	x24, #0x30			// debug exception in EL0
+	b.ge	el0_dbg
+	b	el0_inv
+
+#ifdef CONFIG_AARCH32_EMULATION
+	.align	6
+el0_sync_compat:
+	kernel_entry 0, 32
+	mrs	x25, esr_el1			// read the syndrome register
+	lsr	x24, x25, #26			// exception class
+	cmp	x24, #0x11			// SVC in 32-bit state
+	b.eq	el0_svc_compat
+	adr	lr, ret_from_exception
+	cmp	x24, #0x24			// data abort in EL0
+	b.eq	el0_da
+	cmp	x24, #0x20			// instruction abort in EL0
+	b.eq	el0_ia
+	cmp	x24, #0x07			// FP/ASIMD access
+	b.eq	el0_fpsimd_acc
+	cmp	x24, #0x28			// FP/ASIMD exception
+	b.eq	el0_fpsimd_exc
+	cmp	x24, #0x00			// unknown exception in EL0
+	b.eq	el0_undef
+	cmp	x24, #0x30			// debug exception in EL0
+	b.ge	el0_dbg
+	b	el0_inv
+el0_svc_compat:
+	/*
+	 * AArch32 syscall handling
+	 */
+	adr	stbl, compat_sys_call_table	// load compat syscall table pointer
+	uxtw	scno, w7			// syscall number in w7 (r7)
+	mov     sc_nr, #__NR_compat_syscalls
+	b	el0_svc_naked
+
+	.align	6
+el0_irq_compat:
+	kernel_entry 0, 32
+	b	el0_irq_naked
+#endif
+
+el0_da:
+	/*
+	 * Data abort handling
+	 */
+	mrs	x0, far_el1
+	disable_step x1
+	isb
+	enable_dbg
+	// enable interrupts before calling the main handler
+	enable_irq
+	mov	x1, x25
+	mov	x2, sp
+	b	do_mem_abort
+el0_ia:
+	/*
+	 * Instruction abort handling
+	 */
+	mrs	x0, far_el1
+	disable_step x1
+	isb
+	enable_dbg
+	// enable interrupts before calling the main handler
+	enable_irq
+	orr	x1, x25, #1 << 24		// use reserved ISS bit for instruction aborts
+	mov	x2, sp
+	b	do_mem_abort
+el0_fpsimd_acc:
+	/*
+	 * Floating Point or Advanced SIMD access
+	 */
+	mov	x0, x25
+	mov	x1, sp
+	b	do_fpsimd_acc
+el0_fpsimd_exc:
+	/*
+	 * Floating Point or Advanced SIMD exception
+	 */
+	mov	x0, x25
+	mov	x1, sp
+	b	do_fpsimd_exc
+el0_sp_pc:
+	/*
+	 * Stack or PC alignment exception handling
+	 */
+	mrs	x0, far_el1
+	disable_step x1
+	isb
+	enable_dbg
+	// enable interrupts before calling the main handler
+	enable_irq
+	mov	x1, x25
+	mov	x2, sp
+	b	do_sp_pc_abort
+el0_undef:
+	/*
+	 *Undefined instruction
+	 */
+	mov	x0, sp
+	b	do_undefinstr
+el0_dbg:
+	/*
+	 * Debug exception handling
+	 */
+	tbnz	x24, #0, el0_inv		// EL0 only
+	mrs	x0, far_el1
+	disable_step x1
+	mov	x1, x25
+	mov	x2, sp
+	b	do_debug_exception
+el0_inv:
+	mov	x0, sp
+	mov	x1, #BAD_SYNC
+	mrs	x2, esr_el1
+	b	bad_mode
+ENDPROC(el0_sync)
+
+	.align	6
+el0_irq:
+	kernel_entry 0
+el0_irq_naked:
+	disable_step x1
+	isb
+	enable_dbg
+#ifdef CONFIG_TRACE_IRQFLAGS
+	bl	trace_hardirqs_off
+#endif
+	get_thread_info tsk
+#ifdef CONFIG_PREEMPT
+	ldr	x24, [tsk, #TI_PREEMPT]		// get preempt count
+	add	x23, x24, #1			// increment it
+	str	x23, [tsk, #TI_PREEMPT]
+#endif
+	irq_handler
+#ifdef CONFIG_PREEMPT
+	ldr	x0, [tsk, #TI_PREEMPT]
+	str	x24, [tsk, #TI_PREEMPT]
+	cmp	x0, x23
+	b.eq	1f
+	mov	x1, #0
+	str	x1, [x1]			// BUG
+1:
+#endif
+#ifdef CONFIG_TRACE_IRQFLAGS
+	bl	trace_hardirqs_on
+#endif
+	b	ret_to_user
+ENDPROC(el0_irq)
+
+/*
+ * This is the return code to user mode for abort handlers
+ */
+ENTRY(ret_from_exception)
+	get_thread_info tsk
+	b	ret_to_user
+ENDPROC(ret_from_exception)
+
+/*
+ * Register switch for AArch64. The callee-saved registers need to be saved
+ * and restored. On entry:
+ *   x0 = previous task_struct (must be preserved across the switch)
+ *   x1 = next task_struct
+ * Previous and next are guaranteed not to be the same.
+ *
+ */
+ENTRY(cpu_switch_to)
+	add	x8, x0, #THREAD_CPU_CONTEXT
+	mov	x9, sp
+	stp	x19, x20, [x8], #16		// store callee-saved registers
+	stp	x21, x22, [x8], #16
+	stp	x23, x24, [x8], #16
+	stp	x25, x26, [x8], #16
+	stp	x27, x28, [x8], #16
+	stp	x29, x9, [x8], #16
+	str	lr, [x8]
+	add	x8, x1, #THREAD_CPU_CONTEXT
+	ldp	x19, x20, [x8], #16		// restore callee-saved registers
+	ldp	x21, x22, [x8], #16
+	ldp	x23, x24, [x8], #16
+	ldp	x25, x26, [x8], #16
+	ldp	x27, x28, [x8], #16
+	ldp	x29, x9, [x8], #16
+	ldr	lr, [x8]
+	mov	sp, x9
+	ret
+ENDPROC(cpu_switch_to)
+
+/*
+ * This is the fast syscall return path.  We do as little as possible here,
+ * and this includes saving x0 back into the kernel stack.
+ */
+ret_fast_syscall:
+	disable_irq				// disable interrupts
+	ldr	x1, [tsk, #TI_FLAGS]
+	and	x2, x1, #_TIF_WORK_MASK
+	cbnz	x2, fast_work_pending
+	tbz	x1, #TIF_SINGLESTEP, fast_exit
+	disable_dbg
+	enable_step x2
+fast_exit:
+	kernel_exit 0, ret = 1
+
+/*
+ * Ok, we need to do extra processing, enter the slow path.
+ */
+fast_work_pending:
+	str	x0, [sp, #S_X0]			// returned x0
+work_pending:
+	tbnz	x1, #TIF_NEED_RESCHED, work_resched
+	/* TIF_SIGPENDING or TIF_NOTIFY_RESUME case */
+	ldr	x2, [sp, #S_PSTATE]
+	mov	x0, sp				// 'regs'
+	tst	x2, #PSR_MODE_MASK		// user mode regs?
+	b.ne	no_work_pending			// returning to kernel
+	bl	do_notify_resume
+	b	ret_to_user
+work_resched:
+	enable_dbg
+	bl	schedule
+
+/*
+ * "slow" syscall return path.
+ */
+ENTRY(ret_to_user)
+	disable_irq				// disable interrupts
+	ldr	x1, [tsk, #TI_FLAGS]
+	and	x2, x1, #_TIF_WORK_MASK
+	cbnz	x2, work_pending
+	tbz	x1, #TIF_SINGLESTEP, no_work_pending
+	disable_dbg
+	enable_step x2
+no_work_pending:
+	kernel_exit 0, ret = 0
+ENDPROC(ret_to_user)
+
+/*
+ * This is how we return from a fork.
+ */
+ENTRY(ret_from_fork)
+	bl	schedule_tail
+	get_thread_info tsk
+	b	ret_to_user
+ENDPROC(ret_from_fork)
+
+/*
+ * SVC handler.
+ */
+	.align	6
+ENTRY(el0_svc)
+	adrp	stbl, sys_call_table		// load syscall table pointer
+	uxtw	scno, w8			// syscall number in w8
+	mov	sc_nr, #__NR_syscalls
+el0_svc_naked:					// compat entry point
+	stp	x0, scno, [sp, #S_ORIG_X0]	// save the original x0 and syscall number
+	disable_step x16
+	isb
+	enable_dbg
+	enable_irq
+
+	get_thread_info tsk
+	ldr	x16, [tsk, #TI_FLAGS]		// check for syscall tracing
+	tbnz	x16, #TIF_SYSCALL_TRACE, __sys_trace // are we tracing syscalls?
+	adr	lr, ret_fast_syscall		// return address
+	cmp     scno, sc_nr                     // check upper syscall limit
+	b.hs	ni_sys
+	ldr	x16, [stbl, scno, lsl #3]	// address in the syscall table
+	br	x16				// call sys_* routine
+ni_sys:
+	mov	x0, sp
+	b	do_ni_syscall
+ENDPROC(el0_svc)
+
+	/*
+	 * This is the really slow path.  We're going to be doing context
+	 * switches, and waiting for our parent to respond.
+	 */
+__sys_trace:
+	mov	x1, sp
+	mov	w0, #0				// trace entry
+	bl	syscall_trace
+	adr	lr, __sys_trace_return		// return address
+	uxtw	scno, w0			// syscall number (possibly new)
+	mov	x1, sp				// pointer to regs
+	cmp	scno, sc_nr			// check upper syscall limit
+	b.hs	ni_sys
+	ldp	x0, x1, [sp]			// restore the syscall args
+	ldp	x2, x3, [sp, #S_X2]
+	ldp	x4, x5, [sp, #S_X4]
+	ldp	x6, x7, [sp, #S_X6]
+	ldr	x16, [stbl, scno, lsl #3]	// address in the syscall table
+	br	x16				// call sys_* routine
+
+__sys_trace_return:
+	str	x0, [sp]			// save returned x0
+	mov	x1, sp
+	mov	w0, #1				// trace exit
+	bl	syscall_trace
+	b	ret_to_user
+
+/*
+ * Special system call wrappers.
+ */
+ENTRY(sys_execve_wrapper)
+	mov	x3, sp
+	b	sys_execve
+ENDPROC(sys_execve_wrapper)
+
+ENTRY(sys_clone_wrapper)
+	mov	x5, sp
+	b	sys_clone
+ENDPROC(sys_clone_wrapper)
+
+ENTRY(sys_rt_sigreturn_wrapper)
+	mov	x0, sp
+	b	sys_rt_sigreturn
+ENDPROC(sys_rt_sigreturn_wrapper)
+
+ENTRY(sys_sigaltstack_wrapper)
+	ldr	x2, [sp, #S_SP]
+	b	sys_sigaltstack
+ENDPROC(sys_sigaltstack_wrapper)
+
+ENTRY(handle_arch_irq)
+	.quad	0
diff --git a/arch/arm64/kernel/stacktrace.c b/arch/arm64/kernel/stacktrace.c
new file mode 100644
index 0000000..d25459f
--- /dev/null
+++ b/arch/arm64/kernel/stacktrace.c
@@ -0,0 +1,127 @@
+/*
+ * Stack tracing support
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/stacktrace.h>
+
+#include <asm/stacktrace.h>
+
+/*
+ * AArch64 PCS assigns the frame pointer to x29.
+ *
+ * A simple function prologue looks like this:
+ * 	sub	sp, sp, #0x10
+ *   	stp	x29, x30, [sp]
+ *	mov	x29, sp
+ *
+ * A simple function epilogue looks like this:
+ *	mov	sp, x29
+ *	ldp	x29, x30, [sp]
+ *	add	sp, sp, #0x10
+ */
+int unwind_frame(struct stackframe *frame)
+{
+	unsigned long high, low;
+	unsigned long fp = frame->fp;
+
+	low  = frame->sp;
+	high = ALIGN(low, THREAD_SIZE);
+
+	if (fp < low || fp > high || fp & 0xf)
+		return -EINVAL;
+
+	frame->sp = fp + 0x10;
+	frame->fp = *(unsigned long *)(fp);
+	frame->pc = *(unsigned long *)(fp + 8);
+
+	return 0;
+}
+
+void notrace walk_stackframe(struct stackframe *frame,
+		     int (*fn)(struct stackframe *, void *), void *data)
+{
+	while (1) {
+		int ret;
+
+		if (fn(frame, data))
+			break;
+		ret = unwind_frame(frame);
+		if (ret < 0)
+			break;
+	}
+}
+EXPORT_SYMBOL(walk_stackframe);
+
+#ifdef CONFIG_STACKTRACE
+struct stack_trace_data {
+	struct stack_trace *trace;
+	unsigned int no_sched_functions;
+	unsigned int skip;
+};
+
+static int save_trace(struct stackframe *frame, void *d)
+{
+	struct stack_trace_data *data = d;
+	struct stack_trace *trace = data->trace;
+	unsigned long addr = frame->pc;
+
+	if (data->no_sched_functions && in_sched_functions(addr))
+		return 0;
+	if (data->skip) {
+		data->skip--;
+		return 0;
+	}
+
+	trace->entries[trace->nr_entries++] = addr;
+
+	return trace->nr_entries >= trace->max_entries;
+}
+
+void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+{
+	struct stack_trace_data data;
+	struct stackframe frame;
+
+	data.trace = trace;
+	data.skip = trace->skip;
+
+	if (tsk != current) {
+		data.no_sched_functions = 1;
+		frame.fp = thread_saved_fp(tsk);
+		frame.sp = thread_saved_sp(tsk);
+		frame.pc = thread_saved_pc(tsk);
+	} else {
+		register unsigned long current_sp asm("sp");
+		data.no_sched_functions = 0;
+		frame.fp = (unsigned long)__builtin_frame_address(0);
+		frame.sp = current_sp;
+		frame.pc = (unsigned long)save_stack_trace_tsk;
+	}
+
+	walk_stackframe(&frame, save_trace, &data);
+	if (trace->nr_entries < trace->max_entries)
+		trace->entries[trace->nr_entries++] = ULONG_MAX;
+}
+
+void save_stack_trace(struct stack_trace *trace)
+{
+	save_stack_trace_tsk(current, trace);
+}
+EXPORT_SYMBOL_GPL(save_stack_trace);
+#endif
diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
new file mode 100644
index 0000000..8712a8e
--- /dev/null
+++ b/arch/arm64/kernel/traps.c
@@ -0,0 +1,357 @@
+/*
+ * Based on arch/arm/kernel/traps.c
+ *
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/signal.h>
+#include <linux/personality.h>
+#include <linux/kallsyms.h>
+#include <linux/spinlock.h>
+#include <linux/uaccess.h>
+#include <linux/hardirq.h>
+#include <linux/kdebug.h>
+#include <linux/module.h>
+#include <linux/kexec.h>
+#include <linux/delay.h>
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/syscalls.h>
+
+#include <asm/atomic.h>
+#include <asm/traps.h>
+#include <asm/stacktrace.h>
+#include <asm/exception.h>
+#include <asm/system_misc.h>
+
+static const char *handler[]= {
+	"Synchronous Abort",
+	"IRQ",
+	"FIQ",
+	"Error"
+};
+
+int show_unhandled_signals = 1;
+
+/*
+ * Dump out the contents of some memory nicely...
+ */
+static void dump_mem(const char *lvl, const char *str, unsigned long bottom,
+		     unsigned long top)
+{
+	unsigned long first;
+	mm_segment_t fs;
+	int i;
+
+	/*
+	 * We need to switch to kernel mode so that we can use __get_user
+	 * to safely read from kernel space.  Note that we now dump the
+	 * code first, just in case the backtrace kills us.
+	 */
+	fs = get_fs();
+	set_fs(KERNEL_DS);
+
+	printk("%s%s(0x%016lx to 0x%016lx)\n", lvl, str, bottom, top);
+
+	for (first = bottom & ~31; first < top; first += 32) {
+		unsigned long p;
+		char str[sizeof(" 12345678") * 8 + 1];
+
+		memset(str, ' ', sizeof(str));
+		str[sizeof(str) - 1] = '\0';
+
+		for (p = first, i = 0; i < 8 && p < top; i++, p += 4) {
+			if (p >= bottom && p < top) {
+				unsigned int val;
+				if (__get_user(val, (unsigned int *)p) == 0)
+					sprintf(str + i * 9, " %08x", val);
+				else
+					sprintf(str + i * 9, " ????????");
+			}
+		}
+		printk("%s%04lx:%s\n", lvl, first & 0xffff, str);
+	}
+
+	set_fs(fs);
+}
+
+static void dump_backtrace_entry(unsigned long where, unsigned long stack)
+{
+	print_ip_sym(where);
+	if (in_exception_text(where))
+		dump_mem("", "Exception stack", stack,
+			 stack + sizeof(struct pt_regs));
+}
+
+static void dump_instr(const char *lvl, struct pt_regs *regs)
+{
+	unsigned long addr = instruction_pointer(regs);
+	mm_segment_t fs;
+	char str[sizeof("00000000 ") * 5 + 2 + 1], *p = str;
+	int i;
+
+	/*
+	 * We need to switch to kernel mode so that we can use __get_user
+	 * to safely read from kernel space.  Note that we now dump the
+	 * code first, just in case the backtrace kills us.
+	 */
+	fs = get_fs();
+	set_fs(KERNEL_DS);
+
+	for (i = -4; i < 1; i++) {
+		unsigned int val, bad;
+
+		bad = __get_user(val, &((u32 *)addr)[i]);
+
+		if (!bad)
+			p += sprintf(p, i == 0 ? "(%08x) " : "%08x ", val);
+		else {
+			p += sprintf(p, "bad PC value");
+			break;
+		}
+	}
+	printk("%sCode: %s\n", lvl, str);
+
+	set_fs(fs);
+}
+
+static void dump_backtrace(struct pt_regs *regs, struct task_struct *tsk)
+{
+	struct stackframe frame;
+	const register unsigned long current_sp asm ("sp");
+
+	pr_debug("%s(regs = %p tsk = %p)\n", __func__, regs, tsk);
+
+	if (!tsk)
+		tsk = current;
+
+	if (regs) {
+		frame.fp = regs->regs[29];
+		frame.sp = regs->sp;
+		frame.pc = regs->pc;
+	} else if (tsk == current) {
+		frame.fp = (unsigned long)__builtin_frame_address(0);
+		frame.sp = current_sp;
+		frame.pc = (unsigned long)dump_backtrace;
+	} else {
+		/*
+		 * task blocked in __switch_to
+		 */
+		frame.fp = thread_saved_fp(tsk);
+		frame.sp = thread_saved_sp(tsk);
+		frame.pc = thread_saved_pc(tsk);
+	}
+
+	printk("Call trace:\n");
+	while (1) {
+		unsigned long where = frame.pc;
+		int ret;
+
+		ret = unwind_frame(&frame);
+		if (ret < 0)
+			break;
+		dump_backtrace_entry(where, frame.sp);
+	}
+}
+
+void dump_stack(void)
+{
+	dump_backtrace(NULL, NULL);
+}
+
+EXPORT_SYMBOL(dump_stack);
+
+void show_stack(struct task_struct *tsk, unsigned long *sp)
+{
+	dump_backtrace(NULL, tsk);
+	barrier();
+}
+
+#ifdef CONFIG_PREEMPT
+#define S_PREEMPT " PREEMPT"
+#else
+#define S_PREEMPT ""
+#endif
+#ifdef CONFIG_SMP
+#define S_SMP " SMP"
+#else
+#define S_SMP ""
+#endif
+
+static int __die(const char *str, int err, struct thread_info *thread,
+		 struct pt_regs *regs)
+{
+	struct task_struct *tsk = thread->task;
+	static int die_counter;
+	int ret;
+
+	pr_emerg("Internal error: %s: %x [#%d]" S_PREEMPT S_SMP "\n",
+		 str, err, ++die_counter);
+
+	/* trap and error numbers are mostly meaningless on ARM */
+	ret = notify_die(DIE_OOPS, str, regs, err, 0, SIGSEGV);
+	if (ret == NOTIFY_STOP)
+		return ret;
+
+	print_modules();
+	__show_regs(regs);
+	pr_emerg("Process %.*s (pid: %d, stack limit = 0x%p)\n",
+		 TASK_COMM_LEN, tsk->comm, task_pid_nr(tsk), thread + 1);
+
+	if (!user_mode(regs) || in_interrupt()) {
+		dump_mem(KERN_EMERG, "Stack: ", regs->sp,
+			 THREAD_SIZE + (unsigned long)task_stack_page(tsk));
+		dump_backtrace(regs, tsk);
+		dump_instr(KERN_EMERG, regs);
+	}
+
+	return ret;
+}
+
+DEFINE_SPINLOCK(die_lock);
+
+/*
+ * This function is protected against re-entrancy.
+ */
+void die(const char *str, struct pt_regs *regs, int err)
+{
+	struct thread_info *thread = current_thread_info();
+	int ret;
+
+	oops_enter();
+
+	spin_lock_irq(&die_lock);
+	console_verbose();
+	bust_spinlocks(1);
+	ret = __die(str, err, thread, regs);
+
+	if (regs && kexec_should_crash(thread->task))
+		crash_kexec(regs);
+
+	bust_spinlocks(0);
+	add_taint(TAINT_DIE);
+	spin_unlock_irq(&die_lock);
+	oops_exit();
+
+	if (in_interrupt())
+		panic("Fatal exception in interrupt");
+	if (panic_on_oops)
+		panic("Fatal exception");
+	if (ret != NOTIFY_STOP)
+		do_exit(SIGSEGV);
+}
+
+void arm64_notify_die(const char *str, struct pt_regs *regs,
+		      struct siginfo *info, int err)
+{
+	if (user_mode(regs))
+		force_sig_info(info->si_signo, info, current);
+	else
+		die(str, regs, err);
+}
+
+asmlinkage void __exception do_undefinstr(struct pt_regs *regs)
+{
+	siginfo_t info;
+	void __user *pc = (void __user *)instruction_pointer(regs);
+
+#ifdef CONFIG_AARCH32_EMULATION
+	/* check for AArch32 breakpoint instructions */
+	if (compat_user_mode(regs) && aarch32_break_trap(regs) == 0)
+		return;
+#endif
+
+	if (show_unhandled_signals) {
+		pr_info("%s[%d]: undefined instruction: pc=%p\n",
+			current->comm, task_pid_nr(current), pc);
+		dump_instr(KERN_INFO, regs);
+	}
+
+	info.si_signo = SIGILL;
+	info.si_errno = 0;
+	info.si_code  = ILL_ILLOPC;
+	info.si_addr  = pc;
+
+	arm64_notify_die("Oops - undefined instruction", regs, &info, 0);
+}
+
+long compat_arm_syscall(struct pt_regs *regs);
+
+asmlinkage long do_ni_syscall(struct pt_regs *regs)
+{
+#ifdef CONFIG_AARCH32_EMULATION
+	long ret;
+	if (is_compat_task()) {
+		ret = compat_arm_syscall(regs);
+		if (ret != -ENOSYS)
+			return ret;
+	}
+#endif
+
+	if (show_unhandled_signals) {
+		pr_info("%s[%d]: syscall %d\n", current->comm,
+			task_pid_nr(current), (int)regs->syscallno);
+		dump_instr("", regs);
+		if (user_mode(regs))
+			__show_regs(regs);
+	}
+
+	return sys_ni_syscall();
+}
+
+/*
+ * bad_mode handles the impossible case in the exception vector.
+ */
+asmlinkage void bad_mode(struct pt_regs *regs, int reason, unsigned int esr)
+{
+	console_verbose();
+
+	pr_crit("Bad mode in %s handler detected, code 0x%08x\n",
+		handler[reason], esr);
+
+	die("Oops - bad mode", regs, 0);
+	local_irq_disable();
+	panic("bad mode");
+}
+
+
+void __bad_xchg(volatile void *ptr, int size)
+{
+	printk("xchg: bad data size: pc 0x%p, ptr 0x%p, size %d\n",
+		__builtin_return_address(0), ptr, size);
+	BUG();
+}
+EXPORT_SYMBOL(__bad_xchg);
+
+void __pte_error(const char *file, int line, unsigned long val)
+{
+	printk("%s:%d: bad pte %016lx.\n", file, line, val);
+}
+
+void __pmd_error(const char *file, int line, unsigned long val)
+{
+	printk("%s:%d: bad pmd %016lx.\n", file, line, val);
+}
+
+void __pgd_error(const char *file, int line, unsigned long val)
+{
+	printk("%s:%d: bad pgd %016lx.\n", file, line, val);
+}
+
+void __init trap_init(void)
+{
+	return;
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 04/31] arm64: MMU definitions
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (2 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 03/31] arm64: Exception handling Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 13:30   ` Arnd Bergmann
  2012-08-17  9:04   ` Tony Lindgren
  2012-08-14 17:52 ` [PATCH v2 05/31] arm64: MMU initialisation Catalin Marinas
                   ` (27 subsequent siblings)
  31 siblings, 2 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

The virtual memory layout is described in
Documentation/arm64/memory.txt. This patch adds the MMU definitions for
the 4KB and 64KB translation table configurations. The SECTION_SIZE is
2MB with 4KB page and 512MB with 64KB page configuration.

PHYS_OFFSET is calculated at run-time and stored in a variable (no
run-time code patching at this stage).

On the current implementation, both user and kernel address spaces are
512G (39-bit) each with a maximum of 256G for the RAM linear mapping.
Linux uses 3 levels of translation tables with the 4K page configuration
and 2 levels with the 64K configuration. Extending the memory space
beyond 39-bit with the 4K pages or 42-bit with 64K pages requires an
additional level of translation tables.

The SPARSEMEM configuration is global to all AArch64 platforms and
allows for 1GB sections with SPARSEMEM_VMEMMAP enabled by default.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 Documentation/arm64/memory.txt                |   69 +++++
 arch/arm64/include/asm/memory.h               |  144 +++++++++++
 arch/arm64/include/asm/mmu.h                  |   27 ++
 arch/arm64/include/asm/pgtable-2level-hwdef.h |   43 ++++
 arch/arm64/include/asm/pgtable-2level-types.h |   60 +++++
 arch/arm64/include/asm/pgtable-3level-hwdef.h |   50 ++++
 arch/arm64/include/asm/pgtable-3level-types.h |   66 +++++
 arch/arm64/include/asm/pgtable-hwdef.h        |   94 +++++++
 arch/arm64/include/asm/pgtable.h              |  328 +++++++++++++++++++++++++
 arch/arm64/include/asm/sparsemem.h            |   24 ++
 10 files changed, 905 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/arm64/memory.txt
 create mode 100644 arch/arm64/include/asm/memory.h
 create mode 100644 arch/arm64/include/asm/mmu.h
 create mode 100644 arch/arm64/include/asm/pgtable-2level-hwdef.h
 create mode 100644 arch/arm64/include/asm/pgtable-2level-types.h
 create mode 100644 arch/arm64/include/asm/pgtable-3level-hwdef.h
 create mode 100644 arch/arm64/include/asm/pgtable-3level-types.h
 create mode 100644 arch/arm64/include/asm/pgtable-hwdef.h
 create mode 100644 arch/arm64/include/asm/pgtable.h
 create mode 100644 arch/arm64/include/asm/sparsemem.h

diff --git a/Documentation/arm64/memory.txt b/Documentation/arm64/memory.txt
new file mode 100644
index 0000000..7210af7
--- /dev/null
+++ b/Documentation/arm64/memory.txt
@@ -0,0 +1,69 @@
+		     Memory Layout on AArch64 Linux
+		     ==============================
+
+Author: Catalin Marinas <catalin.marinas@arm.com>
+Date  : 20 February 2012
+
+This document describes the virtual memory layout used by the AArch64
+Linux kernel. The architecture allows up to 4 levels of translation
+tables with a 4KB page size and up to 3 levels with a 64KB page size.
+
+AArch64 Linux uses 3 levels of translation tables with the 4KB page
+configuration, allowing 39-bit (512GB) virtual addresses for both user
+and kernel. With 64KB pages, only 2 levels of translation tables are
+used but the memory layout is the same.
+
+User addresses have bits 63:39 set to 0 while the kernel addresses have
+the same bits set to 1. TTBRx selection is given by bit 63 of the
+virtual address. The swapper_pg_dir contains only kernel (global)
+mappings while the user pgd contains only user (non-global) mappings.
+The swapper_pgd_dir address is written to TTBR1 and never written to
+TTBR0.
+
+
+AArch64 Linux memory layout:
+
+Start			End			Size		Use
+-----------------------------------------------------------------------
+0000000000000000	0000007fffffffff	 512GB		user
+
+ffffff8000000000	ffffffbbfffeffff	~240GB		vmalloc
+
+ffffffbbffff0000	ffffffbcffffffff	  64KB		[guard page]
+
+ffffffbc00000000	ffffffbdffffffff	   8GB		vmemmap
+
+ffffffbe00000000	ffffffbffbffffff	  ~8GB		[guard, future vmmemap]
+
+ffffffbffc000000	ffffffbfffffffff	  64MB		modules
+
+ffffffc000000000	ffffffffffffffff	 256GB		memory
+
+
+Translation table lookup with 4KB pages:
+
++--------+--------+--------+--------+--------+--------+--------+--------+
+|63    56|55    48|47    40|39    32|31    24|23    16|15     8|7      0|
++--------+--------+--------+--------+--------+--------+--------+--------+
+ |                 |         |         |         |         |
+ |                 |         |         |         |         v
+ |                 |         |         |         |   [11:0]  in-page offset
+ |                 |         |         |         +-> [20:12] L3 index
+ |                 |         |         +-----------> [29:21] L2 index
+ |                 |         +---------------------> [38:30] L1 index
+ |                 +-------------------------------> [47:39] L0 index (not used)
+ +-------------------------------------------------> [63] TTBR0/1
+
+
+Translation table lookup with 64KB pages:
+
++--------+--------+--------+--------+--------+--------+--------+--------+
+|63    56|55    48|47    40|39    32|31    24|23    16|15     8|7      0|
++--------+--------+--------+--------+--------+--------+--------+--------+
+ |                 |    |               |              |
+ |                 |    |               |              v
+ |                 |    |               |            [15:0]  in-page offset
+ |                 |    |               +----------> [28:16] L3 index
+ |                 |    +--------------------------> [41:29] L2 index (only 38:29 used)
+ |                 +-------------------------------> [47:42] L1 index (not used)
+ +-------------------------------------------------> [63] TTBR0/1
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h
new file mode 100644
index 0000000..3cfdc4b
--- /dev/null
+++ b/arch/arm64/include/asm/memory.h
@@ -0,0 +1,144 @@
+/*
+ * Based on arch/arm/include/asm/memory.h
+ *
+ * Copyright (C) 2000-2002 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Note: this file should not be included by non-asm/.h files
+ */
+#ifndef __ASM_MEMORY_H
+#define __ASM_MEMORY_H
+
+#include <linux/compiler.h>
+#include <linux/const.h>
+#include <linux/types.h>
+#include <asm/sizes.h>
+
+/*
+ * Allow for constants defined here to be used from assembly code
+ * by prepending the UL suffix only with actual C code compilation.
+ */
+#define UL(x) _AC(x, UL)
+
+/*
+ * PAGE_OFFSET - the virtual address of the start of the kernel image.
+ * VA_BITS - the maximum number of bits for virtual addresses.
+ * TASK_SIZE - the maximum size of a user space task.
+ * TASK_UNMAPPED_BASE - the lower boundary of the mmap VM area.
+ * The module space lives between the addresses given by TASK_SIZE
+ * and PAGE_OFFSET - it must be within 128MB of the kernel text.
+ */
+#define PAGE_OFFSET		UL(0xffffffc000000000)
+#define MODULES_END		(PAGE_OFFSET)
+#define MODULES_VADDR		(MODULES_END - SZ_64M)
+#define VA_BITS			(39)
+#define TASK_SIZE_64		(UL(1) << VA_BITS)
+
+#ifdef CONFIG_AARCH32_EMULATION
+#define TASK_SIZE_32		UL(0x100000000)
+#define TASK_SIZE		(test_thread_flag(TIF_32BIT) ? \
+				TASK_SIZE_32 : TASK_SIZE_64)
+#else
+#define TASK_SIZE		TASK_SIZE_64
+#endif /* CONFIG_AARCH32_EMULATION */
+
+#define TASK_UNMAPPED_BASE	(PAGE_ALIGN(TASK_SIZE / 4))
+
+#if TASK_SIZE_64 > MODULES_VADDR
+#error Top of 64-bit user space clashes with start of module space
+#endif
+
+/*
+ * Physical vs virtual RAM address space conversion.  These are
+ * private definitions which should NOT be used outside memory.h
+ * files.  Use virt_to_phys/phys_to_virt/__pa/__va instead.
+ */
+#define __virt_to_phys(x)	(((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET))
+#define __phys_to_virt(x)	((unsigned long)((x) - PHYS_OFFSET + PAGE_OFFSET))
+
+/*
+ * Convert a physical address to a Page Frame Number and back
+ */
+#define	__phys_to_pfn(paddr)	((unsigned long)((paddr) >> PAGE_SHIFT))
+#define	__pfn_to_phys(pfn)	((phys_addr_t)(pfn) << PAGE_SHIFT)
+
+/*
+ * Convert a page to/from a physical address
+ */
+#define page_to_phys(page)	(__pfn_to_phys(page_to_pfn(page)))
+#define phys_to_page(phys)	(pfn_to_page(__phys_to_pfn(phys)))
+
+/*
+ * Memory types available.
+ */
+#define MT_DEVICE_nGnRnE	0
+#define MT_DEVICE_nGnRE		1
+#define MT_DEVICE_GRE		2
+#define MT_NORMAL_NC		3
+#define MT_NORMAL		4
+
+#ifndef __ASSEMBLY__
+
+extern phys_addr_t		memstart_addr;
+/* PHYS_OFFSET - the physical address of the start of memory. */
+#define PHYS_OFFSET		({ memstart_addr; })
+
+/*
+ * PFNs are used to describe any physical page; this means
+ * PFN 0 == physical address 0.
+ *
+ * This is the PFN of the first RAM page in the kernel
+ * direct-mapped view.  We assume this is the first page
+ * of RAM in the mem_map as well.
+ */
+#define PHYS_PFN_OFFSET	(PHYS_OFFSET >> PAGE_SHIFT)
+
+/*
+ * Note: Drivers should NOT use these.  They are the wrong
+ * translation for translating DMA addresses.  Use the driver
+ * DMA support - see dma-mapping.h.
+ */
+static inline phys_addr_t virt_to_phys(const volatile void *x)
+{
+	return __virt_to_phys((unsigned long)(x));
+}
+
+static inline void *phys_to_virt(phys_addr_t x)
+{
+	return (void *)(__phys_to_virt(x));
+}
+
+/*
+ * Drivers should NOT use these either.
+ */
+#define __pa(x)			__virt_to_phys((unsigned long)(x))
+#define __va(x)			((void *)__phys_to_virt((phys_addr_t)(x)))
+#define pfn_to_kaddr(pfn)	__va((pfn) << PAGE_SHIFT)
+
+/*
+ *  virt_to_page(k)	convert a _valid_ virtual address to struct page *
+ *  virt_addr_valid(k)	indicates whether a virtual address is valid
+ */
+#define ARCH_PFN_OFFSET		PHYS_PFN_OFFSET
+
+#define virt_to_page(kaddr)	pfn_to_page(__pa(kaddr) >> PAGE_SHIFT)
+#define	virt_addr_valid(kaddr)	(((void *)(kaddr) >= (void *)PAGE_OFFSET) && \
+				 ((void *)(kaddr) < (void *)high_memory))
+
+#endif
+
+#include <asm-generic/memory_model.h>
+
+#endif
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h
new file mode 100644
index 0000000..981498a
--- /dev/null
+++ b/arch/arm64/include/asm/mmu.h
@@ -0,0 +1,27 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_MMU_H
+#define __ASM_MMU_H
+
+typedef struct {
+	unsigned int id;
+	spinlock_t id_lock;
+	void *vdso;
+} mm_context_t;
+
+#define ASID(mm)	((mm)->context.id & 0xffff)
+
+#endif
diff --git a/arch/arm64/include/asm/pgtable-2level-hwdef.h b/arch/arm64/include/asm/pgtable-2level-hwdef.h
new file mode 100644
index 0000000..0a8ed3f
--- /dev/null
+++ b/arch/arm64/include/asm/pgtable-2level-hwdef.h
@@ -0,0 +1,43 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PGTABLE_2LEVEL_HWDEF_H
+#define __ASM_PGTABLE_2LEVEL_HWDEF_H
+
+/*
+ * With LPAE and 64KB pages, there are 2 levels of page tables. Each level has
+ * 8192 entries of 8 bytes each, occupying a 64KB page. Levels 0 and 1 are not
+ * used. The 2nd level table (PGD for Linux) can cover a range of 4TB, each
+ * entry representing 512MB. The user and kernel address spaces are limited to
+ * 512GB and therefore we only use 1024 entries in the PGD.
+ */
+#define PTRS_PER_PTE		8192
+#define PTRS_PER_PGD		1024
+
+/*
+ * PGDIR_SHIFT determines the size a top-level page table entry can map.
+ */
+#define PGDIR_SHIFT		29
+#define PGDIR_SIZE		(_AC(1, UL) << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+
+/*
+ * section address mask and size definitions.
+ */
+#define SECTION_SHIFT		29
+#define SECTION_SIZE		(_AC(1, UL) << SECTION_SHIFT)
+#define SECTION_MASK		(~(SECTION_SIZE-1))
+
+#endif
diff --git a/arch/arm64/include/asm/pgtable-2level-types.h b/arch/arm64/include/asm/pgtable-2level-types.h
new file mode 100644
index 0000000..3c3ca7d
--- /dev/null
+++ b/arch/arm64/include/asm/pgtable-2level-types.h
@@ -0,0 +1,60 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PGTABLE_2LEVEL_TYPES_H
+#define __ASM_PGTABLE_2LEVEL_TYPES_H
+
+typedef u64 pteval_t;
+typedef u64 pgdval_t;
+typedef pgdval_t pmdval_t;
+
+#undef STRICT_MM_TYPECHECKS
+
+#ifdef STRICT_MM_TYPECHECKS
+
+/*
+ * These are used to make use of C type-checking..
+ */
+typedef struct { pteval_t pte; } pte_t;
+typedef struct { pgdval_t pgd; } pgd_t;
+typedef struct { pteval_t pgprot; } pgprot_t;
+
+#define pte_val(x)      ((x).pte)
+#define pgd_val(x)	((x).pgd)
+#define pgprot_val(x)   ((x).pgprot)
+
+#define __pte(x)        ((pte_t) { (x) } )
+#define __pgd(x)	((pgd_t) { (x) } )
+#define __pgprot(x)     ((pgprot_t) { (x) } )
+
+#else	/* !STRICT_MM_TYPECHECKS */
+
+typedef pteval_t pte_t;
+typedef pgdval_t pgd_t;
+typedef pteval_t pgprot_t;
+
+#define pte_val(x)	(x)
+#define pgd_val(x)	(x)
+#define pgprot_val(x)	(x)
+
+#define __pte(x)	(x)
+#define __pgd(x)	(x)
+#define __pgprot(x)	(x)
+
+#endif	/* STRICT_MM_TYPECHECKS */
+
+#include <asm-generic/pgtable-nopmd.h>
+
+#endif	/* __ASM_PGTABLE_2LEVEL_TYPES_H */
diff --git a/arch/arm64/include/asm/pgtable-3level-hwdef.h b/arch/arm64/include/asm/pgtable-3level-hwdef.h
new file mode 100644
index 0000000..3dbf941
--- /dev/null
+++ b/arch/arm64/include/asm/pgtable-3level-hwdef.h
@@ -0,0 +1,50 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PGTABLE_3LEVEL_HWDEF_H
+#define __ASM_PGTABLE_3LEVEL_HWDEF_H
+
+/*
+ * With LPAE and 4KB pages, there are 3 levels of page tables. Each level has
+ * 512 entries of 8 bytes each, occupying a 4K page. The first level table
+ * covers a range of 512GB, each entry representing 1GB. The user and kernel
+ * address spaces are limited to 512GB each.
+ */
+#define PTRS_PER_PTE		512
+#define PTRS_PER_PMD		512
+#define PTRS_PER_PGD		512
+
+/*
+ * PGDIR_SHIFT determines the size a top-level page table entry can map.
+ */
+#define PGDIR_SHIFT		30
+#define PGDIR_SIZE		(_AC(1, UL) << PGDIR_SHIFT)
+#define PGDIR_MASK		(~(PGDIR_SIZE-1))
+
+/*
+ * PMD_SHIFT determines the size a middle-level page table entry can map.
+ */
+#define PMD_SHIFT		21
+#define PMD_SIZE		(_AC(1, UL) << PMD_SHIFT)
+#define PMD_MASK		(~(PMD_SIZE-1))
+
+/*
+ * section address mask and size definitions.
+ */
+#define SECTION_SHIFT		21
+#define SECTION_SIZE		(_AC(1, UL) << SECTION_SHIFT)
+#define SECTION_MASK		(~(SECTION_SIZE-1))
+
+#endif
diff --git a/arch/arm64/include/asm/pgtable-3level-types.h b/arch/arm64/include/asm/pgtable-3level-types.h
new file mode 100644
index 0000000..4489615
--- /dev/null
+++ b/arch/arm64/include/asm/pgtable-3level-types.h
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PGTABLE_3LEVEL_TYPES_H
+#define __ASM_PGTABLE_3LEVEL_TYPES_H
+
+typedef u64 pteval_t;
+typedef u64 pmdval_t;
+typedef u64 pgdval_t;
+
+#undef STRICT_MM_TYPECHECKS
+
+#ifdef STRICT_MM_TYPECHECKS
+
+/*
+ * These are used to make use of C type-checking..
+ */
+typedef struct { pteval_t pte; } pte_t;
+typedef struct { pmdval_t pmd; } pmd_t;
+typedef struct { pgdval_t pgd; } pgd_t;
+typedef struct { pteval_t pgprot; } pgprot_t;
+
+#define pte_val(x)      ((x).pte)
+#define pmd_val(x)      ((x).pmd)
+#define pgd_val(x)	((x).pgd)
+#define pgprot_val(x)   ((x).pgprot)
+
+#define __pte(x)        ((pte_t) { (x) } )
+#define __pmd(x)        ((pmd_t) { (x) } )
+#define __pgd(x)	((pgd_t) { (x) } )
+#define __pgprot(x)     ((pgprot_t) { (x) } )
+
+#else	/* !STRICT_MM_TYPECHECKS */
+
+typedef pteval_t pte_t;
+typedef pmdval_t pmd_t;
+typedef pgdval_t pgd_t;
+typedef pteval_t pgprot_t;
+
+#define pte_val(x)	(x)
+#define pmd_val(x)	(x)
+#define pgd_val(x)	(x)
+#define pgprot_val(x)	(x)
+
+#define __pte(x)	(x)
+#define __pmd(x)	(x)
+#define __pgd(x)	(x)
+#define __pgprot(x)	(x)
+
+#endif	/* STRICT_MM_TYPECHECKS */
+
+#include <asm-generic/pgtable-nopud.h>
+
+#endif	/* __ASM_PGTABLE_3LEVEL_TYPES_H */
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h
new file mode 100644
index 0000000..561fb08
--- /dev/null
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -0,0 +1,94 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PGTABLE_HWDEF_H
+#define __ASM_PGTABLE_HWDEF_H
+
+#ifdef CONFIG_ARM64_64K_PAGES
+#include <asm/pgtable-2level-hwdef.h>
+#else
+#include <asm/pgtable-3level-hwdef.h>
+#endif
+
+/*
+ * Hardware page table definitions.
+ *
+ * Level 2 descriptor (PMD).
+ */
+#define PMD_TYPE_MASK		(_AT(pmdval_t, 3) << 0)
+#define PMD_TYPE_FAULT		(_AT(pmdval_t, 0) << 0)
+#define PMD_TYPE_TABLE		(_AT(pmdval_t, 3) << 0)
+#define PMD_TYPE_SECT		(_AT(pmdval_t, 1) << 0)
+
+/*
+ * Section
+ */
+#define PMD_SECT_S		(_AT(pmdval_t, 3) << 8)
+#define PMD_SECT_AF		(_AT(pmdval_t, 1) << 10)
+#define PMD_SECT_NG		(_AT(pmdval_t, 1) << 11)
+#define PMD_SECT_XN		(_AT(pmdval_t, 1) << 54)
+
+/*
+ * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
+ */
+#define PMD_ATTRINDX(t)		(_AT(pmdval_t, (t)) << 2)
+#define PMD_ATTRINDX_MASK	(_AT(pmdval_t, 7) << 2)
+
+/*
+ * Level 3 descriptor (PTE).
+ */
+#define PTE_TYPE_MASK		(_AT(pteval_t, 3) << 0)
+#define PTE_TYPE_FAULT		(_AT(pteval_t, 0) << 0)
+#define PTE_TYPE_PAGE		(_AT(pteval_t, 3) << 0)
+#define PTE_USER		(_AT(pteval_t, 1) << 6)		/* AP[1] */
+#define PTE_RDONLY		(_AT(pteval_t, 1) << 7)		/* AP[2] */
+#define PTE_SHARED		(_AT(pteval_t, 3) << 8)		/* SH[1:0], inner shareable */
+#define PTE_AF			(_AT(pteval_t, 1) << 10)	/* Access Flag */
+#define PTE_NG			(_AT(pteval_t, 1) << 11)	/* nG */
+#define PTE_XN			(_AT(pteval_t, 1) << 54)	/* XN */
+
+/*
+ * AttrIndx[2:0] encoding (mapping attributes defined in the MAIR* registers).
+ */
+#define PTE_ATTRINDX(t)		(_AT(pteval_t, (t)) << 2)
+#define PTE_ATTRINDX_MASK	(_AT(pteval_t, 7) << 2)
+
+/*
+ * 40-bit physical address supported.
+ */
+#define PHYS_MASK_SHIFT		(40)
+#define PHYS_MASK		((1UL << PHYS_MASK_SHIFT) - 1)
+
+/*
+ * TCR flags.
+ */
+#define TCR_TxSZ(x)		(((64 - (x)) << 16) | ((64 - (x)) << 0))
+#define TCR_IRGN_NC		((0 << 8) | (0 << 24))
+#define TCR_IRGN_WBWA		((1 << 8) | (1 << 24))
+#define TCR_IRGN_WT		((2 << 8) | (2 << 24))
+#define TCR_IRGN_WBnWA		((3 << 8) | (3 << 24))
+#define TCR_IRGN_MASK		((3 << 8) | (3 << 24))
+#define TCR_ORGN_NC		((0 << 10) | (0 << 26))
+#define TCR_ORGN_WBWA		((1 << 10) | (1 << 26))
+#define TCR_ORGN_WT		((2 << 10) | (2 << 26))
+#define TCR_ORGN_WBnWA		((3 << 10) | (3 << 26))
+#define TCR_ORGN_MASK		((3 << 10) | (3 << 26))
+#define TCR_SHARED		((3 << 12) | (3 << 28))
+#define TCR_TG0_64K		(1 << 14)
+#define TCR_TG1_64K		(1 << 30)
+#define TCR_IPS_40BIT		(2 << 32)
+#define TCR_ASID16		(1 << 36)
+
+#endif
diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
new file mode 100644
index 0000000..6981da0
--- /dev/null
+++ b/arch/arm64/include/asm/pgtable.h
@@ -0,0 +1,328 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PGTABLE_H
+#define __ASM_PGTABLE_H
+
+#include <asm/proc-fns.h>
+
+#include <asm/memory.h>
+#include <asm/pgtable-hwdef.h>
+
+/*
+ * Software defined PTE bits definition.
+ */
+#define PTE_VALID		(_AT(pteval_t, 1) << 0)	/* pte_present() check */
+#define PTE_FILE		(_AT(pteval_t, 1) << 2)	/* only when !pte_present() */
+#define PTE_DIRTY		(_AT(pteval_t, 1) << 55)
+#define PTE_SPECIAL		(_AT(pteval_t, 1) << 56)
+
+/*
+ * VMALLOC and SPARSEMEM_VMEMMAP ranges.
+ */
+#define VMALLOC_START		UL(0xffffff8000000000)
+#define VMALLOC_END		(PAGE_OFFSET - UL(0x400000000) - SZ_64K)
+
+#define vmemmap			((struct page *)(VMALLOC_END + SZ_64K))
+
+#define FIRST_USER_ADDRESS	0
+
+#ifndef __ASSEMBLY__
+extern void __pte_error(const char *file, int line, unsigned long val);
+extern void __pmd_error(const char *file, int line, unsigned long val);
+extern void __pgd_error(const char *file, int line, unsigned long val);
+
+#define pte_ERROR(pte)		__pte_error(__FILE__, __LINE__, pte_val(pte))
+#ifndef CONFIG_ARM64_64K_PAGES
+#define pmd_ERROR(pmd)		__pmd_error(__FILE__, __LINE__, pmd_val(pmd))
+#endif
+#define pgd_ERROR(pgd)		__pgd_error(__FILE__, __LINE__, pgd_val(pgd))
+
+/*
+ * The pgprot_* and protection_map entries will be fixed up at runtime to
+ * include the cachable and bufferable bits based on memory policy, as well as
+ * any architecture dependent bits like global/ASID and SMP shared mapping
+ * bits.
+ */
+#define _PAGE_DEFAULT		PTE_TYPE_PAGE | PTE_AF
+
+extern pgprot_t pgprot_default;
+
+#define _MOD_PROT(p, b)	__pgprot(pgprot_val(p) | (b))
+
+#define PAGE_NONE		_MOD_PROT(pgprot_default, PTE_NG | PTE_XN | PTE_RDONLY)
+#define PAGE_SHARED		_MOD_PROT(pgprot_default, PTE_USER | PTE_NG | PTE_XN)
+#define PAGE_SHARED_EXEC	_MOD_PROT(pgprot_default, PTE_USER | PTE_NG)
+#define PAGE_COPY		_MOD_PROT(pgprot_default, PTE_USER | PTE_NG | PTE_XN | PTE_RDONLY)
+#define PAGE_COPY_EXEC		_MOD_PROT(pgprot_default, PTE_USER | PTE_NG | PTE_RDONLY)
+#define PAGE_READONLY		_MOD_PROT(pgprot_default, PTE_USER | PTE_NG | PTE_XN | PTE_RDONLY)
+#define PAGE_READONLY_EXEC	_MOD_PROT(pgprot_default, PTE_USER | PTE_NG | PTE_RDONLY)
+#define PAGE_KERNEL		_MOD_PROT(pgprot_default, PTE_XN | PTE_DIRTY)
+#define PAGE_KERNEL_EXEC	_MOD_PROT(pgprot_default, PTE_DIRTY)
+
+#define __PAGE_NONE		__pgprot(_PAGE_DEFAULT | PTE_NG | PTE_XN | PTE_RDONLY)
+#define __PAGE_SHARED		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_XN)
+#define __PAGE_SHARED_EXEC	__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG)
+#define __PAGE_COPY		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_XN | PTE_RDONLY)
+#define __PAGE_COPY_EXEC	__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_RDONLY)
+#define __PAGE_READONLY		__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_XN | PTE_RDONLY)
+#define __PAGE_READONLY_EXEC	__pgprot(_PAGE_DEFAULT | PTE_USER | PTE_NG | PTE_RDONLY)
+
+#endif /* __ASSEMBLY__ */
+
+#define __P000  __PAGE_NONE
+#define __P001  __PAGE_READONLY
+#define __P010  __PAGE_COPY
+#define __P011  __PAGE_COPY
+#define __P100  __PAGE_READONLY_EXEC
+#define __P101  __PAGE_READONLY_EXEC
+#define __P110  __PAGE_COPY_EXEC
+#define __P111  __PAGE_COPY_EXEC
+
+#define __S000  __PAGE_NONE
+#define __S001  __PAGE_READONLY
+#define __S010  __PAGE_SHARED
+#define __S011  __PAGE_SHARED
+#define __S100  __PAGE_READONLY_EXEC
+#define __S101  __PAGE_READONLY_EXEC
+#define __S110  __PAGE_SHARED_EXEC
+#define __S111  __PAGE_SHARED_EXEC
+
+#ifndef __ASSEMBLY__
+/*
+ * ZERO_PAGE is a global shared page that is always zero: used
+ * for zero-mapped memory areas etc..
+ */
+extern struct page *empty_zero_page;
+#define ZERO_PAGE(vaddr)	(empty_zero_page)
+
+#define pte_pfn(pte)		((pte_val(pte) & PHYS_MASK) >> PAGE_SHIFT)
+
+#define pfn_pte(pfn,prot)	(__pte(((phys_addr_t)(pfn) << PAGE_SHIFT) | pgprot_val(prot)))
+
+#define pte_none(pte)		(!pte_val(pte))
+#define pte_clear(mm,addr,ptep)	set_pte(ptep, __pte(0))
+#define pte_page(pte)		(pfn_to_page(pte_pfn(pte)))
+#define pte_offset_kernel(dir,addr)	(pmd_page_vaddr(*(dir)) + __pte_index(addr))
+
+#define pte_offset_map(dir,addr)	pte_offset_kernel((dir), (addr))
+#define pte_offset_map_nested(dir,addr)	pte_offset_kernel((dir), (addr))
+#define pte_unmap(pte)			do { } while (0)
+#define pte_unmap_nested(pte)		do { } while (0)
+
+/*
+ * The following only work if pte_present(). Undefined behaviour otherwise.
+ */
+#define pte_present(pte)	(pte_val(pte) & PTE_VALID)
+#define pte_dirty(pte)		(pte_val(pte) & PTE_DIRTY)
+#define pte_young(pte)		(pte_val(pte) & PTE_AF)
+#define pte_special(pte)	(pte_val(pte) & PTE_SPECIAL)
+#define pte_write(pte)		(!(pte_val(pte) & PTE_RDONLY))
+#define pte_exec(pte)		(!(pte_val(pte) & PTE_XN))
+
+#define pte_present_exec_user(pte) \
+	((pte_val(pte) & (PTE_VALID | PTE_USER | PTE_XN)) == \
+	 (PTE_VALID | PTE_USER))
+
+#define PTE_BIT_FUNC(fn,op) \
+static inline pte_t pte_##fn(pte_t pte) { pte_val(pte) op; return pte; }
+
+PTE_BIT_FUNC(wrprotect, |= PTE_RDONLY);
+PTE_BIT_FUNC(mkwrite,   &= ~PTE_RDONLY);
+PTE_BIT_FUNC(mkclean,   &= ~PTE_DIRTY);
+PTE_BIT_FUNC(mkdirty,   |= PTE_DIRTY);
+PTE_BIT_FUNC(mkold,     &= ~PTE_AF);
+PTE_BIT_FUNC(mkyoung,   |= PTE_AF);
+PTE_BIT_FUNC(mkspecial, |= PTE_SPECIAL);
+
+static inline void set_pte(pte_t *ptep, pte_t pte)
+{
+	*ptep = pte;
+}
+
+extern void __sync_icache_dcache(pte_t pteval);
+
+static inline void set_pte_at(struct mm_struct *mm, unsigned long addr,
+			      pte_t *ptep, pte_t pte)
+{
+	if (pte_present_exec_user(pte))
+		__sync_icache_dcache(pte);
+	set_pte(ptep, pte);
+}
+
+/*
+ * Huge pte definitions.
+ */
+#define pte_huge(pte)		((pte_val(pte) & PTE_TYPE_MASK) == PTE_TYPE_HUGEPAGE)
+#define pte_mkhuge(pte)		(__pte((pte_val(pte) & ~PTE_TYPE_MASK) | PTE_TYPE_HUGEPAGE))
+
+#define __pgprot_modify(prot,mask,bits)		\
+	__pgprot((pgprot_val(prot) & ~(mask)) | (bits))
+
+#define __HAVE_ARCH_PTE_SPECIAL
+
+/*
+ * Mark the prot value as uncacheable and unbufferable.
+ */
+#define pgprot_noncached(prot) \
+	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_nGnRnE))
+#define pgprot_writecombine(prot) \
+	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_DEVICE_GRE))
+#define pgprot_dmacoherent(prot) \
+	__pgprot_modify(prot, PTE_ATTRINDX_MASK, PTE_ATTRINDX(MT_NORMAL_NC))
+#define __HAVE_PHYS_MEM_ACCESS_PROT
+struct file;
+extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
+				     unsigned long size, pgprot_t vma_prot);
+
+#define pmd_none(pmd)		(!pmd_val(pmd))
+#define pmd_present(pmd)	(pmd_val(pmd))
+
+#define pmd_bad(pmd)		(!(pmd_val(pmd) & 2))
+
+static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
+{
+	*pmdp = pmd;
+	dsb();
+}
+
+static inline void pmd_clear(pmd_t *pmdp)
+{
+	set_pmd(pmdp, __pmd(0));
+}
+
+static inline pte_t *pmd_page_vaddr(pmd_t pmd)
+{
+	return __va(pmd_val(pmd) & PHYS_MASK & (s32)PAGE_MASK);
+}
+
+#define pmd_page(pmd)		pfn_to_page(__phys_to_pfn(pmd_val(pmd) & PHYS_MASK))
+
+/*
+ * Conversion functions: convert a page and protection to a page entry,
+ * and a page entry and page directory to the page they refer to.
+ */
+#define mk_pte(page,prot)	pfn_pte(page_to_pfn(page),prot)
+
+#ifndef CONFIG_ARM64_64K_PAGES
+
+#define pud_none(pud)		(!pud_val(pud))
+#define pud_bad(pud)		(!(pud_val(pud) & 2))
+#define pud_present(pud)	(pud_val(pud))
+
+static inline void set_pud(pud_t *pudp, pud_t pud)
+{
+	*pudp = pud;
+	dsb();
+}
+
+static inline void pud_clear(pud_t *pudp)
+{
+	set_pud(pudp, __pud(0));
+}
+
+static inline pmd_t *pud_page_vaddr(pud_t pud)
+{
+	return __va(pud_val(pud) & PHYS_MASK & (s32)PAGE_MASK);
+}
+
+#endif	/* CONFIG_ARM64_64K_PAGES */
+
+/* to find an entry in a page-table-directory */
+#define pgd_index(addr)		(((addr) >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1))
+
+#define pgd_offset(mm, addr)	((mm)->pgd+pgd_index(addr))
+
+/* to find an entry in a kernel page-table-directory */
+#define pgd_offset_k(addr)	pgd_offset(&init_mm, addr)
+
+/* Find an entry in the second-level page table.. */
+#ifndef CONFIG_ARM64_64K_PAGES
+#define pmd_index(addr)		(((addr) >> PMD_SHIFT) & (PTRS_PER_PMD - 1))
+static inline pmd_t *pmd_offset(pud_t *pud, unsigned long addr)
+{
+	return (pmd_t *)pud_page_vaddr(*pud) + pmd_index(addr);
+}
+#endif
+
+/* Find an entry in the third-level page table.. */
+#define __pte_index(addr)	(((addr) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1))
+
+static inline pte_t pte_modify(pte_t pte, pgprot_t newprot)
+{
+	const pteval_t mask = PTE_USER | PTE_XN | PTE_RDONLY;
+	pte_val(pte) = (pte_val(pte) & ~mask) | (pgprot_val(newprot) & mask);
+	return pte;
+}
+
+extern pgd_t swapper_pg_dir[PTRS_PER_PGD];
+extern pgd_t idmap_pg_dir[PTRS_PER_PGD];
+
+#define SWAPPER_DIR_SIZE	(3 * PAGE_SIZE)
+#define IDMAP_DIR_SIZE		(2 * PAGE_SIZE)
+
+/*
+ * Encode and decode a swap entry:
+ *	bits 0-1:	present (must be zero)
+ *	bit  2:		PTE_FILE
+ *	bits 3-8:	swap type
+ *	bits 9-63:	swap offset
+ */
+#define __SWP_TYPE_SHIFT	3
+#define __SWP_TYPE_BITS		6
+#define __SWP_TYPE_MASK		((1 << __SWP_TYPE_BITS) - 1)
+#define __SWP_OFFSET_SHIFT	(__SWP_TYPE_BITS + __SWP_TYPE_SHIFT)
+
+#define __swp_type(x)		(((x).val >> __SWP_TYPE_SHIFT) & __SWP_TYPE_MASK)
+#define __swp_offset(x)		((x).val >> __SWP_OFFSET_SHIFT)
+#define __swp_entry(type,offset) ((swp_entry_t) { ((type) << __SWP_TYPE_SHIFT) | ((offset) << __SWP_OFFSET_SHIFT) })
+
+#define __pte_to_swp_entry(pte)	((swp_entry_t) { pte_val(pte) })
+#define __swp_entry_to_pte(swp)	((pte_t) { (swp).val })
+
+/*
+ * Ensure that there are not more swap files than can be encoded in the kernel
+ * the PTEs.
+ */
+#define MAX_SWAPFILES_CHECK() BUILD_BUG_ON(MAX_SWAPFILES_SHIFT > __SWP_TYPE_BITS)
+
+/*
+ * Encode and decode a file entry:
+ *	bits 0-1:	present (must be zero)
+ *	bit  2:		PTE_FILE
+ *	bits 3-63:	file offset / PAGE_SIZE
+ */
+#define pte_file(pte)		(pte_val(pte) & PTE_FILE)
+#define pte_to_pgoff(x)		(pte_val(x) >> 3)
+#define pgoff_to_pte(x)		__pte(((x) << 3) | PTE_FILE)
+
+#define PTE_FILE_MAX_BITS	61
+
+extern int kern_addr_valid(unsigned long addr);
+
+#include <asm-generic/pgtable.h>
+
+/*
+ * remap a physical page `pfn' of size `size' with page protection `prot'
+ * into virtual address `from'
+ */
+#define io_remap_pfn_range(vma,from,pfn,size,prot) \
+		remap_pfn_range(vma, from, pfn, size, prot)
+
+#define pgtable_cache_init() do { } while (0)
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __ASM_PGTABLE_H */
diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h
new file mode 100644
index 0000000..1be62bc
--- /dev/null
+++ b/arch/arm64/include/asm/sparsemem.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SPARSEMEM_H
+#define __ASM_SPARSEMEM_H
+
+#ifdef CONFIG_SPARSEMEM
+#define MAX_PHYSMEM_BITS	40
+#define SECTION_SIZE_BITS	30
+#endif
+
+#endif


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 05/31] arm64: MMU initialisation
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (3 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 04/31] arm64: MMU definitions Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 13:45   ` Arnd Bergmann
  2012-08-17 10:06   ` Santosh Shilimkar
  2012-08-14 17:52 ` [PATCH v2 06/31] arm64: MMU fault handling and page table management Catalin Marinas
                   ` (26 subsequent siblings)
  31 siblings, 2 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch contains the initialisation of the memory blocks, MMU
attributes and the memory map. Only five memory types are defined:
Device nGnRnE (equivalent to Strongly Ordered), Device nGnRE (classic
Device memory), Device GRE, Normal Non-cacheable and Normal Cacheable.
Cache policies are supported via the memory attributes register
(MAIR_EL1) and only affect the Normal Cacheable mappings.

This patch also adds the SPARSEMEM_VMEMMAP initialisation.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/memblock.h |   21 ++
 arch/arm64/mm/init.c              |  416 +++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/mmu.c               |  395 +++++++++++++++++++++++++++++++++++
 3 files changed, 832 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/memblock.h
 create mode 100644 arch/arm64/mm/init.c
 create mode 100644 arch/arm64/mm/mmu.c

diff --git a/arch/arm64/include/asm/memblock.h b/arch/arm64/include/asm/memblock.h
new file mode 100644
index 0000000..6afeed2
--- /dev/null
+++ b/arch/arm64/include/asm/memblock.h
@@ -0,0 +1,21 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_MEMBLOCK_H
+#define __ASM_MEMBLOCK_H
+
+extern void arm64_memblock_init(void);
+
+#endif
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
new file mode 100644
index 0000000..c87583d
--- /dev/null
+++ b/arch/arm64/mm/init.c
@@ -0,0 +1,416 @@
+/*
+ * Based on arch/arm/mm/init.c
+ *
+ * Copyright (C) 1995-2005 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/errno.h>
+#include <linux/swap.h>
+#include <linux/init.h>
+#include <linux/bootmem.h>
+#include <linux/mman.h>
+#include <linux/nodemask.h>
+#include <linux/initrd.h>
+#include <linux/gfp.h>
+#include <linux/memblock.h>
+#include <linux/sort.h>
+#include <linux/of_fdt.h>
+
+#include <asm/prom.h>
+#include <asm/sections.h>
+#include <asm/setup.h>
+#include <asm/sizes.h>
+#include <asm/tlb.h>
+
+#include "mm.h"
+
+static unsigned long phys_initrd_start __initdata = 0;
+static unsigned long phys_initrd_size __initdata = 0;
+
+phys_addr_t memstart_addr __read_mostly = 0;
+
+void __init early_init_dt_setup_initrd_arch(unsigned long start,
+					    unsigned long end)
+{
+	phys_initrd_start = start;
+	phys_initrd_size = end - start;
+}
+
+static int __init early_initrd(char *p)
+{
+	unsigned long start, size;
+	char *endp;
+
+	start = memparse(p, &endp);
+	if (*endp == ',') {
+		size = memparse(endp + 1, NULL);
+
+		phys_initrd_start = start;
+		phys_initrd_size = size;
+	}
+	return 0;
+}
+early_param("initrd", early_initrd);
+
+#define MAX_DMA32_PFN ((4UL * 1024 * 1024 * 1024) >> PAGE_SHIFT)
+
+static void __init zone_sizes_init(unsigned long min, unsigned long max)
+{
+	unsigned long zone_size[MAX_NR_ZONES];
+	unsigned long max_dma32 = min;
+
+	memset(zone_size, 0, sizeof(zone_size));
+
+	zone_size[0] = max - min;
+#ifdef CONFIG_ZONE_DMA32
+	/* 4GB maximum for 32-bit only capable devices */
+	max_dma32 = min(max, MAX_DMA32_PFN);
+	zone_size[ZONE_DMA32] = max_dma32 - min;
+#endif
+	zone_size[ZONE_NORMAL] = max - max_dma32;
+
+	free_area_init(zone_size);
+}
+
+#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+int pfn_valid(unsigned long pfn)
+{
+	return memblock_is_memory(pfn << PAGE_SHIFT);
+}
+EXPORT_SYMBOL(pfn_valid);
+#endif
+
+#ifndef CONFIG_SPARSEMEM
+static void arm64_memory_present(void)
+{
+}
+#else
+static void arm64_memory_present(void)
+{
+	struct memblock_region *reg;
+
+	for_each_memblock(memory, reg)
+		memory_present(0, memblock_region_memory_base_pfn(reg),
+			       memblock_region_memory_end_pfn(reg));
+}
+#endif
+
+void __init arm64_memblock_init(void)
+{
+	u64 *reserve_map, base, size;
+
+	/* Register the kernel text, kernel data and initrd with memblock */
+	memblock_reserve(__pa(_text), _end - _text);
+#ifdef CONFIG_BLK_DEV_INITRD
+	if (phys_initrd_size) {
+		memblock_reserve(phys_initrd_start, phys_initrd_size);
+
+		/* Now convert initrd to virtual addresses */
+		initrd_start = __phys_to_virt(phys_initrd_start);
+		initrd_end = initrd_start + phys_initrd_size;
+	}
+#endif
+
+	/*
+	 * Reserve the page tables.  These are already in use,
+	 * and can only be in node 0.
+	 */
+	memblock_reserve(__pa(swapper_pg_dir), SWAPPER_DIR_SIZE);
+	memblock_reserve(__pa(idmap_pg_dir), IDMAP_DIR_SIZE);
+
+	/* Reserve the dtb region */
+	memblock_reserve(virt_to_phys(initial_boot_params),
+			 be32_to_cpu(initial_boot_params->totalsize));
+
+	/*
+	 * Process the reserve map.  This will probably overlap the initrd
+	 * and dtb locations which are already reserved, but overlapping
+	 * doesn't hurt anything
+	 */
+	reserve_map = ((void*)initial_boot_params) +
+			be32_to_cpu(initial_boot_params->off_mem_rsvmap);
+	while (1) {
+		base = be64_to_cpup(reserve_map++);
+		size = be64_to_cpup(reserve_map++);
+		if (!size)
+			break;
+		memblock_reserve(base, size);
+	}
+
+	memblock_allow_resize();
+	memblock_dump_all();
+}
+
+void __init bootmem_init(void)
+{
+	unsigned long min, max;
+
+	min = PFN_UP(memblock_start_of_DRAM());
+	max = PFN_DOWN(memblock_end_of_DRAM());
+
+	/*
+	 * Sparsemem tries to allocate bootmem in memory_present(), so must be
+	 * done after the fixed reservations.
+	 */
+	arm64_memory_present();
+
+	sparse_init();
+	zone_sizes_init(min, max);
+
+	high_memory = __va((max << PAGE_SHIFT) - 1) + 1;
+	max_pfn = max_low_pfn = max;
+}
+
+static inline int free_area(unsigned long pfn, unsigned long end, char *s)
+{
+	unsigned int pages = 0, size = (end - pfn) << (PAGE_SHIFT - 10);
+
+	for (; pfn < end; pfn++) {
+		struct page *page = pfn_to_page(pfn);
+		ClearPageReserved(page);
+		init_page_count(page);
+		__free_page(page);
+		pages++;
+	}
+
+	if (size && s)
+		pr_info("Freeing %s memory: %dK\n", s, size);
+
+	return pages;
+}
+
+/*
+ * Poison init memory with an undefined instruction (0x0).
+ */
+static inline void poison_init_mem(void *s, size_t count)
+{
+	memset(s, 0, count);
+}
+
+#ifndef CONFIG_SPARSEMEM_VMEMMAP
+static inline void free_memmap(unsigned long start_pfn, unsigned long end_pfn)
+{
+	struct page *start_pg, *end_pg;
+	unsigned long pg, pgend;
+
+	/*
+	 * Convert start_pfn/end_pfn to a struct page pointer.
+	 */
+	start_pg = pfn_to_page(start_pfn - 1) + 1;
+	end_pg = pfn_to_page(end_pfn - 1) + 1;
+
+	/*
+	 * Convert to physical addresses, and round start upwards and end
+	 * downwards.
+	 */
+	pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
+	pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
+
+	/*
+	 * If there are free pages between these, free the section of the
+	 * memmap array.
+	 */
+	if (pg < pgend)
+		free_bootmem(pg, pgend - pg);
+}
+
+/*
+ * The mem_map array can get very big. Free the unused area of the memory map.
+ */
+static void __init free_unused_memmap(void)
+{
+	unsigned long start, prev_end = 0;
+	struct memblock_region *reg;
+
+	for_each_memblock(memory, reg) {
+		start = __phys_to_pfn(reg->base);
+
+#ifdef CONFIG_SPARSEMEM
+		/*
+		 * Take care not to free memmap entries that don't exist due
+		 * to SPARSEMEM sections which aren't present.
+		 */
+		start = min(start, ALIGN(prev_end, PAGES_PER_SECTION));
+#endif
+		/*
+		 * If we had a previous bank, and there is a space between the
+		 * current bank and the previous, free it.
+		 */
+		if (prev_end && prev_end < start)
+			free_memmap(prev_end, start);
+
+		/*
+		 * Align up here since the VM subsystem insists that the
+		 * memmap entries are valid from the bank end aligned to
+		 * MAX_ORDER_NR_PAGES.
+		 */
+		prev_end = ALIGN(start + __phys_to_pfn(reg->size),
+				 MAX_ORDER_NR_PAGES);
+	}
+
+#ifdef CONFIG_SPARSEMEM
+	if (!IS_ALIGNED(prev_end, PAGES_PER_SECTION))
+		free_memmap(prev_end, ALIGN(prev_end, PAGES_PER_SECTION));
+#endif
+}
+#endif	/* !CONFIG_SPARSEMEM_VMEMMAP */
+
+/*
+ * mem_init() marks the free areas in the mem_map and tells us how much memory
+ * is free.  This is done after various parts of the system have claimed their
+ * memory after the kernel image.
+ */
+void __init mem_init(void)
+{
+	unsigned long reserved_pages, free_pages;
+	struct memblock_region *reg;
+
+#if CONFIG_SWIOTLB
+	extern void __init arm64_swiotlb_init(size_t max_size);
+	arm64_swiotlb_init(max_pfn << (PAGE_SHIFT - 1));
+#endif
+
+	max_mapnr   = pfn_to_page(max_pfn + PHYS_PFN_OFFSET) - mem_map;
+
+#ifndef CONFIG_SPARSEMEM_VMEMMAP
+	/* this will put all unused low memory onto the freelists */
+	free_unused_memmap();
+#endif
+
+	totalram_pages += free_all_bootmem();
+
+	reserved_pages = free_pages = 0;
+
+	for_each_memblock(memory, reg) {
+		unsigned int pfn1, pfn2;
+		struct page *page, *end;
+
+		pfn1 = __phys_to_pfn(reg->base);
+		pfn2 = pfn1 + __phys_to_pfn(reg->size);
+
+		page = pfn_to_page(pfn1);
+		end  = pfn_to_page(pfn2 - 1) + 1;
+
+		do {
+			if (PageReserved(page))
+				reserved_pages++;
+			else if (!page_count(page))
+				free_pages++;
+			page++;
+		} while (page < end);
+	}
+
+	/*
+	 * Since our memory may not be contiguous, calculate the real number
+	 * of pages we have in this system.
+	 */
+	pr_info("Memory:");
+	num_physpages = 0;
+	for_each_memblock(memory, reg) {
+		unsigned long pages = memblock_region_memory_end_pfn(reg) -
+			memblock_region_memory_base_pfn(reg);
+		num_physpages += pages;
+		printk(" %ldMB", pages >> (20 - PAGE_SHIFT));
+	}
+	printk(" = %luMB total\n", num_physpages >> (20 - PAGE_SHIFT));
+
+	pr_notice("Memory: %luk/%luk available, %luk reserved\n",
+		  nr_free_pages() << (PAGE_SHIFT-10),
+		  free_pages << (PAGE_SHIFT-10),
+		  reserved_pages << (PAGE_SHIFT-10));
+
+#define MLK(b, t) b, t, ((t) - (b)) >> 10
+#define MLM(b, t) b, t, ((t) - (b)) >> 20
+#define MLK_ROUNDUP(b, t) b, t, DIV_ROUND_UP(((t) - (b)), SZ_1K)
+
+	pr_notice("Virtual kernel memory layout:\n"
+		  "    vmalloc : 0x%16lx - 0x%16lx   (%6ld MB)\n"
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+		  "    vmemmap : 0x%16lx - 0x%16lx   (%6ld MB)\n"
+#endif
+		  "    modules : 0x%16lx - 0x%16lx   (%6ld MB)\n"
+		  "    memory  : 0x%16lx - 0x%16lx   (%6ld MB)\n"
+		  "      .init : 0x%p" " - 0x%p" "   (%6ld kB)\n"
+		  "      .text : 0x%p" " - 0x%p" "   (%6ld kB)\n"
+		  "      .data : 0x%p" " - 0x%p" "   (%6ld kB)\n",
+		  MLM(VMALLOC_START, VMALLOC_END),
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+		  MLM((unsigned long)virt_to_page(PAGE_OFFSET),
+		      (unsigned long)virt_to_page(high_memory)),
+#endif
+		  MLM(MODULES_VADDR, MODULES_END),
+		  MLM(PAGE_OFFSET, (unsigned long)high_memory),
+
+		  MLK_ROUNDUP(__init_begin, __init_end),
+		  MLK_ROUNDUP(_text, _etext),
+		  MLK_ROUNDUP(_sdata, _edata));
+
+#undef MLK
+#undef MLM
+#undef MLK_ROUNDUP
+
+	/*
+	 * Check boundaries twice: Some fundamental inconsistencies can be
+	 * detected at build time already.
+	 */
+#ifdef CONFIG_AARCH32_EMULATION
+	BUILD_BUG_ON(TASK_SIZE_32			> TASK_SIZE_64);
+#endif
+	BUILD_BUG_ON(TASK_SIZE_64			> MODULES_VADDR);
+	BUG_ON(TASK_SIZE_64				> MODULES_VADDR);
+
+	if (PAGE_SIZE >= 16384 && num_physpages <= 128) {
+		extern int sysctl_overcommit_memory;
+		/*
+		 * On a machine this small we won't get anywhere without
+		 * overcommit, so turn it on by default.
+		 */
+		sysctl_overcommit_memory = OVERCOMMIT_ALWAYS;
+	}
+}
+
+void free_initmem(void)
+{
+	poison_init_mem(__init_begin, __init_end - __init_begin);
+	totalram_pages += free_area(__phys_to_pfn(__pa(__init_begin)),
+				    __phys_to_pfn(__pa(__init_end)),
+				    "init");
+}
+
+#ifdef CONFIG_BLK_DEV_INITRD
+
+static int keep_initrd;
+
+void free_initrd_mem(unsigned long start, unsigned long end)
+{
+	if (!keep_initrd) {
+		poison_init_mem((void *)start, PAGE_ALIGN(end) - start);
+		totalram_pages += free_area(__phys_to_pfn(__pa(start)),
+					    __phys_to_pfn(__pa(end)),
+					    "initrd");
+	}
+}
+
+static int __init keepinitrd_setup(char *__unused)
+{
+	keep_initrd = 1;
+	return 1;
+}
+
+__setup("keepinitrd", keepinitrd_setup);
+#endif
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
new file mode 100644
index 0000000..d2dd438
--- /dev/null
+++ b/arch/arm64/mm/mmu.c
@@ -0,0 +1,395 @@
+/*
+ * Based on arch/arm/mm/mmu.c
+ *
+ * Copyright (C) 1995-2005 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/init.h>
+#include <linux/mman.h>
+#include <linux/nodemask.h>
+#include <linux/memblock.h>
+#include <linux/fs.h>
+
+#include <asm/cputype.h>
+#include <asm/sections.h>
+#include <asm/setup.h>
+#include <asm/sizes.h>
+#include <asm/tlb.h>
+#include <asm/mmu_context.h>
+
+#include "mm.h"
+
+/*
+ * Empty_zero_page is a special page that is used for zero-initialized data
+ * and COW.
+ */
+struct page *empty_zero_page;
+EXPORT_SYMBOL(empty_zero_page);
+
+pgprot_t pgprot_default;
+EXPORT_SYMBOL(pgprot_default);
+
+static pmdval_t prot_sect_kernel;
+
+struct cachepolicy {
+	const char	policy[16];
+	u64		mair;
+	u64		tcr;
+};
+
+static struct cachepolicy cache_policies[] __initdata = {
+	{
+		.policy		= "uncached",
+		.mair		= 0x44,			/* inner, outer non-cacheable */
+		.tcr		= TCR_IRGN_NC | TCR_ORGN_NC,
+	}, {
+		.policy		= "writethrough",
+		.mair		= 0xaa,			/* inner, outer write-through, read-allocate */
+		.tcr		= TCR_IRGN_WT | TCR_ORGN_WT,
+	}, {
+		.policy		= "writeback",
+		.mair		= 0xee,			/* inner, outer write-back, read-allocate */
+		.tcr		= TCR_IRGN_WBnWA | TCR_ORGN_WBnWA,
+	}
+};
+
+/*
+ * These are useful for identifying cache coherency problems by allowing the
+ * cache or the cache and writebuffer to be turned off. It changes the Normal
+ * memory caching attributes in the MAIR_EL1 register.
+ */
+static int __init early_cachepolicy(char *p)
+{
+	int i;
+	u64 tmp;
+
+	for (i = 0; i < ARRAY_SIZE(cache_policies); i++) {
+		int len = strlen(cache_policies[i].policy);
+
+		if (memcmp(p, cache_policies[i].policy, len) == 0)
+			break;
+	}
+	if (i == ARRAY_SIZE(cache_policies)) {
+		pr_err("ERROR: unknown or unsupported cache policy: %s\n", p);
+		return 0;
+	}
+
+	flush_cache_all();
+
+	/*
+	 * Modify MT_NORMAL attributes in MAIR_EL1.
+	 */
+	asm volatile(
+	"	mrs	%0, mair_el1\n"
+	"	bfi	%0, %1, #%2, #8\n"
+	"	msr	mair_el1, %0\n"
+	"	isb\n"
+	: "=&r" (tmp)
+	: "r" (cache_policies[i].mair), "i" (MT_NORMAL * 8));
+
+	/*
+	 * Modify TCR PTW cacheability attributes.
+	 */
+	asm volatile(
+	"	mrs	%0, tcr_el1\n"
+	"	bic	%0, %0, %2\n"
+	"	orr	%0, %0, %1\n"
+	"	msr	tcr_el1, %0\n"
+	"	isb\n"
+	: "=&r" (tmp)
+	: "r" (cache_policies[i].tcr), "r" (TCR_IRGN_MASK | TCR_ORGN_MASK));
+
+	flush_cache_all();
+
+	return 0;
+}
+early_param("cachepolicy", early_cachepolicy);
+
+/*
+ * Adjust the PMD section entries according to the CPU in use.
+ */
+static void __init init_mem_pgprot(void)
+{
+	pteval_t default_pgprot;
+	int i;
+
+	default_pgprot = PTE_ATTRINDX(MT_NORMAL);
+	prot_sect_kernel = PMD_TYPE_SECT | PMD_SECT_AF | PMD_ATTRINDX(MT_NORMAL);
+
+#ifdef CONFIG_SMP
+	/*
+	 * Mark memory with the "shared" attribute for SMP systems
+	 */
+	default_pgprot |= PTE_SHARED;
+	prot_sect_kernel |= PMD_SECT_S;
+#endif
+
+	for (i = 0; i < 16; i++) {
+		unsigned long v = pgprot_val(protection_map[i]);
+		protection_map[i] = __pgprot(v | default_pgprot);
+	}
+
+	pgprot_default = __pgprot(PTE_TYPE_PAGE | PTE_AF | default_pgprot);
+}
+
+pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
+			      unsigned long size, pgprot_t vma_prot)
+{
+	if (!pfn_valid(pfn))
+		return pgprot_noncached(vma_prot);
+	else if (file->f_flags & O_SYNC)
+		return pgprot_writecombine(vma_prot);
+	return vma_prot;
+}
+EXPORT_SYMBOL(phys_mem_access_prot);
+
+static void __init *early_alloc(unsigned long sz)
+{
+	void *ptr = __va(memblock_alloc(sz, sz));
+	memset(ptr, 0, sz);
+	return ptr;
+}
+
+static void __init alloc_init_pte(pmd_t *pmd, unsigned long addr,
+				  unsigned long end, unsigned long pfn)
+{
+	pte_t *pte;
+
+	if (pmd_none(*pmd)) {
+		pte = early_alloc(PTRS_PER_PTE * sizeof(pte_t));
+		__pmd_populate(pmd, __pa(pte), PMD_TYPE_TABLE);
+	}
+	BUG_ON(pmd_bad(*pmd));
+
+	pte = pte_offset_kernel(pmd, addr);
+	do {
+		set_pte(pte, pfn_pte(pfn, PAGE_KERNEL_EXEC));
+		pfn++;
+	} while (pte++, addr += PAGE_SIZE, addr != end);
+}
+
+static void __init alloc_init_pmd(pud_t *pud, unsigned long addr,
+				  unsigned long end, phys_addr_t phys)
+{
+	pmd_t *pmd;
+	unsigned long next;
+
+	/*
+	 * Check for initial section mappings in the pgd/pud and remove them.
+	 */
+	if (pud_none(*pud) || pud_bad(*pud)) {
+		pmd = early_alloc(PTRS_PER_PMD * sizeof(pmd_t));
+		pud_populate(&init_mm, pud, pmd);
+	}
+
+	pmd = pmd_offset(pud, addr);
+	do {
+		next = pmd_addr_end(addr, end);
+		/* try section mapping first */
+		if (((addr | next | phys) & ~SECTION_MASK) == 0)
+			set_pmd(pmd, __pmd(phys | prot_sect_kernel));
+		else
+			alloc_init_pte(pmd, addr, next, __phys_to_pfn(phys));
+		phys += next - addr;
+	} while (pmd++, addr = next, addr != end);
+}
+
+static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
+				  unsigned long end, unsigned long phys)
+{
+	pud_t *pud = pud_offset(pgd, addr);
+	unsigned long next;
+
+	do {
+		next = pud_addr_end(addr, end);
+		alloc_init_pmd(pud, addr, next, phys);
+		phys += next - addr;
+	} while (pud++, addr = next, addr != end);
+}
+
+/*
+ * Create the page directory entries and any necessary page tables for the
+ * mapping specified by 'md'.
+ */
+static void __init create_mapping(phys_addr_t phys, unsigned long virt,
+				  phys_addr_t size)
+{
+	unsigned long addr, length, end, next;
+	pgd_t *pgd;
+
+	if (virt < VMALLOC_START) {
+		pr_warning("BUG: not creating mapping for 0x%016llx at 0x%016lx - outside kernel range\n",
+			   phys, virt);
+		return;
+	}
+
+	addr = virt & PAGE_MASK;
+	length = PAGE_ALIGN(size + (virt & ~PAGE_MASK));
+
+	pgd = pgd_offset_k(addr);
+	end = addr + length;
+	do {
+		next = pgd_addr_end(addr, end);
+		alloc_init_pud(pgd, addr, next, phys);
+		phys += next - addr;
+	} while (pgd++, addr = next, addr != end);
+}
+
+static void __init map_mem(void)
+{
+	struct memblock_region *reg;
+
+	/* map all the memory banks */
+	for_each_memblock(memory, reg) {
+		phys_addr_t start = reg->base;
+		phys_addr_t end = start + reg->size;
+
+		if (start >= end)
+			break;
+
+		create_mapping(start, __phys_to_virt(start), end - start);
+	}
+}
+
+/*
+ * paging_init() sets up the page tables, initialises the zone memory
+ * maps and sets up the zero page.
+ */
+void __init paging_init(void)
+{
+	void *zero_page;
+
+	/*
+	 * Maximum PGDIR_SIZE addressable via the initial direct kernel
+	 * mapping in swapper_pg_dir.
+	 */
+	memblock_set_current_limit((PHYS_OFFSET & PGDIR_MASK) + PGDIR_SIZE);
+
+	init_mem_pgprot();
+	map_mem();
+
+	/*
+	 * Finally flush the caches and tlb to ensure that we're in a
+	 * consistent state.
+	 */
+	flush_cache_all();
+	flush_tlb_all();
+
+	/* allocate the zero page. */
+	zero_page = early_alloc(PAGE_SIZE);
+
+	bootmem_init();
+
+	empty_zero_page = virt_to_page(zero_page);
+	__flush_dcache_page(NULL, empty_zero_page);
+
+	/*
+	 * TTBR0 is only used for the identity mapping at this stage. Make it
+	 * point to zero page to avoid speculatively fetching new entries.
+	 */
+	cpu_set_reserved_ttbr0();
+	flush_tlb_all();
+}
+
+/*
+ * Enable the identity mapping to allow the MMU disabling.
+ */
+void setup_mm_for_reboot(void)
+{
+	cpu_switch_mm(idmap_pg_dir, &init_mm);
+	flush_tlb_all();
+}
+
+/*
+ * Check whether a kernel address is valid (derived from arch/x86/).
+ */
+int kern_addr_valid(unsigned long addr)
+{
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+	pte_t *pte;
+
+	if ((((long)addr) >> VA_BITS) != -1UL)
+		return 0;
+
+	pgd = pgd_offset_k(addr);
+	if (pgd_none(*pgd))
+		return 0;
+
+	pud = pud_offset(pgd, addr);
+	if (pud_none(*pud))
+		return 0;
+
+	pmd = pmd_offset(pud, addr);
+	if (pmd_none(*pmd))
+		return 0;
+
+	pte = pte_offset_kernel(pmd, addr);
+	if (pte_none(*pte))
+		return 0;
+
+	return pfn_valid(pte_pfn(*pte));
+}
+#ifdef CONFIG_SPARSEMEM_VMEMMAP
+#ifdef CONFIG_ARM64_64K_PAGES
+int __meminit vmemmap_populate(struct page *start_page,
+			       unsigned long size, int node)
+{
+	return vmemmap_populate_basepages(start_page, size, node);
+}
+#else	/* !CONFIG_ARM64_64K_PAGES */
+int __meminit vmemmap_populate(struct page *start_page,
+			       unsigned long size, int node)
+{
+	unsigned long addr = (unsigned long)start_page;
+	unsigned long end = (unsigned long)(start_page + size);
+	unsigned long next;
+	pgd_t *pgd;
+	pud_t *pud;
+	pmd_t *pmd;
+
+	do {
+		next = pmd_addr_end(addr, end);
+
+		pgd = vmemmap_pgd_populate(addr, node);
+		if (!pgd)
+			return -ENOMEM;
+
+		pud = vmemmap_pud_populate(pgd, addr, node);
+		if (!pud)
+			return -ENOMEM;
+
+		pmd = pmd_offset(pud, addr);
+		if (pmd_none(*pmd)) {
+			void *p = NULL;
+
+			p = vmemmap_alloc_block_buf(PMD_SIZE, node);
+			if (!p)
+				return -ENOMEM;
+
+			set_pmd(pmd, __pmd(__pa(p) | prot_sect_kernel));
+		} else
+			vmemmap_verify((pte_t *)pmd, node, addr, next);
+	} while (addr = next, addr != end);
+
+	return 0;
+}
+#endif	/* CONFIG_ARM64_64K_PAGES */
+#endif	/* CONFIG_SPARSEMEM_VMEMMAP */


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 06/31] arm64: MMU fault handling and page table management
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (4 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 05/31] arm64: MMU initialisation Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 13:47   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 07/31] arm64: Process management Catalin Marinas
                   ` (25 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds support for the handling of the MMU faults (exception
entry code introduced by a previous patch) and page table management.

The user translation table is pointed to by TTBR0 and the kernel one
(swapper_pg_dir) by TTBR1. There is no translation information shared or
address space overlapping between user and kernel page tables.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/page.h    |   67 +++++
 arch/arm64/include/asm/pgalloc.h |  113 ++++++++
 arch/arm64/mm/copypage.c         |   34 +++
 arch/arm64/mm/extable.c          |   17 ++
 arch/arm64/mm/fault.c            |  534 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/mm/mm.h               |    2 +
 arch/arm64/mm/mmap.c             |  144 ++++++++++
 arch/arm64/mm/pgd.c              |   49 ++++
 8 files changed, 960 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/page.h
 create mode 100644 arch/arm64/include/asm/pgalloc.h
 create mode 100644 arch/arm64/mm/copypage.c
 create mode 100644 arch/arm64/mm/extable.c
 create mode 100644 arch/arm64/mm/fault.c
 create mode 100644 arch/arm64/mm/mm.h
 create mode 100644 arch/arm64/mm/mmap.c
 create mode 100644 arch/arm64/mm/pgd.c

diff --git a/arch/arm64/include/asm/page.h b/arch/arm64/include/asm/page.h
new file mode 100644
index 0000000..46bf666
--- /dev/null
+++ b/arch/arm64/include/asm/page.h
@@ -0,0 +1,67 @@
+/*
+ * Based on arch/arm/include/asm/page.h
+ *
+ * Copyright (C) 1995-2003 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PAGE_H
+#define __ASM_PAGE_H
+
+/* PAGE_SHIFT determines the page size */
+#ifdef CONFIG_ARM64_64K_PAGES
+#define PAGE_SHIFT		16
+#else
+#define PAGE_SHIFT		12
+#endif
+#define PAGE_SIZE		(_AC(1,UL) << PAGE_SHIFT)
+#define PAGE_MASK		(~(PAGE_SIZE-1))
+
+/* We do define AT_SYSINFO_EHDR but don't use the gate mechanism */
+#define __HAVE_ARCH_GATE_AREA		1
+
+#ifndef __ASSEMBLY__
+
+#ifdef CONFIG_ARM64_64K_PAGES
+#include <asm/pgtable-2level-types.h>
+#else
+#include <asm/pgtable-3level-types.h>
+#endif
+
+extern void __cpu_clear_user_page(void *p, unsigned long user);
+extern void __cpu_copy_user_page(void *to, const void *from,
+				 unsigned long user);
+extern void copy_page(void *to, const void *from);
+extern void clear_page(void *to);
+
+#define clear_user_page(addr,vaddr,pg)  __cpu_clear_user_page(addr, vaddr)
+#define copy_user_page(to,from,vaddr,pg) __cpu_copy_user_page(to, from, vaddr)
+
+typedef struct page *pgtable_t;
+
+#ifdef CONFIG_HAVE_ARCH_PFN_VALID
+extern int pfn_valid(unsigned long);
+#endif
+
+#include <asm/memory.h>
+
+#endif /* !__ASSEMBLY__ */
+
+#define VM_DATA_DEFAULT_FLAGS \
+	(((current->personality & READ_IMPLIES_EXEC) ? VM_EXEC : 0) | \
+	 VM_READ | VM_WRITE | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC)
+
+#include <asm-generic/getorder.h>
+
+#endif
diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgalloc.h
new file mode 100644
index 0000000..f214069
--- /dev/null
+++ b/arch/arm64/include/asm/pgalloc.h
@@ -0,0 +1,113 @@
+/*
+ * Based on arch/arm/include/asm/pgalloc.h
+ *
+ * Copyright (C) 2000-2001 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PGALLOC_H
+#define __ASM_PGALLOC_H
+
+#include <asm/pgtable-hwdef.h>
+#include <asm/processor.h>
+#include <asm/cacheflush.h>
+#include <asm/tlbflush.h>
+
+#define check_pgt_cache()		do { } while (0)
+
+#ifndef CONFIG_ARM64_64K_PAGES
+
+static inline pmd_t *pmd_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+	return (pmd_t *)get_zeroed_page(GFP_KERNEL | __GFP_REPEAT);
+}
+
+static inline void pmd_free(struct mm_struct *mm, pmd_t *pmd)
+{
+	BUG_ON((unsigned long)pmd & (PAGE_SIZE-1));
+	free_page((unsigned long)pmd);
+}
+
+static inline void pud_populate(struct mm_struct *mm, pud_t *pud, pmd_t *pmd)
+{
+	set_pud(pud, __pud(__pa(pmd) | PMD_TYPE_TABLE));
+}
+
+#endif	/* CONFIG_ARM64_64K_PAGES */
+
+extern pgd_t *pgd_alloc(struct mm_struct *mm);
+extern void pgd_free(struct mm_struct *mm, pgd_t *pgd);
+
+#define PGALLOC_GFP	(GFP_KERNEL | __GFP_NOTRACK | __GFP_REPEAT | __GFP_ZERO)
+
+static inline pte_t *
+pte_alloc_one_kernel(struct mm_struct *mm, unsigned long addr)
+{
+	return (pte_t *)__get_free_page(PGALLOC_GFP);
+}
+
+static inline pgtable_t
+pte_alloc_one(struct mm_struct *mm, unsigned long addr)
+{
+	struct page *pte;
+
+	pte = alloc_pages(PGALLOC_GFP, 0);
+	if (pte)
+		pgtable_page_ctor(pte);
+
+	return pte;
+}
+
+/*
+ * Free a PTE table.
+ */
+static inline void pte_free_kernel(struct mm_struct *mm, pte_t *pte)
+{
+	if (pte)
+		free_page((unsigned long)pte);
+}
+
+static inline void pte_free(struct mm_struct *mm, pgtable_t pte)
+{
+	pgtable_page_dtor(pte);
+	__free_page(pte);
+}
+
+static inline void __pmd_populate(pmd_t *pmdp, phys_addr_t pte,
+				  pmdval_t prot)
+{
+	set_pmd(pmdp, __pmd(pte | prot));
+}
+
+/*
+ * Populate the pmdp entry with a pointer to the pte.  This pmd is part
+ * of the mm address space.
+ */
+static inline void
+pmd_populate_kernel(struct mm_struct *mm, pmd_t *pmdp, pte_t *ptep)
+{
+	/*
+	 * The pmd must be loaded with the physical address of the PTE table
+	 */
+	__pmd_populate(pmdp, __pa(ptep), PMD_TYPE_TABLE);
+}
+
+static inline void
+pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtable_t ptep)
+{
+	__pmd_populate(pmdp, page_to_phys(ptep), PMD_TYPE_TABLE);
+}
+#define pmd_pgtable(pmd) pmd_page(pmd)
+
+#endif
diff --git a/arch/arm64/mm/copypage.c b/arch/arm64/mm/copypage.c
new file mode 100644
index 0000000..9361662
--- /dev/null
+++ b/arch/arm64/mm/copypage.c
@@ -0,0 +1,34 @@
+/*
+ * Based on arch/arm/mm/copypage.c
+ *
+ * Copyright (C) 2002 Deep Blue Solutions Ltd, All Rights Reserved.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/mm.h>
+
+#include <asm/page.h>
+#include <asm/cacheflush.h>
+
+void __cpu_copy_user_page(void *kto, const void *kfrom, unsigned long vaddr)
+{
+	copy_page(kto, kfrom);
+	__cpuc_flush_dcache_area(kto, PAGE_SIZE);
+}
+
+void __cpu_clear_user_page(void *kaddr, unsigned long vaddr)
+{
+	clear_page(kaddr);
+}
diff --git a/arch/arm64/mm/extable.c b/arch/arm64/mm/extable.c
new file mode 100644
index 0000000..7944427
--- /dev/null
+++ b/arch/arm64/mm/extable.c
@@ -0,0 +1,17 @@
+/*
+ * Based on arch/arm/mm/extable.c
+ */
+
+#include <linux/module.h>
+#include <linux/uaccess.h>
+
+int fixup_exception(struct pt_regs *regs)
+{
+	const struct exception_table_entry *fixup;
+
+	fixup = search_exception_tables(instruction_pointer(regs));
+	if (fixup)
+		regs->pc = fixup->fixup;
+
+	return fixup != NULL;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
new file mode 100644
index 0000000..1909a69
--- /dev/null
+++ b/arch/arm64/mm/fault.c
@@ -0,0 +1,534 @@
+/*
+ * Based on arch/arm/mm/fault.c
+ *
+ * Copyright (C) 1995  Linus Torvalds
+ * Copyright (C) 1995-2004 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/module.h>
+#include <linux/signal.h>
+#include <linux/mm.h>
+#include <linux/hardirq.h>
+#include <linux/init.h>
+#include <linux/kprobes.h>
+#include <linux/uaccess.h>
+#include <linux/page-flags.h>
+#include <linux/sched.h>
+#include <linux/highmem.h>
+#include <linux/perf_event.h>
+
+#include <asm/exception.h>
+#include <asm/debug-monitors.h>
+#include <asm/system_misc.h>
+#include <asm/pgtable.h>
+#include <asm/tlbflush.h>
+
+/*
+ * Dump out the page tables associated with 'addr' in mm 'mm'.
+ */
+void show_pte(struct mm_struct *mm, unsigned long addr)
+{
+	pgd_t *pgd;
+
+	if (!mm)
+		mm = &init_mm;
+
+	pr_alert("pgd = %p\n", mm->pgd);
+	pgd = pgd_offset(mm, addr);
+	pr_alert("[%08lx] *pgd=%016llx", addr, pgd_val(*pgd));
+
+	do {
+		pud_t *pud;
+		pmd_t *pmd;
+		pte_t *pte;
+
+		if (pgd_none_or_clear_bad(pgd))
+			break;
+
+		pud = pud_offset(pgd, addr);
+		if (pud_none_or_clear_bad(pud))
+			break;
+
+		pmd = pmd_offset(pud, addr);
+		printk(", *pmd=%016llx", pmd_val(*pmd));
+		if (pmd_none_or_clear_bad(pmd))
+			break;
+
+		pte = pte_offset_map(pmd, addr);
+		printk(", *pte=%016llx", pte_val(*pte));
+		pte_unmap(pte);
+	} while(0);
+
+	printk("\n");
+}
+
+/*
+ * The kernel tried to access some page that wasn't present.
+ */
+static void __do_kernel_fault(struct mm_struct *mm, unsigned long addr,
+			      unsigned int esr, struct pt_regs *regs)
+{
+	/*
+	 * Are we prepared to handle this kernel fault?
+	 */
+	if (fixup_exception(regs))
+		return;
+
+	/*
+	 * No handler, we'll have to terminate things with extreme prejudice.
+	 */
+	bust_spinlocks(1);
+	pr_alert("Unable to handle kernel %s at virtual address %08lx\n",
+		 (addr < PAGE_SIZE) ? "NULL pointer dereference" :
+		 "paging request", addr);
+
+	show_pte(mm, addr);
+	die("Oops", regs, esr);
+	bust_spinlocks(0);
+	do_exit(SIGKILL);
+}
+
+/*
+ * Something tried to access memory that isn't in our memory map. User mode
+ * accesses just cause a SIGSEGV
+ */
+static void __do_user_fault(struct task_struct *tsk, unsigned long addr,
+			    unsigned int esr, unsigned int sig, int code,
+			    struct pt_regs *regs)
+{
+	struct siginfo si;
+
+	if (show_unhandled_signals) {
+		pr_info("%s[%d]: unhandled page fault (%d) at 0x%08lx, code 0x%03x\n",
+			tsk->comm, task_pid_nr(tsk), sig, addr, esr);
+		show_pte(tsk->mm, addr);
+		show_regs(regs);
+	}
+
+	tsk->thread.fault_address = addr;
+	si.si_signo = sig;
+	si.si_errno = 0;
+	si.si_code = code;
+	si.si_addr = (void __user *)addr;
+	force_sig_info(sig, &si, tsk);
+}
+
+void do_bad_area(unsigned long addr, unsigned int esr, struct pt_regs *regs)
+{
+	struct task_struct *tsk = current;
+	struct mm_struct *mm = tsk->active_mm;
+
+	/*
+	 * If we are in kernel mode at this point, we have no context to
+	 * handle this fault with.
+	 */
+	if (user_mode(regs))
+		__do_user_fault(tsk, addr, esr, SIGSEGV, SEGV_MAPERR, regs);
+	else
+		__do_kernel_fault(mm, addr, esr, regs);
+}
+
+#define VM_FAULT_BADMAP		0x010000
+#define VM_FAULT_BADACCESS	0x020000
+
+#define ESR_WRITE		(1 << 6)
+#define ESR_LNX_EXEC		(1 << 24)
+
+/*
+ * Check that the permissions on the VMA allow for the fault which occurred.
+ * If we encountered a write fault, we must have write permission, otherwise
+ * we allow any permission.
+ */
+static inline bool access_error(unsigned int esr, struct vm_area_struct *vma)
+{
+	unsigned int mask = VM_READ | VM_WRITE | VM_EXEC;
+
+	if (esr & ESR_WRITE)
+		mask = VM_WRITE;
+	if (esr & ESR_LNX_EXEC)
+		mask = VM_EXEC;
+
+	return vma->vm_flags & mask ? false : true;
+}
+
+static int __do_page_fault(struct mm_struct *mm, unsigned long addr,
+			   unsigned int esr, unsigned int flags,
+			   struct task_struct *tsk)
+{
+	struct vm_area_struct *vma;
+	int fault;
+
+	vma = find_vma(mm, addr);
+	fault = VM_FAULT_BADMAP;
+	if (unlikely(!vma))
+		goto out;
+	if (unlikely(vma->vm_start > addr))
+		goto check_stack;
+
+	/*
+	 * Ok, we have a good vm_area for this memory access, so we can handle
+	 * it.
+	 */
+good_area:
+	if (access_error(esr, vma)) {
+		fault = VM_FAULT_BADACCESS;
+		goto out;
+	}
+
+	return handle_mm_fault(mm, vma, addr & PAGE_MASK, flags);
+
+check_stack:
+	if (vma->vm_flags & VM_GROWSDOWN && !expand_stack(vma, addr))
+		goto good_area;
+out:
+	return fault;
+}
+
+static int __kprobes do_page_fault(unsigned long addr, unsigned int esr,
+				   struct pt_regs *regs)
+{
+	struct task_struct *tsk;
+	struct mm_struct *mm;
+	int fault, sig, code;
+	int write = esr & ESR_WRITE;
+	unsigned int flags = FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE |
+		(write ? FAULT_FLAG_WRITE : 0);
+
+	tsk = current;
+	mm  = tsk->mm;
+
+	/* Enable interrupts if they were enabled in the parent context. */
+	if (interrupts_enabled(regs))
+		local_irq_enable();
+
+	/*
+	 * If we're in an interrupt or have no user context, we must not take
+	 * the fault.
+	 */
+	if (in_atomic() || !mm)
+		goto no_context;
+
+	/*
+	 * As per x86, we may deadlock here. However, since the kernel only
+	 * validly references user space from well defined areas of the code,
+	 * we can bug out early if this is from code which shouldn't.
+	 */
+	if (!down_read_trylock(&mm->mmap_sem)) {
+		if (!user_mode(regs) && !search_exception_tables(regs->pc))
+			goto no_context;
+retry:
+		down_read(&mm->mmap_sem);
+	} else {
+		/*
+		 * The above down_read_trylock() might have succeeded in which
+		 * case, we'll have missed the might_sleep() from down_read().
+		 */
+		might_sleep();
+#ifdef CONFIG_DEBUG_VM
+		if (!user_mode(regs) && !search_exception_tables(regs->pc))
+			goto no_context;
+#endif
+	}
+
+	fault = __do_page_fault(mm, addr, esr, flags, tsk);
+
+	/*
+	 * If we need to retry but a fatal signal is pending, handle the
+	 * signal first. We do not need to release the mmap_sem because it
+	 * would already be released in __lock_page_or_retry in mm/filemap.c.
+	 */
+	if ((fault & VM_FAULT_RETRY) && fatal_signal_pending(current))
+		return 0;
+
+	/*
+	 * Major/minor page fault accounting is only done on the initial
+	 * attempt. If we go through a retry, it is extremely likely that the
+	 * page will be found in page cache at that point.
+	 */
+
+	perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
+	if (flags & FAULT_FLAG_ALLOW_RETRY) {
+		if (fault & VM_FAULT_MAJOR) {
+			tsk->maj_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MAJ, 1, regs,
+				      addr);
+		} else {
+			tsk->min_flt++;
+			perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS_MIN, 1, regs,
+				      addr);
+		}
+		if (fault & VM_FAULT_RETRY) {
+			/*
+			 * Clear FAULT_FLAG_ALLOW_RETRY to avoid any risk of
+			 * starvation.
+			 */
+			flags &= ~FAULT_FLAG_ALLOW_RETRY;
+			goto retry;
+		}
+	}
+
+	up_read(&mm->mmap_sem);
+
+	/*
+	 * Handle the "normal" case first - VM_FAULT_MAJOR / VM_FAULT_MINOR
+	 */
+	if (likely(!(fault & (VM_FAULT_ERROR | VM_FAULT_BADMAP |
+			      VM_FAULT_BADACCESS))))
+		return 0;
+
+	if (fault & VM_FAULT_OOM) {
+		/*
+		 * We ran out of memory, call the OOM killer, and return to
+		 * userspace (which will retry the fault, or kill us if we got
+		 * oom-killed).
+		 */
+		pagefault_out_of_memory();
+		return 0;
+	}
+
+	/*
+	 * If we are in kernel mode at this point, we have no context to
+	 * handle this fault with.
+	 */
+	if (!user_mode(regs))
+		goto no_context;
+
+	if (fault & VM_FAULT_SIGBUS) {
+		/*
+		 * We had some memory, but were unable to successfully fix up
+		 * this page fault.
+		 */
+		sig = SIGBUS;
+		code = BUS_ADRERR;
+	} else {
+		/*
+		 * Something tried to access memory that isn't in our memory
+		 * map.
+		 */
+		sig = SIGSEGV;
+		code = fault == VM_FAULT_BADACCESS ?
+			SEGV_ACCERR : SEGV_MAPERR;
+	}
+
+	__do_user_fault(tsk, addr, esr, sig, code, regs);
+	return 0;
+
+no_context:
+	__do_kernel_fault(mm, addr, esr, regs);
+	return 0;
+}
+
+/*
+ * First Level Translation Fault Handler
+ *
+ * We enter here because the first level page table doesn't contain a valid
+ * entry for the address.
+ *
+ * If the address is in kernel space (>= TASK_SIZE), then we are probably
+ * faulting in the vmalloc() area.
+ *
+ * If the init_task's first level page tables contains the relevant entry, we
+ * copy the it to this task.  If not, we send the process a signal, fixup the
+ * exception, or oops the kernel.
+ *
+ * NOTE! We MUST NOT take any locks for this case. We may be in an interrupt
+ * or a critical region, and should only copy the information from the master
+ * page table, nothing more.
+ */
+static int __kprobes do_translation_fault(unsigned long addr,
+					  unsigned int esr,
+					  struct pt_regs *regs)
+{
+	if (addr < TASK_SIZE)
+		return do_page_fault(addr, esr, regs);
+
+	do_bad_area(addr, esr, regs);
+	return 0;
+}
+
+/*
+ * Some section permission faults need to be handled gracefully.  They can
+ * happen due to a __{get,put}_user during an oops.
+ */
+static int do_sect_fault(unsigned long addr, unsigned int esr,
+			 struct pt_regs *regs)
+{
+	do_bad_area(addr, esr, regs);
+	return 0;
+}
+
+/*
+ * This abort handler always returns "fault".
+ */
+static int do_bad(unsigned long addr, unsigned int esr, struct pt_regs *regs)
+{
+	return 1;
+}
+
+static struct fault_info {
+	int	(*fn)(unsigned long addr, unsigned int esr, struct pt_regs *regs);
+	int	sig;
+	int	code;
+	const char *name;
+} fault_info[] = {
+	{ do_bad,		SIGBUS,  0,		"ttbr address size fault"	},
+	{ do_bad,		SIGBUS,  0,		"level 1 address size fault"	},
+	{ do_bad,		SIGBUS,  0,		"level 2 address size fault"	},
+	{ do_bad,		SIGBUS,  0,		"level 3 address size fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"input address range fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 1 translation fault"	},
+	{ do_translation_fault,	SIGSEGV, SEGV_MAPERR,	"level 2 translation fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_MAPERR,	"level 3 translation fault"	},
+	{ do_bad,		SIGBUS,  0,		"reserved access flag fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 access flag fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 2 access flag fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 access flag fault"	},
+	{ do_bad,		SIGBUS,  0,		"reserved permission fault"	},
+	{ do_bad,		SIGSEGV, SEGV_ACCERR,	"level 1 permission fault"	},
+	{ do_sect_fault,	SIGSEGV, SEGV_ACCERR,	"level 2 permission fault"	},
+	{ do_page_fault,	SIGSEGV, SEGV_ACCERR,	"level 3 permission fault"	},
+	{ do_bad,		SIGBUS,  0,		"synchronous external abort"	},
+	{ do_bad,		SIGBUS,  0,		"asynchronous external abort"	},
+	{ do_bad,		SIGBUS,  0,		"unknown 18"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 19"			},
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous abort (translation table walk)" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error"	},
+	{ do_bad,		SIGBUS,  0,		"asynchronous parity error"	},
+	{ do_bad,		SIGBUS,  0,		"unknown 26"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 27"			},
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"synchronous parity error (translation table walk" },
+	{ do_bad,		SIGBUS,  0,		"unknown 32"			},
+	{ do_bad,		SIGBUS,  BUS_ADRALN,	"alignment fault"		},
+	{ do_bad,		SIGBUS,  0,		"debug event"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 35"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 36"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 37"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 38"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 39"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 40"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 41"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 42"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 43"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 44"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 45"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 46"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 47"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 48"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 49"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 50"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 51"			},
+	{ do_bad,		SIGBUS,  0,		"implementation fault (lockdown abort)" },
+	{ do_bad,		SIGBUS,  0,		"unknown 53"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 54"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 55"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 56"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 57"			},
+	{ do_bad,		SIGBUS,  0,		"implementation fault (coprocessor abort)" },
+	{ do_bad,		SIGBUS,  0,		"unknown 59"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 60"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 61"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 62"			},
+	{ do_bad,		SIGBUS,  0,		"unknown 63"			},
+};
+
+/*
+ * Dispatch a data abort to the relevant handler.
+ */
+asmlinkage void __exception do_mem_abort(unsigned long addr, unsigned int esr,
+					 struct pt_regs *regs)
+{
+	const struct fault_info *inf = fault_info + (esr & 63);
+	struct siginfo info;
+
+	if (!inf->fn(addr, esr, regs))
+		return;
+
+	pr_alert("Unhandled fault: %s (0x%08x) at 0x%016lx\n",
+		 inf->name, esr, addr);
+
+	info.si_signo = inf->sig;
+	info.si_errno = 0;
+	info.si_code  = inf->code;
+	info.si_addr  = (void __user *)addr;
+	arm64_notify_die("", regs, &info, esr);
+}
+
+/*
+ * Handle stack alignment exceptions.
+ */
+asmlinkage void __exception do_sp_pc_abort(unsigned long addr,
+					   unsigned int esr,
+					   struct pt_regs *regs)
+{
+	struct siginfo info;
+
+	info.si_signo = SIGBUS;
+	info.si_errno = 0;
+	info.si_code  = BUS_ADRALN;
+	info.si_addr  = (void __user *)addr;
+	arm64_notify_die("", regs, &info, esr);
+}
+
+static struct fault_info debug_fault_info[] = {
+	{ do_bad,	SIGTRAP,	TRAP_HWBKPT,	"hardware breakpoint"	},
+	{ do_bad,	SIGTRAP,	TRAP_HWBKPT,	"hardware single-step"	},
+	{ do_bad,	SIGTRAP,	TRAP_HWBKPT,	"hardware watchpoint"	},
+	{ do_bad,	SIGBUS,		0,		"unknown 3"		},
+	{ do_bad,	SIGTRAP,	TRAP_BRKPT,	"aarch32 BKPT"		},
+	{ do_bad,	SIGTRAP,	0,		"aarch32 vector catch"	},
+	{ do_bad,	SIGTRAP,	TRAP_BRKPT,	"aarch64 BRK"		},
+	{ do_bad,	SIGBUS,		0,		"unknown 7"		},
+};
+
+void __init hook_debug_fault_code(int nr,
+				  int (*fn)(unsigned long, unsigned int, struct pt_regs *),
+				  int sig, int code, const char *name)
+{
+	BUG_ON(nr < 0 || nr >= ARRAY_SIZE(debug_fault_info));
+
+	debug_fault_info[nr].fn		= fn;
+	debug_fault_info[nr].sig	= sig;
+	debug_fault_info[nr].code	= code;
+	debug_fault_info[nr].name	= name;
+}
+
+asmlinkage int __exception do_debug_exception(unsigned long addr,
+					      unsigned int esr,
+					      struct pt_regs *regs)
+{
+	const struct fault_info *inf = debug_fault_info + DBG_ESR_EVT(esr);
+	struct siginfo info;
+
+	if (!inf->fn(addr, esr, regs))
+		return 1;
+
+	pr_alert("Unhandled debug exception: %s (0x%08x) at 0x%016lx\n",
+		 inf->name, esr, addr);
+
+	info.si_signo = inf->sig;
+	info.si_errno = 0;
+	info.si_code  = inf->code;
+	info.si_addr  = (void __user *)addr;
+	arm64_notify_die("", regs, &info, esr);
+
+	return 0;
+}
diff --git a/arch/arm64/mm/mm.h b/arch/arm64/mm/mm.h
new file mode 100644
index 0000000..c84f68b
--- /dev/null
+++ b/arch/arm64/mm/mm.h
@@ -0,0 +1,2 @@
+extern void __flush_dcache_page(struct address_space *mapping, struct page *page);
+extern void __init bootmem_init(void);
diff --git a/arch/arm64/mm/mmap.c b/arch/arm64/mm/mmap.c
new file mode 100644
index 0000000..7c7be78
--- /dev/null
+++ b/arch/arm64/mm/mmap.c
@@ -0,0 +1,144 @@
+/*
+ * Based on arch/arm/mm/mmap.c
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/elf.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/mman.h>
+#include <linux/export.h>
+#include <linux/shm.h>
+#include <linux/sched.h>
+#include <linux/io.h>
+#include <linux/personality.h>
+#include <linux/random.h>
+
+#include <asm/cputype.h>
+
+/*
+ * Leave enough space between the mmap area and the stack to honour ulimit in
+ * the face of randomisation.
+ */
+#define MIN_GAP (SZ_128M + ((STACK_RND_MASK << PAGE_SHIFT) + 1))
+#define MAX_GAP	(STACK_TOP/6*5)
+
+static int mmap_is_legacy(void)
+{
+	if (current->personality & ADDR_COMPAT_LAYOUT)
+		return 1;
+
+	if (rlimit(RLIMIT_STACK) == RLIM_INFINITY)
+		return 1;
+
+	return sysctl_legacy_va_layout;
+}
+
+/*
+ * Since get_random_int() returns the same value within a 1 jiffy window, we
+ * will almost always get the same randomisation for the stack and mmap
+ * region. This will mean the relative distance between stack and mmap will be
+ * the same.
+ *
+ * To avoid this we can shift the randomness by 1 bit.
+ */
+static unsigned long mmap_rnd(void)
+{
+	unsigned long rnd = 0;
+
+	if (current->flags & PF_RANDOMIZE)
+		rnd = (long)get_random_int() & (STACK_RND_MASK >> 1);
+
+	return rnd << (PAGE_SHIFT + 1);
+}
+
+static unsigned long mmap_base(void)
+{
+	unsigned long gap = rlimit(RLIMIT_STACK);
+
+	if (gap < MIN_GAP)
+		gap = MIN_GAP;
+	else if (gap > MAX_GAP)
+		gap = MAX_GAP;
+
+	return PAGE_ALIGN(STACK_TOP - gap - mmap_rnd());
+}
+
+/*
+ * This function, called very early during the creation of a new process VM
+ * image, sets up which VM layout function to use:
+ */
+void arch_pick_mmap_layout(struct mm_struct *mm)
+{
+	/*
+	 * Fall back to the standard layout if the personality bit is set, or
+	 * if the expected stack growth is unlimited:
+	 */
+	if (mmap_is_legacy()) {
+		mm->mmap_base = TASK_UNMAPPED_BASE;
+		mm->get_unmapped_area = arch_get_unmapped_area;
+		mm->unmap_area = arch_unmap_area;
+	} else {
+		mm->mmap_base = mmap_base();
+		mm->get_unmapped_area = arch_get_unmapped_area_topdown;
+		mm->unmap_area = arch_unmap_area_topdown;
+	}
+}
+EXPORT_SYMBOL_GPL(arch_pick_mmap_layout);
+
+
+/*
+ * You really shouldn't be using read() or write() on /dev/mem.  This might go
+ * away in the future.
+ */
+int valid_phys_addr_range(unsigned long addr, size_t size)
+{
+	if (addr < PHYS_OFFSET)
+		return 0;
+	if (addr + size > __pa(high_memory - 1) + 1)
+		return 0;
+
+	return 1;
+}
+
+/*
+ * Do not allow /dev/mem mappings beyond the supported physical range.
+ */
+int valid_mmap_phys_addr_range(unsigned long pfn, size_t size)
+{
+	return !(((pfn << PAGE_SHIFT) + size) & ~PHYS_MASK);
+}
+
+#ifdef CONFIG_STRICT_DEVMEM
+
+#include <linux/ioport.h>
+
+/*
+ * devmem_is_allowed() checks to see if /dev/mem access to a certain address
+ * is valid. The argument is a physical page number.  We mimic x86 here by
+ * disallowing access to system RAM as well as device-exclusive MMIO regions.
+ * This effectively disable read()/write() on /dev/mem.
+ */
+int devmem_is_allowed(unsigned long pfn)
+{
+	if (iomem_is_exclusive(pfn << PAGE_SHIFT))
+		return 0;
+	if (!page_is_ram(pfn))
+		return 1;
+	return 0;
+}
+
+#endif
diff --git a/arch/arm64/mm/pgd.c b/arch/arm64/mm/pgd.c
new file mode 100644
index 0000000..7a7b0e9
--- /dev/null
+++ b/arch/arm64/mm/pgd.c
@@ -0,0 +1,49 @@
+/*
+ * PGD allocation/freeing
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/mm.h>
+#include <linux/gfp.h>
+#include <linux/highmem.h>
+#include <linux/slab.h>
+
+#include <asm/pgalloc.h>
+#include <asm/page.h>
+#include <asm/tlbflush.h>
+
+#include "mm.h"
+
+#define PGD_ORDER	0
+
+pgd_t *pgd_alloc(struct mm_struct *mm)
+{
+	pgd_t *new_pgd;
+
+	new_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL, PGD_ORDER);
+	if (!new_pgd)
+		return NULL;
+
+	memset(new_pgd, 0, PAGE_SIZE << PGD_ORDER);
+
+	return new_pgd;
+}
+
+void pgd_free(struct mm_struct *mm, pgd_t *pgd)
+{
+	free_pages((unsigned long)pgd, PGD_ORDER);
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 07/31] arm64: Process management
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (5 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 06/31] arm64: MMU fault handling and page table management Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-14 23:50   ` Olof Johansson
                     ` (2 more replies)
  2012-08-14 17:52 ` [PATCH v2 08/31] arm64: CPU support Catalin Marinas
                   ` (24 subsequent siblings)
  31 siblings, 3 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

The patch adds support for thread creation and context switching. The
context switching CPU specific code is introduced with the CPU support
patch (part of the arch/arm64/mm/proc.S file). AArch64 supports
ASID-tagged TLBs and the ASID can be either 8 or 16-bit wide (detectable
via the ID_AA64AFR0_EL1 register).

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/mmu_context.h |  152 +++++++++++++
 arch/arm64/include/asm/thread_info.h |  124 ++++++++++
 arch/arm64/kernel/process.c          |  416 ++++++++++++++++++++++++++++++++++
 arch/arm64/mm/context.c              |  159 +++++++++++++
 4 files changed, 851 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/mmu_context.h
 create mode 100644 arch/arm64/include/asm/thread_info.h
 create mode 100644 arch/arm64/kernel/process.c
 create mode 100644 arch/arm64/mm/context.c

diff --git a/arch/arm64/include/asm/mmu_context.h b/arch/arm64/include/asm/mmu_context.h
new file mode 100644
index 0000000..f68465d
--- /dev/null
+++ b/arch/arm64/include/asm/mmu_context.h
@@ -0,0 +1,152 @@
+/*
+ * Based on arch/arm/include/asm/mmu_context.h
+ *
+ * Copyright (C) 1996 Russell King.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_MMU_CONTEXT_H
+#define __ASM_MMU_CONTEXT_H
+
+#include <linux/compiler.h>
+#include <linux/sched.h>
+
+#include <asm/cacheflush.h>
+#include <asm/proc-fns.h>
+#include <asm-generic/mm_hooks.h>
+#include <asm/cputype.h>
+#include <asm/pgtable.h>
+
+#define MAX_ASID_BITS	16
+
+extern unsigned int cpu_last_asid;
+
+void __init_new_context(struct task_struct *tsk, struct mm_struct *mm);
+void __new_context(struct mm_struct *mm);
+
+/*
+ * Set TTBR0 to empty_zero_page. No translations will be possible via TTBR0.
+ */
+static inline void cpu_set_reserved_ttbr0(void)
+{
+	unsigned long ttbr = page_to_phys(empty_zero_page);
+
+	asm(
+	"	msr	ttbr0_el1, %0			// set TTBR0\n"
+	"	isb"
+	:
+	: "r" (ttbr));
+}
+
+static inline void switch_new_context(struct mm_struct *mm)
+{
+	unsigned long flags;
+
+	__new_context(mm);
+
+	local_irq_save(flags);
+	cpu_switch_mm(mm->pgd, mm);
+	local_irq_restore(flags);
+}
+
+static inline void check_and_switch_context(struct mm_struct *mm,
+					    struct task_struct *tsk)
+{
+	/*
+	 * Required during context switch to avoid speculative page table
+	 * walking with the wrong TTBR.
+	 */
+	cpu_set_reserved_ttbr0();
+
+	if (!((mm->context.id ^ cpu_last_asid) >> MAX_ASID_BITS))
+		/*
+		 * The ASID is from the current generation, just switch to the
+		 * new pgd. This condition is only true for calls from
+		 * context_switch() and interrupts are already disabled.
+		 */
+		cpu_switch_mm(mm->pgd, mm);
+	else if (irqs_disabled())
+		/*
+		 * Defer the new ASID allocation until after the context
+		 * switch critical region since __new_context() cannot be
+		 * called with interrupts disabled.
+		 */
+		set_ti_thread_flag(task_thread_info(tsk), TIF_SWITCH_MM);
+	else
+		/*
+		 * That is a direct call to switch_mm() or activate_mm() with
+		 * interrupts enabled and a new context.
+		 */
+		switch_new_context(mm);
+}
+
+#define init_new_context(tsk,mm)	(__init_new_context(tsk,mm),0)
+#define destroy_context(mm)		do { } while(0)
+
+#define finish_arch_post_lock_switch \
+	finish_arch_post_lock_switch
+static inline void finish_arch_post_lock_switch(void)
+{
+	if (test_and_clear_thread_flag(TIF_SWITCH_MM)) {
+		struct mm_struct *mm = current->mm;
+		unsigned long flags;
+
+		__new_context(mm);
+
+		local_irq_save(flags);
+		cpu_switch_mm(mm->pgd, mm);
+		local_irq_restore(flags);
+	}
+}
+
+/*
+ * This is called when "tsk" is about to enter lazy TLB mode.
+ *
+ * mm:  describes the currently active mm context
+ * tsk: task which is entering lazy tlb
+ * cpu: cpu number which is entering lazy tlb
+ *
+ * tsk->mm will be NULL
+ */
+static inline void
+enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk)
+{
+}
+
+/*
+ * This is the actual mm switch as far as the scheduler
+ * is concerned.  No registers are touched.  We avoid
+ * calling the CPU specific function when the mm hasn't
+ * actually changed.
+ */
+static inline void
+switch_mm(struct mm_struct *prev, struct mm_struct *next,
+	  struct task_struct *tsk)
+{
+	unsigned int cpu = smp_processor_id();
+
+#ifdef CONFIG_SMP
+	/* check for possible thread migration */
+	if (!cpumask_empty(mm_cpumask(next)) &&
+	    !cpumask_test_cpu(cpu, mm_cpumask(next)))
+		__flush_icache_all();
+#endif
+	if (!cpumask_test_and_set_cpu(cpu, mm_cpumask(next)) || prev != next)
+		check_and_switch_context(next, tsk);
+}
+
+#define deactivate_mm(tsk,mm)	do { } while (0)
+#define activate_mm(prev,next)	switch_mm(prev, next, NULL)
+
+#endif
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
new file mode 100644
index 0000000..4f909a3
--- /dev/null
+++ b/arch/arm64/include/asm/thread_info.h
@@ -0,0 +1,124 @@
+/*
+ * Based on arch/arm/include/asm/thread_info.h
+ *
+ * Copyright (C) 2002 Russell King.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_THREAD_INFO_H
+#define __ASM_THREAD_INFO_H
+
+#ifdef __KERNEL__
+
+#include <linux/compiler.h>
+
+#define THREAD_SIZE_ORDER	1
+#define THREAD_SIZE		8192
+#define THREAD_START_SP		(THREAD_SIZE - 16)
+
+#ifndef __ASSEMBLY__
+
+struct task_struct;
+struct exec_domain;
+
+#include <asm/types.h>
+
+typedef unsigned long mm_segment_t;
+
+/*
+ * low level task data that entry.S needs immediate access to.
+ * __switch_to() assumes cpu_context follows immediately after cpu_domain.
+ */
+struct thread_info {
+	unsigned long		flags;		/* low level flags */
+	mm_segment_t		addr_limit;	/* address limit */
+	struct task_struct	*task;		/* main task structure */
+	struct exec_domain	*exec_domain;	/* execution domain */
+	struct restart_block	restart_block;
+	int			preempt_count;	/* 0 => preemptable, <0 => bug */
+	int			cpu;		/* cpu */
+};
+
+#define INIT_THREAD_INFO(tsk)						\
+{									\
+	.task		= &tsk,						\
+	.exec_domain	= &default_exec_domain,				\
+	.flags		= 0,						\
+	.preempt_count	= INIT_PREEMPT_COUNT,				\
+	.addr_limit	= KERNEL_DS,					\
+	.restart_block	= {						\
+		.fn	= do_no_restart_syscall,			\
+	},								\
+}
+
+#define init_thread_info	(init_thread_union.thread_info)
+#define init_stack		(init_thread_union.stack)
+
+/*
+ * how to get the thread information struct from C
+ */
+static inline struct thread_info *current_thread_info(void) __attribute_const__;
+
+static inline struct thread_info *current_thread_info(void)
+{
+	register unsigned long sp asm ("sp");
+	return (struct thread_info *)(sp & ~(THREAD_SIZE - 1));
+}
+
+#define thread_saved_pc(tsk)	\
+	((unsigned long)(tsk->thread.cpu_context.pc))
+#define thread_saved_sp(tsk)	\
+	((unsigned long)(tsk->thread.cpu_context.sp))
+#define thread_saved_fp(tsk)	\
+	((unsigned long)(tsk->thread.cpu_context.fp))
+
+#endif
+
+/*
+ * We use bit 30 of the preempt_count to indicate that kernel
+ * preemption is occurring.  See <asm/hardirq.h>.
+ */
+#define PREEMPT_ACTIVE	0x40000000
+
+/*
+ * thread information flags:
+ *  TIF_SYSCALL_TRACE	- syscall trace active
+ *  TIF_SIGPENDING	- signal pending
+ *  TIF_NEED_RESCHED	- rescheduling necessary
+ *  TIF_NOTIFY_RESUME	- callback before returning to user
+ *  TIF_USEDFPU		- FPU was used by this task this quantum (SMP)
+ *  TIF_POLLING_NRFLAG	- true if poll_idle() is polling TIF_NEED_RESCHED
+ */
+#define TIF_SIGPENDING		0
+#define TIF_NEED_RESCHED	1
+#define TIF_NOTIFY_RESUME	2	/* callback before returning to user */
+#define TIF_SYSCALL_TRACE	8
+#define TIF_POLLING_NRFLAG	16
+#define TIF_MEMDIE		18	/* is terminating due to OOM killer */
+#define TIF_FREEZE		19
+#define TIF_RESTORE_SIGMASK	20
+#define TIF_SINGLESTEP		21
+#define TIF_32BIT		22	/* 32bit process */
+#define TIF_SWITCH_MM		23	/* deferred switch_mm */
+
+#define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
+#define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)
+#define _TIF_NOTIFY_RESUME	(1 << TIF_NOTIFY_RESUME)
+#define _TIF_32BIT		(1 << TIF_32BIT)
+
+#define _TIF_WORK_MASK		(_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
+				 _TIF_NOTIFY_RESUME)
+
+#endif /* __KERNEL__ */
+#endif /* __ASM_THREAD_INFO_H */
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
new file mode 100644
index 0000000..c4a4e1c
--- /dev/null
+++ b/arch/arm64/kernel/process.c
@@ -0,0 +1,416 @@
+/*
+ * Based on arch/arm/kernel/process.c
+ *
+ * Original Copyright (C) 1995  Linus Torvalds
+ * Copyright (C) 1996-2000 Russell King - Converted to ARM.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <stdarg.h>
+
+#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/stddef.h>
+#include <linux/unistd.h>
+#include <linux/user.h>
+#include <linux/delay.h>
+#include <linux/reboot.h>
+#include <linux/interrupt.h>
+#include <linux/kallsyms.h>
+#include <linux/init.h>
+#include <linux/cpu.h>
+#include <linux/elfcore.h>
+#include <linux/pm.h>
+#include <linux/tick.h>
+#include <linux/utsname.h>
+#include <linux/uaccess.h>
+#include <linux/random.h>
+#include <linux/hw_breakpoint.h>
+#include <linux/personality.h>
+#include <linux/notifier.h>
+
+#include <asm/cacheflush.h>
+#include <asm/processor.h>
+#include <asm/stacktrace.h>
+#include <asm/fpsimd.h>
+
+extern void setup_mm_for_reboot(void);
+
+static void setup_restart(void)
+{
+	/*
+	 * Tell the mm system that we are going to reboot -
+	 * we may need it to insert some 1:1 mappings so that
+	 * soft boot works.
+	 */
+	setup_mm_for_reboot();
+
+	/* Clean and invalidate caches */
+	flush_cache_all();
+
+	/* Turn off caching */
+	cpu_proc_fin();
+
+	/* Push out any further dirty data, and ensure cache is empty */
+	flush_cache_all();
+}
+
+void soft_restart(unsigned long addr)
+{
+	setup_restart();
+	cpu_reset(addr);
+}
+
+/*
+ * Function pointers to optional machine specific functions
+ */
+void (*pm_power_off)(void);
+EXPORT_SYMBOL(pm_power_off);
+
+void (*pm_restart)(const char *cmd);
+EXPORT_SYMBOL_GPL(pm_restart);
+
+
+/*
+ * This is our default idle handler.
+ */
+static void default_idle(void)
+{
+	/*
+	 * This should do all the clock switching and wait for interrupt
+	 * tricks
+	 */
+	cpu_do_idle();
+	local_irq_enable();
+}
+
+void (*pm_idle)(void) = default_idle;
+EXPORT_SYMBOL(pm_idle);
+
+/*
+ * The idle thread, has rather strange semantics for calling pm_idle,
+ * but this is what x86 does and we need to do the same, so that
+ * things like cpuidle get called in the same way.  The only difference
+ * is that we always respect 'hlt_counter' to prevent low power idle.
+ */
+void cpu_idle(void)
+{
+	local_fiq_enable();
+
+	/* endless idle loop with no priority at all */
+	while (1) {
+		tick_nohz_idle_enter();
+		rcu_idle_enter();
+		while (!need_resched()) {
+			/*
+			 * We need to disable interrupts here to ensure
+			 * we don't miss a wakeup call.
+			 */
+			local_irq_disable();
+			if (!need_resched()) {
+				stop_critical_timings();
+				pm_idle();
+				start_critical_timings();
+				/*
+				 * pm_idle functions should always return
+				 * with IRQs enabled.
+				 */
+				WARN_ON(irqs_disabled());
+			} else {
+				local_irq_enable();
+			}
+		}
+		rcu_idle_exit();
+		tick_nohz_idle_exit();
+		preempt_enable_no_resched();
+		schedule();
+		preempt_disable();
+	}
+}
+
+void machine_shutdown(void)
+{
+#ifdef CONFIG_SMP
+	smp_send_stop();
+#endif
+}
+
+void machine_halt(void)
+{
+	machine_shutdown();
+	while (1);
+}
+
+void machine_power_off(void)
+{
+	machine_shutdown();
+	if (pm_power_off)
+		pm_power_off();
+}
+
+void machine_restart(char *cmd)
+{
+	machine_shutdown();
+
+	/* Disable interrupts first */
+	local_irq_disable();
+	local_fiq_disable();
+
+	/* Now call the architecture specific reboot code. */
+	if (pm_restart)
+		pm_restart(cmd);
+
+	/*
+	 * Whoops - the architecture was unable to reboot.
+	 * Tell the user!
+	 */
+	mdelay(1000);
+	printk("Reboot failed -- System halted\n");
+	while (1);
+}
+
+void __show_regs(struct pt_regs *regs)
+{
+	int i;
+
+	printk("CPU: %d    %s  (%s %.*s)\n",
+		raw_smp_processor_id(), print_tainted(),
+		init_utsname()->release,
+		(int)strcspn(init_utsname()->version, " "),
+		init_utsname()->version);
+	print_symbol("PC is at %s\n", instruction_pointer(regs));
+	print_symbol("LR is at %s\n", regs->regs[30]);
+	printk("pc : [<%016llx>] lr : [<%016llx>] pstate: %08llx\n",
+	       regs->pc, regs->regs[30], regs->pstate);
+	printk("sp : %016llx\n", regs->sp);
+	for (i = 29; i >= 0; i--) {
+		printk("x%-2d: %016llx ", i, regs->regs[i]);
+		if (i % 2 == 0)
+			printk("\n");
+	}
+	printk("\n");
+}
+
+void show_regs(struct pt_regs * regs)
+{
+	printk("\n");
+	printk("Pid: %d, comm: %20s\n", task_pid_nr(current), current->comm);
+	__show_regs(regs);
+}
+
+/*
+ * Free current thread data structures etc..
+ */
+void exit_thread(void)
+{
+}
+
+void flush_thread(void)
+{
+	fpsimd_flush_thread();
+	flush_ptrace_hw_breakpoint(current);
+}
+
+void release_thread(struct task_struct *dead_task)
+{
+}
+
+int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src)
+{
+	fpsimd_save_state(&current->thread.fpsimd_state);
+	*dst = *src;
+	return 0;
+}
+
+asmlinkage void ret_from_fork(void) asm("ret_from_fork");
+
+int copy_thread(unsigned long clone_flags, unsigned long stack_start,
+		unsigned long stk_sz, struct task_struct *p,
+		struct pt_regs *regs)
+{
+	struct pt_regs *childregs = task_pt_regs(p);
+	unsigned long tls = p->thread.tp_value;
+
+	*childregs = *regs;
+	childregs->regs[0] = 0;
+
+#ifdef CONFIG_AARCH32_EMULATION
+	if (test_ti_thread_flag(task_thread_info(p), TIF_32BIT))
+		childregs->compat_sp = stack_start;
+	else
+#endif
+	{
+		/*
+		 * Read the current TLS pointer from tpidr_el0 as it may be
+		 * out-of-sync with the saved value.
+		 */
+		asm("mrs %0, tpidr_el0" : "=r" (tls));
+		childregs->sp = stack_start;
+	}
+
+	memset(&p->thread.cpu_context, 0, sizeof(struct cpu_context));
+	p->thread.cpu_context.sp = (unsigned long)childregs;
+	p->thread.cpu_context.pc = (unsigned long)ret_from_fork;
+
+	/* If a TLS pointer was passed to clone, use that for the new thread. */
+	if (clone_flags & CLONE_SETTLS)
+		tls = regs->regs[3];
+	p->thread.tp_value = tls;
+
+	ptrace_hw_copy_thread(p);
+
+	return 0;
+}
+
+static void tls_thread_switch(struct task_struct *next)
+{
+	unsigned long tpidr, tpidrro;
+
+	if (!test_thread_flag(TIF_32BIT)) {
+		asm("mrs %0, tpidr_el0" : "=r" (tpidr));
+		current->thread.tp_value = tpidr;
+	}
+
+	if (test_ti_thread_flag(task_thread_info(next), TIF_32BIT)) {
+		tpidr = 0;
+		tpidrro = next->thread.tp_value;
+	} else {
+		tpidr = next->thread.tp_value;
+		tpidrro = 0;
+	}
+
+	asm(
+	"	msr	tpidr_el0, %0\n"
+	"	msr	tpidrro_el0, %1"
+	: : "r" (tpidr), "r" (tpidrro));
+}
+
+/*
+ * Thread switching.
+ */
+struct task_struct *__switch_to(struct task_struct *prev,
+				struct task_struct *next)
+{
+	struct task_struct *last;
+
+	fpsimd_thread_switch(next);
+	tls_thread_switch(next);
+	hw_breakpoint_thread_switch(next);
+
+	/* the actual thread switch */
+	last = cpu_switch_to(prev, next);
+
+	return last;
+}
+
+/*
+ * Fill in the task's elfregs structure for a core dump.
+ */
+int dump_task_regs(struct task_struct *t, elf_gregset_t *elfregs)
+{
+	elf_core_copy_regs(elfregs, task_pt_regs(t));
+	return 1;
+}
+
+/*
+ * fill in the fpe structure for a core dump...
+ */
+int dump_fpu (struct pt_regs *regs, struct user_fp *fp)
+{
+	return 0;
+}
+EXPORT_SYMBOL(dump_fpu);
+
+/*
+ * Shuffle the argument into the correct register before calling the
+ * thread function.  x1 is the thread argument, x2 is the pointer to
+ * the thread function, and x3 points to the exit function.
+ */
+extern void kernel_thread_helper(void);
+asm(	".section .text\n"
+"	.align\n"
+"	.type	kernel_thread_helper, #function\n"
+"kernel_thread_helper:\n"
+"	mov	x0, x1\n"
+"	mov	x30, x3\n"
+"	br	x2\n"
+"	.size	kernel_thread_helper, . - kernel_thread_helper\n"
+"	.previous");
+
+#define kernel_thread_exit	do_exit
+
+/*
+ * Create a kernel thread.
+ */
+pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
+{
+	struct pt_regs regs;
+
+	memset(&regs, 0, sizeof(regs));
+
+	regs.regs[1] = (unsigned long)arg;
+	regs.regs[2] = (unsigned long)fn;
+	regs.regs[3] = (unsigned long)kernel_thread_exit;
+	regs.pc = (unsigned long)kernel_thread_helper;
+	regs.pstate = PSR_MODE_EL1h;
+
+	return do_fork(flags|CLONE_VM|CLONE_UNTRACED, 0, &regs, 0, NULL, NULL);
+}
+EXPORT_SYMBOL(kernel_thread);
+
+unsigned long get_wchan(struct task_struct *p)
+{
+	struct stackframe frame;
+	int count = 0;
+	if (!p || p == current || p->state == TASK_RUNNING)
+		return 0;
+
+	frame.fp = thread_saved_fp(p);
+	frame.sp = thread_saved_sp(p);
+	frame.pc = thread_saved_pc(p);
+	do {
+		int ret = unwind_frame(&frame);
+		if (ret < 0)
+			return 0;
+		if (!in_sched_functions(frame.pc))
+			return frame.pc;
+	} while (count ++ < 16);
+	return 0;
+}
+
+unsigned long arch_align_stack(unsigned long sp)
+{
+	if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space)
+		sp -= get_random_int() & ~PAGE_MASK;
+	return sp & ~0xf;
+}
+
+static unsigned long randomize_base(unsigned long base)
+{
+	unsigned long range_end = base + (STACK_RND_MASK << PAGE_SHIFT) + 1;
+	return randomize_range(base, range_end, 0) ? : base;
+}
+
+unsigned long arch_randomize_brk(struct mm_struct *mm)
+{
+	return randomize_base(mm->brk);
+}
+
+unsigned long randomize_et_dyn(unsigned long base)
+{
+	return randomize_base(base);
+}
diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c
new file mode 100644
index 0000000..e06f47a
--- /dev/null
+++ b/arch/arm64/mm/context.c
@@ -0,0 +1,159 @@
+/*
+ * Based on arch/arm/mm/context.c
+ *
+ * Copyright (C) 2002-2003 Deep Blue Solutions Ltd, all rights reserved.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/smp.h>
+#include <linux/percpu.h>
+
+#include <asm/mmu_context.h>
+#include <asm/tlbflush.h>
+#include <asm/cachetype.h>
+
+#define asid_bits(reg) \
+	(((read_cpuid(ID_AA64MMFR0_EL1) & 0xf0) >> 2) + 8)
+
+#define ASID_FIRST_VERSION	(1 << MAX_ASID_BITS)
+
+static DEFINE_SPINLOCK(cpu_asid_lock);
+unsigned int cpu_last_asid = ASID_FIRST_VERSION;
+
+/*
+ * We fork()ed a process, and we need a new context for the child to run in.
+ */
+void __init_new_context(struct task_struct *tsk, struct mm_struct *mm)
+{
+	mm->context.id = 0;
+	spin_lock_init(&mm->context.id_lock);
+}
+
+static void flush_context(void)
+{
+	/* set the reserved TTBR0 before flushing the TLB */
+	cpu_set_reserved_ttbr0();
+	flush_tlb_all();
+	if (icache_is_aivivt())
+		__flush_icache_all();
+}
+
+#ifdef CONFIG_SMP
+
+static void set_mm_context(struct mm_struct *mm, unsigned int asid)
+{
+	unsigned long flags;
+
+	/*
+	 * Locking needed for multi-threaded applications where the same
+	 * mm->context.id could be set from different CPUs during the
+	 * broadcast. This function is also called via IPI so the
+	 * mm->context.id_lock has to be IRQ-safe.
+	 */
+	spin_lock_irqsave(&mm->context.id_lock, flags);
+	if (likely((mm->context.id ^ cpu_last_asid) >> MAX_ASID_BITS)) {
+		/*
+		 * Old version of ASID found. Set the new one and reset
+		 * mm_cpumask(mm).
+		 */
+		mm->context.id = asid;
+		cpumask_clear(mm_cpumask(mm));
+	}
+	spin_unlock_irqrestore(&mm->context.id_lock, flags);
+
+	/*
+	 * Set the mm_cpumask(mm) bit for the current CPU.
+	 */
+	cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
+}
+
+/*
+ * Reset the ASID on the current CPU. This function call is broadcast from the
+ * CPU handling the ASID rollover and holding cpu_asid_lock.
+ */
+static void reset_context(void *info)
+{
+	unsigned int asid;
+	unsigned int cpu = smp_processor_id();
+	struct mm_struct *mm = current->active_mm;
+
+	smp_rmb();
+	asid = cpu_last_asid + cpu;
+
+	flush_context();
+	set_mm_context(mm, asid);
+
+	/* set the new ASID */
+	cpu_switch_mm(mm->pgd, mm);
+}
+
+#else
+
+static inline void set_mm_context(struct mm_struct *mm, unsigned int asid)
+{
+	mm->context.id = asid;
+	cpumask_copy(mm_cpumask(mm), cpumask_of(smp_processor_id()));
+}
+
+#endif
+
+void __new_context(struct mm_struct *mm)
+{
+	unsigned int asid;
+	unsigned int bits = asid_bits();
+
+	spin_lock(&cpu_asid_lock);
+#ifdef CONFIG_SMP
+	/*
+	 * Check the ASID again, in case the change was broadcast from another
+	 * CPU before we acquired the lock.
+	 */
+	if (!unlikely((mm->context.id ^ cpu_last_asid) >> MAX_ASID_BITS)) {
+		cpumask_set_cpu(smp_processor_id(), mm_cpumask(mm));
+		spin_unlock(&cpu_asid_lock);
+		return;
+	}
+#endif
+	/*
+	 * At this point, it is guaranteed that the current mm (with an old
+	 * ASID) isn't active on any other CPU since the ASIDs are changed
+	 * simultaneously via IPI.
+	 */
+	asid = ++cpu_last_asid;
+
+	/*
+	 * If we've used up all our ASIDs, we need to start a new version and
+	 * flush the TLB.
+	 */
+	if (unlikely((asid & ((1 << bits) - 1)) == 0)) {
+		/* increment the ASID version */
+		cpu_last_asid += (1 << MAX_ASID_BITS) - (1 << bits);
+		if (cpu_last_asid == 0)
+			cpu_last_asid = ASID_FIRST_VERSION;
+		asid = cpu_last_asid + smp_processor_id();
+		flush_context();
+#ifdef CONFIG_SMP
+		smp_wmb();
+		smp_call_function(reset_context, NULL, 1);
+#endif
+		cpu_last_asid += NR_CPUS - 1;
+	}
+
+	set_mm_context(mm, asid);
+	spin_unlock(&cpu_asid_lock);
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 08/31] arm64: CPU support
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (6 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 07/31] arm64: Process management Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15  0:10   ` Olof Johansson
  2012-08-15 13:56   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 09/31] arm64: Cache maintenance routines Catalin Marinas
                   ` (23 subsequent siblings)
  31 siblings, 2 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds AArch64 CPU specific functionality. It assumes that the
implementation is generic to AArch64 and does not require specific
identification. Different CPU implementations may require the setting of
various ACTLR_EL1 bits but such information is not currently available
and it should ideally be pushed to firmware.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/cputype.h   |   49 +++++++++
 arch/arm64/include/asm/proc-fns.h  |   51 ++++++++++
 arch/arm64/include/asm/processor.h |  174 ++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/procinfo.h  |   44 ++++++++
 arch/arm64/mm/proc-syms.c          |   31 ++++++
 arch/arm64/mm/proc.S               |  193 ++++++++++++++++++++++++++++++++++++
 6 files changed, 542 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/cputype.h
 create mode 100644 arch/arm64/include/asm/proc-fns.h
 create mode 100644 arch/arm64/include/asm/processor.h
 create mode 100644 arch/arm64/include/asm/procinfo.h
 create mode 100644 arch/arm64/mm/proc-syms.c
 create mode 100644 arch/arm64/mm/proc.S

diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
new file mode 100644
index 0000000..ef54125
--- /dev/null
+++ b/arch/arm64/include/asm/cputype.h
@@ -0,0 +1,49 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_CPUTYPE_H
+#define __ASM_CPUTYPE_H
+
+#define ID_MIDR_EL1		"midr_el1"
+#define ID_CTR_EL0		"ctr_el0"
+
+#define ID_AA64PFR0_EL1		"id_aa64pfr0_el1"
+#define ID_AA64DFR0_EL1		"id_aa64dfr0_el1"
+#define ID_AA64AFR0_EL1		"id_aa64afr0_el1"
+#define ID_AA64ISAR0_EL1	"id_aa64isar0_el1"
+#define ID_AA64MMFR0_EL1	"id_aa64mmfr0_el1"
+
+#define read_cpuid(reg) ({						\
+	u64 __val;							\
+	asm("mrs	%0, " reg : "=r" (__val));			\
+	__val;								\
+})
+
+/*
+ * The CPU ID never changes at run time, so we might as well tell the
+ * compiler that it's constant.  Use this function to read the CPU ID
+ * rather than directly reading processor_id or read_cpuid() directly.
+ */
+static inline u32 __attribute_const__ read_cpuid_id(void)
+{
+	return read_cpuid(ID_MIDR_EL1);
+}
+
+static inline u32 __attribute_const__ read_cpuid_cachetype(void)
+{
+	return read_cpuid(ID_CTR_EL0);
+}
+
+#endif
diff --git a/arch/arm64/include/asm/proc-fns.h b/arch/arm64/include/asm/proc-fns.h
new file mode 100644
index 0000000..520331b
--- /dev/null
+++ b/arch/arm64/include/asm/proc-fns.h
@@ -0,0 +1,51 @@
+/*
+ * Based on arch/arm/include/asm/proc-fns.h
+ *
+ * Copyright (C) 1997-1999 Russell King
+ * Copyright (C) 2000 Deep Blue Solutions Ltd
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PROCFNS_H
+#define __ASM_PROCFNS_H
+
+#ifdef __KERNEL__
+#ifndef __ASSEMBLY__
+
+#include <asm/page.h>
+
+struct mm_struct;
+
+extern void cpu_proc_init(void);
+extern void cpu_proc_fin(void);
+extern void cpu_do_idle(void);
+extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm);
+extern void cpu_reset(unsigned long addr) __attribute__((noreturn));
+
+#include <asm/memory.h>
+
+#define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
+
+#define cpu_get_pgd()					\
+({							\
+	unsigned long pg;				\
+	asm("mrs	%0, ttbr0_el1\n"		\
+	    : "=r" (pg));				\
+	pg &= ~0xffff000000003ffful;			\
+	(pgd_t *)phys_to_virt(pg);			\
+})
+
+#endif /* __ASSEMBLY__ */
+#endif /* __KERNEL__ */
+#endif /* __ASM_PROCFNS_H */
diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h
new file mode 100644
index 0000000..ebf2b22
--- /dev/null
+++ b/arch/arm64/include/asm/processor.h
@@ -0,0 +1,174 @@
+/*
+ * Based on arch/arm/include/asm/processor.h
+ *
+ * Copyright (C) 1995-1999 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PROCESSOR_H
+#define __ASM_PROCESSOR_H
+
+/*
+ * Default implementation of macro that returns current
+ * instruction pointer ("program counter").
+ */
+#define current_text_addr() ({ __label__ _l; _l: &&_l;})
+
+#ifdef __KERNEL__
+
+#include <linux/string.h>
+
+#include <asm/fpsimd.h>
+#include <asm/hw_breakpoint.h>
+#include <asm/ptrace.h>
+#include <asm/types.h>
+
+#ifdef __KERNEL__
+#define STACK_TOP_MAX		TASK_SIZE_64
+#ifdef CONFIG_AARCH32_EMULATION
+#define AARCH32_VECTORS_BASE	0xffff0000
+#define STACK_TOP		(test_thread_flag(TIF_32BIT) ? \
+				AARCH32_VECTORS_BASE : STACK_TOP_MAX)
+#else
+#define STACK_TOP		STACK_TOP_MAX
+#endif /* CONFIG_AARCH32_EMULATION */
+#endif /* __KERNEL__ */
+
+struct debug_info {
+	/* Have we suspended stepping by a debugger? */
+	int			suspended_step;
+	/* Allow breakpoints and watchpoints to be disabled for this thread. */
+	int			bps_disabled;
+	int			wps_disabled;
+	/* Hardware breakpoints pinned to this task. */
+	struct perf_event	*hbp[ARM_MAX_HBP_SLOTS];
+};
+
+struct cpu_context {
+	unsigned long x19;
+	unsigned long x20;
+	unsigned long x21;
+	unsigned long x22;
+	unsigned long x23;
+	unsigned long x24;
+	unsigned long x25;
+	unsigned long x26;
+	unsigned long x27;
+	unsigned long x28;
+	unsigned long fp;
+	unsigned long sp;
+	unsigned long pc;
+};
+
+struct thread_struct {
+	struct cpu_context	cpu_context;	/* cpu context */
+	unsigned long		tp_value;
+	struct fpsimd_state	fpsimd_state;
+	unsigned long		fault_address;	/* fault info */
+	struct debug_info	debug;		/* debugging */
+};
+
+#define INIT_THREAD  {	}
+
+static inline void start_thread_common(struct pt_regs *regs, unsigned long pc)
+{
+	memset(regs, 0, sizeof(*regs));
+	regs->syscallno = ~0UL;
+	regs->pc = pc;
+}
+
+static inline void start_thread(struct pt_regs *regs, unsigned long pc,
+				unsigned long sp)
+{
+	unsigned long *stack = (unsigned long *)sp;
+
+	start_thread_common(regs, pc);
+	regs->pstate = PSR_MODE_EL0t;
+	regs->sp = sp;
+	regs->regs[2] = stack[2];	/* x2 (envp) */
+	regs->regs[1] = stack[1];	/* x1 (argv) */
+	regs->regs[0] = stack[0];	/* x0 (argc) */
+}
+
+#ifdef CONFIG_AARCH32_EMULATION
+static inline void compat_start_thread(struct pt_regs *regs, unsigned long pc,
+				       unsigned long sp)
+{
+	unsigned int *stack = (unsigned int *)sp;
+
+	start_thread_common(regs, pc);
+	regs->pstate = COMPAT_PSR_MODE_USR;
+	if (pc & 1)
+		regs->pstate |= COMPAT_PSR_T_BIT;
+	regs->compat_sp = sp;
+	regs->regs[2] = stack[2];	/* x2 (envp) */
+	regs->regs[1] = stack[1];	/* x1 (argv) */
+	regs->regs[0] = stack[0];	/* x0 (argc) */
+}
+#endif
+
+/* Forward declaration, a strange C thing */
+struct task_struct;
+
+/* Free all resources held by a thread. */
+extern void release_thread(struct task_struct *);
+
+/* Prepare to copy thread state - unlazy all lazy status */
+#define prepare_to_copy(tsk)	do { } while (0)
+
+unsigned long get_wchan(struct task_struct *p);
+
+#define cpu_relax()			barrier()
+
+/* Thread switching */
+extern struct task_struct *cpu_switch_to(struct task_struct *prev,
+					 struct task_struct *next);
+
+/*
+ * Create a new kernel thread
+ */
+extern int kernel_thread(int (*fn)(void *), void *arg, unsigned long flags);
+
+#define task_pt_regs(p) \
+	((struct pt_regs *)(THREAD_START_SP + task_stack_page(p)) - 1)
+
+#define KSTK_EIP(tsk)	task_pt_regs(tsk)->pc
+#define KSTK_ESP(tsk)	task_pt_regs(tsk)->sp
+
+/*
+ * Prefetching support
+ */
+#define ARCH_HAS_PREFETCH
+static inline void prefetch(const void *ptr)
+{
+	asm volatile("prfm pldl1keep, %a0\n" : : "p" (ptr));
+}
+
+#define ARCH_HAS_PREFETCHW
+static inline void prefetchw(const void *ptr)
+{
+	asm volatile("prfm pstl1keep, %a0\n" : : "p" (ptr));
+}
+
+#define ARCH_HAS_SPINLOCK_PREFETCH
+static inline void spin_lock_prefetch(const void *x)
+{
+	prefetchw(x);
+}
+
+#define HAVE_ARCH_PICK_MMAP_LAYOUT
+
+#endif
+
+#endif /* __ASM_PROCESSOR_H */
diff --git a/arch/arm64/include/asm/procinfo.h b/arch/arm64/include/asm/procinfo.h
new file mode 100644
index 0000000..81fece9
--- /dev/null
+++ b/arch/arm64/include/asm/procinfo.h
@@ -0,0 +1,44 @@
+/*
+ * Based on arch/arm/include/asm/procinfo.h
+ *
+ * Copyright (C) 1996-1999 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PROCINFO_H
+#define __ASM_PROCINFO_H
+
+#ifdef __KERNEL__
+
+/*
+ * Note!  struct processor is always defined if we're
+ * using MULTI_CPU, otherwise this entry is unused,
+ * but still exists.
+ *
+ * NOTE! The following structure is defined by assembly
+ * language, NOT C code.  For more information, check:
+ *  arch/arm/mm/proc-*.S and arch/arm/kernel/head.S
+ */
+struct proc_info_list {
+	unsigned int		cpu_val;
+	unsigned int		cpu_mask;
+	unsigned long		__cpu_flush;		/* used by head.S */
+	const char		*cpu_name;
+};
+
+#else	/* __KERNEL__ */
+#include <asm/elf.h>
+#warning "Please include asm/elf.h instead"
+#endif	/* __KERNEL__ */
+#endif
diff --git a/arch/arm64/mm/proc-syms.c b/arch/arm64/mm/proc-syms.c
new file mode 100644
index 0000000..2d99ef9
--- /dev/null
+++ b/arch/arm64/mm/proc-syms.c
@@ -0,0 +1,31 @@
+/*
+ * Based on arch/arm/mm/proc-syms.c
+ *
+ * Copyright (C) 2000-2002 Russell King
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/export.h>
+#include <linux/mm.h>
+
+#include <asm/cacheflush.h>
+#include <asm/proc-fns.h>
+#include <asm/tlbflush.h>
+#include <asm/page.h>
+
+EXPORT_SYMBOL(__cpuc_flush_kern_all);
+EXPORT_SYMBOL(__cpuc_flush_user_all);
+EXPORT_SYMBOL(__cpuc_flush_user_range);
+EXPORT_SYMBOL(__cpuc_coherent_kern_range);
+EXPORT_SYMBOL(__cpuc_flush_dcache_area);
diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
new file mode 100644
index 0000000..453f517
--- /dev/null
+++ b/arch/arm64/mm/proc.S
@@ -0,0 +1,193 @@
+/*
+ * Based on arch/arm/mm/proc.S
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+#include <asm/asm-offsets.h>
+#include <asm/hwcap.h>
+#include <asm/pgtable-hwdef.h>
+#include <asm/pgtable.h>
+
+#include "proc-macros.S"
+
+#ifndef CONFIG_SMP
+/* PTWs cacheable, inner/outer WBWA not shareable */
+#define TCR_FLAGS	TCR_IRGN_WBWA | TCR_ORGN_WBWA
+#else
+/* PTWs cacheable, inner/outer WBWA shareable */
+#define TCR_FLAGS	TCR_IRGN_WBWA | TCR_ORGN_WBWA | TCR_SHARED
+#endif
+
+#define MAIR(attr, mt)	((attr) << ((mt) * 8))
+
+ENTRY(cpu_proc_init)
+	ret
+ENDPROC(cpu_proc_init)
+
+ENTRY(cpu_proc_fin)
+	ret
+ENDPROC(cpu_proc_fin)
+
+/*
+ *	cpu_reset(loc)
+ *
+ *	Perform a soft reset of the system.  Put the CPU into the same state
+ *	as it would be if it had been reset, and branch to what would be the
+ *	reset vector. It must be executed with the flat identity mapping.
+ *
+ *	- loc   - location to jump to for soft reset
+ */
+	.align	5
+ENTRY(cpu_reset)
+	mrs	x1, sctlr_el1
+	bic	x1, x1, #1
+	msr	sctlr_el1, x1			// disable the MMU
+	isb
+	ret	x0
+ENDPROC(cpu_reset)
+
+/*
+ *	cpu_do_idle()
+ *
+ *	Idle the processor (wait for interrupt).
+ */
+ENTRY(cpu_do_idle)
+	dsb	sy				// WFI may enter a low-power mode
+	wfi
+	ret
+ENDPROC(cpu_do_idle)
+
+/*
+ *	cpu_switch_mm(pgd_phys, tsk)
+ *
+ *	Set the translation table base pointer to be pgd_phys.
+ *
+ *	- pgd_phys - physical address of new TTB
+ */
+ENTRY(cpu_do_switch_mm)
+	mmid	w1, x1				// get mm->context.id
+	bfi	x0, x1, #48, #16		// set the ASID
+	msr	ttbr0_el1, x0			// set TTBR0
+	isb
+	ret
+ENDPROC(cpu_do_switch_mm)
+
+cpu_name:
+	.ascii	"AArch64 Processor"
+	.align
+
+	.section ".text.init", #alloc, #execinstr
+
+/*
+ *	__cpu_setup
+ *
+ *	Initialise the processor for turning the MMU on.  Return in x0 the
+ *	value of the SCTLR_EL1 register.
+ */
+__cpu_setup:
+#ifdef CONFIG_SMP
+	/* TODO: only do this for certain CPUs */
+	/*
+	 * Enable SMP/nAMP mode.
+	 */
+	mrs	x0, actlr_el1
+	tbnz	x0, #6, 1f			// already enabled?
+	orr	x0, x0, #1 << 6
+	msr	actlr_el1, x0
+1:
+#endif
+	/*
+	 * Preserve the link register across the function call.
+	 */
+	mov	x28, lr
+	bl	__cpuc_flush_dcache_all
+	mov	lr, x28
+	ic	iallu				// I+BTB cache invalidate
+	dsb	sy
+
+	mov	x0, #3 << 20
+	msr	cpacr_el1, x0			// Enable FP/ASIMD
+	mov	x0, #1
+	msr	oslar_el1, x0			// Set the debug OS lock
+	tlbi	vmalle1is			// invalidate I + D TLBs
+	/*
+	 * Memory region attributes for LPAE:
+	 *
+	 *   n = AttrIndx[2:0]
+	 *			n	MAIR
+	 *   DEVICE_nGnRnE	000	00000000
+	 *   DEVICE_nGnRE	001	00000100
+	 *   DEVICE_GRE		010	00001100
+	 *   NORMAL_NC		011	01000100
+	 *   NORMAL		100	11111111
+	 */
+	ldr	x5, =MAIR(0x00, MT_DEVICE_nGnRnE) | \
+		     MAIR(0x04, MT_DEVICE_nGnRE) | \
+		     MAIR(0x0c, MT_DEVICE_GRE) | \
+		     MAIR(0x44, MT_NORMAL_NC) | \
+		     MAIR(0xff, MT_NORMAL)
+	msr	mair_el1, x5
+	/*
+	 * Prepare SCTLR
+	 */
+	adr	x5, crval
+	ldp	w5, w6, [x5]
+	mrs	x0, sctlr_el1
+	bic	x0, x0, x5			// clear bits
+	orr	x0, x0, x6			// set bits
+	/*
+	 * Set/prepare TCR and TTBR. We use 512GB (39-bit) address range for
+	 * both user and kernel.
+	 */
+	ldr	x10, =TCR_TxSZ(VA_BITS) | TCR_FLAGS | TCR_IPS_40BIT | \
+		      TCR_ASID16 | (1 << 31)
+#ifdef CONFIG_ARM64_64K_PAGES
+	orr	x10, x10, TCR_TG0_64K
+	orr	x10, x10, TCR_TG1_64K
+#endif
+	msr	tcr_el1, x10
+	ret					// return to head.S
+ENDPROC(__cpu_setup)
+
+	/*
+	 *                 n n            T
+	 *       U E      WT T UD     US IHBS
+	 *       CE0      XWHW CZ     ME TEEA S
+	 * .... .IEE .... NEAI TE.I ..AD DEN0 ACAM
+	 * 0011 0... 1101 ..0. ..0. 10.. .... .... < hardware reserved
+	 * .... .100 .... 01.1 11.1 ..01 0001 1101 < software settings
+	 */
+	.type	crval, #object
+crval:
+	.word	0x030802e2			// clear
+	.word	0x0405d11d			// set
+
+	.section ".proc.info.init", #alloc, #execinstr
+
+	.type	__v8_proc_info, #object
+__v8_proc_info:
+	.long	0x000f0000			// Required ID value
+	.long	0x000f0000			// Mask for ID
+	b	__cpu_setup
+	nop
+	.quad	cpu_name
+	.long	0
+	.size	__v8_proc_info, . - __v8_proc_info


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 09/31] arm64: Cache maintenance routines
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (7 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 08/31] arm64: CPU support Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-17  9:57   ` Santosh Shilimkar
  2012-08-14 17:52 ` [PATCH v2 10/31] arm64: TLB maintenance functionality Catalin Marinas
                   ` (22 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

The patch adds functionality required for cache maintenance. The AArch64
architecture mandates non-aliasing VIPT or PIPT D-cache and VIPT (may
have aliases) or ASID-tagged VIVT I-cache. Cache maintenance operations
are automatically broadcast in hardware between CPUs.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/cache.h      |   32 ++++
 arch/arm64/include/asm/cacheflush.h |  209 ++++++++++++++++++++++++++
 arch/arm64/include/asm/cachetype.h  |   48 ++++++
 arch/arm64/mm/cache.S               |  279 +++++++++++++++++++++++++++++++++++
 arch/arm64/mm/flush.c               |  132 +++++++++++++++++
 5 files changed, 700 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/cache.h
 create mode 100644 arch/arm64/include/asm/cacheflush.h
 create mode 100644 arch/arm64/include/asm/cachetype.h
 create mode 100644 arch/arm64/mm/cache.S
 create mode 100644 arch/arm64/mm/flush.c

diff --git a/arch/arm64/include/asm/cache.h b/arch/arm64/include/asm/cache.h
new file mode 100644
index 0000000..390308a
--- /dev/null
+++ b/arch/arm64/include/asm/cache.h
@@ -0,0 +1,32 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_CACHE_H
+#define __ASM_CACHE_H
+
+#define L1_CACHE_SHIFT		6
+#define L1_CACHE_BYTES		(1 << L1_CACHE_SHIFT)
+
+/*
+ * Memory returned by kmalloc() may be used for DMA, so we must make
+ * sure that all such allocations are cache aligned. Otherwise,
+ * unrelated code may cause parts of the buffer to be read into the
+ * cache before the transfer is done, causing old data to be seen by
+ * the CPU.
+ */
+#define ARCH_DMA_MINALIGN	L1_CACHE_BYTES
+#define ARCH_SLAB_MINALIGN	8
+
+#endif
diff --git a/arch/arm64/include/asm/cacheflush.h b/arch/arm64/include/asm/cacheflush.h
new file mode 100644
index 0000000..93b5590
--- /dev/null
+++ b/arch/arm64/include/asm/cacheflush.h
@@ -0,0 +1,209 @@
+/*
+ * Based on arch/arm/include/asm/cacheflush.h
+ *
+ * Copyright (C) 1999-2002 Russell King.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_CACHEFLUSH_H
+#define __ASM_CACHEFLUSH_H
+
+#include <linux/mm.h>
+
+/*
+ * This flag is used to indicate that the page pointed to by a pte is clean
+ * and does not require cleaning before returning it to the user.
+ */
+#define PG_dcache_clean PG_arch_1
+
+/*
+ *	MM Cache Management
+ *	===================
+ *
+ *	The arch/arm/mm/cache-*.S and arch/arm/mm/proc-*.S files
+ *	implement these methods.
+ *
+ *	Start addresses are inclusive and end addresses are exclusive;
+ *	start addresses should be rounded down, end addresses up.
+ *
+ *	See Documentation/cachetlb.txt for more information.
+ *	Please note that the implementation of these, and the required
+ *	effects are cache-type (VIVT/VIPT/PIPT) specific.
+ *
+ *	flush_cache_kern_all()
+ *
+ *		Unconditionally clean and invalidate the entire cache.
+ *
+ *	flush_cache_user_mm(mm)
+ *
+ *		Clean and invalidate all user space cache entries
+ *		before a change of page tables.
+ *
+ *	flush_cache_user_range(start, end, flags)
+ *
+ *		Clean and invalidate a range of cache entries in the
+ *		specified address space before a change of page tables.
+ *		- start - user start address (inclusive, page aligned)
+ *		- end   - user end address   (exclusive, page aligned)
+ *		- flags - vma->vm_flags field
+ *
+ *	coherent_kern_range(start, end)
+ *
+ *		Ensure coherency between the Icache and the Dcache in the
+ *		region described by start, end.  If you have non-snooping
+ *		Harvard caches, you need to implement this function.
+ *		- start  - virtual start address
+ *		- end    - virtual end address
+ *
+ *	coherent_user_range(start, end)
+ *
+ *		Ensure coherency between the Icache and the Dcache in the
+ *		region described by start, end.  If you have non-snooping
+ *		Harvard caches, you need to implement this function.
+ *		- start  - virtual start address
+ *		- end    - virtual end address
+ *
+ *	flush_kern_dcache_area(kaddr, size)
+ *
+ *		Ensure that the data held in page is written back.
+ *		- kaddr  - page address
+ *		- size   - region size
+ *
+ *	DMA Cache Coherency
+ *	===================
+ *
+ *	dma_flush_range(start, end)
+ *
+ *		Clean and invalidate the specified virtual address range.
+ *		- start  - virtual start address
+ *		- end    - virtual end address
+ */
+extern void __cpuc_flush_kern_all(void);
+extern void __cpuc_flush_user_all(void);
+extern void __cpuc_flush_user_range(unsigned long, unsigned long, unsigned int);
+extern void __cpuc_coherent_kern_range(unsigned long, unsigned long);
+extern void __cpuc_coherent_user_range(unsigned long, unsigned long);
+extern void __cpuc_flush_dcache_area(void *, size_t);
+
+/*
+ * These are private to the dma-mapping API.  Do not use directly.
+ * Their sole purpose is to ensure that data held in the cache
+ * is visible to DMA, or data written by DMA to system memory is
+ * visible to the CPU.
+ */
+extern void dmac_map_area(const void *, size_t, int);
+extern void dmac_unmap_area(const void *, size_t, int);
+extern void dmac_flush_range(const void *, const void *);
+
+/*
+ * Copy user data from/to a page which is mapped into a different
+ * processes address space.  Really, we want to allow our "user
+ * space" model to handle this.
+ */
+extern void copy_to_user_page(struct vm_area_struct *, struct page *,
+	unsigned long, void *, const void *, unsigned long);
+#define copy_from_user_page(vma, page, vaddr, dst, src, len) \
+	do {							\
+		memcpy(dst, src, len);				\
+	} while (0)
+
+/*
+ * Convert calls to our calling convention.
+ */
+#define flush_cache_all()		__cpuc_flush_kern_all()
+extern void flush_cache_mm(struct mm_struct *mm);
+extern void flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end);
+extern void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr, unsigned long pfn);
+
+#define flush_cache_dup_mm(mm) flush_cache_mm(mm)
+
+/*
+ * flush_cache_user_range is used when we want to ensure that the
+ * Harvard caches are synchronised for the user space address range.
+ * This is used for the ARM private sys_cacheflush system call.
+ */
+#define flush_cache_user_range(start, end) \
+	__cpuc_coherent_user_range((start) & PAGE_MASK, PAGE_ALIGN(end))
+
+/*
+ * Perform necessary cache operations to ensure that data previously
+ * stored within this range of addresses can be executed by the CPU.
+ */
+#define flush_icache_range(s,e)		__cpuc_coherent_kern_range(s,e)
+
+/*
+ * flush_dcache_page is used when the kernel has written to the page
+ * cache page at virtual address page->virtual.
+ *
+ * If this page isn't mapped (ie, page_mapping == NULL), or it might
+ * have userspace mappings, then we _must_ always clean + invalidate
+ * the dcache entries associated with the kernel mapping.
+ *
+ * Otherwise we can defer the operation, and clean the cache when we are
+ * about to change to user space.  This is the same method as used on SPARC64.
+ * See update_mmu_cache for the user space part.
+ */
+#define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
+extern void flush_dcache_page(struct page *);
+
+static inline void __flush_icache_all(void)
+{
+	asm("ic	ialluis");
+}
+
+#define ARCH_HAS_FLUSH_ANON_PAGE
+static inline void flush_anon_page(struct vm_area_struct *vma,
+			 struct page *page, unsigned long vmaddr)
+{
+	extern void __flush_anon_page(struct vm_area_struct *vma,
+				struct page *, unsigned long);
+	if (PageAnon(page))
+		__flush_anon_page(vma, page, vmaddr);
+}
+
+#define flush_dcache_mmap_lock(mapping) \
+	spin_lock_irq(&(mapping)->tree_lock)
+#define flush_dcache_mmap_unlock(mapping) \
+	spin_unlock_irq(&(mapping)->tree_lock)
+
+#define flush_icache_user_range(vma,page,addr,len) \
+	flush_dcache_page(page)
+
+/*
+ * We don't appear to need to do anything here.  In fact, if we did, we'd
+ * duplicate cache flushing elsewhere performed by flush_dcache_page().
+ */
+#define flush_icache_page(vma,page)	do { } while (0)
+
+/*
+ * flush_cache_vmap() is used when creating mappings (eg, via vmap,
+ * vmalloc, ioremap etc) in kernel space for pages.  On non-VIPT
+ * caches, since the direct-mappings of these pages may contain cached
+ * data, we need to do a full cache flush to ensure that writebacks
+ * don't corrupt data placed into these pages via the new mappings.
+ */
+static inline void flush_cache_vmap(unsigned long start, unsigned long end)
+{
+	/*
+	 * set_pte_at() called from vmap_pte_range() does not
+	 * have a DSB after cleaning the cache line.
+	 */
+	dsb();
+}
+
+static inline void flush_cache_vunmap(unsigned long start, unsigned long end)
+{
+}
+
+#endif
diff --git a/arch/arm64/include/asm/cachetype.h b/arch/arm64/include/asm/cachetype.h
new file mode 100644
index 0000000..85f5f51
--- /dev/null
+++ b/arch/arm64/include/asm/cachetype.h
@@ -0,0 +1,48 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_CACHETYPE_H
+#define __ASM_CACHETYPE_H
+
+#include <asm/cputype.h>
+
+#define CTR_L1IP_SHIFT		14
+#define CTR_L1IP_MASK		3
+
+#define ICACHE_POLICY_RESERVED	0
+#define ICACHE_POLICY_AIVIVT	1
+#define ICACHE_POLICY_VIPT	2
+#define ICACHE_POLICY_PIPT	3
+
+static inline u32 icache_policy(void)
+{
+	return (read_cpuid_cachetype() >> CTR_L1IP_SHIFT) & CTR_L1IP_MASK;
+}
+
+/*
+ * Whilst the D-side always behaves as PIPT on AArch64, aliasing is
+ * permitted in the I-cache.
+ */
+static inline int icache_is_aliasing(void)
+{
+	return icache_policy() != ICACHE_POLICY_PIPT;
+}
+
+static inline int icache_is_aivivt(void)
+{
+	return icache_policy() == ICACHE_POLICY_AIVIVT;
+}
+
+#endif	/* __ASM_CACHETYPE_H */
diff --git a/arch/arm64/mm/cache.S b/arch/arm64/mm/cache.S
new file mode 100644
index 0000000..f4efa04
--- /dev/null
+++ b/arch/arm64/mm/cache.S
@@ -0,0 +1,279 @@
+/*
+ * Cache maintenance
+ *
+ * Copyright (C) 2001 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <linux/init.h>
+#include <asm/assembler.h>
+
+#include "proc-macros.S"
+
+/*
+ *	__cpuc_flush_dcache_all()
+ *
+ *	Flush the whole D-cache.
+ *
+ *	Corrupted registers: x0-x7, x9-x11
+ */
+ENTRY(__cpuc_flush_dcache_all)
+	dsb	sy				// ensure ordering with previous memory accesses
+	mrs	x0, clidr_el1			// read clidr
+	and	x3, x0, #0x7000000		// extract loc from clidr
+	lsr	x3, x3, #23			// left align loc bit field
+	cbz	x3, finished			// if loc is 0, then no need to clean
+	mov	x10, #0				// start clean at cache level 0
+loop1:
+	add	x2, x10, x10, lsr #1		// work out 3x current cache level
+	lsr	x1, x0, x2			// extract cache type bits from clidr
+	and	x1, x1, #7			// mask of the bits for current cache only
+	cmp	x1, #2				// see what cache we have at this level
+	b.lt	skip				// skip if no cache, or just i-cache
+	save_and_disable_irqs x9		// make CSSELR and CCSIDR access atomic
+	msr	csselr_el1, x10			// select current cache level in csselr
+	isb					// isb to sych the new cssr&csidr
+	mrs	x1, ccsidr_el1			// read the new ccsidr
+	restore_irqs x9
+	and	x2, x1, #7			// extract the length of the cache lines
+	add	x2, x2, #4			// add 4 (line length offset)
+	mov	x4, #0x3ff
+	and	x4, x4, x1, lsr #3		// find maximum number on the way size
+	clz	x5, x4				// find bit position of way size increment
+	mov	x7, #0x7fff
+	and	x7, x7, x1, lsr #13		// extract max number of the index size
+loop2:
+	mov	x9, x4				// create working copy of max way size
+loop3:
+	lsl	x6, x9, x5
+	orr	x11, x10, x6			// factor way and cache number into x11
+	lsl	x6, x7, x2
+	orr	x11, x11, x6			// factor index number into x11
+	dc	cisw, x11			// clean & invalidate by set/way
+	subs	x9, x9, #1			// decrement the way
+	b.ge	loop3
+	subs	x7, x7, #1			// decrement the index
+	b.ge	loop2
+skip:
+	add	x10, x10, #2			// increment cache number
+	cmp	x3, x10
+	b.gt	loop1
+finished:
+	mov	x10, #0				// swith back to cache level 0
+	msr	csselr_el1, x10			// select current cache level in csselr
+	dsb	sy
+	isb
+	ret
+ENDPROC(__cpuc_flush_dcache_all)
+
+/*
+ *	__cpuc_flush_cache_all()
+ *
+ *	Flush the entire cache system.  The data cache flush is now achieved
+ *	using atomic clean / invalidates working outwards from L1 cache. This
+ *	is done using Set/Way based cache maintainance instructions.  The
+ *	instruction cache can still be invalidated back to the point of
+ *	unification in a single instruction.
+ */
+ENTRY(__cpuc_flush_kern_all)
+	mov	x12, lr
+	bl	__cpuc_flush_dcache_all
+	mov	x0, #0
+	ic	ialluis				// I+BTB cache invalidate
+	ret	x12
+ENDPROC(__cpuc_flush_kern_all)
+
+/*
+ *	__cpuc_flush_cache_all()
+ *
+ *	Flush all TLB entries in a particular address space
+ */
+ENTRY(__cpuc_flush_user_all)
+	/*FALLTHROUGH*/
+
+/*
+ *	__cpuc_flush_cache_range(start, end, flags)
+ *
+ *	Flush a range of TLB entries in the specified address space.
+ *
+ *	- start - start address (may not be aligned)
+ *	- end   - end address (exclusive, may not be aligned)
+ *	- flags	- vm_area_struct flags describing address space
+ */
+ENTRY(__cpuc_flush_user_range)
+	ret
+ENDPROC(__cpuc_flush_user_all)
+ENDPROC(__cpuc_flush_user_range)
+
+/*
+ *	__cpuc_coherent_kern_range(start,end)
+ *
+ *	Ensure that the I and D caches are coherent within specified region.
+ *	This is typically used when code has been written to a memory region,
+ *	and will be executed.
+ *
+ *	- start   - virtual start address of region
+ *	- end     - virtual end address of region
+ */
+ENTRY(__cpuc_coherent_kern_range)
+	/* FALLTHROUGH */
+
+/*
+ *	__cpuc_coherent_user_range(start,end)
+ *
+ *	Ensure that the I and D caches are coherent within specified region.
+ *	This is typically used when code has been written to a memory region,
+ *	and will be executed.
+ *
+ *	- start   - virtual start address of region
+ *	- end     - virtual end address of region
+ */
+ENTRY(__cpuc_coherent_user_range)
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x0, x3
+1:
+USER(9f, dc	cvau, x4	)		// clean D line to PoU
+	add	x4, x4, x2
+	cmp	x4, x1
+	b.lo	1b
+	dsb	sy
+
+	icache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x4, x0, x3
+1:
+USER(9f, ic	ivau, x4	)		// invalidate I line PoU
+	add	x4, x4, x2
+	cmp	x4, x1
+	b.lo	1b
+9:						// ignore any faulting cache operation
+	dsb	sy
+	isb
+	ret
+ENDPROC(__cpuc_coherent_kern_range)
+ENDPROC(__cpuc_coherent_user_range)
+
+	.section .fixup,"ax"
+	.align	0
+9001:	ret
+	.previous
+
+
+/*
+ *	__cpuc_flush_kern_dcache_page(kaddr)
+ *
+ *	Ensure that the data held in the page kaddr is written back to the
+ *	page in question.
+ *
+ *	- kaddr   - kernel address
+ *	- size    - size in question
+ */
+ENTRY(__cpuc_flush_dcache_area)
+	dcache_line_size x2, x3
+	add	x1, x0, x1
+	sub	x3, x2, #1
+	bic	x0, x0, x3
+1:	dc	civac, x0			// clean & invalidate D line / unified line
+	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	ret
+ENDPROC(__cpuc_flush_dcache_area)
+
+/*
+ *	dmac_inv_range(start,end)
+ *
+ *	Invalidate the data cache within the specified region; we will be
+ *	performing a DMA operation in this region and we want to purge old
+ *	data in the cache.
+ *
+ *	- start   - virtual start address of region
+ *	- end     - virtual end address of region
+ */
+ENTRY(dmac_inv_range)
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x0, x0, x3
+	bic	x1, x1, x3
+1:	dc	ivac, x0			// invalidate D / U line
+	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	ret
+ENDPROC(dmac_inv_range)
+
+/*
+ *	dmac_clean_range(start,end)
+ *	- start   - virtual start address of region
+ *	- end     - virtual end address of region
+ */
+ENTRY(dmac_clean_range)
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x0, x0, x3
+1:	dc	cvac, x0			// clean D / U line
+	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	ret
+ENDPROC(dmac_clean_range)
+
+/*
+ *	dmac_flush_range(start,end)
+ *	- start   - virtual start address of region
+ *	- end     - virtual end address of region
+ */
+ENTRY(dmac_flush_range)
+	dcache_line_size x2, x3
+	sub	x3, x2, #1
+	bic	x0, x0, x3
+1:	dc	civac, x0			// clean & invalidate D / U line
+	add	x0, x0, x2
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	ret
+ENDPROC(dmac_flush_range)
+
+/*
+ *	dmac_map_area(start, size, dir)
+ *	- start	- kernel virtual start address
+ *	- size	- size of region
+ *	- dir	- DMA direction
+ */
+ENTRY(dmac_map_area)
+	add	x1, x1, x0
+	cmp	x2, #DMA_FROM_DEVICE
+	b.eq	dmac_inv_range
+	b	dmac_clean_range
+ENDPROC(dmac_map_area)
+
+/*
+ *	dmac_unmap_area(start, size, dir)
+ *	- start	- kernel virtual start address
+ *	- size	- size of region
+ *	- dir	- DMA direction
+ */
+ENTRY(dmac_unmap_area)
+	add	x1, x1, x0
+	cmp	x2, #DMA_TO_DEVICE
+	b.ne	dmac_inv_range
+	ret
+ENDPROC(dmac_unmap_area)
diff --git a/arch/arm64/mm/flush.c b/arch/arm64/mm/flush.c
new file mode 100644
index 0000000..44f9e5c
--- /dev/null
+++ b/arch/arm64/mm/flush.c
@@ -0,0 +1,132 @@
+/*
+ * Based on arch/arm/mm/flush.c
+ *
+ * Copyright (C) 1995-2002 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/export.h>
+#include <linux/mm.h>
+#include <linux/pagemap.h>
+
+#include <asm/cacheflush.h>
+#include <asm/cachetype.h>
+#include <asm/tlbflush.h>
+
+#include "mm.h"
+
+void flush_cache_mm(struct mm_struct *mm)
+{
+}
+
+void flush_cache_range(struct vm_area_struct *vma, unsigned long start,
+		       unsigned long end)
+{
+	if (vma->vm_flags & VM_EXEC)
+		__flush_icache_all();
+}
+
+void flush_cache_page(struct vm_area_struct *vma, unsigned long user_addr,
+		      unsigned long pfn)
+{
+}
+
+static void flush_ptrace_access(struct vm_area_struct *vma, struct page *page,
+				unsigned long uaddr, void *kaddr,
+				unsigned long len)
+{
+	if (vma->vm_flags & VM_EXEC) {
+		unsigned long addr = (unsigned long)kaddr;
+		if (icache_is_aliasing()) {
+			__cpuc_flush_dcache_area(kaddr, len);
+			__flush_icache_all();
+		} else {
+			__cpuc_coherent_kern_range(addr, addr + len);
+		}
+	}
+}
+
+/*
+ * Copy user data from/to a page which is mapped into a different processes
+ * address space.  Really, we want to allow our "user space" model to handle
+ * this.
+ *
+ * Note that this code needs to run on the current CPU.
+ */
+void copy_to_user_page(struct vm_area_struct *vma, struct page *page,
+		       unsigned long uaddr, void *dst, const void *src,
+		       unsigned long len)
+{
+#ifdef CONFIG_SMP
+	preempt_disable();
+#endif
+	memcpy(dst, src, len);
+	flush_ptrace_access(vma, page, uaddr, dst, len);
+#ifdef CONFIG_SMP
+	preempt_enable();
+#endif
+}
+
+void __flush_dcache_page(struct address_space *mapping, struct page *page)
+{
+	__cpuc_flush_dcache_area(page_address(page), PAGE_SIZE);
+}
+
+void __sync_icache_dcache(pte_t pte)
+{
+	unsigned long pfn;
+	struct page *page;
+
+	pfn = pte_pfn(pte);
+	if (!pfn_valid(pfn))
+		return;
+
+	page = pfn_to_page(pfn);
+	if (!test_and_set_bit(PG_dcache_clean, &page->flags))
+		__flush_dcache_page(NULL, page);
+	__flush_icache_all();
+}
+
+/*
+ * Ensure cache coherency between kernel mapping and userspace mapping of this
+ * page.
+ */
+void flush_dcache_page(struct page *page)
+{
+	struct address_space *mapping;
+
+	/*
+	 * The zero page is never written to, so never has any dirty cache
+	 * lines, and therefore never needs to be flushed.
+	 */
+	if (page == ZERO_PAGE(0))
+		return;
+
+	mapping = page_mapping(page);
+
+	if (mapping && !mapping_mapped(mapping))
+		clear_bit(PG_dcache_clean, &page->flags);
+	else {
+		__flush_dcache_page(mapping, page);
+		if (mapping)
+			__flush_icache_all();
+		set_bit(PG_dcache_clean, &page->flags);
+	}
+}
+EXPORT_SYMBOL(flush_dcache_page);
+
+void __flush_anon_page(struct vm_area_struct *vma, struct page *page, unsigned long vmaddr)
+{
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 10/31] arm64: TLB maintenance functionality
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (8 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 09/31] arm64: Cache maintenance routines Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-14 17:52 ` [PATCH v2 11/31] arm64: IRQ handling Catalin Marinas
                   ` (21 subsequent siblings)
  31 siblings, 0 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds the TLB maintenance functions. There is no distinction
made between the I and D TLBs. TLB maintenance operations are
automatically broadcast between CPUs in hardware. The inner-shareable
operations are always present, even on UP systems.

NOTE: Large part of this patch to be dropped once Peter Z's generic
mmu_gather patches are merged.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/tlb.h      |  190 +++++++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/tlbflush.h |  123 ++++++++++++++++++++++++
 arch/arm64/mm/tlb.S               |   71 ++++++++++++++
 3 files changed, 384 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/tlb.h
 create mode 100644 arch/arm64/include/asm/tlbflush.h
 create mode 100644 arch/arm64/mm/tlb.S

diff --git a/arch/arm64/include/asm/tlb.h b/arch/arm64/include/asm/tlb.h
new file mode 100644
index 0000000..654f096
--- /dev/null
+++ b/arch/arm64/include/asm/tlb.h
@@ -0,0 +1,190 @@
+/*
+ * Based on arch/arm/include/asm/tlb.h
+ *
+ * Copyright (C) 2002 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_TLB_H
+#define __ASM_TLB_H
+
+#include <linux/pagemap.h>
+#include <linux/swap.h>
+
+#include <asm/pgalloc.h>
+#include <asm/tlbflush.h>
+
+#define MMU_GATHER_BUNDLE	8
+
+/*
+ * TLB handling.  This allows us to remove pages from the page
+ * tables, and efficiently handle the TLB issues.
+ */
+struct mmu_gather {
+	struct mm_struct	*mm;
+	unsigned int		fullmm;
+	struct vm_area_struct	*vma;
+	unsigned long		range_start;
+	unsigned long		range_end;
+	unsigned int		nr;
+	unsigned int		max;
+	struct page		**pages;
+	struct page		*local[MMU_GATHER_BUNDLE];
+};
+
+/*
+ * This is unnecessarily complex.  There's three ways the TLB shootdown
+ * code is used:
+ *  1. Unmapping a range of vmas.  See zap_page_range(), unmap_region().
+ *     tlb->fullmm = 0, and tlb_start_vma/tlb_end_vma will be called.
+ *     tlb->vma will be non-NULL.
+ *  2. Unmapping all vmas.  See exit_mmap().
+ *     tlb->fullmm = 1, and tlb_start_vma/tlb_end_vma will be called.
+ *     tlb->vma will be non-NULL.  Additionally, page tables will be freed.
+ *  3. Unmapping argument pages.  See shift_arg_pages().
+ *     tlb->fullmm = 0, but tlb_start_vma/tlb_end_vma will not be called.
+ *     tlb->vma will be NULL.
+ */
+static inline void tlb_flush(struct mmu_gather *tlb)
+{
+	if (tlb->fullmm || !tlb->vma)
+		flush_tlb_mm(tlb->mm);
+	else if (tlb->range_end > 0) {
+		flush_tlb_range(tlb->vma, tlb->range_start, tlb->range_end);
+		tlb->range_start = TASK_SIZE;
+		tlb->range_end = 0;
+	}
+}
+
+static inline void tlb_add_flush(struct mmu_gather *tlb, unsigned long addr)
+{
+	if (!tlb->fullmm) {
+		if (addr < tlb->range_start)
+			tlb->range_start = addr;
+		if (addr + PAGE_SIZE > tlb->range_end)
+			tlb->range_end = addr + PAGE_SIZE;
+	}
+}
+
+static inline void __tlb_alloc_page(struct mmu_gather *tlb)
+{
+	unsigned long addr = __get_free_pages(GFP_NOWAIT | __GFP_NOWARN, 0);
+
+	if (addr) {
+		tlb->pages = (void *)addr;
+		tlb->max = PAGE_SIZE / sizeof(struct page *);
+	}
+}
+
+static inline void tlb_flush_mmu(struct mmu_gather *tlb)
+{
+	tlb_flush(tlb);
+	free_pages_and_swap_cache(tlb->pages, tlb->nr);
+	tlb->nr = 0;
+	if (tlb->pages == tlb->local)
+		__tlb_alloc_page(tlb);
+}
+
+static inline void
+tlb_gather_mmu(struct mmu_gather *tlb, struct mm_struct *mm, unsigned int fullmm)
+{
+	tlb->mm = mm;
+	tlb->fullmm = fullmm;
+	tlb->vma = NULL;
+	tlb->max = ARRAY_SIZE(tlb->local);
+	tlb->pages = tlb->local;
+	tlb->nr = 0;
+	__tlb_alloc_page(tlb);
+}
+
+static inline void
+tlb_finish_mmu(struct mmu_gather *tlb, unsigned long start, unsigned long end)
+{
+	tlb_flush_mmu(tlb);
+
+	/* keep the page table cache within bounds */
+	check_pgt_cache();
+
+	if (tlb->pages != tlb->local)
+		free_pages((unsigned long)tlb->pages, 0);
+}
+
+/*
+ * Memorize the range for the TLB flush.
+ */
+static inline void
+tlb_remove_tlb_entry(struct mmu_gather *tlb, pte_t *ptep, unsigned long addr)
+{
+	tlb_add_flush(tlb, addr);
+}
+
+/*
+ * In the case of tlb vma handling, we can optimise these away in the
+ * case where we're doing a full MM flush.  When we're doing a munmap,
+ * the vmas are adjusted to only cover the region to be torn down.
+ */
+static inline void
+tlb_start_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+	if (!tlb->fullmm) {
+		tlb->vma = vma;
+		tlb->range_start = TASK_SIZE;
+		tlb->range_end = 0;
+	}
+}
+
+static inline void
+tlb_end_vma(struct mmu_gather *tlb, struct vm_area_struct *vma)
+{
+	if (!tlb->fullmm)
+		tlb_flush(tlb);
+}
+
+static inline int __tlb_remove_page(struct mmu_gather *tlb, struct page *page)
+{
+	tlb->pages[tlb->nr++] = page;
+	VM_BUG_ON(tlb->nr > tlb->max);
+	return tlb->max - tlb->nr;
+}
+
+static inline void tlb_remove_page(struct mmu_gather *tlb, struct page *page)
+{
+	if (!__tlb_remove_page(tlb, page))
+		tlb_flush_mmu(tlb);
+}
+
+static inline void __pte_free_tlb(struct mmu_gather *tlb, pgtable_t pte,
+	unsigned long addr)
+{
+	pgtable_page_dtor(pte);
+	tlb_add_flush(tlb, addr);
+	tlb_remove_page(tlb, pte);
+}
+
+#ifndef CONFIG_ARM64_64K_PAGES
+static inline void __pmd_free_tlb(struct mmu_gather *tlb, pmd_t *pmdp,
+				  unsigned long addr)
+{
+	tlb_add_flush(tlb, addr);
+	tlb_remove_page(tlb, virt_to_page(pmdp));
+}
+#endif
+
+#define pte_free_tlb(tlb, ptep, addr)	__pte_free_tlb(tlb, ptep, addr)
+#define pmd_free_tlb(tlb, pmdp, addr)	__pmd_free_tlb(tlb, pmdp, addr)
+#define pud_free_tlb(tlb, pudp, addr)	pud_free((tlb)->mm, pudp)
+
+#define tlb_migrate_finish(mm)		do { } while (0)
+
+#endif
diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h
new file mode 100644
index 0000000..615d131
--- /dev/null
+++ b/arch/arm64/include/asm/tlbflush.h
@@ -0,0 +1,123 @@
+/*
+ * Based on arch/arm/include/asm/tlbflush.h
+ *
+ * Copyright (C) 1999-2003 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_TLBFLUSH_H
+#define __ASM_TLBFLUSH_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/sched.h>
+#include <asm/cputype.h>
+
+extern void __cpu_flush_user_tlb_range(unsigned long, unsigned long, struct vm_area_struct *);
+extern void __cpu_flush_kern_tlb_range(unsigned long, unsigned long);
+
+extern struct cpu_tlb_fns cpu_tlb;
+
+/*
+ *	TLB Management
+ *	==============
+ *
+ *	The arch/arm/mm/tlb-*.S files implement these methods.
+ *
+ *	The TLB specific code is expected to perform whatever tests it
+ *	needs to determine if it should invalidate the TLB for each
+ *	call.  Start addresses are inclusive and end addresses are
+ *	exclusive; it is safe to round these addresses down.
+ *
+ *	flush_tlb_all()
+ *
+ *		Invalidate the entire TLB.
+ *
+ *	flush_tlb_mm(mm)
+ *
+ *		Invalidate all TLB entries in a particular address
+ *		space.
+ *		- mm	- mm_struct describing address space
+ *
+ *	flush_tlb_range(mm,start,end)
+ *
+ *		Invalidate a range of TLB entries in the specified
+ *		address space.
+ *		- mm	- mm_struct describing address space
+ *		- start - start address (may not be aligned)
+ *		- end	- end address (exclusive, may not be aligned)
+ *
+ *	flush_tlb_page(vaddr,vma)
+ *
+ *		Invalidate the specified page in the specified address range.
+ *		- vaddr - virtual address (may not be aligned)
+ *		- vma	- vma_struct describing address range
+ *
+ *	flush_kern_tlb_page(kaddr)
+ *
+ *		Invalidate the TLB entry for the specified page.  The address
+ *		will be in the kernels virtual memory space.  Current uses
+ *		only require the D-TLB to be invalidated.
+ *		- kaddr - Kernel virtual memory address
+ */
+static inline void flush_tlb_all(void)
+{
+	dsb();
+	asm("tlbi	vmalle1is");
+	dsb();
+	isb();
+}
+
+static inline void flush_tlb_mm(struct mm_struct *mm)
+{
+	unsigned long asid = (unsigned long)ASID(mm) << 48;
+
+	dsb();
+	asm("tlbi	aside1is, %0" : : "r" (asid));
+	dsb();
+}
+
+static inline void flush_tlb_page(struct vm_area_struct *vma,
+				  unsigned long uaddr)
+{
+	unsigned long addr = uaddr >> 12 |
+		((unsigned long)ASID(vma->vm_mm) << 48);
+
+	dsb();
+	asm("tlbi	vae1is, %0" : : "r" (addr));
+	dsb();
+}
+
+/*
+ * Convert calls to our calling convention.
+ */
+#define flush_tlb_range(vma,start,end)	__cpu_flush_user_tlb_range(start,end,vma)
+#define flush_tlb_kernel_range(s,e)	__cpu_flush_kern_tlb_range(s,e)
+
+/*
+ * On AArch64, the cache coherency is handled via the set_pte_at() function.
+ */
+static inline void update_mmu_cache(struct vm_area_struct *vma,
+				    unsigned long addr, pte_t *ptep)
+{
+	/*
+	 * set_pte() does not have a DSB, so make sure that the page table
+	 * write is visible.
+	 */
+	dsb();
+}
+
+#endif
+
+#endif
diff --git a/arch/arm64/mm/tlb.S b/arch/arm64/mm/tlb.S
new file mode 100644
index 0000000..8ae80a1
--- /dev/null
+++ b/arch/arm64/mm/tlb.S
@@ -0,0 +1,71 @@
+/*
+ * Based on arch/arm/mm/tlb.S
+ *
+ * Copyright (C) 1997-2002 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Written by Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+#include <asm/asm-offsets.h>
+#include <asm/page.h>
+#include <asm/tlbflush.h>
+#include "proc-macros.S"
+
+/*
+ *	__cpu_flush_user_tlb_range(start, end, vma)
+ *
+ *	Invalidate a range of TLB entries in the specified address space.
+ *
+ *	- start - start address (may not be aligned)
+ *	- end   - end address (exclusive, may not be aligned)
+ *	- vma   - vma_struct describing address range
+ */
+ENTRY(__cpu_flush_user_tlb_range)
+	vma_vm_mm x3, x2			// get vma->vm_mm
+	mmid	x3, x3				// get vm_mm->context.id
+	dsb	sy
+	lsr	x0, x0, #12			// align address
+	lsr	x1, x1, #12
+	bfi	x0, x3, #48, #16		// start VA and ASID
+	bfi	x1, x3, #48, #16		// end VA and ASID
+1:	tlbi	vae1is, x0			// TLB invalidate by address and ASID
+	add	x0, x0, #1
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	ret
+ENDPROC(__cpu_flush_user_tlb_range)
+
+/*
+ *	__cpu_flush_kern_tlb_range(start,end)
+ *
+ *	Invalidate a range of kernel TLB entries.
+ *
+ *	- start - start address (may not be aligned)
+ *	- end   - end address (exclusive, may not be aligned)
+ */
+ENTRY(__cpu_flush_kern_tlb_range)
+	dsb	sy
+	lsr	x0, x0, #12			// align address
+	lsr	x1, x1, #12
+1:	tlbi	vaae1is, x0			// TLB invalidate by address
+	add	x0, x0, #1
+	cmp	x0, x1
+	b.lo	1b
+	dsb	sy
+	isb
+	ret
+ENDPROC(__cpu_flush_kern_tlb_range)


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 11/31] arm64: IRQ handling
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (9 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 10/31] arm64: TLB maintenance functionality Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-14 23:22   ` Aaro Koskinen
  2012-08-14 17:52 ` [PATCH v2 12/31] arm64: Atomic operations Catalin Marinas
                   ` (20 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel
  Cc: linux-kernel, Arnd Bergmann, Marc Zyngier, Will Deacon

From: Marc Zyngier <marc.zyngier@arm.com>

This patch adds the support for IRQ handling. The actual interrupt
controller will be part of a separate patch (going into
drivers/irqchip/).

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/hardirq.h  |   47 +++++++++++++++++++
 arch/arm64/include/asm/irq.h      |    8 +++
 arch/arm64/include/asm/irqflags.h |   91 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/irq.c           |   84 ++++++++++++++++++++++++++++++++++
 4 files changed, 230 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/hardirq.h
 create mode 100644 arch/arm64/include/asm/irq.h
 create mode 100644 arch/arm64/include/asm/irqflags.h
 create mode 100644 arch/arm64/kernel/irq.c

diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
new file mode 100644
index 0000000..c6c9514
--- /dev/null
+++ b/arch/arm64/include/asm/hardirq.h
@@ -0,0 +1,47 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_HARDIRQ_H
+#define __ASM_HARDIRQ_H
+
+#include <linux/cache.h>
+#include <linux/threads.h>
+#include <asm/irq.h>
+
+typedef struct {
+	unsigned int __softirq_pending;
+} ____cacheline_aligned irq_cpustat_t;
+
+#include <linux/irq_cpustat.h>	/* Standard mappings for irq_cpustat_t above */
+
+#define __inc_irq_stat(cpu, member)	__IRQ_STAT(cpu, member)++
+#define __get_irq_stat(cpu, member)	__IRQ_STAT(cpu, member)
+
+#ifdef CONFIG_SMP
+u64 smp_irq_stat_cpu(unsigned int cpu);
+#define arch_irq_stat_cpu	smp_irq_stat_cpu
+#endif
+
+#define __ARCH_IRQ_EXIT_IRQS_DISABLED	1
+
+static inline void ack_bad_irq(unsigned int irq)
+{
+	extern unsigned long irq_err_count;
+	irq_err_count++;
+}
+
+extern void handle_IRQ(unsigned int, struct pt_regs *);
+
+#endif /* __ASM_HARDIRQ_H */
diff --git a/arch/arm64/include/asm/irq.h b/arch/arm64/include/asm/irq.h
new file mode 100644
index 0000000..a4e1cad
--- /dev/null
+++ b/arch/arm64/include/asm/irq.h
@@ -0,0 +1,8 @@
+#ifndef __ASM_IRQ_H
+#define __ASM_IRQ_H
+
+#include <asm-generic/irq.h>
+
+extern void (*handle_arch_irq)(struct pt_regs *);
+
+#endif
diff --git a/arch/arm64/include/asm/irqflags.h b/arch/arm64/include/asm/irqflags.h
new file mode 100644
index 0000000..aa11943
--- /dev/null
+++ b/arch/arm64/include/asm/irqflags.h
@@ -0,0 +1,91 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_IRQFLAGS_H
+#define __ASM_IRQFLAGS_H
+
+#ifdef __KERNEL__
+
+#include <asm/ptrace.h>
+
+/*
+ * CPU interrupt mask handling.
+ */
+static inline unsigned long arch_local_irq_save(void)
+{
+	unsigned long flags;
+	asm volatile(
+		"mrs	%0, daif		// arch_local_irq_save\n"
+		"msr	daifset, #2"
+		: "=r" (flags)
+		:
+		: "memory");
+	return flags;
+}
+
+static inline void arch_local_irq_enable(void)
+{
+	asm volatile(
+		"msr	daifclr, #2		// arch_local_irq_enable"
+		:
+		:
+		: "memory");
+}
+
+static inline void arch_local_irq_disable(void)
+{
+	asm volatile(
+		"msr	daifset, #2		// arch_local_irq_disable"
+		:
+		:
+		: "memory");
+}
+
+#define local_fiq_enable()	asm("msr	daifclr, #1" : : : "memory")
+#define local_fiq_disable()	asm("msr	daifset, #1" : : : "memory")
+
+/*
+ * Save the current interrupt enable state.
+ */
+static inline unsigned long arch_local_save_flags(void)
+{
+	unsigned long flags;
+	asm volatile(
+		"mrs	%0, daif		// arch_local_save_flags"
+		: "=r" (flags)
+		:
+		: "memory");
+	return flags;
+}
+
+/*
+ * restore saved IRQ state
+ */
+static inline void arch_local_irq_restore(unsigned long flags)
+{
+	asm volatile(
+		"msr	daif, %0		// arch_local_irq_restore"
+	:
+	: "r" (flags)
+	: "memory");
+}
+
+static inline int arch_irqs_disabled_flags(unsigned long flags)
+{
+	return flags & PSR_I_BIT;
+}
+
+#endif
+#endif
diff --git a/arch/arm64/kernel/irq.c b/arch/arm64/kernel/irq.c
new file mode 100644
index 0000000..d346241
--- /dev/null
+++ b/arch/arm64/kernel/irq.c
@@ -0,0 +1,84 @@
+/*
+ * Based on arch/arm/kernel/irq.c
+ *
+ * Copyright (C) 1992 Linus Torvalds
+ * Modifications for ARM processor Copyright (C) 1995-2000 Russell King.
+ * Support for Dynamic Tick Timer Copyright (C) 2004-2005 Nokia Corporation.
+ * Dynamic Tick Timer written by Tony Lindgren <tony@atomide.com> and
+ * Tuukka Tikkanen <tuukka.tikkanen@elektrobit.com>.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel_stat.h>
+#include <linux/irq.h>
+#include <linux/smp.h>
+#include <linux/init.h>
+#include <linux/of_irq.h>
+#include <linux/seq_file.h>
+
+unsigned long irq_err_count;
+
+int arch_show_interrupts(struct seq_file *p, int prec)
+{
+#ifdef CONFIG_SMP
+	show_ipi_list(p, prec);
+#endif
+	seq_printf(p, "%*s: %10lu\n", prec, "Err", irq_err_count);
+	return 0;
+}
+
+/*
+ * handle_IRQ handles all hardware IRQ's.  Decoded IRQs should
+ * not come via this function.  Instead, they should provide their
+ * own 'handler'.  Used by platform code implementing C-based 1st
+ * level decoding.
+ */
+void handle_IRQ(unsigned int irq, struct pt_regs *regs)
+{
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	irq_enter();
+
+	/*
+	 * Some hardware gives randomly wrong interrupts.  Rather
+	 * than crashing, do something sensible.
+	 */
+	if (unlikely(irq >= nr_irqs)) {
+		if (printk_ratelimit())
+			pr_warning("Bad IRQ%u\n", irq);
+		ack_bad_irq(irq);
+	} else {
+		generic_handle_irq(irq);
+	}
+
+	irq_exit();
+	set_irq_regs(old_regs);
+}
+
+/*
+ * Interrupt controllers supported by the kernel.
+ */
+static const struct of_device_id intctrl_of_match[] __initconst = {
+	/* IRQ controllers { .compatible, .data } info to go here */
+	{}
+};
+
+void __init init_IRQ(void)
+{
+	of_irq_init(intctrl_of_match);
+
+	if (!handle_arch_irq)
+		panic("No interrupt controller found.");
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 12/31] arm64: Atomic operations
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (10 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 11/31] arm64: IRQ handling Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15  0:21   ` Olof Johansson
  2012-08-14 17:52 ` [PATCH v2 13/31] arm64: Device specific operations Catalin Marinas
                   ` (19 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch introduces the atomic, mutex and futex operations. Many
atomic operations use the load-acquire and store-release operations
which imply barriers, avoiding the need for explicit DMB.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/atomic.h |  306 +++++++++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/futex.h  |  134 +++++++++++++++++
 2 files changed, 440 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/atomic.h
 create mode 100644 arch/arm64/include/asm/futex.h

diff --git a/arch/arm64/include/asm/atomic.h b/arch/arm64/include/asm/atomic.h
new file mode 100644
index 0000000..fa60c8b
--- /dev/null
+++ b/arch/arm64/include/asm/atomic.h
@@ -0,0 +1,306 @@
+/*
+ * Based on arch/arm/include/asm/atomic.h
+ *
+ * Copyright (C) 1996 Russell King.
+ * Copyright (C) 2002 Deep Blue Solutions Ltd.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_ATOMIC_H
+#define __ASM_ATOMIC_H
+
+#include <linux/compiler.h>
+#include <linux/types.h>
+
+#include <asm/barrier.h>
+#include <asm/cmpxchg.h>
+
+#define ATOMIC_INIT(i)	{ (i) }
+
+#ifdef __KERNEL__
+
+/*
+ * On ARM, ordinary assignment (str instruction) doesn't clear the local
+ * strex/ldrex monitor on some implementations. The reason we can use it for
+ * atomic_set() is the clrex or dummy strex done on every exception return.
+ */
+#define atomic_read(v)	(*(volatile int *)&(v)->counter)
+#define atomic_set(v,i)	(((v)->counter) = (i))
+
+/*
+ * AArch64 UP and SMP safe atomic ops.  We use load exclusive and
+ * store exclusive to ensure that these are atomic.  We may loop
+ * to ensure that the update happens.
+ */
+static inline void atomic_add(int i, atomic_t *v)
+{
+	unsigned long tmp;
+	int result;
+
+	asm volatile("// atomic_add\n"
+"1:	ldxr	%w0, [%3]\n"
+"	add	%w0, %w0, %w4\n"
+"	stxr	%w1, %w0, [%3]\n"
+"	cbnz	%w1,1b"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter), "Ir" (i)
+	: "cc");
+}
+
+static inline int atomic_add_return(int i, atomic_t *v)
+{
+	unsigned long tmp;
+	int result;
+
+	asm volatile("// atomic_add_return\n"
+"1:	ldaxr	%w0, [%3]\n"
+"	add	%w0, %w0, %w4\n"
+"	stlxr	%w1, %w0, [%3]\n"
+"	cbnz	%w1, 1b"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter), "Ir" (i)
+	: "cc");
+
+	return result;
+}
+
+static inline void atomic_sub(int i, atomic_t *v)
+{
+	unsigned long tmp;
+	int result;
+
+	asm volatile("// atomic_sub\n"
+"1:	ldxr	%w0, [%3]\n"
+"	sub	%w0, %w0, %w4\n"
+"	stxr	%w1, %w0, [%3]\n"
+"	cbnz	%w1, 1b"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter), "Ir" (i)
+	: "cc");
+}
+
+static inline int atomic_sub_return(int i, atomic_t *v)
+{
+	unsigned long tmp;
+	int result;
+
+	asm volatile("// atomic_sub_return\n"
+"1:	ldaxr	%w0, [%3]\n"
+"	sub	%w0, %w0, %w4\n"
+"	stlxr	%w1, %w0, [%3]\n"
+"	cbnz	%w1, 1b"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter), "Ir" (i)
+	: "cc");
+
+	return result;
+}
+
+static inline int atomic_cmpxchg(atomic_t *ptr, int old, int new)
+{
+	unsigned long tmp;
+	int oldval;
+
+	asm volatile("// atomic_cmpxchg\n"
+"1:	ldaxr	%w1, [%3]\n"
+"	cmp	%w1, %w4\n"
+"	b.ne	2f\n"
+"	stlxr	%w0, %w5, [%3]\n"
+"	cbnz	%w0, 1b\n"
+"2:"
+	: "=&r" (tmp), "=&r" (oldval), "+o" (ptr->counter)
+	: "r" (&ptr->counter), "Ir" (old), "r" (new)
+	: "cc");
+
+	return oldval;
+}
+
+static inline void atomic_clear_mask(unsigned long mask, unsigned long *addr)
+{
+	unsigned long tmp, tmp2;
+
+	asm volatile("// atomic_clear_mask\n"
+"1:	ldxr	%0, [%3]\n"
+"	bic	%0, %0, %4\n"
+"	stxr	%w1, %0, [%3]\n"
+"	cbnz	%w1, 1b"
+	: "=&r" (tmp), "=&r" (tmp2), "+o" (*addr)
+	: "r" (addr), "Ir" (mask)
+	: "cc");
+}
+
+#define atomic_xchg(v, new) (xchg(&((v)->counter), new))
+
+static inline int __atomic_add_unless(atomic_t *v, int a, int u)
+{
+	int c, old;
+
+	c = atomic_read(v);
+	while (c != u && (old = atomic_cmpxchg((v), c, c + a)) != c)
+		c = old;
+	return c;
+}
+
+#define atomic_inc(v)		atomic_add(1, v)
+#define atomic_dec(v)		atomic_sub(1, v)
+
+#define atomic_inc_and_test(v)	(atomic_add_return(1, v) == 0)
+#define atomic_dec_and_test(v)	(atomic_sub_return(1, v) == 0)
+#define atomic_inc_return(v)    (atomic_add_return(1, v))
+#define atomic_dec_return(v)    (atomic_sub_return(1, v))
+#define atomic_sub_and_test(i, v) (atomic_sub_return(i, v) == 0)
+
+#define atomic_add_negative(i,v) (atomic_add_return(i, v) < 0)
+
+#define smp_mb__before_atomic_dec()	smp_mb()
+#define smp_mb__after_atomic_dec()	smp_mb()
+#define smp_mb__before_atomic_inc()	smp_mb()
+#define smp_mb__after_atomic_inc()	smp_mb()
+
+/*
+ * 64-bit atomic operations.
+ */
+#define ATOMIC64_INIT(i) { (i) }
+
+#define atomic64_read(v)	(*(volatile long long *)&(v)->counter)
+#define atomic64_set(v,i)	(((v)->counter) = (i))
+
+static inline void atomic64_add(u64 i, atomic64_t *v)
+{
+	long result;
+	unsigned long tmp;
+
+	asm volatile("// atomic64_add\n"
+"1:	ldxr	%0, [%3]\n"
+"	add	%0, %0, %4\n"
+"	stxr	%w1, %0, [%3]\n"
+"	cbnz	%w1, 1b"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter), "Ir" (i)
+	: "cc");
+}
+
+static inline long atomic64_add_return(long i, atomic64_t *v)
+{
+	long result;
+	unsigned long tmp;
+
+	asm volatile("// atomic64_add_return\n"
+"1:	ldaxr	%0, [%3]\n"
+"	add	%0, %0, %4\n"
+"	stlxr	%w1, %0, [%3]\n"
+"	cbnz	%w1, 1b"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter), "Ir" (i)
+	: "cc");
+
+	return result;
+}
+
+static inline void atomic64_sub(u64 i, atomic64_t *v)
+{
+	long result;
+	unsigned long tmp;
+
+	asm volatile("// atomic64_sub\n"
+"1:	ldxr	%0, [%3]\n"
+"	sub	%0, %0, %4\n"
+"	stxr	%w1, %0, [%3]\n"
+"	cbnz	%w1, 1b"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter), "Ir" (i)
+	: "cc");
+}
+
+static inline long atomic64_sub_return(long i, atomic64_t *v)
+{
+	long result;
+	unsigned long tmp;
+
+	asm volatile("// atomic64_sub_return\n"
+"1:	ldaxr	%0, [%3]\n"
+"	sub	%0, %0, %4\n"
+"	stlxr	%w1, %0, [%3]\n"
+"	cbnz	%w1, 1b"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter), "Ir" (i)
+	: "cc");
+
+	return result;
+}
+
+static inline long atomic64_cmpxchg(atomic64_t *ptr, long old, long new)
+{
+	long oldval;
+	unsigned long res;
+
+	asm volatile("// atomic64_cmpxchg\n"
+"1:	ldaxr	%1, [%3]\n"
+"	cmp	%1, %4\n"
+"	b.ne	2f\n"
+"	stlxr	%w0, %5, [%3]\n"
+"	cbnz	%w0, 1b\n"
+"2:"
+	: "=&r" (res), "=&r" (oldval), "+o" (ptr->counter)
+	: "r" (&ptr->counter), "Ir" (old), "r" (new)
+	: "cc");
+
+	return oldval;
+}
+
+#define atomic64_xchg(v, new) (xchg(&((v)->counter), new))
+
+#define ARCH_HAS_ATOMIC64_DEC_IF_POSITIVE
+static inline long atomic64_dec_if_positive(atomic64_t *v)
+{
+	long result;
+	unsigned long tmp;
+
+	asm volatile("// atomic64_dec_if_positive\n"
+"1:	ldaxr	%0, [%3]\n"
+"	subs	%0, %0, #1\n"
+"	b.mi	2f\n"
+"	stlxr	%w1, %0, [%3]\n"
+"	cbnz	%w1, 1b\n"
+"2:"
+	: "=&r" (result), "=&r" (tmp), "+o" (v->counter)
+	: "r" (&v->counter)
+	: "cc");
+
+	return result;
+}
+
+static inline int atomic64_add_unless(atomic64_t *v, long a, long u)
+{
+	long c, old;
+
+	c = atomic64_read(v);
+	while (c != u && (old = atomic64_cmpxchg((v), c, c + a)) != c)
+		c = old;
+
+	return c != u;
+}
+
+#define atomic64_add_negative(a, v)	(atomic64_add_return((a), (v)) < 0)
+#define atomic64_inc(v)			atomic64_add(1LL, (v))
+#define atomic64_inc_return(v)		atomic64_add_return(1LL, (v))
+#define atomic64_inc_and_test(v)	(atomic64_inc_return(v) == 0)
+#define atomic64_sub_and_test(a, v)	(atomic64_sub_return((a), (v)) == 0)
+#define atomic64_dec(v)			atomic64_sub(1LL, (v))
+#define atomic64_dec_return(v)		atomic64_sub_return(1LL, (v))
+#define atomic64_dec_and_test(v)	(atomic64_dec_return((v)) == 0)
+#define atomic64_inc_not_zero(v)	atomic64_add_unless((v), 1LL, 0LL)
+
+#endif
+#endif
diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
new file mode 100644
index 0000000..0745e82
--- /dev/null
+++ b/arch/arm64/include/asm/futex.h
@@ -0,0 +1,134 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_FUTEX_H
+#define __ASM_FUTEX_H
+
+#ifdef __KERNEL__
+
+#include <linux/futex.h>
+#include <linux/uaccess.h>
+#include <asm/errno.h>
+
+#define __futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg)		\
+	asm volatile(							\
+"1:	ldaxr	%w1, %2\n"						\
+	insn "\n"							\
+"2:	stlxr	%w3, %w0, %2\n"						\
+"	cbnz	%w3, 1b\n"						\
+"3:	.pushsection __ex_table,\"a\"\n"				\
+"	.align	3\n"							\
+"	.quad	1b, 4f, 2b, 4f\n"					\
+"	.popsection\n"							\
+"	.pushsection .fixup,\"ax\"\n"					\
+"4:	mov	%w0, %w5\n"						\
+"	b	3b\n"							\
+"	.popsection"							\
+	: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp)	\
+	: "r" (oparg), "Ir" (-EFAULT)					\
+	: "cc")
+
+static inline int
+futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
+{
+	int op = (encoded_op >> 28) & 7;
+	int cmp = (encoded_op >> 24) & 15;
+	int oparg = (encoded_op << 8) >> 20;
+	int cmparg = (encoded_op << 20) >> 20;
+	int oldval = 0, ret, tmp;
+
+	if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
+		oparg = 1 << oparg;
+
+	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
+		return -EFAULT;
+
+	pagefault_disable();	/* implies preempt_disable() */
+
+	switch (op) {
+	case FUTEX_OP_SET:
+		__futex_atomic_op("mov	%w0, %w4",
+				  ret, oldval, uaddr, tmp, oparg);
+		break;
+	case FUTEX_OP_ADD:
+		__futex_atomic_op("add	%w0, %w1, %w4",
+				  ret, oldval, uaddr, tmp, oparg);
+		break;
+	case FUTEX_OP_OR:
+		__futex_atomic_op("orr	%w0, %w1, %w4",
+				  ret, oldval, uaddr, tmp, oparg);
+		break;
+	case FUTEX_OP_ANDN:
+		__futex_atomic_op("and	%w0, %w1, %w4",
+				  ret, oldval, uaddr, tmp, ~oparg);
+		break;
+	case FUTEX_OP_XOR:
+		__futex_atomic_op("eor	%w0, %w1, %w4",
+				  ret, oldval, uaddr, tmp, oparg);
+		break;
+	default:
+		ret = -ENOSYS;
+	}
+
+	pagefault_enable();	/* subsumes preempt_enable() */
+
+	if (!ret) {
+		switch (cmp) {
+		case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
+		case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
+		case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
+		case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
+		case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
+		case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
+		default: ret = -ENOSYS;
+		}
+	}
+	return ret;
+}
+
+static inline int
+futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
+			      u32 oldval, u32 newval)
+{
+	int ret = 0;
+	u32 val, tmp;
+
+	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
+		return -EFAULT;
+
+	asm volatile("// futex_atomic_cmpxchg_inatomic\n"
+"1:	ldaxr	%w1, %2\n"
+"	sub	%w3, %w1, %w4\n"
+"	cbnz	%w3, 3f\n"
+"2:	stlxr	%w3, %w5, %2\n"
+"	cbnz	%w3, 1b\n"
+"3:	.pushsection __ex_table,\"a\"\n"
+"	.align	3\n"
+"	.quad	1b, 4f, 2b, 4f\n"
+"	.popsection\n"
+"	.pushsection .fixup,\"ax\"\n"
+"4:	mov	%w0, %w6\n"
+"	b	3b\n"
+"	.popsection"
+	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
+	: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
+	: "cc", "memory");
+
+	*uval = val;
+	return ret;
+}
+
+#endif /* __KERNEL__ */
+#endif /* __ASM_FUTEX_H */


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 13/31] arm64: Device specific operations
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (11 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 12/31] arm64: Atomic operations Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15  0:33   ` Olof Johansson
                     ` (2 more replies)
  2012-08-14 17:52 ` [PATCH v2 14/31] arm64: DMA mapping API Catalin Marinas
                   ` (18 subsequent siblings)
  31 siblings, 3 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds several definitions for device communication, including
I/O accessors and ioremap(). The __raw_* accessors are implemented as
inline asm to avoid compiler generation of post-indexed accesses (less
efficient to emulate in a virtualised environment).

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/device.h |   26 ++++
 arch/arm64/include/asm/fb.h     |   34 +++++
 arch/arm64/include/asm/io.h     |  263 +++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/io.c          |   64 ++++++++++
 arch/arm64/mm/ioremap.c         |   84 +++++++++++++
 5 files changed, 471 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/device.h
 create mode 100644 arch/arm64/include/asm/fb.h
 create mode 100644 arch/arm64/include/asm/io.h
 create mode 100644 arch/arm64/kernel/io.c
 create mode 100644 arch/arm64/mm/ioremap.c

diff --git a/arch/arm64/include/asm/device.h b/arch/arm64/include/asm/device.h
new file mode 100644
index 0000000..0d8453c
--- /dev/null
+++ b/arch/arm64/include/asm/device.h
@@ -0,0 +1,26 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_DEVICE_H
+#define __ASM_DEVICE_H
+
+struct dev_archdata {
+	struct dma_map_ops *dma_ops;
+};
+
+struct pdev_archdata {
+};
+
+#endif
diff --git a/arch/arm64/include/asm/fb.h b/arch/arm64/include/asm/fb.h
new file mode 100644
index 0000000..adb88a6
--- /dev/null
+++ b/arch/arm64/include/asm/fb.h
@@ -0,0 +1,34 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_FB_H_
+#define __ASM_FB_H_
+
+#include <linux/fb.h>
+#include <linux/fs.h>
+#include <asm/page.h>
+
+static inline void fb_pgprotect(struct file *file, struct vm_area_struct *vma,
+				unsigned long off)
+{
+	vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot);
+}
+
+static inline int fb_is_primary_device(struct fb_info *info)
+{
+	return 0;
+}
+
+#endif /* __ASM_FB_H_ */
diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
new file mode 100644
index 0000000..48fa83f
--- /dev/null
+++ b/arch/arm64/include/asm/io.h
@@ -0,0 +1,263 @@
+/*
+ * Based on arch/arm/include/asm/io.h
+ *
+ * Copyright (C) 1996-2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_IO_H
+#define __ASM_IO_H
+
+#ifdef __KERNEL__
+
+#include <linux/types.h>
+
+#include <asm/byteorder.h>
+#include <asm/barrier.h>
+#include <asm/pgtable.h>
+
+/*
+ * Generic IO read/write.  These perform native-endian accesses.
+ */
+static inline void __raw_writeb(u8 val, volatile void __iomem *addr)
+{
+	asm volatile("strb %w0, [%1]" : : "r" (val), "r" (addr));
+}
+
+static inline void __raw_writew(u16 val, volatile void __iomem *addr)
+{
+	asm volatile("strh %w0, [%1]" : : "r" (val), "r" (addr));
+}
+
+static inline void __raw_writel(u32 val, volatile void __iomem *addr)
+{
+	asm volatile("str %w0, [%1]" : : "r" (val), "r" (addr));
+}
+
+static inline void __raw_writeq(u64 val, volatile void __iomem *addr)
+{
+	asm volatile("str %0, [%1]" : : "r" (val), "r" (addr));
+}
+
+static inline u8 __raw_readb(const volatile void __iomem *addr)
+{
+	u8 val;
+	asm volatile("ldrb %w0, [%1]" : "=r" (val) : "r" (addr));
+	return val;
+}
+
+static inline u16 __raw_readw(const volatile void __iomem *addr)
+{
+	u16 val;
+	asm volatile("ldrh %w0, [%1]" : "=r" (val) : "r" (addr));
+	return val;
+}
+
+static inline u32 __raw_readl(const volatile void __iomem *addr)
+{
+	u32 val;
+	asm volatile("ldr %w0, [%1]" : "=r" (val) : "r" (addr));
+	return val;
+}
+
+static inline u64 __raw_readq(const volatile void __iomem *addr)
+{
+	u64 val;
+	asm volatile("ldr %0, [%1]" : "=r" (val) : "r" (addr));
+	return val;
+}
+
+/* IO barriers */
+#define __iormb()		rmb()
+#define __iowmb()		wmb()
+
+#define mmiowb()		do { } while (0)
+
+/*
+ * Relaxed I/O memory access primitives. These follow the Device memory
+ * ordering rules but do not guarantee any ordering relative to Normal memory
+ * accesses.
+ */
+#define readb_relaxed(c)	({ u8  __v = __raw_readb(c); __v; })
+#define readw_relaxed(c)	({ u16 __v = le16_to_cpu((__force __le16)__raw_readw(c)); __v; })
+#define readl_relaxed(c)	({ u32 __v = le32_to_cpu((__force __le32)__raw_readl(c)); __v; })
+
+#define writeb_relaxed(v,c)	((void)__raw_writeb((v),(c)))
+#define writew_relaxed(v,c)	((void)__raw_writew((__force u16)cpu_to_le16(v),(c)))
+#define writel_relaxed(v,c)	((void)__raw_writel((__force u32)cpu_to_le32(v),(c)))
+
+/*
+ * I/O memory access primitives. Reads are ordered relative to any
+ * following Normal memory access. Writes are ordered relative to any prior
+ * Normal memory access.
+ */
+#define readb(c)		({ u8  __v = readb_relaxed(c); __iormb(); __v; })
+#define readw(c)		({ u16 __v = readw_relaxed(c); __iormb(); __v; })
+#define readl(c)		({ u32 __v = readl_relaxed(c); __iormb(); __v; })
+
+#define writeb(v,c)		({ __iowmb(); writeb_relaxed((v),(c)); })
+#define writew(v,c)		({ __iowmb(); writew_relaxed((v),(c)); })
+#define writel(v,c)		({ __iowmb(); writel_relaxed((v),(c)); })
+
+/*
+ *  I/O port access primitives.
+ */
+#define IO_SPACE_LIMIT		0xffff
+
+/*
+ * We currently don't have any platform with PCI support, so just leave this
+ * defined to 0 until needed.
+ */
+#define PCI_IOBASE		((void __iomem *)0)
+
+static inline u8 inb(unsigned long addr)
+{
+	return readb(addr + PCI_IOBASE);
+}
+
+static inline u16 inw(unsigned long addr)
+{
+	return readw(addr + PCI_IOBASE);
+}
+
+static inline u32 inl(unsigned long addr)
+{
+	return readl(addr + PCI_IOBASE);
+}
+
+static inline void outb(u8 b, unsigned long addr)
+{
+	writeb(b, addr + PCI_IOBASE);
+}
+
+static inline void outw(u16 b, unsigned long addr)
+{
+	writew(b, addr + PCI_IOBASE);
+}
+
+static inline void outl(u32 b, unsigned long addr)
+{
+	writel(b, addr + PCI_IOBASE);
+}
+
+#define inb_p(addr)	inb(addr)
+#define inw_p(addr)	inw(addr)
+#define inl_p(addr)	inl(addr)
+
+#define outb_p(x, addr)	outb((x), (addr))
+#define outw_p(x, addr)	outw((x), (addr))
+#define outl_p(x, addr)	outl((x), (addr))
+
+static inline void insb(unsigned long addr, void *buffer, int count)
+{
+	u8 *buf = buffer;
+	while (count--)
+		*buf++ = __raw_readb(addr + PCI_IOBASE);
+}
+
+static inline void insw(unsigned long addr, void *buffer, int count)
+{
+	u16 *buf = buffer;
+	while (count--)
+		*buf++ = __raw_readw(addr + PCI_IOBASE);
+}
+
+static inline void insl(unsigned long addr, void *buffer, int count)
+{
+	u32 *buf = buffer;
+	while (count--)
+		*buf++ = __raw_readl(addr + PCI_IOBASE);
+}
+
+static inline void outsb(unsigned long addr, const void *buffer, int count)
+{
+	const u8 *buf = buffer;
+	while (count--)
+		__raw_writeb(*buf++, addr + PCI_IOBASE);
+}
+
+static inline void outsw(unsigned long addr, const void *buffer, int count)
+{
+	const u16 *buf = buffer;
+	while (count--)
+		__raw_writew(*buf++, addr + PCI_IOBASE);
+}
+
+static inline void outsl(unsigned long addr, const void *buffer, int count)
+{
+	const u32 *buf = buffer;
+	while (count--)
+		__raw_writel(*buf++, addr + PCI_IOBASE);
+}
+
+#define insb_p(port,to,len)	insb(port,to,len)
+#define insw_p(port,to,len)	insw(port,to,len)
+#define insl_p(port,to,len)	insl(port,to,len)
+
+#define outsb_p(port,from,len)	outsb(port,from,len)
+#define outsw_p(port,from,len)	outsw(port,from,len)
+#define outsl_p(port,from,len)	outsl(port,from,len)
+
+/*
+ * String version of I/O memory access operations.
+ */
+extern void __memcpy_fromio(void *, const volatile void __iomem *, size_t);
+extern void __memcpy_toio(volatile void __iomem *, const void *, size_t);
+extern void __memset_io(volatile void __iomem *, int, size_t);
+
+#define memset_io(c,v,l)	__memset_io((c),(v),(l))
+#define memcpy_fromio(a,c,l)	__memcpy_fromio((a),(c),(l))
+#define memcpy_toio(c,a,l)	__memcpy_toio((c),(a),(l))
+
+/*
+ * I/O memory mapping functions.
+ */
+extern void __iomem *__ioremap(phys_addr_t phys_addr, size_t size, pgprot_t prot);
+extern void __iounmap(volatile void __iomem *addr);
+
+#define PROT_DEFAULT		(PTE_TYPE_PAGE | PTE_AF | PTE_DIRTY)
+#define PROT_DEVICE_nGnRE	(PROT_DEFAULT | PTE_XN | PTE_ATTRINDX(MT_DEVICE_nGnRE))
+#define PROT_NORMAL_NC		(PROT_DEFAULT | PTE_ATTRINDX(MT_NORMAL_NC))
+
+#define ioremap(addr, size)		__ioremap((addr), (size), PROT_DEVICE_nGnRE)
+#define ioremap_nocache(addr, size)	__ioremap((addr), (size), PROT_DEVICE_nGnRE)
+#define ioremap_wc(addr, size)		__ioremap((addr), (size), PROT_NORMAL_NC)
+#define iounmap				__iounmap
+
+#define ARCH_HAS_IOREMAP_WC
+#include <asm-generic/iomap.h>
+
+/*
+ * More restrictive address range checking than the default implementation
+ * (PHYS_OFFSET and PHYS_MASK taken into account).
+ */
+#define ARCH_HAS_VALID_PHYS_ADDR_RANGE
+extern int valid_phys_addr_range(unsigned long addr, size_t size);
+extern int valid_mmap_phys_addr_range(unsigned long pfn, size_t size);
+
+extern int devmem_is_allowed(unsigned long pfn);
+
+/*
+ * Convert a physical pointer to a virtual kernel pointer for /dev/mem
+ * access
+ */
+#define xlate_dev_mem_ptr(p)	__va(p)
+
+/*
+ * Convert a virtual cached pointer to an uncached pointer
+ */
+#define xlate_dev_kmem_ptr(p)	p
+
+#endif	/* __KERNEL__ */
+#endif	/* __ASM_IO_H */
diff --git a/arch/arm64/kernel/io.c b/arch/arm64/kernel/io.c
new file mode 100644
index 0000000..7d37ead
--- /dev/null
+++ b/arch/arm64/kernel/io.c
@@ -0,0 +1,64 @@
+/*
+ * Based on arch/arm/kernel/io.c
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/export.h>
+#include <linux/types.h>
+#include <linux/io.h>
+
+/*
+ * Copy data from IO memory space to "real" memory space.
+ */
+void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count)
+{
+	unsigned char *t = to;
+	while (count) {
+		count--;
+		*t = readb(from);
+		t++;
+		from++;
+	}
+}
+EXPORT_SYMBOL(__memcpy_fromio);
+
+/*
+ * Copy data from "real" memory space to IO memory space.
+ */
+void __memcpy_toio(volatile void __iomem *to, const void *from, size_t count)
+{
+	const unsigned char *f = from;
+	while (count) {
+		count--;
+		writeb(*f, to);
+		f++;
+		to++;
+	}
+}
+EXPORT_SYMBOL(__memcpy_toio);
+
+/*
+ * "memset" on IO memory space.
+ */
+void __memset_io(volatile void __iomem *dst, int c, size_t count)
+{
+	while (count) {
+		count--;
+		writeb(c, dst);
+		dst++;
+	}
+}
+EXPORT_SYMBOL(__memset_io);
diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c
new file mode 100644
index 0000000..1725cd6
--- /dev/null
+++ b/arch/arm64/mm/ioremap.c
@@ -0,0 +1,84 @@
+/*
+ * Based on arch/arm/mm/ioremap.c
+ *
+ * (C) Copyright 1995 1996 Linus Torvalds
+ * Hacked for ARM by Phil Blundell <philb@gnu.org>
+ * Hacked to allow all architectures to build, and various cleanups
+ * by Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/export.h>
+#include <linux/mm.h>
+#include <linux/vmalloc.h>
+#include <linux/io.h>
+
+static void __iomem *__ioremap_caller(phys_addr_t phys_addr, size_t size,
+				      pgprot_t prot, void *caller)
+{
+	unsigned long last_addr;
+	unsigned long offset = phys_addr & ~PAGE_MASK;
+	int err;
+	unsigned long addr;
+	struct vm_struct *area;
+
+	/*
+	 * Page align the mapping address and size, taking account of any
+	 * offset.
+	 */
+	phys_addr &= PAGE_MASK;
+	size = PAGE_ALIGN(size + offset);
+
+	/*
+	 * Don't allow wraparound, zero size or outside PHYS_MASK.
+	 */
+	last_addr = phys_addr + size - 1;
+	if (!size || last_addr < phys_addr || (last_addr & ~PHYS_MASK))
+		return NULL;
+
+	/*
+	 * Don't allow RAM to be mapped.
+	 */
+	if (WARN_ON(pfn_valid(__phys_to_pfn(phys_addr))))
+		return NULL;
+
+	area = get_vm_area_caller(size, VM_IOREMAP, caller);
+	if (!area)
+		return NULL;
+	addr = (unsigned long)area->addr;
+
+	err = ioremap_page_range(addr, addr + size, phys_addr, prot);
+	if (err) {
+		vunmap((void *)addr);
+		return NULL;
+	}
+
+	return (void __iomem *)(offset + addr);
+}
+
+void __iomem *__ioremap(phys_addr_t phys_addr, size_t size, pgprot_t prot)
+{
+	return __ioremap_caller(phys_addr, size, prot,
+				__builtin_return_address(0));
+}
+EXPORT_SYMBOL(__ioremap);
+
+void __iounmap(volatile void __iomem *io_addr)
+{
+	void *addr = (void *)(PAGE_MASK & (unsigned long)io_addr);
+
+	vunmap(addr);
+}
+EXPORT_SYMBOL(__iounmap);


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 14/31] arm64: DMA mapping API
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (12 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 13/31] arm64: Device specific operations Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15  0:40   ` Olof Johansson
  2012-08-15 16:16   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 15/31] arm64: SMP support Catalin Marinas
                   ` (17 subsequent siblings)
  31 siblings, 2 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann

This patch adds support for the DMA mapping API. It uses dma_map_ops for
flexibility and it currently supports swiotlb. This patch could be
simplified further if the DMA accesses are coherent (not mandated by the
architecture) or if corresponding hooks are placed in the generic
swiotlb code to deal with cache maintenance.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/dma-mapping.h |  124 ++++++++++++++++++++
 arch/arm64/mm/dma-mapping.c          |  208 ++++++++++++++++++++++++++++++++++
 2 files changed, 332 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/dma-mapping.h
 create mode 100644 arch/arm64/mm/dma-mapping.c

diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
new file mode 100644
index 0000000..538f4b4
--- /dev/null
+++ b/arch/arm64/include/asm/dma-mapping.h
@@ -0,0 +1,124 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_DMA_MAPPING_H
+#define __ASM_DMA_MAPPING_H
+
+#ifdef __KERNEL__
+
+#include <linux/types.h>
+#include <linux/vmalloc.h>
+
+#include <asm-generic/dma-coherent.h>
+
+#define ARCH_HAS_DMA_GET_REQUIRED_MASK
+
+extern struct dma_map_ops *dma_ops;
+
+static inline struct dma_map_ops *get_dma_ops(struct device *dev)
+{
+	if (unlikely(!dev) || !dev->archdata.dma_ops)
+		return dma_ops;
+	else
+		return dev->archdata.dma_ops;
+}
+
+#include <asm-generic/dma-mapping-common.h>
+
+static inline dma_addr_t phys_to_dma(struct device *dev, phys_addr_t paddr)
+{
+	return (dma_addr_t)paddr;
+}
+
+static inline phys_addr_t dma_to_phys(struct device *dev, dma_addr_t dev_addr)
+{
+	return (phys_addr_t)dev_addr;
+}
+
+static inline int dma_mapping_error(struct device *dev, dma_addr_t dev_addr)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	return ops->mapping_error(dev, dev_addr);
+}
+
+static inline int dma_supported(struct device *dev, u64 mask)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	return ops->dma_supported(dev, mask);
+}
+
+static inline int dma_set_mask(struct device *dev, u64 mask)
+{
+	if (!dev->dma_mask || !dma_supported(dev, mask))
+		return -EIO;
+	*dev->dma_mask = mask;
+
+	return 0;
+}
+
+static inline bool dma_capable(struct device *dev, dma_addr_t addr, size_t size)
+{
+	if (!dev->dma_mask)
+		return 0;
+
+	return addr + size - 1 <= *dev->dma_mask;
+}
+
+static inline void dma_mark_clean(void *addr, size_t size)
+{
+}
+
+static inline void *dma_alloc_coherent(struct device *dev, size_t size,
+				       dma_addr_t *dma_handle, gfp_t flags)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+	void *vaddr;
+
+	if (dma_alloc_from_coherent(dev, size, dma_handle, &vaddr))
+		return vaddr;
+
+	vaddr = ops->alloc(dev, size, dma_handle, flags, NULL);
+	debug_dma_alloc_coherent(dev, size, *dma_handle, vaddr);
+	return vaddr;
+}
+
+static inline void dma_free_coherent(struct device *dev, size_t size,
+				     void *vaddr, dma_addr_t dev_addr)
+{
+	struct dma_map_ops *ops = get_dma_ops(dev);
+
+	if (dma_release_from_coherent(dev, get_order(size), vaddr))
+		return;
+
+	debug_dma_free_coherent(dev, size, vaddr, dev_addr);
+	ops->free(dev, size, vaddr, dev_addr, NULL);
+}
+
+/*
+ * There is no dma_cache_sync() implementation, so just return NULL here.
+ */
+static inline void *dma_alloc_noncoherent(struct device *dev, size_t size,
+					  dma_addr_t *handle, gfp_t flags)
+{
+	return NULL;
+}
+
+static inline void dma_free_noncoherent(struct device *dev, size_t size,
+					void *cpu_addr, dma_addr_t handle)
+{
+}
+
+#endif	/* __KERNEL__ */
+#endif	/* __ASM_DMA_MAPPING_H */
diff --git a/arch/arm64/mm/dma-mapping.c b/arch/arm64/mm/dma-mapping.c
new file mode 100644
index 0000000..4e5871d
--- /dev/null
+++ b/arch/arm64/mm/dma-mapping.c
@@ -0,0 +1,208 @@
+/*
+ * SWIOTLB-based DMA API implementation
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/gfp.h>
+#include <linux/export.h>
+#include <linux/slab.h>
+#include <linux/dma-mapping.h>
+#include <linux/vmalloc.h>
+#include <linux/swiotlb.h>
+
+#include <asm/cacheflush.h>
+
+struct dma_map_ops *dma_ops;
+EXPORT_SYMBOL(dma_ops);
+
+static void *arm64_swiotlb_alloc_coherent(struct device *dev, size_t size,
+					  dma_addr_t *dma_handle, gfp_t flags,
+					  struct dma_attrs *attrs)
+{
+	struct page *page, **map;
+	void *ptr;
+	int order = get_order(size);
+	int i;
+
+	if (dev->coherent_dma_mask != DMA_BIT_MASK(64))
+		flags |= GFP_DMA;
+
+	ptr = swiotlb_alloc_coherent(dev, size, dma_handle, flags);
+	if (!ptr)
+		goto no_mem;
+	map = kmalloc(sizeof(struct page *) << order, flags & ~GFP_DMA);
+	if (!map)
+		goto no_map;
+
+	/* remove any dirty cache lines on the kernel alias */
+	dmac_flush_range(ptr, ptr + size);
+
+	/* create a coherent mapping */
+	page = virt_to_page(ptr);
+	for (i = 0; i < (size >> PAGE_SHIFT); i++)
+		map[i] = page + i;
+	ptr = vmap(map, size >> PAGE_SHIFT, VM_MAP,
+		   pgprot_dmacoherent(pgprot_default)); kfree(map);
+	if (!ptr)
+		goto no_map;
+
+	return ptr;
+
+no_map:
+	swiotlb_free_coherent(dev, size, ptr, *dma_handle);
+no_mem:
+	*dma_handle = ~0;
+	return NULL;
+}
+
+static void arm64_swiotlb_free_coherent(struct device *dev, size_t size,
+					void *vaddr, dma_addr_t dma_handle,
+					struct dma_attrs *attrs)
+{
+	vunmap(vaddr);
+	swiotlb_free_coherent(dev, size, vaddr, dma_handle);
+}
+
+static dma_addr_t arm64_swiotlb_map_page(struct device *dev,
+					 struct page *page,
+					 unsigned long offset, size_t size,
+					 enum dma_data_direction dir,
+					 struct dma_attrs *attrs)
+{
+	dma_addr_t dev_addr;
+
+	dev_addr = swiotlb_map_page(dev, page, offset, size, dir, attrs);
+	dmac_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
+
+	return dev_addr;
+}
+
+
+static void arm64_swiotlb_unmap_page(struct device *dev, dma_addr_t dev_addr,
+				     size_t size, enum dma_data_direction dir,
+				     struct dma_attrs *attrs)
+{
+	dmac_unmap_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
+	swiotlb_unmap_page(dev, dev_addr, size, dir, attrs);
+}
+
+static int arm64_swiotlb_map_sg_attrs(struct device *dev,
+				      struct scatterlist *sgl, int nelems,
+				      enum dma_data_direction dir,
+				      struct dma_attrs *attrs)
+{
+	struct scatterlist *sg;
+	int i, ret;
+
+	ret = swiotlb_map_sg_attrs(dev, sgl, nelems, dir, attrs);
+	for_each_sg(sgl, sg, ret, i)
+		dmac_map_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
+			      sg->length, dir);
+
+	return ret;
+}
+
+static void arm64_swiotlb_unmap_sg_attrs(struct device *dev,
+					 struct scatterlist *sgl, int nelems,
+					 enum dma_data_direction dir,
+					 struct dma_attrs *attrs)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nelems, i)
+		dmac_unmap_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
+				sg->length, dir);
+	swiotlb_unmap_sg_attrs(dev, sgl, nelems, dir, attrs);
+}
+
+static void arm64_swiotlb_sync_single_for_cpu(struct device *dev,
+					      dma_addr_t dev_addr,
+					      size_t size,
+					      enum dma_data_direction dir)
+{
+	dmac_unmap_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
+	swiotlb_sync_single_for_cpu(dev, dev_addr, size, dir);
+}
+
+static void arm64_swiotlb_sync_single_for_device(struct device *dev,
+						 dma_addr_t dev_addr,
+						 size_t size,
+						 enum dma_data_direction dir)
+{
+	swiotlb_sync_single_for_device(dev, dev_addr, size, dir);
+	dmac_map_area(phys_to_virt(dma_to_phys(dev, dev_addr)), size, dir);
+}
+
+static void arm64_swiotlb_sync_sg_for_cpu(struct device *dev,
+					  struct scatterlist *sgl, int nelems,
+					  enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	for_each_sg(sgl, sg, nelems, i)
+		dmac_unmap_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
+				sg->length, dir);
+	swiotlb_sync_sg_for_cpu(dev, sgl, nelems, dir);
+}
+
+static void arm64_swiotlb_sync_sg_for_device(struct device *dev,
+					     struct scatterlist *sgl,
+					     int nelems,
+					     enum dma_data_direction dir)
+{
+	struct scatterlist *sg;
+	int i;
+
+	swiotlb_sync_sg_for_device(dev, sgl, nelems, dir);
+	for_each_sg(sgl, sg, nelems, i)
+		dmac_map_area(phys_to_virt(dma_to_phys(dev, sg->dma_address)),
+			      sg->length, dir);
+}
+
+static struct dma_map_ops arm64_swiotlb_dma_ops = {
+	.alloc = arm64_swiotlb_alloc_coherent,
+	.free = arm64_swiotlb_free_coherent,
+	.map_page = arm64_swiotlb_map_page,
+	.unmap_page = arm64_swiotlb_unmap_page,
+	.map_sg = arm64_swiotlb_map_sg_attrs,
+	.unmap_sg = arm64_swiotlb_unmap_sg_attrs,
+	.sync_single_for_cpu = arm64_swiotlb_sync_single_for_cpu,
+	.sync_single_for_device = arm64_swiotlb_sync_single_for_device,
+	.sync_sg_for_cpu = arm64_swiotlb_sync_sg_for_cpu,
+	.sync_sg_for_device = arm64_swiotlb_sync_sg_for_device,
+	.dma_supported = swiotlb_dma_supported,
+	.mapping_error = swiotlb_dma_mapping_error,
+};
+
+void __init swiotlb_init_with_default_size(size_t default_size, int verbose);
+
+void __init arm64_swiotlb_init(size_t max_size)
+{
+	dma_ops = &arm64_swiotlb_dma_ops;
+	swiotlb_init_with_default_size(min((size_t)SZ_64M, max_size), 1);
+}
+
+#define PREALLOC_DMA_DEBUG_ENTRIES	4096
+
+static int __init dma_debug_do_init(void)
+{
+	dma_debug_init(PREALLOC_DMA_DEBUG_ENTRIES);
+	return 0;
+}
+fs_initcall(dma_debug_do_init);


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 15/31] arm64: SMP support
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (13 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 14/31] arm64: DMA mapping API Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15  0:49   ` Olof Johansson
                     ` (2 more replies)
  2012-08-14 17:52 ` [PATCH v2 16/31] arm64: ELF definitions Catalin Marinas
                   ` (16 subsequent siblings)
  31 siblings, 3 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel
  Cc: linux-kernel, Arnd Bergmann, Will Deacon, Marc Zyngier

This patch adds SMP initialisation and spinlocks implementation for
AArch64. The spinlock support uses the new load-acquire/store-release
instructions to avoid explicit barriers. The architecture also specifies
that an event is automatically generated when clearing the exclusive
monitor state to wake up processors in WFE, so there is no need for an
explicit DSB/SEV instruction sequence. The SEVL instruction is used to
set the exclusive monitor locally as there is no conditional WFE and a
branch is more expensive.

For the SMP booting protocol, see Documentation/arm64/booting.txt.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/hardirq.h        |    5 +
 arch/arm64/include/asm/smp.h            |   69 +++++
 arch/arm64/include/asm/spinlock.h       |  199 +++++++++++++
 arch/arm64/include/asm/spinlock_types.h |   38 +++
 arch/arm64/kernel/smp.c                 |  469 +++++++++++++++++++++++++++++++
 5 files changed, 780 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/smp.h
 create mode 100644 arch/arm64/include/asm/spinlock.h
 create mode 100644 arch/arm64/include/asm/spinlock_types.h
 create mode 100644 arch/arm64/kernel/smp.c

diff --git a/arch/arm64/include/asm/hardirq.h b/arch/arm64/include/asm/hardirq.h
index c6c9514..5075463 100644
--- a/arch/arm64/include/asm/hardirq.h
+++ b/arch/arm64/include/asm/hardirq.h
@@ -20,8 +20,13 @@
 #include <linux/threads.h>
 #include <asm/irq.h>
 
+#define NR_IPI	4
+
 typedef struct {
 	unsigned int __softirq_pending;
+#ifdef CONFIG_SMP
+	unsigned int ipi_irqs[NR_IPI];
+#endif
 } ____cacheline_aligned irq_cpustat_t;
 
 #include <linux/irq_cpustat.h>	/* Standard mappings for irq_cpustat_t above */
diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h
new file mode 100644
index 0000000..7e34295
--- /dev/null
+++ b/arch/arm64/include/asm/smp.h
@@ -0,0 +1,69 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SMP_H
+#define __ASM_SMP_H
+
+#include <linux/threads.h>
+#include <linux/cpumask.h>
+#include <linux/thread_info.h>
+
+#ifndef CONFIG_SMP
+# error "<asm/smp.h> included in non-SMP build"
+#endif
+
+#define raw_smp_processor_id() (current_thread_info()->cpu)
+
+struct seq_file;
+
+/*
+ * generate IPI list text
+ */
+extern void show_ipi_list(struct seq_file *p, int prec);
+
+/*
+ * Called from C code, this handles an IPI.
+ */
+extern void handle_IPI(int ipinr, struct pt_regs *regs);
+
+/*
+ * Setup the set of possible CPUs (via set_cpu_possible)
+ */
+extern void smp_init_cpus(void);
+
+/*
+ * Provide a function to raise an IPI cross call on CPUs in callmap.
+ */
+extern void set_smp_cross_call(void (*)(const struct cpumask *, unsigned int));
+
+/*
+ * Called from the secondary holding pen, this is the secondary CPU entry point.
+ */
+asmlinkage void secondary_start_kernel(void);
+
+/*
+ * Initial data for bringing up a secondary CPU.
+ */
+struct secondary_data {
+	void *stack;
+};
+extern struct secondary_data secondary_data;
+extern void secondary_holding_pen(void);
+extern volatile unsigned long secondary_holding_pen_release;
+
+extern void arch_send_call_function_single_ipi(int cpu);
+extern void arch_send_call_function_ipi_mask(const struct cpumask *mask);
+
+#endif /* ifndef __ASM_SMP_H */
diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
new file mode 100644
index 0000000..34a37fb
--- /dev/null
+++ b/arch/arm64/include/asm/spinlock.h
@@ -0,0 +1,199 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SPINLOCK_H
+#define __ASM_SPINLOCK_H
+
+#include <asm/spinlock_types.h>
+#include <asm/processor.h>
+
+/*
+ * AArch64 Spin-locking.
+ *
+ * We exclusively read the old value.  If it is zero, we may have
+ * won the lock, so we try exclusively storing it.  A memory barrier
+ * is required after we get a lock, and before we release it, because
+ * V6 CPUs are assumed to have weakly ordered memory.
+ *
+ * Unlocked value: 0
+ * Locked value: 1
+ */
+
+#define arch_spin_is_locked(x)		((x)->lock != 0)
+#define arch_spin_unlock_wait(lock) \
+	do { while (arch_spin_is_locked(lock)) cpu_relax(); } while (0)
+
+#define arch_spin_lock_flags(lock, flags) arch_spin_lock(lock)
+
+static inline void arch_spin_lock(arch_spinlock_t *lock)
+{
+	unsigned int tmp;
+
+	asm volatile(
+	"	sevl\n"
+	"1:	wfe\n"
+	"2:	ldaxr	%w0, [%1]\n"
+	"	cbnz	%w0, 1b\n"
+	"	stxr	%w0, %w2, [%1]\n"
+	"	cbnz	%w0, 2b\n"
+	: "=&r" (tmp)
+	: "r" (&lock->lock), "r" (1)
+	: "memory");
+}
+
+static inline int arch_spin_trylock(arch_spinlock_t *lock)
+{
+	unsigned int tmp;
+
+	asm volatile(
+	"	ldaxr	%w0, [%1]\n"
+	"	cbnz	%w0, 1f\n"
+	"	stxr	%w0, %w2, [%1]\n"
+	"1:\n"
+	: "=&r" (tmp)
+	: "r" (&lock->lock), "r" (1)
+	: "memory");
+
+	return !tmp;
+}
+
+static inline void arch_spin_unlock(arch_spinlock_t *lock)
+{
+	asm volatile(
+	"	stlr	%w1, [%0]\n"
+	: : "r" (&lock->lock), "r" (0) : "memory");
+}
+
+/*
+ * RWLOCKS
+ *
+ *
+ * Write locks are easy - we just set bit 31.  When unlocking, we can
+ * just write zero since the lock is exclusively held.
+ */
+
+static inline void arch_write_lock(arch_rwlock_t *rw)
+{
+	unsigned int tmp;
+
+	asm volatile(
+	"	sevl\n"
+	"1:	wfe\n"
+	"2:	ldaxr	%w0, [%1]\n"
+	"	cbnz	%w0, 1b\n"
+	"	stxr	%w0, %w2, [%1]\n"
+	"	cbnz	%w0, 2b\n"
+	: "=&r" (tmp)
+	: "r" (&rw->lock), "r" (0x80000000)
+	: "memory");
+}
+
+static inline int arch_write_trylock(arch_rwlock_t *rw)
+{
+	unsigned int tmp;
+
+	asm volatile(
+	"	ldaxr	%w0, [%1]\n"
+	"	cbnz	%w0, 1f\n"
+	"	stxr	%w0, %w2, [%1]\n"
+	"1:\n"
+	: "=&r" (tmp)
+	: "r" (&rw->lock), "r" (0x80000000)
+	: "memory");
+
+	return !tmp;
+}
+
+static inline void arch_write_unlock(arch_rwlock_t *rw)
+{
+	asm volatile(
+	"	stlr	%w1, [%0]\n"
+	: : "r" (&rw->lock), "r" (0) : "memory");
+}
+
+/* write_can_lock - would write_trylock() succeed? */
+#define arch_write_can_lock(x)		((x)->lock == 0)
+
+/*
+ * Read locks are a bit more hairy:
+ *  - Exclusively load the lock value.
+ *  - Increment it.
+ *  - Store new lock value if positive, and we still own this location.
+ *    If the value is negative, we've already failed.
+ *  - If we failed to store the value, we want a negative result.
+ *  - If we failed, try again.
+ * Unlocking is similarly hairy.  We may have multiple read locks
+ * currently active.  However, we know we won't have any write
+ * locks.
+ */
+static inline void arch_read_lock(arch_rwlock_t *rw)
+{
+	unsigned int tmp, tmp2;
+
+	asm volatile(
+	"	sevl\n"
+	"1:	wfe\n"
+	"2:	ldaxr	%w0, [%2]\n"
+	"	add	%w0, %w0, #1\n"
+	"	tbnz	%w0, #31, 1b\n"
+	"	stxr	%w1, %w0, [%2]\n"
+	"	cbnz	%w1, 2b\n"
+	: "=&r" (tmp), "=&r" (tmp2)
+	: "r" (&rw->lock)
+	: "memory");
+}
+
+static inline void arch_read_unlock(arch_rwlock_t *rw)
+{
+	unsigned int tmp, tmp2;
+
+	asm volatile(
+	"1:	ldxr	%w0, [%2]\n"
+	"	sub	%w0, %w0, #1\n"
+	"	stlxr	%w1, %w0, [%2]\n"
+	"	cbnz	%w1, 1b\n"
+	: "=&r" (tmp), "=&r" (tmp2)
+	: "r" (&rw->lock)
+	: "memory");
+}
+
+static inline int arch_read_trylock(arch_rwlock_t *rw)
+{
+	unsigned int tmp, tmp2 = 1;
+
+	asm volatile(
+	"	ldaxr	%w0, [%2]\n"
+	"	add	%w0, %w0, #1\n"
+	"	tbnz	%w0, #31, 1f\n"
+	"	stxr	%w1, %w0, [%2]\n"
+	"1:\n"
+	: "=&r" (tmp), "+r" (tmp2)
+	: "r" (&rw->lock)
+	: "memory");
+
+	return !tmp2;
+}
+
+/* read_can_lock - would read_trylock() succeed? */
+#define arch_read_can_lock(x)		((x)->lock < 0x80000000)
+
+#define arch_read_lock_flags(lock, flags) arch_read_lock(lock)
+#define arch_write_lock_flags(lock, flags) arch_write_lock(lock)
+
+#define arch_spin_relax(lock)	cpu_relax()
+#define arch_read_relax(lock)	cpu_relax()
+#define arch_write_relax(lock)	cpu_relax()
+
+#endif /* __ASM_SPINLOCK_H */
diff --git a/arch/arm64/include/asm/spinlock_types.h b/arch/arm64/include/asm/spinlock_types.h
new file mode 100644
index 0000000..9a49434
--- /dev/null
+++ b/arch/arm64/include/asm/spinlock_types.h
@@ -0,0 +1,38 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SPINLOCK_TYPES_H
+#define __ASM_SPINLOCK_TYPES_H
+
+#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H)
+# error "please don't include this file directly"
+#endif
+
+/* We only require natural alignment for exclusive accesses. */
+#define __lock_aligned
+
+typedef struct {
+	volatile unsigned int lock;
+} arch_spinlock_t;
+
+#define __ARCH_SPIN_LOCK_UNLOCKED	{ 0 }
+
+typedef struct {
+	volatile unsigned int lock;
+} arch_rwlock_t;
+
+#define __ARCH_RW_LOCK_UNLOCKED		{ 0 }
+
+#endif
diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c
new file mode 100644
index 0000000..0b03e77
--- /dev/null
+++ b/arch/arm64/kernel/smp.c
@@ -0,0 +1,469 @@
+/*
+ * SMP initialisation and IPI support
+ * Based on arch/arm/kernel/smp.c
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/delay.h>
+#include <linux/init.h>
+#include <linux/spinlock.h>
+#include <linux/sched.h>
+#include <linux/interrupt.h>
+#include <linux/cache.h>
+#include <linux/profile.h>
+#include <linux/errno.h>
+#include <linux/mm.h>
+#include <linux/err.h>
+#include <linux/cpu.h>
+#include <linux/smp.h>
+#include <linux/seq_file.h>
+#include <linux/irq.h>
+#include <linux/percpu.h>
+#include <linux/clockchips.h>
+#include <linux/completion.h>
+#include <linux/of.h>
+
+#include <asm/atomic.h>
+#include <asm/cacheflush.h>
+#include <asm/cputype.h>
+#include <asm/mmu_context.h>
+#include <asm/pgtable.h>
+#include <asm/pgalloc.h>
+#include <asm/processor.h>
+#include <asm/sections.h>
+#include <asm/tlbflush.h>
+#include <asm/ptrace.h>
+#include <asm/mmu_context.h>
+
+/*
+ * as from 2.5, kernels no longer have an init_tasks structure
+ * so we need some other way of telling a new secondary core
+ * where to place its SVC stack
+ */
+struct secondary_data secondary_data;
+volatile unsigned long secondary_holding_pen_release;
+
+enum ipi_msg_type {
+	IPI_RESCHEDULE,
+	IPI_CALL_FUNC,
+	IPI_CALL_FUNC_SINGLE,
+	IPI_CPU_STOP,
+};
+
+static DEFINE_SPINLOCK(boot_lock);
+
+/*
+ * Write secondary_holding_pen_release in a way that is guaranteed to be
+ * visible to all observers, irrespective of whether they're taking part
+ * in coherency or not.  This is necessary for the hotplug code to work
+ * reliably.
+ */
+static void __cpuinit write_pen_release(int val)
+{
+	void *start = (void *)&secondary_holding_pen_release;
+	unsigned long size = sizeof(secondary_holding_pen_release);
+
+	secondary_holding_pen_release = val;
+	__cpuc_flush_dcache_area(start, size);
+}
+
+/*
+ * Boot a secondary CPU, and assign it the specified idle task.
+ * This also gives us the initial stack to use for this CPU.
+ */
+static int __cpuinit boot_secondary(unsigned int cpu, struct task_struct *idle)
+{
+	unsigned long timeout;
+
+	/*
+	 * Set synchronisation state between this boot processor
+	 * and the secondary one
+	 */
+	spin_lock(&boot_lock);
+
+	/*
+	 * Update the pen release flag.
+	 */
+	write_pen_release(cpu);
+
+	/*
+	 * Send an event, causing the secondaries to read pen_release.
+	 */
+	sev();
+
+	timeout = jiffies + (1 * HZ);
+	while (time_before(jiffies, timeout)) {
+		if (secondary_holding_pen_release == -1UL)
+			break;
+		udelay(10);
+	}
+
+	/*
+	 * Now the secondary core is starting up let it run its
+	 * calibrations, then wait for it to finish
+	 */
+	spin_unlock(&boot_lock);
+
+	return secondary_holding_pen_release != -1 ? -ENOSYS : 0;
+}
+
+static DECLARE_COMPLETION(cpu_running);
+
+int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
+{
+	int ret;
+
+	/*
+	 * We need to tell the secondary core where to find its stack and the
+	 * page tables.
+	 */
+	secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
+	__cpuc_flush_dcache_area(&secondary_data, sizeof(secondary_data));
+
+	/*
+	 * Now bring the CPU into our world.
+	 */
+	ret = boot_secondary(cpu, idle);
+	if (ret == 0) {
+		/*
+		 * CPU was successfully started, wait for it to come online or
+		 * time out.
+		 */
+		wait_for_completion_timeout(&cpu_running,
+					    msecs_to_jiffies(1000));
+
+		if (!cpu_online(cpu)) {
+			pr_crit("CPU%u: failed to come online\n", cpu);
+			ret = -EIO;
+		}
+	} else {
+		pr_err("CPU%u: failed to boot: %d\n", cpu, ret);
+	}
+
+	secondary_data.stack = NULL;
+
+	return ret;
+}
+
+/*
+ * This is the secondary CPU boot entry.  We're using this CPUs
+ * idle thread stack, but a set of temporary page tables.
+ */
+asmlinkage void __cpuinit secondary_start_kernel(void)
+{
+	struct mm_struct *mm = &init_mm;
+	unsigned int cpu = smp_processor_id();
+
+	printk("CPU%u: Booted secondary processor\n", cpu);
+
+	/*
+	 * All kernel threads share the same mm context; grab a
+	 * reference and switch to it.
+	 */
+	atomic_inc(&mm->mm_count);
+	current->active_mm = mm;
+	cpumask_set_cpu(cpu, mm_cpumask(mm));
+
+	/*
+	 * TTBR0 is only used for the identity mapping at this stage. Make it
+	 * point to zero page to avoid speculatively fetching new entries.
+	 */
+	cpu_set_reserved_ttbr0();
+	flush_tlb_all();
+
+	preempt_disable();
+	trace_hardirqs_off();
+
+	/*
+	 * Let the primary processor know we're out of the
+	 * pen, then head off into the C entry point
+	 */
+	write_pen_release(-1);
+
+	/*
+	 * Synchronise with the boot thread.
+	 */
+	spin_lock(&boot_lock);
+	spin_unlock(&boot_lock);
+
+	/*
+	 * Enable local interrupts.
+	 */
+	notify_cpu_starting(cpu);
+	local_irq_enable();
+	local_fiq_enable();
+
+	/*
+	 * OK, now it's safe to let the boot CPU continue.  Wait for
+	 * the CPU migration code to notice that the CPU is online
+	 * before we continue.
+	 */
+	set_cpu_online(cpu, true);
+	while (!cpu_active(cpu))
+		cpu_relax();
+
+	/*
+	 * OK, it's off to the idle thread for us
+	 */
+	cpu_idle();
+}
+
+void __init smp_cpus_done(unsigned int max_cpus)
+{
+	unsigned long bogosum = loops_per_jiffy * num_online_cpus();
+
+	pr_info("SMP: Total of %d processors activated (%lu.%02lu BogoMIPS).\n",
+		num_online_cpus(), bogosum / (500000/HZ),
+		(bogosum / (5000/HZ)) % 100);
+}
+
+void __init smp_prepare_boot_cpu(void)
+{
+}
+
+static void (*smp_cross_call)(const struct cpumask *, unsigned int);
+static phys_addr_t cpu_release_addr[NR_CPUS];
+
+/*
+ * Enumerate the possible CPU set from the device tree.
+ */
+void __init smp_init_cpus(void)
+{
+	const char *enable_method;
+	struct device_node *dn = NULL;
+	int cpu = 0;
+
+	while ((dn = of_find_node_by_type(dn, "cpu"))) {
+		if (cpu >= NR_CPUS)
+			goto next;
+
+		/*
+		 * We currently support only the "spin-table" enable-method.
+		 */
+		enable_method = of_get_property(dn, "enable-method", NULL);
+		if (!enable_method || strcmp(enable_method, "spin-table")) {
+			pr_err("CPU %d: missing or invalid enable-method property: %s\n",
+			       cpu, enable_method);
+			goto next;
+		}
+
+		/*
+		 * Determine the address from which the CPU is polling.
+		 */
+		if (of_property_read_u64(dn, "cpu-release-addr",
+					 &cpu_release_addr[cpu])) {
+			pr_err("CPU %d: missing or invalid cpu-release-addr property\n",
+			       cpu);
+			goto next;
+		}
+
+		set_cpu_possible(cpu, true);
+next:
+		cpu++;
+	}
+
+	/* sanity check */
+	if (cpu > NR_CPUS)
+		pr_warning("no. of cores (%d) greater than configured maximum of %d - clipping\n",
+			   cpu, NR_CPUS);
+}
+
+void __init smp_prepare_cpus(unsigned int max_cpus)
+{
+	int cpu;
+	void **release_addr;
+	unsigned int ncores = num_possible_cpus();
+
+	/*
+	 * are we trying to boot more cores than exist?
+	 */
+	if (max_cpus > ncores)
+		max_cpus = ncores;
+
+	/*
+	 * Initialise the present map (which describes the set of CPUs
+	 * actually populated at the present time) and release the
+	 * secondaries from the bootloader.
+	 */
+	for_each_possible_cpu(cpu) {
+		if (max_cpus == 0)
+			break;
+
+		if (!cpu_release_addr[cpu])
+			continue;
+
+		release_addr = __va(cpu_release_addr[cpu]);
+		release_addr[0] = (void *)__pa(secondary_holding_pen);
+		__cpuc_flush_dcache_area(release_addr, sizeof(release_addr[0]));
+
+		set_cpu_present(cpu, true);
+		max_cpus--;
+	}
+
+	/*
+	 * Send an event to wake up the secondaries.
+	 */
+	sev();
+}
+
+
+void __init set_smp_cross_call(void (*fn)(const struct cpumask *, unsigned int))
+{
+	smp_cross_call = fn;
+}
+
+void arch_send_call_function_ipi_mask(const struct cpumask *mask)
+{
+	smp_cross_call(mask, IPI_CALL_FUNC);
+}
+
+void arch_send_call_function_single_ipi(int cpu)
+{
+	smp_cross_call(cpumask_of(cpu), IPI_CALL_FUNC_SINGLE);
+}
+
+static const char *ipi_types[NR_IPI] = {
+#define S(x,s)	[x - IPI_RESCHEDULE] = s
+	S(IPI_RESCHEDULE, "Rescheduling interrupts"),
+	S(IPI_CALL_FUNC, "Function call interrupts"),
+	S(IPI_CALL_FUNC_SINGLE, "Single function call interrupts"),
+	S(IPI_CPU_STOP, "CPU stop interrupts"),
+};
+
+void show_ipi_list(struct seq_file *p, int prec)
+{
+	unsigned int cpu, i;
+
+	for (i = 0; i < NR_IPI; i++) {
+		seq_printf(p, "%*s%u:%s", prec - 1, "IPI", i + IPI_RESCHEDULE,
+			   prec >= 4 ? " " : "");
+		for_each_present_cpu(cpu)
+			seq_printf(p, "%10u ",
+				   __get_irq_stat(cpu, ipi_irqs[i]));
+		seq_printf(p, "      %s\n", ipi_types[i]);
+	}
+}
+
+u64 smp_irq_stat_cpu(unsigned int cpu)
+{
+	u64 sum = 0;
+	int i;
+
+	for (i = 0; i < NR_IPI; i++)
+		sum += __get_irq_stat(cpu, ipi_irqs[i]);
+
+	return sum;
+}
+
+static DEFINE_SPINLOCK(stop_lock);
+
+/*
+ * ipi_cpu_stop - handle IPI from smp_send_stop()
+ */
+static void ipi_cpu_stop(unsigned int cpu)
+{
+	if (system_state == SYSTEM_BOOTING ||
+	    system_state == SYSTEM_RUNNING) {
+		spin_lock(&stop_lock);
+		pr_crit("CPU%u: stopping\n", cpu);
+		dump_stack();
+		spin_unlock(&stop_lock);
+	}
+
+	set_cpu_online(cpu, false);
+
+	local_fiq_disable();
+	local_irq_disable();
+
+	while (1)
+		cpu_relax();
+}
+
+/*
+ * Main handler for inter-processor interrupts
+ */
+void handle_IPI(int ipinr, struct pt_regs *regs)
+{
+	unsigned int cpu = smp_processor_id();
+	struct pt_regs *old_regs = set_irq_regs(regs);
+
+	if (ipinr >= IPI_RESCHEDULE && ipinr < IPI_RESCHEDULE + NR_IPI)
+		__inc_irq_stat(cpu, ipi_irqs[ipinr - IPI_RESCHEDULE]);
+
+	switch (ipinr) {
+	case IPI_RESCHEDULE:
+		scheduler_ipi();
+		break;
+
+	case IPI_CALL_FUNC:
+		irq_enter();
+		generic_smp_call_function_interrupt();
+		irq_exit();
+		break;
+
+	case IPI_CALL_FUNC_SINGLE:
+		irq_enter();
+		generic_smp_call_function_single_interrupt();
+		irq_exit();
+		break;
+
+	case IPI_CPU_STOP:
+		irq_enter();
+		ipi_cpu_stop(cpu);
+		irq_exit();
+		break;
+
+	default:
+		pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr);
+		break;
+	}
+	set_irq_regs(old_regs);
+}
+
+void smp_send_reschedule(int cpu)
+{
+	smp_cross_call(cpumask_of(cpu), IPI_RESCHEDULE);
+}
+
+void smp_send_stop(void)
+{
+	unsigned long timeout;
+
+	if (num_online_cpus() > 1) {
+		cpumask_t mask;
+
+		cpumask_copy(&mask, cpu_online_mask);
+		cpu_clear(smp_processor_id(), mask);
+
+		smp_cross_call(&mask, IPI_CPU_STOP);
+	}
+
+	/* Wait up to one second for other CPUs to stop */
+	timeout = USEC_PER_SEC;
+	while (num_online_cpus() > 1 && timeout--)
+		udelay(1);
+
+	if (num_online_cpus() > 1)
+		pr_warning("SMP: failed to stop secondary CPUs\n");
+}
+
+/*
+ * not supported here
+ */
+int setup_profiling_timer(unsigned int multiplier)
+{
+	return -EINVAL;
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 16/31] arm64: ELF definitions
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (14 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 15/31] arm64: SMP support Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 14:15   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 17/31] arm64: System calls handling Catalin Marinas
                   ` (15 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds definitions for the ELF format, including personality
personality setting and EXEC_PAGESIZE. The are only two hwcap
definitions for 64-bit applications - HWCAP_FP and HWCAP_ASIMD.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/auxvec.h   |   22 +++++
 arch/arm64/include/asm/elf.h      |  176 +++++++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/hwcap.h    |   57 ++++++++++++
 arch/arm64/include/asm/param.h    |   23 +++++
 arch/arm64/include/asm/shmparam.h |   28 ++++++
 arch/arm64/kernel/elf.c           |   41 +++++++++
 6 files changed, 347 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/auxvec.h
 create mode 100644 arch/arm64/include/asm/elf.h
 create mode 100644 arch/arm64/include/asm/hwcap.h
 create mode 100644 arch/arm64/include/asm/param.h
 create mode 100644 arch/arm64/include/asm/shmparam.h
 create mode 100644 arch/arm64/kernel/elf.c

diff --git a/arch/arm64/include/asm/auxvec.h b/arch/arm64/include/asm/auxvec.h
new file mode 100644
index 0000000..22d6d88
--- /dev/null
+++ b/arch/arm64/include/asm/auxvec.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_AUXVEC_H
+#define __ASM_AUXVEC_H
+
+/* vDSO location */
+#define AT_SYSINFO_EHDR	33
+
+#endif
diff --git a/arch/arm64/include/asm/elf.h b/arch/arm64/include/asm/elf.h
new file mode 100644
index 0000000..9d62a7a
--- /dev/null
+++ b/arch/arm64/include/asm/elf.h
@@ -0,0 +1,176 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_ELF_H
+#define __ASM_ELF_H
+
+#include <asm/hwcap.h>
+
+/*
+ * ELF register definitions..
+ */
+#include <asm/ptrace.h>
+#include <asm/user.h>
+
+typedef unsigned long elf_greg_t;
+typedef unsigned long elf_freg_t[3];
+
+#define ELF_NGREG (sizeof (struct pt_regs) / sizeof(elf_greg_t))
+typedef elf_greg_t elf_gregset_t[ELF_NGREG];
+
+typedef struct user_fp elf_fpregset_t;
+
+#define EM_AARCH64		183
+
+/*
+ * AArch64 static relocation types.
+ */
+
+/* Miscellaneous. */
+#define R_ARM_NONE			0
+#define R_AARCH64_NONE			256
+
+/* Data. */
+#define R_AARCH64_ABS64			257
+#define R_AARCH64_ABS32			258
+#define R_AARCH64_ABS16			259
+#define R_AARCH64_PREL64		260
+#define R_AARCH64_PREL32		261
+#define R_AARCH64_PREL16		262
+
+/* Instructions. */
+#define R_AARCH64_MOVW_UABS_G0		263
+#define R_AARCH64_MOVW_UABS_G0_NC	264
+#define R_AARCH64_MOVW_UABS_G1		265
+#define R_AARCH64_MOVW_UABS_G1_NC	266
+#define R_AARCH64_MOVW_UABS_G2		267
+#define R_AARCH64_MOVW_UABS_G2_NC	268
+#define R_AARCH64_MOVW_UABS_G3		269
+
+#define R_AARCH64_MOVW_SABS_G0		270
+#define R_AARCH64_MOVW_SABS_G1		271
+#define R_AARCH64_MOVW_SABS_G2		272
+
+#define R_AARCH64_LD_PREL_LO19		273
+#define R_AARCH64_ADR_PREL_LO21		274
+#define R_AARCH64_ADR_PREL_PG_HI21	275
+#define R_AARCH64_ADR_PREL_PG_HI21_NC	276
+#define R_AARCH64_ADD_ABS_LO12_NC	277
+#define R_AARCH64_LDST8_ABS_LO12_NC	278
+
+#define R_AARCH64_TSTBR14		279
+#define R_AARCH64_CONDBR19		280
+#define R_AARCH64_JUMP26		282
+#define R_AARCH64_CALL26		283
+#define R_AARCH64_LDST16_ABS_LO12_NC	284
+#define R_AARCH64_LDST32_ABS_LO12_NC	285
+#define R_AARCH64_LDST64_ABS_LO12_NC	286
+#define R_AARCH64_LDST128_ABS_LO12_NC	299
+
+#define R_AARCH64_MOVW_PREL_G0		287
+#define R_AARCH64_MOVW_PREL_G0_NC	288
+#define R_AARCH64_MOVW_PREL_G1		289
+#define R_AARCH64_MOVW_PREL_G1_NC	290
+#define R_AARCH64_MOVW_PREL_G2		291
+#define R_AARCH64_MOVW_PREL_G2_NC	292
+#define R_AARCH64_MOVW_PREL_G3		293
+
+
+/*
+ * These are used to set parameters in the core dumps.
+ */
+#define ELF_CLASS	ELFCLASS64
+#define ELF_DATA	ELFDATA2LSB
+#define ELF_ARCH	EM_AARCH64
+
+#define ELF_PLATFORM_SIZE	16
+#define ELF_PLATFORM		("aarch64")
+
+/*
+ * This is used to ensure we don't load something for the wrong architecture.
+ */
+#define elf_check_arch(x)		((x)->e_machine == EM_AARCH64)
+
+#define elf_read_implies_exec(ex,stk)	(stk != EXSTACK_DISABLE_X)
+
+#define CORE_DUMP_USE_REGSET
+#define ELF_EXEC_PAGESIZE	PAGE_SIZE
+
+/*
+ * This is the location that an ET_DYN program is loaded if exec'ed.  Typical
+ * use of this is to invoke "./ld.so someprog" to test out a new version of
+ * the loader.  We need to make sure that it is out of the way of the program
+ * that it will "exec", and that there is sufficient room for the brk.
+ */
+extern unsigned long randomize_et_dyn(unsigned long base);
+#define ELF_ET_DYN_BASE	(randomize_et_dyn(2 * TASK_SIZE_64 / 3))
+
+/*
+ * When the program starts, a1 contains a pointer to a function to be
+ * registered with atexit, as per the SVR4 ABI.  A value of 0 means we have no
+ * such handler.
+ */
+#define ELF_PLAT_INIT(_r, load_addr)	(_r)->regs[0] = 0
+
+extern void				elf_set_personality(int personality);
+#define SET_PERSONALITY(ex)		elf_set_personality(PER_LINUX)
+
+#define ARCH_DLINFO							\
+do {									\
+	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
+		    (elf_addr_t)current->mm->context.vdso);		\
+} while (0)
+
+#define ARCH_HAS_SETUP_ADDITIONAL_PAGES
+struct linux_binprm;
+extern int arch_setup_additional_pages(struct linux_binprm *bprm,
+				       int uses_interp);
+
+/* 1GB of VA */
+#define STACK_RND_MASK			(test_thread_flag(TIF_32BIT) ? \
+						0x7ff >> (PAGE_SHIFT - 12) : \
+						0x3ffff >> (PAGE_SHIFT - 12))
+
+struct mm_struct;
+extern unsigned long arch_randomize_brk(struct mm_struct *mm);
+#define arch_randomize_brk arch_randomize_brk
+
+#ifdef CONFIG_AARCH32_EMULATION
+#define EM_ARM				40
+#define COMPAT_ELF_PLATFORM		("v8l")
+
+#define COMPAT_ELF_ET_DYN_BASE		(randomize_et_dyn(2 * TASK_SIZE_32 / 3))
+
+/* AArch32 registers. */
+#define COMPAT_ELF_NGREG		18
+typedef unsigned int			compat_elf_greg_t;
+typedef compat_elf_greg_t		compat_elf_gregset_t[COMPAT_ELF_NGREG];
+
+/* AArch32 EABI. */
+#define EF_ARM_EABI_MASK		0xff000000
+#define compat_elf_check_arch(x)	(((x)->e_machine == EM_ARM) && \
+					 ((x)->e_flags & EF_ARM_EABI_MASK))
+
+#define compat_start_thread		compat_start_thread
+#define COMPAT_SET_PERSONALITY(ex)	elf_set_personality(PER_LINUX32)
+#define COMPAT_ARCH_DLINFO
+extern int aarch32_setup_vectors_page(struct linux_binprm *bprm,
+				      int uses_interp);
+#define compat_arch_setup_additional_pages \
+					aarch32_setup_vectors_page
+
+#endif /* CONFIG_AARCH32_EMULATION */
+
+#endif
diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
new file mode 100644
index 0000000..0cc7c03
--- /dev/null
+++ b/arch/arm64/include/asm/hwcap.h
@@ -0,0 +1,57 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_HWCAP_H
+#define __ASM_HWCAP_H
+
+/*
+ * HWCAP flags - for elf_hwcap (in kernel) and AT_HWCAP
+ */
+#define HWCAP_FP		(1 << 0)
+#define HWCAP_ASIMD		(1 << 1)
+
+#ifdef CONFIG_AARCH32_EMULATION
+#define COMPAT_HWCAP_HALF	(1 << 1)
+#define COMPAT_HWCAP_THUMB	(1 << 2)
+#define COMPAT_HWCAP_FAST_MULT	(1 << 4)
+#define COMPAT_HWCAP_VFP	(1 << 6)
+#define COMPAT_HWCAP_EDSP	(1 << 7)
+#define COMPAT_HWCAP_NEON	(1 << 12)
+#define COMPAT_HWCAP_VFPv3	(1 << 13)
+#define COMPAT_HWCAP_TLS	(1 << 15)
+#define COMPAT_HWCAP_VFPv4	(1 << 16)
+#define COMPAT_HWCAP_IDIVA	(1 << 17)
+#define COMPAT_HWCAP_IDIVT	(1 << 18)
+#define COMPAT_HWCAP_IDIV	(COMPAT_HWCAP_IDIVA|COMPAT_HWCAP_IDIVT)
+
+#endif /* CONFIG_AARCH32_EMULATION */
+
+#if defined(__KERNEL__) && !defined(__ASSEMBLY__)
+/*
+ * This yields a mask that user programs can use to figure out what
+ * instruction set this cpu supports.
+ */
+#define ELF_HWCAP		(elf_hwcap)
+#ifdef CONFIG_AARCH32_EMULATION
+#define COMPAT_ELF_HWCAP	(COMPAT_HWCAP_HALF|COMPAT_HWCAP_THUMB|\
+				 COMPAT_HWCAP_FAST_MULT|COMPAT_HWCAP_EDSP|\
+				 COMPAT_HWCAP_TLS|COMPAT_HWCAP_VFP|\
+				 COMPAT_HWCAP_VFPv3|COMPAT_HWCAP_VFPv4|\
+				 COMPAT_HWCAP_NEON|COMPAT_HWCAP_IDIV)
+#endif
+extern unsigned int elf_hwcap;
+#endif
+
+#endif
diff --git a/arch/arm64/include/asm/param.h b/arch/arm64/include/asm/param.h
new file mode 100644
index 0000000..8e3a281
--- /dev/null
+++ b/arch/arm64/include/asm/param.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PARAM_H
+#define __ASM_PARAM_H
+
+#define EXEC_PAGESIZE	65536
+
+#include <asm-generic/param.h>
+
+#endif
diff --git a/arch/arm64/include/asm/shmparam.h b/arch/arm64/include/asm/shmparam.h
new file mode 100644
index 0000000..4df608a
--- /dev/null
+++ b/arch/arm64/include/asm/shmparam.h
@@ -0,0 +1,28 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SHMPARAM_H
+#define __ASM_SHMPARAM_H
+
+/*
+ * For IPC syscalls from compat tasks, we need to use the legacy 16k
+ * alignment value. Since we don't have aliasing D-caches, the rest of
+ * the time we can safely use PAGE_SIZE.
+ */
+#define COMPAT_SHMLBA	0x4000
+
+#include <asm-generic/shmparam.h>
+
+#endif /* __ASM_SHMPARAM_H */
diff --git a/arch/arm64/kernel/elf.c b/arch/arm64/kernel/elf.c
new file mode 100644
index 0000000..6f98076
--- /dev/null
+++ b/arch/arm64/kernel/elf.c
@@ -0,0 +1,41 @@
+/*
+ * ELF personality setting
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/personality.h>
+#include <linux/binfmts.h>
+#include <linux/elf.h>
+
+void elf_set_personality(int personality)
+{
+	switch (personality & PER_MASK) {
+	case PER_LINUX:
+		clear_thread_flag(TIF_32BIT);
+		break;
+	case PER_LINUX32:
+		set_thread_flag(TIF_32BIT);
+		break;
+	default:
+		pr_warning("Process %s tried to assume unknown personality %d\n",
+			   current->comm, personality);
+		return;
+	}
+
+	current->personality = personality;
+}
+EXPORT_SYMBOL(elf_set_personality);


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 17/31] arm64: System calls handling
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (15 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 16/31] arm64: ELF definitions Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 14:22   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 18/31] arm64: VDSO support Catalin Marinas
                   ` (14 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds support for system calls coming from 64-bit
applications. It uses the asm-generic/unistd.h definitions with the
canonical set of system calls. The private system calls are only used
for 32-bit (compat) applications as 64-bit ones can set the TLS and
flush the caches entirely from user space.

The sys_call_table is just an array defined in a C file and it contains
pointers to the syscall functions. The array is 4KB aligned to allow the
use of the ADRP instruction (longer range ADR) in entry.S.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/stat.h     |   63 +++++++++++++++++
 arch/arm64/include/asm/statfs.h   |   23 ++++++
 arch/arm64/include/asm/syscalls.h |   40 +++++++++++
 arch/arm64/include/asm/unistd.h   |   27 +++++++
 arch/arm64/kernel/sys.c           |  138 +++++++++++++++++++++++++++++++++++++
 5 files changed, 291 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/stat.h
 create mode 100644 arch/arm64/include/asm/statfs.h
 create mode 100644 arch/arm64/include/asm/syscalls.h
 create mode 100644 arch/arm64/include/asm/unistd.h
 create mode 100644 arch/arm64/kernel/sys.c

diff --git a/arch/arm64/include/asm/stat.h b/arch/arm64/include/asm/stat.h
new file mode 100644
index 0000000..f63a680
--- /dev/null
+++ b/arch/arm64/include/asm/stat.h
@@ -0,0 +1,63 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_STAT_H
+#define __ASM_STAT_H
+
+#include <asm-generic/stat.h>
+
+#if defined(__KERNEL__) && defined(CONFIG_AARCH32_EMULATION)
+
+#include <asm/compat.h>
+
+/* This matches struct stat64 in glibc2.1, hence the absolutely
+ * insane amounts of padding around dev_t's.
+ * Note: The kernel zero's the padded region because glibc might read them
+ * in the hope that the kernel has stretched to using larger sizes.
+ */
+struct stat64 {
+	compat_u64	st_dev;
+	unsigned char   __pad0[4];
+
+#define STAT64_HAS_BROKEN_ST_INO	1
+	compat_ulong_t	__st_ino;
+	compat_uint_t	st_mode;
+	compat_uint_t	st_nlink;
+
+	compat_ulong_t	st_uid;
+	compat_ulong_t	st_gid;
+
+	compat_u64	st_rdev;
+	unsigned char   __pad3[4];
+
+	compat_s64	st_size;
+	compat_ulong_t	st_blksize;
+	compat_u64	st_blocks;	/* Number 512-byte blocks allocated. */
+
+	compat_ulong_t	st_atime;
+	compat_ulong_t	st_atime_nsec;
+
+	compat_ulong_t	st_mtime;
+	compat_ulong_t	st_mtime_nsec;
+
+	compat_ulong_t	st_ctime;
+	compat_ulong_t	st_ctime_nsec;
+
+	compat_u64	st_ino;
+};
+
+#endif
+
+#endif
diff --git a/arch/arm64/include/asm/statfs.h b/arch/arm64/include/asm/statfs.h
new file mode 100644
index 0000000..6f62190
--- /dev/null
+++ b/arch/arm64/include/asm/statfs.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_STATFS_H
+#define __ASM_STATFS_H
+
+#define ARCH_PACK_COMPAT_STATFS64 __attribute__((packed,aligned(4)))
+
+#include <asm-generic/statfs.h>
+
+#endif
diff --git a/arch/arm64/include/asm/syscalls.h b/arch/arm64/include/asm/syscalls.h
new file mode 100644
index 0000000..09ff335
--- /dev/null
+++ b/arch/arm64/include/asm/syscalls.h
@@ -0,0 +1,40 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SYSCALLS_H
+#define __ASM_SYSCALLS_H
+
+#include <linux/linkage.h>
+#include <linux/compiler.h>
+#include <linux/signal.h>
+
+/*
+ * System call wrappers implemented in kernel/entry.S.
+ */
+asmlinkage long sys_execve_wrapper(const char __user *filename,
+				   const char __user *const __user *argv,
+				   const char __user *const __user *envp);
+asmlinkage long sys_clone_wrapper(unsigned long clone_flags,
+				  unsigned long newsp,
+				  void __user *parent_tid,
+				  unsigned long tls_val,
+				  void __user *child_tid);
+asmlinkage long sys_rt_sigreturn_wrapper(void);
+asmlinkage long sys_sigaltstack_wrapper(const stack_t __user *uss,
+					stack_t __user *uoss);
+
+#include <asm-generic/syscalls.h>
+
+#endif	/* __ASM_SYSCALLS_H */
diff --git a/arch/arm64/include/asm/unistd.h b/arch/arm64/include/asm/unistd.h
new file mode 100644
index 0000000..b00718c
--- /dev/null
+++ b/arch/arm64/include/asm/unistd.h
@@ -0,0 +1,27 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#if !defined(__ASM_UNISTD_H) || defined(__SYSCALL)
+#define __ASM_UNISTD_H
+
+#ifndef __SYSCALL_COMPAT
+#include <asm-generic/unistd.h>
+#endif
+
+#if defined(__KERNEL__) && defined(CONFIG_AARCH32_EMULATION)
+#include <asm/unistd32.h>
+#endif
+
+#endif /* __ASM_UNISTD_H */
diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c
new file mode 100644
index 0000000..905fcfb
--- /dev/null
+++ b/arch/arm64/kernel/sys.c
@@ -0,0 +1,138 @@
+/*
+ * AArch64-specific system calls implementation
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/compiler.h>
+#include <linux/errno.h>
+#include <linux/fs.h>
+#include <linux/mm.h>
+#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/syscalls.h>
+
+/*
+ * Clone a task - this clones the calling program thread.
+ */
+asmlinkage long sys_clone(unsigned long clone_flags, unsigned long newsp,
+			  int __user *parent_tidptr, unsigned long tls_val,
+			  int __user *child_tidptr, struct pt_regs *regs)
+{
+	if (!newsp)
+		newsp = regs->sp;
+	/* 16-byte aligned stack mandatory on AArch64 */
+	if (newsp & 15)
+		return -EINVAL;
+	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
+}
+
+/*
+ * sys_execve() executes a new program.
+ */
+asmlinkage long sys_execve(const char __user *filenamei,
+			   const char __user *const __user *argv,
+			   const char __user *const __user *envp,
+			   struct pt_regs *regs)
+{
+	long error;
+	char * filename;
+
+	filename = getname(filenamei);
+	error = PTR_ERR(filename);
+	if (IS_ERR(filename))
+		goto out;
+	error = do_execve(filename, argv, envp, regs);
+	putname(filename);
+out:
+	return error;
+}
+
+int kernel_execve(const char *filename,
+		  const char *const argv[],
+		  const char *const envp[])
+{
+	struct pt_regs regs;
+	int ret;
+
+	memset(&regs, 0, sizeof(struct pt_regs));
+	ret = do_execve(filename,
+			(const char __user *const __user *)argv,
+			(const char __user *const __user *)envp, &regs);
+	if (ret < 0)
+		goto out;
+
+	/*
+	 * Save argc to the register structure for userspace.
+	 */
+	regs.regs[0] = ret;
+
+	/*
+	 * We were successful.  We won't be returning to our caller, but
+	 * instead to user space by manipulating the kernel stack.
+	 */
+	asm(	"add	x0, %0, %1\n\t"
+		"mov	x1, %2\n\t"
+		"mov	x2, %3\n\t"
+		"bl	memmove\n\t"	/* copy regs to top of stack */
+		"mov	x27, #0\n\t"	/* not a syscall */
+		"mov	x28, %0\n\t"	/* thread structure */
+		"mov	sp, x0\n\t"	/* reposition stack pointer */
+		"b	ret_to_user"
+		:
+		: "r" (current_thread_info()),
+		  "Ir" (THREAD_START_SP - sizeof(regs)),
+		  "r" (&regs),
+		  "Ir" (sizeof(regs))
+		: "x0", "x1", "x2", "x27", "x28", "x30", "memory");
+
+ out:
+	return ret;
+}
+EXPORT_SYMBOL(kernel_execve);
+
+asmlinkage long sys_mmap(unsigned long addr, unsigned long len,
+			 unsigned long prot, unsigned long flags,
+			 unsigned long fd, off_t off)
+{
+	if (offset_in_page(off) != 0)
+		return -EINVAL;
+
+	return sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
+}
+
+/*
+ * Wrappers to pass the pt_regs argument.
+ */
+#define sys_execve		sys_execve_wrapper
+#define sys_clone		sys_clone_wrapper
+#define sys_rt_sigreturn	sys_rt_sigreturn_wrapper
+#define sys_sigaltstack		sys_sigaltstack_wrapper
+
+#include <asm/syscalls.h>
+
+#undef __SYSCALL
+#define __SYSCALL(nr, sym)	[nr] = sym,
+
+/*
+ * The sys_call_table array must be 4K aligned to be accessible from
+ * kernel/entry.S.
+ */
+void *sys_call_table[__NR_syscalls] __aligned(4096) = {
+	[0 ... __NR_syscalls - 1] = sys_ni_syscall,
+#include <asm/unistd.h>
+};


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 18/31] arm64: VDSO support
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (16 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 17/31] arm64: System calls handling Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-14 17:52 ` [PATCH v2 19/31] arm64: Signal handling support Catalin Marinas
                   ` (13 subsequent siblings)
  31 siblings, 0 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

From: Will Deacon <will.deacon@arm.com>

This patch adds VDSO support for 64-bit applications. The VDSO code is
currently used for sys_rt_sigreturn() and optimised gettimeofday()
(using the user-accessible generic counter).

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/vdso.h              |   41 +++++
 arch/arm64/include/asm/vdso_datapage.h     |   43 +++++
 arch/arm64/kernel/vdso.c                   |  261 ++++++++++++++++++++++++++++
 arch/arm64/kernel/vdso/.gitignore          |    2 +
 arch/arm64/kernel/vdso/Makefile            |   63 +++++++
 arch/arm64/kernel/vdso/gen_vdso_offsets.sh |   15 ++
 arch/arm64/kernel/vdso/gettimeofday.S      |  242 ++++++++++++++++++++++++++
 arch/arm64/kernel/vdso/note.S              |   28 +++
 arch/arm64/kernel/vdso/sigreturn.S         |   37 ++++
 arch/arm64/kernel/vdso/vdso.S              |   33 ++++
 arch/arm64/kernel/vdso/vdso.lds.S          |  100 +++++++++++
 11 files changed, 865 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/vdso.h
 create mode 100644 arch/arm64/include/asm/vdso_datapage.h
 create mode 100644 arch/arm64/kernel/vdso.c
 create mode 100644 arch/arm64/kernel/vdso/.gitignore
 create mode 100644 arch/arm64/kernel/vdso/Makefile
 create mode 100755 arch/arm64/kernel/vdso/gen_vdso_offsets.sh
 create mode 100644 arch/arm64/kernel/vdso/gettimeofday.S
 create mode 100644 arch/arm64/kernel/vdso/note.S
 create mode 100644 arch/arm64/kernel/vdso/sigreturn.S
 create mode 100644 arch/arm64/kernel/vdso/vdso.S
 create mode 100644 arch/arm64/kernel/vdso/vdso.lds.S

diff --git a/arch/arm64/include/asm/vdso.h b/arch/arm64/include/asm/vdso.h
new file mode 100644
index 0000000..839ce00
--- /dev/null
+++ b/arch/arm64/include/asm/vdso.h
@@ -0,0 +1,41 @@
+/*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_VDSO_H
+#define __ASM_VDSO_H
+
+#ifdef __KERNEL__
+
+/*
+ * Default link address for the vDSO.
+ * Since we randomise the VDSO mapping, there's little point in trying
+ * to prelink this.
+ */
+#define VDSO_LBASE	0x0
+
+#ifndef __ASSEMBLY__
+
+#include <generated/vdso-offsets.h>
+
+#define VDSO_SYMBOL(base, name)						   \
+({									   \
+	(void *)(vdso_offset_##name - VDSO_LBASE + (unsigned long)(base)); \
+})
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __KERNEL__ */
+
+#endif /* __ASM_VDSO_H */
diff --git a/arch/arm64/include/asm/vdso_datapage.h b/arch/arm64/include/asm/vdso_datapage.h
new file mode 100644
index 0000000..de66199
--- /dev/null
+++ b/arch/arm64/include/asm/vdso_datapage.h
@@ -0,0 +1,43 @@
+/*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_VDSO_DATAPAGE_H
+#define __ASM_VDSO_DATAPAGE_H
+
+#ifdef __KERNEL__
+
+#ifndef __ASSEMBLY__
+
+struct vdso_data {
+	__u64 cs_cycle_last;	/* Timebase at clocksource init */
+	__u64 xtime_clock_sec;	/* Kernel time */
+	__u64 xtime_clock_nsec;
+	__u64 xtime_coarse_sec;	/* Coarse time */
+	__u64 xtime_coarse_nsec;
+	__u64 wtm_clock_sec;	/* Wall to monotonic time */
+	__u64 wtm_clock_nsec;
+	__u32 tb_seq_count;	/* Timebase sequence counter */
+	__u32 cs_mult;		/* Clocksource multiplier */
+	__u32 cs_shift;		/* Clocksource shift */
+	__u32 tz_minuteswest;	/* Whacky timezone stuff */
+	__u32 tz_dsttime;
+	__u32 use_syscall;
+};
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __KERNEL__ */
+
+#endif /* __ASM_VDSO_DATAPAGE_H */
diff --git a/arch/arm64/kernel/vdso.c b/arch/arm64/kernel/vdso.c
new file mode 100644
index 0000000..8d8a365
--- /dev/null
+++ b/arch/arm64/kernel/vdso.c
@@ -0,0 +1,261 @@
+/*
+ * VDSO implementation for AArch64 and vector page setup for AArch32.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/kernel.h>
+#include <linux/clocksource.h>
+#include <linux/elf.h>
+#include <linux/err.h>
+#include <linux/errno.h>
+#include <linux/gfp.h>
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/signal.h>
+#include <linux/slab.h>
+#include <linux/vmalloc.h>
+
+#include <asm/cacheflush.h>
+#include <asm/signal32.h>
+#include <asm/vdso.h>
+#include <asm/vdso_datapage.h>
+
+extern char vdso_start, vdso_end;
+static unsigned long vdso_pages;
+static struct page **vdso_pagelist;
+
+/*
+ * The vDSO data page.
+ */
+static union {
+	struct vdso_data	data;
+	u8			page[PAGE_SIZE];
+} vdso_data_store __page_aligned_data;
+struct vdso_data *vdso_data = &vdso_data_store.data;
+
+#ifdef CONFIG_AARCH32_EMULATION
+/*
+ * Create and map the vectors page for AArch32 tasks.
+ */
+static struct page *vectors_page[1];
+
+static int alloc_vectors_page(void)
+{
+	extern char __kuser_helper_start[], __kuser_helper_end[];
+	int kuser_sz = __kuser_helper_end - __kuser_helper_start;
+	unsigned long vpage;
+
+	vpage = get_zeroed_page(GFP_ATOMIC);
+
+	if (!vpage)
+		return -ENOMEM;
+
+	/* kuser helpers */
+	memcpy((void *)vpage + 0x1000 - kuser_sz, __kuser_helper_start,
+		kuser_sz);
+
+	/* sigreturn code */
+	memcpy((void *)vpage + AARCH32_KERN_SIGRET_CODE_OFFSET,
+		aarch32_sigret_code, sizeof(aarch32_sigret_code));
+
+	flush_icache_range(vpage, vpage + PAGE_SIZE);
+	vectors_page[0] = virt_to_page(vpage);
+
+	return 0;
+}
+arch_initcall(alloc_vectors_page);
+
+int aarch32_setup_vectors_page(struct linux_binprm *bprm, int uses_interp)
+{
+	struct mm_struct *mm = current->mm;
+	unsigned long addr = AARCH32_VECTORS_BASE;
+	int ret;
+
+	down_write(&mm->mmap_sem);
+	current->mm->context.vdso = (void *)addr;
+
+	/* Map vectors page at the high address. */
+	ret = install_special_mapping(mm, addr, PAGE_SIZE,
+				      VM_READ|VM_EXEC|VM_MAYREAD|VM_MAYEXEC,
+				      vectors_page);
+
+	up_write(&mm->mmap_sem);
+
+	return ret;
+}
+#endif /* CONFIG_AARCH32_EMULATION */
+
+static int __init vdso_init(void)
+{
+	struct page *pg;
+	char *vbase;
+	int i, ret = 0;
+
+	vdso_pages = (&vdso_end - &vdso_start) >> PAGE_SHIFT;
+	pr_info("vdso: %ld pages (%ld code, %ld data) at base %p\n",
+		vdso_pages + 1, vdso_pages, 1L, &vdso_start);
+
+	/* Allocate the vDSO pagelist, plus a page for the data. */
+	vdso_pagelist = kzalloc(sizeof(struct page *) * (vdso_pages + 1),
+				GFP_KERNEL);
+	if (vdso_pagelist == NULL) {
+		pr_err("Failed to allocate vDSO pagelist!\n");
+		return -ENOMEM;
+	}
+
+	/* Grab the vDSO code pages. */
+	for (i = 0; i < vdso_pages; i++) {
+		pg = virt_to_page(&vdso_start + i*PAGE_SIZE);
+		ClearPageReserved(pg);
+		get_page(pg);
+		vdso_pagelist[i] = pg;
+	}
+
+	/* Sanity check the shared object header. */
+	vbase = vmap(vdso_pagelist, 1, 0, PAGE_KERNEL);
+	if (vbase == NULL) {
+		pr_err("Failed to map vDSO pagelist!\n");
+		return -ENOMEM;
+	} else if (memcmp(vbase, "\177ELF", 4)) {
+		pr_err("vDSO is not a valid ELF object!\n");
+		ret = -EINVAL;
+		goto unmap;
+	}
+
+	/* Grab the vDSO data page. */
+	pg = virt_to_page(vdso_data);
+	get_page(pg);
+	vdso_pagelist[i] = pg;
+
+unmap:
+	vunmap(vbase);
+	return ret;
+}
+arch_initcall(vdso_init);
+
+int arch_setup_additional_pages(struct linux_binprm *bprm,
+				int uses_interp)
+{
+	struct mm_struct *mm = current->mm;
+	unsigned long vdso_base, vdso_mapping_len;
+	int ret;
+
+	/* Be sure to map the data page */
+	vdso_mapping_len = (vdso_pages + 1) << PAGE_SHIFT;
+
+	down_write(&mm->mmap_sem);
+	vdso_base = get_unmapped_area(NULL, 0, vdso_mapping_len, 0, 0);
+	if (IS_ERR_VALUE(vdso_base)) {
+		ret = vdso_base;
+		goto up_fail;
+	}
+	mm->context.vdso = (void *)vdso_base;
+
+	ret = install_special_mapping(mm, vdso_base, vdso_mapping_len,
+				      VM_READ|VM_EXEC|
+				      VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
+				      vdso_pagelist);
+	if (ret) {
+		mm->context.vdso = NULL;
+		goto up_fail;
+	}
+
+up_fail:
+	up_write(&mm->mmap_sem);
+
+	return ret;
+}
+
+const char *arch_vma_name(struct vm_area_struct *vma)
+{
+	/*
+	 * We can re-use the vdso pointer in mm_context_t for identifying
+	 * the vectors page for compat applications. The vDSO will always
+	 * sit above TASK_UNMAPPED_BASE and so we don't need to worry about
+	 * it conflicting with the vectors base.
+	 */
+	if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso) {
+#ifdef CONFIG_AARCH32_EMULATION
+		if (vma->vm_start == AARCH32_VECTORS_BASE)
+			return "[vectors]";
+#endif
+		return "[vdso]";
+	}
+
+	return NULL;
+}
+
+/*
+ * We define AT_SYSINFO_EHDR, so we need these function stubs to keep
+ * Linux happy.
+ */
+int in_gate_area_no_mm(unsigned long addr)
+{
+	return 0;
+}
+
+int in_gate_area(struct mm_struct *mm, unsigned long addr)
+{
+	return 0;
+}
+
+struct vm_area_struct *get_gate_vma(struct mm_struct *mm)
+{
+	return NULL;
+}
+
+/*
+ * Update the vDSO data page to keep in sync with kernel timekeeping.
+ */
+void update_vsyscall(struct timespec *ts, struct timespec *wtm,
+		     struct clocksource *clock, u32 mult)
+{
+	struct timespec xtime_coarse;
+	u32 use_syscall = strcmp(clock->name, "arch_sys_counter");
+
+	++vdso_data->tb_seq_count;
+	smp_wmb();
+
+	xtime_coarse = __current_kernel_time();
+	vdso_data->use_syscall			= use_syscall;
+	vdso_data->xtime_coarse_sec		= xtime_coarse.tv_sec;
+	vdso_data->xtime_coarse_nsec		= xtime_coarse.tv_nsec;
+
+	if (!use_syscall) {
+		vdso_data->cs_cycle_last	= clock->cycle_last;
+		vdso_data->xtime_clock_sec	= ts->tv_sec;
+		vdso_data->xtime_clock_nsec	= ts->tv_nsec;
+		vdso_data->cs_mult		= mult;
+		vdso_data->cs_shift		= clock->shift;
+		vdso_data->wtm_clock_sec	= wtm->tv_sec;
+		vdso_data->wtm_clock_nsec	= wtm->tv_nsec;
+	}
+
+	smp_wmb();
+	++vdso_data->tb_seq_count;
+}
+
+void update_vsyscall_tz(void)
+{
+	++vdso_data->tb_seq_count;
+	smp_wmb();
+	vdso_data->tz_minuteswest	= sys_tz.tz_minuteswest;
+	vdso_data->tz_dsttime		= sys_tz.tz_dsttime;
+	smp_wmb();
+	++vdso_data->tb_seq_count;
+}
diff --git a/arch/arm64/kernel/vdso/.gitignore b/arch/arm64/kernel/vdso/.gitignore
new file mode 100644
index 0000000..b8cc94e
--- /dev/null
+++ b/arch/arm64/kernel/vdso/.gitignore
@@ -0,0 +1,2 @@
+vdso.lds
+vdso-offsets.h
diff --git a/arch/arm64/kernel/vdso/Makefile b/arch/arm64/kernel/vdso/Makefile
new file mode 100644
index 0000000..d8064af
--- /dev/null
+++ b/arch/arm64/kernel/vdso/Makefile
@@ -0,0 +1,63 @@
+#
+# Building a vDSO image for AArch64.
+#
+# Author: Will Deacon <will.deacon@arm.com>
+# Heavily based on the vDSO Makefiles for other archs.
+#
+
+obj-vdso := gettimeofday.o note.o sigreturn.o
+
+# Build rules
+targets := $(obj-vdso) vdso.so vdso.so.dbg
+obj-vdso := $(addprefix $(obj)/, $(obj-vdso))
+
+ccflags-y := -shared -fno-common -fno-builtin
+ccflags-y += -nostdlib -Wl,-soname=linux-vdso.so.1 \
+		$(call cc-ldoption, -Wl$(comma)--hash-style=sysv)
+
+obj-y += vdso.o
+extra-y += vdso.lds vdso-offsets.h
+CPPFLAGS_vdso.lds += -P -C -U$(ARCH)
+
+# Force dependency (incbin is bad)
+$(obj)/vdso.o : $(obj)/vdso.so
+
+# Link rule for the .so file, .lds has to be first
+$(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso)
+	$(call if_changed,vdsold)
+
+# Strip rule for the .so file
+$(obj)/%.so: OBJCOPYFLAGS := -S
+$(obj)/%.so: $(obj)/%.so.dbg FORCE
+	$(call if_changed,objcopy)
+
+# Generate VDSO offsets using helper script
+gen-vdsosym := $(srctree)/$(src)/gen_vdso_offsets.sh
+quiet_cmd_vdsosym = VDSOSYM $@
+define cmd_vdsosym
+	$(NM) $< | $(gen-vdsosym) | LC_ALL=C sort > $@ && \
+	cp $@ include/generated/
+endef
+
+$(obj)/vdso-offsets.h: $(obj)/vdso.so.dbg FORCE
+	$(call if_changed,vdsosym)
+
+# Assembly rules for the .S files
+$(obj-vdso): %.o: %.S
+	$(call if_changed_dep,vdsoas)
+
+# Actual build commands
+quiet_cmd_vdsold = VDSOL $@
+      cmd_vdsold = $(CC) $(c_flags) -Wl,-T $^ -o $@
+quiet_cmd_vdsoas = VDSOA $@
+      cmd_vdsoas = $(CC) $(a_flags) -c -o $@ $<
+
+# Install commands for the unstripped file
+quiet_cmd_vdso_install = INSTALL $@
+      cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@
+
+vdso.so: $(obj)/vdso.so.dbg
+	@mkdir -p $(MODLIB)/vdso
+	$(call cmd,vdso_install)
+
+vdso_install: vdso.so
diff --git a/arch/arm64/kernel/vdso/gen_vdso_offsets.sh b/arch/arm64/kernel/vdso/gen_vdso_offsets.sh
new file mode 100755
index 0000000..01924ff
--- /dev/null
+++ b/arch/arm64/kernel/vdso/gen_vdso_offsets.sh
@@ -0,0 +1,15 @@
+#!/bin/sh
+
+#
+# Match symbols in the DSO that look like VDSO_*; produce a header file
+# of constant offsets into the shared object.
+#
+# Doing this inside the Makefile will break the $(filter-out) function,
+# causing Kbuild to rebuild the vdso-offsets header file every time.
+#
+# Author: Will Deacon <will.deacon@arm.com
+#
+
+LC_ALL=C
+sed -n -e 's/^00*/0/' -e \
+'s/^\([0-9a-fA-F]*\) . VDSO_\([a-zA-Z0-9_]*\)$/\#define vdso_offset_\2\t0x\1/p'
diff --git a/arch/arm64/kernel/vdso/gettimeofday.S b/arch/arm64/kernel/vdso/gettimeofday.S
new file mode 100644
index 0000000..dcb8c20
--- /dev/null
+++ b/arch/arm64/kernel/vdso/gettimeofday.S
@@ -0,0 +1,242 @@
+/*
+ * Userspace implementations of gettimeofday() and friends.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/linkage.h>
+#include <asm/asm-offsets.h>
+#include <asm/unistd.h>
+
+#define NSEC_PER_SEC_LO16	0xca00
+#define NSEC_PER_SEC_HI16	0x3b9a
+
+vdso_data	.req	x6
+use_syscall	.req	w7
+seqcnt		.req	w8
+
+	.macro	seqcnt_acquire
+9999:	ldr	seqcnt, [vdso_data, #VDSO_TB_SEQ_COUNT]
+	tbnz	seqcnt, #0, 9999b
+	dmb	ishld
+	ldr	use_syscall, [vdso_data, #VDSO_USE_SYSCALL]
+	.endm
+
+	.macro	seqcnt_read, cnt
+	dmb	ishld
+	ldr	\cnt, [vdso_data, #VDSO_TB_SEQ_COUNT]
+	.endm
+
+	.macro	seqcnt_check, cnt, fail
+	cmp	\cnt, seqcnt
+	b.ne	\fail
+	.endm
+
+	.text
+
+/* int __kernel_gettimeofday(struct timeval *tv, struct timezone *tz); */
+ENTRY(__kernel_gettimeofday)
+	.cfi_startproc
+	mov	x2, x30
+	.cfi_register x30, x2
+
+	/* Acquire the sequence counter and get the timespec. */
+	adr	vdso_data, _vdso_data
+1:	seqcnt_acquire
+	cbnz	use_syscall, 4f
+
+	/* If tv is NULL, skip to the timezone code. */
+	cbz	x0, 2f
+	bl	__do_get_tspec
+	seqcnt_check w13, 1b
+
+	/* Convert ns to us. */
+	mov	x11, #1000
+	udiv	x10, x10, x11
+	stp	x9, x10, [x0, #TVAL_TV_SEC]
+2:
+	/* If tz is NULL, return 0. */
+	cbz	x1, 3f
+	ldp	w4, w5, [vdso_data, #VDSO_TZ_MINWEST]
+	seqcnt_read w13
+	seqcnt_check w13, 1b
+	stp	w4, w5, [x1, #TZ_MINWEST]
+3:
+	mov	x0, xzr
+	ret	x2
+4:
+	/* Syscall fallback. */
+	mov	x8, #__NR_gettimeofday
+	svc	#0
+	ret	x2
+	.cfi_endproc
+ENDPROC(__kernel_gettimeofday)
+
+/* int __kernel_clock_gettime(clockid_t clock_id, struct timespec *tp); */
+ENTRY(__kernel_clock_gettime)
+	.cfi_startproc
+	cmp	w0, #CLOCK_REALTIME
+	ccmp	w0, #CLOCK_MONOTONIC, #0x4, ne
+	b.ne	2f
+
+	mov	x2, x30
+	.cfi_register x30, x2
+
+	/* Get kernel timespec. */
+	adr	vdso_data, _vdso_data
+1:	seqcnt_acquire
+	cbnz	use_syscall, 7f
+
+	bl	__do_get_tspec
+	seqcnt_check w13, 1b
+
+	cmp	w0, #CLOCK_MONOTONIC
+	b.ne	6f
+
+	/* Get wtm timespec. */
+	ldp	x14, x15, [vdso_data, #VDSO_WTM_CLK_SEC]
+
+	/* Check the sequence counter. */
+	seqcnt_read w13
+	seqcnt_check w13, 1b
+	b	4f
+2:
+	cmp	w0, #CLOCK_REALTIME_COARSE
+	ccmp	w0, #CLOCK_MONOTONIC_COARSE, #0x4, ne
+	b.ne	8f
+
+	/* Get coarse timespec. */
+	adr	vdso_data, _vdso_data
+3:	seqcnt_acquire
+	ldp	x9, x10, [vdso_data, #VDSO_XTIME_CRS_SEC]
+
+	cmp	w0, #CLOCK_MONOTONIC_COARSE
+	b.ne	6f
+
+	/* Get wtm timespec. */
+	ldp	x14, x15, [vdso_data, #VDSO_WTM_CLK_SEC]
+
+	/* Check the sequence counter. */
+	seqcnt_read w13
+	seqcnt_check w13, 3b
+4:
+	/* Add on wtm timespec. */
+	add	x9, x9, x14
+	add	x10, x10, x15
+
+	/* Normalise the new timespec. */
+	mov	x14, #NSEC_PER_SEC_LO16
+	movk	x14, #NSEC_PER_SEC_HI16, lsl #16
+	cmp	x10, x14
+	b.lt	5f
+	sub	x10, x10, x14
+	add	x9, x9, #1
+5:
+	cmp	x10, #0
+	b.ge	6f
+	add	x10, x10, x14
+	sub	x9, x9, #1
+
+6:	/* Store to the user timespec. */
+	stp	x9, x10, [x1, #TSPEC_TV_SEC]
+	mov	x0, xzr
+	ret	x2
+7:
+	mov	x30, x2
+8:	/* Syscall fallback. */
+	mov	x8, #__NR_clock_gettime
+	svc	#0
+	ret
+	.cfi_endproc
+ENDPROC(__kernel_clock_gettime)
+
+/* int __kernel_clock_getres(clockid_t clock_id, struct timespec *res); */
+ENTRY(__kernel_clock_getres)
+	.cfi_startproc
+	cbz	w1, 3f
+
+	cmp	w0, #CLOCK_REALTIME
+	ccmp	w0, #CLOCK_MONOTONIC, #0x4, ne
+	b.ne	1f
+
+	ldr	x2, 5f
+	b	2f
+1:
+	cmp	w0, #CLOCK_REALTIME_COARSE
+	ccmp	w0, #CLOCK_MONOTONIC_COARSE, #0x4, ne
+	b.ne	4f
+	ldr	x2, 6f
+2:
+	stp	xzr, x2, [x1]
+
+3:	/* res == NULL. */
+	mov	w0, wzr
+	ret
+
+4:	/* Syscall fallback. */
+	mov	x8, #__NR_clock_getres
+	svc	#0
+	ret
+5:
+	.quad	CLOCK_REALTIME_RES
+6:
+	.quad	CLOCK_COARSE_RES
+	.cfi_endproc
+ENDPROC(__kernel_clock_getres)
+
+/*
+ * Read the current time from the architected counter.
+ * Expects vdso_data to be initialised.
+ * Clobbers the temporary registers (x9 - x15).
+ * Returns:
+ *  - (x9, x10) = (ts->tv_sec, ts->tv_nsec)
+ *  - (x11, x12) = (xtime->tv_sec, xtime->tv_nsec)
+ *  - w13 = vDSO sequence counter
+ */
+ENTRY(__do_get_tspec)
+	.cfi_startproc
+
+	/* Read from the vDSO data page. */
+	ldr	x10, [vdso_data, #VDSO_CS_CYCLE_LAST]
+	ldp	x11, x12, [vdso_data, #VDSO_XTIME_CLK_SEC]
+	ldp	w14, w15, [vdso_data, #VDSO_CS_MULT]
+	seqcnt_read w13
+
+	/* Read the physical counter. */
+	isb
+	mrs	x9, cntpct_el0
+
+	/* Calculate cycle delta and convert to ns. */
+	sub	x10, x9, x10
+	/* We can only guarantee 56 bits of precision. */
+	movn	x9, #0xff0, lsl #48
+	and	x10, x9, x10
+	mul	x10, x10, x14
+	lsr	x10, x10, x15
+
+	/* Use the kernel time to calculate the new timespec. */
+	add	x10, x12, x10
+	mov	x14, #NSEC_PER_SEC_LO16
+	movk	x14, #NSEC_PER_SEC_HI16, lsl #16
+	udiv	x15, x10, x14
+	add	x9, x15, x11
+	mul	x14, x14, x15
+	sub	x10, x10, x14
+
+	ret
+	.cfi_endproc
+ENDPROC(__do_get_tspec)
diff --git a/arch/arm64/kernel/vdso/note.S b/arch/arm64/kernel/vdso/note.S
new file mode 100644
index 0000000..b82c85e
--- /dev/null
+++ b/arch/arm64/kernel/vdso/note.S
@@ -0,0 +1,28 @@
+/*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ *
+ * This supplies .note.* sections to go into the PT_NOTE inside the vDSO text.
+ * Here we can supply some information useful to userland.
+ */
+
+#include <linux/uts.h>
+#include <linux/version.h>
+#include <linux/elfnote.h>
+
+ELFNOTE_START(Linux, 0, "a")
+	.long LINUX_VERSION_CODE
+ELFNOTE_END
diff --git a/arch/arm64/kernel/vdso/sigreturn.S b/arch/arm64/kernel/vdso/sigreturn.S
new file mode 100644
index 0000000..20d98ef
--- /dev/null
+++ b/arch/arm64/kernel/vdso/sigreturn.S
@@ -0,0 +1,37 @@
+/*
+ * Sigreturn trampoline for returning from a signal when the SA_RESTORER
+ * flag is not set.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/linkage.h>
+#include <asm/unistd.h>
+
+	.text
+
+	nop
+ENTRY(__kernel_rt_sigreturn)
+	.cfi_startproc
+	.cfi_signal_frame
+	.cfi_def_cfa	x29, 0
+	.cfi_offset	x29, 0 * 8
+	.cfi_offset	x30, 1 * 8
+	mov	x8, #__NR_rt_sigreturn
+	svc	#0
+	.cfi_endproc
+ENDPROC(__kernel_rt_sigreturn)
diff --git a/arch/arm64/kernel/vdso/vdso.S b/arch/arm64/kernel/vdso/vdso.S
new file mode 100644
index 0000000..60c1db5
--- /dev/null
+++ b/arch/arm64/kernel/vdso/vdso.S
@@ -0,0 +1,33 @@
+/*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/page.h>
+
+	__PAGE_ALIGNED_DATA
+
+	.globl vdso_start, vdso_end
+	.balign PAGE_SIZE
+vdso_start:
+	.incbin "arch/arm64/kernel/vdso/vdso.so"
+	.balign PAGE_SIZE
+vdso_end:
+
+	.previous
diff --git a/arch/arm64/kernel/vdso/vdso.lds.S b/arch/arm64/kernel/vdso/vdso.lds.S
new file mode 100644
index 0000000..8154b8d
--- /dev/null
+++ b/arch/arm64/kernel/vdso/vdso.lds.S
@@ -0,0 +1,100 @@
+/*
+ * GNU linker script for the VDSO library.
+*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ * Heavily based on the vDSO linker scripts for other archs.
+ */
+
+#include <linux/const.h>
+#include <asm/page.h>
+#include <asm/vdso.h>
+
+OUTPUT_FORMAT("elf64-littleaarch64", "elf64-bigaarch64", "elf64-littleaarch64")
+OUTPUT_ARCH(aarch64)
+
+SECTIONS
+{
+	. = VDSO_LBASE + SIZEOF_HEADERS;
+
+	.hash		: { *(.hash) }			:text
+	.gnu.hash	: { *(.gnu.hash) }
+	.dynsym		: { *(.dynsym) }
+	.dynstr		: { *(.dynstr) }
+	.gnu.version	: { *(.gnu.version) }
+	.gnu.version_d	: { *(.gnu.version_d) }
+	.gnu.version_r	: { *(.gnu.version_r) }
+
+	.note		: { *(.note.*) }		:text	:note
+
+	. = ALIGN(16);
+
+	.text		: { *(.text*) }			:text	=0xd503201f
+	PROVIDE (__etext = .);
+	PROVIDE (_etext = .);
+	PROVIDE (etext = .);
+
+	.eh_frame_hdr	: { *(.eh_frame_hdr) }		:text	:eh_frame_hdr
+	.eh_frame	: { KEEP (*(.eh_frame)) }	:text
+
+	.dynamic	: { *(.dynamic) }		:text	:dynamic
+
+	.rodata		: { *(.rodata*) }		:text
+
+	_end = .;
+	PROVIDE(end = .);
+
+	. = ALIGN(PAGE_SIZE);
+	PROVIDE(_vdso_data = .);
+
+	/DISCARD/	: {
+		*(.note.GNU-stack)
+		*(.data .data.* .gnu.linkonce.d.* .sdata*)
+		*(.bss .sbss .dynbss .dynsbss)
+	}
+}
+
+/*
+ * We must supply the ELF program headers explicitly to get just one
+ * PT_LOAD segment, and set the flags explicitly to make segments read-only.
+ */
+PHDRS
+{
+	text		PT_LOAD		FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
+	dynamic		PT_DYNAMIC	FLAGS(4);		/* PF_R */
+	note		PT_NOTE		FLAGS(4);		/* PF_R */
+	eh_frame_hdr	PT_GNU_EH_FRAME;
+}
+
+/*
+ * This controls what symbols we export from the DSO.
+ */
+VERSION
+{
+	LINUX_2.6.39 {
+	global:
+		__kernel_rt_sigreturn;
+		__kernel_gettimeofday;
+		__kernel_clock_gettime;
+		__kernel_clock_getres;
+	local: *;
+	};
+}
+
+/*
+ * Make the sigreturn code visible to the kernel.
+ */
+VDSO_sigtramp		= __kernel_rt_sigreturn;


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 19/31] arm64: Signal handling support
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (17 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 18/31] arm64: VDSO support Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-14 17:52 ` [PATCH v2 20/31] arm64: User access library functions Catalin Marinas
                   ` (12 subsequent siblings)
  31 siblings, 0 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds support for signal handling. The sigreturn is done via
VDSO, introduced by a previous patch. The SA_RESTORER is still defined
as it is required for 32-bit (compat) support but it is not to be used
for 64-bit applications.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/sigcontext.h |   69 ++++++
 arch/arm64/include/asm/siginfo.h    |   23 ++
 arch/arm64/include/asm/signal.h     |   24 ++
 arch/arm64/include/asm/ucontext.h   |   30 +++
 arch/arm64/kernel/signal.c          |  436 +++++++++++++++++++++++++++++++++++
 5 files changed, 582 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/sigcontext.h
 create mode 100644 arch/arm64/include/asm/siginfo.h
 create mode 100644 arch/arm64/include/asm/signal.h
 create mode 100644 arch/arm64/include/asm/ucontext.h
 create mode 100644 arch/arm64/kernel/signal.c

diff --git a/arch/arm64/include/asm/sigcontext.h b/arch/arm64/include/asm/sigcontext.h
new file mode 100644
index 0000000..573cec7
--- /dev/null
+++ b/arch/arm64/include/asm/sigcontext.h
@@ -0,0 +1,69 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SIGCONTEXT_H
+#define __ASM_SIGCONTEXT_H
+
+#include <linux/types.h>
+
+/*
+ * Signal context structure - contains all info to do with the state
+ * before the signal handler was invoked.
+ */
+struct sigcontext {
+	__u64 fault_address;
+	/* AArch64 registers */
+	__u64 regs[31];
+	__u64 sp;
+	__u64 pc;
+	__u64 pstate;
+	/* 4K reserved for FP/SIMD state and future expansion */
+	__u8 __reserved[4096] __attribute__((__aligned__(16)));
+};
+
+/*
+ * Header to be used at the beginning of structures extending the user
+ * context. Such structures must be placed after the rt_sigframe on the stack
+ * and be 16-byte aligned. The last structure must be a dummy one with the
+ * magic and size set to 0.
+ */
+struct _aarch64_ctx {
+	__u32 magic;
+	__u32 size;
+};
+
+#define FPSIMD_MAGIC	0x46508001
+
+struct fpsimd_context {
+	struct _aarch64_ctx head;
+	__u32 fpsr;
+	__u32 fpcr;
+	__uint128_t vregs[32];
+};
+
+#ifdef __KERNEL__
+/*
+ * Auxiliary context saved in the sigcontext.__reserved array. Not exported to
+ * user space as it will change with the addition of new context. User space
+ * should check the magic/size information.
+ */
+struct aux_context {
+	struct fpsimd_context fpsimd;
+	/* additional context to be added before "end" */
+	struct _aarch64_ctx end;
+};
+#endif
+
+#endif
diff --git a/arch/arm64/include/asm/siginfo.h b/arch/arm64/include/asm/siginfo.h
new file mode 100644
index 0000000..5a74a08
--- /dev/null
+++ b/arch/arm64/include/asm/siginfo.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SIGINFO_H
+#define __ASM_SIGINFO_H
+
+#define __ARCH_SI_PREAMBLE_SIZE	(4 * sizeof(int))
+
+#include <asm-generic/siginfo.h>
+
+#endif
diff --git a/arch/arm64/include/asm/signal.h b/arch/arm64/include/asm/signal.h
new file mode 100644
index 0000000..8d1e723
--- /dev/null
+++ b/arch/arm64/include/asm/signal.h
@@ -0,0 +1,24 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SIGNAL_H
+#define __ASM_SIGNAL_H
+
+/* Required for AArch32 compatibility. */
+#define SA_RESTORER	0x04000000
+
+#include <asm-generic/signal.h>
+
+#endif
diff --git a/arch/arm64/include/asm/ucontext.h b/arch/arm64/include/asm/ucontext.h
new file mode 100644
index 0000000..bde9607
--- /dev/null
+++ b/arch/arm64/include/asm/ucontext.h
@@ -0,0 +1,30 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_UCONTEXT_H
+#define __ASM_UCONTEXT_H
+
+struct ucontext {
+	unsigned long	  uc_flags;
+	struct ucontext	 *uc_link;
+	stack_t		  uc_stack;
+	sigset_t	  uc_sigmask;
+	/* glibc uses a 1024-bit sigset_t */
+	__u8		  __unused[(1024 - sizeof(sigset_t)) / 8];
+	/* last for future expansion */
+	struct sigcontext uc_mcontext;
+};
+
+#endif /* __ASM_UCONTEXT_H */
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
new file mode 100644
index 0000000..a8f29d2
--- /dev/null
+++ b/arch/arm64/kernel/signal.c
@@ -0,0 +1,436 @@
+/*
+ * Based on arch/arm/kernel/signal.c
+ *
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/errno.h>
+#include <linux/signal.h>
+#include <linux/personality.h>
+#include <linux/freezer.h>
+#include <linux/uaccess.h>
+#include <linux/tracehook.h>
+#include <linux/ratelimit.h>
+
+#include <asm/debug-monitors.h>
+#include <asm/elf.h>
+#include <asm/cacheflush.h>
+#include <asm/ucontext.h>
+#include <asm/unistd.h>
+#include <asm/fpsimd.h>
+#include <asm/signal32.h>
+#include <asm/vdso.h>
+
+/*
+ * Do a signal return; undo the signal stack. These are aligned to 128-bit.
+ */
+struct rt_sigframe {
+	struct siginfo info;
+	struct ucontext uc;
+};
+
+static int preserve_fpsimd_context(struct fpsimd_context __user *ctx)
+{
+	struct fpsimd_state *fpsimd = &current->thread.fpsimd_state;
+	int err;
+
+	/* dump the hardware registers to the fpsimd_state structure */
+	fpsimd_save_state(fpsimd);
+
+	/* copy the FP and status/control registers */
+	err = __copy_to_user(ctx->vregs, fpsimd->vregs, sizeof(fpsimd->vregs));
+	__put_user_error(fpsimd->fpsr, &ctx->fpsr, err);
+	__put_user_error(fpsimd->fpcr, &ctx->fpcr, err);
+
+	/* copy the magic/size information */
+	__put_user_error(FPSIMD_MAGIC, &ctx->head.magic, err);
+	__put_user_error(sizeof(struct fpsimd_context), &ctx->head.size, err);
+
+	return err ? -EFAULT : 0;
+}
+
+static int restore_fpsimd_context(struct fpsimd_context __user *ctx)
+{
+	struct fpsimd_state fpsimd;
+	__u32 magic, size;
+	int err = 0;
+
+	/* check the magic/size information */
+	__get_user_error(magic, &ctx->head.magic, err);
+	__get_user_error(size, &ctx->head.size, err);
+	if (err)
+		return -EFAULT;
+	if (magic != FPSIMD_MAGIC || size != sizeof(struct fpsimd_context))
+		return -EINVAL;
+
+	/* copy the FP and status/control registers */
+	err = __copy_from_user(fpsimd.vregs, ctx->vregs,
+			       sizeof(fpsimd.vregs));
+	__get_user_error(fpsimd.fpsr, &ctx->fpsr, err);
+	__get_user_error(fpsimd.fpcr, &ctx->fpcr, err);
+
+	/* load the hardware registers from the fpsimd_state structure */
+	if (!err) {
+		preempt_disable();
+		fpsimd_load_state(&fpsimd);
+		preempt_enable();
+	}
+
+	return err ? -EFAULT : 0;
+}
+
+static int restore_sigframe(struct pt_regs *regs,
+			    struct rt_sigframe __user *sf)
+{
+	sigset_t set;
+	int i, err;
+	struct aux_context __user *aux =
+		(struct aux_context __user *)sf->uc.uc_mcontext.__reserved;
+
+	err = __copy_from_user(&set, &sf->uc.uc_sigmask, sizeof(set));
+	if (err == 0)
+		set_current_blocked(&set);
+
+	for (i = 0; i < 31; i++)
+		__get_user_error(regs->regs[i], &sf->uc.uc_mcontext.regs[i],
+				 err);
+	__get_user_error(regs->sp, &sf->uc.uc_mcontext.sp, err);
+	__get_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err);
+	__get_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err);
+
+	/*
+	 * Avoid sys_rt_sigreturn() restarting.
+	 */
+	regs->syscallno = ~0UL;
+
+	err |= !valid_user_regs(&regs->user_regs);
+
+	if (err == 0)
+		err |= restore_fpsimd_context(&aux->fpsimd);
+
+	return err;
+}
+
+asmlinkage long sys_rt_sigreturn(struct pt_regs *regs)
+{
+	struct rt_sigframe __user *frame;
+
+	/* Always make any pending restarted system calls return -EINTR */
+	current_thread_info()->restart_block.fn = do_no_restart_syscall;
+
+	/*
+	 * Since we stacked the signal on a 128-bit boundary, then 'sp' should
+	 * be word aligned here.
+	 */
+	if (regs->sp & 15)
+		goto badframe;
+
+	frame = (struct rt_sigframe __user *)regs->sp;
+
+	if (!access_ok(VERIFY_READ, frame, sizeof (*frame)))
+		goto badframe;
+
+	if (restore_sigframe(regs, frame))
+		goto badframe;
+
+	if (do_sigaltstack(&frame->uc.uc_stack,
+			   NULL, regs->sp) == -EFAULT)
+		goto badframe;
+
+	return regs->regs[0];
+
+badframe:
+	if (show_unhandled_signals)
+		printk_ratelimited(KERN_INFO "%s[%d]: bad frame in %s: pc=%08llx sp=%08llx\n",
+				   current->comm, task_pid_nr(current), __func__,
+				   regs->pc, regs->sp);
+	force_sig(SIGSEGV, current);
+	return 0;
+}
+
+asmlinkage long sys_sigaltstack(const stack_t __user *uss, stack_t __user *uoss,
+				unsigned long sp)
+{
+	return do_sigaltstack(uss, uoss, sp);
+}
+
+static int setup_sigframe(struct rt_sigframe __user *sf,
+			  struct pt_regs *regs, sigset_t *set)
+{
+	int i, err = 0;
+	struct aux_context __user *aux =
+		(struct aux_context __user *)sf->uc.uc_mcontext.__reserved;
+
+	for (i = 0; i < 31; i++)
+		__put_user_error(regs->regs[i], &sf->uc.uc_mcontext.regs[i],
+				 err);
+	__put_user_error(regs->sp, &sf->uc.uc_mcontext.sp, err);
+	__put_user_error(regs->pc, &sf->uc.uc_mcontext.pc, err);
+	__put_user_error(regs->pstate, &sf->uc.uc_mcontext.pstate, err);
+
+	__put_user_error(current->thread.fault_address, &sf->uc.uc_mcontext.fault_address, err);
+
+	err |= __copy_to_user(&sf->uc.uc_sigmask, set, sizeof(*set));
+
+	if (err == 0)
+		err |= preserve_fpsimd_context(&aux->fpsimd);
+
+	/* set the "end" magic */
+	__put_user_error(0, &aux->end.magic, err);
+	__put_user_error(0, &aux->end.size, err);
+
+	return err;
+}
+
+static void __user *get_sigframe(struct k_sigaction *ka, struct pt_regs *regs,
+				 int framesize)
+{
+	unsigned long sp, sp_top;
+	void __user *frame;
+
+	sp = sp_top = regs->sp;
+
+	/*
+	 * This is the X/Open sanctioned signal stack switching.
+	 */
+	if ((ka->sa.sa_flags & SA_ONSTACK) && !sas_ss_flags(sp))
+		sp = sp_top = current->sas_ss_sp + current->sas_ss_size;
+
+	/* room for stack frame (FP, LR) */
+	sp -= 16;
+
+	sp = (sp - framesize) & ~15;
+	frame = (void __user *)sp;
+
+	/*
+	 * Check that we can actually write to the signal frame.
+	 */
+	if (!access_ok(VERIFY_WRITE, frame, sp_top - sp))
+		frame = NULL;
+
+	return frame;
+}
+
+static int setup_return(struct pt_regs *regs, struct k_sigaction *ka,
+			void __user *frame, int usig)
+{
+	int err = 0;
+	__sigrestore_t sigtramp;
+	unsigned long __user *sp = (unsigned long __user *)regs->sp;
+
+	/* set up the stack frame */
+	__put_user_error(regs->regs[29], sp - 2, err);
+	__put_user_error(regs->regs[30], sp - 1, err);
+
+	regs->regs[0] = usig;
+	regs->regs[29] = regs->sp - 16;
+	regs->sp = (unsigned long)frame;
+	regs->pc = (unsigned long)ka->sa.sa_handler;
+
+	if (ka->sa.sa_flags & SA_RESTORER)
+		sigtramp = ka->sa.sa_restorer;
+	else
+		sigtramp = VDSO_SYMBOL(current->mm->context.vdso, sigtramp);
+
+	regs->regs[30] = (unsigned long)sigtramp;
+
+	return err;
+}
+
+static int setup_rt_frame(int usig, struct k_sigaction *ka, siginfo_t *info,
+			  sigset_t *set, struct pt_regs *regs)
+{
+	struct rt_sigframe __user *frame;
+	stack_t stack;
+	int err = 0;
+
+	frame = get_sigframe(ka, regs, sizeof(*frame));
+	if (!frame)
+		return 1;
+
+	__put_user_error(0, &frame->uc.uc_flags, err);
+	__put_user_error(NULL, &frame->uc.uc_link, err);
+
+	memset(&stack, 0, sizeof(stack));
+	stack.ss_sp = (void __user *)current->sas_ss_sp;
+	stack.ss_flags = sas_ss_flags(regs->sp);
+	stack.ss_size = current->sas_ss_size;
+	err |= __copy_to_user(&frame->uc.uc_stack, &stack, sizeof(stack));
+
+	err |= setup_sigframe(frame, regs, set);
+	if (err == 0)
+		err = setup_return(regs, ka, frame, usig);
+
+	if (err == 0 && ka->sa.sa_flags & SA_SIGINFO) {
+		err |= copy_siginfo_to_user(&frame->info, info);
+		regs->regs[1] = (unsigned long)&frame->info;
+		regs->regs[2] = (unsigned long)&frame->uc;
+	}
+
+	return err;
+}
+
+static void setup_restart_syscall(struct pt_regs *regs)
+{
+	if (test_thread_flag(TIF_32BIT))
+		compat_setup_restart_syscall(regs);
+	else
+		regs->regs[8] = __NR_restart_syscall;
+}
+
+/*
+ * OK, we're invoking a handler
+ */
+static void handle_signal(unsigned long sig, struct k_sigaction *ka,
+			  siginfo_t *info, struct pt_regs *regs)
+{
+	struct thread_info *thread = current_thread_info();
+	struct task_struct *tsk = current;
+	sigset_t *oldset = sigmask_to_save();
+	int usig = sig;
+	int ret;
+
+	/*
+	 * translate the signal
+	 */
+	if (usig < 32 && thread->exec_domain && thread->exec_domain->signal_invmap)
+		usig = thread->exec_domain->signal_invmap[usig];
+
+	/*
+	 * Set up the stack frame
+	 */
+	if (test_thread_flag(TIF_32BIT)) {
+		if (ka->sa.sa_flags & SA_SIGINFO)
+			ret = compat_setup_rt_frame(usig, ka, info, oldset,
+						    regs);
+		else
+			ret = compat_setup_frame(usig, ka, oldset, regs);
+	} else {
+		ret = setup_rt_frame(usig, ka, info, oldset, regs);
+	}
+
+	/*
+	 * Check that the resulting registers are actually sane.
+	 */
+	ret |= !valid_user_regs(&regs->user_regs);
+
+	if (ret != 0) {
+		force_sigsegv(sig, tsk);
+		return;
+	}
+
+	/*
+	 * Fast forward the stepping logic so we step into the signal
+	 * handler.
+	 */
+	user_fastforward_single_step(tsk);
+
+	signal_delivered(sig, info, ka, regs, 0);
+}
+
+/*
+ * Note that 'init' is a special process: it doesn't get signals it doesn't
+ * want to handle. Thus you cannot kill init even with a SIGKILL even by
+ * mistake.
+ *
+ * Note that we go through the signals twice: once to check the signals that
+ * the kernel can handle, and then we build all the user-level signal handling
+ * stack-frames in one go after that.
+ */
+static void do_signal(struct pt_regs *regs)
+{
+	unsigned long continue_addr = 0, restart_addr = 0;
+	struct k_sigaction ka;
+	siginfo_t info;
+	int signr, retval = 0;
+	int syscall = (int)regs->syscallno;
+
+	/*
+	 * If we were from a system call, check for system call restarting...
+	 */
+	if (syscall >= 0) {
+		continue_addr = regs->pc;
+		restart_addr = continue_addr - (compat_thumb_mode(regs) ? 2 : 4);
+		retval = regs->regs[0];
+
+		/*
+		 * Avoid additional syscall restarting via ret_to_user.
+		 */
+		regs->syscallno = ~0UL;
+
+		/*
+		 * Prepare for system call restart. We do this here so that a
+		 * debugger will see the already changed PC.
+		 */
+		switch (retval) {
+		case -ERESTARTNOHAND:
+		case -ERESTARTSYS:
+		case -ERESTARTNOINTR:
+		case -ERESTART_RESTARTBLOCK:
+			regs->regs[0] = regs->orig_x0;
+			regs->pc = restart_addr;
+			break;
+		}
+	}
+
+	/*
+	 * Get the signal to deliver. When running under ptrace, at this point
+	 * the debugger may change all of our registers.
+	 */
+	signr = get_signal_to_deliver(&info, &ka, regs, NULL);
+	if (signr > 0) {
+		/*
+		 * Depending on the signal settings, we may need to revert the
+		 * decision to restart the system call, but skip this if a
+		 * debugger has chosen to restart at a different PC.
+		 */
+		if (regs->pc == restart_addr &&
+		    (retval == -ERESTARTNOHAND ||
+		     retval == -ERESTART_RESTARTBLOCK ||
+		     (retval == -ERESTARTSYS &&
+		      !(ka.sa.sa_flags & SA_RESTART)))) {
+			regs->regs[0] = -EINTR;
+			regs->pc = continue_addr;
+		}
+
+		handle_signal(signr, &ka, &info, regs);
+		return;
+	}
+
+	/*
+	 * Handle restarting a different system call. As above, if a debugger
+	 * has chosen to restart at a different PC, ignore the restart.
+	 */
+	if (syscall >= 0 && regs->pc == restart_addr) {
+		if (retval == -ERESTART_RESTARTBLOCK)
+			setup_restart_syscall(regs);
+		user_rewind_single_step(current);
+	}
+
+	restore_saved_sigmask();
+}
+
+asmlinkage void do_notify_resume(struct pt_regs *regs,
+				 unsigned int thread_flags)
+{
+	if (thread_flags & _TIF_SIGPENDING)
+		do_signal(regs);
+
+	if (thread_flags & _TIF_NOTIFY_RESUME) {
+		clear_thread_flag(TIF_NOTIFY_RESUME);
+		tracehook_notify_resume(regs);
+	}
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 20/31] arm64: User access library functions
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (18 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 19/31] arm64: Signal handling support Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 14:49   ` [PATCH v2 20/31] arm64: User access library function Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 21/31] arm64: 32-bit (compat) applications support Catalin Marinas
                   ` (11 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel
  Cc: linux-kernel, Arnd Bergmann, Will Deacon, Marc Zyngier

This patch add support for various user access functions. These
functions use the standard LDR/STR instructions and not the LDRT/STRT
variants in order to allow kernel addresses (after set_fs(KERNEL_DS)).

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/uaccess.h   |  377 ++++++++++++++++++++++++++++++++++++
 arch/arm64/lib/clear_user.S        |   58 ++++++
 arch/arm64/lib/copy_from_user.S    |   66 +++++++
 arch/arm64/lib/copy_in_user.S      |   63 ++++++
 arch/arm64/lib/copy_to_user.S      |   61 ++++++
 arch/arm64/lib/getuser.S           |   75 +++++++
 arch/arm64/lib/putuser.S           |   73 +++++++
 arch/arm64/lib/strncpy_from_user.S |   50 +++++
 arch/arm64/lib/strnlen_user.S      |   47 +++++
 9 files changed, 870 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/uaccess.h
 create mode 100644 arch/arm64/lib/clear_user.S
 create mode 100644 arch/arm64/lib/copy_from_user.S
 create mode 100644 arch/arm64/lib/copy_in_user.S
 create mode 100644 arch/arm64/lib/copy_to_user.S
 create mode 100644 arch/arm64/lib/getuser.S
 create mode 100644 arch/arm64/lib/putuser.S
 create mode 100644 arch/arm64/lib/strncpy_from_user.S
 create mode 100644 arch/arm64/lib/strnlen_user.S

diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
new file mode 100644
index 0000000..09d7b53
--- /dev/null
+++ b/arch/arm64/include/asm/uaccess.h
@@ -0,0 +1,377 @@
+/*
+ * Based on arch/arm/include/asm/uaccess.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_UACCESS_H
+#define __ASM_UACCESS_H
+
+/*
+ * User space memory access functions
+ */
+#include <linux/string.h>
+#include <linux/thread_info.h>
+
+#include <asm/ptrace.h>
+#include <asm/errno.h>
+#include <asm/memory.h>
+#include <asm/compiler.h>
+
+#define VERIFY_READ 0
+#define VERIFY_WRITE 1
+
+/*
+ * The exception table consists of pairs of addresses: the first is the
+ * address of an instruction that is allowed to fault, and the second is
+ * the address at which the program should continue.  No registers are
+ * modified, so it is entirely up to the continuation code to figure out
+ * what to do.
+ *
+ * All the routines below use bits of fixup code that are out of line
+ * with the main instruction path.  This means when everything is well,
+ * we don't even have to jump over them.  Further, they do not intrude
+ * on our cache or tlb entries.
+ */
+
+struct exception_table_entry
+{
+	unsigned long insn, fixup;
+};
+
+extern int fixup_exception(struct pt_regs *regs);
+
+/*
+ * These two are intentionally not defined anywhere - if the kernel
+ * code generates any references to them, that's a bug.
+ */
+extern long __get_user_bad(void);
+extern long __put_user_bad(void);
+
+#define KERNEL_DS	(-1UL)
+#define get_ds()	(KERNEL_DS)
+
+#define USER_DS		TASK_SIZE_64
+#define get_fs()	(current_thread_info()->addr_limit)
+
+static inline void set_fs(mm_segment_t fs)
+{
+	current_thread_info()->addr_limit = fs;
+}
+
+#define segment_eq(a,b)	((a) == (b))
+
+/*
+ * Return 1 if addr < current->addr_limit, 0 otherwise.
+ */
+#define __addr_ok(addr)							\
+({									\
+	unsigned long flag;						\
+	asm("cmp %1, %0; cset %0, lo"				\
+		: "=&r" (flag)						\
+		: "r" (addr), "0" (current_thread_info()->addr_limit)	\
+		: "cc");						\
+	flag;								\
+})
+
+/*
+ * Test whether a block of memory is a valid user space address.
+ * Returns 1 if the range is valid, 0 otherwise.
+ *
+ * This is equivalent to the following test:
+ * (u65)addr + (u65)size < (u65)current->addr_limit
+ *
+ * This needs 65-bit arithmetic.
+ */
+#define __range_ok(addr,size)						\
+({									\
+	unsigned long flag, roksum;					\
+	__chk_user_ptr(addr);						\
+	asm("adds %1, %1, %3; ccmp %1, %4, #2, cc; cset %0, cc"	\
+		: "=&r" (flag), "=&r" (roksum)				\
+		: "1" (addr), "Ir" (size),				\
+		  "r" (current_thread_info()->addr_limit)		\
+		: "cc");						\
+	flag;								\
+})
+
+/*
+ * Single-value transfer routines.  They automatically use the right
+ * size if we just have the right pointer type.  Note that the functions
+ * which read from user space (*get_*) need to take care not to leak
+ * kernel data even if the calling code is buggy and fails to check
+ * the return value.  This means zeroing out the destination variable
+ * or buffer on error.  Normally this is done out of line by the
+ * fixup code, but there are a few places where it intrudes on the
+ * main code path.  When we only write to user space, there is no
+ * problem.
+ */
+extern long __get_user_1(void *);
+extern long __get_user_2(void *);
+extern long __get_user_4(void *);
+extern long __get_user_8(void *);
+
+#define __get_user_x(__r2,__p,__e,__s,__i...)				\
+	   asm volatile(						\
+		__asmeq("%0", "x0") __asmeq("%1", "x2")			\
+		"bl	__get_user_" #__s				\
+		: "=&r" (__e), "=r" (__r2)				\
+		: "0" (__p)						\
+		: __i, "cc")
+
+#define get_user(x,p)							\
+	({								\
+		register const typeof(*(p)) __user *__p asm("x0") = (p);\
+		register unsigned long __r2 asm("x2");			\
+		register long __e asm("x0");				\
+		switch (sizeof(*(__p))) {				\
+		case 1:							\
+			__get_user_x(__r2, __p, __e, 1, "x30");		\
+			break;						\
+		case 2:							\
+			__get_user_x(__r2, __p, __e, 2, "x3", "x30");	\
+			break;						\
+		case 4:							\
+			__get_user_x(__r2, __p, __e, 4, "x30");		\
+			break;						\
+		case 8:							\
+			__get_user_x(__r2, __p, __e, 8, "x30");		\
+			break;						\
+		default: __e = __get_user_bad(); break;			\
+		}							\
+		x = (typeof(*(p))) __r2;				\
+		__e;							\
+	})
+
+#define __get_user_unaligned __get_user
+
+extern long __put_user_1(void *, unsigned long);
+extern long __put_user_2(void *, unsigned long);
+extern long __put_user_4(void *, unsigned long);
+extern long __put_user_8(void *, unsigned long);
+
+#define __put_user_x(__r2,__p,__e,__s)					\
+	   asm volatile(						\
+		__asmeq("%0", "x0") __asmeq("%2", "x2")			\
+		"bl	__put_user_" #__s				\
+		: "=&r" (__e)						\
+		: "0" (__p), "r" (__r2)					\
+		: "x8", "x30", "cc")
+
+#define put_user(x,p)							\
+	({								\
+		register const typeof(*(p)) __r2 asm("x2") = (x);	\
+		register const typeof(*(p)) __user *__p asm("x0") = (p);\
+		register long __e asm("x0");				\
+		switch (sizeof(*(__p))) {				\
+		case 1:							\
+			__put_user_x(__r2, __p, __e, 1);		\
+			break;						\
+		case 2:							\
+			__put_user_x(__r2, __p, __e, 2);		\
+			break;						\
+		case 4:							\
+			__put_user_x(__r2, __p, __e, 4);		\
+			break;						\
+		case 8:							\
+			__put_user_x(__r2, __p, __e, 8);		\
+			break;						\
+		default: __e = __put_user_bad(); break;			\
+		}							\
+		__e;							\
+	})
+
+#define __put_user_unaligned __put_user
+
+#define access_ok(type,addr,size)	__range_ok(addr,size)
+
+/*
+ * The "__xxx" versions of the user access functions do not verify the
+ * address space - it must have been done previously with a separate
+ * "access_ok()" call.
+ *
+ * The "xxx_error" versions set the third argument to EFAULT if an
+ * error occurs, and leave it unchanged on success.  Note that these
+ * versions are void (ie, don't return a value as such).
+ */
+#define __get_user(x,ptr)						\
+({									\
+	long __gu_err = 0;						\
+	__get_user_err((x),(ptr),__gu_err);				\
+	__gu_err;							\
+})
+
+#define __get_user_error(x,ptr,err)					\
+({									\
+	__get_user_err((x),(ptr),err);					\
+	(void) 0;							\
+})
+
+#define __get_user_err(x,ptr,err)					\
+do {									\
+	unsigned long __gu_addr = (unsigned long)(ptr);			\
+	unsigned long __gu_val;						\
+	__chk_user_ptr(ptr);						\
+	switch (sizeof(*(ptr))) {					\
+	case 1:								\
+		__get_user_asm("ldrb", "%w", __gu_val, __gu_addr, err);	\
+		break;							\
+	case 2:								\
+		__get_user_asm("ldrh", "%w", __gu_val, __gu_addr, err);	\
+		break;							\
+	case 4:								\
+		__get_user_asm("ldr", "%w", __gu_val, __gu_addr, err);	\
+		break;							\
+	case 8:								\
+		__get_user_asm("ldr", "%",  __gu_val, __gu_addr, err);	\
+		break;							\
+	default:							\
+		(__gu_val) = __get_user_bad();				\
+	}								\
+	(x) = (__typeof__(*(ptr)))__gu_val;				\
+} while (0)
+
+#define __get_user_asm(instr, reg, x, addr, err)			\
+	asm volatile(							\
+	"1:	" instr "	" reg "1, [%2]\n"			\
+	"2:\n"								\
+	"	.section .fixup, \"ax\"\n"				\
+	"	.align	2\n"						\
+	"3:	mov	%0, %3\n"					\
+	"	mov	%1, #0\n"					\
+	"	b	2b\n"						\
+	"	.previous\n"						\
+	"	.section __ex_table,\"a\"\n"				\
+	"	.align	3\n"						\
+	"	.quad	1b, 3b\n"					\
+	"	.previous"						\
+	: "+r" (err), "=&r" (x)						\
+	: "r" (addr), "i" (-EFAULT)					\
+	: "cc")
+
+#define __put_user(x,ptr)						\
+({									\
+	long __pu_err = 0;						\
+	__put_user_err((x),(ptr),__pu_err);				\
+	__pu_err;							\
+})
+
+#define __put_user_error(x,ptr,err)					\
+({									\
+	__put_user_err((x),(ptr),err);					\
+	(void) 0;							\
+})
+
+#define __put_user_err(x,ptr,err)					\
+do {									\
+	unsigned long __pu_addr = (unsigned long)(ptr);			\
+	__typeof__(*(ptr)) __pu_val = (x);				\
+	__chk_user_ptr(ptr);						\
+	switch (sizeof(*(ptr))) {					\
+	case 1:								\
+		__put_user_asm("strb", "%w", __pu_val, __pu_addr, err);	\
+		break;							\
+	case 2:								\
+		__put_user_asm("strh", "%w", __pu_val, __pu_addr, err);	\
+		break;							\
+	case 4:								\
+		__put_user_asm("str",  "%w", __pu_val, __pu_addr, err);	\
+		break;							\
+	case 8:								\
+		__put_user_asm("str",  "%",  __pu_val, __pu_addr, err);	\
+		break;							\
+	default:							\
+		__put_user_bad();					\
+	}								\
+} while (0)
+
+#define __put_user_asm(instr, reg, x, __pu_addr, err)			\
+	asm volatile(							\
+	"1:	" instr "	" reg "1, [%2]\n"			\
+	"2:\n"								\
+	"	.section .fixup,\"ax\"\n"				\
+	"	.align	2\n"						\
+	"3:	mov	%0, %3\n"					\
+	"	b	2b\n"						\
+	"	.previous\n"						\
+	"	.section __ex_table,\"a\"\n"				\
+	"	.align	3\n"						\
+	"	.quad	1b, 3b\n"					\
+	"	.previous"						\
+	: "+r" (err)							\
+	: "r" (x), "r" (__pu_addr), "i" (-EFAULT)			\
+	: "cc")
+
+extern unsigned long __must_check __copy_from_user(void *to, const void __user *from, unsigned long n);
+extern unsigned long __must_check __copy_to_user(void __user *to, const void *from, unsigned long n);
+extern unsigned long __must_check __copy_in_user(void __user *to, const void __user *from, unsigned long n);
+extern unsigned long __must_check __clear_user(void __user *addr, unsigned long n);
+
+extern unsigned long __must_check __strncpy_from_user(char *to, const char __user *from, unsigned long count);
+extern unsigned long __must_check __strnlen_user(const char __user *s, long n);
+
+static inline unsigned long __must_check copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+	if (access_ok(VERIFY_READ, from, n))
+		n = __copy_from_user(to, from, n);
+	else /* security hole - plug it */
+		memset(to, 0, n);
+	return n;
+}
+
+static inline unsigned long __must_check copy_to_user(void __user *to, const void *from, unsigned long n)
+{
+	if (access_ok(VERIFY_WRITE, to, n))
+		n = __copy_to_user(to, from, n);
+	return n;
+}
+
+static inline unsigned long __must_check copy_in_user(void __user *to, const void __user *from, unsigned long n)
+{
+	if (access_ok(VERIFY_READ, from, n) && access_ok(VERIFY_WRITE, to, n))
+		n = __copy_in_user(to, from, n);
+	return n;
+}
+
+#define __copy_to_user_inatomic __copy_to_user
+#define __copy_from_user_inatomic __copy_from_user
+
+static inline unsigned long __must_check clear_user(void __user *to, unsigned long n)
+{
+	if (access_ok(VERIFY_WRITE, to, n))
+		n = __clear_user(to, n);
+	return n;
+}
+
+static inline long __must_check strncpy_from_user(char *dst, const char __user *src, long count)
+{
+	long res = -EFAULT;
+	if (access_ok(VERIFY_READ, src, 1))
+		res = __strncpy_from_user(dst, src, count);
+	return res;
+}
+
+#define strlen_user(s)	strnlen_user(s, ~0UL >> 1)
+
+static inline long __must_check strnlen_user(const char __user *s, long n)
+{
+	unsigned long res = 0;
+
+	if (__addr_ok(s))
+		res = __strnlen_user(s, n);
+
+	return res;
+}
+
+#endif /* __ASM_UACCESS_H */
diff --git a/arch/arm64/lib/clear_user.S b/arch/arm64/lib/clear_user.S
new file mode 100644
index 0000000..6e0ed93
--- /dev/null
+++ b/arch/arm64/lib/clear_user.S
@@ -0,0 +1,58 @@
+/*
+ * Based on arch/arm/lib/clear_user.S
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+	.text
+
+/* Prototype: int __clear_user(void *addr, size_t sz)
+ * Purpose  : clear some user memory
+ * Params   : addr - user memory address to clear
+ *          : sz   - number of bytes to clear
+ * Returns  : number of bytes NOT cleared
+ *
+ * Alignment fixed up by hardware.
+ */
+ENTRY(__clear_user)
+	mov	x2, x1			// save the size for fixup return
+	subs	x1, x1, #8
+	b.mi	2f
+1:
+USER(9f, str	xzr, [x0], #8	)
+	subs	x1, x1, #8
+	b.pl	1b
+2:	adds	x1, x1, #4
+	b.mi	3f
+USER(9f, str	wzr, [x0], #4	)
+	sub	x1, x1, #4
+3:	adds	x1, x1, #2
+	b.mi	4f
+USER(9f, strh	wzr, [x0], #2	)
+	sub	x1, x1, #2
+4:	adds	x1, x1, #1
+	b.mi	5f
+	strb	wzr, [x0]
+5:	mov	x0, #0
+	ret
+ENDPROC(__clear_user)
+
+	.section .fixup,"ax"
+	.align	2
+9:	mov	x0, x2			// return the original size
+	ret
+	.previous
diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S
new file mode 100644
index 0000000..5e27add
--- /dev/null
+++ b/arch/arm64/lib/copy_from_user.S
@@ -0,0 +1,66 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+/*
+ * Copy from user space to a kernel buffer (alignment handled by the hardware)
+ *
+ * Parameters:
+ *	x0 - to
+ *	x1 - from
+ *	x2 - n
+ * Returns:
+ *	x0 - bytes not copied
+ */
+ENTRY(__copy_from_user)
+	add	x4, x1, x2			// upper user buffer boundary
+	subs	x2, x2, #8
+	b.mi	2f
+1:
+USER(9f, ldr	x3, [x1], #8	)
+	subs	x2, x2, #8
+	str	x3, [x0], #8
+	b.pl	1b
+2:	adds	x2, x2, #4
+	b.mi	3f
+USER(9f, ldr	w3, [x1], #4	)
+	sub	x2, x2, #4
+	str	w3, [x0], #4
+3:	adds	x2, x2, #2
+	b.mi	4f
+USER(9f, ldrh	w3, [x1], #2	)
+	sub	x2, x2, #2
+	strh	w3, [x0], #2
+4:	adds	x2, x2, #1
+	b.mi	5f
+USER(9f, ldrb	w3, [x1]	)
+	strb	w3, [x0]
+5:	mov	x0, #0
+	ret
+ENDPROC(__copy_from_user)
+
+	.section .fixup,"ax"
+	.align	2
+9:	sub	x2, x4, x1
+	mov	x3, x2
+10:	strb	wzr, [x0], #1			// zero remaining buffer space
+	subs	x3, x3, #1
+	b.ne	10b
+	mov	x0, x2				// bytes not copied
+	ret
+	.previous
diff --git a/arch/arm64/lib/copy_in_user.S b/arch/arm64/lib/copy_in_user.S
new file mode 100644
index 0000000..84b6c9b
--- /dev/null
+++ b/arch/arm64/lib/copy_in_user.S
@@ -0,0 +1,63 @@
+/*
+ * Copy from user space to user space
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+/*
+ * Copy from user space to user space (alignment handled by the hardware)
+ *
+ * Parameters:
+ *	x0 - to
+ *	x1 - from
+ *	x2 - n
+ * Returns:
+ *	x0 - bytes not copied
+ */
+ENTRY(__copy_in_user)
+	add	x4, x0, x2			// upper user buffer boundary
+	subs	x2, x2, #8
+	b.mi	2f
+1:
+USER(9f, ldr	x3, [x1], #8	)
+	subs	x2, x2, #8
+USER(9f, str	x3, [x0], #8	)
+	b.pl	1b
+2:	adds	x2, x2, #4
+	b.mi	3f
+USER(9f, ldr	w3, [x1], #4	)
+	sub	x2, x2, #4
+USER(9f, str	w3, [x0], #4	)
+3:	adds	x2, x2, #2
+	b.mi	4f
+USER(9f, ldrh	w3, [x1], #2	)
+	sub	x2, x2, #2
+USER(9f, strh	w3, [x0], #2	)
+4:	adds	x2, x2, #1
+	b.mi	5f
+USER(9f, ldrb	w3, [x1]	)
+USER(9f, strb	w3, [x0]	)
+5:	mov	x0, #0
+	ret
+ENDPROC(__copy_in_user)
+
+	.section .fixup,"ax"
+	.align	2
+9:	sub	x0, x4, x0			// bytes not copied
+	ret
+	.previous
diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S
new file mode 100644
index 0000000..a0aeeb9
--- /dev/null
+++ b/arch/arm64/lib/copy_to_user.S
@@ -0,0 +1,61 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+
+/*
+ * Copy to user space from a kernel buffer (alignment handled by the hardware)
+ *
+ * Parameters:
+ *	x0 - to
+ *	x1 - from
+ *	x2 - n
+ * Returns:
+ *	x0 - bytes not copied
+ */
+ENTRY(__copy_to_user)
+	add	x4, x0, x2			// upper user buffer boundary
+	subs	x2, x2, #8
+	b.mi	2f
+1:
+	ldr	x3, [x1], #8
+	subs	x2, x2, #8
+USER(9f, str	x3, [x0], #8	)
+	b.pl	1b
+2:	adds	x2, x2, #4
+	b.mi	3f
+	ldr	w3, [x1], #4
+	sub	x2, x2, #4
+USER(9f, str	w3, [x0], #4	)
+3:	adds	x2, x2, #2
+	b.mi	4f
+	ldrh	w3, [x1], #2
+	sub	x2, x2, #2
+USER(9f, strh	w3, [x0], #2	)
+4:	adds	x2, x2, #1
+	b.mi	5f
+	ldrb	w3, [x1]
+USER(9f, strb	w3, [x0]	)
+5:	mov	x0, #0
+	ret
+ENDPROC(__copy_to_user)
+
+	.section .fixup,"ax"
+	.align	2
+9:	sub	x0, x4, x0			// bytes not copied
+	ret
+	.previous
diff --git a/arch/arm64/lib/getuser.S b/arch/arm64/lib/getuser.S
new file mode 100644
index 0000000..1b4da22
--- /dev/null
+++ b/arch/arm64/lib/getuser.S
@@ -0,0 +1,75 @@
+/*
+ * Based on arch/arm/lib/getuser.S
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Idea from x86 version, (C) Copyright 1998 Linus Torvalds
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ *
+ * These functions have a non-standard call interface to make them more
+ * efficient, especially as they return an error value in addition to
+ * the "real" return value.
+ *
+ * __get_user_X
+ *
+ * Inputs:	x0 contains the address
+ * Outputs:	x0 is the error code
+ *		x2, x3 contains the zero-extended value
+ *		lr corrupted
+ *
+ * No other registers must be altered.  (see <asm/uaccess.h>
+ * for specific ASM register usage).
+ *
+ * Note also that it is intended that __get_user_bad is not global.
+ */
+
+#include <linux/linkage.h>
+#include <asm/errno.h>
+
+ENTRY(__get_user_1)
+1:	ldrb	w2, [x0]
+	mov	x0, #0
+	ret
+ENDPROC(__get_user_1)
+
+ENTRY(__get_user_2)
+2:	ldrh	w2, [x0]
+	mov	x0, #0
+	ret
+ENDPROC(__get_user_2)
+
+ENTRY(__get_user_4)
+3:	ldr	w2, [x0]
+	mov	x0, #0
+	ret
+ENDPROC(__get_user_4)
+
+ENTRY(__get_user_8)
+4:	ldr	x2, [x0]
+	mov	x0, #0
+	ret
+ENDPROC(__get_user_4)
+
+__get_user_bad:
+	mov	x2, #0
+	mov	x0, #-EFAULT
+	ret
+ENDPROC(__get_user_bad)
+
+.section __ex_table, "a"
+	.quad	1b, __get_user_bad
+	.quad	2b, __get_user_bad
+	.quad	3b, __get_user_bad
+	.quad	4b, __get_user_bad
+.previous
diff --git a/arch/arm64/lib/putuser.S b/arch/arm64/lib/putuser.S
new file mode 100644
index 0000000..62d4a42
--- /dev/null
+++ b/arch/arm64/lib/putuser.S
@@ -0,0 +1,73 @@
+/*
+ * Based on arch/arm/lib/putuser.S
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Idea from x86 version, (C) Copyright 1998 Linus Torvalds
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * These functions have a non-standard call interface to make
+ * them more efficient, especially as they return an error
+ * value in addition to the "real" return value.
+ *
+ * __put_user_X
+ *
+ * Inputs:	x0 contains the address
+ *		x2, x3 contains the value
+ * Outputs:	x0 is the error code
+ *		lr corrupted
+ *
+ * No other registers must be altered.  (see <asm/uaccess.h>
+ * for specific ASM register usage).
+ *
+ * Note that it is intended that __put_user_bad is not global.
+ */
+
+#include <linux/linkage.h>
+#include <asm/errno.h>
+
+ENTRY(__put_user_1)
+1:	strb	w2, [x0]
+	mov	x0, #0
+	ret
+ENDPROC(__put_user_1)
+
+ENTRY(__put_user_2)
+2:	strh	w2, [x0]
+	mov	x0, #0
+	ret
+ENDPROC(__put_user_2)
+
+ENTRY(__put_user_4)
+3:	str	w2, [x0]
+	mov	x0, #0
+	ret
+ENDPROC(__put_user_4)
+
+ENTRY(__put_user_8)
+4:	str	x2, [x0]
+	mov	x0, #0
+	ret
+ENDPROC(__put_user_8)
+
+__put_user_bad:
+	mov	x0, #-EFAULT
+	ret
+ENDPROC(__put_user_bad)
+
+.section __ex_table, "a"
+	.quad	1b, __put_user_bad
+	.quad	2b, __put_user_bad
+	.quad	3b, __put_user_bad
+	.quad	4b, __put_user_bad
+.previous
diff --git a/arch/arm64/lib/strncpy_from_user.S b/arch/arm64/lib/strncpy_from_user.S
new file mode 100644
index 0000000..56e448a
--- /dev/null
+++ b/arch/arm64/lib/strncpy_from_user.S
@@ -0,0 +1,50 @@
+/*
+ * Based on arch/arm/lib/strncpy_from_user.S
+ *
+ * Copyright (C) 1995-2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+#include <asm/errno.h>
+
+	.text
+	.align	5
+
+/*
+ * Copy a string from user space to kernel space.
+ *  x0 = dst, x1 = src, x2 = byte length
+ * returns the number of characters copied (strlen of copied string),
+ *  -EFAULT on exception, or "len" if we fill the whole buffer
+ */
+ENTRY(__strncpy_from_user)
+	mov	x4, x1
+1:	subs	x2, x2, #1
+	bmi	2f
+USER(9f, ldrb	w3, [x1], #1	)
+	strb	w3, [x0], #1
+	cbnz	w3, 1b
+	sub	x1, x1, #1	// take NUL character out of count
+2:	sub	x0, x1, x4
+	ret
+ENDPROC(__strncpy_from_user)
+
+	.section .fixup,"ax"
+	.align	0
+9:	strb	wzr, [x0]	// null terminate
+	mov	x0, #-EFAULT
+	ret
+	.previous
diff --git a/arch/arm64/lib/strnlen_user.S b/arch/arm64/lib/strnlen_user.S
new file mode 100644
index 0000000..7f7b176
--- /dev/null
+++ b/arch/arm64/lib/strnlen_user.S
@@ -0,0 +1,47 @@
+/*
+ * Based on arch/arm/lib/strnlen_user.S
+ *
+ * Copyright (C) 1995-2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <asm/assembler.h>
+#include <asm/errno.h>
+
+	.text
+	.align	5
+
+/* Prototype: unsigned long __strnlen_user(const char *str, long n)
+ * Purpose  : get length of a string in user memory
+ * Params   : str - address of string in user memory
+ * Returns  : length of string *including terminator*
+ *	      or zero on exception, or n if too long
+ */
+ENTRY(__strnlen_user)
+	mov	x2, x0
+1:	subs	x1, x1, #1
+	b.mi	2f
+USER(9f, ldrb	w3, [x0], #1	)
+	cbnz	w3, 1b
+2:	sub	x0, x0, x2
+	ret
+ENDPROC(__strnlen_user)
+
+	.section .fixup,"ax"
+	.align	0
+9:	mov	x0, #0
+	ret
+	.previous


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 21/31] arm64: 32-bit (compat) applications support
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (19 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 20/31] arm64: User access library functions Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 14:34   ` Arnd Bergmann
  2012-08-20 10:53   ` Pavel Machek
  2012-08-14 17:52 ` [PATCH v2 22/31] arm64: Floating point and SIMD Catalin Marinas
                   ` (10 subsequent siblings)
  31 siblings, 2 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

From: Will Deacon <will.deacon@arm.com>

This patch adds support for 32-bit applications. The vectors page is a
binary blob mapped into the application user space at 0xffff0000 (the
AArch64 toolchain does not support compilation of AArch32 code). Full
compatibility with ARMv7 user space is supported. The use of deprecated
ARMv7 functionality (SWP, CP15 barriers) has been disabled by default on
AArch64 kernels and unaligned LDM/STM is not supported.

Please note that only the ARM 32-bit EABI is supported, so no OABI
compatibility.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/compat.h   |  232 ++++++++++
 arch/arm64/include/asm/signal32.h |   54 +++
 arch/arm64/include/asm/unistd32.h |  758 ++++++++++++++++++++++++++++++++
 arch/arm64/kernel/kuser32.S       |   77 ++++
 arch/arm64/kernel/signal32.c      |  876 +++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/sys32.S         |  283 ++++++++++++
 arch/arm64/kernel/sys_compat.c    |  177 ++++++++
 7 files changed, 2457 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/compat.h
 create mode 100644 arch/arm64/include/asm/signal32.h
 create mode 100644 arch/arm64/include/asm/unistd32.h
 create mode 100644 arch/arm64/kernel/kuser32.S
 create mode 100644 arch/arm64/kernel/signal32.c
 create mode 100644 arch/arm64/kernel/sys32.S
 create mode 100644 arch/arm64/kernel/sys_compat.c

diff --git a/arch/arm64/include/asm/compat.h b/arch/arm64/include/asm/compat.h
new file mode 100644
index 0000000..91e72b7
--- /dev/null
+++ b/arch/arm64/include/asm/compat.h
@@ -0,0 +1,232 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_COMPAT_H
+#define __ASM_COMPAT_H
+#ifdef __KERNEL__
+#ifdef CONFIG_COMPAT
+
+/*
+ * Architecture specific compatibility types
+ */
+#include <linux/types.h>
+#include <linux/sched.h>
+
+#define COMPAT_USER_HZ		100
+#define COMPAT_UTS_MACHINE	"armv8l\0\0"
+
+typedef u32		compat_size_t;
+typedef s32		compat_ssize_t;
+typedef s32		compat_time_t;
+typedef s32		compat_clock_t;
+typedef s32		compat_pid_t;
+typedef u32		__compat_uid_t;
+typedef u32		__compat_gid_t;
+typedef u32		__compat_uid32_t;
+typedef u32		__compat_gid32_t;
+typedef u32		compat_mode_t;
+typedef u32		compat_ino_t;
+typedef u32		compat_dev_t;
+typedef s32		compat_off_t;
+typedef s64		compat_loff_t;
+typedef s16		compat_nlink_t;
+typedef u16		compat_ipc_pid_t;
+typedef s32		compat_daddr_t;
+typedef u32		compat_caddr_t;
+typedef __kernel_fsid_t	compat_fsid_t;
+typedef s32		compat_key_t;
+typedef s32		compat_timer_t;
+
+typedef s32		compat_int_t;
+typedef s32		compat_long_t;
+typedef s64		compat_s64;
+typedef u32		compat_uint_t;
+typedef u32		compat_ulong_t;
+typedef u64		compat_u64;
+
+struct compat_timespec {
+	compat_time_t	tv_sec;
+	s32		tv_nsec;
+};
+
+struct compat_timeval {
+	compat_time_t	tv_sec;
+	s32		tv_usec;
+};
+
+struct compat_stat {
+	compat_dev_t	st_dev;
+	compat_ino_t	st_ino;
+	compat_mode_t	st_mode;
+	compat_nlink_t	st_nlink;
+	__compat_uid32_t	st_uid;
+	__compat_gid32_t	st_gid;
+	compat_dev_t	st_rdev;
+	compat_off_t	st_size;
+	compat_off_t	st_blksize;
+	compat_off_t	st_blocks;
+	compat_time_t	st_atime;
+	u32		st_atime_nsec;
+	compat_time_t	st_mtime;
+	u32		st_mtime_nsec;
+	compat_time_t	st_ctime;
+	u32		st_ctime_nsec;
+	u32		__unused4[2];
+};
+
+struct compat_flock {
+	short		l_type;
+	short		l_whence;
+	compat_off_t	l_start;
+	compat_off_t	l_len;
+	compat_pid_t	l_pid;
+};
+
+#define F_GETLK64	12	/*  using 'struct flock64' */
+#define F_SETLK64	13
+#define F_SETLKW64	14
+
+struct compat_flock64 {
+	short		l_type;
+	short		l_whence;
+	compat_loff_t	l_start;
+	compat_loff_t	l_len;
+	compat_pid_t	l_pid;
+};
+
+struct compat_statfs {
+	int		f_type;
+	int		f_bsize;
+	int		f_blocks;
+	int		f_bfree;
+	int		f_bavail;
+	int		f_files;
+	int		f_ffree;
+	compat_fsid_t	f_fsid;
+	int		f_namelen;	/* SunOS ignores this field. */
+	int		f_frsize;
+	int		f_flags;
+	int		f_spare[4];
+};
+
+#define COMPAT_RLIM_INFINITY		0xffffffff
+
+typedef u32		compat_old_sigset_t;
+
+#define _COMPAT_NSIG		64
+#define _COMPAT_NSIG_BPW	32
+
+typedef u32		compat_sigset_word;
+
+#define COMPAT_OFF_T_MAX	0x7fffffff
+#define COMPAT_LOFF_T_MAX	0x7fffffffffffffffL
+
+/*
+ * A pointer passed in from user mode. This should not
+ * be used for syscall parameters, just declare them
+ * as pointers because the syscall entry code will have
+ * appropriately converted them already.
+ */
+typedef	u32		compat_uptr_t;
+
+static inline void __user *compat_ptr(compat_uptr_t uptr)
+{
+	return (void __user *)(unsigned long)uptr;
+}
+
+static inline compat_uptr_t ptr_to_compat(void __user *uptr)
+{
+	return (u32)(unsigned long)uptr;
+}
+
+static inline void __user *arch_compat_alloc_user_space(long len)
+{
+	struct pt_regs *regs = task_pt_regs(current);
+	return (void __user *)regs->compat_sp - len;
+}
+
+struct compat_ipc64_perm {
+	compat_key_t key;
+	__compat_uid32_t uid;
+	__compat_gid32_t gid;
+	__compat_uid32_t cuid;
+	__compat_gid32_t cgid;
+	unsigned short mode;
+	unsigned short __pad1;
+	unsigned short seq;
+	unsigned short __pad2;
+	compat_ulong_t unused1;
+	compat_ulong_t unused2;
+};
+
+struct compat_semid64_ds {
+	struct compat_ipc64_perm sem_perm;
+	compat_time_t  sem_otime;
+	compat_ulong_t __unused1;
+	compat_time_t  sem_ctime;
+	compat_ulong_t __unused2;
+	compat_ulong_t sem_nsems;
+	compat_ulong_t __unused3;
+	compat_ulong_t __unused4;
+};
+
+struct compat_msqid64_ds {
+	struct compat_ipc64_perm msg_perm;
+	compat_time_t  msg_stime;
+	compat_ulong_t __unused1;
+	compat_time_t  msg_rtime;
+	compat_ulong_t __unused2;
+	compat_time_t  msg_ctime;
+	compat_ulong_t __unused3;
+	compat_ulong_t msg_cbytes;
+	compat_ulong_t msg_qnum;
+	compat_ulong_t msg_qbytes;
+	compat_pid_t   msg_lspid;
+	compat_pid_t   msg_lrpid;
+	compat_ulong_t __unused4;
+	compat_ulong_t __unused5;
+};
+
+struct compat_shmid64_ds {
+	struct compat_ipc64_perm shm_perm;
+	compat_size_t  shm_segsz;
+	compat_time_t  shm_atime;
+	compat_ulong_t __unused1;
+	compat_time_t  shm_dtime;
+	compat_ulong_t __unused2;
+	compat_time_t  shm_ctime;
+	compat_ulong_t __unused3;
+	compat_pid_t   shm_cpid;
+	compat_pid_t   shm_lpid;
+	compat_ulong_t shm_nattch;
+	compat_ulong_t __unused4;
+	compat_ulong_t __unused5;
+};
+
+static inline int is_compat_task(void)
+{
+	return test_thread_flag(TIF_32BIT);
+}
+
+#else /* !CONFIG_COMPAT */
+
+static inline int is_compat_task(void)
+{
+	return 0;
+}
+
+#endif /* CONFIG_COMPAT */
+#endif /* __KERNEL__ */
+#endif /* __ASM_COMPAT_H */
diff --git a/arch/arm64/include/asm/signal32.h b/arch/arm64/include/asm/signal32.h
new file mode 100644
index 0000000..f9cf9e1
--- /dev/null
+++ b/arch/arm64/include/asm/signal32.h
@@ -0,0 +1,54 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SIGNAL32_H
+#define __ASM_SIGNAL32_H
+
+#ifdef __KERNEL__
+#ifdef CONFIG_AARCH32_EMULATION
+#include <linux/compat.h>
+
+#define AARCH32_KERN_SIGRET_CODE_OFFSET	0x500
+
+extern const compat_ulong_t aarch32_sigret_code[6];
+
+int compat_setup_frame(int usig, struct k_sigaction *ka, sigset_t *set,
+		       struct pt_regs *regs);
+int compat_setup_rt_frame(int usig, struct k_sigaction *ka, siginfo_t *info,
+			  sigset_t *set, struct pt_regs *regs);
+
+void compat_setup_restart_syscall(struct pt_regs *regs);
+#else
+
+static inline int compat_setup_frame(int usid, struct k_sigaction *ka,
+				     sigset_t *set, struct pt_regs *regs)
+{
+	BUG();
+}
+
+static inline int compat_setup_rt_frame(int usig, struct k_sigaction *ka,
+					siginfo_t *info, sigset_t *set,
+					struct pt_regs *regs)
+{
+	BUG();
+}
+
+static inline void compat_setup_restart_syscall(struct pt_regs *regs)
+{
+	BUG();
+}
+#endif /* CONFIG_AARCH32_EMULATION */
+#endif /* __KERNEL__ */
+#endif /* __ASM_SIGNAL32_H */
diff --git a/arch/arm64/include/asm/unistd32.h b/arch/arm64/include/asm/unistd32.h
new file mode 100644
index 0000000..a50405f
--- /dev/null
+++ b/arch/arm64/include/asm/unistd32.h
@@ -0,0 +1,758 @@
+/*
+ * Based on arch/arm/include/asm/unistd.h
+ *
+ * Copyright (C) 2001-2005 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#if !defined(__ASM_UNISTD32_H) || defined(__SYSCALL)
+#define __ASM_UNISTD32_H
+
+#ifndef __SYSCALL
+#define __SYSCALL(x, y)
+#endif
+
+/*
+ * This file contains the system call numbers.
+ */
+
+#ifdef __SYSCALL_COMPAT
+
+#define __NR_restart_syscall		0
+__SYSCALL(__NR_restart_syscall, sys_restart_syscall)
+#define __NR_exit			1
+__SYSCALL(__NR_exit, sys_exit)
+#define __NR_fork			2
+__SYSCALL(__NR_fork, sys_fork)
+#define __NR_read			3
+__SYSCALL(__NR_read, sys_read)
+#define __NR_write			4
+__SYSCALL(__NR_write, sys_write)
+#define __NR_open			5
+__SYSCALL(__NR_open, sys_open)
+#define __NR_close			6
+__SYSCALL(__NR_close, sys_close)
+__SYSCALL(7, sys_ni_syscall)		/* 7 was sys_waitpid */
+#define __NR_creat			8
+__SYSCALL(__NR_creat, sys_creat)
+#define __NR_link			9
+__SYSCALL(__NR_link, sys_link)
+#define __NR_unlink			10
+__SYSCALL(__NR_unlink, sys_unlink)
+#define __NR_execve			11
+__SYSCALL(__NR_execve, sys_execve)
+#define __NR_chdir			12
+__SYSCALL(__NR_chdir, sys_chdir)
+__SYSCALL(13, sys_ni_syscall)		/* 13 was sys_time */
+#define __NR_mknod			14
+__SYSCALL(__NR_mknod, sys_mknod)
+#define __NR_chmod			15
+__SYSCALL(__NR_chmod, sys_chmod)
+#define __NR_lchown			16
+__SYSCALL(__NR_lchown, sys_lchown16)
+__SYSCALL(17, sys_ni_syscall)		/* 17 was sys_break */
+__SYSCALL(18, sys_ni_syscall)		/* 18 was sys_stat */
+#define __NR_lseek			19
+__SYSCALL(__NR_lseek, sys_lseek)
+#define __NR_getpid			20
+__SYSCALL(__NR_getpid, sys_getpid)
+#define __NR_mount			21
+__SYSCALL(__NR_mount, sys_mount)
+__SYSCALL(22, sys_ni_syscall)		/* 22 was sys_umount */
+#define __NR_setuid			23
+__SYSCALL(__NR_setuid, sys_setuid16)
+#define __NR_getuid			24
+__SYSCALL(__NR_getuid, sys_getuid16)
+__SYSCALL(25, sys_ni_syscall)		/* 25 was sys_stime */
+#define __NR_ptrace			26
+__SYSCALL(__NR_ptrace, sys_ptrace)
+__SYSCALL(27, sys_ni_syscall)		/* 27 was sys_alarm */
+__SYSCALL(28, sys_ni_syscall)		/* 28 was sys_fstat */
+#define __NR_pause			29
+__SYSCALL(__NR_pause, sys_pause)
+__SYSCALL(30, sys_ni_syscall)		/* 30 was sys_utime */
+__SYSCALL(31, sys_ni_syscall)		/* 31 was sys_stty */
+__SYSCALL(32, sys_ni_syscall)		/* 32 was sys_gtty */
+#define __NR_access			33
+__SYSCALL(__NR_access, sys_access)
+#define __NR_nice			34
+__SYSCALL(__NR_nice, sys_nice)
+__SYSCALL(35, sys_ni_syscall)		/* 35 was sys_ftime */
+#define __NR_sync			36
+__SYSCALL(__NR_sync, sys_sync)
+#define __NR_kill			37
+__SYSCALL(__NR_kill, sys_kill)
+#define __NR_rename			38
+__SYSCALL(__NR_rename, sys_rename)
+#define __NR_mkdir			39
+__SYSCALL(__NR_mkdir, sys_mkdir)
+#define __NR_rmdir			40
+__SYSCALL(__NR_rmdir, sys_rmdir)
+#define __NR_dup			41
+__SYSCALL(__NR_dup, sys_dup)
+#define __NR_pipe			42
+__SYSCALL(__NR_pipe, sys_pipe)
+#define __NR_times			43
+__SYSCALL(__NR_times, sys_times)
+__SYSCALL(44, sys_ni_syscall)		/* 44 was sys_prof */
+#define __NR_brk			45
+__SYSCALL(__NR_brk, sys_brk)
+#define __NR_setgid			46
+__SYSCALL(__NR_setgid, sys_setgid16)
+#define __NR_getgid			47
+__SYSCALL(__NR_getgid, sys_getgid16)
+__SYSCALL(48, sys_ni_syscall)		/* 48 was sys_signal */
+#define __NR_geteuid			49
+__SYSCALL(__NR_geteuid, sys_geteuid16)
+#define __NR_getegid			50
+__SYSCALL(__NR_getegid, sys_getegid16)
+#define __NR_acct			51
+__SYSCALL(__NR_acct, sys_acct)
+#define __NR_umount2			52
+__SYSCALL(__NR_umount2, sys_umount)
+__SYSCALL(53, sys_ni_syscall)		/* 53 was sys_lock */
+#define __NR_ioctl			54
+__SYSCALL(__NR_ioctl, sys_ioctl)
+#define __NR_fcntl			55
+__SYSCALL(__NR_fcntl, sys_fcntl)
+__SYSCALL(56, sys_ni_syscall)		/* 56 was sys_mpx */
+#define __NR_setpgid			57
+__SYSCALL(__NR_setpgid, sys_setpgid)
+__SYSCALL(58, sys_ni_syscall)		/* 58 was sys_ulimit */
+__SYSCALL(59, sys_ni_syscall)		/* 59 was sys_olduname */
+#define __NR_umask			60
+__SYSCALL(__NR_umask, sys_umask)
+#define __NR_chroot			61
+__SYSCALL(__NR_chroot, sys_chroot)
+#define __NR_ustat			62
+__SYSCALL(__NR_ustat, sys_ustat)
+#define __NR_dup2			63
+__SYSCALL(__NR_dup2, sys_dup2)
+#define __NR_getppid			64
+__SYSCALL(__NR_getppid, sys_getppid)
+#define __NR_getpgrp			65
+__SYSCALL(__NR_getpgrp, sys_getpgrp)
+#define __NR_setsid			66
+__SYSCALL(__NR_setsid, sys_setsid)
+#define __NR_sigaction			67
+__SYSCALL(__NR_sigaction, sys_sigaction)
+__SYSCALL(68, sys_ni_syscall)		/* 68 was sys_sgetmask */
+__SYSCALL(69, sys_ni_syscall)		/* 69 was sys_ssetmask */
+#define __NR_setreuid			70
+__SYSCALL(__NR_setreuid, sys_setreuid16)
+#define __NR_setregid			71
+__SYSCALL(__NR_setregid, sys_setregid16)
+#define __NR_sigsuspend			72
+__SYSCALL(__NR_sigsuspend, sys_sigsuspend)
+#define __NR_sigpending			73
+__SYSCALL(__NR_sigpending, sys_sigpending)
+#define __NR_sethostname		74
+__SYSCALL(__NR_sethostname, sys_sethostname)
+#define __NR_setrlimit			75
+__SYSCALL(__NR_setrlimit, sys_setrlimit)
+__SYSCALL(76, sys_ni_syscall)		/* 76 was sys_getrlimit */
+#define __NR_getrusage			77
+__SYSCALL(__NR_getrusage, sys_getrusage)
+#define __NR_gettimeofday		78
+__SYSCALL(__NR_gettimeofday, sys_gettimeofday)
+#define __NR_settimeofday		79
+__SYSCALL(__NR_settimeofday, sys_settimeofday)
+#define __NR_getgroups			80
+__SYSCALL(__NR_getgroups, sys_getgroups16)
+#define __NR_setgroups			81
+__SYSCALL(__NR_setgroups, sys_setgroups16)
+__SYSCALL(82, sys_ni_syscall)		/* 82 was sys_select */
+#define __NR_symlink			83
+__SYSCALL(__NR_symlink, sys_symlink)
+__SYSCALL(84, sys_ni_syscall)		/* 84 was sys_lstat */
+#define __NR_readlink			85
+__SYSCALL(__NR_readlink, sys_readlink)
+#define __NR_uselib			86
+__SYSCALL(__NR_uselib, sys_uselib)
+#define __NR_swapon			87
+__SYSCALL(__NR_swapon, sys_swapon)
+#define __NR_reboot			88
+__SYSCALL(__NR_reboot, sys_reboot)
+__SYSCALL(89, sys_ni_syscall)		/* 89 was sys_readdir */
+__SYSCALL(90, sys_ni_syscall)		/* 90 was sys_mmap */
+#define __NR_munmap			91
+__SYSCALL(__NR_munmap, sys_munmap)
+#define __NR_truncate			92
+__SYSCALL(__NR_truncate, sys_truncate)
+#define __NR_ftruncate			93
+__SYSCALL(__NR_ftruncate, sys_ftruncate)
+#define __NR_fchmod			94
+__SYSCALL(__NR_fchmod, sys_fchmod)
+#define __NR_fchown			95
+__SYSCALL(__NR_fchown, sys_fchown16)
+#define __NR_getpriority		96
+__SYSCALL(__NR_getpriority, sys_getpriority)
+#define __NR_setpriority		97
+__SYSCALL(__NR_setpriority, sys_setpriority)
+__SYSCALL(98, sys_ni_syscall)		/* 98 was sys_profil */
+#define __NR_statfs			99
+__SYSCALL(__NR_statfs, sys_statfs)
+#define __NR_fstatfs			100
+__SYSCALL(__NR_fstatfs, sys_fstatfs)
+__SYSCALL(101, sys_ni_syscall)		/* 101 was sys_ioperm */
+__SYSCALL(102, sys_ni_syscall)		/* 102 was sys_socketcall */
+#define __NR_syslog			103
+__SYSCALL(__NR_syslog, sys_syslog)
+#define __NR_setitimer			104
+__SYSCALL(__NR_setitimer, sys_setitimer)
+#define __NR_getitimer			105
+__SYSCALL(__NR_getitimer, sys_getitimer)
+#define __NR_stat			106
+__SYSCALL(__NR_stat, sys_newstat)
+#define __NR_lstat			107
+__SYSCALL(__NR_lstat, sys_newlstat)
+#define __NR_fstat			108
+__SYSCALL(__NR_fstat, sys_newfstat)
+__SYSCALL(109, sys_ni_syscall)		/* 109 was sys_uname */
+__SYSCALL(110, sys_ni_syscall)		/* 110 was sys_iopl */
+#define __NR_vhangup			111
+__SYSCALL(__NR_vhangup, sys_vhangup)
+__SYSCALL(112, sys_ni_syscall)		/* 112 was sys_idle */
+__SYSCALL(113, sys_ni_syscall)		/* 113 was sys_syscall */
+#define __NR_wait4			114
+__SYSCALL(__NR_wait4, sys_wait4)
+#define __NR_swapoff			115
+__SYSCALL(__NR_swapoff, sys_swapoff)
+#define __NR_sysinfo			116
+__SYSCALL(__NR_sysinfo, sys_sysinfo)
+__SYSCALL(117, sys_ni_syscall)		/* 117 was sys_ipc */
+#define __NR_fsync			118
+__SYSCALL(__NR_fsync, sys_fsync)
+#define __NR_sigreturn			119
+__SYSCALL(__NR_sigreturn, sys_sigreturn)
+#define __NR_clone			120
+__SYSCALL(__NR_clone, sys_clone)
+#define __NR_setdomainname		121
+__SYSCALL(__NR_setdomainname, sys_setdomainname)
+#define __NR_uname			122
+__SYSCALL(__NR_uname, sys_newuname)
+__SYSCALL(123, sys_ni_syscall)		/* 123 was sys_modify_ldt */
+#define __NR_adjtimex			124
+__SYSCALL(__NR_adjtimex, sys_adjtimex)
+#define __NR_mprotect			125
+__SYSCALL(__NR_mprotect, sys_mprotect)
+#define __NR_sigprocmask		126
+__SYSCALL(__NR_sigprocmask, sys_sigprocmask)
+__SYSCALL(127, sys_ni_syscall)		/* 127 was sys_create_module */
+#define __NR_init_module		128
+__SYSCALL(__NR_init_module, sys_init_module)
+#define __NR_delete_module		129
+__SYSCALL(__NR_delete_module, sys_delete_module)
+__SYSCALL(130, sys_ni_syscall)		/* 130 was sys_get_kernel_syms */
+#define __NR_quotactl			131
+__SYSCALL(__NR_quotactl, sys_quotactl)
+#define __NR_getpgid			132
+__SYSCALL(__NR_getpgid, sys_getpgid)
+#define __NR_fchdir			133
+__SYSCALL(__NR_fchdir, sys_fchdir)
+#define __NR_bdflush			134
+__SYSCALL(__NR_bdflush, sys_bdflush)
+#define __NR_sysfs			135
+__SYSCALL(__NR_sysfs, sys_sysfs)
+#define __NR_personality		136
+__SYSCALL(__NR_personality, sys_personality)
+__SYSCALL(137, sys_ni_syscall)		/* 137 was sys_afs_syscall */
+#define __NR_setfsuid			138
+__SYSCALL(__NR_setfsuid, sys_setfsuid16)
+#define __NR_setfsgid			139
+__SYSCALL(__NR_setfsgid, sys_setfsgid16)
+#define __NR__llseek			140
+__SYSCALL(__NR__llseek, sys_llseek)
+#define __NR_getdents			141
+__SYSCALL(__NR_getdents, sys_getdents)
+#define __NR__newselect			142
+__SYSCALL(__NR__newselect, sys_select)
+#define __NR_flock			143
+__SYSCALL(__NR_flock, sys_flock)
+#define __NR_msync			144
+__SYSCALL(__NR_msync, sys_msync)
+#define __NR_readv			145
+__SYSCALL(__NR_readv, sys_readv)
+#define __NR_writev			146
+__SYSCALL(__NR_writev, sys_writev)
+#define __NR_getsid			147
+__SYSCALL(__NR_getsid, sys_getsid)
+#define __NR_fdatasync			148
+__SYSCALL(__NR_fdatasync, sys_fdatasync)
+#define __NR__sysctl			149
+__SYSCALL(__NR__sysctl, sys_sysctl)
+#define __NR_mlock			150
+__SYSCALL(__NR_mlock, sys_mlock)
+#define __NR_munlock			151
+__SYSCALL(__NR_munlock, sys_munlock)
+#define __NR_mlockall			152
+__SYSCALL(__NR_mlockall, sys_mlockall)
+#define __NR_munlockall			153
+__SYSCALL(__NR_munlockall, sys_munlockall)
+#define __NR_sched_setparam		154
+__SYSCALL(__NR_sched_setparam, sys_sched_setparam)
+#define __NR_sched_getparam		155
+__SYSCALL(__NR_sched_getparam, sys_sched_getparam)
+#define __NR_sched_setscheduler		156
+__SYSCALL(__NR_sched_setscheduler, sys_sched_setscheduler)
+#define __NR_sched_getscheduler		157
+__SYSCALL(__NR_sched_getscheduler, sys_sched_getscheduler)
+#define __NR_sched_yield		158
+__SYSCALL(__NR_sched_yield, sys_sched_yield)
+#define __NR_sched_get_priority_max	159
+__SYSCALL(__NR_sched_get_priority_max, sys_sched_get_priority_max)
+#define __NR_sched_get_priority_min	160
+__SYSCALL(__NR_sched_get_priority_min, sys_sched_get_priority_min)
+#define __NR_sched_rr_get_interval	161
+__SYSCALL(__NR_sched_rr_get_interval, sys_sched_rr_get_interval)
+#define __NR_nanosleep			162
+__SYSCALL(__NR_nanosleep, sys_nanosleep)
+#define __NR_mremap			163
+__SYSCALL(__NR_mremap, sys_mremap)
+#define __NR_setresuid			164
+__SYSCALL(__NR_setresuid, sys_setresuid16)
+#define __NR_getresuid			165
+__SYSCALL(__NR_getresuid, sys_getresuid16)
+__SYSCALL(166, sys_ni_syscall)		/* 166 was sys_vm86 */
+__SYSCALL(167, sys_ni_syscall)		/* 167 was sys_query_module */
+#define __NR_poll			168
+__SYSCALL(__NR_poll, sys_poll)
+#define __NR_nfsservctl			169
+__SYSCALL(__NR_nfsservctl, sys_ni_syscall)
+#define __NR_setresgid			170
+__SYSCALL(__NR_setresgid, sys_setresgid16)
+#define __NR_getresgid			171
+__SYSCALL(__NR_getresgid, sys_getresgid16)
+#define __NR_prctl			172
+__SYSCALL(__NR_prctl, sys_prctl)
+#define __NR_rt_sigreturn		173
+__SYSCALL(__NR_rt_sigreturn, sys_rt_sigreturn)
+#define __NR_rt_sigaction		174
+__SYSCALL(__NR_rt_sigaction, sys_rt_sigaction)
+#define __NR_rt_sigprocmask		175
+__SYSCALL(__NR_rt_sigprocmask, sys_rt_sigprocmask)
+#define __NR_rt_sigpending		176
+__SYSCALL(__NR_rt_sigpending, sys_rt_sigpending)
+#define __NR_rt_sigtimedwait		177
+__SYSCALL(__NR_rt_sigtimedwait, sys_rt_sigtimedwait)
+#define __NR_rt_sigqueueinfo		178
+__SYSCALL(__NR_rt_sigqueueinfo, sys_rt_sigqueueinfo)
+#define __NR_rt_sigsuspend		179
+__SYSCALL(__NR_rt_sigsuspend, sys_rt_sigsuspend)
+#define __NR_pread64			180
+__SYSCALL(__NR_pread64, sys_pread64)
+#define __NR_pwrite64			181
+__SYSCALL(__NR_pwrite64, sys_pwrite64)
+#define __NR_chown			182
+__SYSCALL(__NR_chown, sys_chown16)
+#define __NR_getcwd			183
+__SYSCALL(__NR_getcwd, sys_getcwd)
+#define __NR_capget			184
+__SYSCALL(__NR_capget, sys_capget)
+#define __NR_capset			185
+__SYSCALL(__NR_capset, sys_capset)
+#define __NR_sigaltstack		186
+__SYSCALL(__NR_sigaltstack, sys_sigaltstack)
+#define __NR_sendfile			187
+__SYSCALL(__NR_sendfile, sys_sendfile)
+__SYSCALL(188, sys_ni_syscall)		/* 188 reserved */
+__SYSCALL(189, sys_ni_syscall)		/* 189 reserved */
+#define __NR_vfork			190
+__SYSCALL(__NR_vfork, sys_vfork)
+#define __NR_ugetrlimit			191	/* SuS compliant getrlimit */
+__SYSCALL(__NR_ugetrlimit, sys_getrlimit)
+#define __NR_mmap2			192
+__SYSCALL(__NR_mmap2, sys_mmap2)
+#define __NR_truncate64			193
+__SYSCALL(__NR_truncate64, sys_truncate64)
+#define __NR_ftruncate64		194
+__SYSCALL(__NR_ftruncate64, sys_ftruncate64)
+#define __NR_stat64			195
+__SYSCALL(__NR_stat64, sys_stat64)
+#define __NR_lstat64			196
+__SYSCALL(__NR_lstat64, sys_lstat64)
+#define __NR_fstat64			197
+__SYSCALL(__NR_fstat64, sys_fstat64)
+#define __NR_lchown32			198
+__SYSCALL(__NR_lchown32, sys_lchown)
+#define __NR_getuid32			199
+__SYSCALL(__NR_getuid32, sys_getuid)
+#define __NR_getgid32			200
+__SYSCALL(__NR_getgid32, sys_getgid)
+#define __NR_geteuid32			201
+__SYSCALL(__NR_geteuid32, sys_geteuid)
+#define __NR_getegid32			202
+__SYSCALL(__NR_getegid32, sys_getegid)
+#define __NR_setreuid32			203
+__SYSCALL(__NR_setreuid32, sys_setreuid)
+#define __NR_setregid32			204
+__SYSCALL(__NR_setregid32, sys_setregid)
+#define __NR_getgroups32		205
+__SYSCALL(__NR_getgroups32, sys_getgroups)
+#define __NR_setgroups32		206
+__SYSCALL(__NR_setgroups32, sys_setgroups)
+#define __NR_fchown32			207
+__SYSCALL(__NR_fchown32, sys_fchown)
+#define __NR_setresuid32		208
+__SYSCALL(__NR_setresuid32, sys_setresuid)
+#define __NR_getresuid32		209
+__SYSCALL(__NR_getresuid32, sys_getresuid)
+#define __NR_setresgid32		210
+__SYSCALL(__NR_setresgid32, sys_setresgid)
+#define __NR_getresgid32		211
+__SYSCALL(__NR_getresgid32, sys_getresgid)
+#define __NR_chown32			212
+__SYSCALL(__NR_chown32, sys_chown)
+#define __NR_setuid32			213
+__SYSCALL(__NR_setuid32, sys_setuid)
+#define __NR_setgid32			214
+__SYSCALL(__NR_setgid32, sys_setgid)
+#define __NR_setfsuid32			215
+__SYSCALL(__NR_setfsuid32, sys_setfsuid)
+#define __NR_setfsgid32			216
+__SYSCALL(__NR_setfsgid32, sys_setfsgid)
+#define __NR_getdents64			217
+__SYSCALL(__NR_getdents64, sys_getdents64)
+#define __NR_pivot_root			218
+__SYSCALL(__NR_pivot_root, sys_pivot_root)
+#define __NR_mincore			219
+__SYSCALL(__NR_mincore, sys_mincore)
+#define __NR_madvise			220
+__SYSCALL(__NR_madvise, sys_madvise)
+#define __NR_fcntl64			221
+__SYSCALL(__NR_fcntl64, sys_fcntl64)
+__SYSCALL(222, sys_ni_syscall)		/* 222 for tux */
+__SYSCALL(223, sys_ni_syscall)		/* 223 is unused */
+#define __NR_gettid			224
+__SYSCALL(__NR_gettid, sys_gettid)
+#define __NR_readahead			225
+__SYSCALL(__NR_readahead, sys_readahead)
+#define __NR_setxattr			226
+__SYSCALL(__NR_setxattr, sys_setxattr)
+#define __NR_lsetxattr			227
+__SYSCALL(__NR_lsetxattr, sys_lsetxattr)
+#define __NR_fsetxattr			228
+__SYSCALL(__NR_fsetxattr, sys_fsetxattr)
+#define __NR_getxattr			229
+__SYSCALL(__NR_getxattr, sys_getxattr)
+#define __NR_lgetxattr			230
+__SYSCALL(__NR_lgetxattr, sys_lgetxattr)
+#define __NR_fgetxattr			231
+__SYSCALL(__NR_fgetxattr, sys_fgetxattr)
+#define __NR_listxattr			232
+__SYSCALL(__NR_listxattr, sys_listxattr)
+#define __NR_llistxattr			233
+__SYSCALL(__NR_llistxattr, sys_llistxattr)
+#define __NR_flistxattr			234
+__SYSCALL(__NR_flistxattr, sys_flistxattr)
+#define __NR_removexattr		235
+__SYSCALL(__NR_removexattr, sys_removexattr)
+#define __NR_lremovexattr		236
+__SYSCALL(__NR_lremovexattr, sys_lremovexattr)
+#define __NR_fremovexattr		237
+__SYSCALL(__NR_fremovexattr, sys_fremovexattr)
+#define __NR_tkill			238
+__SYSCALL(__NR_tkill, sys_tkill)
+#define __NR_sendfile64			239
+__SYSCALL(__NR_sendfile64, sys_sendfile64)
+#define __NR_futex			240
+__SYSCALL(__NR_futex, sys_futex)
+#define __NR_sched_setaffinity		241
+__SYSCALL(__NR_sched_setaffinity, sys_sched_setaffinity)
+#define __NR_sched_getaffinity		242
+__SYSCALL(__NR_sched_getaffinity, sys_sched_getaffinity)
+#define __NR_io_setup			243
+__SYSCALL(__NR_io_setup, sys_io_setup)
+#define __NR_io_destroy			244
+__SYSCALL(__NR_io_destroy, sys_io_destroy)
+#define __NR_io_getevents		245
+__SYSCALL(__NR_io_getevents, sys_io_getevents)
+#define __NR_io_submit			246
+__SYSCALL(__NR_io_submit, sys_io_submit)
+#define __NR_io_cancel			247
+__SYSCALL(__NR_io_cancel, sys_io_cancel)
+#define __NR_exit_group			248
+__SYSCALL(__NR_exit_group, sys_exit_group)
+#define __NR_lookup_dcookie		249
+__SYSCALL(__NR_lookup_dcookie, sys_lookup_dcookie)
+#define __NR_epoll_create		250
+__SYSCALL(__NR_epoll_create, sys_epoll_create)
+#define __NR_epoll_ctl			251
+__SYSCALL(__NR_epoll_ctl, sys_epoll_ctl)
+#define __NR_epoll_wait			252
+__SYSCALL(__NR_epoll_wait, sys_epoll_wait)
+#define __NR_remap_file_pages		253
+__SYSCALL(__NR_remap_file_pages, sys_remap_file_pages)
+__SYSCALL(254, sys_ni_syscall)		/* 254 for set_thread_area */
+__SYSCALL(255, sys_ni_syscall)		/* 255 for get_thread_area */
+#define __NR_set_tid_address		256
+__SYSCALL(__NR_set_tid_address, sys_set_tid_address)
+#define __NR_timer_create		257
+__SYSCALL(__NR_timer_create, sys_timer_create)
+#define __NR_timer_settime		258
+__SYSCALL(__NR_timer_settime, sys_timer_settime)
+#define __NR_timer_gettime		259
+__SYSCALL(__NR_timer_gettime, sys_timer_gettime)
+#define __NR_timer_getoverrun		260
+__SYSCALL(__NR_timer_getoverrun, sys_timer_getoverrun)
+#define __NR_timer_delete		261
+__SYSCALL(__NR_timer_delete, sys_timer_delete)
+#define __NR_clock_settime		262
+__SYSCALL(__NR_clock_settime, sys_clock_settime)
+#define __NR_clock_gettime		263
+__SYSCALL(__NR_clock_gettime, sys_clock_gettime)
+#define __NR_clock_getres		264
+__SYSCALL(__NR_clock_getres, sys_clock_getres)
+#define __NR_clock_nanosleep		265
+__SYSCALL(__NR_clock_nanosleep, sys_clock_nanosleep)
+#define __NR_statfs64			266
+__SYSCALL(__NR_statfs64, sys_statfs64)
+#define __NR_fstatfs64			267
+__SYSCALL(__NR_fstatfs64, sys_fstatfs64)
+#define __NR_tgkill			268
+__SYSCALL(__NR_tgkill, sys_tgkill)
+#define __NR_utimes			269
+__SYSCALL(__NR_utimes, sys_utimes)
+#define __NR_fadvise64			270
+__SYSCALL(__NR_fadvise64, sys_fadvise64_64)
+#define __NR_pciconfig_iobase		271
+__SYSCALL(__NR_pciconfig_iobase, sys_pciconfig_iobase)
+#define __NR_pciconfig_read		272
+__SYSCALL(__NR_pciconfig_read, sys_pciconfig_read)
+#define __NR_pciconfig_write		273
+__SYSCALL(__NR_pciconfig_write, sys_pciconfig_write)
+#define __NR_mq_open			274
+__SYSCALL(__NR_mq_open, sys_mq_open)
+#define __NR_mq_unlink			275
+__SYSCALL(__NR_mq_unlink, sys_mq_unlink)
+#define __NR_mq_timedsend		276
+__SYSCALL(__NR_mq_timedsend, sys_mq_timedsend)
+#define __NR_mq_timedreceive		277
+__SYSCALL(__NR_mq_timedreceive, sys_mq_timedreceive)
+#define __NR_mq_notify			278
+__SYSCALL(__NR_mq_notify, sys_mq_notify)
+#define __NR_mq_getsetattr		279
+__SYSCALL(__NR_mq_getsetattr, sys_mq_getsetattr)
+#define __NR_waitid			280
+__SYSCALL(__NR_waitid, sys_waitid)
+#define __NR_socket			281
+__SYSCALL(__NR_socket, sys_socket)
+#define __NR_bind			282
+__SYSCALL(__NR_bind, sys_bind)
+#define __NR_connect			283
+__SYSCALL(__NR_connect, sys_connect)
+#define __NR_listen			284
+__SYSCALL(__NR_listen, sys_listen)
+#define __NR_accept			285
+__SYSCALL(__NR_accept, sys_accept)
+#define __NR_getsockname		286
+__SYSCALL(__NR_getsockname, sys_getsockname)
+#define __NR_getpeername		287
+__SYSCALL(__NR_getpeername, sys_getpeername)
+#define __NR_socketpair			288
+__SYSCALL(__NR_socketpair, sys_socketpair)
+#define __NR_send			289
+__SYSCALL(__NR_send, sys_send)
+#define __NR_sendto			290
+__SYSCALL(__NR_sendto, sys_sendto)
+#define __NR_recv			291
+__SYSCALL(__NR_recv, sys_recv)
+#define __NR_recvfrom			292
+__SYSCALL(__NR_recvfrom, sys_recvfrom)
+#define __NR_shutdown			293
+__SYSCALL(__NR_shutdown, sys_shutdown)
+#define __NR_setsockopt			294
+__SYSCALL(__NR_setsockopt, sys_setsockopt)
+#define __NR_getsockopt			295
+__SYSCALL(__NR_getsockopt, sys_getsockopt)
+#define __NR_sendmsg			296
+__SYSCALL(__NR_sendmsg, sys_sendmsg)
+#define __NR_recvmsg			297
+__SYSCALL(__NR_recvmsg, sys_recvmsg)
+#define __NR_semop			298
+__SYSCALL(__NR_semop, sys_semop)
+#define __NR_semget			299
+__SYSCALL(__NR_semget, sys_semget)
+#define __NR_semctl			300
+__SYSCALL(__NR_semctl, sys_semctl)
+#define __NR_msgsnd			301
+__SYSCALL(__NR_msgsnd, sys_msgsnd)
+#define __NR_msgrcv			302
+__SYSCALL(__NR_msgrcv, sys_msgrcv)
+#define __NR_msgget			303
+__SYSCALL(__NR_msgget, sys_msgget)
+#define __NR_msgctl			304
+__SYSCALL(__NR_msgctl, sys_msgctl)
+#define __NR_shmat			305
+__SYSCALL(__NR_shmat, sys_shmat)
+#define __NR_shmdt			306
+__SYSCALL(__NR_shmdt, sys_shmdt)
+#define __NR_shmget			307
+__SYSCALL(__NR_shmget, sys_shmget)
+#define __NR_shmctl			308
+__SYSCALL(__NR_shmctl, sys_shmctl)
+#define __NR_add_key			309
+__SYSCALL(__NR_add_key, sys_add_key)
+#define __NR_request_key		310
+__SYSCALL(__NR_request_key, sys_request_key)
+#define __NR_keyctl			311
+__SYSCALL(__NR_keyctl, sys_keyctl)
+#define __NR_semtimedop			312
+__SYSCALL(__NR_semtimedop, sys_semtimedop)
+#define __NR_vserver			313
+__SYSCALL(__NR_vserver, sys_ni_syscall)
+#define __NR_ioprio_set			314
+__SYSCALL(__NR_ioprio_set, sys_ioprio_set)
+#define __NR_ioprio_get			315
+__SYSCALL(__NR_ioprio_get, sys_ioprio_get)
+#define __NR_inotify_init		316
+__SYSCALL(__NR_inotify_init, sys_inotify_init)
+#define __NR_inotify_add_watch		317
+__SYSCALL(__NR_inotify_add_watch, sys_inotify_add_watch)
+#define __NR_inotify_rm_watch		318
+__SYSCALL(__NR_inotify_rm_watch, sys_inotify_rm_watch)
+#define __NR_mbind			319
+__SYSCALL(__NR_mbind, sys_mbind)
+#define __NR_get_mempolicy		320
+__SYSCALL(__NR_get_mempolicy, sys_get_mempolicy)
+#define __NR_set_mempolicy		321
+__SYSCALL(__NR_set_mempolicy, sys_set_mempolicy)
+#define __NR_openat			322
+__SYSCALL(__NR_openat, sys_openat)
+#define __NR_mkdirat			323
+__SYSCALL(__NR_mkdirat, sys_mkdirat)
+#define __NR_mknodat			324
+__SYSCALL(__NR_mknodat, sys_mknodat)
+#define __NR_fchownat			325
+__SYSCALL(__NR_fchownat, sys_fchownat)
+#define __NR_futimesat			326
+__SYSCALL(__NR_futimesat, sys_futimesat)
+#define __NR_fstatat64			327
+__SYSCALL(__NR_fstatat64, sys_fstatat64)
+#define __NR_unlinkat			328
+__SYSCALL(__NR_unlinkat, sys_unlinkat)
+#define __NR_renameat			329
+__SYSCALL(__NR_renameat, sys_renameat)
+#define __NR_linkat			330
+__SYSCALL(__NR_linkat, sys_linkat)
+#define __NR_symlinkat			331
+__SYSCALL(__NR_symlinkat, sys_symlinkat)
+#define __NR_readlinkat			332
+__SYSCALL(__NR_readlinkat, sys_readlinkat)
+#define __NR_fchmodat			333
+__SYSCALL(__NR_fchmodat, sys_fchmodat)
+#define __NR_faccessat			334
+__SYSCALL(__NR_faccessat, sys_faccessat)
+#define __NR_pselect6			335
+__SYSCALL(__NR_pselect6, sys_pselect6)
+#define __NR_ppoll			336
+__SYSCALL(__NR_ppoll, sys_ppoll)
+#define __NR_unshare			337
+__SYSCALL(__NR_unshare, sys_unshare)
+#define __NR_set_robust_list		338
+__SYSCALL(__NR_set_robust_list, sys_set_robust_list)
+#define __NR_get_robust_list		339
+__SYSCALL(__NR_get_robust_list, sys_get_robust_list)
+#define __NR_splice			340
+__SYSCALL(__NR_splice, sys_splice)
+#define __NR_sync_file_range2		341
+__SYSCALL(__NR_sync_file_range2, sys_sync_file_range2)
+#define __NR_tee			342
+__SYSCALL(__NR_tee, sys_tee)
+#define __NR_vmsplice			343
+__SYSCALL(__NR_vmsplice, sys_vmsplice)
+#define __NR_move_pages			344
+__SYSCALL(__NR_move_pages, sys_move_pages)
+#define __NR_getcpu			345
+__SYSCALL(__NR_getcpu, sys_getcpu)
+#define __NR_epoll_pwait		346
+__SYSCALL(__NR_epoll_pwait, sys_epoll_pwait)
+#define __NR_kexec_load			347
+__SYSCALL(__NR_kexec_load, sys_kexec_load)
+#define __NR_utimensat			348
+__SYSCALL(__NR_utimensat, sys_utimensat)
+#define __NR_signalfd			349
+__SYSCALL(__NR_signalfd, sys_signalfd)
+#define __NR_timerfd_create		350
+__SYSCALL(__NR_timerfd_create, sys_timerfd_create)
+#define __NR_eventfd			351
+__SYSCALL(__NR_eventfd, sys_eventfd)
+#define __NR_fallocate			352
+__SYSCALL(__NR_fallocate, sys_fallocate)
+#define __NR_timerfd_settime		353
+__SYSCALL(__NR_timerfd_settime, sys_timerfd_settime)
+#define __NR_timerfd_gettime		354
+__SYSCALL(__NR_timerfd_gettime, sys_timerfd_gettime)
+#define __NR_signalfd4			355
+__SYSCALL(__NR_signalfd4, sys_signalfd4)
+#define __NR_eventfd2			356
+__SYSCALL(__NR_eventfd2, sys_eventfd2)
+#define __NR_epoll_create1		357
+__SYSCALL(__NR_epoll_create1, sys_epoll_create1)
+#define __NR_dup3			358
+__SYSCALL(__NR_dup3, sys_dup3)
+#define __NR_pipe2			359
+__SYSCALL(__NR_pipe2, sys_pipe2)
+#define __NR_inotify_init1		360
+__SYSCALL(__NR_inotify_init1, sys_inotify_init1)
+#define __NR_preadv			361
+__SYSCALL(__NR_preadv, sys_preadv)
+#define __NR_pwritev			362
+__SYSCALL(__NR_pwritev, sys_pwritev)
+#define __NR_rt_tgsigqueueinfo		363
+__SYSCALL(__NR_rt_tgsigqueueinfo, sys_rt_tgsigqueueinfo)
+#define __NR_perf_event_open		364
+__SYSCALL(__NR_perf_event_open, sys_perf_event_open)
+#define __NR_recvmmsg			365
+__SYSCALL(__NR_recvmmsg, sys_recvmmsg)
+#define __NR_accept4			366
+__SYSCALL(__NR_accept4, sys_accept4)
+#define __NR_fanotify_init		367
+__SYSCALL(__NR_fanotify_init, sys_fanotify_init)
+#define __NR_fanotify_mark		368
+__SYSCALL(__NR_fanotify_mark, sys_fanotify_mark)
+#define __NR_prlimit64			369
+__SYSCALL(__NR_prlimit64, sys_prlimit64)
+#define __NR_name_to_handle_at		370
+__SYSCALL(__NR_name_to_handle_at, sys_name_to_handle_at)
+#define __NR_open_by_handle_at		371
+__SYSCALL(__NR_open_by_handle_at, sys_open_by_handle_at)
+#define __NR_clock_adjtime		372
+__SYSCALL(__NR_clock_adjtime, sys_clock_adjtime)
+#define __NR_syncfs			373
+__SYSCALL(__NR_syncfs, sys_syncfs)
+
+/*
+ * The following SVCs are ARM private.
+ */
+#define __ARM_NR_COMPAT_BASE		0x0f0000
+#define __ARM_NR_compat_cacheflush	(__ARM_NR_COMPAT_BASE+2)
+#define __ARM_NR_compat_set_tls		(__ARM_NR_COMPAT_BASE+5)
+
+#endif	/* __SYSCALL_COMPAT */
+
+#define __NR_compat_syscalls		374
+
+#define __ARCH_WANT_COMPAT_IPC_PARSE_VERSION
+#define __ARCH_WANT_COMPAT_STAT64
+#define __ARCH_WANT_SYS_GETHOSTNAME
+#define __ARCH_WANT_SYS_PAUSE
+#define __ARCH_WANT_SYS_GETPGRP
+#define __ARCH_WANT_SYS_LLSEEK
+#define __ARCH_WANT_SYS_NICE
+#define __ARCH_WANT_SYS_SIGPENDING
+#define __ARCH_WANT_SYS_SIGPROCMASK
+#define __ARCH_WANT_COMPAT_SYS_RT_SIGSUSPEND
+
+#endif /* __ASM_UNISTD32_H */
diff --git a/arch/arm64/kernel/kuser32.S b/arch/arm64/kernel/kuser32.S
new file mode 100644
index 0000000..c3bab17
--- /dev/null
+++ b/arch/arm64/kernel/kuser32.S
@@ -0,0 +1,77 @@
+/*
+ * Low-level user helpers placed in the vectors page for AArch32.
+ *
+ * Copyright (C) 1996-2000 Russell King.
+ * Copyright (C) 2012 ARM Ltd
+ * Author: Will Deacon <will.deacon@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ *
+ * AArch32 user helpers.
+ *
+ * Each segment is 32-byte aligned and will be moved to the top of the high
+ * vector page.  New segments (if ever needed) must be added in front of
+ * existing ones.  This mechanism should be used only for things that are
+ * really small and justified, and not be abused freely.
+ *
+ * See Documentation/arm/kernel_user_helpers.txt for formal definitions.
+ */
+	.align	5
+	.globl	__kuser_helper_start
+__kuser_helper_start:
+
+__kuser_cmpxchg64:			// 0xffff0f60
+	.inst	0xe92d00f0		//	push		{r4, r5, r6, r7}
+	.inst	0xe1c040d0		//	ldrd		r4, r5, [r0]
+	.inst	0xe1c160d0		//	ldrd		r6, r7, [r1]
+	.inst	0xf57ff05f		//	dmb		sy
+	.inst	0xe1b20f9f		// 1:	ldrexd		r0, r1, [r2]
+	.inst	0xe0303004		//	eors		r3, r0, r4
+	.inst	0x00313005		//	eoreqs		r3, r1, r5
+	.inst	0x01a23f96		//	strexdeq	r3, r6, [r2]
+	.inst	0x03330001		//	teqeq		r3, #1
+	.inst	0x0afffff9		//	beq		1b
+	.inst	0xf57ff05f		//	dmb		sy
+	.inst	0xe2730000		//	rsbs		r0, r3, #0
+	.inst	0xe8bd00f0		//	pop		{r4, r5, r6, r7}
+	.inst	0xe12fff1e		//	bx		lr
+
+	.align	5
+__kuser_memory_barrier:			// 0xffff0fa0
+	.inst	0xf57ff05f		// dmb	sy
+	.inst	0xe12fff1e		// bx	lr
+
+	.align	5
+__kuser_cmpxchg:			// 0xffff0fc0
+	.inst	0xf57ff05f		//	dmb		sy
+	.inst	0xe1923f9f		// 1:	ldrex		r3, [r2]
+	.inst	0xe0533000		//	subs		r3, r3, r0
+	.inst	0x01823f91		//	strexeq	r3, r1, [r2]
+	.inst	0x03330001		//	teqeq		r3, #1
+	.inst	0x0afffffa		//	beq		1b
+	.inst	0xe2730000		//	rsbs		r0, r3, #0
+	.inst	0xeaffffef		//	b		<__kuser_memory_barrier>
+
+	.align	5
+__kuser_get_tls:			// 0xffff0fe0
+	.inst	0xee1d0f70		// mrc	p15, 0, r0, c13, c0, 3
+	.inst	0xe12fff1e		// bx	lr
+	.rep	5
+	.word	0
+	.endr
+
+__kuser_helper_version:			// 0xffff0ffc
+	.word	((__kuser_helper_end - __kuser_helper_start) >> 5)
+	.globl	__kuser_helper_end
+__kuser_helper_end:
diff --git a/arch/arm64/kernel/signal32.c b/arch/arm64/kernel/signal32.c
new file mode 100644
index 0000000..4bb754c
--- /dev/null
+++ b/arch/arm64/kernel/signal32.c
@@ -0,0 +1,876 @@
+/*
+ * Based on arch/arm/kernel/signal.c
+ *
+ * Copyright (C) 1995-2009 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ * Modified by Will Deacon <will.deacon@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define __SYSCALL_COMPAT
+
+#include <linux/compat.h>
+#include <linux/signal.h>
+#include <linux/syscalls.h>
+#include <linux/ratelimit.h>
+
+#include <asm/fpsimd.h>
+#include <asm/signal32.h>
+#include <asm/uaccess.h>
+#include <asm/unistd.h>
+
+typedef struct compat_siginfo {
+	int si_signo;
+	int si_errno;
+	int si_code;
+
+	union {
+		/* The padding is the same size as AArch64. */
+		int _pad[SI_PAD_SIZE];
+
+		/* kill() */
+		struct {
+			compat_pid_t _pid;	/* sender's pid */
+			__compat_uid32_t _uid;	/* sender's uid */
+		} _kill;
+
+		/* POSIX.1b timers */
+		struct {
+			compat_timer_t _tid;	/* timer id */
+			int _overrun;		/* overrun count */
+			compat_sigval_t _sigval;	/* same as below */
+			int _sys_private;       /* not to be passed to user */
+		} _timer;
+
+		/* POSIX.1b signals */
+		struct {
+			compat_pid_t _pid;	/* sender's pid */
+			__compat_uid32_t _uid;	/* sender's uid */
+			compat_sigval_t _sigval;
+		} _rt;
+
+		/* SIGCHLD */
+		struct {
+			compat_pid_t _pid;	/* which child */
+			__compat_uid32_t _uid;	/* sender's uid */
+			int _status;		/* exit code */
+			compat_clock_t _utime;
+			compat_clock_t _stime;
+		} _sigchld;
+
+		/* SIGILL, SIGFPE, SIGSEGV, SIGBUS */
+		struct {
+			compat_uptr_t _addr; /* faulting insn/memory ref. */
+			short _addr_lsb; /* LSB of the reported address */
+		} _sigfault;
+
+		/* SIGPOLL */
+		struct {
+			compat_long_t _band;	/* POLL_IN, POLL_OUT, POLL_MSG */
+			int _fd;
+		} _sigpoll;
+	} _sifields;
+} compat_siginfo_t;
+
+struct compat_sigaction {
+	compat_uptr_t			sa_handler;
+	compat_ulong_t			sa_flags;
+	compat_uptr_t			sa_restorer;
+	compat_sigset_t			sa_mask;
+};
+
+struct compat_old_sigaction {
+	compat_uptr_t			sa_handler;
+	compat_old_sigset_t		sa_mask;
+	compat_ulong_t			sa_flags;
+	compat_uptr_t			sa_restorer;
+};
+
+typedef struct compat_sigaltstack {
+	compat_uptr_t			ss_sp;
+	int				ss_flags;
+	compat_size_t			ss_size;
+} compat_stack_t;
+
+struct compat_sigcontext {
+	/* We always set these two fields to 0 */
+	compat_ulong_t			trap_no;
+	compat_ulong_t			error_code;
+
+	compat_ulong_t			oldmask;
+	compat_ulong_t			arm_r0;
+	compat_ulong_t			arm_r1;
+	compat_ulong_t			arm_r2;
+	compat_ulong_t			arm_r3;
+	compat_ulong_t			arm_r4;
+	compat_ulong_t			arm_r5;
+	compat_ulong_t			arm_r6;
+	compat_ulong_t			arm_r7;
+	compat_ulong_t			arm_r8;
+	compat_ulong_t			arm_r9;
+	compat_ulong_t			arm_r10;
+	compat_ulong_t			arm_fp;
+	compat_ulong_t			arm_ip;
+	compat_ulong_t			arm_sp;
+	compat_ulong_t			arm_lr;
+	compat_ulong_t			arm_pc;
+	compat_ulong_t			arm_cpsr;
+	compat_ulong_t			fault_address;
+};
+
+struct compat_ucontext {
+	compat_ulong_t			uc_flags;
+	struct compat_ucontext		*uc_link;
+	compat_stack_t			uc_stack;
+	struct compat_sigcontext	uc_mcontext;
+	compat_sigset_t			uc_sigmask;
+	int		__unused[32 - (sizeof (compat_sigset_t) / sizeof (int))];
+	compat_ulong_t	uc_regspace[128] __attribute__((__aligned__(8)));
+};
+
+struct compat_vfp_sigframe {
+	compat_ulong_t	magic;
+	compat_ulong_t	size;
+	struct compat_user_vfp {
+		compat_u64	fpregs[32];
+		compat_ulong_t	fpscr;
+	} ufp;
+	struct compat_user_vfp_exc {
+		compat_ulong_t	fpexc;
+		compat_ulong_t	fpinst;
+		compat_ulong_t	fpinst2;
+	} ufp_exc;
+} __attribute__((__aligned__(8)));
+
+#define VFP_MAGIC		0x56465001
+#define VFP_STORAGE_SIZE	sizeof(struct compat_vfp_sigframe)
+
+struct compat_aux_sigframe {
+	struct compat_vfp_sigframe	vfp;
+
+	/* Something that isn't a valid magic number for any coprocessor.  */
+	unsigned long			end_magic;
+} __attribute__((__aligned__(8)));
+
+struct compat_sigframe {
+	struct compat_ucontext	uc;
+	compat_ulong_t		retcode[2];
+};
+
+struct compat_rt_sigframe {
+	struct compat_siginfo info;
+	struct compat_sigframe sig;
+};
+
+#define _BLOCKABLE (~(sigmask(SIGKILL) | sigmask(SIGSTOP)))
+
+/*
+ * For ARM syscalls, the syscall number has to be loaded into r7.
+ * We do not support an OABI userspace.
+ */
+#define MOV_R7_NR_SIGRETURN	(0xe3a07000 | __NR_sigreturn)
+#define SVC_SYS_SIGRETURN	(0xef000000 | __NR_sigreturn)
+#define MOV_R7_NR_RT_SIGRETURN	(0xe3a07000 | __NR_rt_sigreturn)
+#define SVC_SYS_RT_SIGRETURN	(0xef000000 | __NR_rt_sigreturn)
+
+/*
+ * For Thumb syscalls, we also pass the syscall number via r7. We therefore
+ * need two 16-bit instructions.
+ */
+#define SVC_THUMB_SIGRETURN	(((0xdf00 | __NR_sigreturn) << 16) | \
+				   0x2700 | __NR_sigreturn)
+#define SVC_THUMB_RT_SIGRETURN	(((0xdf00 | __NR_rt_sigreturn) << 16) | \
+				   0x2700 | __NR_rt_sigreturn)
+
+const compat_ulong_t aarch32_sigret_code[6] = {
+	/*
+	 * AArch32 sigreturn code.
+	 * We don't construct an OABI SWI - instead we just set the imm24 field
+	 * to the EABI syscall number so that we create a sane disassembly.
+	 */
+	MOV_R7_NR_SIGRETURN,    SVC_SYS_SIGRETURN,    SVC_THUMB_SIGRETURN,
+	MOV_R7_NR_RT_SIGRETURN, SVC_SYS_RT_SIGRETURN, SVC_THUMB_RT_SIGRETURN,
+};
+
+static inline int put_sigset_t(compat_sigset_t __user *uset, sigset_t *set)
+{
+	compat_sigset_t	cset;
+
+	cset.sig[0] = set->sig[0] & 0xffffffffull;
+	cset.sig[1] = set->sig[0] >> 32;
+
+	return copy_to_user(uset, &cset, sizeof(*uset));
+}
+
+static inline int get_sigset_t(sigset_t *set,
+			       const compat_sigset_t __user *uset)
+{
+	compat_sigset_t s32;
+
+	if (copy_from_user(&s32, uset, sizeof(*uset)))
+		return -EFAULT;
+
+	set->sig[0] = s32.sig[0] | (((long)s32.sig[1]) << 32);
+	return 0;
+}
+
+int copy_siginfo_to_user32(compat_siginfo_t __user *to, siginfo_t *from)
+{
+	int err;
+
+	if (!access_ok(VERIFY_WRITE, to, sizeof(*to)))
+		return -EFAULT;
+
+	/* If you change siginfo_t structure, please be sure
+	 * this code is fixed accordingly.
+	 * It should never copy any pad contained in the structure
+	 * to avoid security leaks, but must copy the generic
+	 * 3 ints plus the relevant union member.
+	 * This routine must convert siginfo from 64bit to 32bit as well
+	 * at the same time.
+	 */
+	err = __put_user(from->si_signo, &to->si_signo);
+	err |= __put_user(from->si_errno, &to->si_errno);
+	err |= __put_user((short)from->si_code, &to->si_code);
+	if (from->si_code < 0)
+		err |= __copy_to_user(&to->_sifields._pad, &from->_sifields._pad,
+				      SI_PAD_SIZE);
+	else switch (from->si_code & __SI_MASK) {
+	case __SI_KILL:
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		break;
+	case __SI_TIMER:
+		 err |= __put_user(from->si_tid, &to->si_tid);
+		 err |= __put_user(from->si_overrun, &to->si_overrun);
+		 err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr,
+				   &to->si_ptr);
+		break;
+	case __SI_POLL:
+		err |= __put_user(from->si_band, &to->si_band);
+		err |= __put_user(from->si_fd, &to->si_fd);
+		break;
+	case __SI_FAULT:
+		err |= __put_user((compat_uptr_t)(unsigned long)from->si_addr,
+				  &to->si_addr);
+#ifdef BUS_MCEERR_AO
+		/*
+		 * Other callers might not initialize the si_lsb field,
+		 * so check explicitely for the right codes here.
+		 */
+		if (from->si_code == BUS_MCEERR_AR || from->si_code == BUS_MCEERR_AO)
+			err |= __put_user(from->si_addr_lsb, &to->si_addr_lsb);
+#endif
+		break;
+	case __SI_CHLD:
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		err |= __put_user(from->si_status, &to->si_status);
+		err |= __put_user(from->si_utime, &to->si_utime);
+		err |= __put_user(from->si_stime, &to->si_stime);
+		break;
+	case __SI_RT: /* This is not generated by the kernel as of now. */
+	case __SI_MESGQ: /* But this is */
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		err |= __put_user((compat_uptr_t)(unsigned long)from->si_ptr, &to->si_ptr);
+		break;
+	default: /* this is just in case for now ... */
+		err |= __put_user(from->si_pid, &to->si_pid);
+		err |= __put_user(from->si_uid, &to->si_uid);
+		break;
+	}
+	return err;
+}
+
+int copy_siginfo_from_user32(siginfo_t *to, compat_siginfo_t __user *from)
+{
+	memset(to, 0, sizeof *to);
+
+	if (copy_from_user(to, from, __ARCH_SI_PREAMBLE_SIZE) ||
+	    copy_from_user(to->_sifields._pad,
+			   from->_sifields._pad, SI_PAD_SIZE))
+		return -EFAULT;
+
+	return 0;
+}
+
+/*
+ * VFP save/restore code.
+ */
+static int compat_preserve_vfp_context(struct compat_vfp_sigframe __user *frame)
+{
+	struct fpsimd_state *fpsimd = &current->thread.fpsimd_state;
+	compat_ulong_t magic = VFP_MAGIC;
+	compat_ulong_t size = VFP_STORAGE_SIZE;
+	compat_ulong_t fpscr, fpexc;
+	int err = 0;
+
+	/*
+	 * Save the hardware registers to the fpsimd_state structure.
+	 * Note that this also saves V16-31, which aren't visible
+	 * in AArch32.
+	 */
+	fpsimd_save_state(fpsimd);
+
+	/* Place structure header on the stack */
+	__put_user_error(magic, &frame->magic, err);
+	__put_user_error(size, &frame->size, err);
+
+	/*
+	 * Now copy the FP registers. Since the registers are packed,
+	 * we can copy the prefix we want (V0-V15) as it is.
+	 * FIXME: Won't work if big endian.
+	 */
+	err |= __copy_to_user(&frame->ufp.fpregs, fpsimd->vregs,
+			      sizeof(frame->ufp.fpregs));
+
+	/* Create an AArch32 fpscr from the fpsr and the fpcr. */
+	fpscr = (fpsimd->fpsr & VFP_FPSCR_STAT_MASK) |
+		(fpsimd->fpcr & VFP_FPSCR_CTRL_MASK);
+	__put_user_error(fpscr, &frame->ufp.fpscr, err);
+
+	/*
+	 * The exception register aren't available so we fake up a
+	 * basic FPEXC and zero everything else.
+	 */
+	fpexc = (1 << 30);
+	__put_user_error(fpexc, &frame->ufp_exc.fpexc, err);
+	__put_user_error(0, &frame->ufp_exc.fpinst, err);
+	__put_user_error(0, &frame->ufp_exc.fpinst2, err);
+
+	return err ? -EFAULT : 0;
+}
+
+static int compat_restore_vfp_context(struct compat_vfp_sigframe __user *frame)
+{
+	struct fpsimd_state fpsimd;
+	compat_ulong_t magic = VFP_MAGIC;
+	compat_ulong_t size = VFP_STORAGE_SIZE;
+	compat_ulong_t fpscr;
+	int err = 0;
+
+	__get_user_error(magic, &frame->magic, err);
+	__get_user_error(size, &frame->size, err);
+
+	if (err)
+		return -EFAULT;
+	if (magic != VFP_MAGIC || size != VFP_STORAGE_SIZE)
+		return -EINVAL;
+
+	/*
+	 * Copy the FP registers into the start of the fpsimd_state.
+	 * FIXME: Won't work if big endian.
+	 */
+	err |= __copy_from_user(fpsimd.vregs, frame->ufp.fpregs,
+				sizeof(frame->ufp.fpregs));
+
+	/* Extract the fpsr and the fpcr from the fpscr */
+	__get_user_error(fpscr, &frame->ufp.fpscr, err);
+	fpsimd.fpsr = fpscr & VFP_FPSCR_STAT_MASK;
+	fpsimd.fpcr = fpscr & VFP_FPSCR_CTRL_MASK;
+
+	/*
+	 * We don't need to touch the exception register, so
+	 * reload the hardware state.
+	 */
+	if (!err) {
+		preempt_disable();
+		fpsimd_load_state(&fpsimd);
+		preempt_enable();
+	}
+
+	return err ? -EFAULT : 0;
+}
+
+/*
+ * atomically swap in the new signal mask, and wait for a signal.
+ */
+asmlinkage int compat_sys_sigsuspend(int restart, compat_ulong_t oldmask,
+				     compat_old_sigset_t mask)
+{
+	sigset_t blocked;
+
+	siginitset(&current->blocked, mask);
+	return sigsuspend(&blocked);
+}
+
+asmlinkage int compat_sys_sigaction(int sig,
+				    const struct compat_old_sigaction __user *act,
+				    struct compat_old_sigaction __user *oact)
+{
+	struct k_sigaction new_ka, old_ka;
+	int ret;
+	compat_old_sigset_t mask;
+	compat_uptr_t handler, restorer;
+
+	if (act) {
+		if (!access_ok(VERIFY_READ, act, sizeof(*act)) ||
+		    __get_user(handler, &act->sa_handler) ||
+		    __get_user(restorer, &act->sa_restorer) ||
+		    __get_user(new_ka.sa.sa_flags, &act->sa_flags) ||
+		    __get_user(mask, &act->sa_mask))
+			return -EFAULT;
+
+		new_ka.sa.sa_handler = compat_ptr(handler);
+		new_ka.sa.sa_restorer = compat_ptr(restorer);
+		siginitset(&new_ka.sa.sa_mask, mask);
+	}
+
+	ret = do_sigaction(sig, act ? &new_ka : NULL, oact ? &old_ka : NULL);
+
+	if (!ret && oact) {
+		if (!access_ok(VERIFY_WRITE, oact, sizeof(*oact)) ||
+		    __put_user(ptr_to_compat(old_ka.sa.sa_handler),
+			       &oact->sa_handler) ||
+		    __put_user(ptr_to_compat(old_ka.sa.sa_restorer),
+			       &oact->sa_restorer) ||
+		    __put_user(old_ka.sa.sa_flags, &oact->sa_flags) ||
+		    __put_user(old_ka.sa.sa_mask.sig[0], &oact->sa_mask))
+			return -EFAULT;
+	}
+
+	return ret;
+}
+
+asmlinkage int compat_sys_rt_sigaction(int sig,
+				       const struct compat_sigaction __user *act,
+				       struct compat_sigaction __user *oact,
+				       compat_size_t sigsetsize)
+{
+	struct k_sigaction new_ka, old_ka;
+	int ret;
+
+	/* XXX: Don't preclude handling different sized sigset_t's.  */
+	if (sigsetsize != sizeof(compat_sigset_t))
+		return -EINVAL;
+
+	if (act) {
+		compat_uptr_t handler, restorer;
+
+		ret = get_user(handler, &act->sa_handler);
+		new_ka.sa.sa_handler = compat_ptr(handler);
+		ret |= get_user(restorer, &act->sa_restorer);
+		new_ka.sa.sa_restorer = compat_ptr(restorer);
+		ret |= get_sigset_t(&new_ka.sa.sa_mask, &act->sa_mask);
+		ret |= __get_user(new_ka.sa.sa_flags, &act->sa_flags);
+		if (ret)
+			return -EFAULT;
+	}
+
+	ret = do_sigaction(sig, act ? &new_ka : NULL, oact ? &old_ka : NULL);
+	if (!ret && oact) {
+		ret = put_user(ptr_to_compat(old_ka.sa.sa_handler), &oact->sa_handler);
+		ret |= put_sigset_t(&oact->sa_mask, &old_ka.sa.sa_mask);
+		ret |= __put_user(old_ka.sa.sa_flags, &oact->sa_flags);
+	}
+	return ret;
+}
+
+int compat_do_sigaltstack(compat_uptr_t compat_uss, compat_uptr_t compat_uoss,
+			  compat_ulong_t sp)
+{
+	compat_stack_t __user *newstack = compat_ptr(compat_uss);
+	compat_stack_t __user *oldstack = compat_ptr(compat_uoss);
+	compat_uptr_t ss_sp;
+	int ret;
+	mm_segment_t old_fs;
+	stack_t uss, uoss;
+
+	/* Marshall the compat new stack into a stack_t */
+	if (newstack) {
+		if (get_user(ss_sp, &newstack->ss_sp) ||
+		    __get_user(uss.ss_flags, &newstack->ss_flags) ||
+		    __get_user(uss.ss_size, &newstack->ss_size))
+			return -EFAULT;
+		uss.ss_sp = compat_ptr(ss_sp);
+	}
+
+	old_fs = get_fs();
+	set_fs(KERNEL_DS);
+	/* The __user pointer casts are valid because of the set_fs() */
+	ret = do_sigaltstack(
+		newstack ? (stack_t __user *) &uss : NULL,
+		oldstack ? (stack_t __user *) &uoss : NULL,
+		(unsigned long)sp);
+	set_fs(old_fs);
+
+	/* Convert the old stack_t into a compat stack. */
+	if (!ret && oldstack &&
+		(put_user(ptr_to_compat(uoss.ss_sp), &oldstack->ss_sp) ||
+		 __put_user(uoss.ss_flags, &oldstack->ss_flags) ||
+		 __put_user(uoss.ss_size, &oldstack->ss_size)))
+		return -EFAULT;
+	return ret;
+}
+
+static int compat_restore_sigframe(struct pt_regs *regs,
+				   struct compat_sigframe __user *sf)
+{
+	int err;
+	sigset_t set;
+	struct compat_aux_sigframe __user *aux;
+
+	err = get_sigset_t(&set, &sf->uc.uc_sigmask);
+	if (err == 0) {
+		sigdelsetmask(&set, ~_BLOCKABLE);
+		set_current_blocked(&set);
+	}
+
+	__get_user_error(regs->regs[0], &sf->uc.uc_mcontext.arm_r0, err);
+	__get_user_error(regs->regs[1], &sf->uc.uc_mcontext.arm_r1, err);
+	__get_user_error(regs->regs[2], &sf->uc.uc_mcontext.arm_r2, err);
+	__get_user_error(regs->regs[3], &sf->uc.uc_mcontext.arm_r3, err);
+	__get_user_error(regs->regs[4], &sf->uc.uc_mcontext.arm_r4, err);
+	__get_user_error(regs->regs[5], &sf->uc.uc_mcontext.arm_r5, err);
+	__get_user_error(regs->regs[6], &sf->uc.uc_mcontext.arm_r6, err);
+	__get_user_error(regs->regs[7], &sf->uc.uc_mcontext.arm_r7, err);
+	__get_user_error(regs->regs[8], &sf->uc.uc_mcontext.arm_r8, err);
+	__get_user_error(regs->regs[9], &sf->uc.uc_mcontext.arm_r9, err);
+	__get_user_error(regs->regs[10], &sf->uc.uc_mcontext.arm_r10, err);
+	__get_user_error(regs->regs[11], &sf->uc.uc_mcontext.arm_fp, err);
+	__get_user_error(regs->regs[12], &sf->uc.uc_mcontext.arm_ip, err);
+	__get_user_error(regs->compat_sp, &sf->uc.uc_mcontext.arm_sp, err);
+	__get_user_error(regs->compat_lr, &sf->uc.uc_mcontext.arm_lr, err);
+	__get_user_error(regs->pc, &sf->uc.uc_mcontext.arm_pc, err);
+	__get_user_error(regs->pstate, &sf->uc.uc_mcontext.arm_cpsr, err);
+
+	/*
+	 * Avoid compat_sys_sigreturn() restarting.
+	 */
+	regs->syscallno = ~0UL;
+
+	err |= !valid_user_regs(&regs->user_regs);
+
+	aux = (struct compat_aux_sigframe __user *) sf->uc.uc_regspace;
+	if (err == 0)
+		err |= compat_restore_vfp_context(&aux->vfp);
+
+	return err;
+}
+
+asmlinkage int compat_sys_sigreturn(struct pt_regs *regs)
+{
+	struct compat_sigframe __user *frame;
+
+	/* Always make any pending restarted system calls return -EINTR */
+	current_thread_info()->restart_block.fn = do_no_restart_syscall;
+
+	/*
+	 * Since we stacked the signal on a 64-bit boundary,
+	 * then 'sp' should be word aligned here.  If it's
+	 * not, then the user is trying to mess with us.
+	 */
+	if (regs->compat_sp & 7)
+		goto badframe;
+
+	frame = (struct compat_sigframe __user *)regs->compat_sp;
+
+	if (!access_ok(VERIFY_READ, frame, sizeof (*frame)))
+		goto badframe;
+
+	if (compat_restore_sigframe(regs, frame))
+		goto badframe;
+
+	return regs->regs[0];
+
+badframe:
+	if (show_unhandled_signals)
+		printk_ratelimited(KERN_INFO "%s[%d]: bad frame in %s: pc=%08llx sp=%08llx\n",
+				   current->comm, task_pid_nr(current), __func__,
+				   regs->pc, regs->sp);
+	force_sig(SIGSEGV, current);
+	return 0;
+}
+
+asmlinkage int compat_sys_rt_sigreturn(struct pt_regs *regs)
+{
+	struct compat_rt_sigframe __user *frame;
+
+	/* Always make any pending restarted system calls return -EINTR */
+	current_thread_info()->restart_block.fn = do_no_restart_syscall;
+
+	/*
+	 * Since we stacked the signal on a 64-bit boundary,
+	 * then 'sp' should be word aligned here.  If it's
+	 * not, then the user is trying to mess with us.
+	 */
+	if (regs->compat_sp & 7)
+		goto badframe;
+
+	frame = (struct compat_rt_sigframe __user *)regs->compat_sp;
+
+	if (!access_ok(VERIFY_READ, frame, sizeof (*frame)))
+		goto badframe;
+
+	if (compat_restore_sigframe(regs, &frame->sig))
+		goto badframe;
+
+	if (compat_do_sigaltstack(ptr_to_compat(&frame->sig.uc.uc_stack),
+				 ptr_to_compat((void __user *)NULL),
+				 regs->compat_sp) == -EFAULT)
+		goto badframe;
+
+	return regs->regs[0];
+
+badframe:
+	if (show_unhandled_signals)
+		printk_ratelimited(KERN_INFO "%s[%d]: bad frame in %s: pc=%08llx sp=%08llx\n",
+				   current->comm, task_pid_nr(current), __func__,
+				   regs->pc, regs->sp);
+	force_sig(SIGSEGV, current);
+	return 0;
+}
+
+static inline void __user *compat_get_sigframe(struct k_sigaction *ka,
+					       struct pt_regs *regs,
+					       int framesize)
+{
+	compat_ulong_t sp = regs->compat_sp;
+	void __user *frame;
+
+	/*
+	 * This is the X/Open sanctioned signal stack switching.
+	 */
+	if ((ka->sa.sa_flags & SA_ONSTACK) && !sas_ss_flags(sp))
+		sp = current->sas_ss_sp + current->sas_ss_size;
+
+	/*
+	 * ATPCS B01 mandates 8-byte alignment
+	 */
+	frame = compat_ptr((compat_uptr_t)((sp - framesize) & ~7));
+
+	/*
+	 * Check that we can actually write to the signal frame.
+	 */
+	if (!access_ok(VERIFY_WRITE, frame, framesize))
+		frame = NULL;
+
+	return frame;
+}
+
+static int compat_setup_return(struct pt_regs *regs, struct k_sigaction *ka,
+			       compat_ulong_t __user *rc, void __user *frame,
+			       int usig)
+{
+	compat_ulong_t handler = ptr_to_compat(ka->sa.sa_handler);
+	compat_ulong_t retcode;
+	compat_ulong_t spsr = regs->pstate & ~PSR_f;
+	int thumb;
+
+	/* Check if the handler is written for ARM or Thumb */
+	thumb = handler & 1;
+
+	if (thumb) {
+		spsr |= COMPAT_PSR_T_BIT;
+		spsr &= ~COMPAT_PSR_IT_MASK;
+	} else {
+		spsr &= ~COMPAT_PSR_T_BIT;
+	}
+
+	if (ka->sa.sa_flags & SA_RESTORER) {
+		retcode = ptr_to_compat(ka->sa.sa_restorer);
+	} else {
+		/* Set up sigreturn pointer */
+		unsigned int idx = thumb << 1;
+
+		if (ka->sa.sa_flags & SA_SIGINFO)
+			idx += 3;
+
+		retcode = AARCH32_VECTORS_BASE +
+			  AARCH32_KERN_SIGRET_CODE_OFFSET +
+			  (idx << 2) + thumb;
+	}
+
+	regs->regs[0]	= usig;
+	regs->compat_sp	= ptr_to_compat(frame);
+	regs->compat_lr	= retcode;
+	regs->pc	= handler;
+	regs->pstate	= spsr;
+
+	return 0;
+}
+
+static int compat_setup_sigframe(struct compat_sigframe __user *sf,
+				 struct pt_regs *regs, sigset_t *set)
+{
+	struct compat_aux_sigframe __user *aux;
+	int err = 0;
+
+	__put_user_error(regs->regs[0], &sf->uc.uc_mcontext.arm_r0, err);
+	__put_user_error(regs->regs[1], &sf->uc.uc_mcontext.arm_r1, err);
+	__put_user_error(regs->regs[2], &sf->uc.uc_mcontext.arm_r2, err);
+	__put_user_error(regs->regs[3], &sf->uc.uc_mcontext.arm_r3, err);
+	__put_user_error(regs->regs[4], &sf->uc.uc_mcontext.arm_r4, err);
+	__put_user_error(regs->regs[5], &sf->uc.uc_mcontext.arm_r5, err);
+	__put_user_error(regs->regs[6], &sf->uc.uc_mcontext.arm_r6, err);
+	__put_user_error(regs->regs[7], &sf->uc.uc_mcontext.arm_r7, err);
+	__put_user_error(regs->regs[8], &sf->uc.uc_mcontext.arm_r8, err);
+	__put_user_error(regs->regs[9], &sf->uc.uc_mcontext.arm_r9, err);
+	__put_user_error(regs->regs[10], &sf->uc.uc_mcontext.arm_r10, err);
+	__put_user_error(regs->regs[11], &sf->uc.uc_mcontext.arm_fp, err);
+	__put_user_error(regs->regs[12], &sf->uc.uc_mcontext.arm_ip, err);
+	__put_user_error(regs->compat_sp, &sf->uc.uc_mcontext.arm_sp, err);
+	__put_user_error(regs->compat_lr, &sf->uc.uc_mcontext.arm_lr, err);
+	__put_user_error(regs->pc, &sf->uc.uc_mcontext.arm_pc, err);
+	__put_user_error(regs->pstate, &sf->uc.uc_mcontext.arm_cpsr, err);
+
+	__put_user_error((compat_ulong_t)0, &sf->uc.uc_mcontext.trap_no, err);
+	__put_user_error((compat_ulong_t)0, &sf->uc.uc_mcontext.error_code, err);
+	__put_user_error(current->thread.fault_address, &sf->uc.uc_mcontext.fault_address, err);
+	__put_user_error(set->sig[0], &sf->uc.uc_mcontext.oldmask, err);
+
+	err |= put_sigset_t(&sf->uc.uc_sigmask, set);
+
+	aux = (struct compat_aux_sigframe __user *) sf->uc.uc_regspace;
+
+	if (err == 0)
+		err |= compat_preserve_vfp_context(&aux->vfp);
+	__put_user_error(0, &aux->end_magic, err);
+
+	return err;
+}
+
+/*
+ * 32-bit signal handling routines called from signal.c
+ */
+int compat_setup_rt_frame(int usig, struct k_sigaction *ka, siginfo_t *info,
+			  sigset_t *set, struct pt_regs *regs)
+{
+	struct compat_rt_sigframe __user *frame;
+	compat_stack_t stack;
+	int err = 0;
+
+	frame = compat_get_sigframe(ka, regs, sizeof(*frame));
+
+	if (!frame)
+		return 1;
+
+	err |= copy_siginfo_to_user32(&frame->info, info);
+
+	__put_user_error(0, &frame->sig.uc.uc_flags, err);
+	__put_user_error(NULL, &frame->sig.uc.uc_link, err);
+
+	memset(&stack, 0, sizeof(stack));
+	stack.ss_sp = (compat_uptr_t)current->sas_ss_sp;
+	stack.ss_flags = sas_ss_flags(regs->compat_sp);
+	stack.ss_size = current->sas_ss_size;
+	err |= __copy_to_user(&frame->sig.uc.uc_stack, &stack, sizeof(stack));
+
+	err |= compat_setup_sigframe(&frame->sig, regs, set);
+	if (err == 0)
+		err = compat_setup_return(regs, ka, frame->sig.retcode, frame,
+					  usig);
+
+	if (err == 0) {
+		regs->regs[1] = (compat_ulong_t)(unsigned long)&frame->info;
+		regs->regs[2] = (compat_ulong_t)(unsigned long)&frame->sig.uc;
+	}
+
+	return err;
+}
+
+int compat_setup_frame(int usig, struct k_sigaction *ka, sigset_t *set,
+		       struct pt_regs *regs)
+{
+	struct compat_sigframe __user *frame;
+	int err = 0;
+
+	frame = compat_get_sigframe(ka, regs, sizeof(*frame));
+
+	if (!frame)
+		return 1;
+
+	__put_user_error(0x5ac3c35a, &frame->uc.uc_flags, err);
+
+	err |= compat_setup_sigframe(frame, regs, set);
+	if (err == 0)
+		err = compat_setup_return(regs, ka, frame->retcode, frame, usig);
+
+	return err;
+}
+
+/*
+ * RT signals don't have generic compat wrappers.
+ * See arch/powerpc/kernel/signal_32.c
+ */
+asmlinkage int compat_sys_rt_sigprocmask(int how, compat_sigset_t __user *set,
+					 compat_sigset_t __user *oset,
+					 compat_size_t sigsetsize)
+{
+	sigset_t s;
+	sigset_t __user *up;
+	int ret;
+	mm_segment_t old_fs = get_fs();
+
+	if (set) {
+		if (get_sigset_t(&s, set))
+			return -EFAULT;
+	}
+
+	set_fs(KERNEL_DS);
+	/* This is valid because of the set_fs() */
+	up = (sigset_t __user *) &s;
+	ret = sys_rt_sigprocmask(how, set ? up : NULL, oset ? up : NULL,
+				 sigsetsize);
+	set_fs(old_fs);
+	if (ret)
+		return ret;
+	if (oset) {
+		if (put_sigset_t(oset, &s))
+			return -EFAULT;
+	}
+	return 0;
+}
+
+asmlinkage int compat_sys_rt_sigpending(compat_sigset_t __user *set,
+					compat_size_t sigsetsize)
+{
+	sigset_t s;
+	int ret;
+	mm_segment_t old_fs = get_fs();
+
+	set_fs(KERNEL_DS);
+	/* The __user pointer cast is valid because of the set_fs() */
+	ret = sys_rt_sigpending((sigset_t __user *) &s, sigsetsize);
+	set_fs(old_fs);
+	if (!ret) {
+		if (put_sigset_t(set, &s))
+			return -EFAULT;
+	}
+	return ret;
+}
+
+asmlinkage int compat_sys_rt_sigqueueinfo(int pid, int sig,
+					  compat_siginfo_t __user *uinfo)
+{
+	siginfo_t info;
+	int ret;
+	mm_segment_t old_fs = get_fs();
+
+	ret = copy_siginfo_from_user32(&info, uinfo);
+	if (unlikely(ret))
+		return ret;
+
+	set_fs (KERNEL_DS);
+	/* The __user pointer cast is valid because of the set_fs() */
+	ret = sys_rt_sigqueueinfo(pid, sig, (siginfo_t __user *) &info);
+	set_fs (old_fs);
+	return ret;
+}
+
+void compat_setup_restart_syscall(struct pt_regs *regs)
+{
+       regs->regs[7] = __NR_restart_syscall;
+}
diff --git a/arch/arm64/kernel/sys32.S b/arch/arm64/kernel/sys32.S
new file mode 100644
index 0000000..fc764c1
--- /dev/null
+++ b/arch/arm64/kernel/sys32.S
@@ -0,0 +1,283 @@
+/*
+ * Compat system call wrappers
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Authors: Will Deacon <will.deacon@arm.com>
+ *	    Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/assembler.h>
+#include <asm/asm-offsets.h>
+
+/*
+ * System call wrappers for the AArch32 compatibility layer.
+ */
+compat_sys_fork_wrapper:
+	mov	x0, sp
+	b	compat_sys_fork
+ENDPROC(compat_sys_fork_wrapper)
+
+compat_sys_vfork_wrapper:
+	mov	x0, sp
+	b	compat_sys_vfork
+ENDPROC(compat_sys_vfork_wrapper)
+
+compat_sys_execve_wrapper:
+	mov	x3, sp
+	b	compat_sys_execve
+ENDPROC(compat_sys_execve_wrapper)
+
+compat_sys_clone_wrapper:
+	mov	x5, sp
+	b	compat_sys_clone
+ENDPROC(compat_sys_clone_wrapper)
+
+compat_sys_sigreturn_wrapper:
+	mov	x0, sp
+	mov	x27, #0		// prevent syscall restart handling (why)
+	b	compat_sys_sigreturn
+ENDPROC(compat_sys_sigreturn_wrapper)
+
+compat_sys_rt_sigreturn_wrapper:
+	mov	x0, sp
+	mov	x27, #0		// prevent syscall restart handling (why)
+	b	compat_sys_rt_sigreturn
+ENDPROC(compat_sys_rt_sigreturn_wrapper)
+
+compat_sys_sigaltstack_wrapper:
+	ldr	x2, [sp, #S_COMPAT_SP]
+	b	compat_do_sigaltstack
+ENDPROC(compat_sys_sigaltstack_wrapper)
+
+compat_sys_statfs64_wrapper:
+	mov	w3, #84
+	cmp	w1, #88
+	csel	w1, w3, w1, eq
+	b	compat_sys_statfs64
+ENDPROC(compat_sys_statfs64_wrapper)
+
+compat_sys_fstatfs64_wrapper:
+	mov	w3, #84
+	cmp	w1, #88
+	csel	w1, w3, w1, eq
+	b	compat_sys_fstatfs64
+ENDPROC(compat_sys_fstatfs64_wrapper)
+
+/*
+ * Wrappers for AArch32 syscalls that either take 64-bit parameters
+ * in registers or that take 32-bit parameters which require sign
+ * extension.
+ */
+compat_sys_lseek_wrapper:
+	sxtw	x1, w1
+	b	sys_lseek
+ENDPROC(compat_sys_lseek_wrapper)
+
+compat_sys_pread64_wrapper:
+	orr	x3, x4, x5, lsl #32
+	b	sys_pread64
+ENDPROC(compat_sys_pread64_wrapper)
+
+compat_sys_pwrite64_wrapper:
+	orr	x3, x4, x5, lsl #32
+	b	sys_pwrite64
+ENDPROC(compat_sys_pwrite64_wrapper)
+
+compat_sys_truncate64_wrapper:
+	orr	x1, x2, x3, lsl #32
+	b	sys_truncate
+ENDPROC(compat_sys_truncate64_wrapper)
+
+compat_sys_ftruncate64_wrapper:
+	orr	x1, x2, x3, lsl #32
+	b	sys_ftruncate
+ENDPROC(compat_sys_ftruncate64_wrapper)
+
+compat_sys_readahead_wrapper:
+	orr	x1, x2, x3, lsl #32
+	mov	w2, w4
+	b	sys_readahead
+ENDPROC(compat_sys_readahead_wrapper)
+
+compat_sys_lookup_dcookie:
+	orr	x0, x0, x1, lsl #32
+	mov	w1, w2
+	mov	w2, w3
+	b	sys_lookup_dcookie
+ENDPROC(compat_sys_lookup_dcookie)
+
+compat_sys_fadvise64_64_wrapper:
+	mov	w6, w1
+	orr	x1, x2, x3, lsl #32
+	orr	x2, x4, x5, lsl #32
+	mov	w3, w6
+	b	sys_fadvise64_64
+ENDPROC(compat_sys_fadvise64_64_wrapper)
+
+compat_sys_sync_file_range2_wrapper:
+	orr	x2, x2, x3, lsl #32
+	orr	x3, x4, x5, lsl #32
+	b	sys_sync_file_range2
+ENDPROC(compat_sys_sync_file_range2_wrapper)
+
+compat_sys_fallocate_wrapper:
+	orr	x2, x2, x3, lsl #32
+	orr	x3, x4, x5, lsl #32
+	b	sys_fallocate
+ENDPROC(compat_sys_fallocate_wrapper)
+
+compat_sys_fanotify_mark_wrapper:
+	orr	x2, x2, x3, lsl #32
+	mov	w3, w4
+	mov	w4, w5
+	b	sys_fanotify_mark
+ENDPROC(compat_sys_fanotify_mark_wrapper)
+
+/*
+ * Use the compat system call wrappers.
+ */
+#define sys_fork		compat_sys_fork_wrapper
+#define sys_open		compat_sys_open
+#define sys_execve		compat_sys_execve_wrapper
+#define sys_lseek		compat_sys_lseek_wrapper
+#define sys_mount		compat_sys_mount
+#define sys_ptrace		compat_sys_ptrace
+#define sys_times		compat_sys_times
+#define sys_ioctl		compat_sys_ioctl
+#define sys_fcntl		compat_sys_fcntl
+#define sys_ustat		compat_sys_ustat
+#define sys_sigaction		compat_sys_sigaction
+#define sys_sigsuspend		compat_sys_sigsuspend
+#define sys_sigpending		compat_sys_sigpending
+#define sys_setrlimit		compat_sys_setrlimit
+#define sys_getrusage		compat_sys_getrusage
+#define sys_gettimeofday	compat_sys_gettimeofday
+#define sys_settimeofday	compat_sys_settimeofday
+#define sys_statfs		compat_sys_statfs
+#define sys_fstatfs		compat_sys_fstatfs
+#define sys_setitimer		compat_sys_setitimer
+#define sys_getitimer		compat_sys_getitimer
+#define sys_newstat		compat_sys_newstat
+#define sys_newlstat		compat_sys_newlstat
+#define sys_newfstat		compat_sys_newfstat
+#define sys_wait4		compat_sys_wait4
+#define sys_sysinfo		compat_sys_sysinfo
+#define sys_sigreturn		compat_sys_sigreturn_wrapper
+#define sys_clone		compat_sys_clone_wrapper
+#define sys_adjtimex		compat_sys_adjtimex
+#define sys_sigprocmask		compat_sys_sigprocmask
+#define sys_personality		compat_sys_personality
+#define sys_getdents		compat_sys_getdents
+#define sys_select		compat_sys_select
+#define sys_readv		compat_sys_readv
+#define sys_writev		compat_sys_writev
+#define sys_sysctl		compat_sys_sysctl
+#define sys_sched_rr_get_interval compat_sys_sched_rr_get_interval
+#define sys_nanosleep		compat_sys_nanosleep
+#define sys_rt_sigreturn	compat_sys_rt_sigreturn_wrapper
+#define sys_rt_sigaction	compat_sys_rt_sigaction
+#define sys_rt_sigprocmask	compat_sys_rt_sigprocmask
+#define sys_rt_sigpending	compat_sys_rt_sigpending
+#define sys_rt_sigtimedwait	compat_sys_rt_sigtimedwait
+#define sys_rt_sigqueueinfo	compat_sys_rt_sigqueueinfo
+#define sys_rt_sigsuspend	compat_sys_rt_sigsuspend
+#define sys_pread64		compat_sys_pread64_wrapper
+#define sys_pwrite64		compat_sys_pwrite64_wrapper
+#define sys_sigaltstack		compat_sys_sigaltstack_wrapper
+#define sys_sendfile		compat_sys_sendfile
+#define sys_vfork		compat_sys_vfork_wrapper
+#define sys_getrlimit		compat_sys_getrlimit
+#define sys_mmap2		sys_mmap_pgoff
+#define sys_truncate64		compat_sys_truncate64_wrapper
+#define sys_ftruncate64		compat_sys_ftruncate64_wrapper
+#define sys_getdents64		compat_sys_getdents64
+#define sys_fcntl64		compat_sys_fcntl64
+#define sys_readahead		compat_sys_readahead_wrapper
+#define sys_futex		compat_sys_futex
+#define sys_sched_setaffinity	compat_sys_sched_setaffinity
+#define sys_sched_getaffinity	compat_sys_sched_getaffinity
+#define sys_io_setup		compat_sys_io_setup
+#define sys_io_getevents	compat_sys_io_getevents
+#define sys_io_submit		compat_sys_io_submit
+#define sys_lookup_dcookie	compat_sys_lookup_dcookie
+#define sys_timer_create	compat_sys_timer_create
+#define sys_timer_settime	compat_sys_timer_settime
+#define sys_timer_gettime	compat_sys_timer_gettime
+#define sys_clock_settime	compat_sys_clock_settime
+#define sys_clock_gettime	compat_sys_clock_gettime
+#define sys_clock_getres	compat_sys_clock_getres
+#define sys_clock_nanosleep	compat_sys_clock_nanosleep
+#define sys_statfs64		compat_sys_statfs64_wrapper
+#define sys_fstatfs64		compat_sys_fstatfs64_wrapper
+#define sys_utimes		compat_sys_utimes
+#define sys_fadvise64_64	compat_sys_fadvise64_64_wrapper
+#define sys_mq_open		compat_sys_mq_open
+#define sys_mq_timedsend	compat_sys_mq_timedsend
+#define sys_mq_timedreceive	compat_sys_mq_timedreceive
+#define sys_mq_notify		compat_sys_mq_notify
+#define sys_mq_getsetattr	compat_sys_mq_getsetattr
+#define sys_waitid		compat_sys_waitid
+#define sys_recv		compat_sys_recv
+#define sys_recvfrom		compat_sys_recvfrom
+#define sys_setsockopt		compat_sys_setsockopt
+#define sys_getsockopt		compat_sys_getsockopt
+#define sys_sendmsg		compat_sys_sendmsg
+#define sys_recvmsg		compat_sys_recvmsg
+#define sys_semctl		compat_sys_semctl
+#define sys_msgsnd		compat_sys_msgsnd
+#define sys_msgrcv		compat_sys_msgrcv
+#define sys_msgctl		compat_sys_msgctl
+#define sys_shmat		compat_sys_shmat
+#define sys_shmctl		compat_sys_shmctl
+#define sys_keyctl		compat_sys_keyctl
+#define sys_semtimedop		compat_sys_semtimedop
+#define sys_mbind		compat_sys_mbind
+#define sys_get_mempolicy	compat_sys_get_mempolicy
+#define sys_set_mempolicy	compat_sys_set_mempolicy
+#define sys_openat		compat_sys_openat
+#define sys_futimesat		compat_sys_futimesat
+#define sys_pselect6		compat_sys_pselect6
+#define sys_ppoll		compat_sys_ppoll
+#define sys_set_robust_list	compat_sys_set_robust_list
+#define sys_get_robust_list	compat_sys_get_robust_list
+#define sys_sync_file_range2	compat_sys_sync_file_range2_wrapper
+#define sys_vmsplice		compat_sys_vmsplice
+#define sys_move_pages		compat_sys_move_pages
+#define sys_epoll_pwait		compat_sys_epoll_pwait
+#define sys_kexec_load		compat_sys_kexec_load
+#define sys_utimensat		compat_sys_utimensat
+#define sys_signalfd		compat_sys_signalfd
+#define sys_fallocate		compat_sys_fallocate_wrapper
+#define sys_timerfd_settime	compat_sys_timerfd_settime
+#define sys_timerfd_gettime	compat_sys_timerfd_gettime
+#define sys_signalfd4		compat_sys_signalfd4
+#define sys_preadv		compat_sys_preadv
+#define sys_pwritev		compat_sys_pwritev
+#define sys_rt_tgsigqueueinfo	compat_sys_rt_tgsigqueueinfo
+#define sys_recvmmsg		compat_sys_recvmmsg
+#define sys_fanotify_mark	compat_sys_fanotify_mark_wrapper
+
+#undef __SYSCALL
+#define __SYSCALL(x, y)		.quad	y	// x
+#define __SYSCALL_COMPAT
+
+/*
+ * The system calls table must be 4KB aligned.
+ */
+	.align	12
+ENTRY(compat_sys_call_table)
+#include <asm/unistd.h>
diff --git a/arch/arm64/kernel/sys_compat.c b/arch/arm64/kernel/sys_compat.c
new file mode 100644
index 0000000..025ec0a
--- /dev/null
+++ b/arch/arm64/kernel/sys_compat.c
@@ -0,0 +1,177 @@
+/*
+ * Based on arch/arm/kernel/sys_arm.c
+ *
+ * Copyright (C) People who wrote linux/arch/i386/kernel/sys_i386.c
+ * Copyright (C) 1995, 1996 Russell King.
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define __SYSCALL_COMPAT
+
+#include <linux/compat.h>
+#include <linux/personality.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/syscalls.h>
+#include <linux/uaccess.h>
+
+#include <asm/cacheflush.h>
+#include <asm/unistd.h>
+
+asmlinkage int compat_sys_fork(struct pt_regs *regs)
+{
+	return do_fork(SIGCHLD, regs->compat_sp, regs, 0, NULL, NULL);
+}
+
+asmlinkage int compat_sys_clone(unsigned long clone_flags, unsigned long newsp,
+			  int __user *parent_tidptr, int tls_val,
+			  int __user *child_tidptr, struct pt_regs *regs)
+{
+	if (!newsp)
+		newsp = regs->compat_sp;
+
+	return do_fork(clone_flags, newsp, regs, 0, parent_tidptr, child_tidptr);
+}
+
+asmlinkage int compat_sys_vfork(struct pt_regs *regs)
+{
+	return do_fork(CLONE_VFORK | CLONE_VM | SIGCHLD, regs->compat_sp,
+		       regs, 0, NULL, NULL);
+}
+
+asmlinkage int compat_sys_execve(const char __user *filenamei,
+				 compat_uptr_t argv, compat_uptr_t envp,
+				 struct pt_regs *regs)
+{
+	int error;
+	char * filename;
+
+	filename = getname(filenamei);
+	error = PTR_ERR(filename);
+	if (IS_ERR(filename))
+		goto out;
+	error = compat_do_execve(filename, compat_ptr(argv), compat_ptr(envp),
+				 regs);
+	putname(filename);
+out:
+	return error;
+}
+
+asmlinkage int compat_sys_sched_rr_get_interval(compat_pid_t pid,
+						struct compat_timespec __user *interval)
+{
+	struct timespec t;
+	int ret;
+	mm_segment_t old_fs = get_fs();
+
+	set_fs(KERNEL_DS);
+	ret = sys_sched_rr_get_interval(pid, (struct timespec __user *)&t);
+	set_fs(old_fs);
+	if (put_compat_timespec(&t, interval))
+		return -EFAULT;
+	return ret;
+}
+
+asmlinkage int compat_sys_personality(compat_ulong_t personality)
+{
+	int ret;
+
+	if (personality(current->personality) == PER_LINUX32 &&
+		personality == PER_LINUX)
+		personality = PER_LINUX32;
+	ret = sys_personality(personality);
+	if (ret == PER_LINUX32)
+		ret = PER_LINUX;
+	return ret;
+}
+
+asmlinkage int compat_sys_sendfile(int out_fd, int in_fd,
+				   compat_off_t __user *offset, s32 count)
+{
+	mm_segment_t old_fs = get_fs();
+	int ret;
+	off_t of;
+
+	if (offset && get_user(of, offset))
+		return -EFAULT;
+
+	set_fs(KERNEL_DS);
+	ret = sys_sendfile(out_fd, in_fd, offset ? (off_t __user *)&of : NULL,
+			   count);
+	set_fs(old_fs);
+
+	if (offset && put_user(of, offset))
+		return -EFAULT;
+	return ret;
+}
+
+static inline void
+do_compat_cache_op(unsigned long start, unsigned long end, int flags)
+{
+	struct mm_struct *mm = current->active_mm;
+	struct vm_area_struct *vma;
+
+	if (end < start || flags)
+		return;
+
+	down_read(&mm->mmap_sem);
+	vma = find_vma(mm, start);
+	if (vma && vma->vm_start < end) {
+		if (start < vma->vm_start)
+			start = vma->vm_start;
+		if (end > vma->vm_end)
+			end = vma->vm_end;
+		up_read(&mm->mmap_sem);
+		flush_cache_user_range(start, end);
+		return;
+	}
+	up_read(&mm->mmap_sem);
+}
+
+/*
+ * Handle all unrecognised system calls.
+ */
+long compat_arm_syscall(struct pt_regs *regs)
+{
+	unsigned int no = regs->regs[7];
+
+	switch (no) {
+	/*
+	 * Flush a region from virtual address 'r0' to virtual address 'r1'
+	 * _exclusive_.  There is no alignment requirement on either address;
+	 * user space does not need to know the hardware cache layout.
+	 *
+	 * r2 contains flags.  It should ALWAYS be passed as ZERO until it
+	 * is defined to be something else.  For now we ignore it, but may
+	 * the fires of hell burn in your belly if you break this rule. ;)
+	 *
+	 * (at a later date, we may want to allow this call to not flush
+	 * various aspects of the cache.  Passing '0' will guarantee that
+	 * everything necessary gets flushed to maintain consistency in
+	 * the specified region).
+	 */
+	case __ARM_NR_compat_cacheflush:
+		do_compat_cache_op(regs->regs[0], regs->regs[1], regs->regs[2]);
+		return 0;
+
+	case __ARM_NR_compat_set_tls:
+		current->thread.tp_value = regs->regs[0];
+		asm ("msr tpidrro_el0, %0" : : "r" (regs->regs[0]));
+		return 0;
+
+	default:
+		return -ENOSYS;
+	}
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 22/31] arm64: Floating point and SIMD
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (20 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 21/31] arm64: 32-bit (compat) applications support Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 14:35   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 23/31] arm64: Debugging support Catalin Marinas
                   ` (9 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds support for FP/ASIMD register bank saving and restoring
during context switch and FP exception handling to generate SIGFPE.
There are 32 128-bit registers and the context switching is currently
done non-lazily. Benchmarks on real hardware are required before
implementing lazy FP state saving/restoring.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/fpsimd.h  |   64 +++++++++++++++++++++++
 arch/arm64/kernel/entry-fpsimd.S |   80 ++++++++++++++++++++++++++++
 arch/arm64/kernel/fpsimd.c       |  106 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 250 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/fpsimd.h
 create mode 100644 arch/arm64/kernel/entry-fpsimd.S
 create mode 100644 arch/arm64/kernel/fpsimd.c

diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h
new file mode 100644
index 0000000..7ea4711
--- /dev/null
+++ b/arch/arm64/include/asm/fpsimd.h
@@ -0,0 +1,64 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_FP_H
+#define __ASM_FP_H
+
+#include <asm/ptrace.h>
+
+#ifndef __ASSEMBLY__
+
+/*
+ * FP/SIMD storage area has:
+ *  - FPSR and FPCR
+ *  - 32 128-bit data registers
+ *
+ * Note that user_fp forms a prefix of this structure, which is relied
+ * upon in the ptrace FP/SIMD accessors. struct user_fpsimd_state must
+ * form a prefix of struct fpsimd_state.
+ */
+struct fpsimd_state {
+	union {
+		struct user_fpsimd_state user_fpsimd;
+		struct {
+			__uint128_t vregs[32];
+			u32 fpsr;
+			u32 fpcr;
+		};
+	};
+};
+
+#if defined(__KERNEL__) && defined(CONFIG_AARCH32_EMULATION)
+/* Masks for extracting the FPSR and FPCR from the FPSCR */
+#define VFP_FPSCR_STAT_MASK	0xf800009f
+#define VFP_FPSCR_CTRL_MASK	0x07f79f00
+/*
+ * The VFP state has 32x64-bit registers and a single 32-bit
+ * control/status register.
+ */
+#define VFP_STATE_SIZE		((32 * 8) + 4)
+#endif
+
+struct task_struct;
+
+extern void fpsimd_save_state(struct fpsimd_state *state);
+extern void fpsimd_load_state(struct fpsimd_state *state);
+
+extern void fpsimd_thread_switch(struct task_struct *next);
+extern void fpsimd_flush_thread(void);
+
+#endif
+
+#endif
diff --git a/arch/arm64/kernel/entry-fpsimd.S b/arch/arm64/kernel/entry-fpsimd.S
new file mode 100644
index 0000000..17988a6
--- /dev/null
+++ b/arch/arm64/kernel/entry-fpsimd.S
@@ -0,0 +1,80 @@
+/*
+ * FP/SIMD state saving and restoring
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+
+#include <asm/assembler.h>
+
+/*
+ * Save the FP registers.
+ *
+ * x0 - pointer to struct fpsimd_state
+ */
+ENTRY(fpsimd_save_state)
+	stp	q0, q1, [x0, #16 * 0]
+	stp	q2, q3, [x0, #16 * 2]
+	stp	q4, q5, [x0, #16 * 4]
+	stp	q6, q7, [x0, #16 * 6]
+	stp	q8, q9, [x0, #16 * 8]
+	stp	q10, q11, [x0, #16 * 10]
+	stp	q12, q13, [x0, #16 * 12]
+	stp	q14, q15, [x0, #16 * 14]
+	stp	q16, q17, [x0, #16 * 16]
+	stp	q18, q19, [x0, #16 * 18]
+	stp	q20, q21, [x0, #16 * 20]
+	stp	q22, q23, [x0, #16 * 22]
+	stp	q24, q25, [x0, #16 * 24]
+	stp	q26, q27, [x0, #16 * 26]
+	stp	q28, q29, [x0, #16 * 28]
+	stp	q30, q31, [x0, #16 * 30]!
+	mrs	x8, fpsr
+	str	w8, [x0, #16 * 2]
+	mrs	x8, fpcr
+	str	w8, [x0, #16 * 2 + 4]
+	ret
+ENDPROC(fpsimd_save_state)
+
+/*
+ * Load the FP registers.
+ *
+ * x0 - pointer to struct fpsimd_state
+ */
+ENTRY(fpsimd_load_state)
+	ldp	q0, q1, [x0, #16 * 0]
+	ldp	q2, q3, [x0, #16 * 2]
+	ldp	q4, q5, [x0, #16 * 4]
+	ldp	q6, q7, [x0, #16 * 6]
+	ldp	q8, q9, [x0, #16 * 8]
+	ldp	q10, q11, [x0, #16 * 10]
+	ldp	q12, q13, [x0, #16 * 12]
+	ldp	q14, q15, [x0, #16 * 14]
+	ldp	q16, q17, [x0, #16 * 16]
+	ldp	q18, q19, [x0, #16 * 18]
+	ldp	q20, q21, [x0, #16 * 20]
+	ldp	q22, q23, [x0, #16 * 22]
+	ldp	q24, q25, [x0, #16 * 24]
+	ldp	q26, q27, [x0, #16 * 26]
+	ldp	q28, q29, [x0, #16 * 28]
+	ldp	q30, q31, [x0, #16 * 30]!
+	ldr	w8, [x0, #16 * 2]
+	ldr	w9, [x0, #16 * 2 + 4]
+	msr	fpsr, x8
+	msr	fpcr, x9
+	ret
+ENDPROC(fpsimd_load_state)
diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c
new file mode 100644
index 0000000..e8b8357
--- /dev/null
+++ b/arch/arm64/kernel/fpsimd.c
@@ -0,0 +1,106 @@
+/*
+ * FP/SIMD context switching and fault handling
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Catalin Marinas <catalin.marinas@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/signal.h>
+
+#include <asm/fpsimd.h>
+#include <asm/cputype.h>
+
+#define FPEXC_IOF	(1 << 0)
+#define FPEXC_DZF	(1 << 1)
+#define FPEXC_OFF	(1 << 2)
+#define FPEXC_UFF	(1 << 3)
+#define FPEXC_IXF	(1 << 4)
+#define FPEXC_IDF	(1 << 7)
+
+/*
+ * Trapped FP/ASIMD access.
+ */
+void do_fpsimd_acc(unsigned int esr, struct pt_regs *regs)
+{
+	/* TODO: implement lazy context saving/restoring */
+	WARN_ON(1);
+}
+
+/*
+ * Raise a SIGFPE for the current process.
+ */
+void do_fpsimd_exc(unsigned int esr, struct pt_regs *regs)
+{
+	siginfo_t info;
+	unsigned int si_code = 0;
+
+	if (esr & FPEXC_IOF)
+		si_code = FPE_FLTINV;
+	else if (esr & FPEXC_DZF)
+		si_code = FPE_FLTDIV;
+	else if (esr & FPEXC_OFF)
+		si_code = FPE_FLTOVF;
+	else if (esr & FPEXC_UFF)
+		si_code = FPE_FLTUND;
+	else if (esr & FPEXC_IXF)
+		si_code = FPE_FLTRES;
+
+	memset(&info, 0, sizeof(info));
+	info.si_signo = SIGFPE;
+	info.si_code = si_code;
+	info.si_addr = (void __user *)instruction_pointer(regs);
+
+	send_sig_info(SIGFPE, &info, current);
+}
+
+void fpsimd_thread_switch(struct task_struct *next)
+{
+	/* check if not kernel threads */
+	if (current->mm)
+		fpsimd_save_state(&current->thread.fpsimd_state);
+	if (next->mm)
+		fpsimd_load_state(&next->thread.fpsimd_state);
+}
+
+void fpsimd_flush_thread(void)
+{
+	memset(&current->thread.fpsimd_state, 0, sizeof(struct fpsimd_state));
+	fpsimd_load_state(&current->thread.fpsimd_state);
+}
+
+/*
+ * FP/SIMD support code initialisation.
+ */
+static int __init fpsimd_init(void)
+{
+	u64 pfr = read_cpuid(ID_AA64PFR0_EL1);
+
+	if (pfr & (0xf << 16)) {
+		pr_notice("Floating-point is not implemented\n");
+		return 0;
+	}
+	elf_hwcap |= HWCAP_FP;
+
+	if (pfr & (0xf << 20))
+		pr_notice("Advanced SIMD is not implemented\n");
+	else
+		elf_hwcap |= HWCAP_ASIMD;
+
+	return 0;
+}
+late_initcall(fpsimd_init);


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 23/31] arm64: Debugging support
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (21 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 22/31] arm64: Floating point and SIMD Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 15:07   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 24/31] arm64: Add support for /proc/sys/debug/exception-trace Catalin Marinas
                   ` (8 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

From: Will Deacon <will.deacon@arm.com>

This patch adds ptrace, debug monitors and hardware breakpoints support.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/debug-monitors.h |   88 +++
 arch/arm64/include/asm/hw_breakpoint.h  |  137 +++++
 arch/arm64/kernel/debug-monitors.c      |  288 ++++++++++
 arch/arm64/kernel/hw_breakpoint.c       |  880 +++++++++++++++++++++++++++++++
 arch/arm64/kernel/ptrace.c              |  834 +++++++++++++++++++++++++++++
 5 files changed, 2227 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/debug-monitors.h
 create mode 100644 arch/arm64/include/asm/hw_breakpoint.h
 create mode 100644 arch/arm64/kernel/debug-monitors.c
 create mode 100644 arch/arm64/kernel/hw_breakpoint.c
 create mode 100644 arch/arm64/kernel/ptrace.c

diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h
new file mode 100644
index 0000000..7eaa0b3
--- /dev/null
+++ b/arch/arm64/include/asm/debug-monitors.h
@@ -0,0 +1,88 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_DEBUG_MONITORS_H
+#define __ASM_DEBUG_MONITORS_H
+
+#ifdef __KERNEL__
+
+#define	DBG_ESR_EVT(x)		(((x) >> 27) & 0x7)
+
+/* AArch64 */
+#define DBG_ESR_EVT_HWBP	0x0
+#define DBG_ESR_EVT_HWSS	0x1
+#define DBG_ESR_EVT_HWWP	0x2
+#define DBG_ESR_EVT_BRK		0x6
+
+enum debug_el {
+	DBG_ACTIVE_EL0 = 0,
+	DBG_ACTIVE_EL1,
+};
+
+/* AArch32 */
+#define DBG_ESR_EVT_BKPT	0x4
+#define DBG_ESR_EVT_VECC	0x5
+
+#define AARCH32_BREAK_ARM	0x07f001f0
+#define AARCH32_BREAK_THUMB	0xde01
+#define AARCH32_BREAK_THUMB2_LO	0xf7f0
+#define AARCH32_BREAK_THUMB2_HI	0xa000
+
+#ifndef __ASSEMBLY__
+struct task_struct;
+
+#define local_dbg_save(flags)							\
+	do {									\
+		typecheck(unsigned long, flags);				\
+		asm volatile(							\
+		"mrs	%0, daif			// local_dbg_save\n"	\
+		"msr	daifset, #8"						\
+		: "=r" (flags) : : "memory");					\
+	} while (0)
+
+#define local_dbg_restore(flags)						\
+	do {									\
+		typecheck(unsigned long, flags);				\
+		asm volatile(							\
+		"msr	daif, %0			// local_dbg_restore\n"	\
+		: : "r" (flags) : "memory");					\
+	} while (0)
+
+#define DBG_ARCH_ID_RESERVED	0	/* In case of ptrace ABI updates. */
+
+u8 debug_monitors_arch(void);
+
+void enable_debug_monitors(enum debug_el el);
+void disable_debug_monitors(enum debug_el el);
+
+void user_rewind_single_step(struct task_struct *task);
+void user_fastforward_single_step(struct task_struct *task);
+
+void kernel_enable_single_step(struct pt_regs *regs);
+void kernel_disable_single_step(void);
+int kernel_active_single_step(void);
+
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+int reinstall_suspended_bps(struct pt_regs *regs);
+#else
+static inline int reinstall_suspended_bps(struct pt_regs *regs)
+{
+	return -ENODEV;
+}
+#endif
+
+#endif	/* __ASSEMBLY */
+#endif	/* __KERNEL__ */
+#endif	/* __ASM_DEBUG_MONITORS_H */
diff --git a/arch/arm64/include/asm/hw_breakpoint.h b/arch/arm64/include/asm/hw_breakpoint.h
new file mode 100644
index 0000000..d064047
--- /dev/null
+++ b/arch/arm64/include/asm/hw_breakpoint.h
@@ -0,0 +1,137 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_HW_BREAKPOINT_H
+#define __ASM_HW_BREAKPOINT_H
+
+#ifdef __KERNEL__
+
+struct arch_hw_breakpoint_ctrl {
+	u32 __reserved	: 19,
+	len		: 8,
+	type		: 2,
+	privilege	: 2,
+	enabled		: 1;
+};
+
+struct arch_hw_breakpoint {
+	u64 address;
+	u64 trigger;
+	struct arch_hw_breakpoint_ctrl ctrl;
+};
+
+static inline u32 encode_ctrl_reg(struct arch_hw_breakpoint_ctrl ctrl)
+{
+	return (ctrl.len << 5) | (ctrl.type << 3) | (ctrl.privilege << 1) |
+		ctrl.enabled;
+}
+
+static inline void decode_ctrl_reg(u32 reg,
+				   struct arch_hw_breakpoint_ctrl *ctrl)
+{
+	ctrl->enabled	= reg & 0x1;
+	reg >>= 1;
+	ctrl->privilege	= reg & 0x3;
+	reg >>= 2;
+	ctrl->type	= reg & 0x3;
+	reg >>= 2;
+	ctrl->len	= reg & 0xff;
+}
+
+/* Breakpoint */
+#define ARM_BREAKPOINT_EXECUTE	0
+
+/* Watchpoints */
+#define ARM_BREAKPOINT_LOAD	1
+#define ARM_BREAKPOINT_STORE	2
+#define AARCH64_ESR_ACCESS_MASK	(1 << 6)
+
+/* Privilege Levels */
+#define AARCH64_BREAKPOINT_EL1	1
+#define AARCH64_BREAKPOINT_EL0	2
+
+/* Lengths */
+#define ARM_BREAKPOINT_LEN_1	0x1
+#define ARM_BREAKPOINT_LEN_2	0x3
+#define ARM_BREAKPOINT_LEN_4	0xf
+#define ARM_BREAKPOINT_LEN_8	0xff
+
+/* Kernel stepping */
+#define ARM_KERNEL_STEP_NONE	0
+#define ARM_KERNEL_STEP_ACTIVE	1
+#define ARM_KERNEL_STEP_SUSPEND	2
+
+/*
+ * Limits.
+ * Changing these will require modifications to the register accessors.
+ */
+#define ARM_MAX_BRP		16
+#define ARM_MAX_WRP		16
+#define ARM_MAX_HBP_SLOTS	(ARM_MAX_BRP + ARM_MAX_WRP)
+
+/* Virtual debug register bases. */
+#define AARCH64_DBG_REG_BVR	0
+#define AARCH64_DBG_REG_BCR	(AARCH64_DBG_REG_BVR + ARM_MAX_BRP)
+#define AARCH64_DBG_REG_WVR	(AARCH64_DBG_REG_BCR + ARM_MAX_BRP)
+#define AARCH64_DBG_REG_WCR	(AARCH64_DBG_REG_WVR + ARM_MAX_WRP)
+
+/* Debug register names. */
+#define AARCH64_DBG_REG_NAME_BVR	"bvr"
+#define AARCH64_DBG_REG_NAME_BCR	"bcr"
+#define AARCH64_DBG_REG_NAME_WVR	"wvr"
+#define AARCH64_DBG_REG_NAME_WCR	"wcr"
+
+/* Accessor macros for the debug registers. */
+#define AARCH64_DBG_READ(N, REG, VAL) do {\
+	asm volatile("mrs %0, dbg" REG #N "_el1" : "=r" (VAL));\
+} while (0)
+
+#define AARCH64_DBG_WRITE(N, REG, VAL) do {\
+	asm volatile("msr dbg" REG #N "_el1, %0" :: "r" (VAL));\
+} while (0)
+
+struct task_struct;
+struct notifier_block;
+struct perf_event;
+struct pmu;
+
+extern int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
+				  int *gen_len, int *gen_type);
+extern int arch_check_bp_in_kernelspace(struct perf_event *bp);
+extern int arch_validate_hwbkpt_settings(struct perf_event *bp);
+extern int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+					   unsigned long val, void *data);
+
+extern int arch_install_hw_breakpoint(struct perf_event *bp);
+extern void arch_uninstall_hw_breakpoint(struct perf_event *bp);
+extern void hw_breakpoint_pmu_read(struct perf_event *bp);
+extern int hw_breakpoint_slots(int type);
+
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+extern void hw_breakpoint_thread_switch(struct task_struct *next);
+extern void ptrace_hw_copy_thread(struct task_struct *task);
+#else
+static inline void hw_breakpoint_thread_switch(struct task_struct *next)
+{
+}
+static inline void ptrace_hw_copy_thread(struct task_struct *task)
+{
+}
+#endif
+
+extern struct pmu perf_ops_bp;
+
+#endif	/* __KERNEL__ */
+#endif	/* __ASM_BREAKPOINT_H */
diff --git a/arch/arm64/kernel/debug-monitors.c b/arch/arm64/kernel/debug-monitors.c
new file mode 100644
index 0000000..0c3ba9f
--- /dev/null
+++ b/arch/arm64/kernel/debug-monitors.c
@@ -0,0 +1,288 @@
+/*
+ * ARMv8 single-step debug support and mdscr context switching.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/cpu.h>
+#include <linux/debugfs.h>
+#include <linux/hardirq.h>
+#include <linux/init.h>
+#include <linux/ptrace.h>
+#include <linux/stat.h>
+
+#include <asm/debug-monitors.h>
+#include <asm/local.h>
+#include <asm/cputype.h>
+#include <asm/system_misc.h>
+
+/* Low-level stepping controls. */
+#define DBG_MDSCR_SS		(1 << 0)
+#define DBG_SPSR_SS		(1 << 21)
+
+/* MDSCR_EL1 enabling bits */
+#define DBG_MDSCR_KDE		(1 << 13)
+#define DBG_MDSCR_MDE		(1 << 15)
+#define DBG_MDSCR_MASK		~(DBG_MDSCR_KDE | DBG_MDSCR_MDE)
+
+/* Determine debug architecture. */
+u8 debug_monitors_arch(void)
+{
+	return read_cpuid(ID_AA64DFR0_EL1) & 0xf;
+}
+
+/*
+ * MDSCR access routines.
+ */
+static void mdscr_write(u32 mdscr)
+{
+	unsigned long flags;
+	local_dbg_save(flags);
+	asm volatile("msr mdscr_el1, %0" :: "r" (mdscr));
+	local_dbg_restore(flags);
+}
+
+static u32 mdscr_read(void)
+{
+	u32 mdscr;
+	asm volatile("mrs %0, mdscr_el1" : "=r" (mdscr));
+	return mdscr;
+}
+
+/*
+ * Allow root to disable self-hosted debug from userspace.
+ * This is useful if you want to connect an external JTAG debugger.
+ */
+static u32 debug_enabled = 1;
+
+static int create_debug_debugfs_entry(void)
+{
+	debugfs_create_bool("debug_enabled", 0644, NULL, &debug_enabled);
+	return 0;
+}
+fs_initcall(create_debug_debugfs_entry);
+
+static int __init early_debug_disable(char *buf)
+{
+	debug_enabled = 0;
+	return 0;
+}
+
+early_param("nodebugmon", early_debug_disable);
+
+/*
+ * Keep track of debug users on each core.
+ * The ref counts are per-cpu so we use a local_t type.
+ */
+static DEFINE_PER_CPU(local_t, mde_ref_count);
+static DEFINE_PER_CPU(local_t, kde_ref_count);
+
+void enable_debug_monitors(enum debug_el el)
+{
+	u32 mdscr, enable = 0;
+
+	WARN_ON(preemptible());
+
+	if (local_inc_return(&__get_cpu_var(mde_ref_count)) == 1)
+		enable = DBG_MDSCR_MDE;
+
+	if (el == DBG_ACTIVE_EL1 &&
+	    local_inc_return(&__get_cpu_var(kde_ref_count)) == 1)
+		enable |= DBG_MDSCR_KDE;
+
+	if (enable && debug_enabled) {
+		mdscr = mdscr_read();
+		mdscr |= enable;
+		mdscr_write(mdscr);
+	}
+}
+
+void disable_debug_monitors(enum debug_el el)
+{
+	u32 mdscr, disable = 0;
+
+	WARN_ON(preemptible());
+
+	if (local_dec_and_test(&__get_cpu_var(mde_ref_count)))
+		disable = ~DBG_MDSCR_MDE;
+
+	if (el == DBG_ACTIVE_EL1 &&
+	    local_dec_and_test(&__get_cpu_var(kde_ref_count)))
+		disable &= ~DBG_MDSCR_KDE;
+
+	if (disable) {
+		mdscr = mdscr_read();
+		mdscr &= disable;
+		mdscr_write(mdscr);
+	}
+}
+
+/*
+ * OS lock clearing.
+ */
+static void clear_os_lock(void *unused)
+{
+	asm volatile("msr mdscr_el1, %0" : : "r" (0));
+	isb();
+	asm volatile("msr oslar_el1, %0" : : "r" (0));
+	isb();
+}
+
+static int __cpuinit os_lock_notify(struct notifier_block *self,
+				    unsigned long action, void *data)
+{
+	int cpu = (unsigned long)data;
+	if (action == CPU_ONLINE)
+		smp_call_function_single(cpu, clear_os_lock, NULL, 1);
+	return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata os_lock_nb = {
+	.notifier_call = os_lock_notify,
+};
+
+static int __cpuinit debug_monitors_init(void)
+{
+	/* Clear the OS lock. */
+	smp_call_function(clear_os_lock, NULL, 1);
+	clear_os_lock(NULL);
+
+	/* Register hotplug handler. */
+	register_cpu_notifier(&os_lock_nb);
+	return 0;
+}
+postcore_initcall(debug_monitors_init);
+
+/*
+ * Single step API and exception handling.
+ */
+static void set_regs_spsr_ss(struct pt_regs *regs)
+{
+	unsigned long spsr;
+
+	spsr = regs->pstate;
+	spsr &= ~DBG_SPSR_SS;
+	spsr |= DBG_SPSR_SS;
+	regs->pstate = spsr;
+}
+
+static void clear_regs_spsr_ss(struct pt_regs *regs)
+{
+	unsigned long spsr;
+
+	spsr = regs->pstate;
+	spsr &= ~DBG_SPSR_SS;
+	regs->pstate = spsr;
+}
+
+static int single_step_handler(unsigned long addr, unsigned int esr,
+			       struct pt_regs *regs)
+{
+	siginfo_t info;
+
+	/*
+	 * If we are stepping a pending breakpoint, call the hw_breakpoint
+	 * handler first.
+	 */
+	if (!reinstall_suspended_bps(regs))
+		return 0;
+
+	if (user_mode(regs)) {
+		info.si_signo = SIGTRAP;
+		info.si_errno = 0;
+		info.si_code  = TRAP_HWBKPT;
+		info.si_addr  = (void __user *)instruction_pointer(regs);
+		force_sig_info(SIGTRAP, &info, current);
+
+		/*
+		 * ptrace will disable single step unless explicitly
+		 * asked to re-enable it. For other clients, it makes
+		 * sense to leave it enabled (i.e. rewind the controls
+		 * to the active-not-pending state).
+		 */
+		user_rewind_single_step(current);
+	} else {
+		/* TODO: route to KGDB */
+		pr_warning("Unexpected kernel single-step exception at EL1\n");
+		/*
+		 * Re-enable stepping since we know that we will be
+		 * returning to regs.
+		 */
+		set_regs_spsr_ss(regs);
+	}
+
+	return 0;
+}
+
+static int __init single_step_init(void)
+{
+	hook_debug_fault_code(DBG_ESR_EVT_HWSS, single_step_handler, SIGTRAP,
+			      TRAP_HWBKPT, "single-step handler");
+	return 0;
+}
+arch_initcall(single_step_init);
+
+/* Re-enable single step for syscall restarting. */
+void user_rewind_single_step(struct task_struct *task)
+{
+	/*
+	 * If single step is active for this thread, then set SPSR.SS
+	 * to 1 to avoid returning to the active-pending state.
+	 */
+	if (test_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP))
+		set_regs_spsr_ss(task_pt_regs(task));
+}
+
+void user_fastforward_single_step(struct task_struct *task)
+{
+	if (test_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP))
+		clear_regs_spsr_ss(task_pt_regs(task));
+}
+
+/* Kernel API */
+void kernel_enable_single_step(struct pt_regs *regs)
+{
+	WARN_ON(!irqs_disabled());
+	set_regs_spsr_ss(regs);
+	mdscr_write(mdscr_read() | DBG_MDSCR_SS);
+	enable_debug_monitors(DBG_ACTIVE_EL1);
+}
+
+void kernel_disable_single_step(void)
+{
+	WARN_ON(!irqs_disabled());
+	mdscr_write(mdscr_read() & ~DBG_MDSCR_SS);
+	disable_debug_monitors(DBG_ACTIVE_EL1);
+}
+
+int kernel_active_single_step(void)
+{
+	WARN_ON(!irqs_disabled());
+	return mdscr_read() & DBG_MDSCR_SS;
+}
+
+/* ptrace API */
+void user_enable_single_step(struct task_struct *task)
+{
+	set_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP);
+	set_regs_spsr_ss(task_pt_regs(task));
+}
+
+void user_disable_single_step(struct task_struct *task)
+{
+	clear_ti_thread_flag(task_thread_info(task), TIF_SINGLESTEP);
+}
diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c
new file mode 100644
index 0000000..5ab825c
--- /dev/null
+++ b/arch/arm64/kernel/hw_breakpoint.c
@@ -0,0 +1,880 @@
+/*
+ * HW_breakpoint: a unified kernel/user-space hardware breakpoint facility,
+ * using the CPU's debug registers.
+ *
+ * Copyright (C) 2012 ARM Limited
+ * Author: Will Deacon <will.deacon@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#define pr_fmt(fmt) "hw-breakpoint: " fmt
+
+#include <linux/errno.h>
+#include <linux/hw_breakpoint.h>
+#include <linux/perf_event.h>
+#include <linux/ptrace.h>
+#include <linux/smp.h>
+
+#include <asm/compat.h>
+#include <asm/current.h>
+#include <asm/debug-monitors.h>
+#include <asm/hw_breakpoint.h>
+#include <asm/kdebug.h>
+#include <asm/traps.h>
+#include <asm/cputype.h>
+#include <asm/system_misc.h>
+
+/* Breakpoint currently in use for each BRP. */
+static DEFINE_PER_CPU(struct perf_event *, bp_on_reg[ARM_MAX_BRP]);
+
+/* Watchpoint currently in use for each WRP. */
+static DEFINE_PER_CPU(struct perf_event *, wp_on_reg[ARM_MAX_WRP]);
+
+/* Currently stepping a per-CPU kernel breakpoint. */
+static DEFINE_PER_CPU(int, stepping_kernel_bp);
+
+/* Number of BRP/WRP registers on this CPU. */
+static int core_num_brps;
+static int core_num_wrps;
+
+/* Determine number of BRP registers available. */
+static int get_num_brps(void)
+{
+	return ((read_cpuid(ID_AA64DFR0_EL1) >> 12) & 0xf) + 1;
+}
+
+/* Determine number of WRP registers available. */
+static int get_num_wrps(void)
+{
+	return ((read_cpuid(ID_AA64DFR0_EL1) >> 20) & 0xf) + 1;
+}
+
+int hw_breakpoint_slots(int type)
+{
+	/*
+	 * We can be called early, so don't rely on
+	 * our static variables being initialised.
+	 */
+	switch (type) {
+	case TYPE_INST:
+		return get_num_brps();
+	case TYPE_DATA:
+		return get_num_wrps();
+	default:
+		pr_warning("unknown slot type: %d\n", type);
+		return 0;
+	}
+}
+
+#define READ_WB_REG_CASE(OFF, N, REG, VAL)	\
+	case (OFF + N):				\
+		AARCH64_DBG_READ(N, REG, VAL);	\
+		break
+
+#define WRITE_WB_REG_CASE(OFF, N, REG, VAL)	\
+	case (OFF + N):				\
+		AARCH64_DBG_WRITE(N, REG, VAL);	\
+		break
+
+#define GEN_READ_WB_REG_CASES(OFF, REG, VAL)	\
+	READ_WB_REG_CASE(OFF,  0, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  1, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  2, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  3, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  4, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  5, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  6, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  7, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  8, REG, VAL);	\
+	READ_WB_REG_CASE(OFF,  9, REG, VAL);	\
+	READ_WB_REG_CASE(OFF, 10, REG, VAL);	\
+	READ_WB_REG_CASE(OFF, 11, REG, VAL);	\
+	READ_WB_REG_CASE(OFF, 12, REG, VAL);	\
+	READ_WB_REG_CASE(OFF, 13, REG, VAL);	\
+	READ_WB_REG_CASE(OFF, 14, REG, VAL);	\
+	READ_WB_REG_CASE(OFF, 15, REG, VAL)
+
+#define GEN_WRITE_WB_REG_CASES(OFF, REG, VAL)	\
+	WRITE_WB_REG_CASE(OFF,  0, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  1, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  2, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  3, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  4, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  5, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  6, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  7, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  8, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF,  9, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF, 10, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF, 11, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF, 12, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF, 13, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF, 14, REG, VAL);	\
+	WRITE_WB_REG_CASE(OFF, 15, REG, VAL)
+
+static u64 read_wb_reg(int reg, int n)
+{
+	u64 val = 0;
+
+	switch (reg + n) {
+	GEN_READ_WB_REG_CASES(AARCH64_DBG_REG_BVR, AARCH64_DBG_REG_NAME_BVR, val);
+	GEN_READ_WB_REG_CASES(AARCH64_DBG_REG_BCR, AARCH64_DBG_REG_NAME_BCR, val);
+	GEN_READ_WB_REG_CASES(AARCH64_DBG_REG_WVR, AARCH64_DBG_REG_NAME_WVR, val);
+	GEN_READ_WB_REG_CASES(AARCH64_DBG_REG_WCR, AARCH64_DBG_REG_NAME_WCR, val);
+	default:
+		pr_warning("attempt to read from unknown breakpoint register %d\n", n);
+	}
+
+	return val;
+}
+
+static void write_wb_reg(int reg, int n, u64 val)
+{
+	switch (reg + n) {
+	GEN_WRITE_WB_REG_CASES(AARCH64_DBG_REG_BVR, AARCH64_DBG_REG_NAME_BVR, val);
+	GEN_WRITE_WB_REG_CASES(AARCH64_DBG_REG_BCR, AARCH64_DBG_REG_NAME_BCR, val);
+	GEN_WRITE_WB_REG_CASES(AARCH64_DBG_REG_WVR, AARCH64_DBG_REG_NAME_WVR, val);
+	GEN_WRITE_WB_REG_CASES(AARCH64_DBG_REG_WCR, AARCH64_DBG_REG_NAME_WCR, val);
+	default:
+		pr_warning("attempt to write to unknown breakpoint register %d\n", n);
+	}
+	isb();
+}
+
+/*
+ * Convert a breakpoint privilege level to the corresponding exception
+ * level.
+ */
+static enum debug_el debug_exception_level(int privilege)
+{
+	switch (privilege) {
+	case AARCH64_BREAKPOINT_EL0:
+		return DBG_ACTIVE_EL0;
+	case AARCH64_BREAKPOINT_EL1:
+		return DBG_ACTIVE_EL1;
+	default:
+		pr_warning("invalid breakpoint privilege level %d\n", privilege);
+		return -EINVAL;
+	}
+}
+
+/*
+ * Install a perf counter breakpoint.
+ */
+int arch_install_hw_breakpoint(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	struct perf_event **slot, **slots;
+	struct debug_info *debug_info = &current->thread.debug;
+	int i, max_slots, ctrl_reg, val_reg, reg_enable;
+	u32 ctrl;
+
+	if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE) {
+		/* Breakpoint */
+		ctrl_reg = AARCH64_DBG_REG_BCR;
+		val_reg = AARCH64_DBG_REG_BVR;
+		slots = __get_cpu_var(bp_on_reg);
+		max_slots = core_num_brps;
+		reg_enable = !debug_info->bps_disabled;
+	} else {
+		/* Watchpoint */
+		ctrl_reg = AARCH64_DBG_REG_WCR;
+		val_reg = AARCH64_DBG_REG_WVR;
+		slots = __get_cpu_var(wp_on_reg);
+		max_slots = core_num_wrps;
+		reg_enable = !debug_info->wps_disabled;
+	}
+
+	for (i = 0; i < max_slots; ++i) {
+		slot = &slots[i];
+
+		if (!*slot) {
+			*slot = bp;
+			break;
+		}
+	}
+
+	if (WARN_ONCE(i == max_slots, "Can't find any breakpoint slot"))
+		return -ENOSPC;
+
+	/* Ensure debug monitors are enabled at the correct exception level.  */
+	enable_debug_monitors(debug_exception_level(info->ctrl.privilege));
+
+	/* Setup the address register. */
+	write_wb_reg(val_reg, i, info->address);
+
+	/* Setup the control register. */
+	ctrl = encode_ctrl_reg(info->ctrl);
+	write_wb_reg(ctrl_reg, i, reg_enable ? ctrl | 0x1 : ctrl & ~0x1);
+
+	return 0;
+}
+
+void arch_uninstall_hw_breakpoint(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	struct perf_event **slot, **slots;
+	int i, max_slots, base;
+
+	if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE) {
+		/* Breakpoint */
+		base = AARCH64_DBG_REG_BCR;
+		slots = __get_cpu_var(bp_on_reg);
+		max_slots = core_num_brps;
+	} else {
+		/* Watchpoint */
+		base = AARCH64_DBG_REG_WCR;
+		slots = __get_cpu_var(wp_on_reg);
+		max_slots = core_num_wrps;
+	}
+
+	/* Remove the breakpoint. */
+	for (i = 0; i < max_slots; ++i) {
+		slot = &slots[i];
+
+		if (*slot == bp) {
+			*slot = NULL;
+			break;
+		}
+	}
+
+	if (WARN_ONCE(i == max_slots, "Can't find any breakpoint slot"))
+		return;
+
+	/* Reset the control register. */
+	write_wb_reg(base, i, 0);
+
+	/* Release the debug monitors for the correct exception level.  */
+	disable_debug_monitors(debug_exception_level(info->ctrl.privilege));
+}
+
+static int get_hbp_len(u8 hbp_len)
+{
+	unsigned int len_in_bytes = 0;
+
+	switch (hbp_len) {
+	case ARM_BREAKPOINT_LEN_1:
+		len_in_bytes = 1;
+		break;
+	case ARM_BREAKPOINT_LEN_2:
+		len_in_bytes = 2;
+		break;
+	case ARM_BREAKPOINT_LEN_4:
+		len_in_bytes = 4;
+		break;
+	case ARM_BREAKPOINT_LEN_8:
+		len_in_bytes = 8;
+		break;
+	}
+
+	return len_in_bytes;
+}
+
+/*
+ * Check whether bp virtual address is in kernel space.
+ */
+int arch_check_bp_in_kernelspace(struct perf_event *bp)
+{
+	unsigned int len;
+	unsigned long va;
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+
+	va = info->address;
+	len = get_hbp_len(info->ctrl.len);
+
+	return (va >= TASK_SIZE) && ((va + len - 1) >= TASK_SIZE);
+}
+
+/*
+ * Extract generic type and length encodings from an arch_hw_breakpoint_ctrl.
+ * Hopefully this will disappear when ptrace can bypass the conversion
+ * to generic breakpoint descriptions.
+ */
+int arch_bp_generic_fields(struct arch_hw_breakpoint_ctrl ctrl,
+			   int *gen_len, int *gen_type)
+{
+	/* Type */
+	switch (ctrl.type) {
+	case ARM_BREAKPOINT_EXECUTE:
+		*gen_type = HW_BREAKPOINT_X;
+		break;
+	case ARM_BREAKPOINT_LOAD:
+		*gen_type = HW_BREAKPOINT_R;
+		break;
+	case ARM_BREAKPOINT_STORE:
+		*gen_type = HW_BREAKPOINT_W;
+		break;
+	case ARM_BREAKPOINT_LOAD | ARM_BREAKPOINT_STORE:
+		*gen_type = HW_BREAKPOINT_RW;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* Len */
+	switch (ctrl.len) {
+	case ARM_BREAKPOINT_LEN_1:
+		*gen_len = HW_BREAKPOINT_LEN_1;
+		break;
+	case ARM_BREAKPOINT_LEN_2:
+		*gen_len = HW_BREAKPOINT_LEN_2;
+		break;
+	case ARM_BREAKPOINT_LEN_4:
+		*gen_len = HW_BREAKPOINT_LEN_4;
+		break;
+	case ARM_BREAKPOINT_LEN_8:
+		*gen_len = HW_BREAKPOINT_LEN_8;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	return 0;
+}
+
+/*
+ * Construct an arch_hw_breakpoint from a perf_event.
+ */
+static int arch_build_bp_info(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+
+	/* Type */
+	switch (bp->attr.bp_type) {
+	case HW_BREAKPOINT_X:
+		info->ctrl.type = ARM_BREAKPOINT_EXECUTE;
+		break;
+	case HW_BREAKPOINT_R:
+		info->ctrl.type = ARM_BREAKPOINT_LOAD;
+		break;
+	case HW_BREAKPOINT_W:
+		info->ctrl.type = ARM_BREAKPOINT_STORE;
+		break;
+	case HW_BREAKPOINT_RW:
+		info->ctrl.type = ARM_BREAKPOINT_LOAD | ARM_BREAKPOINT_STORE;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/* Len */
+	switch (bp->attr.bp_len) {
+	case HW_BREAKPOINT_LEN_1:
+		info->ctrl.len = ARM_BREAKPOINT_LEN_1;
+		break;
+	case HW_BREAKPOINT_LEN_2:
+		info->ctrl.len = ARM_BREAKPOINT_LEN_2;
+		break;
+	case HW_BREAKPOINT_LEN_4:
+		info->ctrl.len = ARM_BREAKPOINT_LEN_4;
+		break;
+	case HW_BREAKPOINT_LEN_8:
+		info->ctrl.len = ARM_BREAKPOINT_LEN_8;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	/*
+	 * On AArch64, we only permit breakpoints of length 4, whereas
+	 * AArch32 also requires breakpoints of length 2 for Thumb.
+	 * Watchpoints can be of length 1, 2, 4 or 8 bytes.
+	 */
+	if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE) {
+		if (is_compat_task()) {
+			if (info->ctrl.len != ARM_BREAKPOINT_LEN_2 &&
+			    info->ctrl.len != ARM_BREAKPOINT_LEN_4)
+				return -EINVAL;
+		} else if (info->ctrl.len != ARM_BREAKPOINT_LEN_4) {
+			/*
+			 * FIXME: Some tools (I'm looking at you perf) assume
+			 *	  that breakpoints should be sizeof(long). This
+			 *	  is nonsense. For now, we fix up the parameter
+			 *	  but we should probably return -EINVAL instead.
+			 */
+			info->ctrl.len = ARM_BREAKPOINT_LEN_4;
+		}
+	}
+
+	/* Address */
+	info->address = bp->attr.bp_addr;
+
+	/*
+	 * Privilege
+	 * Note that we disallow combined EL0/EL1 breakpoints because
+	 * that would complicate the stepping code.
+	 */
+	if (arch_check_bp_in_kernelspace(bp))
+		info->ctrl.privilege = AARCH64_BREAKPOINT_EL1;
+	else
+		info->ctrl.privilege = AARCH64_BREAKPOINT_EL0;
+
+	/* Enabled? */
+	info->ctrl.enabled = !bp->attr.disabled;
+
+	return 0;
+}
+
+/*
+ * Validate the arch-specific HW Breakpoint register settings.
+ */
+int arch_validate_hwbkpt_settings(struct perf_event *bp)
+{
+	struct arch_hw_breakpoint *info = counter_arch_bp(bp);
+	int ret;
+	u64 alignment_mask, offset;
+
+	/* Build the arch_hw_breakpoint. */
+	ret = arch_build_bp_info(bp);
+	if (ret)
+		return ret;
+
+	/*
+	 * Check address alignment.
+	 * We don't do any clever alignment correction for watchpoints
+	 * because using 64-bit unaligned addresses is deprecated for
+	 * AArch64.
+	 *
+	 * AArch32 tasks expect some simple alignment fixups, so emulate
+	 * that here.
+	 */
+	if (is_compat_task()) {
+		if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
+			alignment_mask = 0x7;
+		else
+			alignment_mask = 0x3;
+		offset = info->address & alignment_mask;
+		switch (offset) {
+		case 0:
+			/* Aligned */
+			break;
+		case 1:
+			/* Allow single byte watchpoint. */
+			if (info->ctrl.len == ARM_BREAKPOINT_LEN_1)
+				break;
+		case 2:
+			/* Allow halfword watchpoints and breakpoints. */
+			if (info->ctrl.len == ARM_BREAKPOINT_LEN_2)
+				break;
+		default:
+			return -EINVAL;
+		}
+
+		info->address &= ~alignment_mask;
+		info->ctrl.len <<= offset;
+	} else {
+		if (info->ctrl.type == ARM_BREAKPOINT_EXECUTE)
+			alignment_mask = 0x3;
+		else
+			alignment_mask = 0x7;
+		if (info->address & alignment_mask)
+			return -EINVAL;
+	}
+
+	/*
+	 * Disallow per-task kernel breakpoints since these would
+	 * complicate the stepping code.
+	 */
+	if (info->ctrl.privilege == AARCH64_BREAKPOINT_EL1 && bp->hw.bp_target)
+		return -EINVAL;
+
+	return 0;
+}
+
+/*
+ * Enable/disable all of the breakpoints active at the specified
+ * exception level at the register level.
+ * This is used when single-stepping after a breakpoint exception.
+ */
+static void toggle_bp_registers(int reg, enum debug_el el, int enable)
+{
+	int i, max_slots, privilege;
+	u32 ctrl;
+	struct perf_event **slots;
+
+	switch (reg) {
+	case AARCH64_DBG_REG_BCR:
+		slots = __get_cpu_var(bp_on_reg);
+		max_slots = core_num_brps;
+		break;
+	case AARCH64_DBG_REG_WCR:
+		slots = __get_cpu_var(wp_on_reg);
+		max_slots = core_num_wrps;
+		break;
+	default:
+		return;
+	}
+
+	for (i = 0; i < max_slots; ++i) {
+		if (!slots[i])
+			continue;
+
+		privilege = counter_arch_bp(slots[i])->ctrl.privilege;
+		if (debug_exception_level(privilege) != el)
+			continue;
+
+		ctrl = read_wb_reg(reg, i);
+		if (enable)
+			ctrl |= 0x1;
+		else
+			ctrl &= ~0x1;
+		write_wb_reg(reg, i, ctrl);
+	}
+}
+
+/*
+ * Debug exception handlers.
+ */
+static int breakpoint_handler(unsigned long unused, unsigned int esr,
+			      struct pt_regs *regs)
+{
+	int i, step = 0, *kernel_step;
+	u32 ctrl_reg;
+	u64 addr, val;
+	struct perf_event *bp, **slots;
+	struct debug_info *debug_info;
+	struct arch_hw_breakpoint_ctrl ctrl;
+
+	slots = (struct perf_event **)__get_cpu_var(bp_on_reg);
+	addr = instruction_pointer(regs);
+	debug_info = &current->thread.debug;
+
+	for (i = 0; i < core_num_brps; ++i) {
+		rcu_read_lock();
+
+		bp = slots[i];
+
+		if (bp == NULL)
+			goto unlock;
+
+		/* Check if the breakpoint value matches. */
+		val = read_wb_reg(AARCH64_DBG_REG_BVR, i);
+		if (val != (addr & ~0x3))
+			goto unlock;
+
+		/* Possible match, check the byte address select to confirm. */
+		ctrl_reg = read_wb_reg(AARCH64_DBG_REG_BCR, i);
+		decode_ctrl_reg(ctrl_reg, &ctrl);
+		if (!((1 << (addr & 0x3)) & ctrl.len))
+			goto unlock;
+
+		counter_arch_bp(bp)->trigger = addr;
+		perf_bp_event(bp, regs);
+
+		/* Do we need to handle the stepping? */
+		if (!bp->overflow_handler)
+			step = 1;
+unlock:
+		rcu_read_unlock();
+	}
+
+	if (!step)
+		return 0;
+
+	if (user_mode(regs)) {
+		debug_info->bps_disabled = 1;
+		toggle_bp_registers(AARCH64_DBG_REG_BCR, DBG_ACTIVE_EL0, 0);
+
+		/* If we're already stepping a watchpoint, just return. */
+		if (debug_info->wps_disabled)
+			return 0;
+
+		if (test_thread_flag(TIF_SINGLESTEP))
+			debug_info->suspended_step = 1;
+		else
+			user_enable_single_step(current);
+	} else {
+		toggle_bp_registers(AARCH64_DBG_REG_BCR, DBG_ACTIVE_EL1, 0);
+		kernel_step = &__get_cpu_var(stepping_kernel_bp);
+
+		if (*kernel_step != ARM_KERNEL_STEP_NONE)
+			return 0;
+
+		if (kernel_active_single_step()) {
+			*kernel_step = ARM_KERNEL_STEP_SUSPEND;
+		} else {
+			*kernel_step = ARM_KERNEL_STEP_ACTIVE;
+			kernel_enable_single_step(regs);
+		}
+	}
+
+	return 0;
+}
+
+static int watchpoint_handler(unsigned long addr, unsigned int esr,
+			      struct pt_regs *regs)
+{
+	int i, step = 0, *kernel_step, access;
+	u32 ctrl_reg;
+	u64 val, alignment_mask;
+	struct perf_event *wp, **slots;
+	struct debug_info *debug_info;
+	struct arch_hw_breakpoint *info;
+	struct arch_hw_breakpoint_ctrl ctrl;
+
+	slots = (struct perf_event **)__get_cpu_var(wp_on_reg);
+	debug_info = &current->thread.debug;
+
+	for (i = 0; i < core_num_wrps; ++i) {
+		rcu_read_lock();
+
+		wp = slots[i];
+
+		if (wp == NULL)
+			goto unlock;
+
+		info = counter_arch_bp(wp);
+		/* AArch32 watchpoints are either 4 or 8 bytes aligned. */
+		if (is_compat_task()) {
+			if (info->ctrl.len == ARM_BREAKPOINT_LEN_8)
+				alignment_mask = 0x7;
+			else
+				alignment_mask = 0x3;
+		} else {
+			alignment_mask = 0x7;
+		}
+
+		/* Check if the watchpoint value matches. */
+		val = read_wb_reg(AARCH64_DBG_REG_WVR, i);
+		if (val != (addr & ~alignment_mask))
+			goto unlock;
+
+		/* Possible match, check the byte address select to confirm. */
+		ctrl_reg = read_wb_reg(AARCH64_DBG_REG_WCR, i);
+		decode_ctrl_reg(ctrl_reg, &ctrl);
+		if (!((1 << (addr & alignment_mask)) & ctrl.len))
+			goto unlock;
+
+		/*
+		 * Check that the access type matches.
+		 * 0 => load, otherwise => store
+		 */
+		access = (esr & AARCH64_ESR_ACCESS_MASK) ? HW_BREAKPOINT_W :
+			 HW_BREAKPOINT_R;
+		if (!(access & hw_breakpoint_type(wp)))
+			goto unlock;
+
+		info->trigger = addr;
+		perf_bp_event(wp, regs);
+
+		/* Do we need to handle the stepping? */
+		if (!wp->overflow_handler)
+			step = 1;
+
+unlock:
+		rcu_read_unlock();
+	}
+
+	if (!step)
+		return 0;
+
+	/*
+	 * We always disable EL0 watchpoints because the kernel can
+	 * cause these to fire via an unprivileged access.
+	 */
+	toggle_bp_registers(AARCH64_DBG_REG_WCR, DBG_ACTIVE_EL0, 0);
+
+	if (user_mode(regs)) {
+		debug_info->wps_disabled = 1;
+
+		/* If we're already stepping a breakpoint, just return. */
+		if (debug_info->bps_disabled)
+			return 0;
+
+		if (test_thread_flag(TIF_SINGLESTEP))
+			debug_info->suspended_step = 1;
+		else
+			user_enable_single_step(current);
+	} else {
+		toggle_bp_registers(AARCH64_DBG_REG_WCR, DBG_ACTIVE_EL1, 0);
+		kernel_step = &__get_cpu_var(stepping_kernel_bp);
+
+		if (*kernel_step != ARM_KERNEL_STEP_NONE)
+			return 0;
+
+		if (kernel_active_single_step()) {
+			*kernel_step = ARM_KERNEL_STEP_SUSPEND;
+		} else {
+			*kernel_step = ARM_KERNEL_STEP_ACTIVE;
+			kernel_enable_single_step(regs);
+		}
+	}
+
+	return 0;
+}
+
+/*
+ * Handle single-step exception.
+ */
+int reinstall_suspended_bps(struct pt_regs *regs)
+{
+	struct debug_info *debug_info = &current->thread.debug;
+	int handled_exception = 0, *kernel_step;
+
+	kernel_step = &__get_cpu_var(stepping_kernel_bp);
+
+	/*
+	 * Called from single-step exception handler.
+	 * Return 0 if execution can resume, 1 if a SIGTRAP should be
+	 * reported.
+	 */
+	if (user_mode(regs)) {
+		if (debug_info->bps_disabled) {
+			debug_info->bps_disabled = 0;
+			toggle_bp_registers(AARCH64_DBG_REG_BCR, DBG_ACTIVE_EL0, 1);
+			handled_exception = 1;
+		}
+
+		if (debug_info->wps_disabled) {
+			debug_info->wps_disabled = 0;
+			toggle_bp_registers(AARCH64_DBG_REG_WCR, DBG_ACTIVE_EL0, 1);
+			handled_exception = 1;
+		}
+
+		if (handled_exception) {
+			if (debug_info->suspended_step) {
+				debug_info->suspended_step = 0;
+				/* Allow exception handling to fall-through. */
+				handled_exception = 0;
+			} else {
+				user_disable_single_step(current);
+			}
+		}
+	} else if (*kernel_step != ARM_KERNEL_STEP_NONE) {
+		toggle_bp_registers(AARCH64_DBG_REG_BCR, DBG_ACTIVE_EL1, 1);
+		toggle_bp_registers(AARCH64_DBG_REG_WCR, DBG_ACTIVE_EL1, 1);
+
+		if (!debug_info->wps_disabled)
+			toggle_bp_registers(AARCH64_DBG_REG_WCR, DBG_ACTIVE_EL0, 1);
+
+		if (*kernel_step != ARM_KERNEL_STEP_SUSPEND) {
+			kernel_disable_single_step();
+			handled_exception = 1;
+		} else {
+			handled_exception = 0;
+		}
+
+		*kernel_step = ARM_KERNEL_STEP_NONE;
+	}
+
+	return !handled_exception;
+}
+
+/*
+ * Context-switcher for restoring suspended breakpoints.
+ */
+void hw_breakpoint_thread_switch(struct task_struct *next)
+{
+	/*
+	 *           current        next
+	 * disabled: 0              0     => The usual case, NOTIFY_DONE
+	 *           0              1     => Disable the registers
+	 *           1              0     => Enable the registers
+	 *           1              1     => NOTIFY_DONE. per-task bps will
+	 *                                   get taken care of by perf.
+	 */
+
+	struct debug_info *current_debug_info, *next_debug_info;
+
+	current_debug_info = &current->thread.debug;
+	next_debug_info = &next->thread.debug;
+
+	/* Update breakpoints. */
+	if (current_debug_info->bps_disabled != next_debug_info->bps_disabled)
+		toggle_bp_registers(AARCH64_DBG_REG_BCR,
+				    DBG_ACTIVE_EL0,
+				    !next_debug_info->bps_disabled);
+
+	/* Update watchpoints. */
+	if (current_debug_info->wps_disabled != next_debug_info->wps_disabled)
+		toggle_bp_registers(AARCH64_DBG_REG_WCR,
+				    DBG_ACTIVE_EL0,
+				    !next_debug_info->wps_disabled);
+}
+
+/*
+ * CPU initialisation.
+ */
+static void reset_ctrl_regs(void *unused)
+{
+	int i;
+
+	for (i = 0; i < core_num_brps; ++i) {
+		write_wb_reg(AARCH64_DBG_REG_BCR, i, 0UL);
+		write_wb_reg(AARCH64_DBG_REG_BVR, i, 0UL);
+	}
+
+	for (i = 0; i < core_num_wrps; ++i) {
+		write_wb_reg(AARCH64_DBG_REG_WCR, i, 0UL);
+		write_wb_reg(AARCH64_DBG_REG_WVR, i, 0UL);
+	}
+}
+
+static int __cpuinit hw_breakpoint_reset_notify(struct notifier_block *self,
+						unsigned long action,
+						void *hcpu)
+{
+	int cpu = (long)hcpu;
+	if (action == CPU_ONLINE)
+		smp_call_function_single(cpu, reset_ctrl_regs, NULL, 1);
+	return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata hw_breakpoint_reset_nb = {
+	.notifier_call = hw_breakpoint_reset_notify,
+};
+
+/*
+ * One-time initialisation.
+ */
+static int __init arch_hw_breakpoint_init(void)
+{
+	core_num_brps = get_num_brps();
+	core_num_wrps = get_num_wrps();
+
+	pr_info("found %d breakpoint and %d watchpoint registers.\n",
+		core_num_brps, core_num_wrps);
+
+	/*
+	 * Reset the breakpoint resources. We assume that a halting
+	 * debugger will leave the world in a nice state for us.
+	 */
+	smp_call_function(reset_ctrl_regs, NULL, 1);
+	reset_ctrl_regs(NULL);
+
+	/* Register debug fault handlers. */
+	hook_debug_fault_code(DBG_ESR_EVT_HWBP, breakpoint_handler, SIGTRAP,
+			      TRAP_HWBKPT, "hw-breakpoint handler");
+	hook_debug_fault_code(DBG_ESR_EVT_HWWP, watchpoint_handler, SIGTRAP,
+			      TRAP_HWBKPT, "hw-watchpoint handler");
+
+	/* Register hotplug notifier. */
+	register_cpu_notifier(&hw_breakpoint_reset_nb);
+
+	return 0;
+}
+arch_initcall(arch_hw_breakpoint_init);
+
+void hw_breakpoint_pmu_read(struct perf_event *bp)
+{
+}
+
+/*
+ * Dummy function to register with die_notifier.
+ */
+int hw_breakpoint_exceptions_notify(struct notifier_block *unused,
+				    unsigned long val, void *data)
+{
+	return NOTIFY_DONE;
+}
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
new file mode 100644
index 0000000..816b1b5
--- /dev/null
+++ b/arch/arm64/kernel/ptrace.c
@@ -0,0 +1,834 @@
+/*
+ * Based on arch/arm/kernel/ptrace.c
+ *
+ * By Ross Biro 1/23/92
+ * edited by Linus Torvalds
+ * ARM modifications Copyright (C) 2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/sched.h>
+#include <linux/mm.h>
+#include <linux/smp.h>
+#include <linux/ptrace.h>
+#include <linux/user.h>
+#include <linux/security.h>
+#include <linux/init.h>
+#include <linux/signal.h>
+#include <linux/uaccess.h>
+#include <linux/perf_event.h>
+#include <linux/hw_breakpoint.h>
+#include <linux/regset.h>
+#include <linux/tracehook.h>
+#include <linux/elf.h>
+
+#include <asm/compat.h>
+#include <asm/debug-monitors.h>
+#include <asm/pgtable.h>
+#include <asm/traps.h>
+#include <asm/system_misc.h>
+
+/*
+ * TODO: does not yet catch signals sent when the child dies.
+ * in exit.c or in signal.c.
+ */
+
+/*
+ * Called by kernel/ptrace.c when detaching..
+ */
+void ptrace_disable(struct task_struct *child)
+{
+}
+
+/*
+ * Handle hitting a breakpoint.
+ */
+static int ptrace_break(struct pt_regs *regs)
+{
+	siginfo_t info;
+
+	info.si_signo = SIGTRAP;
+	info.si_errno = 0;
+	info.si_code  = TRAP_BRKPT;
+	info.si_addr  = (void __user *)instruction_pointer(regs);
+
+	force_sig_info(SIGTRAP, &info, current);
+	return 0;
+}
+
+static int arm64_break_trap(unsigned long addr, unsigned int esr,
+			    struct pt_regs *regs)
+{
+	return ptrace_break(regs);
+}
+
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+/*
+ * Convert a virtual register number into an index for a thread_info
+ * breakpoint array. Breakpoints are identified using positive numbers
+ * whilst watchpoints are negative. The registers are laid out as pairs
+ * of (address, control), each pair mapping to a unique hw_breakpoint struct.
+ * Register 0 is reserved for describing resource information.
+ */
+static int ptrace_hbp_num_to_idx(long num)
+{
+	if (num < 0)
+		num = (ARM_MAX_BRP << 1) - num;
+	return (num - 1) >> 1;
+}
+
+/*
+ * Returns the virtual register number for the address of the
+ * breakpoint at index idx.
+ */
+static long ptrace_hbp_idx_to_num(int idx)
+{
+	long mid = ARM_MAX_BRP << 1;
+	long num = (idx << 1) + 1;
+	return num > mid ? mid - num : num;
+}
+
+/*
+ * Handle hitting a HW-breakpoint.
+ */
+static void ptrace_hbptriggered(struct perf_event *bp,
+				struct perf_sample_data *data,
+				struct pt_regs *regs)
+{
+	struct arch_hw_breakpoint *bkpt = counter_arch_bp(bp);
+	long num;
+	int i;
+	siginfo_t info;
+
+	for (i = 0; i < ARM_MAX_HBP_SLOTS; ++i)
+		if (current->thread.debug.hbp[i] == bp)
+			break;
+
+	num = (i == ARM_MAX_HBP_SLOTS) ? 0 : ptrace_hbp_idx_to_num(i);
+
+	info.si_signo	= SIGTRAP;
+	info.si_errno	= (int)num;
+	info.si_code	= TRAP_HWBKPT;
+	info.si_addr	= (void __user *)(bkpt->trigger);
+
+	force_sig_info(SIGTRAP, &info, current);
+}
+
+/*
+ * Unregister breakpoints from this task and reset the pointers in
+ * the thread_struct.
+ */
+void flush_ptrace_hw_breakpoint(struct task_struct *tsk)
+{
+	int i;
+	struct thread_struct *t = &tsk->thread;
+
+	for (i = 0; i < ARM_MAX_HBP_SLOTS; i++) {
+		if (t->debug.hbp[i]) {
+			unregister_hw_breakpoint(t->debug.hbp[i]);
+			t->debug.hbp[i] = NULL;
+		}
+	}
+}
+
+void ptrace_hw_copy_thread(struct task_struct *task)
+{
+	memset(&task->thread.debug, 0, sizeof(struct debug_info));
+}
+
+static u32 ptrace_get_hbp_resource_info(void)
+{
+	u8 num_brps, num_wrps, debug_arch, wp_len;
+	u32 reg = 0;
+
+	num_brps	= hw_breakpoint_slots(TYPE_INST);
+	num_wrps	= hw_breakpoint_slots(TYPE_DATA);
+
+	debug_arch	= debug_monitors_arch();
+	wp_len		= 8;		/* Reserved on AArch64 */
+	reg		|= debug_arch;
+	reg		<<= 8;
+	reg		|= wp_len;
+	reg		<<= 8;
+	reg		|= num_wrps;
+	reg		<<= 8;
+	reg		|= num_brps;
+
+	return reg;
+}
+
+static struct perf_event *ptrace_hbp_create(struct task_struct *tsk, int type)
+{
+	struct perf_event_attr attr;
+
+	ptrace_breakpoint_init(&attr);
+
+	/*
+	 * Initialise fields to sane defaults
+	 * (i.e. values that will pass validation).
+	 */
+	attr.bp_addr	= 0;
+	attr.bp_len	= HW_BREAKPOINT_LEN_4;
+	attr.bp_type	= type;
+	attr.disabled	= 1;
+
+	return register_user_hw_breakpoint(&attr, ptrace_hbptriggered, NULL,
+					   tsk);
+}
+
+static int ptrace_gethbpregs(struct task_struct *tsk, long num,
+			     unsigned long  __user *data)
+{
+	u64 addr_reg;
+	u32 ctrl_reg;
+	int idx, ret = 0;
+	struct perf_event *bp;
+	struct arch_hw_breakpoint_ctrl arch_ctrl;
+
+	if (num == 0) {
+		ctrl_reg = ptrace_get_hbp_resource_info();
+		if (put_user(ctrl_reg, (u32 __user *)data))
+			ret = -EFAULT;
+	} else {
+		idx = ptrace_hbp_num_to_idx(num);
+		if (idx < 0 || idx >= ARM_MAX_HBP_SLOTS)
+			return -EINVAL;
+
+		bp = tsk->thread.debug.hbp[idx];
+		arch_ctrl = counter_arch_bp(bp)->ctrl;
+
+		if (is_compat_task()) {
+			/*
+			 * Fix up the len because we may have adjusted
+			 * it to compensate for an unaligned address.
+			 */
+			while (!(arch_ctrl.len & 0x1))
+				arch_ctrl.len >>= 1;
+		}
+
+		if (num & 0x1) {
+			addr_reg = bp ? bp->attr.bp_addr : 0;
+			if (put_user(addr_reg, data))
+				ret = -EFAULT;
+		} else {
+			ctrl_reg = bp ? encode_ctrl_reg(arch_ctrl) : 0;
+			if (put_user(ctrl_reg, (u32 __user *)data))
+				ret = -EFAULT;
+		}
+	}
+
+	return ret;
+}
+
+static int ptrace_sethbpregs(struct task_struct *tsk, long num,
+			     unsigned long __user *data)
+{
+	int idx, gen_len, gen_type, implied_type, ret;
+	u64 user_addr;
+	u32 user_ctrl;
+	struct perf_event *bp;
+	struct arch_hw_breakpoint_ctrl ctrl;
+	struct perf_event_attr attr;
+
+	if (num == 0)
+		return 0;
+	else if (num < 0)
+		implied_type = HW_BREAKPOINT_RW;
+	else
+		implied_type = HW_BREAKPOINT_X;
+
+	idx = ptrace_hbp_num_to_idx(num);
+	if (idx < 0 || idx >= ARM_MAX_HBP_SLOTS)
+		return -EFAULT;
+
+	bp = tsk->thread.debug.hbp[idx];
+	if (!bp) {
+		bp = ptrace_hbp_create(tsk, implied_type);
+		if (IS_ERR(bp))
+			return PTR_ERR(bp);
+		tsk->thread.debug.hbp[idx] = bp;
+	}
+
+	attr = bp->attr;
+
+	if (num & 0x1) {
+		/* Address */
+		if (get_user(user_addr, data))
+			return -EFAULT;
+		attr.bp_addr = user_addr;
+	} else {
+		/* Control */
+		if (get_user(user_ctrl, (u32 __user *)data))
+			return -EFAULT;
+		decode_ctrl_reg(user_ctrl, &ctrl);
+		ret = arch_bp_generic_fields(ctrl, &gen_len, &gen_type);
+		if (ret)
+			return ret;
+
+		if ((gen_type & implied_type) != gen_type)
+			return -EINVAL;
+
+		attr.bp_len	= gen_len;
+		attr.bp_type	= gen_type;
+		attr.disabled	= !ctrl.enabled;
+	}
+
+	return modify_user_hw_breakpoint(bp, &attr);
+}
+#endif	/* CONFIG_HAVE_HW_BREAKPOINT */
+
+static int gpr_get(struct task_struct *target,
+		   const struct user_regset *regset,
+		   unsigned int pos, unsigned int count,
+		   void *kbuf, void __user *ubuf)
+{
+	struct user_pt_regs *uregs = &task_pt_regs(target)->user_regs;
+	return user_regset_copyout(&pos, &count, &kbuf, &ubuf, uregs, 0, -1);
+}
+
+static int gpr_set(struct task_struct *target, const struct user_regset *regset,
+		   unsigned int pos, unsigned int count,
+		   const void *kbuf, const void __user *ubuf)
+{
+	int ret;
+	struct user_pt_regs newregs;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &newregs, 0, -1);
+	if (ret)
+		return ret;
+
+	if (!valid_user_regs(&newregs))
+		return -EINVAL;
+
+	task_pt_regs(target)->user_regs = newregs;
+	return 0;
+}
+
+/*
+ * TODO: update fp accessors for lazy context switching (sync/flush hwstate)
+ */
+static int fpr_get(struct task_struct *target, const struct user_regset *regset,
+		   unsigned int pos, unsigned int count,
+		   void *kbuf, void __user *ubuf)
+{
+	struct user_fpsimd_state *uregs;
+	uregs = &target->thread.fpsimd_state.user_fpsimd;
+	return user_regset_copyout(&pos, &count, &kbuf, &ubuf, uregs, 0, -1);
+}
+
+static int fpr_set(struct task_struct *target, const struct user_regset *regset,
+		   unsigned int pos, unsigned int count,
+		   const void *kbuf, const void __user *ubuf)
+{
+	int ret;
+	struct user_fpsimd_state newstate;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &newstate, 0, -1);
+	if (ret)
+		return ret;
+
+	target->thread.fpsimd_state.user_fpsimd = newstate;
+	return ret;
+}
+
+enum aarch64_regset {
+	REGSET_GPR,
+	REGSET_FPR,
+};
+
+static const struct user_regset aarch64_regsets[] = {
+	[REGSET_GPR] = {
+		.core_note_type = NT_PRSTATUS,
+		.n = sizeof(struct user_pt_regs) / sizeof(u64),
+		.size = sizeof(u64),
+		.align = sizeof(u64),
+		.get = gpr_get,
+		.set = gpr_set
+	},
+	[REGSET_FPR] = {
+		.core_note_type = NT_PRFPREG,
+		.n = sizeof(struct user_fpsimd_state) / sizeof(u32),
+		/*
+		 * We pretend we have 32-bit registers because the fpsr and
+		 * fpcr are 32-bits wide.
+		 */
+		.size = sizeof(u32),
+		.align = sizeof(u32),
+		.get = fpr_get,
+		.set = fpr_set
+	},
+};
+
+static const struct user_regset_view user_aarch64_view = {
+	.name = "aarch64", .e_machine = EM_AARCH64,
+	.regsets = aarch64_regsets, .n = ARRAY_SIZE(aarch64_regsets)
+};
+
+#ifdef CONFIG_AARCH32_EMULATION
+enum compat_regset {
+	REGSET_COMPAT_GPR,
+	REGSET_COMPAT_VFP,
+};
+
+static int compat_gpr_get(struct task_struct *target,
+			  const struct user_regset *regset,
+			  unsigned int pos, unsigned int count,
+			  void *kbuf, void __user *ubuf)
+{
+	int ret = 0;
+	unsigned int i, start, num_regs;
+
+	/* Calculate the number of AArch32 registers contained in count */
+	num_regs = count / regset->size;
+
+	/* Convert pos into an register number */
+	start = pos / regset->size;
+
+	if (start + num_regs > regset->n)
+		return -EIO;
+
+	for (i = 0; i < num_regs; ++i) {
+		unsigned int idx = start + i;
+		void *reg;
+
+		switch (idx) {
+		case 15:
+			reg = (void *)&task_pt_regs(target)->pc;
+			break;
+		case 16:
+			reg = (void *)&task_pt_regs(target)->pstate;
+			break;
+		case 17:
+			reg = (void *)&task_pt_regs(target)->orig_x0;
+			break;
+		default:
+			reg = (void *)&task_pt_regs(target)->regs[idx];
+		}
+
+		ret = copy_to_user(ubuf, reg, sizeof(compat_ulong_t));
+
+		if (ret)
+			break;
+		else
+			ubuf += sizeof(compat_ulong_t);
+	}
+
+	return ret;
+}
+
+static int compat_gpr_set(struct task_struct *target,
+			  const struct user_regset *regset,
+			  unsigned int pos, unsigned int count,
+			  const void *kbuf, const void __user *ubuf)
+{
+	struct pt_regs newregs;
+	int ret = 0;
+	unsigned int i, start, num_regs;
+
+	/* Calculate the number of AArch32 registers contained in count */
+	num_regs = count / regset->size;
+
+	/* Convert pos into an register number */
+	start = pos / regset->size;
+
+	if (start + num_regs > regset->n)
+		return -EIO;
+
+	newregs = *task_pt_regs(target);
+
+	for (i = 0; i < num_regs; ++i) {
+		unsigned int idx = start + i;
+		void *reg;
+
+		switch (idx) {
+		case 15:
+			reg = (void *)&newregs.pc;
+			break;
+		case 16:
+			reg = (void *)&newregs.pstate;
+			break;
+		case 17:
+			reg = (void *)&newregs.orig_x0;
+			break;
+		default:
+			reg = (void *)&newregs.regs[idx];
+		}
+
+		ret = copy_from_user(reg, ubuf, sizeof(compat_ulong_t));
+
+		if (ret)
+			goto out;
+		else
+			ubuf += sizeof(compat_ulong_t);
+	}
+
+	if (valid_user_regs(&newregs.user_regs))
+		*task_pt_regs(target) = newregs;
+	else
+		ret = -EINVAL;
+
+out:
+	return ret;
+}
+
+static int compat_vfp_get(struct task_struct *target,
+			  const struct user_regset *regset,
+			  unsigned int pos, unsigned int count,
+			  void *kbuf, void __user *ubuf)
+{
+	struct user_fpsimd_state *uregs;
+	compat_ulong_t fpscr;
+	int ret;
+
+	uregs = &target->thread.fpsimd_state.user_fpsimd;
+
+	/*
+	 * The VFP registers are packed into the fpsimd_state, so they all sit
+	 * nicely together for us. We just need to create the fpscr separately.
+	 */
+	ret = user_regset_copyout(&pos, &count, &kbuf, &ubuf, uregs, 0,
+				  VFP_STATE_SIZE - sizeof(compat_ulong_t));
+
+	if (count && !ret) {
+		fpscr = (uregs->fpsr & VFP_FPSCR_STAT_MASK) |
+			(uregs->fpcr & VFP_FPSCR_CTRL_MASK);
+		ret = put_user(fpscr, (compat_ulong_t *)ubuf);
+	}
+
+	return ret;
+}
+
+static int compat_vfp_set(struct task_struct *target,
+			  const struct user_regset *regset,
+			  unsigned int pos, unsigned int count,
+			  const void *kbuf, const void __user *ubuf)
+{
+	struct user_fpsimd_state *uregs;
+	compat_ulong_t fpscr;
+	int ret;
+
+	if (pos + count > VFP_STATE_SIZE)
+		return -EIO;
+
+	uregs = &target->thread.fpsimd_state.user_fpsimd;
+
+	ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, uregs, 0,
+				 VFP_STATE_SIZE - sizeof(compat_ulong_t));
+
+	if (count && !ret) {
+		ret = get_user(fpscr, (compat_ulong_t *)ubuf);
+		uregs->fpsr = fpscr & VFP_FPSCR_STAT_MASK;
+		uregs->fpcr = fpscr & VFP_FPSCR_CTRL_MASK;
+	}
+
+	return ret;
+}
+
+static const struct user_regset aarch32_regsets[] = {
+	[REGSET_COMPAT_GPR] = {
+		.core_note_type = NT_PRSTATUS,
+		.n = COMPAT_ELF_NGREG,
+		.size = sizeof(compat_elf_greg_t),
+		.align = sizeof(compat_elf_greg_t),
+		.get = compat_gpr_get,
+		.set = compat_gpr_set
+	},
+	[REGSET_COMPAT_VFP] = {
+		.core_note_type = NT_ARM_VFP,
+		.n = VFP_STATE_SIZE / sizeof(compat_ulong_t),
+		.size = sizeof(compat_ulong_t),
+		.align = sizeof(compat_ulong_t),
+		.get = compat_vfp_get,
+		.set = compat_vfp_set
+	},
+};
+
+static const struct user_regset_view user_aarch32_view = {
+	.name = "aarch32", .e_machine = EM_ARM,
+	.regsets = aarch32_regsets, .n = ARRAY_SIZE(aarch32_regsets)
+};
+#endif /* CONFIG_AARCH32_EMULATION */
+
+const struct user_regset_view *task_user_regset_view(struct task_struct *task)
+{
+#ifdef CONFIG_AARCH32_EMULATION
+	if (test_tsk_thread_flag(task, TIF_32BIT))
+		return &user_aarch32_view;
+#endif
+	return &user_aarch64_view;
+}
+
+long arch_ptrace(struct task_struct *child, long request,
+		 unsigned long addr, unsigned long data)
+{
+	int ret;
+	unsigned long *datap = (unsigned long __user *)data;
+
+	switch (request) {
+		case PTRACE_GET_THREAD_AREA:
+			ret = put_user(child->thread.tp_value, datap);
+			break;
+
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+		case PTRACE_GETHBPREGS:
+			ret = ptrace_gethbpregs(child, addr, datap);
+			break;
+
+		case PTRACE_SETHBPREGS:
+			ret = ptrace_sethbpregs(child, addr, datap);
+			break;
+#endif
+
+		default:
+			ret = ptrace_request(child, request, addr, data);
+			break;
+	}
+
+	return ret;
+}
+
+#ifdef CONFIG_AARCH32_EMULATION
+
+#include <linux/compat.h>
+
+int aarch32_break_trap(struct pt_regs *regs)
+{
+	unsigned int instr;
+	bool bp = false;
+	void __user *pc = (void __user *)instruction_pointer(regs);
+
+	if (compat_thumb_mode(regs)) {
+		/* get 16-bit Thumb instruction */
+		get_user(instr, (u16 __user *)pc);
+		if (instr == AARCH32_BREAK_THUMB2_LO) {
+			/* get second half of 32-bit Thumb-2 instruction */
+			get_user(instr, (u16 __user *)(pc + 2));
+			bp = instr == AARCH32_BREAK_THUMB2_HI;
+		} else {
+			bp = instr == AARCH32_BREAK_THUMB;
+		}
+	} else {
+		/* 32-bit ARM instruction */
+		get_user(instr, (u32 __user *)pc);
+		bp = (instr & ~0xf0000000) == AARCH32_BREAK_ARM;
+	}
+
+	if (bp)
+		return ptrace_break(regs);
+	return 1;
+}
+
+static int compat_ptrace_read_user(struct task_struct *tsk, compat_ulong_t off,
+				   compat_ulong_t __user *ret)
+{
+	compat_ulong_t tmp;
+
+	if (off & 3)
+		return -EIO;
+
+	if (off == PT_TEXT_ADDR)
+		tmp = tsk->mm->start_code;
+	else if (off == PT_DATA_ADDR)
+		tmp = tsk->mm->start_data;
+	else if (off == PT_TEXT_END_ADDR)
+		tmp = tsk->mm->end_code;
+	else if (off < sizeof(compat_elf_gregset_t))
+		return copy_regset_to_user(tsk, &user_aarch32_view,
+					   REGSET_COMPAT_GPR, off,
+					   sizeof(compat_ulong_t), ret);
+	else if (off >= COMPAT_USER_SZ)
+		return -EIO;
+	else
+		tmp = 0;
+
+	return put_user(tmp, ret);
+}
+
+static int compat_ptrace_write_user(struct task_struct *tsk, compat_ulong_t off,
+				    compat_ulong_t val)
+{
+	int ret;
+
+	if (off & 3 || off >= COMPAT_USER_SZ)
+		return -EIO;
+
+	if (off >= sizeof(compat_elf_gregset_t))
+		return 0;
+
+	ret = copy_regset_from_user(tsk, &user_aarch32_view,
+				    REGSET_COMPAT_GPR, off,
+				    sizeof(compat_ulong_t),
+				    &val);
+	return ret;
+}
+
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+static int compat_ptrace_gethbpregs(struct task_struct *tsk, compat_long_t num,
+				    compat_ulong_t __user *data)
+{
+	int ret;
+	unsigned long kdata;
+
+	mm_segment_t old_fs = get_fs();
+	set_fs(KERNEL_DS);
+	ret = ptrace_gethbpregs(tsk, (long)num, &kdata);
+	set_fs(old_fs);
+
+	if (!ret)
+		ret = put_user(kdata, data);
+
+	return ret;
+}
+
+static int compat_ptrace_sethbpregs(struct task_struct *tsk, compat_long_t num,
+				    compat_ulong_t __user *data)
+{
+	int ret;
+	unsigned long kdata = 0;
+	mm_segment_t old_fs = get_fs();
+
+	ret = get_user(kdata, data);
+
+	if (!ret) {
+		set_fs(KERNEL_DS);
+		ret = ptrace_sethbpregs(tsk, (long)num, &kdata);
+		set_fs(old_fs);
+	}
+
+	return ret;
+}
+#endif	/* CONFIG_HAVE_HW_BREAKPOINT */
+
+long compat_arch_ptrace(struct task_struct *child, compat_long_t request,
+			compat_ulong_t caddr, compat_ulong_t cdata)
+{
+	unsigned long addr = caddr;
+	unsigned long data = cdata;
+	void __user *datap = compat_ptr(data);
+	int ret;
+
+	switch (request) {
+		case PTRACE_PEEKUSR:
+			ret = compat_ptrace_read_user(child, addr, datap);
+			break;
+
+		case PTRACE_POKEUSR:
+			ret = compat_ptrace_write_user(child, addr, data);
+			break;
+
+		case PTRACE_GETREGS:
+			ret = copy_regset_to_user(child,
+						  &user_aarch32_view,
+						  REGSET_COMPAT_GPR,
+						  0, sizeof(compat_elf_gregset_t),
+						  datap);
+			break;
+
+		case PTRACE_SETREGS:
+			ret = copy_regset_from_user(child,
+						    &user_aarch32_view,
+						    REGSET_COMPAT_GPR,
+						    0, sizeof(compat_elf_gregset_t),
+						    datap);
+			break;
+
+		case PTRACE_GET_THREAD_AREA:
+			ret = put_user((compat_ulong_t)child->thread.tp_value,
+				       (compat_ulong_t __user *)datap);
+			break;
+
+		case PTRACE_SET_SYSCALL:
+			task_pt_regs(child)->syscallno = data;
+			ret = 0;
+			break;
+
+		case COMPAT_PTRACE_GETVFPREGS:
+			ret = copy_regset_to_user(child,
+						  &user_aarch32_view,
+						  REGSET_COMPAT_VFP,
+						  0, VFP_STATE_SIZE,
+						  datap);
+			break;
+
+		case COMPAT_PTRACE_SETVFPREGS:
+			ret = copy_regset_from_user(child,
+						    &user_aarch32_view,
+						    REGSET_COMPAT_VFP,
+						    0, VFP_STATE_SIZE,
+						    datap);
+			break;
+
+#ifdef CONFIG_HAVE_HW_BREAKPOINT
+		case PTRACE_GETHBPREGS:
+			ret = compat_ptrace_gethbpregs(child, addr, datap);
+			break;
+
+		case PTRACE_SETHBPREGS:
+			ret = compat_ptrace_sethbpregs(child, addr, datap);
+			break;
+#endif
+
+		default:
+			ret = compat_ptrace_request(child, request, addr,
+						    data);
+			break;
+	}
+
+	return ret;
+}
+#endif /* CONFIG_AARCH32_EMULATION */
+
+static int __init ptrace_break_init(void)
+{
+	hook_debug_fault_code(DBG_ESR_EVT_BRK, arm64_break_trap, SIGTRAP,
+			      TRAP_BRKPT, "ptrace BRK handler");
+	return 0;
+}
+core_initcall(ptrace_break_init);
+
+
+asmlinkage int syscall_trace(int dir, struct pt_regs *regs)
+{
+	unsigned long saved_reg;
+
+	if (!test_thread_flag(TIF_SYSCALL_TRACE))
+		return regs->syscallno;
+
+	if (test_thread_flag(TIF_32BIT)) {
+		/* AArch32 uses ip (r12) for scratch */
+		saved_reg = regs->regs[12];
+		regs->regs[12] = dir;
+	} else {
+		/*
+		 * Save X7. X7 is used to denote syscall entry/exit:
+		 *   X7 = 0 -> entry, = 1 -> exit
+		 */
+		saved_reg = regs->regs[7];
+		regs->regs[7] = dir;
+	}
+
+	if (dir)
+		tracehook_report_syscall_exit(regs, 0);
+	else if (tracehook_report_syscall_entry(regs))
+		regs->syscallno = ~0UL;
+
+	if (test_thread_flag(TIF_32BIT))
+		regs->regs[12] = saved_reg;
+	else
+		regs->regs[7] = saved_reg;
+
+	return regs->syscallno;
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 24/31] arm64: Add support for /proc/sys/debug/exception-trace
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (22 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 23/31] arm64: Debugging support Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 15:08   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 25/31] arm64: Performance counters support Catalin Marinas
                   ` (7 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann

This patch allows setting of the show_unhandled_signals variable via
/proc/sys/debug/exception-trace. The default value is currently 1
showing unhandled user faults (undefined instructions, data aborts) and
invalid signal stack frames.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 kernel/sysctl.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index 87174ef..79dcb00 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -1544,7 +1544,7 @@ static struct ctl_table fs_table[] = {
 
 static struct ctl_table debug_table[] = {
 #if defined(CONFIG_X86) || defined(CONFIG_PPC) || defined(CONFIG_SPARC) || \
-    defined(CONFIG_S390) || defined(CONFIG_TILE)
+    defined(CONFIG_S390) || defined(CONFIG_TILE) || defined(CONFIG_ARM64)
 	{
 		.procname	= "exception-trace",
 		.data		= &show_unhandled_signals,


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 25/31] arm64: Performance counters support
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (23 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 24/31] arm64: Add support for /proc/sys/debug/exception-trace Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 15:11   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 26/31] arm64: Miscellaneous library functions Catalin Marinas
                   ` (6 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

From: Will Deacon <will.deacon@arm.com>

This patch adds support for the AArch64 performance counters.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/perf_event.h |   22 +
 arch/arm64/include/asm/pmu.h        |   82 +++
 arch/arm64/kernel/perf_event.c      | 1368 +++++++++++++++++++++++++++++++++++
 tools/perf/perf.h                   |    6 +
 4 files changed, 1478 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/perf_event.h
 create mode 100644 arch/arm64/include/asm/pmu.h
 create mode 100644 arch/arm64/kernel/perf_event.c

diff --git a/arch/arm64/include/asm/perf_event.h b/arch/arm64/include/asm/perf_event.h
new file mode 100644
index 0000000..a6fffd5
--- /dev/null
+++ b/arch/arm64/include/asm/perf_event.h
@@ -0,0 +1,22 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#ifndef __ASM_PERF_EVENT_H
+#define __ASM_PERF_EVENT_H
+
+/* It's quiet around here... */
+
+#endif
diff --git a/arch/arm64/include/asm/pmu.h b/arch/arm64/include/asm/pmu.h
new file mode 100644
index 0000000..e6f0878
--- /dev/null
+++ b/arch/arm64/include/asm/pmu.h
@@ -0,0 +1,82 @@
+/*
+ * Based on arch/arm/include/asm/pmu.h
+ *
+ * Copyright (C) 2009 picoChip Designs Ltd, Jamie Iles
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_PMU_H
+#define __ASM_PMU_H
+
+#ifdef CONFIG_HW_PERF_EVENTS
+
+/* The events for a given PMU register set. */
+struct pmu_hw_events {
+	/*
+	 * The events that are active on the PMU for the given index.
+	 */
+	struct perf_event	**events;
+
+	/*
+	 * A 1 bit for an index indicates that the counter is being used for
+	 * an event. A 0 means that the counter can be used.
+	 */
+	unsigned long           *used_mask;
+
+	/*
+	 * Hardware lock to serialize accesses to PMU registers. Needed for the
+	 * read/modify/write sequences.
+	 */
+	raw_spinlock_t		pmu_lock;
+};
+
+struct arm_pmu {
+	struct pmu		pmu;
+	cpumask_t		active_irqs;
+	const char		*name;
+	irqreturn_t		(*handle_irq)(int irq_num, void *dev);
+	void			(*enable)(struct hw_perf_event *evt, int idx);
+	void			(*disable)(struct hw_perf_event *evt, int idx);
+	int			(*get_event_idx)(struct pmu_hw_events *hw_events,
+						 struct hw_perf_event *hwc);
+	int			(*set_event_filter)(struct hw_perf_event *evt,
+						    struct perf_event_attr *attr);
+	u32			(*read_counter)(int idx);
+	void			(*write_counter)(int idx, u32 val);
+	void			(*start)(void);
+	void			(*stop)(void);
+	void			(*reset)(void *);
+	int			(*map_event)(struct perf_event *event);
+	int			num_events;
+	atomic_t		active_events;
+	struct mutex		reserve_mutex;
+	u64			max_period;
+	struct platform_device	*plat_device;
+	struct pmu_hw_events	*(*get_hw_events)(void);
+};
+
+#define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
+
+int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type);
+
+u64 armpmu_event_update(struct perf_event *event,
+			struct hw_perf_event *hwc,
+			int idx);
+
+int armpmu_event_set_period(struct perf_event *event,
+			    struct hw_perf_event *hwc,
+			    int idx);
+
+#endif /* CONFIG_HW_PERF_EVENTS */
+#endif /* __ASM_PMU_H */
diff --git a/arch/arm64/kernel/perf_event.c b/arch/arm64/kernel/perf_event.c
new file mode 100644
index 0000000..ecbf2d8
--- /dev/null
+++ b/arch/arm64/kernel/perf_event.c
@@ -0,0 +1,1368 @@
+/*
+ * PMU support
+ *
+ * Copyright (C) 2012 ARM Limited
+ * Author: Will Deacon <will.deacon@arm.com>
+ *
+ * This code is based heavily on the ARMv7 perf event code.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#define pr_fmt(fmt) "hw perfevents: " fmt
+
+#include <linux/bitmap.h>
+#include <linux/interrupt.h>
+#include <linux/kernel.h>
+#include <linux/export.h>
+#include <linux/perf_event.h>
+#include <linux/platform_device.h>
+#include <linux/spinlock.h>
+#include <linux/uaccess.h>
+
+#include <asm/cputype.h>
+#include <asm/irq.h>
+#include <asm/irq_regs.h>
+#include <asm/pmu.h>
+#include <asm/stacktrace.h>
+
+/*
+ * ARMv8 supports a maximum of 32 events.
+ * The cycle counter is included in this total.
+ */
+#define ARMPMU_MAX_HWEVENTS		32
+
+static DEFINE_PER_CPU(struct perf_event * [ARMPMU_MAX_HWEVENTS], hw_events);
+static DEFINE_PER_CPU(unsigned long [BITS_TO_LONGS(ARMPMU_MAX_HWEVENTS)], used_mask);
+static DEFINE_PER_CPU(struct pmu_hw_events, cpu_hw_events);
+
+#define to_arm_pmu(p) (container_of(p, struct arm_pmu, pmu))
+
+/* Set at runtime when we know what CPU type we are. */
+static struct arm_pmu *cpu_pmu;
+
+int
+armpmu_get_max_events(void)
+{
+	int max_events = 0;
+
+	if (cpu_pmu != NULL)
+		max_events = cpu_pmu->num_events;
+
+	return max_events;
+}
+EXPORT_SYMBOL_GPL(armpmu_get_max_events);
+
+int perf_num_counters(void)
+{
+	return armpmu_get_max_events();
+}
+EXPORT_SYMBOL_GPL(perf_num_counters);
+
+#define HW_OP_UNSUPPORTED		0xFFFF
+
+#define C(_x) \
+	PERF_COUNT_HW_CACHE_##_x
+
+#define CACHE_OP_UNSUPPORTED		0xFFFF
+
+static int
+armpmu_map_cache_event(const unsigned (*cache_map)
+				      [PERF_COUNT_HW_CACHE_MAX]
+				      [PERF_COUNT_HW_CACHE_OP_MAX]
+				      [PERF_COUNT_HW_CACHE_RESULT_MAX],
+		       u64 config)
+{
+	unsigned int cache_type, cache_op, cache_result, ret;
+
+	cache_type = (config >>  0) & 0xff;
+	if (cache_type >= PERF_COUNT_HW_CACHE_MAX)
+		return -EINVAL;
+
+	cache_op = (config >>  8) & 0xff;
+	if (cache_op >= PERF_COUNT_HW_CACHE_OP_MAX)
+		return -EINVAL;
+
+	cache_result = (config >> 16) & 0xff;
+	if (cache_result >= PERF_COUNT_HW_CACHE_RESULT_MAX)
+		return -EINVAL;
+
+	ret = (int)(*cache_map)[cache_type][cache_op][cache_result];
+
+	if (ret == CACHE_OP_UNSUPPORTED)
+		return -ENOENT;
+
+	return ret;
+}
+
+static int
+armpmu_map_event(const unsigned (*event_map)[PERF_COUNT_HW_MAX], u64 config)
+{
+	int mapping = (*event_map)[config];
+	return mapping == HW_OP_UNSUPPORTED ? -ENOENT : mapping;
+}
+
+static int
+armpmu_map_raw_event(u32 raw_event_mask, u64 config)
+{
+	return (int)(config & raw_event_mask);
+}
+
+static int map_cpu_event(struct perf_event *event,
+			 const unsigned (*event_map)[PERF_COUNT_HW_MAX],
+			 const unsigned (*cache_map)
+					[PERF_COUNT_HW_CACHE_MAX]
+					[PERF_COUNT_HW_CACHE_OP_MAX]
+					[PERF_COUNT_HW_CACHE_RESULT_MAX],
+			 u32 raw_event_mask)
+{
+	u64 config = event->attr.config;
+
+	switch (event->attr.type) {
+	case PERF_TYPE_HARDWARE:
+		return armpmu_map_event(event_map, config);
+	case PERF_TYPE_HW_CACHE:
+		return armpmu_map_cache_event(cache_map, config);
+	case PERF_TYPE_RAW:
+		return armpmu_map_raw_event(raw_event_mask, config);
+	}
+
+	return -ENOENT;
+}
+
+int
+armpmu_event_set_period(struct perf_event *event,
+			struct hw_perf_event *hwc,
+			int idx)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	s64 left = local64_read(&hwc->period_left);
+	s64 period = hwc->sample_period;
+	int ret = 0;
+
+	if (unlikely(left <= -period)) {
+		left = period;
+		local64_set(&hwc->period_left, left);
+		hwc->last_period = period;
+		ret = 1;
+	}
+
+	if (unlikely(left <= 0)) {
+		left += period;
+		local64_set(&hwc->period_left, left);
+		hwc->last_period = period;
+		ret = 1;
+	}
+
+	if (left > (s64)armpmu->max_period)
+		left = armpmu->max_period;
+
+	local64_set(&hwc->prev_count, (u64)-left);
+
+	armpmu->write_counter(idx, (u64)(-left) & 0xffffffff);
+
+	perf_event_update_userpage(event);
+
+	return ret;
+}
+
+u64
+armpmu_event_update(struct perf_event *event,
+		    struct hw_perf_event *hwc,
+		    int idx)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	u64 delta, prev_raw_count, new_raw_count;
+
+again:
+	prev_raw_count = local64_read(&hwc->prev_count);
+	new_raw_count = armpmu->read_counter(idx);
+
+	if (local64_cmpxchg(&hwc->prev_count, prev_raw_count,
+			     new_raw_count) != prev_raw_count)
+		goto again;
+
+	delta = (new_raw_count - prev_raw_count) & armpmu->max_period;
+
+	local64_add(delta, &event->count);
+	local64_sub(delta, &hwc->period_left);
+
+	return new_raw_count;
+}
+
+static void
+armpmu_read(struct perf_event *event)
+{
+	struct hw_perf_event *hwc = &event->hw;
+
+	/* Don't read disabled counters! */
+	if (hwc->idx < 0)
+		return;
+
+	armpmu_event_update(event, hwc, hwc->idx);
+}
+
+static void
+armpmu_stop(struct perf_event *event, int flags)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+
+	/*
+	 * ARM pmu always has to update the counter, so ignore
+	 * PERF_EF_UPDATE, see comments in armpmu_start().
+	 */
+	if (!(hwc->state & PERF_HES_STOPPED)) {
+		armpmu->disable(hwc, hwc->idx);
+		barrier(); /* why? */
+		armpmu_event_update(event, hwc, hwc->idx);
+		hwc->state |= PERF_HES_STOPPED | PERF_HES_UPTODATE;
+	}
+}
+
+static void
+armpmu_start(struct perf_event *event, int flags)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+
+	/*
+	 * ARM pmu always has to reprogram the period, so ignore
+	 * PERF_EF_RELOAD, see the comment below.
+	 */
+	if (flags & PERF_EF_RELOAD)
+		WARN_ON_ONCE(!(hwc->state & PERF_HES_UPTODATE));
+
+	hwc->state = 0;
+	/*
+	 * Set the period again. Some counters can't be stopped, so when we
+	 * were stopped we simply disabled the IRQ source and the counter
+	 * may have been left counting. If we don't do this step then we may
+	 * get an interrupt too soon or *way* too late if the overflow has
+	 * happened since disabling.
+	 */
+	armpmu_event_set_period(event, hwc, hwc->idx);
+	armpmu->enable(hwc, hwc->idx);
+}
+
+static void
+armpmu_del(struct perf_event *event, int flags)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	struct pmu_hw_events *hw_events = armpmu->get_hw_events();
+	struct hw_perf_event *hwc = &event->hw;
+	int idx = hwc->idx;
+
+	WARN_ON(idx < 0);
+
+	armpmu_stop(event, PERF_EF_UPDATE);
+	hw_events->events[idx] = NULL;
+	clear_bit(idx, hw_events->used_mask);
+
+	perf_event_update_userpage(event);
+}
+
+static int
+armpmu_add(struct perf_event *event, int flags)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	struct pmu_hw_events *hw_events = armpmu->get_hw_events();
+	struct hw_perf_event *hwc = &event->hw;
+	int idx;
+	int err = 0;
+
+	perf_pmu_disable(event->pmu);
+
+	/* If we don't have a space for the counter then finish early. */
+	idx = armpmu->get_event_idx(hw_events, hwc);
+	if (idx < 0) {
+		err = idx;
+		goto out;
+	}
+
+	/*
+	 * If there is an event in the counter we are going to use then make
+	 * sure it is disabled.
+	 */
+	event->hw.idx = idx;
+	armpmu->disable(hwc, idx);
+	hw_events->events[idx] = event;
+
+	hwc->state = PERF_HES_STOPPED | PERF_HES_UPTODATE;
+	if (flags & PERF_EF_START)
+		armpmu_start(event, PERF_EF_RELOAD);
+
+	/* Propagate our changes to the userspace mapping. */
+	perf_event_update_userpage(event);
+
+out:
+	perf_pmu_enable(event->pmu);
+	return err;
+}
+
+static int
+validate_event(struct pmu_hw_events *hw_events,
+	       struct perf_event *event)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	struct hw_perf_event fake_event = event->hw;
+	struct pmu *leader_pmu = event->group_leader->pmu;
+
+	if (event->pmu != leader_pmu || event->state <= PERF_EVENT_STATE_OFF)
+		return 1;
+
+	return armpmu->get_event_idx(hw_events, &fake_event) >= 0;
+}
+
+static int
+validate_group(struct perf_event *event)
+{
+	struct perf_event *sibling, *leader = event->group_leader;
+	struct pmu_hw_events fake_pmu;
+	DECLARE_BITMAP(fake_used_mask, ARMPMU_MAX_HWEVENTS);
+
+	/*
+	 * Initialise the fake PMU. We only need to populate the
+	 * used_mask for the purposes of validation.
+	 */
+	memset(fake_used_mask, 0, sizeof(fake_used_mask));
+	fake_pmu.used_mask = fake_used_mask;
+
+	if (!validate_event(&fake_pmu, leader))
+		return -EINVAL;
+
+	list_for_each_entry(sibling, &leader->sibling_list, group_entry) {
+		if (!validate_event(&fake_pmu, sibling))
+			return -EINVAL;
+	}
+
+	if (!validate_event(&fake_pmu, event))
+		return -EINVAL;
+
+	return 0;
+}
+
+static void
+armpmu_release_hardware(struct arm_pmu *armpmu)
+{
+	int i, irq, irqs;
+	struct platform_device *pmu_device = armpmu->plat_device;
+
+	irqs = min(pmu_device->num_resources, num_possible_cpus());
+
+	for (i = 0; i < irqs; ++i) {
+		if (!cpumask_test_and_clear_cpu(i, &armpmu->active_irqs))
+			continue;
+		irq = platform_get_irq(pmu_device, i);
+		if (irq >= 0)
+			free_irq(irq, armpmu);
+	}
+}
+
+static int
+armpmu_reserve_hardware(struct arm_pmu *armpmu)
+{
+	int i, err, irq, irqs;
+	struct platform_device *pmu_device = armpmu->plat_device;
+
+	if (!pmu_device) {
+		pr_err("no PMU device registered\n");
+		return -ENODEV;
+	}
+
+	irqs = min(pmu_device->num_resources, num_possible_cpus());
+	if (irqs < 1) {
+		pr_err("no irqs for PMUs defined\n");
+		return -ENODEV;
+	}
+
+	for (i = 0; i < irqs; ++i) {
+		err = 0;
+		irq = platform_get_irq(pmu_device, i);
+		if (irq < 0)
+			continue;
+
+		/*
+		 * If we have a single PMU interrupt that we can't shift,
+		 * assume that we're running on a uniprocessor machine and
+		 * continue. Otherwise, continue without this interrupt.
+		 */
+		if (irq_set_affinity(irq, cpumask_of(i)) && irqs > 1) {
+			pr_warning("unable to set irq affinity (irq=%d, cpu=%u)\n",
+				    irq, i);
+			continue;
+		}
+
+		err = request_irq(irq, armpmu->handle_irq,
+				  IRQF_NOBALANCING,
+				  "arm-pmu", armpmu);
+		if (err) {
+			pr_err("unable to request IRQ%d for ARM PMU counters\n",
+				irq);
+			armpmu_release_hardware(armpmu);
+			return err;
+		}
+
+		cpumask_set_cpu(i, &armpmu->active_irqs);
+	}
+
+	return 0;
+}
+
+static void
+hw_perf_event_destroy(struct perf_event *event)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	atomic_t *active_events	 = &armpmu->active_events;
+	struct mutex *pmu_reserve_mutex = &armpmu->reserve_mutex;
+
+	if (atomic_dec_and_mutex_lock(active_events, pmu_reserve_mutex)) {
+		armpmu_release_hardware(armpmu);
+		mutex_unlock(pmu_reserve_mutex);
+	}
+}
+
+static int
+event_requires_mode_exclusion(struct perf_event_attr *attr)
+{
+	return attr->exclude_idle || attr->exclude_user ||
+	       attr->exclude_kernel || attr->exclude_hv;
+}
+
+static int
+__hw_perf_event_init(struct perf_event *event)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	struct hw_perf_event *hwc = &event->hw;
+	int mapping, err;
+
+	mapping = armpmu->map_event(event);
+
+	if (mapping < 0) {
+		pr_debug("event %x:%llx not supported\n", event->attr.type,
+			 event->attr.config);
+		return mapping;
+	}
+
+	/*
+	 * We don't assign an index until we actually place the event onto
+	 * hardware. Use -1 to signify that we haven't decided where to put it
+	 * yet. For SMP systems, each core has it's own PMU so we can't do any
+	 * clever allocation or constraints checking at this point.
+	 */
+	hwc->idx		= -1;
+	hwc->config_base	= 0;
+	hwc->config		= 0;
+	hwc->event_base		= 0;
+
+	/*
+	 * Check whether we need to exclude the counter from certain modes.
+	 */
+	if ((!armpmu->set_event_filter ||
+	     armpmu->set_event_filter(hwc, &event->attr)) &&
+	     event_requires_mode_exclusion(&event->attr)) {
+		pr_debug("ARM performance counters do not support mode exclusion\n");
+		return -EPERM;
+	}
+
+	/*
+	 * Store the event encoding into the config_base field.
+	 */
+	hwc->config_base	    |= (unsigned long)mapping;
+
+	if (!hwc->sample_period) {
+		/*
+		 * For non-sampling runs, limit the sample_period to half
+		 * of the counter width. That way, the new counter value
+		 * is far less likely to overtake the previous one unless
+		 * you have some serious IRQ latency issues.
+		 */
+		hwc->sample_period  = armpmu->max_period >> 1;
+		hwc->last_period    = hwc->sample_period;
+		local64_set(&hwc->period_left, hwc->sample_period);
+	}
+
+	err = 0;
+	if (event->group_leader != event) {
+		err = validate_group(event);
+		if (err)
+			return -EINVAL;
+	}
+
+	return err;
+}
+
+static int armpmu_event_init(struct perf_event *event)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(event->pmu);
+	int err = 0;
+	atomic_t *active_events = &armpmu->active_events;
+
+	if (armpmu->map_event(event) == -ENOENT)
+		return -ENOENT;
+
+	event->destroy = hw_perf_event_destroy;
+
+	if (!atomic_inc_not_zero(active_events)) {
+		mutex_lock(&armpmu->reserve_mutex);
+		if (atomic_read(active_events) == 0)
+			err = armpmu_reserve_hardware(armpmu);
+
+		if (!err)
+			atomic_inc(active_events);
+		mutex_unlock(&armpmu->reserve_mutex);
+	}
+
+	if (err)
+		return err;
+
+	err = __hw_perf_event_init(event);
+	if (err)
+		hw_perf_event_destroy(event);
+
+	return err;
+}
+
+static void armpmu_enable(struct pmu *pmu)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(pmu);
+	struct pmu_hw_events *hw_events = armpmu->get_hw_events();
+	int enabled = bitmap_weight(hw_events->used_mask, armpmu->num_events);
+
+	if (enabled)
+		armpmu->start();
+}
+
+static void armpmu_disable(struct pmu *pmu)
+{
+	struct arm_pmu *armpmu = to_arm_pmu(pmu);
+	armpmu->stop();
+}
+
+static void __init armpmu_init(struct arm_pmu *armpmu)
+{
+	atomic_set(&armpmu->active_events, 0);
+	mutex_init(&armpmu->reserve_mutex);
+
+	armpmu->pmu = (struct pmu) {
+		.pmu_enable	= armpmu_enable,
+		.pmu_disable	= armpmu_disable,
+		.event_init	= armpmu_event_init,
+		.add		= armpmu_add,
+		.del		= armpmu_del,
+		.start		= armpmu_start,
+		.stop		= armpmu_stop,
+		.read		= armpmu_read,
+	};
+}
+
+int __init armpmu_register(struct arm_pmu *armpmu, char *name, int type)
+{
+	armpmu_init(armpmu);
+	return perf_pmu_register(&armpmu->pmu, name, type);
+}
+
+/*
+ * ARMv8 PMUv3 Performance Events handling code.
+ * Common event types.
+ */
+enum armv8_pmuv3_perf_types {
+	/* Required events. */
+	ARMV8_PMUV3_PERFCTR_PMNC_SW_INCR			= 0x00,
+	ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL			= 0x03,
+	ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS			= 0x04,
+	ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED			= 0x10,
+	ARMV8_PMUV3_PERFCTR_CLOCK_CYCLES			= 0x11,
+	ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED			= 0x12,
+
+	/* At least one of the following is required. */
+	ARMV8_PMUV3_PERFCTR_INSTR_EXECUTED			= 0x08,
+	ARMV8_PMUV3_PERFCTR_OP_SPEC				= 0x1B,
+
+	/* Common architectural events. */
+	ARMV8_PMUV3_PERFCTR_MEM_READ				= 0x06,
+	ARMV8_PMUV3_PERFCTR_MEM_WRITE				= 0x07,
+	ARMV8_PMUV3_PERFCTR_EXC_TAKEN				= 0x09,
+	ARMV8_PMUV3_PERFCTR_EXC_EXECUTED			= 0x0A,
+	ARMV8_PMUV3_PERFCTR_CID_WRITE				= 0x0B,
+	ARMV8_PMUV3_PERFCTR_PC_WRITE				= 0x0C,
+	ARMV8_PMUV3_PERFCTR_PC_IMM_BRANCH			= 0x0D,
+	ARMV8_PMUV3_PERFCTR_PC_PROC_RETURN			= 0x0E,
+	ARMV8_PMUV3_PERFCTR_MEM_UNALIGNED_ACCESS		= 0x0F,
+	ARMV8_PMUV3_PERFCTR_TTBR_WRITE				= 0x1C,
+
+	/* Common microarchitectural events. */
+	ARMV8_PMUV3_PERFCTR_L1_ICACHE_REFILL			= 0x01,
+	ARMV8_PMUV3_PERFCTR_ITLB_REFILL				= 0x02,
+	ARMV8_PMUV3_PERFCTR_DTLB_REFILL				= 0x05,
+	ARMV8_PMUV3_PERFCTR_MEM_ACCESS				= 0x13,
+	ARMV8_PMUV3_PERFCTR_L1_ICACHE_ACCESS			= 0x14,
+	ARMV8_PMUV3_PERFCTR_L1_DCACHE_WB			= 0x15,
+	ARMV8_PMUV3_PERFCTR_L2_CACHE_ACCESS			= 0x16,
+	ARMV8_PMUV3_PERFCTR_L2_CACHE_REFILL			= 0x17,
+	ARMV8_PMUV3_PERFCTR_L2_CACHE_WB				= 0x18,
+	ARMV8_PMUV3_PERFCTR_BUS_ACCESS				= 0x19,
+	ARMV8_PMUV3_PERFCTR_MEM_ERROR				= 0x1A,
+	ARMV8_PMUV3_PERFCTR_BUS_CYCLES				= 0x1D,
+
+	/*
+	 * This isn't an architected event.
+	 * We detect this event number and use the cycle counter instead.
+	 */
+	ARMV8_PMUV3_PERFCTR_CPU_CYCLES				= 0xFF,
+};
+
+/* PMUv3 HW events mapping. */
+static const unsigned armv8_pmuv3_perf_map[PERF_COUNT_HW_MAX] = {
+	[PERF_COUNT_HW_CPU_CYCLES]		= ARMV8_PMUV3_PERFCTR_CPU_CYCLES,
+	[PERF_COUNT_HW_INSTRUCTIONS]		= ARMV8_PMUV3_PERFCTR_INSTR_EXECUTED,
+	[PERF_COUNT_HW_CACHE_REFERENCES]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS,
+	[PERF_COUNT_HW_CACHE_MISSES]		= ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL,
+	[PERF_COUNT_HW_BRANCH_INSTRUCTIONS]	= HW_OP_UNSUPPORTED,
+	[PERF_COUNT_HW_BRANCH_MISSES]		= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+	[PERF_COUNT_HW_BUS_CYCLES]		= HW_OP_UNSUPPORTED,
+	[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND]	= HW_OP_UNSUPPORTED,
+	[PERF_COUNT_HW_STALLED_CYCLES_BACKEND]	= HW_OP_UNSUPPORTED,
+};
+
+static const unsigned armv8_pmuv3_perf_cache_map[PERF_COUNT_HW_CACHE_MAX]
+						[PERF_COUNT_HW_CACHE_OP_MAX]
+						[PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+	[C(L1D)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS,
+			[C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_ACCESS,
+			[C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_L1_DCACHE_REFILL,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+	},
+	[C(L1I)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+	},
+	[C(LL)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+	},
+	[C(DTLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+	},
+	[C(ITLB)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+	},
+	[C(BPU)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED,
+			[C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_PRED,
+			[C(RESULT_MISS)]	= ARMV8_PMUV3_PERFCTR_PC_BRANCH_MIS_PRED,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+	},
+	[C(NODE)] = {
+		[C(OP_READ)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_WRITE)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+		[C(OP_PREFETCH)] = {
+			[C(RESULT_ACCESS)]	= CACHE_OP_UNSUPPORTED,
+			[C(RESULT_MISS)]	= CACHE_OP_UNSUPPORTED,
+		},
+	},
+};
+
+/*
+ * Perf Events' indices
+ */
+#define	ARMV8_IDX_CYCLE_COUNTER	0
+#define	ARMV8_IDX_COUNTER0	1
+#define	ARMV8_IDX_COUNTER_LAST	(ARMV8_IDX_CYCLE_COUNTER + cpu_pmu->num_events - 1)
+
+#define	ARMV8_MAX_COUNTERS	32
+#define	ARMV8_COUNTER_MASK	(ARMV8_MAX_COUNTERS - 1)
+
+/*
+ * ARMv8 low level PMU access
+ */
+
+/*
+ * Perf Event to low level counters mapping
+ */
+#define	ARMV8_IDX_TO_COUNTER(x)	\
+	(((x) - ARMV8_IDX_COUNTER0) & ARMV8_COUNTER_MASK)
+
+/*
+ * Per-CPU PMCR: config reg
+ */
+#define ARMV8_PMCR_E		(1 << 0) /* Enable all counters */
+#define ARMV8_PMCR_P		(1 << 1) /* Reset all counters */
+#define ARMV8_PMCR_C		(1 << 2) /* Cycle counter reset */
+#define ARMV8_PMCR_D		(1 << 3) /* CCNT counts every 64th cpu cycle */
+#define ARMV8_PMCR_X		(1 << 4) /* Export to ETM */
+#define ARMV8_PMCR_DP		(1 << 5) /* Disable CCNT if non-invasive debug*/
+#define	ARMV8_PMCR_N_SHIFT	11	 /* Number of counters supported */
+#define	ARMV8_PMCR_N_MASK	0x1f
+#define	ARMV8_PMCR_MASK		0x3f	 /* Mask for writable bits */
+
+/*
+ * PMOVSR: counters overflow flag status reg
+ */
+#define	ARMV8_OVSR_MASK		0xffffffff	/* Mask for writable bits */
+#define	ARMV8_OVERFLOWED_MASK	ARMV8_OVSR_MASK
+
+/*
+ * PMXEVTYPER: Event selection reg
+ */
+#define	ARMV8_EVTYPE_MASK	0xc00000ff	/* Mask for writable bits */
+#define	ARMV8_EVTYPE_EVENT	0xff		/* Mask for EVENT bits */
+
+/*
+ * Event filters for PMUv3
+ */
+#define	ARMV8_EXCLUDE_EL1	(1 << 31)
+#define	ARMV8_EXCLUDE_EL0	(1 << 30)
+#define	ARMV8_INCLUDE_EL2	(1 << 27)
+
+static inline u32 armv8pmu_pmcr_read(void)
+{
+	u32 val;
+	asm volatile("mrs %0, pmcr_el0" : "=r" (val));
+	return val;
+}
+
+static inline void armv8pmu_pmcr_write(u32 val)
+{
+	val &= ARMV8_PMCR_MASK;
+	isb();
+	asm volatile("msr pmcr_el0, %0" :: "r" (val));
+}
+
+static inline int armv8pmu_has_overflowed(u32 pmovsr)
+{
+	return pmovsr & ARMV8_OVERFLOWED_MASK;
+}
+
+static inline int armv8pmu_counter_valid(int idx)
+{
+	return idx >= ARMV8_IDX_CYCLE_COUNTER && idx <= ARMV8_IDX_COUNTER_LAST;
+}
+
+static inline int armv8pmu_counter_has_overflowed(u32 pmnc, int idx)
+{
+	int ret = 0;
+	u32 counter;
+
+	if (!armv8pmu_counter_valid(idx)) {
+		pr_err("CPU%u checking wrong counter %d overflow status\n",
+			smp_processor_id(), idx);
+	} else {
+		counter = ARMV8_IDX_TO_COUNTER(idx);
+		ret = pmnc & BIT(counter);
+	}
+
+	return ret;
+}
+
+static inline int armv8pmu_select_counter(int idx)
+{
+	u32 counter;
+
+	if (!armv8pmu_counter_valid(idx)) {
+		pr_err("CPU%u selecting wrong PMNC counter %d\n",
+			smp_processor_id(), idx);
+		return -EINVAL;
+	}
+
+	counter = ARMV8_IDX_TO_COUNTER(idx);
+	asm volatile("msr pmselr_el0, %0" :: "r" (counter));
+	isb();
+
+	return idx;
+}
+
+static inline u32 armv8pmu_read_counter(int idx)
+{
+	u32 value = 0;
+
+	if (!armv8pmu_counter_valid(idx))
+		pr_err("CPU%u reading wrong counter %d\n",
+			smp_processor_id(), idx);
+	else if (idx == ARMV8_IDX_CYCLE_COUNTER)
+		asm volatile("mrs %0, pmccntr_el0" : "=r" (value));
+	else if (armv8pmu_select_counter(idx) == idx)
+		asm volatile("mrs %0, pmxevcntr_el0" : "=r" (value));
+
+	return value;
+}
+
+static inline void armv8pmu_write_counter(int idx, u32 value)
+{
+	if (!armv8pmu_counter_valid(idx))
+		pr_err("CPU%u writing wrong counter %d\n",
+			smp_processor_id(), idx);
+	else if (idx == ARMV8_IDX_CYCLE_COUNTER)
+		asm volatile("msr pmccntr_el0, %0" :: "r" (value));
+	else if (armv8pmu_select_counter(idx) == idx)
+		asm volatile("msr pmxevcntr_el0, %0" :: "r" (value));
+}
+
+static inline void armv8pmu_write_evtype(int idx, u32 val)
+{
+	if (armv8pmu_select_counter(idx) == idx) {
+		val &= ARMV8_EVTYPE_MASK;
+		asm volatile("msr pmxevtyper_el0, %0" :: "r" (val));
+	}
+}
+
+static inline int armv8pmu_enable_counter(int idx)
+{
+	u32 counter;
+
+	if (!armv8pmu_counter_valid(idx)) {
+		pr_err("CPU%u enabling wrong PMNC counter %d\n",
+			smp_processor_id(), idx);
+		return -EINVAL;
+	}
+
+	counter = ARMV8_IDX_TO_COUNTER(idx);
+	asm volatile("msr pmcntenset_el0, %0" :: "r" (BIT(counter)));
+	return idx;
+}
+
+static inline int armv8pmu_disable_counter(int idx)
+{
+	u32 counter;
+
+	if (!armv8pmu_counter_valid(idx)) {
+		pr_err("CPU%u disabling wrong PMNC counter %d\n",
+			smp_processor_id(), idx);
+		return -EINVAL;
+	}
+
+	counter = ARMV8_IDX_TO_COUNTER(idx);
+	asm volatile("msr pmcntenclr_el0, %0" :: "r" (BIT(counter)));
+	return idx;
+}
+
+static inline int armv8pmu_enable_intens(int idx)
+{
+	u32 counter;
+
+	if (!armv8pmu_counter_valid(idx)) {
+		pr_err("CPU%u enabling wrong PMNC counter IRQ enable %d\n",
+			smp_processor_id(), idx);
+		return -EINVAL;
+	}
+
+	counter = ARMV8_IDX_TO_COUNTER(idx);
+	asm volatile("msr pmintenset_el1, %0" :: "r" (BIT(counter)));
+	return idx;
+}
+
+static inline int armv8pmu_disable_intens(int idx)
+{
+	u32 counter;
+
+	if (!armv8pmu_counter_valid(idx)) {
+		pr_err("CPU%u disabling wrong PMNC counter IRQ enable %d\n",
+			smp_processor_id(), idx);
+		return -EINVAL;
+	}
+
+	counter = ARMV8_IDX_TO_COUNTER(idx);
+	asm volatile("msr pmintenclr_el1, %0" :: "r" (BIT(counter)));
+	isb();
+	/* Clear the overflow flag in case an interrupt is pending. */
+	asm volatile("msr pmovsclr_el0, %0" :: "r" (BIT(counter)));
+	isb();
+	return idx;
+}
+
+static inline u32 armv8pmu_getreset_flags(void)
+{
+	u32 value;
+
+	/* Read */
+	asm volatile("mrs %0, pmovsclr_el0" : "=r" (value));
+
+	/* Write to clear flags */
+	value &= ARMV8_OVSR_MASK;
+	asm volatile("msr pmovsclr_el0, %0" :: "r" (value));
+
+	return value;
+}
+
+static void armv8pmu_enable_event(struct hw_perf_event *hwc, int idx)
+{
+	unsigned long flags;
+	struct pmu_hw_events *events = cpu_pmu->get_hw_events();
+
+	/*
+	 * Enable counter and interrupt, and set the counter to count
+	 * the event that we're interested in.
+	 */
+	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+
+	/*
+	 * Disable counter
+	 */
+	armv8pmu_disable_counter(idx);
+
+	/*
+	 * Set event (if destined for PMNx counters).
+	 */
+	armv8pmu_write_evtype(idx, hwc->config_base);
+
+	/*
+	 * Enable interrupt for this counter
+	 */
+	armv8pmu_enable_intens(idx);
+
+	/*
+	 * Enable counter
+	 */
+	armv8pmu_enable_counter(idx);
+
+	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+}
+
+static void armv8pmu_disable_event(struct hw_perf_event *hwc, int idx)
+{
+	unsigned long flags;
+	struct pmu_hw_events *events = cpu_pmu->get_hw_events();
+
+	/*
+	 * Disable counter and interrupt
+	 */
+	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+
+	/*
+	 * Disable counter
+	 */
+	armv8pmu_disable_counter(idx);
+
+	/*
+	 * Disable interrupt for this counter
+	 */
+	armv8pmu_disable_intens(idx);
+
+	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+}
+
+static irqreturn_t armv8pmu_handle_irq(int irq_num, void *dev)
+{
+	u32 pmovsr;
+	struct perf_sample_data data;
+	struct pmu_hw_events *cpuc;
+	struct pt_regs *regs;
+	int idx;
+
+	/*
+	 * Get and reset the IRQ flags
+	 */
+	pmovsr = armv8pmu_getreset_flags();
+
+	/*
+	 * Did an overflow occur?
+	 */
+	if (!armv8pmu_has_overflowed(pmovsr))
+		return IRQ_NONE;
+
+	/*
+	 * Handle the counter(s) overflow(s)
+	 */
+	regs = get_irq_regs();
+
+	cpuc = &__get_cpu_var(cpu_hw_events);
+	for (idx = 0; idx < cpu_pmu->num_events; ++idx) {
+		struct perf_event *event = cpuc->events[idx];
+		struct hw_perf_event *hwc;
+
+		/* Ignore if we don't have an event. */
+		if (!event)
+			continue;
+
+		/*
+		 * We have a single interrupt for all counters. Check that
+		 * each counter has overflowed before we process it.
+		 */
+		if (!armv8pmu_counter_has_overflowed(pmovsr, idx))
+			continue;
+
+		hwc = &event->hw;
+		armpmu_event_update(event, hwc, idx);
+		perf_sample_data_init(&data, 0, hwc->last_period);
+		if (!armpmu_event_set_period(event, hwc, idx))
+			continue;
+
+		if (perf_event_overflow(event, &data, regs))
+			cpu_pmu->disable(hwc, idx);
+	}
+
+	/*
+	 * Handle the pending perf events.
+	 *
+	 * Note: this call *must* be run with interrupts disabled. For
+	 * platforms that can have the PMU interrupts raised as an NMI, this
+	 * will not work.
+	 */
+	irq_work_run();
+
+	return IRQ_HANDLED;
+}
+
+static void armv8pmu_start(void)
+{
+	unsigned long flags;
+	struct pmu_hw_events *events = cpu_pmu->get_hw_events();
+
+	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	/* Enable all counters */
+	armv8pmu_pmcr_write(armv8pmu_pmcr_read() | ARMV8_PMCR_E);
+	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+}
+
+static void armv8pmu_stop(void)
+{
+	unsigned long flags;
+	struct pmu_hw_events *events = cpu_pmu->get_hw_events();
+
+	raw_spin_lock_irqsave(&events->pmu_lock, flags);
+	/* Disable all counters */
+	armv8pmu_pmcr_write(armv8pmu_pmcr_read() & ~ARMV8_PMCR_E);
+	raw_spin_unlock_irqrestore(&events->pmu_lock, flags);
+}
+
+static int armv8pmu_get_event_idx(struct pmu_hw_events *cpuc,
+				  struct hw_perf_event *event)
+{
+	int idx;
+	unsigned long evtype = event->config_base & ARMV8_EVTYPE_EVENT;
+
+	/* Always place a cycle counter into the cycle counter. */
+	if (evtype == ARMV8_PMUV3_PERFCTR_CPU_CYCLES) {
+		if (test_and_set_bit(ARMV8_IDX_CYCLE_COUNTER, cpuc->used_mask))
+			return -EAGAIN;
+
+		return ARMV8_IDX_CYCLE_COUNTER;
+	}
+
+	/*
+	 * For anything other than a cycle counter, try and use
+	 * the events counters
+	 */
+	for (idx = ARMV8_IDX_COUNTER0; idx < cpu_pmu->num_events; ++idx) {
+		if (!test_and_set_bit(idx, cpuc->used_mask))
+			return idx;
+	}
+
+	/* The counters are all in use. */
+	return -EAGAIN;
+}
+
+/*
+ * Add an event filter to a given event. This will only work for PMUv2 PMUs.
+ */
+static int armv8pmu_set_event_filter(struct hw_perf_event *event,
+				     struct perf_event_attr *attr)
+{
+	unsigned long config_base = 0;
+
+	if (attr->exclude_idle)
+		return -EPERM;
+	if (attr->exclude_user)
+		config_base |= ARMV8_EXCLUDE_EL0;
+	if (attr->exclude_kernel)
+		config_base |= ARMV8_EXCLUDE_EL1;
+	if (!attr->exclude_hv)
+		config_base |= ARMV8_INCLUDE_EL2;
+
+	/*
+	 * Install the filter into config_base as this is used to
+	 * construct the event type.
+	 */
+	event->config_base = config_base;
+
+	return 0;
+}
+
+static void armv8pmu_reset(void *info)
+{
+	u32 idx, nb_cnt = cpu_pmu->num_events;
+
+	/* The counter and interrupt enable registers are unknown at reset. */
+	for (idx = ARMV8_IDX_CYCLE_COUNTER; idx < nb_cnt; ++idx)
+		armv8pmu_disable_event(NULL, idx);
+
+	/* Initialize & Reset PMNC: C and P bits. */
+	armv8pmu_pmcr_write(ARMV8_PMCR_P | ARMV8_PMCR_C);
+
+	/* Disable access from userspace. */
+	asm volatile("msr pmuserenr_el0, %0" :: "r" (0));
+}
+
+static int armv8_pmuv3_map_event(struct perf_event *event)
+{
+	return map_cpu_event(event, &armv8_pmuv3_perf_map,
+				&armv8_pmuv3_perf_cache_map, 0xFF);
+}
+
+static struct arm_pmu armv8pmu = {
+	.handle_irq		= armv8pmu_handle_irq,
+	.enable			= armv8pmu_enable_event,
+	.disable		= armv8pmu_disable_event,
+	.read_counter		= armv8pmu_read_counter,
+	.write_counter		= armv8pmu_write_counter,
+	.get_event_idx		= armv8pmu_get_event_idx,
+	.start			= armv8pmu_start,
+	.stop			= armv8pmu_stop,
+	.reset			= armv8pmu_reset,
+	.max_period		= (1LLU << 32) - 1,
+};
+
+static u32 __init armv8pmu_read_num_pmnc_events(void)
+{
+	u32 nb_cnt;
+
+	/* Read the nb of CNTx counters supported from PMNC */
+	nb_cnt = (armv8pmu_pmcr_read() >> ARMV8_PMCR_N_SHIFT) & ARMV8_PMCR_N_MASK;
+
+	/* Add the CPU cycles counter and return */
+	return nb_cnt + 1;
+}
+
+static struct arm_pmu *__init armv8_pmuv3_pmu_init(void)
+{
+	armv8pmu.name			= "arm/armv8-pmuv3";
+	armv8pmu.map_event		= armv8_pmuv3_map_event;
+	armv8pmu.num_events		= armv8pmu_read_num_pmnc_events();
+	armv8pmu.set_event_filter	= armv8pmu_set_event_filter;
+	return &armv8pmu;
+}
+
+/*
+ * Ensure the PMU has sane values out of reset.
+ * This requires SMP to be available, so exists as a separate initcall.
+ */
+static int __init
+cpu_pmu_reset(void)
+{
+	if (cpu_pmu && cpu_pmu->reset)
+		return on_each_cpu(cpu_pmu->reset, NULL, 1);
+	return 0;
+}
+arch_initcall(cpu_pmu_reset);
+
+/*
+ * PMU platform driver and devicetree bindings.
+ */
+static struct of_device_id armpmu_of_device_ids[] = {
+	{.compatible = "arm,armv8-pmuv3"},
+	{},
+};
+
+static int __devinit armpmu_device_probe(struct platform_device *pdev)
+{
+	if (!cpu_pmu)
+		return -ENODEV;
+
+	cpu_pmu->plat_device = pdev;
+	return 0;
+}
+
+static struct platform_driver armpmu_driver = {
+	.driver		= {
+		.name	= "arm-pmu",
+		.of_match_table = armpmu_of_device_ids,
+	},
+	.probe		= armpmu_device_probe,
+};
+
+static int __init register_pmu_driver(void)
+{
+	return platform_driver_register(&armpmu_driver);
+}
+device_initcall(register_pmu_driver);
+
+static struct pmu_hw_events *armpmu_get_cpu_events(void)
+{
+	return &__get_cpu_var(cpu_hw_events);
+}
+
+static void __init cpu_pmu_init(struct arm_pmu *armpmu)
+{
+	int cpu;
+	for_each_possible_cpu(cpu) {
+		struct pmu_hw_events *events = &per_cpu(cpu_hw_events, cpu);
+		events->events = per_cpu(hw_events, cpu);
+		events->used_mask = per_cpu(used_mask, cpu);
+		raw_spin_lock_init(&events->pmu_lock);
+	}
+	armpmu->get_hw_events = armpmu_get_cpu_events;
+}
+
+static int __init init_hw_perf_events(void)
+{
+	u64 dfr = read_cpuid(ID_AA64DFR0_EL1);
+
+	switch ((dfr >> 8) & 0xf) {
+	case 0x1:	/* PMUv3 */
+		cpu_pmu = armv8_pmuv3_pmu_init();
+		break;
+	}
+
+	if (cpu_pmu) {
+		pr_info("enabled with %s PMU driver, %d counters available\n",
+			cpu_pmu->name, cpu_pmu->num_events);
+		cpu_pmu_init(cpu_pmu);
+		armpmu_register(cpu_pmu, "cpu", PERF_TYPE_RAW);
+	} else {
+		pr_info("no hardware support available\n");
+	}
+
+	return 0;
+}
+early_initcall(init_hw_perf_events);
+
+/*
+ * Callchain handling code.
+ */
+struct frame_tail {
+	struct frame_tail   __user *fp;
+	unsigned long	    lr;
+} __attribute__((packed));
+
+/*
+ * Get the return address for a single stackframe and return a pointer to the
+ * next frame tail.
+ */
+static struct frame_tail __user *
+user_backtrace(struct frame_tail __user *tail,
+	       struct perf_callchain_entry *entry)
+{
+	struct frame_tail buftail;
+	unsigned long err;
+
+	/* Also check accessibility of one struct frame_tail beyond */
+	if (!access_ok(VERIFY_READ, tail, sizeof(buftail)))
+		return NULL;
+
+	pagefault_disable();
+	err = __copy_from_user_inatomic(&buftail, tail, sizeof(buftail));
+	pagefault_enable();
+
+	if (err)
+		return NULL;
+
+	perf_callchain_store(entry, buftail.lr);
+
+	/*
+	 * Frame pointers should strictly progress back up the stack
+	 * (towards higher addresses).
+	 */
+	if (tail >= buftail.fp)
+		return NULL;
+
+	return buftail.fp;
+}
+
+void perf_callchain_user(struct perf_callchain_entry *entry,
+			 struct pt_regs *regs)
+{
+	struct frame_tail __user *tail;
+
+	tail = (struct frame_tail __user *)regs->regs[29];
+
+	while (entry->nr < PERF_MAX_STACK_DEPTH &&
+	       tail && !((unsigned long)tail & 0xf))
+		tail = user_backtrace(tail, entry);
+}
+
+/*
+ * Gets called by walk_stackframe() for every stackframe. This will be called
+ * whist unwinding the stackframe and is like a subroutine return so we use
+ * the PC.
+ */
+static int callchain_trace(struct stackframe *frame, void *data)
+{
+	struct perf_callchain_entry *entry = data;
+	perf_callchain_store(entry, frame->pc);
+	return 0;
+}
+
+void perf_callchain_kernel(struct perf_callchain_entry *entry,
+			   struct pt_regs *regs)
+{
+	struct stackframe frame;
+
+	frame.fp = regs->regs[29];
+	frame.sp = regs->sp;
+	frame.pc = regs->pc;
+	walk_stackframe(&frame, callchain_trace, entry);
+}
diff --git a/tools/perf/perf.h b/tools/perf/perf.h
index f960ccb..8c36763 100644
--- a/tools/perf/perf.h
+++ b/tools/perf/perf.h
@@ -88,6 +88,12 @@ void get_term_dimensions(struct winsize *ws);
 #define CPUINFO_PROC	"Processor"
 #endif
 
+#ifdef __aarch64__
+#include "../../arch/arm64/include/asm/unistd.h"
+#define rmb()		asm volatile("dmb ld" ::: "memory")
+#define cpu_relax()	asm volatile("yield" ::: "memory")
+#endif
+
 #ifdef __mips__
 #include "../../arch/mips/include/asm/unistd.h"
 #define rmb()		asm volatile(					\


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 26/31] arm64: Miscellaneous library functions
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (24 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 25/31] arm64: Performance counters support Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 15:21   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 27/31] arm64: Loadable modules Catalin Marinas
                   ` (5 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel
  Cc: linux-kernel, Arnd Bergmann, Marc Zyngier, Will Deacon

From: Marc Zyngier <marc.zyngier@arm.com>

This patch adds udelay, memory and bit operations together with the
ksyms exports.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/bitops.h  |   74 ++++++++++++++++++++++++++++
 arch/arm64/include/asm/syscall.h |  101 ++++++++++++++++++++++++++++++++++++++
 arch/arm64/kernel/arm64ksyms.c   |   55 ++++++++++++++++++++
 arch/arm64/lib/Makefile          |    5 ++
 arch/arm64/lib/bitops.c          |   25 +++++++++
 arch/arm64/lib/clear_page.S      |   39 +++++++++++++++
 arch/arm64/lib/copy_page.S       |   46 +++++++++++++++++
 arch/arm64/lib/delay.c           |   55 ++++++++++++++++++++
 8 files changed, 400 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/bitops.h
 create mode 100644 arch/arm64/include/asm/syscall.h
 create mode 100644 arch/arm64/kernel/arm64ksyms.c
 create mode 100644 arch/arm64/lib/Makefile
 create mode 100644 arch/arm64/lib/bitops.c
 create mode 100644 arch/arm64/lib/clear_page.S
 create mode 100644 arch/arm64/lib/copy_page.S
 create mode 100644 arch/arm64/lib/delay.c

diff --git a/arch/arm64/include/asm/bitops.h b/arch/arm64/include/asm/bitops.h
new file mode 100644
index 0000000..67df4d2
--- /dev/null
+++ b/arch/arm64/include/asm/bitops.h
@@ -0,0 +1,74 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_BITOPS_H
+#define __ASM_BITOPS_H
+
+#include <linux/compiler.h>
+
+#include <asm/barrier.h>
+
+/*
+ * clear_bit may not imply a memory barrier
+ */
+#ifndef smp_mb__before_clear_bit
+#define smp_mb__before_clear_bit()	smp_mb()
+#define smp_mb__after_clear_bit()	smp_mb()
+#endif
+
+/*
+ * Use compiler builtins for simple inline operations.
+ */
+static inline unsigned long __ffs(unsigned long word)
+{
+	return __builtin_ffsl(word) - 1;
+}
+
+static inline int ffs(int x)
+{
+	return __builtin_ffs(x);
+}
+
+static inline unsigned long __fls(unsigned long word)
+{
+	return BITS_PER_LONG - 1 - __builtin_clzl(word);
+}
+
+static inline int fls(int x)
+{
+	return x ? sizeof(x) * BITS_PER_BYTE - __builtin_clz(x) : 0;
+}
+
+/*
+ * Mainly use the generic routines for now.
+ */
+#ifndef _LINUX_BITOPS_H
+#error only <linux/bitops.h> can be included directly
+#endif
+
+#include <asm-generic/bitops/ffz.h>
+#include <asm-generic/bitops/fls64.h>
+#include <asm-generic/bitops/find.h>
+
+#include <asm-generic/bitops/sched.h>
+#include <asm-generic/bitops/hweight.h>
+#include <asm-generic/bitops/lock.h>
+
+#include <asm-generic/bitops/atomic.h>
+#include <asm-generic/bitops/non-atomic.h>
+#include <asm-generic/bitops/le.h>
+#include <asm-generic/bitops/ext2-atomic.h>
+
+#endif /* __ASM_BITOPS_H */
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
new file mode 100644
index 0000000..89c047f
--- /dev/null
+++ b/arch/arm64/include/asm/syscall.h
@@ -0,0 +1,101 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SYSCALL_H
+#define __ASM_SYSCALL_H
+
+#include <linux/err.h>
+
+
+static inline int syscall_get_nr(struct task_struct *task,
+				 struct pt_regs *regs)
+{
+	return regs->syscallno;
+}
+
+static inline void syscall_rollback(struct task_struct *task,
+				    struct pt_regs *regs)
+{
+	regs->regs[0] = regs->orig_x0;
+}
+
+
+static inline long syscall_get_error(struct task_struct *task,
+				     struct pt_regs *regs)
+{
+	unsigned long error = regs->regs[0];
+	return IS_ERR_VALUE(error) ? error : 0;
+}
+
+static inline long syscall_get_return_value(struct task_struct *task,
+					    struct pt_regs *regs)
+{
+	return regs->regs[0];
+}
+
+static inline void syscall_set_return_value(struct task_struct *task,
+					    struct pt_regs *regs,
+					    int error, long val)
+{
+	regs->regs[0] = (long) error ? error : val;
+}
+
+#define SYSCALL_MAX_ARGS 6
+
+static inline void syscall_get_arguments(struct task_struct *task,
+					 struct pt_regs *regs,
+					 unsigned int i, unsigned int n,
+					 unsigned long *args)
+{
+	if (i + n > SYSCALL_MAX_ARGS) {
+		unsigned long *args_bad = args + SYSCALL_MAX_ARGS - i;
+		unsigned int n_bad = n + i - SYSCALL_MAX_ARGS;
+		pr_warning("%s called with max args %d, handling only %d\n",
+			   __func__, i + n, SYSCALL_MAX_ARGS);
+		memset(args_bad, 0, n_bad * sizeof(args[0]));
+	}
+
+	if (i == 0) {
+		args[0] = regs->orig_x0;
+		args++;
+		i++;
+		n--;
+	}
+
+	memcpy(args, &regs->regs[i], n * sizeof(args[0]));
+}
+
+static inline void syscall_set_arguments(struct task_struct *task,
+					 struct pt_regs *regs,
+					 unsigned int i, unsigned int n,
+					 const unsigned long *args)
+{
+	if (i + n > SYSCALL_MAX_ARGS) {
+		pr_warning("%s called with max args %d, handling only %d\n",
+			   __func__, i + n, SYSCALL_MAX_ARGS);
+		n = SYSCALL_MAX_ARGS - i;
+	}
+
+	if (i == 0) {
+		regs->orig_x0 = args[0];
+		args++;
+		i++;
+		n--;
+	}
+
+	memcpy(&regs->regs[i], args, n * sizeof(args[0]));
+}
+
+#endif	/* __ASM_SYSCALL_H */
diff --git a/arch/arm64/kernel/arm64ksyms.c b/arch/arm64/kernel/arm64ksyms.c
new file mode 100644
index 0000000..4631573
--- /dev/null
+++ b/arch/arm64/kernel/arm64ksyms.c
@@ -0,0 +1,55 @@
+/*
+ * Based on arch/arm/kernel/armksyms.c
+ *
+ * Copyright (C) 2000 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/export.h>
+#include <linux/sched.h>
+#include <linux/string.h>
+#include <linux/cryptohash.h>
+#include <linux/delay.h>
+#include <linux/in6.h>
+#include <linux/syscalls.h>
+#include <linux/uaccess.h>
+#include <linux/io.h>
+
+#include <asm/checksum.h>
+
+	/* user mem (segment) */
+EXPORT_SYMBOL(__strnlen_user);
+EXPORT_SYMBOL(__strncpy_from_user);
+
+EXPORT_SYMBOL(copy_page);
+
+EXPORT_SYMBOL(__copy_from_user);
+EXPORT_SYMBOL(__copy_to_user);
+EXPORT_SYMBOL(__clear_user);
+
+EXPORT_SYMBOL(__get_user_1);
+EXPORT_SYMBOL(__get_user_2);
+EXPORT_SYMBOL(__get_user_4);
+
+EXPORT_SYMBOL(__put_user_1);
+EXPORT_SYMBOL(__put_user_2);
+EXPORT_SYMBOL(__put_user_4);
+EXPORT_SYMBOL(__put_user_8);
+
+	/* bitops */
+EXPORT_SYMBOL(__atomic_hash);
+
+	/* physical memory */
+EXPORT_SYMBOL(memstart_addr);
diff --git a/arch/arm64/lib/Makefile b/arch/arm64/lib/Makefile
new file mode 100644
index 0000000..ae71bae
--- /dev/null
+++ b/arch/arm64/lib/Makefile
@@ -0,0 +1,5 @@
+lib-y		:= bitops.o delay.o					\
+		   strncpy_from_user.o strnlen_user.o			\
+		   clear_user.o getuser.o putuser.o			\
+		   copy_from_user.o copy_to_user.o copy_in_user.o	\
+		   copy_page.o clear_page.o
diff --git a/arch/arm64/lib/bitops.c b/arch/arm64/lib/bitops.c
new file mode 100644
index 0000000..aa4965e
--- /dev/null
+++ b/arch/arm64/lib/bitops.c
@@ -0,0 +1,25 @@
+/*
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/kernel.h>
+#include <linux/spinlock.h>
+#include <linux/atomic.h>
+
+#ifdef CONFIG_SMP
+arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
+       [0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
+};
+#endif
diff --git a/arch/arm64/lib/clear_page.S b/arch/arm64/lib/clear_page.S
new file mode 100644
index 0000000..ef08e90
--- /dev/null
+++ b/arch/arm64/lib/clear_page.S
@@ -0,0 +1,39 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/assembler.h>
+#include <asm/page.h>
+
+/*
+ * Clear page @dest
+ *
+ * Parameters:
+ *	x0 - dest
+ */
+ENTRY(clear_page)
+	mrs	x1, dczid_el0
+	and	w1, w1, #0xf
+	mov	x2, #4
+	lsl	x1, x2, x1
+
+1:	dc	zva, x0
+	add	x0, x0, x1
+	tst	x0, #(PAGE_SIZE - 1)
+	b.ne	1b
+	ret
+ENDPROC(clear_page)
diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S
new file mode 100644
index 0000000..512b9a7
--- /dev/null
+++ b/arch/arm64/lib/copy_page.S
@@ -0,0 +1,46 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/assembler.h>
+#include <asm/page.h>
+
+/*
+ * Copy a page from src to dest (both are page aligned)
+ *
+ * Parameters:
+ *	x0 - dest
+ *	x1 - src
+ */
+ENTRY(copy_page)
+	/* Assume cache line size is 64 bytes. */
+	prfm	pldl1strm, [x1, #64]
+1:	ldp	x2, x3, [x1]
+	ldp	x4, x5, [x1, #16]
+	ldp	x6, x7, [x1, #32]
+	ldp	x8, x9, [x1, #48]
+	add	x1, x1, #64
+	prfm	pldl1strm, [x1, #64]
+	stnp	x2, x3, [x0]
+	stnp	x4, x5, [x0, #16]
+	stnp	x6, x7, [x0, #32]
+	stnp	x8, x9, [x0, #48]
+	add	x0, x0, #64
+	tst	x1, #(PAGE_SIZE - 1)
+	b.ne	1b
+	ret
+ENDPROC(copy_page)
diff --git a/arch/arm64/lib/delay.c b/arch/arm64/lib/delay.c
new file mode 100644
index 0000000..dad4ec9
--- /dev/null
+++ b/arch/arm64/lib/delay.c
@@ -0,0 +1,55 @@
+/*
+ * Delay loops based on the OpenRISC implementation.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/delay.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/timex.h>
+
+void __delay(unsigned long cycles)
+{
+	cycles_t start = get_cycles();
+
+	while ((get_cycles() - start) < cycles)
+		cpu_relax();
+}
+EXPORT_SYMBOL(__delay);
+
+inline void __const_udelay(unsigned long xloops)
+{
+	unsigned long loops;
+
+	loops = xloops * loops_per_jiffy * HZ;
+	__delay(loops >> 32);
+}
+EXPORT_SYMBOL(__const_udelay);
+
+void __udelay(unsigned long usecs)
+{
+	__const_udelay(usecs * 0x10C7UL); /* 2**32 / 1000000 (rounded up) */
+}
+EXPORT_SYMBOL(__udelay);
+
+void __ndelay(unsigned long nsecs)
+{
+	__const_udelay(nsecs * 0x5UL); /* 2**32 / 1000000000 (rounded up) */
+}
+EXPORT_SYMBOL(__ndelay);


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 27/31] arm64: Loadable modules
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (25 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 26/31] arm64: Miscellaneous library functions Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 15:23   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 28/31] arm64: Generic timers support Catalin Marinas
                   ` (4 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

From: Will Deacon <will.deacon@arm.com>

This patch adds support for loadable modules. Loadable modules are
loaded 64MB below the kernel image due to branch relocation restrictions
(see Documentation/arm64/memory.txt).

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/module.h |   23 ++
 arch/arm64/kernel/module.c      |  456 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 479 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/module.h
 create mode 100644 arch/arm64/kernel/module.c

diff --git a/arch/arm64/include/asm/module.h b/arch/arm64/include/asm/module.h
new file mode 100644
index 0000000..e80e232
--- /dev/null
+++ b/arch/arm64/include/asm/module.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_MODULE_H
+#define __ASM_MODULE_H
+
+#include <asm-generic/module.h>
+
+#define MODULE_ARCH_VERMAGIC	"aarch64"
+
+#endif /* __ASM_MODULE_H */
diff --git a/arch/arm64/kernel/module.c b/arch/arm64/kernel/module.c
new file mode 100644
index 0000000..ca0e3d5
--- /dev/null
+++ b/arch/arm64/kernel/module.c
@@ -0,0 +1,456 @@
+/*
+ * AArch64 loadable module support.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/bitops.h>
+#include <linux/elf.h>
+#include <linux/gfp.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/moduleloader.h>
+#include <linux/vmalloc.h>
+
+void *module_alloc(unsigned long size)
+{
+	return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
+				    GFP_KERNEL, PAGE_KERNEL_EXEC, -1,
+				    __builtin_return_address(0));
+}
+
+enum aarch64_reloc_op {
+	RELOC_OP_NONE,
+	RELOC_OP_ABS,
+	RELOC_OP_PREL,
+	RELOC_OP_PAGE,
+};
+
+static u64 do_reloc(enum aarch64_reloc_op reloc_op, void *place, u64 val)
+{
+	switch (reloc_op) {
+	case RELOC_OP_ABS:
+		return val;
+	case RELOC_OP_PREL:
+		return val - (u64)place;
+	case RELOC_OP_PAGE:
+		return (val & ~0xfff) - ((u64)place & ~0xfff);
+	case RELOC_OP_NONE:
+		return 0;
+	}
+
+	pr_err("do_reloc: unknown relocation operation %d\n", reloc_op);
+	return 0;
+}
+
+static int reloc_data(enum aarch64_reloc_op op, void *place, u64 val, int len)
+{
+	u64 imm_mask = (1 << len) - 1;
+	s64 sval = do_reloc(op, place, val);
+
+	switch (len) {
+	case 16:
+		*(s16 *)place = sval;
+		break;
+	case 32:
+		*(s32 *)place = sval;
+		break;
+	case 64:
+		*(s64 *)place = sval;
+		break;
+	default:
+		pr_err("Invalid length (%d) for data relocation\n", len);
+		return 0;
+	}
+
+	/*
+	 * Extract the upper value bits (including the sign bit) and
+	 * shift them to bit 0.
+	 */
+	sval = (s64)(sval & ~(imm_mask >> 1)) >> (len - 1);
+
+	/*
+	 * Overflow has occurred if the value is not representable in
+	 * len bits (i.e the bottom len bits are not sign-extended and
+	 * the top bits are not all zero).
+	 */
+	if ((u64)(sval + 1) > 2)
+		return -ERANGE;
+
+	return 0;
+}
+
+enum aarch64_imm_type {
+	INSN_IMM_MOVNZ,
+	INSN_IMM_MOVK,
+	INSN_IMM_ADR,
+	INSN_IMM_26,
+	INSN_IMM_19,
+	INSN_IMM_16,
+	INSN_IMM_14,
+	INSN_IMM_12,
+	INSN_IMM_9,
+};
+
+static u32 encode_insn_immediate(enum aarch64_imm_type type, u32 insn, u64 imm)
+{
+	u32 immlo, immhi, lomask, himask, mask;
+	int shift;
+
+	switch (type) {
+	case INSN_IMM_MOVNZ:
+		/*
+		 * For signed MOVW relocations, we have to manipulate the
+		 * instruction encoding depending on whether or not the
+		 * immediate is less than zero.
+		 */
+		insn &= ~(3 << 29);
+		if ((s64)imm >= 0) {
+			/* >=0: Set the instruction to MOVZ (opcode 10b). */
+			insn |= 2 << 29;
+		} else {
+			/*
+			 * <0: Set the instruction to MOVN (opcode 00b).
+			 *     Since we've masked the opcode already, we
+			 *     don't need to do anything other than
+			 *     inverting the new immediate field.
+			 */
+			imm = ~imm;
+		}
+	case INSN_IMM_MOVK:
+		mask = BIT(16) - 1;
+		shift = 5;
+		break;
+	case INSN_IMM_ADR:
+		lomask = 0x3;
+		himask = 0x7ffff;
+		immlo = imm & lomask;
+		imm >>= 2;
+		immhi = imm & himask;
+		imm = (immlo << 24) | (immhi);
+		mask = (lomask << 24) | (himask);
+		shift = 5;
+		break;
+	case INSN_IMM_26:
+		mask = BIT(26) - 1;
+		shift = 0;
+		break;
+	case INSN_IMM_19:
+		mask = BIT(19) - 1;
+		shift = 5;
+		break;
+	case INSN_IMM_16:
+		mask = BIT(16) - 1;
+		shift = 5;
+		break;
+	case INSN_IMM_14:
+		mask = BIT(14) - 1;
+		shift = 5;
+		break;
+	case INSN_IMM_12:
+		mask = BIT(12) - 1;
+		shift = 10;
+		break;
+	case INSN_IMM_9:
+		mask = BIT(9) - 1;
+		shift = 12;
+		break;
+	default:
+		pr_err("encode_insn_immediate: unknown immediate encoding %d\n",
+			type);
+		return 0;
+	}
+
+	/* Update the immediate field. */
+	insn &= ~(mask << shift);
+	insn |= (imm & mask) << shift;
+
+	return insn;
+}
+
+static int reloc_insn_movw(enum aarch64_reloc_op op, void *place, u64 val,
+			   int lsb, enum aarch64_imm_type imm_type)
+{
+	u64 imm, limit = 0;
+	s64 sval;
+	u32 insn = *(u32 *)place;
+
+	sval = do_reloc(op, place, val);
+	sval >>= lsb;
+	imm = sval & 0xffff;
+
+	/* Update the instruction with the new encoding. */
+	*(u32 *)place = encode_insn_immediate(imm_type, insn, imm);
+
+	/* Shift out the immediate field. */
+	sval >>= 16;
+
+	/*
+	 * For unsigned immediates, the overflow check is straightforward.
+	 * For signed immediates, the sign bit is actually the bit past the
+	 * most significant bit of the field.
+	 * The INSN_IMM_16 immediate type is unsigned.
+	 */
+	if (imm_type != INSN_IMM_16) {
+		sval++;
+		limit++;
+	}
+
+	/* Check the upper bits depending on the sign of the immediate. */
+	if ((u64)sval > limit)
+		return -ERANGE;
+
+	return 0;
+}
+
+static int reloc_insn_imm(enum aarch64_reloc_op op, void *place, u64 val,
+			  int lsb, int len, enum aarch64_imm_type imm_type)
+{
+	u64 imm, imm_mask;
+	s64 sval;
+	u32 insn = *(u32 *)place;
+
+	/* Calculate the relocation value. */
+	sval = do_reloc(op, place, val);
+	sval >>= lsb;
+
+	/* Extract the value bits and shift them to bit 0. */
+	imm_mask = (BIT(lsb + len) - 1) >> lsb;
+	imm = sval & imm_mask;
+
+	/* Update the instruction's immediate field. */
+	*(u32 *)place = encode_insn_immediate(imm_type, insn, imm);
+
+	/*
+	 * Extract the upper value bits (including the sign bit) and
+	 * shift them to bit 0.
+	 */
+	sval = (s64)(sval & ~(imm_mask >> 1)) >> (len - 1);
+
+	/*
+	 * Overflow has occurred if the upper bits are not all equal to
+	 * the sign bit of the value.
+	 */
+	if ((u64)(sval + 1) >= 2)
+		return -ERANGE;
+
+	return 0;
+}
+
+int apply_relocate_add(Elf64_Shdr *sechdrs,
+		       const char *strtab,
+		       unsigned int symindex,
+		       unsigned int relsec,
+		       struct module *me)
+{
+	unsigned int i;
+	int ovf;
+	bool overflow_check;
+	Elf64_Sym *sym;
+	void *loc;
+	u64 val;
+	Elf64_Rela *rel = (void *)sechdrs[relsec].sh_addr;
+
+	for (i = 0; i < sechdrs[relsec].sh_size / sizeof(*rel); i++) {
+		/* loc corresponds to P in the AArch64 ELF document. */
+		loc = (void *)sechdrs[sechdrs[relsec].sh_info].sh_addr
+			+ rel[i].r_offset;
+
+		/* sym is the ELF symbol we're referring to. */
+		sym = (Elf64_Sym *)sechdrs[symindex].sh_addr
+			+ ELF64_R_SYM(rel[i].r_info);
+
+		/* val corresponds to (S + A) in the AArch64 ELF document. */
+		val = sym->st_value + rel[i].r_addend;
+
+		/* Check for overflow by default. */
+		overflow_check = true;
+
+		/* Perform the static relocation. */
+		switch (ELF64_R_TYPE(rel[i].r_info)) {
+		/* Null relocations. */
+		case R_ARM_NONE:
+		case R_AARCH64_NONE:
+			ovf = 0;
+			break;
+
+		/* Data relocations. */
+		case R_AARCH64_ABS64:
+			overflow_check = false;
+			ovf = reloc_data(RELOC_OP_ABS, loc, val, 64);
+			break;
+		case R_AARCH64_ABS32:
+			ovf = reloc_data(RELOC_OP_ABS, loc, val, 32);
+			break;
+		case R_AARCH64_ABS16:
+			ovf = reloc_data(RELOC_OP_ABS, loc, val, 16);
+			break;
+		case R_AARCH64_PREL64:
+			overflow_check = false;
+			ovf = reloc_data(RELOC_OP_PREL, loc, val, 64);
+			break;
+		case R_AARCH64_PREL32:
+			ovf = reloc_data(RELOC_OP_PREL, loc, val, 32);
+			break;
+		case R_AARCH64_PREL16:
+			ovf = reloc_data(RELOC_OP_PREL, loc, val, 16);
+			break;
+
+		/* MOVW instruction relocations. */
+		case R_AARCH64_MOVW_UABS_G0_NC:
+			overflow_check = false;
+		case R_AARCH64_MOVW_UABS_G0:
+			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
+					      INSN_IMM_16);
+			break;
+		case R_AARCH64_MOVW_UABS_G1_NC:
+			overflow_check = false;
+		case R_AARCH64_MOVW_UABS_G1:
+			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
+					      INSN_IMM_16);
+			break;
+		case R_AARCH64_MOVW_UABS_G2_NC:
+			overflow_check = false;
+		case R_AARCH64_MOVW_UABS_G2:
+			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
+					      INSN_IMM_16);
+			break;
+		case R_AARCH64_MOVW_UABS_G3:
+			/* We're using the top bits so we can't overflow. */
+			overflow_check = false;
+			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 48,
+					      INSN_IMM_16);
+			break;
+		case R_AARCH64_MOVW_SABS_G0:
+			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 0,
+					      INSN_IMM_MOVNZ);
+			break;
+		case R_AARCH64_MOVW_SABS_G1:
+			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 16,
+					      INSN_IMM_MOVNZ);
+			break;
+		case R_AARCH64_MOVW_SABS_G2:
+			ovf = reloc_insn_movw(RELOC_OP_ABS, loc, val, 32,
+					      INSN_IMM_MOVNZ);
+			break;
+		case R_AARCH64_MOVW_PREL_G0_NC:
+			overflow_check = false;
+			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
+					      INSN_IMM_MOVK);
+			break;
+		case R_AARCH64_MOVW_PREL_G0:
+			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 0,
+					      INSN_IMM_MOVNZ);
+			break;
+		case R_AARCH64_MOVW_PREL_G1_NC:
+			overflow_check = false;
+			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
+					      INSN_IMM_MOVK);
+			break;
+		case R_AARCH64_MOVW_PREL_G1:
+			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 16,
+					      INSN_IMM_MOVNZ);
+			break;
+		case R_AARCH64_MOVW_PREL_G2_NC:
+			overflow_check = false;
+			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
+					      INSN_IMM_MOVK);
+			break;
+		case R_AARCH64_MOVW_PREL_G2:
+			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 32,
+					      INSN_IMM_MOVNZ);
+			break;
+		case R_AARCH64_MOVW_PREL_G3:
+			/* We're using the top bits so we can't overflow. */
+			overflow_check = false;
+			ovf = reloc_insn_movw(RELOC_OP_PREL, loc, val, 48,
+					      INSN_IMM_MOVNZ);
+			break;
+
+		/* Immediate instruction relocations. */
+		case R_AARCH64_LD_PREL_LO19:
+			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
+					     INSN_IMM_19);
+			break;
+		case R_AARCH64_ADR_PREL_LO21:
+			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 0, 21,
+					     INSN_IMM_ADR);
+			break;
+		case R_AARCH64_ADR_PREL_PG_HI21_NC:
+			overflow_check = false;
+		case R_AARCH64_ADR_PREL_PG_HI21:
+			ovf = reloc_insn_imm(RELOC_OP_PAGE, loc, val, 12, 21,
+					     INSN_IMM_ADR);
+			break;
+		case R_AARCH64_ADD_ABS_LO12_NC:
+		case R_AARCH64_LDST8_ABS_LO12_NC:
+			overflow_check = false;
+			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 0, 12,
+					     INSN_IMM_12);
+			break;
+		case R_AARCH64_LDST16_ABS_LO12_NC:
+			overflow_check = false;
+			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 1, 11,
+					     INSN_IMM_12);
+			break;
+		case R_AARCH64_LDST32_ABS_LO12_NC:
+			overflow_check = false;
+			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 2, 10,
+					     INSN_IMM_12);
+			break;
+		case R_AARCH64_LDST64_ABS_LO12_NC:
+			overflow_check = false;
+			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 3, 9,
+					     INSN_IMM_12);
+			break;
+		case R_AARCH64_LDST128_ABS_LO12_NC:
+			overflow_check = false;
+			ovf = reloc_insn_imm(RELOC_OP_ABS, loc, val, 4, 8,
+					     INSN_IMM_12);
+			break;
+		case R_AARCH64_TSTBR14:
+			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 14,
+					     INSN_IMM_14);
+			break;
+		case R_AARCH64_CONDBR19:
+			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 19,
+					     INSN_IMM_19);
+			break;
+		case R_AARCH64_JUMP26:
+		case R_AARCH64_CALL26:
+			ovf = reloc_insn_imm(RELOC_OP_PREL, loc, val, 2, 26,
+					     INSN_IMM_26);
+			break;
+
+		default:
+			pr_err("module %s: unsupported RELA relocation: %llu\n",
+			       me->name, ELF64_R_TYPE(rel[i].r_info));
+			return -ENOEXEC;
+		}
+
+		if (overflow_check && ovf == -ERANGE)
+			goto overflow;
+
+	}
+
+	return 0;
+
+overflow:
+	pr_err("module %s: overflow in relocation type %d val %Lx\n",
+	       me->name, (int)ELF64_R_TYPE(rel[i].r_info), val);
+	return -ENOEXEC;
+}


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 28/31] arm64: Generic timers support
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (26 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 27/31] arm64: Loadable modules Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 15:52   ` Arnd Bergmann
                     ` (4 more replies)
  2012-08-14 17:52 ` [PATCH v2 29/31] arm64: Miscellaneous header files Catalin Marinas
                   ` (3 subsequent siblings)
  31 siblings, 5 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel
  Cc: linux-kernel, Arnd Bergmann, Marc Zyngier, Will Deacon

From: Marc Zyngier <marc.zyngier@arm.com>

This patch adds support for the ARM generic timers with A64 instructions
for accessing the timer registers. It uses the physical counter as the
clock source and the virtual counter as sched_clock.

The timer frequency can be specified via DT or read from the CNTFRQ_EL0
register. The physical counter is also accessible from user space
allowing fast gettimeofday() implementation.

Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/timex.h    |   32 ++++
 arch/arm64/kernel/time.c          |   65 ++++++++
 drivers/clocksource/Kconfig       |    5 +
 drivers/clocksource/Makefile      |    1 +
 drivers/clocksource/arm_generic.c |  309 +++++++++++++++++++++++++++++++++++++
 include/clocksource/arm_generic.h |   21 +++
 6 files changed, 433 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/timex.h
 create mode 100644 arch/arm64/kernel/time.c
 create mode 100644 drivers/clocksource/arm_generic.c
 create mode 100644 include/clocksource/arm_generic.h

diff --git a/arch/arm64/include/asm/timex.h b/arch/arm64/include/asm/timex.h
new file mode 100644
index 0000000..88f74de
--- /dev/null
+++ b/arch/arm64/include/asm/timex.h
@@ -0,0 +1,32 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_TIMEX_H
+#define __ASM_TIMEX_H
+
+/*
+ * Use the current timer as a cycle counter since this is what we use for
+ * the delay loop.
+ */
+#define get_cycles()	({ cycles_t c; read_current_timer(&c); c; })
+
+#include <asm-generic/timex.h>
+
+#define ARCH_HAS_READ_CURRENT_TIMER
+
+/* This isn't really used any more */
+#define CLOCK_TICK_RATE 1000
+
+#endif
diff --git a/arch/arm64/kernel/time.c b/arch/arm64/kernel/time.c
new file mode 100644
index 0000000..3b4b725
--- /dev/null
+++ b/arch/arm64/kernel/time.c
@@ -0,0 +1,65 @@
+/*
+ * Based on arch/arm/kernel/time.c
+ *
+ * Copyright (C) 1991, 1992, 1995  Linus Torvalds
+ * Modifications for ARM (C) 1994-2001 Russell King
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/interrupt.h>
+#include <linux/time.h>
+#include <linux/init.h>
+#include <linux/sched.h>
+#include <linux/smp.h>
+#include <linux/timex.h>
+#include <linux/errno.h>
+#include <linux/profile.h>
+#include <linux/syscore_ops.h>
+#include <linux/timer.h>
+#include <linux/irq.h>
+
+#include <clocksource/arm_generic.h>
+
+#include <asm/thread_info.h>
+#include <asm/stacktrace.h>
+
+#ifdef CONFIG_SMP
+unsigned long profile_pc(struct pt_regs *regs)
+{
+	struct stackframe frame;
+
+	if (!in_lock_functions(regs->pc))
+		return regs->pc;
+
+	frame.fp = regs->regs[29];
+	frame.sp = regs->sp;
+	frame.pc = regs->pc;
+	do {
+		int ret = unwind_frame(&frame);
+		if (ret < 0)
+			return 0;
+	} while (in_lock_functions(frame.pc));
+
+	return frame.pc;
+}
+EXPORT_SYMBOL(profile_pc);
+#endif
+
+void __init time_init(void)
+{
+	arm_generic_timer_init();
+}
diff --git a/drivers/clocksource/Kconfig b/drivers/clocksource/Kconfig
index d53cd0a..6a78073 100644
--- a/drivers/clocksource/Kconfig
+++ b/drivers/clocksource/Kconfig
@@ -35,3 +35,8 @@ config CLKSRC_DBX500_PRCMU_SCHED_CLOCK
 	default y
 	help
 	  Use the always on PRCMU Timer as sched_clock
+
+config CLKSRC_ARM_GENERIC
+	def_bool y if ARM64
+	help
+	  This option enables support for the ARM generic timer.
diff --git a/drivers/clocksource/Makefile b/drivers/clocksource/Makefile
index b65d0c5..6591990 100644
--- a/drivers/clocksource/Makefile
+++ b/drivers/clocksource/Makefile
@@ -13,3 +13,4 @@ obj-$(CONFIG_DW_APB_TIMER)	+= dw_apb_timer.o
 obj-$(CONFIG_DW_APB_TIMER_OF)	+= dw_apb_timer_of.o
 obj-$(CONFIG_CLKSRC_DBX500_PRCMU)	+= clksrc-dbx500-prcmu.o
 obj-$(CONFIG_ARMADA_370_XP_TIMER)	+= time-armada-370-xp.o
+obj-$(CONFIG_CLKSRC_ARM_GENERIC)	+= arm_generic.o
diff --git a/drivers/clocksource/arm_generic.c b/drivers/clocksource/arm_generic.c
new file mode 100644
index 0000000..05c898c
--- /dev/null
+++ b/drivers/clocksource/arm_generic.c
@@ -0,0 +1,309 @@
+/*
+ * Generic timers support
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ * Author: Marc Zyngier <marc.zyngier@arm.com>
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/smp.h>
+#include <linux/cpu.h>
+#include <linux/jiffies.h>
+#include <linux/interrupt.h>
+#include <linux/clockchips.h>
+#include <linux/of_irq.h>
+#include <linux/io.h>
+
+#include <clocksource/arm_generic.h>
+
+static u32 arch_timer_rate;
+static u64 sched_clock_mult __read_mostly;
+static DEFINE_PER_CPU(struct clock_event_device, arch_timer_evt);
+static int arch_timer_ppi;
+
+/*
+ * Architected system timer support.
+ */
+
+#define ARCH_TIMER_CTRL_ENABLE		(1 << 0)
+#define ARCH_TIMER_CTRL_IT_MASK		(1 << 1)
+
+#define ARCH_TIMER_REG_CTRL		0
+#define ARCH_TIMER_REG_FREQ		1
+#define ARCH_TIMER_REG_TVAL		2
+
+static void arch_timer_reg_write(int reg, u32 val)
+{
+	switch (reg) {
+	case ARCH_TIMER_REG_CTRL:
+		asm volatile("msr cntp_ctl_el0,  %0" : : "r" (val));
+		break;
+	case ARCH_TIMER_REG_TVAL:
+		asm volatile("msr cntp_tval_el0, %0" : : "r" (val));
+		break;
+	default:
+		BUG();
+	}
+
+	isb();
+}
+
+static u32 arch_timer_reg_read(int reg)
+{
+	u32 val;
+
+	switch (reg) {
+	case ARCH_TIMER_REG_CTRL:
+		asm volatile("mrs %0,  cntp_ctl_el0" : "=r" (val));
+		break;
+	case ARCH_TIMER_REG_FREQ:
+		asm volatile("mrs %0,   cntfrq_el0" : "=r" (val));
+		break;
+	case ARCH_TIMER_REG_TVAL:
+		asm volatile("mrs %0, cntp_tval_el0" : "=r" (val));
+		break;
+	default:
+		BUG();
+	}
+
+	return val;
+}
+
+static irqreturn_t arch_timer_handle_irq(int irq, void *dev_id)
+{
+	struct clock_event_device *evt = dev_id;
+	unsigned long ctrl;
+
+	ctrl = arch_timer_reg_read(ARCH_TIMER_REG_CTRL);
+	if (ctrl & 0x4) {
+		ctrl |= ARCH_TIMER_CTRL_IT_MASK;
+		arch_timer_reg_write(ARCH_TIMER_REG_CTRL, ctrl);
+		evt->event_handler(evt);
+		return IRQ_HANDLED;
+	}
+
+	return IRQ_NONE;
+}
+
+static void arch_timer_stop(void)
+{
+	unsigned long ctrl;
+
+	ctrl = arch_timer_reg_read(ARCH_TIMER_REG_CTRL);
+	ctrl &= ~ARCH_TIMER_CTRL_ENABLE;
+	arch_timer_reg_write(ARCH_TIMER_REG_CTRL, ctrl);
+}
+
+static void arch_timer_set_mode(enum clock_event_mode mode,
+				struct clock_event_device *clk)
+{
+	switch (mode) {
+	case CLOCK_EVT_MODE_UNUSED:
+	case CLOCK_EVT_MODE_SHUTDOWN:
+		arch_timer_stop();
+		break;
+	default:
+		break;
+	}
+}
+
+static int arch_timer_set_next_event(unsigned long evt,
+				     struct clock_event_device *unused)
+{
+	unsigned long ctrl;
+
+	ctrl = arch_timer_reg_read(ARCH_TIMER_REG_CTRL);
+	ctrl |= ARCH_TIMER_CTRL_ENABLE;
+	ctrl &= ~ARCH_TIMER_CTRL_IT_MASK;
+
+	arch_timer_reg_write(ARCH_TIMER_REG_TVAL, evt);
+	arch_timer_reg_write(ARCH_TIMER_REG_CTRL, ctrl);
+
+	return 0;
+}
+
+static void __cpuinit arch_counter_enable_user_access(void)
+{
+	u32 cntkctl;
+
+	/* Disable user access to the timers and the virtual counter. */
+	asm volatile("mrs	%0, cntkctl_el1" : "=r" (cntkctl));
+	cntkctl &= ~((3 << 8) | (1 << 1));
+
+	/* Enable user access to the physical counter and frequency. */
+	cntkctl |= 1;
+	asm volatile("msr	cntkctl_el1, %0" : : "r" (cntkctl));
+}
+
+static void __cpuinit arch_timer_setup(struct clock_event_device *clk)
+{
+	/* Let's make sure the timer is off before doing anything else */
+	arch_timer_stop();
+
+	clk->features = CLOCK_EVT_FEAT_ONESHOT;
+	clk->name = "arch_sys_timer";
+	clk->rating = 400;
+	clk->set_mode = arch_timer_set_mode;
+	clk->set_next_event = arch_timer_set_next_event;
+	clk->irq = arch_timer_ppi;
+	clk->cpumask = cpumask_of(smp_processor_id());
+
+	clockevents_config_and_register(clk, arch_timer_rate,
+					0xf, 0x7fffffff);
+
+	enable_percpu_irq(clk->irq, 0);
+
+	/* Ensure the physical counter is visible to userspace for the vDSO. */
+	arch_counter_enable_user_access();
+}
+
+static void __init arch_timer_calibrate(void)
+{
+	if (arch_timer_rate == 0) {
+		arch_timer_reg_write(ARCH_TIMER_REG_CTRL, 0);
+		arch_timer_rate = arch_timer_reg_read(ARCH_TIMER_REG_FREQ);
+
+		/* Check the timer frequency. */
+		if (arch_timer_rate == 0)
+			panic("Architected timer frequency is set to zero.\n"
+			      "You must set this in your .dts file\n");
+	}
+
+	/* Cache the sched_clock multiplier to save a divide in the hot path. */
+
+	sched_clock_mult = NSEC_PER_SEC / arch_timer_rate;
+
+	pr_info("Architected local timer running at %u.%02uMHz.\n",
+		 arch_timer_rate / 1000000, (arch_timer_rate / 10000) % 100);
+}
+
+static inline cycle_t arch_counter_get_cntpct(void)
+{
+	cycle_t cval;
+
+	asm volatile("mrs %0, cntpct_el0" : "=r" (cval));
+
+	return cval;
+}
+
+static inline cycle_t arch_counter_get_cntvct(void)
+{
+	cycle_t cval;
+
+	asm volatile("mrs %0, cntvct_el0" : "=r" (cval));
+
+	return cval;
+}
+
+static cycle_t arch_counter_read(struct clocksource *cs)
+{
+	return arch_counter_get_cntpct();
+}
+
+static struct clocksource clocksource_counter = {
+	.name	= "arch_sys_counter",
+	.rating	= 400,
+	.read	= arch_counter_read,
+	.mask	= CLOCKSOURCE_MASK(56),
+	.flags	= (CLOCK_SOURCE_IS_CONTINUOUS | CLOCK_SOURCE_VALID_FOR_HRES),
+};
+
+int read_current_timer(unsigned long *timer_value)
+{
+	*timer_value = arch_counter_get_cntpct();
+	return 0;
+}
+
+unsigned long long notrace sched_clock(void)
+{
+	return arch_counter_get_cntvct() * sched_clock_mult;
+}
+
+static int __cpuinit arch_timer_cpu_notify(struct notifier_block *self,
+					   unsigned long action, void *hcpu)
+{
+	int cpu = (long)hcpu;
+	struct clock_event_device *clk = per_cpu_ptr(&arch_timer_evt, cpu);
+
+	switch(action) {
+	case CPU_STARTING:
+	case CPU_STARTING_FROZEN:
+		arch_timer_setup(clk);
+		break;
+
+	case CPU_DYING:
+	case CPU_DYING_FROZEN:
+		pr_debug("arch_timer_teardown disable IRQ%d cpu #%d\n",
+			 clk->irq, cpu);
+		disable_percpu_irq(clk->irq);
+		arch_timer_set_mode(CLOCK_EVT_MODE_UNUSED, clk);
+		break;
+	}
+
+	return NOTIFY_OK;
+}
+
+static struct notifier_block __cpuinitdata arch_timer_cpu_nb = {
+	.notifier_call = arch_timer_cpu_notify,
+};
+
+static const struct of_device_id arch_timer_of_match[] __initconst = {
+	{ .compatible = "arm,armv8-timer" },
+	{},
+};
+
+int __init arm_generic_timer_init(void)
+{
+	struct device_node *np;
+	int err;
+	u32 freq;
+
+	np = of_find_matching_node(NULL, arch_timer_of_match);
+	if (!np) {
+		pr_err("arch_timer: can't find DT node\n");
+		return -ENODEV;
+	}
+
+	/* Try to determine the frequency from the device tree or CNTFRQ */
+	if (!of_property_read_u32(np, "clock-frequency", &freq))
+		arch_timer_rate = freq;
+	arch_timer_calibrate();
+
+	arch_timer_ppi = irq_of_parse_and_map(np, 0);
+	pr_info("arch_timer: found %s irq %d\n", np->name, arch_timer_ppi);
+
+	err = request_percpu_irq(arch_timer_ppi, arch_timer_handle_irq,
+				 np->name, &arch_timer_evt);
+	if (err) {
+		pr_err("arch_timer: can't register interrupt %d (%d)\n",
+		       arch_timer_ppi, err);
+		return err;
+	}
+
+	clocksource_register_hz(&clocksource_counter, arch_timer_rate);
+
+	/* Calibrate the delay loop directly */
+	lpj_fine = arch_timer_rate / HZ;
+
+	/* Immediately configure the timer on the boot CPU */
+	arch_timer_setup(per_cpu_ptr(&arch_timer_evt, smp_processor_id()));
+
+	register_cpu_notifier(&arch_timer_cpu_nb);
+
+	return 0;
+}
diff --git a/include/clocksource/arm_generic.h b/include/clocksource/arm_generic.h
new file mode 100644
index 0000000..5b41b0d
--- /dev/null
+++ b/include/clocksource/arm_generic.h
@@ -0,0 +1,21 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __CLKSOURCE_ARM_GENERIC_H
+#define __CLKSOURCE_ARM_GENERIC_H
+
+extern int arm_generic_timer_init(void);
+
+#endif


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 29/31] arm64: Miscellaneous header files
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (27 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 28/31] arm64: Generic timers support Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 15:56   ` Arnd Bergmann
  2012-08-14 17:52 ` [PATCH v2 30/31] arm64: Build infrastructure Catalin Marinas
                   ` (2 subsequent siblings)
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch introduces a few AArch64-specific header files together with
Kbuild entries for generic headers.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/include/asm/Kbuild        |   51 ++++++++++
 arch/arm64/include/asm/barrier.h     |   52 ++++++++++
 arch/arm64/include/asm/bitsperlong.h |   23 +++++
 arch/arm64/include/asm/byteorder.h   |   21 ++++
 arch/arm64/include/asm/cmpxchg.h     |  180 ++++++++++++++++++++++++++++++++++
 arch/arm64/include/asm/compiler.h    |   30 ++++++
 arch/arm64/include/asm/exception.h   |   23 +++++
 arch/arm64/include/asm/exec.h        |   23 +++++
 arch/arm64/include/asm/fcntl.h       |   29 ++++++
 arch/arm64/include/asm/system_misc.h |   54 ++++++++++
 10 files changed, 486 insertions(+), 0 deletions(-)
 create mode 100644 arch/arm64/include/asm/Kbuild
 create mode 100644 arch/arm64/include/asm/barrier.h
 create mode 100644 arch/arm64/include/asm/bitsperlong.h
 create mode 100644 arch/arm64/include/asm/byteorder.h
 create mode 100644 arch/arm64/include/asm/cmpxchg.h
 create mode 100644 arch/arm64/include/asm/compiler.h
 create mode 100644 arch/arm64/include/asm/exception.h
 create mode 100644 arch/arm64/include/asm/exec.h
 create mode 100644 arch/arm64/include/asm/fcntl.h
 create mode 100644 arch/arm64/include/asm/system_misc.h

diff --git a/arch/arm64/include/asm/Kbuild b/arch/arm64/include/asm/Kbuild
new file mode 100644
index 0000000..35924a5
--- /dev/null
+++ b/arch/arm64/include/asm/Kbuild
@@ -0,0 +1,51 @@
+include include/asm-generic/Kbuild.asm
+
+header-y += hwcap.h
+
+generic-y += bug.h
+generic-y += bugs.h
+generic-y += checksum.h
+generic-y += cputime.h
+generic-y += current.h
+generic-y += delay.h
+generic-y += div64.h
+generic-y += dma.h
+generic-y += emergency-restart.h
+generic-y += errno.h
+generic-y += ftrace.h
+generic-y += hw_irq.h
+generic-y += ioctl.h
+generic-y += ioctls.h
+generic-y += ipcbuf.h
+generic-y += irq_regs.h
+generic-y += kdebug.h
+generic-y += kmap_types.h
+generic-y += linkage.h
+generic-y += local.h
+generic-y += local64.h
+generic-y += mman.h
+generic-y += msgbuf.h
+generic-y += mutex.h
+generic-y += pci.h
+generic-y += percpu.h
+generic-y += poll.h
+generic-y += posix_types.h
+generic-y += resource.h
+generic-y += scatterlist.h
+generic-y += sections.h
+generic-y += segment.h
+generic-y += sembuf.h
+generic-y += serial.h
+generic-y += shmbuf.h
+generic-y += sizes.h
+generic-y += socket.h
+generic-y += sockios.h
+generic-y += string.h
+generic-y += switch_to.h
+generic-y += swab.h
+generic-y += termbits.h
+generic-y += termios.h
+generic-y += topology.h
+generic-y += types.h
+generic-y += unaligned.h
+generic-y += user.h
diff --git a/arch/arm64/include/asm/barrier.h b/arch/arm64/include/asm/barrier.h
new file mode 100644
index 0000000..d4a6333
--- /dev/null
+++ b/arch/arm64/include/asm/barrier.h
@@ -0,0 +1,52 @@
+/*
+ * Based on arch/arm/include/asm/barrier.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_BARRIER_H
+#define __ASM_BARRIER_H
+
+#ifndef __ASSEMBLY__
+
+#define sev()		asm volatile("sev" : : : "memory")
+#define wfe()		asm volatile("wfe" : : : "memory")
+#define wfi()		asm volatile("wfi" : : : "memory")
+
+#define isb()		asm volatile("isb" : : : "memory")
+#define dsb()		asm volatile("dsb sy" : : : "memory")
+
+#define mb()		dsb()
+#define rmb()		asm volatile("dsb ld" : : : "memory")
+#define wmb()		asm volatile("dsb st" : : : "memory")
+
+#ifndef CONFIG_SMP
+#define smp_mb()	barrier()
+#define smp_rmb()	barrier()
+#define smp_wmb()	barrier()
+#else
+#define smp_mb()	asm volatile("dmb ish" : : : "memory")
+#define smp_rmb()	asm volatile("dmb ishld" : : : "memory")
+#define smp_wmb()	asm volatile("dmb ishst" : : : "memory")
+#endif
+
+#define read_barrier_depends()		do { } while(0)
+#define smp_read_barrier_depends()	do { } while(0)
+
+#define set_mb(var, value)	do { var = value; smp_mb(); } while (0)
+#define nop()		asm volatile("nop");
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* __ASM_BARRIER_H */
diff --git a/arch/arm64/include/asm/bitsperlong.h b/arch/arm64/include/asm/bitsperlong.h
new file mode 100644
index 0000000..fce9c29
--- /dev/null
+++ b/arch/arm64/include/asm/bitsperlong.h
@@ -0,0 +1,23 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_BITSPERLONG_H
+#define __ASM_BITSPERLONG_H
+
+#define __BITS_PER_LONG 64
+
+#include <asm-generic/bitsperlong.h>
+
+#endif	/* __ASM_BITSPERLONG_H */
diff --git a/arch/arm64/include/asm/byteorder.h b/arch/arm64/include/asm/byteorder.h
new file mode 100644
index 0000000..2b92046
--- /dev/null
+++ b/arch/arm64/include/asm/byteorder.h
@@ -0,0 +1,21 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_BYTEORDER_H
+#define __ASM_BYTEORDER_H
+
+#include <linux/byteorder/little_endian.h>
+
+#endif	/* __ASM_BYTEORDER_H */
diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
new file mode 100644
index 0000000..dc50de7
--- /dev/null
+++ b/arch/arm64/include/asm/cmpxchg.h
@@ -0,0 +1,180 @@
+/*
+ * Based on arch/arm/include/asm/cmpxchg.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_CMPXCHG_H
+#define __ASM_CMPXCHG_H
+
+#include <linux/irqflags.h>
+#include <asm/barrier.h>
+
+static inline unsigned long __xchg(unsigned long x, volatile void *ptr, int size)
+{
+	extern void __bad_xchg(volatile void *, int);
+	unsigned long ret, tmp;
+
+	switch (size) {
+	case 1:
+		asm volatile("//	__xchg1\n"
+		"1:	ldaxrb	%w0, [%3]\n"
+		"	stlxrb	%w1, %w2, [%3]\n"
+		"	cbnz	%w1, 1b\n"
+			: "=&r" (ret), "=&r" (tmp)
+			: "r" (x), "r" (ptr)
+			: "memory", "cc");
+		break;
+	case 2:
+		asm volatile("//	__xchg2\n"
+		"1:	ldaxrh	%w0, [%3]\n"
+		"	stlxrh	%w1, %w2, [%3]\n"
+		"	cbnz	%w1, 1b\n"
+			: "=&r" (ret), "=&r" (tmp)
+			: "r" (x), "r" (ptr)
+			: "memory", "cc");
+		break;
+	case 4:
+		asm volatile("//	__xchg4\n"
+		"1:	ldaxr	%w0, [%3]\n"
+		"	stlxr	%w1, %w2, [%3]\n"
+		"	cbnz	%w1, 1b\n"
+			: "=&r" (ret), "=&r" (tmp)
+			: "r" (x), "r" (ptr)
+			: "memory", "cc");
+		break;
+	case 8:
+		asm volatile("//	__xchg8\n"
+		"1:	ldaxr	%0, [%3]\n"
+		"	stlxr	%w1, %2, [%3]\n"
+		"	cbnz	%w1, 1b\n"
+			: "=&r" (ret), "=&r" (tmp)
+			: "r" (x), "r" (ptr)
+			: "memory", "cc");
+		break;
+	default:
+		__bad_xchg(ptr, size), ret = 0;
+		break;
+	}
+
+	return ret;
+}
+
+#define xchg(ptr,x) \
+	((__typeof__(*(ptr)))__xchg((unsigned long)(x),(ptr),sizeof(*(ptr))))
+
+/*
+ * cmpxchg operations.
+ */
+extern void __bad_cmpxchg(volatile void *ptr, int size);
+
+static inline unsigned long __cmpxchg(volatile void *ptr, unsigned long old,
+				      unsigned long new, int size)
+{
+	unsigned long oldval, res;
+
+	switch (size) {
+	case 1:
+		do {
+			asm volatile("// __cmpxchg1\n"
+			"	ldxrb	%w1, [%2]\n"
+			"	mov	%w0, #0\n"
+			"	cmp	%w1, %w3\n"
+			"	b.ne	1f\n"
+			"	stxrb	%w0, %w4, [%2]\n"
+			"1:\n"
+				: "=&r" (res), "=&r" (oldval)
+				: "r" (ptr), "Ir" (old), "r" (new)
+				: "cc");
+		} while (res);
+		break;
+
+	case 2:
+		do {
+			asm volatile("// __cmpxchg2\n"
+			"	ldxrh	%w1, [%2]\n"
+			"	mov	%w0, #0\n"
+			"	cmp	%w1, %w3\n"
+			"	b.ne	1f\n"
+			"	stxrh	%w0, %w4, [%2]\n"
+			"1:\n"
+				: "=&r" (res), "=&r" (oldval)
+				: "r" (ptr), "Ir" (old), "r" (new)
+				: "memory", "cc");
+		} while (res);
+		break;
+
+	case 4:
+		do {
+			asm volatile("// __cmpxchg4\n"
+			"	ldxr	%w1, [%2]\n"
+			"	mov	%w0, #0\n"
+			"	cmp	%w1, %w3\n"
+			"	b.ne	1f\n"
+			"	stxr	%w0, %w4, [%2]\n"
+			"1:\n"
+				: "=&r" (res), "=&r" (oldval)
+				: "r" (ptr), "Ir" (old), "r" (new)
+				: "cc");
+		} while (res);
+		break;
+
+	case 8:
+		do {
+			asm volatile("// __cmpxchg8\n"
+			"	ldxr	%1, [%2]\n"
+			"	mov	%w0, #0\n"
+			"	cmp	%1, %3\n"
+			"	b.ne	1f\n"
+			"	stxr	%w0, %4, [%2]\n"
+			"1:\n"
+				: "=&r" (res), "=&r" (oldval)
+				: "r" (ptr), "Ir" (old), "r" (new)
+				: "cc");
+		} while (res);
+		break;
+
+	default:
+		__bad_cmpxchg(ptr, size);
+		oldval = 0;
+	}
+
+	return oldval;
+}
+
+static inline unsigned long __cmpxchg_mb(volatile void *ptr, unsigned long old,
+					 unsigned long new, int size)
+{
+	unsigned long ret;
+
+	smp_mb();
+	ret = __cmpxchg(ptr, old, new, size);
+	smp_mb();
+
+	return ret;
+}
+
+#define cmpxchg(ptr,o,n)						\
+	((__typeof__(*(ptr)))__cmpxchg_mb((ptr),			\
+					  (unsigned long)(o),		\
+					  (unsigned long)(n),		\
+					  sizeof(*(ptr))))
+
+#define cmpxchg_local(ptr,o,n)						\
+	((__typeof__(*(ptr)))__cmpxchg((ptr),				\
+				       (unsigned long)(o),		\
+				       (unsigned long)(n),		\
+				       sizeof(*(ptr))))
+
+#endif	/* __ASM_CMPXCHG_H */
diff --git a/arch/arm64/include/asm/compiler.h b/arch/arm64/include/asm/compiler.h
new file mode 100644
index 0000000..ee35fd0
--- /dev/null
+++ b/arch/arm64/include/asm/compiler.h
@@ -0,0 +1,30 @@
+/*
+ * Based on arch/arm/include/asm/compiler.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_COMPILER_H
+#define __ASM_COMPILER_H
+
+/*
+ * This is used to ensure the compiler did actually allocate the register we
+ * asked it for some inline assembly sequences.  Apparently we can't trust the
+ * compiler from one version to another so a bit of paranoia won't hurt.  This
+ * string is meant to be concatenated with the inline asm string and will
+ * cause compilation to stop on mismatch.  (for details, see gcc PR 15089)
+ */
+#define __asmeq(x, y)  ".ifnc " x "," y " ; .err ; .endif\n\t"
+
+#endif	/* __ASM_COMPILER_H */
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
new file mode 100644
index 0000000..ac63519
--- /dev/null
+++ b/arch/arm64/include/asm/exception.h
@@ -0,0 +1,23 @@
+/*
+ * Based on arch/arm/include/asm/exception.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_EXCEPTION_H
+#define __ASM_EXCEPTION_H
+
+#define __exception	__attribute__((section(".exception.text")))
+
+#endif	/* __ASM_EXCEPTION_H */
diff --git a/arch/arm64/include/asm/exec.h b/arch/arm64/include/asm/exec.h
new file mode 100644
index 0000000..db0563c
--- /dev/null
+++ b/arch/arm64/include/asm/exec.h
@@ -0,0 +1,23 @@
+/*
+ * Based on arch/arm/include/asm/exec.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_EXEC_H
+#define __ASM_EXEC_H
+
+extern unsigned long arch_align_stack(unsigned long sp);
+
+#endif	/* __ASM_EXEC_H */
diff --git a/arch/arm64/include/asm/fcntl.h b/arch/arm64/include/asm/fcntl.h
new file mode 100644
index 0000000..cd2e630
--- /dev/null
+++ b/arch/arm64/include/asm/fcntl.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_FCNTL_H
+#define __ASM_FCNTL_H
+
+/*
+ * Using our own definitions for AArch32 (compat) support.
+ */
+#define O_DIRECTORY	 040000	/* must be a directory */
+#define O_NOFOLLOW	0100000	/* don't follow links */
+#define O_DIRECT	0200000	/* direct disk access hint - currently ignored */
+#define O_LARGEFILE	0400000
+
+#include <asm-generic/fcntl.h>
+
+#endif
diff --git a/arch/arm64/include/asm/system_misc.h b/arch/arm64/include/asm/system_misc.h
new file mode 100644
index 0000000..95e4072
--- /dev/null
+++ b/arch/arm64/include/asm/system_misc.h
@@ -0,0 +1,54 @@
+/*
+ * Based on arch/arm/include/asm/system_misc.h
+ *
+ * Copyright (C) 2012 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_SYSTEM_MISC_H
+#define __ASM_SYSTEM_MISC_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/compiler.h>
+#include <linux/linkage.h>
+#include <linux/irqflags.h>
+
+struct pt_regs;
+
+void die(const char *msg, struct pt_regs *regs, int err);
+
+struct siginfo;
+void arm64_notify_die(const char *str, struct pt_regs *regs,
+		      struct siginfo *info, int err);
+
+void hook_debug_fault_code(int nr, int (*fn)(unsigned long, unsigned int,
+					     struct pt_regs *),
+			   int sig, int code, const char *name);
+
+struct mm_struct;
+extern void show_pte(struct mm_struct *mm, unsigned long addr);
+extern void __show_regs(struct pt_regs *);
+
+void soft_restart(unsigned long);
+extern void (*pm_restart)(const char *cmd);
+
+#define UDBG_UNDEFINED	(1 << 0)
+#define UDBG_SYSCALL	(1 << 1)
+#define UDBG_BADABORT	(1 << 2)
+#define UDBG_SEGV	(1 << 3)
+#define UDBG_BUS	(1 << 4)
+
+#endif	/* __ASSEMBLY__ */
+
+#endif	/* __ASM_SYSTEM_MISC_H */


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 30/31] arm64: Build infrastructure
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (28 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 29/31] arm64: Miscellaneous header files Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-14 21:01   ` Sam Ravnborg
                     ` (2 more replies)
  2012-08-14 17:52 ` [PATCH v2 31/31] arm64: MAINTAINERS update Catalin Marinas
  2012-08-17  9:36 ` [PATCH v2 00/31] AArch64 Linux kernel port Tony Lindgren
  31 siblings, 3 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann, Will Deacon

This patch adds Makefile and Kconfig files required for building an
AArch64 kernel.

Signed-off-by: Will Deacon <will.deacon@arm.com>
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 arch/arm64/Kconfig                   |  261 ++++++++++++++++++++++++++++++++++
 arch/arm64/Kconfig.debug             |   27 ++++
 arch/arm64/Makefile                  |   71 +++++++++
 arch/arm64/boot/.gitignore           |    2 +
 arch/arm64/boot/Makefile             |   38 +++++
 arch/arm64/boot/install.sh           |   52 +++++++
 arch/arm64/configs/generic_defconfig |   85 +++++++++++
 arch/arm64/include/asm/prom.h        |    1 +
 arch/arm64/kernel/.gitignore         |    1 +
 arch/arm64/kernel/Makefile           |   27 ++++
 arch/arm64/kernel/vmlinux.lds.S      |  146 +++++++++++++++++++
 arch/arm64/mm/Kconfig                |    5 +
 arch/arm64/mm/Makefile               |    6 +
 init/Kconfig                         |    3 +-
 lib/Kconfig.debug                    |    6 +-
 15 files changed, 728 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/Kconfig
 create mode 100644 arch/arm64/Kconfig.debug
 create mode 100644 arch/arm64/Makefile
 create mode 100644 arch/arm64/boot/.gitignore
 create mode 100644 arch/arm64/boot/Makefile
 create mode 100644 arch/arm64/boot/install.sh
 create mode 100644 arch/arm64/configs/generic_defconfig
 create mode 100644 arch/arm64/include/asm/prom.h
 create mode 100644 arch/arm64/kernel/.gitignore
 create mode 100644 arch/arm64/kernel/Makefile
 create mode 100644 arch/arm64/kernel/vmlinux.lds.S
 create mode 100644 arch/arm64/mm/Kconfig
 create mode 100644 arch/arm64/mm/Makefile

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
new file mode 100644
index 0000000..1ce3d04
--- /dev/null
+++ b/arch/arm64/Kconfig
@@ -0,0 +1,261 @@
+config ARM64
+	def_bool y
+	select OF
+	select OF_EARLY_FLATTREE
+	select IRQ_DOMAIN
+	select HAVE_AOUT
+	select HAVE_DMA_ATTRS
+	select HAVE_DMA_API_DEBUG
+	select HAVE_IDE
+	select HAVE_MEMBLOCK
+	select RTC_LIB
+	select SYS_SUPPORTS_APM_EMULATION
+	select HAVE_GENERIC_DMA_COHERENT
+	select GENERIC_IOMAP
+	select HAVE_IRQ_WORK
+	select HAVE_PERF_EVENTS
+	select HAVE_ARCH_TRACEHOOK
+	select PERF_USE_VMALLOC
+	select HAVE_HW_BREAKPOINT if PERF_EVENTS
+	select HAVE_GENERIC_HARDIRQS
+	select GENERIC_HARDIRQS_NO_DEPRECATED
+	select HAVE_SPARSE_IRQ
+	select SPARSE_IRQ
+	select GENERIC_IRQ_SHOW
+	select GENERIC_SMP_IDLE_THREAD
+	select NO_BOOTMEM
+	help
+	  ARM 64-bit (AArch64) Linux support.
+
+config 64BIT
+	def_bool y
+
+config ARCH_PHYS_ADDR_T_64BIT
+	def_bool y
+
+config HAVE_PWM
+	bool
+
+config SYS_SUPPORTS_APM_EMULATION
+	bool
+
+config NO_IOPORT
+	def_bool y
+
+config GENERIC_GPIO
+	bool
+
+config GENERIC_TIME_VSYSCALL
+	def_bool y
+
+config GENERIC_CLOCKEVENTS
+	def_bool y
+
+config STACKTRACE_SUPPORT
+	def_bool y
+
+config LOCKDEP_SUPPORT
+	def_bool y
+
+config TRACE_IRQFLAGS_SUPPORT
+	def_bool y
+
+config HARDIRQS_SW_RESEND
+	def_bool y
+
+config GENERIC_IRQ_PROBE
+	def_bool y
+
+config GENERIC_LOCKBREAK
+	def_bool y
+	depends on SMP && PREEMPT
+
+config RWSEM_GENERIC_SPINLOCK
+	def_bool y
+
+config RWSEM_XCHGADD_ALGORITHM
+	bool
+
+config ARCH_HAS_ILOG2_U32
+	bool
+
+config ARCH_HAS_ILOG2_U64
+	bool
+
+config ARCH_HAS_CPUFREQ
+	bool
+	help
+	  Internal node to signify that the ARCH has CPUFREQ support
+	  and that the relevant menu configurations are displayed for
+	  it.
+
+config GENERIC_HWEIGHT
+	def_bool y
+
+config GENERIC_CSUM
+        def_bool y
+
+config GENERIC_CALIBRATE_DELAY
+	def_bool y
+
+config ZONE_DMA32
+	def_bool y
+
+config ARCH_DMA_ADDR_T_64BIT
+	def_bool y
+
+config NEED_DMA_MAP_STATE
+	def_bool y
+
+config NEED_SG_DMA_LENGTH
+	def_bool y
+
+config SWIOTLB
+	def_bool y
+
+config IOMMU_HELPER
+	def_bool SWIOTLB
+
+source "init/Kconfig"
+
+source "kernel/Kconfig.freezer"
+
+menu "System Type"
+
+source "arch/arm64/mm/Kconfig"
+
+endmenu
+
+menu "Bus support"
+
+config ARM_AMBA
+	bool
+
+endmenu
+
+menu "Kernel Features"
+
+source "kernel/time/Kconfig"
+
+config ARM64_64K_PAGES
+	bool "Enable 64KB pages support"
+	help
+	  This feature enables 64KB pages support (4KB by default)
+	  allowing only two levels of page tables and faster TLB
+	  look-up. AArch32 emulation is not available when this feature
+	  is enabled.
+
+config SMP
+	bool "Symmetric Multi-Processing"
+	depends on GENERIC_CLOCKEVENTS
+	select USE_GENERIC_SMP_HELPERS
+	help
+	  This enables support for systems with more than one CPU.  If
+	  you say N here, the kernel will run on single and
+	  multiprocessor machines, but will use only one CPU of a
+	  multiprocessor machine. If you say Y here, the kernel will run
+	  on many, but not all, single processor machines. On a single
+	  processor machine, the kernel will run faster if you say N
+	  here.
+
+	  If you don't know what to do here, say N.
+
+config NR_CPUS
+	int "Maximum number of CPUs (2-32)"
+	range 2 32
+	depends on SMP
+	default "4"
+
+source kernel/Kconfig.preempt
+
+config HZ
+	int
+	default 100
+
+config ARCH_HAS_HOLES_MEMORYMODEL
+	def_bool y if SPARSEMEM
+
+config ARCH_SPARSEMEM_ENABLE
+	def_bool y
+	select SPARSEMEM_VMEMMAP_ENABLE
+
+config ARCH_SPARSEMEM_DEFAULT
+	def_bool ARCH_SPARSEMEM_ENABLE
+
+config ARCH_SELECT_MEMORY_MODEL
+	def_bool ARCH_SPARSEMEM_ENABLE
+
+config HAVE_ARCH_PFN_VALID
+	def_bool ARCH_HAS_HOLES_MEMORYMODEL || !SPARSEMEM
+
+config HW_PERF_EVENTS
+	bool "Enable hardware performance counter support for perf events"
+	depends on PERF_EVENTS
+	default y
+	help
+	  Enable hardware performance counter support for perf events. If
+	  disabled, perf events will use software events only.
+
+source "mm/Kconfig"
+
+endmenu
+
+menu "Boot options"
+
+config CMDLINE
+	string "Default kernel command string"
+	default ""
+	help
+	  Provide a set of default command-line options at build time by
+	  entering them here. As a minimum, you should specify the the
+	  root device (e.g. root=/dev/nfs).
+
+config CMDLINE_FORCE
+	bool "Always use the default kernel command string"
+	help
+	  Always use the default kernel command string, even if the boot
+	  loader passes other arguments to the kernel.
+	  This is useful if you cannot or don't want to change the
+	  command-line options your boot loader passes to the kernel.
+
+endmenu
+
+menu "Userspace binary formats"
+
+source "fs/Kconfig.binfmt"
+
+config AARCH32_EMULATION
+	bool "Kernel support for 32-bit EL0"
+	depends on !ARM64_64K_PAGES
+	select COMPAT_BINFMT_ELF
+	help
+	  This option enables support for a 32-bit EL0 running under a 64-bit
+	  kernel at EL1. AArch32-specific components such as system calls,
+	  the user helper functions, VFP support and the ptrace interface are
+	  handled appropriately by the kernel.
+
+	  If you want to execute 32-bit userspace applications, say Y.
+
+config COMPAT
+	def_bool y
+	depends on AARCH32_EMULATION
+
+config SYSVIPC_COMPAT
+	def_bool y
+	depends on COMPAT && SYSVIPC
+
+endmenu
+
+source "net/Kconfig"
+
+source "drivers/Kconfig"
+
+source "fs/Kconfig"
+
+source "arch/arm64/Kconfig.debug"
+
+source "security/Kconfig"
+
+source "crypto/Kconfig"
+
+source "lib/Kconfig"
diff --git a/arch/arm64/Kconfig.debug b/arch/arm64/Kconfig.debug
new file mode 100644
index 0000000..d7553f2
--- /dev/null
+++ b/arch/arm64/Kconfig.debug
@@ -0,0 +1,27 @@
+menu "Kernel hacking"
+
+source "lib/Kconfig.debug"
+
+config FRAME_POINTER
+	bool
+	default y
+
+config DEBUG_ERRORS
+	bool "Verbose kernel error messages"
+	depends on DEBUG_KERNEL
+	help
+	  This option controls verbose debugging information which can be
+	  printed when the kernel detects an internal error. This debugging
+	  information is useful to kernel hackers when tracking down problems,
+	  but mostly meaningless to other people. It's safe to say Y unless
+	  you are concerned with the code size or don't want to see these
+	  messages.
+
+config DEBUG_STACK_USAGE
+	bool "Enable stack utilization instrumentation"
+	depends on DEBUG_KERNEL
+	help
+	  Enables the display of the minimum amount of free stack which each
+	  task has ever had available in the sysrq-T output.
+
+endmenu
diff --git a/arch/arm64/Makefile b/arch/arm64/Makefile
new file mode 100644
index 0000000..831bd41
--- /dev/null
+++ b/arch/arm64/Makefile
@@ -0,0 +1,71 @@
+#
+# arch/arm64/Makefile
+#
+# This file is included by the global makefile so that you can add your own
+# architecture-specific flags and dependencies.
+#
+# This file is subject to the terms and conditions of the GNU General Public
+# License.  See the file "COPYING" in the main directory of this archive
+# for more details.
+#
+# Copyright (C) 1995-2001 by Russell King
+
+LDFLAGS_vmlinux	:=-p --no-undefined -X
+CPPFLAGS_vmlinux.lds = -DTEXT_OFFSET=$(TEXT_OFFSET)
+OBJCOPYFLAGS	:=-O binary -R .note -R .note.gnu.build-id -R .comment -S
+GZFLAGS		:=-9
+
+LIBGCC 		:= $(shell $(CC) $(KBUILD_CFLAGS) -print-libgcc-file-name)
+
+KBUILD_DEFCONFIG := generic_defconfig
+
+KBUILD_CFLAGS	+= -mgeneral-regs-only
+KBUILD_CPPFLAGS	+= -mlittle-endian
+AS		+= -EL
+LD		+= -EL
+
+comma = ,
+
+CHECKFLAGS	+= -D__aarch64__
+
+# Default value
+head-y		:= arch/arm64/kernel/head.o
+
+# The byte offset of the kernel image in RAM from the start of RAM.
+TEXT_OFFSET := 0x00080000
+
+export	TEXT_OFFSET GZFLAGS
+
+core-y		+= arch/arm64/kernel/ arch/arm64/mm/
+libs-y		:= arch/arm64/lib/ $(libs-y)
+libs-y		+= $(LIBGCC)
+
+# Default target when executing plain make
+KBUILD_IMAGE := Image.gz
+
+all:	$(KBUILD_IMAGE)
+
+boot := arch/arm64/boot
+
+Image Image.gz: vmlinux
+	$(Q)$(MAKE) $(build)=$(boot) MACHINE=$(MACHINE) $(boot)/$@
+
+zinstall install: vmlinux
+	$(Q)$(MAKE) $(build)=$(boot) MACHINE=$(MACHINE) $@
+
+%.dtb:
+	$(Q)$(MAKE) $(build)=$(boot) MACHINE=$(MACHINE) $(boot)/$@
+
+# We use MRPROPER_FILES and CLEAN_FILES now
+archclean:
+	$(Q)$(MAKE) $(clean)=$(boot)
+
+define archhelp
+  echo  '* Image.gz      - Compressed kernel image (arch/$(ARCH)/boot/Image.gz)'
+  echo  '  Image         - Uncompressed kernel image (arch/$(ARCH)/boot/Image)'
+  echo  '  install       - Install uncompressed kernel'
+  echo  '  zinstall      - Install compressed kernel'
+  echo  '                  Install using (your) ~/bin/installkernel or'
+  echo  '                  (distribution) /sbin/installkernel or'
+  echo  '                  install to $$(INSTALL_PATH) and run lilo'
+endef
diff --git a/arch/arm64/boot/.gitignore b/arch/arm64/boot/.gitignore
new file mode 100644
index 0000000..8dab0bb
--- /dev/null
+++ b/arch/arm64/boot/.gitignore
@@ -0,0 +1,2 @@
+Image
+Image.gz
diff --git a/arch/arm64/boot/Makefile b/arch/arm64/boot/Makefile
new file mode 100644
index 0000000..15a58a8
--- /dev/null
+++ b/arch/arm64/boot/Makefile
@@ -0,0 +1,38 @@
+#
+# arch/arm64/boot/Makefile
+#
+# This file is included by the global makefile so that you can add your own
+# architecture-specific flags and dependencies.
+#
+# This file is subject to the terms and conditions of the GNU General Public
+# License.  See the file "COPYING" in the main directory of this archive
+# for more details.
+#
+# Copyright (C) 2012, ARM Ltd.
+# Author: Will Deacon <will.deacon@arm.com>
+#
+# Based on the ia64 boot/Makefile.
+#
+
+targets := Image Image.gz
+
+$(obj)/Image: vmlinux FORCE
+	$(call if_changed,objcopy)
+	@echo '  Kernel: $@ is ready'
+
+$(obj)/Image.gz: $(obj)/Image FORCE
+	$(call if_changed,gzip)
+	@echo '  Kernel: $@ is ready'
+
+$(obj)/%.dtb: $(src)/dts/%.dts
+	$(call cmd,dtc)
+
+install: $(obj)/Image
+	$(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
+	$(obj)/Image System.map "$(INSTALL_PATH)"
+
+zinstall: $(obj)/Image.gz
+	$(CONFIG_SHELL) $(srctree)/$(src)/install.sh $(KERNELRELEASE) \
+	$(obj)/Image.gz System.map "$(INSTALL_PATH)"
+
+clean-files += *.dtb
diff --git a/arch/arm64/boot/install.sh b/arch/arm64/boot/install.sh
new file mode 100644
index 0000000..9151e21
--- /dev/null
+++ b/arch/arm64/boot/install.sh
@@ -0,0 +1,52 @@
+#!/bin/sh
+#
+# arch/arm64/boot/install.sh
+#
+# This file is subject to the terms and conditions of the GNU General Public
+# License.  See the file "COPYING" in the main directory of this archive
+# for more details.
+#
+# Copyright (C) 1995 by Linus Torvalds
+#
+# Adapted from code in arch/i386/boot/Makefile by H. Peter Anvin
+# Adapted from code in arch/i386/boot/install.sh by Russell King
+#
+# "make install" script for the AArch64 Linux port
+#
+# Arguments:
+#   $1 - kernel version
+#   $2 - kernel image file
+#   $3 - kernel map file
+#   $4 - default install path (blank if root directory)
+#
+
+# User may have a custom install script
+if [ -x ~/bin/${INSTALLKERNEL} ]; then exec ~/bin/${INSTALLKERNEL} "$@"; fi
+if [ -x /sbin/${INSTALLKERNEL} ]; then exec /sbin/${INSTALLKERNEL} "$@"; fi
+
+if [ "$(basename $2)" = "Image.gz" ]; then
+# Compressed install
+  echo "Installing compressed kernel"
+  base=vmlinuz
+else
+# Normal install
+  echo "Installing normal kernel"
+  base=vmlinux
+fi
+
+if [ -f $4/$base-$1 ]; then
+  mv $4/$base-$1 $4/$base-$1.old
+fi
+cat $2 > $4/$base-$1
+
+# Install system map file
+if [ -f $4/System.map-$1 ]; then
+  mv $4/System.map-$1 $4/System.map-$1.old
+fi
+cp $3 $4/System.map-$1
+
+if [ -x /sbin/loadmap ]; then
+  /sbin/loadmap
+else
+  echo "You have to install it yourself"
+fi
diff --git a/arch/arm64/configs/generic_defconfig b/arch/arm64/configs/generic_defconfig
new file mode 100644
index 0000000..d9aac95
--- /dev/null
+++ b/arch/arm64/configs/generic_defconfig
@@ -0,0 +1,85 @@
+CONFIG_EXPERIMENTAL=y
+# CONFIG_LOCALVERSION_AUTO is not set
+# CONFIG_SWAP is not set
+CONFIG_SYSVIPC=y
+CONFIG_POSIX_MQUEUE=y
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
+CONFIG_NO_HZ=y
+CONFIG_HIGH_RES_TIMERS=y
+CONFIG_IKCONFIG=y
+CONFIG_IKCONFIG_PROC=y
+CONFIG_LOG_BUF_SHIFT=14
+# CONFIG_UTS_NS is not set
+# CONFIG_IPC_NS is not set
+# CONFIG_PID_NS is not set
+# CONFIG_NET_NS is not set
+CONFIG_SCHED_AUTOGROUP=y
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_KALLSYMS_ALL=y
+# CONFIG_COMPAT_BRK is not set
+CONFIG_PROFILING=y
+CONFIG_MODULES=y
+CONFIG_MODULE_UNLOAD=y
+# CONFIG_BLK_DEV_BSG is not set
+# CONFIG_IOSCHED_DEADLINE is not set
+CONFIG_SMP=y
+CONFIG_PREEMPT_VOLUNTARY=y
+CONFIG_CMDLINE="console=ttyAMA0"
+# CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set
+CONFIG_AARCH32_EMULATION=y
+CONFIG_NET=y
+CONFIG_PACKET=y
+CONFIG_UNIX=y
+CONFIG_INET=y
+CONFIG_IP_PNP=y
+CONFIG_IP_PNP_DHCP=y
+CONFIG_IP_PNP_BOOTP=y
+# CONFIG_INET_LRO is not set
+# CONFIG_IPV6 is not set
+# CONFIG_WIRELESS is not set
+CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
+CONFIG_DEVTMPFS=y
+# CONFIG_BLK_DEV is not set
+CONFIG_SCSI=y
+# CONFIG_SCSI_PROC_FS is not set
+CONFIG_BLK_DEV_SD=y
+# CONFIG_SCSI_LOWLEVEL is not set
+CONFIG_NETDEVICES=y
+CONFIG_MII=y
+# CONFIG_WLAN is not set
+CONFIG_INPUT_EVDEV=y
+# CONFIG_SERIO_I8042 is not set
+# CONFIG_SERIO_SERPORT is not set
+CONFIG_LEGACY_PTY_COUNT=16
+# CONFIG_HW_RANDOM is not set
+# CONFIG_HWMON is not set
+CONFIG_FB=y
+# CONFIG_VGA_CONSOLE is not set
+CONFIG_FRAMEBUFFER_CONSOLE=y
+CONFIG_LOGO=y
+# CONFIG_LOGO_LINUX_MONO is not set
+# CONFIG_LOGO_LINUX_VGA16 is not set
+# CONFIG_USB_SUPPORT is not set
+# CONFIG_IOMMU_SUPPORT is not set
+CONFIG_EXT2_FS=y
+CONFIG_EXT3_FS=y
+# CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set
+# CONFIG_EXT3_FS_XATTR is not set
+CONFIG_FUSE_FS=y
+CONFIG_CUSE=y
+CONFIG_VFAT_FS=y
+CONFIG_TMPFS=y
+# CONFIG_MISC_FILESYSTEMS is not set
+CONFIG_NFS_FS=y
+CONFIG_ROOT_NFS=y
+CONFIG_NLS_CODEPAGE_437=y
+CONFIG_NLS_ISO8859_1=y
+CONFIG_MAGIC_SYSRQ=y
+CONFIG_DEBUG_FS=y
+CONFIG_DEBUG_KERNEL=y
+# CONFIG_SCHED_DEBUG is not set
+CONFIG_DEBUG_INFO=y
+# CONFIG_FTRACE is not set
+CONFIG_ATOMIC64_SELFTEST=y
+CONFIG_DEBUG_ERRORS=y
diff --git a/arch/arm64/include/asm/prom.h b/arch/arm64/include/asm/prom.h
new file mode 100644
index 0000000..68b90e6
--- /dev/null
+++ b/arch/arm64/include/asm/prom.h
@@ -0,0 +1 @@
+/* Empty for now */
diff --git a/arch/arm64/kernel/.gitignore b/arch/arm64/kernel/.gitignore
new file mode 100644
index 0000000..c5f676c
--- /dev/null
+++ b/arch/arm64/kernel/.gitignore
@@ -0,0 +1 @@
+vmlinux.lds
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
new file mode 100644
index 0000000..59fbdef
--- /dev/null
+++ b/arch/arm64/kernel/Makefile
@@ -0,0 +1,27 @@
+#
+# Makefile for the linux kernel.
+#
+
+CPPFLAGS_vmlinux.lds	:= -DTEXT_OFFSET=$(TEXT_OFFSET)
+AFLAGS_head.o		:= -DTEXT_OFFSET=$(TEXT_OFFSET)
+
+# Object file lists.
+arm64-obj-y		:= debug-monitors.o elf.o entry.o irq.o	fpsimd.o	\
+			   entry-fpsimd.o process.o ptrace.o setup.o signal.o	\
+			   sys.o stacktrace.o time.o traps.o io.o vdso.o
+
+arm64-obj-$(CONFIG_AARCH32_EMULATION)	+= sys32.o kuser32.o signal32.o 	\
+					   sys_compat.o
+arm64-obj-$(CONFIG_MODULES)		+= arm64ksyms.o module.o
+arm64-obj-$(CONFIG_SMP)			+= smp.o
+arm64-obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o
+arm64-obj-$(CONFIG_HAVE_HW_BREAKPOINT)+= hw_breakpoint.o
+
+obj-y					+= $(arm64-obj-y) vdso/
+obj-m					+= $(arm64-obj-m)
+head-y					:= head.o
+extra-y					:= $(head-y) vmlinux.lds
+
+# vDSO - this must be built first to generate the symbol offsets
+$(call objectify,$(arm64-obj-y)): $(obj)/vdso/vdso-offsets.h
+$(obj)/vdso/vdso-offsets.h: $(obj)/vdso
diff --git a/arch/arm64/kernel/vmlinux.lds.S b/arch/arm64/kernel/vmlinux.lds.S
new file mode 100644
index 0000000..5eab87b
--- /dev/null
+++ b/arch/arm64/kernel/vmlinux.lds.S
@@ -0,0 +1,146 @@
+/*
+ * ld script to make ARM Linux kernel
+ * taken from the i386 version by Russell King
+ * Written by Martin Mares <mj@atrey.karlin.mff.cuni.cz>
+ */
+
+#include <asm-generic/vmlinux.lds.h>
+#include <asm/thread_info.h>
+#include <asm/memory.h>
+#include <asm/page.h>
+
+#define PROC_INFO							\
+	VMLINUX_SYMBOL(__proc_info_begin) = .;				\
+	*(.proc.info.init)						\
+	VMLINUX_SYMBOL(__proc_info_end) = .;
+
+#define ARM_CPU_DISCARD(x)	x
+#define ARM_CPU_KEEP(x)
+
+#define ARM_EXIT_KEEP(x)
+#define ARM_EXIT_DISCARD(x)	x
+
+OUTPUT_ARCH(aarch64)
+ENTRY(stext)
+
+jiffies = jiffies_64;
+
+SECTIONS
+{
+	/*
+	 * XXX: The linker does not define how output sections are
+	 * assigned to input sections when there are multiple statements
+	 * matching the same input section name.  There is no documented
+	 * order of matching.
+	 */
+	/DISCARD/ : {
+		ARM_EXIT_DISCARD(EXIT_TEXT)
+		ARM_EXIT_DISCARD(EXIT_DATA)
+		EXIT_CALL
+		*(.discard)
+		*(.discard.*)
+	}
+
+	. = PAGE_OFFSET + TEXT_OFFSET;
+
+	.head.text : {
+		_text = .;
+		HEAD_TEXT
+	}
+	.text : {			/* Real text segment		*/
+		_stext = .;		/* Text and read-only data	*/
+			*(.smp.pen.text)
+			__exception_text_start = .;
+			*(.exception.text)
+			__exception_text_end = .;
+			IRQENTRY_TEXT
+			TEXT_TEXT
+			SCHED_TEXT
+			LOCK_TEXT
+			*(.fixup)
+			*(.gnu.warning)
+		. = ALIGN(16);
+		*(.got)			/* Global offset table		*/
+			ARM_CPU_KEEP(PROC_INFO)
+	}
+
+	RO_DATA(PAGE_SIZE)
+
+	_etext = .;			/* End of text and rodata section */
+
+	. = ALIGN(PAGE_SIZE);
+	__init_begin = .;
+
+	INIT_TEXT_SECTION(8)
+	.exit.text : {
+		ARM_EXIT_KEEP(EXIT_TEXT)
+	}
+	. = ALIGN(16);
+	.init.proc.info : {
+		ARM_CPU_DISCARD(PROC_INFO)
+	}
+	. = ALIGN(16);
+	.init.data : {
+		INIT_DATA
+		INIT_SETUP(16)
+		INIT_CALLS
+		CON_INITCALL
+		SECURITY_INITCALL
+		INIT_RAM_FS
+	}
+	.exit.data : {
+		ARM_EXIT_KEEP(EXIT_DATA)
+	}
+
+	PERCPU_SECTION(64)
+
+	__init_end = .;
+	. = ALIGN(THREAD_SIZE);
+	__data_loc = .;
+
+	.data : AT(__data_loc) {
+		_data = .;		/* address in memory */
+		_sdata = .;
+
+		/*
+		 * first, the init task union, aligned
+		 * to an 8192 byte boundary.
+		 */
+		INIT_TASK_DATA(THREAD_SIZE)
+		NOSAVE_DATA
+		CACHELINE_ALIGNED_DATA(64)
+		READ_MOSTLY_DATA(64)
+
+		/*
+		 * The exception fixup table (might need resorting at runtime)
+		 */
+		. = ALIGN(32);
+		__start___ex_table = .;
+		*(__ex_table)
+		__stop___ex_table = .;
+
+		/*
+		 * and the usual data section
+		 */
+		DATA_DATA
+		CONSTRUCTORS
+
+		_edata = .;
+	}
+	_edata_loc = __data_loc + SIZEOF(.data);
+
+	NOTES
+
+	BSS_SECTION(0, 0, 0)
+	_end = .;
+
+	STABS_DEBUG
+	.comment 0 : { *(.comment) }
+}
+
+/*
+ * These must never be empty
+ * If you have to comment these two assert statements out, your
+ * binutils is too old (for other reasons as well)
+ */
+ASSERT((__proc_info_end - __proc_info_begin), "missing CPU support")
diff --git a/arch/arm64/mm/Kconfig b/arch/arm64/mm/Kconfig
new file mode 100644
index 0000000..8e94e52
--- /dev/null
+++ b/arch/arm64/mm/Kconfig
@@ -0,0 +1,5 @@
+config MMU
+	def_bool y
+
+config CPU_64
+	def_bool y
diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile
new file mode 100644
index 0000000..81a9d8b
--- /dev/null
+++ b/arch/arm64/mm/Makefile
@@ -0,0 +1,6 @@
+obj-y				:= dma-mapping.o extable.o fault.o init.o \
+				   cache.o copypage.o flush.o \
+				   ioremap.o mmap.o pgd.o mmu.o \
+				   context.o tlb.o proc.o
+
+obj-$(CONFIG_MODULES)		+= proc-syms.o
diff --git a/init/Kconfig b/init/Kconfig
index af6c7f8..8bfda46 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -1153,7 +1153,8 @@ menuconfig EXPERT
 
 config UID16
 	bool "Enable 16-bit UID system calls" if EXPERT
-	depends on ARM || BLACKFIN || CRIS || FRV || H8300 || X86_32 || M68K || (S390 && !64BIT) || SUPERH || SPARC32 || (SPARC64 && COMPAT) || UML || (X86_64 && IA32_EMULATION)
+	depends on ARM || BLACKFIN || CRIS || FRV || H8300 || X86_32 || M68K || (S390 && !64BIT) || SUPERH || SPARC32 || (SPARC64 && COMPAT) || UML || (X86_64 && IA32_EMULATION) \
+		|| AARCH32_EMULATION
 	default y
 	help
 	  This enables the legacy 16-bit UID syscall wrappers.
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2403a63..cfb4578 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -452,7 +452,8 @@ config SLUB_STATS
 config DEBUG_KMEMLEAK
 	bool "Kernel memory leak detector"
 	depends on DEBUG_KERNEL && EXPERIMENTAL && \
-		(X86 || ARM || PPC || MIPS || S390 || SPARC64 || SUPERH || MICROBLAZE || TILE)
+		(X86 || ARM || PPC || MIPS || S390 || SPARC64 || SUPERH || \
+		 MICROBLAZE || TILE || ARM64)
 
 	select DEBUG_FS
 	select STACKTRACE if STACKTRACE_SUPPORT
@@ -739,7 +740,8 @@ config DEBUG_BUGVERBOSE
 	bool "Verbose BUG() reporting (adds 70K)" if DEBUG_KERNEL && EXPERT
 	depends on BUG
 	depends on ARM || AVR32 || M32R || M68K || SPARC32 || SPARC64 || \
-		   FRV || SUPERH || GENERIC_BUG || BLACKFIN || MN10300 || TILE
+		   FRV || SUPERH || GENERIC_BUG || BLACKFIN || MN10300 || \
+		   TILE || ARM64
 	default y
 	help
 	  Say Y here to make BUG() panics output the file name and line number


^ permalink raw reply	[flat|nested] 170+ messages in thread

* [PATCH v2 31/31] arm64: MAINTAINERS update
  2012-08-14 17:52 [PATCH v2 00/31] AArch64 Linux kernel port Catalin Marinas
                   ` (29 preceding siblings ...)
  2012-08-14 17:52 ` [PATCH v2 30/31] arm64: Build infrastructure Catalin Marinas
@ 2012-08-14 17:52 ` Catalin Marinas
  2012-08-15 15:57   ` Arnd Bergmann
  2012-08-17  9:36 ` [PATCH v2 00/31] AArch64 Linux kernel port Tony Lindgren
  31 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-14 17:52 UTC (permalink / raw)
  To: linux-arch, linux-arm-kernel; +Cc: linux-kernel, Arnd Bergmann

This patch updates the MAINTAINERS file for the AArch64 Linux kernel
port.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
---
 MAINTAINERS |    6 ++++++
 1 files changed, 6 insertions(+), 0 deletions(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 94b823f..6d7c5f4 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -1204,6 +1204,12 @@ S:	Maintained
 F:	arch/arm/mach-pxa/z2.c
 F:	arch/arm/mach-pxa/include/mach/z2.h
 
+ARM64 PORT (AARCH64 ARCHITECTURE)
+M:	Catalin Marinas <catalin.marinas@arm.com>
+L:	linux-arm-kernel@lists.infradead.org (moderated for non-subscribers)
+S:	Maintained
+F:	arch/arm64/
+
 ASC7621 HARDWARE MONITOR DRIVER
 M:	George Joseph <george.joseph@fairview5.com>
 L:	lm-sensors@lm-sensors.org


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 30/31] arm64: Build infrastructure
  2012-08-14 17:52 ` [PATCH v2 30/31] arm64: Build infrastructure Catalin Marinas
@ 2012-08-14 21:01   ` Sam Ravnborg
  2012-08-15 16:07   ` Arnd Bergmann
  2012-08-17  9:32   ` Tony Lindgren
  2 siblings, 0 replies; 170+ messages in thread
From: Sam Ravnborg @ 2012-08-14 21:01 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

> 
> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> new file mode 100644
> index 0000000..1ce3d04
> --- /dev/null
> +++ b/arch/arm64/Kconfig
> @@ -0,0 +1,261 @@
> +config ARM64
> +	def_bool y
> +	select OF
> +	select OF_EARLY_FLATTREE
> +	select IRQ_DOMAIN
> +	select HAVE_AOUT
> +	select HAVE_DMA_ATTRS
> +	select HAVE_DMA_API_DEBUG
> +	select HAVE_IDE
> +	select HAVE_MEMBLOCK
> +	select RTC_LIB
> +	select SYS_SUPPORTS_APM_EMULATION
> +	select HAVE_GENERIC_DMA_COHERENT
> +	select GENERIC_IOMAP
> +	select HAVE_IRQ_WORK
> +	select HAVE_PERF_EVENTS
> +	select HAVE_ARCH_TRACEHOOK
> +	select PERF_USE_VMALLOC
> +	select HAVE_HW_BREAKPOINT if PERF_EVENTS
> +	select HAVE_GENERIC_HARDIRQS
> +	select GENERIC_HARDIRQS_NO_DEPRECATED
> +	select HAVE_SPARSE_IRQ
> +	select SPARSE_IRQ
> +	select GENERIC_IRQ_SHOW
> +	select GENERIC_SMP_IDLE_THREAD
> +	select NO_BOOTMEM

If you keep this list sorted then merge conflicts are less likely.


> +	help
> +	  ARM 64-bit (AArch64) Linux support.
> +
> +config 64BIT
> +	def_bool y
> +
> +config ARCH_PHYS_ADDR_T_64BIT
> +	def_bool y
> +
> +config HAVE_PWM
> +	bool
> +
> +config SYS_SUPPORTS_APM_EMULATION
> +	bool
> +
> +config NO_IOPORT
> +	def_bool y
> +
> +config GENERIC_GPIO
> +	bool
> +
> +config GENERIC_TIME_VSYSCALL
> +	def_bool y
Please use select like all other archs do.


> +
> +config GENERIC_CLOCKEVENTS
> +	def_bool y
Again - please use select.

> +
> +config STACKTRACE_SUPPORT
> +	def_bool y
> +
> +config LOCKDEP_SUPPORT
> +	def_bool y
> +
> +config TRACE_IRQFLAGS_SUPPORT
> +	def_bool y
> +
> +config HARDIRQS_SW_RESEND
> +	def_bool y
Please use select.

> +
> +config GENERIC_IRQ_PROBE
> +	def_bool y
Please use select.

> +
> +config GENERIC_LOCKBREAK
> +	def_bool y
> +	depends on SMP && PREEMPT
> +
> +config RWSEM_GENERIC_SPINLOCK
> +	def_bool y
> +
> +config RWSEM_XCHGADD_ALGORITHM
> +	bool
> +
> +config ARCH_HAS_ILOG2_U32
> +	bool
> +
> +config ARCH_HAS_ILOG2_U64
> +	bool
> +
> +config ARCH_HAS_CPUFREQ
> +	bool
> +	help
> +	  Internal node to signify that the ARCH has CPUFREQ support
> +	  and that the relevant menu configurations are displayed for
> +	  it.
> +
> +config GENERIC_HWEIGHT
> +	def_bool y
> +
> +config GENERIC_CSUM
> +        def_bool y
> +
> +config GENERIC_CALIBRATE_DELAY
> +	def_bool y
> +
> +config ZONE_DMA32
> +	def_bool y
> +
> +config ARCH_DMA_ADDR_T_64BIT
> +	def_bool y
> +
> +config NEED_DMA_MAP_STATE
> +	def_bool y
> +
> +config NEED_SG_DMA_LENGTH
> +	def_bool y
> +
> +config SWIOTLB
> +	def_bool y
> +
> +config IOMMU_HELPER
> +	def_bool SWIOTLB
> +


	Sam

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 02/31] arm64: Kernel booting and initialisation
  2012-08-14 17:52 ` [PATCH v2 02/31] arm64: Kernel booting and initialisation Catalin Marinas
@ 2012-08-14 23:06   ` Olof Johansson
  2012-08-15 17:37     ` Catalin Marinas
  2012-08-15 13:20   ` Arnd Bergmann
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 170+ messages in thread
From: Olof Johansson @ 2012-08-14 23:06 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

Hi,


On Tue, Aug 14, 2012 at 06:52:03PM +0100, Catalin Marinas wrote:

> +Before jumping into the kernel, the following conditions must be met:
> +
> +- Quiesce all DMA capable devices so that memory does not get
> +  corrupted by bogus network packets or disk data.  This will save
> +  you many hours of debug.
> +
> +- Primary CPU general-purpose register settings
> +  x0 = physical address of device tree blob (dtb) in system RAM.
> +
> +- CPU mode
> +  All forms of interrupts must be masked in PSTATE.DAIF (Debug, SError,
> +  IRQ and FIQ).
> +  The CPU must be in either EL2 (RECOMMENDED in order to have access to
> +  the virtualisation extensions) or non-secure EL1.
> +
> +- Caches, MMUs
> +  The MMU must be off.
> +  Instruction cache may be on or off.
> +  Data cache must be off and invalidated.
> +
> +- Architected timers
> +  CNTFRQ must be programmed with the timer frequency.
> +  If entering the kernel at EL1, CNTHCTL_EL2 must have EL1PCTEN (bit 0)
> +  set where available.
> +
> +- Coherency
> +  All CPUs to be booted by the kernel must be part of the same coherency
> +  domain on entry to the kernel.  This may require IMPLEMENTATION DEFINED
> +  initialisation to enable the receiving of maintenance operations on
> +  each CPU.
> +
> +- System registers
> +  All writable architected system registers at the exception level where
> +  the kernel image will be entered must be initialised by software at a
> +  higher exception level to prevent execution in an UNKNOWN state.

Given the recent development of ARM platforms, you might want to mandate
the state of IOMMUs as well (they should probably be off, since there
should be no active DMA activity). Graphics would be the exception to
this, since if you want to keep scanning out a splash screen, you'll
have to keep doing DMA...

> +- The primary CPU must jump directly to the first instruction of the
> +  kernel image.  The device tree blob passed by this CPU must contain
> +  for each CPU node:
> +
> +    1. An 'enable-method' property. Currently, the only supported value
> +       for this field is the string "spin-table".
> +
> +    2. A 'cpu-release-addr' property identifying a 64-bit,
> +       zero-initialised memory location.

These would be good to have documented in the
Documentation/devicetree/bindings hierarchy as well.

> index 0000000..d766493
> --- /dev/null
> +++ b/arch/arm64/include/asm/setup.h
> @@ -0,0 +1,26 @@
> +/*
> + * Based on arch/arm/include/asm/setup.h
> + *
> + * Copyright (C) 1997-1999 Russell King
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_SETUP_H
> +#define __ASM_SETUP_H
> +
> +#include <linux/types.h>
> +
> +#define COMMAND_LINE_SIZE 1024

Probably not a huge deal, and other architectures seem to be all over
the map on this, but you might want to go with a larger value now rather
than later. 2048 or 4096 perhaps?

> diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> new file mode 100644
> index 0000000..34ccdc0
> --- /dev/null
> +++ b/arch/arm64/kernel/head.S

[...]

> +/*
> + * Setup common bits before finally enabling the MMU. Essentially this is just
> + * loading the page table pointer and vector base registers.
> + *
> + * On entry to this code, x0 must contain the SCTLR_EL1 value for turning on
> + * the MMU.
> + */
> +__enable_mmu:

ENTRY()?

> +	ldr	x5, =vectors
> +	msr	vbar_el1, x5
> +	msr	ttbr0_el1, x25			// load TTBR0
> +	msr	ttbr1_el1, x26			// load TTBR1
> +	isb
> +	b	__turn_mmu_on
> +ENDPROC(__enable_mmu)

...or just END()? Same for a few of the other functions below.

> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> new file mode 100644
> index 0000000..f25186f
> --- /dev/null
> +++ b/arch/arm64/kernel/setup.c

[...]

> +static void __init setup_processor(void)
> +{
> +	struct proc_info_list *list;
> +
> +	/*
> +	 * locate processor in the list of supported processor
> +	 * types.  The linker builds this table for us from the
> +	 * entries in arch/arm/mm/proc.S
> +	 */

Probably from arch/arm64/... somewhere?


[...]

> +	printk("CPU: %s [%08x] revision %d\n",
> +	       cpu_name, read_cpuid_id(), read_cpuid_id() & 15);
> +
> +	sprintf(init_utsname()->machine, "aarch64");

> +	initial_boot_params = devtree;
> +	dt_root = of_get_flat_dt_root();
> +
> +	machine_name = of_get_flat_dt_prop(dt_root, "model", NULL);
> +	if (!machine_name)
> +		machine_name = of_get_flat_dt_prop(dt_root, "compatible", NULL);
> +	if (!machine_name)
> +		machine_name = "<unknown>";
> +	pr_info("Machine: %s\n", machine_name);

This property is an array of strings. It would be more valuable to print out
the entry that was matched for a platform instead of the provided one from the
device tree.


-Olof

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 11/31] arm64: IRQ handling
  2012-08-14 17:52 ` [PATCH v2 11/31] arm64: IRQ handling Catalin Marinas
@ 2012-08-14 23:22   ` Aaro Koskinen
  0 siblings, 0 replies; 170+ messages in thread
From: Aaro Koskinen @ 2012-08-14 23:22 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, Marc Zyngier, Will Deacon,
	linux-kernel, Arnd Bergmann

Hi,

On Tue, Aug 14, 2012 at 06:52:12PM +0100, Catalin Marinas wrote:
> +void handle_IRQ(unsigned int irq, struct pt_regs *regs)
> +{
> +	struct pt_regs *old_regs = set_irq_regs(regs);
> +
> +	irq_enter();
> +
> +	/*
> +	 * Some hardware gives randomly wrong interrupts.  Rather
> +	 * than crashing, do something sensible.
> +	 */
> +	if (unlikely(irq >= nr_irqs)) {
> +		if (printk_ratelimit())
> +			pr_warning("Bad IRQ%u\n", irq);

I guess pr_warn_ratelimited() should be used for new code.

(See include/linux/printk.h, "Please don't use printk_ratelimit()...")

A.

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 03/31] arm64: Exception handling
  2012-08-14 17:52 ` [PATCH v2 03/31] arm64: Exception handling Catalin Marinas
@ 2012-08-14 23:29   ` Olof Johansson
  2012-08-14 23:47     ` Thomas Gleixner
  2012-08-15 13:03   ` Arnd Bergmann
  1 sibling, 1 reply; 170+ messages in thread
From: Olof Johansson @ 2012-08-14 23:29 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

Hi,

This one is a bit denser, so just a quick first pass with a couple of minor
comments. I'll revisit the rest.

On Tue, Aug 14, 2012 at 06:52:04PM +0100, Catalin Marinas wrote:

> +el1_sp_pc:
> +	/*
> +	 *Stack or PC alignment exception handling
> +	 */
> +	mrs	x0, far_el1
> +	mov	x1, x25
> +	mov	x2, sp
> +	b	do_sp_pc_abort
> +el1_undef:
> +	/*
> +	 *Undefined instruction
> +	 */

Nit: Missing spaces in the comment here and the one above.

> +el0_undef:
> +	/*
> +	 *Undefined instruction
> +	 */
> +	mov	x0, sp
> +	b	do_undefinstr

Here too.

> diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> new file mode 100644
> index 0000000..8712a8e
> --- /dev/null
> +++ b/arch/arm64/kernel/traps.c
[...]
> +DEFINE_SPINLOCK(die_lock);

Should probably be static.


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 03/31] arm64: Exception handling
  2012-08-14 23:29   ` Olof Johansson
@ 2012-08-14 23:47     ` Thomas Gleixner
  0 siblings, 0 replies; 170+ messages in thread
From: Thomas Gleixner @ 2012-08-14 23:47 UTC (permalink / raw)
  To: Olof Johansson
  Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel,
	Arnd Bergmann, Will Deacon

On Tue, 14 Aug 2012, Olof Johansson wrote:
> > diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c
> > new file mode 100644
> > index 0000000..8712a8e
> > --- /dev/null
> > +++ b/arch/arm64/kernel/traps.c
> [...]
> > +DEFINE_SPINLOCK(die_lock);
> 
> Should probably be static.

And RAW_

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 07/31] arm64: Process management
  2012-08-14 17:52 ` [PATCH v2 07/31] arm64: Process management Catalin Marinas
@ 2012-08-14 23:50   ` Olof Johansson
  2012-09-14 17:33     ` Catalin Marinas
  2012-08-15 13:53   ` Arnd Bergmann
  2012-08-16 15:09   ` Tobias Klauser
  2 siblings, 1 reply; 170+ messages in thread
From: Olof Johansson @ 2012-08-14 23:50 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, Will Deacon, linux-kernel, Arnd Bergmann

Hi,

On Tue, Aug 14, 2012 at 06:52:08PM +0100, Catalin Marinas wrote:

> diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> new file mode 100644
> index 0000000..c4a4e1c
> --- /dev/null
> +++ b/arch/arm64/kernel/process.c
> @@ -0,0 +1,416 @@

[...]
> +/*
> + * Function pointers to optional machine specific functions
> + */
> +void (*pm_power_off)(void);
> +EXPORT_SYMBOL(pm_power_off);
> +
> +void (*pm_restart)(const char *cmd);
> +EXPORT_SYMBOL_GPL(pm_restart);
[...]
> +void (*pm_idle)(void) = default_idle;
> +EXPORT_SYMBOL(pm_idle);

Does it really make sense to export these to modules?

I find the powerpc way of having a machine descriptor structure with these
(and other) function pointers in it a bit cleaner, since it gives you
one place to plug it all in. I'd recommend that you consider doing that
here as well, for these three and potentially other cases in the future.

(See arch/powerpc/include/asm/machdep.h, struct machdep_calls).

> +void machine_halt(void)
> +{
> +	machine_shutdown();
> +	while (1);
> +}
> +
> +void machine_power_off(void)
> +{
> +	machine_shutdown();
> +	if (pm_power_off)
> +		pm_power_off();
> +}

Printing something here along the lines of "System halted, OK to power off"
is useful.


-Olof

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 08/31] arm64: CPU support
  2012-08-14 17:52 ` [PATCH v2 08/31] arm64: CPU support Catalin Marinas
@ 2012-08-15  0:10   ` Olof Johansson
  2012-08-20 15:57     ` Catalin Marinas
  2012-09-14 17:38     ` Catalin Marinas
  2012-08-15 13:56   ` Arnd Bergmann
  1 sibling, 2 replies; 170+ messages in thread
From: Olof Johansson @ 2012-08-15  0:10 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

Hi,

On Tue, Aug 14, 2012 at 06:52:09PM +0100, Catalin Marinas wrote:

> diff --git a/arch/arm64/include/asm/cputype.h b/arch/arm64/include/asm/cputype.h
> new file mode 100644
> index 0000000..ef54125
> --- /dev/null
> +++ b/arch/arm64/include/asm/cputype.h
> @@ -0,0 +1,49 @@
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_CPUTYPE_H
> +#define __ASM_CPUTYPE_H
> +
> +#define ID_MIDR_EL1		"midr_el1"
> +#define ID_CTR_EL0		"ctr_el0"
> +
> +#define ID_AA64PFR0_EL1		"id_aa64pfr0_el1"
> +#define ID_AA64DFR0_EL1		"id_aa64dfr0_el1"
> +#define ID_AA64AFR0_EL1		"id_aa64afr0_el1"
> +#define ID_AA64ISAR0_EL1	"id_aa64isar0_el1"
> +#define ID_AA64MMFR0_EL1	"id_aa64mmfr0_el1"
> +
> +#define read_cpuid(reg) ({						\
> +	u64 __val;							\
> +	asm("mrs	%0, " reg : "=r" (__val));			\
> +	__val;								\
> +})
> +
> +/*
> + * The CPU ID never changes at run time, so we might as well tell the
> + * compiler that it's constant.  Use this function to read the CPU ID
> + * rather than directly reading processor_id or read_cpuid() directly.
> + */
> +static inline u32 __attribute_const__ read_cpuid_id(void)
> +{
> +	return read_cpuid(ID_MIDR_EL1);
> +}
> +
> +static inline u32 __attribute_const__ read_cpuid_cachetype(void)
> +{
> +	return read_cpuid(ID_CTR_EL0);
> +}

Is this perhaps a carry-over from arch/arm? Abstracting out read_cpuid()
doesn't seem to buy anything here, just opencode the one-line assembly
in each.

Might as well cleanup the naming a little too while you're at it, i.e.
read_cpu_id() and read_cpu_cachetype().


> diff --git a/arch/arm64/include/asm/procinfo.h b/arch/arm64/include/asm/procinfo.h
> new file mode 100644
> index 0000000..81fece9
> --- /dev/null
> +++ b/arch/arm64/include/asm/procinfo.h
> @@ -0,0 +1,44 @@
> +/*
> + * Based on arch/arm/include/asm/procinfo.h
> + *
> + * Copyright (C) 1996-1999 Russell King
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_PROCINFO_H
> +#define __ASM_PROCINFO_H
> +
> +#ifdef __KERNEL__
> +
> +/*
> + * Note!  struct processor is always defined if we're
> + * using MULTI_CPU, otherwise this entry is unused,
> + * but still exists.

Stale comment?

> + *
> + * NOTE! The following structure is defined by assembly
> + * language, NOT C code.  For more information, check:
> + *  arch/arm/mm/proc-*.S and arch/arm/kernel/head.S

Stale references. Also, no current arm64 implementation uses this. Premature
abstraction perhaps?

> +struct proc_info_list {
> +	unsigned int		cpu_val;
> +	unsigned int		cpu_mask;
> +	unsigned long		__cpu_flush;		/* used by head.S */
> +	const char		*cpu_name;
> +};
> +
> +#else	/* __KERNEL__ */
> +#include <asm/elf.h>
> +#warning "Please include asm/elf.h instead"
> +#endif	/* __KERNEL__ */
> +#endif
> diff --git a/arch/arm64/mm/proc-syms.c b/arch/arm64/mm/proc-syms.c
> new file mode 100644
> index 0000000..2d99ef9
> --- /dev/null
> +++ b/arch/arm64/mm/proc-syms.c
> @@ -0,0 +1,31 @@
> +/*
> + * Based on arch/arm/mm/proc-syms.c
> + *
> + * Copyright (C) 2000-2002 Russell King
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/export.h>
> +#include <linux/mm.h>
> +
> +#include <asm/cacheflush.h>
> +#include <asm/proc-fns.h>
> +#include <asm/tlbflush.h>
> +#include <asm/page.h>
> +
> +EXPORT_SYMBOL(__cpuc_flush_kern_all);
> +EXPORT_SYMBOL(__cpuc_flush_user_all);
> +EXPORT_SYMBOL(__cpuc_flush_user_range);
> +EXPORT_SYMBOL(__cpuc_coherent_kern_range);
> +EXPORT_SYMBOL(__cpuc_flush_dcache_area);

See comment on other email about putting function pointers in a struct
instead.

> diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S
> new file mode 100644
> index 0000000..453f517
> --- /dev/null
> +++ b/arch/arm64/mm/proc.S
> @@ -0,0 +1,193 @@
> +	.section ".proc.info.init", #alloc, #execinstr
> +
> +	.type	__v8_proc_info, #object
> +__v8_proc_info:
> +	.long	0x000f0000			// Required ID value
> +	.long	0x000f0000			// Mask for ID
> +	b	__cpu_setup
> +	nop
> +	.quad	cpu_name
> +	.long	0
> +	.size	__v8_proc_info, . - __v8_proc_info

I know this is a carry-over from arch/arm, but how about moving this
to more of a C construct similar to arch/powerpc/kernel/cputable.c
instead? It's considerably easier to read that way, and it's convenient
to have the definitions all in one place, making it easier to share some
of the functions, etc.


-Olof

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 12/31] arm64: Atomic operations
  2012-08-14 17:52 ` [PATCH v2 12/31] arm64: Atomic operations Catalin Marinas
@ 2012-08-15  0:21   ` Olof Johansson
  0 siblings, 0 replies; 170+ messages in thread
From: Olof Johansson @ 2012-08-15  0:21 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

Hi,

On Tue, Aug 14, 2012 at 06:52:13PM +0100, Catalin Marinas wrote:
> This patch introduces the atomic, mutex and futex operations. Many
> atomic operations use the load-acquire and store-release operations
> which imply barriers, avoiding the need for explicit DMB.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/atomic.h |  306 +++++++++++++++++++++++++++++++++++++++
>  arch/arm64/include/asm/futex.h  |  134 +++++++++++++++++
>  2 files changed, 440 insertions(+), 0 deletions(-)
>  create mode 100644 arch/arm64/include/asm/atomic.h
>  create mode 100644 arch/arm64/include/asm/futex.h
> 
> diff --git a/arch/arm64/include/asm/atomic.h b/arch/arm64/include/asm/atomic.h
> new file mode 100644
> index 0000000..fa60c8b
> --- /dev/null
> +++ b/arch/arm64/include/asm/atomic.h
> @@ -0,0 +1,306 @@
> +/*
> + * Based on arch/arm/include/asm/atomic.h
> + *
> + * Copyright (C) 1996 Russell King.
> + * Copyright (C) 2002 Deep Blue Solutions Ltd.
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_ATOMIC_H
> +#define __ASM_ATOMIC_H
> +
> +#include <linux/compiler.h>
> +#include <linux/types.h>
> +
> +#include <asm/barrier.h>
> +#include <asm/cmpxchg.h>
> +
> +#define ATOMIC_INIT(i)	{ (i) }
> +
> +#ifdef __KERNEL__
> +
> +/*
> + * On ARM, ordinary assignment (str instruction) doesn't clear the local
> + * strex/ldrex monitor on some implementations. The reason we can use it for
> + * atomic_set() is the clrex or dummy strex done on every exception return.
> + */
> +#define atomic_read(v)	(*(volatile int *)&(v)->counter)
> +#define atomic_set(v,i)	(((v)->counter) = (i))
> +
> +/*
> + * AArch64 UP and SMP safe atomic ops.  We use load exclusive and
> + * store exclusive to ensure that these are atomic.  We may loop
> + * to ensure that the update happens.
> + */
> +static inline void atomic_add(int i, atomic_t *v)
> +{
> +	unsigned long tmp;
> +	int result;
> +
> +	asm volatile("// atomic_add\n"
> +"1:	ldxr	%w0, [%3]\n"
> +"	add	%w0, %w0, %w4\n"
> +"	stxr	%w1, %w0, [%3]\n"
> +"	cbnz	%w1,1b"

Nit: space before 1b

[...]

> diff --git a/arch/arm64/include/asm/futex.h b/arch/arm64/include/asm/futex.h
> new file mode 100644
> index 0000000..0745e82
> --- /dev/null
> +++ b/arch/arm64/include/asm/futex.h
> @@ -0,0 +1,134 @@
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_FUTEX_H
> +#define __ASM_FUTEX_H
> +
> +#ifdef __KERNEL__
> +
> +#include <linux/futex.h>
> +#include <linux/uaccess.h>
> +#include <asm/errno.h>
> +
> +#define __futex_atomic_op(insn, ret, oldval, uaddr, tmp, oparg)		\
> +	asm volatile(							\
> +"1:	ldaxr	%w1, %2\n"						\
> +	insn "\n"							\
> +"2:	stlxr	%w3, %w0, %2\n"						\
> +"	cbnz	%w3, 1b\n"						\
> +"3:	.pushsection __ex_table,\"a\"\n"				\
> +"	.align	3\n"							\
> +"	.quad	1b, 4f, 2b, 4f\n"					\
> +"	.popsection\n"							\
> +"	.pushsection .fixup,\"ax\"\n"					\

Moving the exception table below the body of the code makes the flow easier to
read, please do that.

Also, don't you need a barrier here?

> +"4:	mov	%w0, %w5\n"						\
> +"	b	3b\n"							\
> +"	.popsection"							\
> +	: "=&r" (ret), "=&r" (oldval), "+Q" (*uaddr), "=&r" (tmp)	\
> +	: "r" (oparg), "Ir" (-EFAULT)					\
> +	: "cc")
> +
> +static inline int
> +futex_atomic_op_inuser (int encoded_op, u32 __user *uaddr)
> +{
> +	int op = (encoded_op >> 28) & 7;
> +	int cmp = (encoded_op >> 24) & 15;
> +	int oparg = (encoded_op << 8) >> 20;
> +	int cmparg = (encoded_op << 20) >> 20;
> +	int oldval = 0, ret, tmp;
> +
> +	if (encoded_op & (FUTEX_OP_OPARG_SHIFT << 28))
> +		oparg = 1 << oparg;
> +
> +	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
> +		return -EFAULT;
> +
> +	pagefault_disable();	/* implies preempt_disable() */
> +
> +	switch (op) {
> +	case FUTEX_OP_SET:
> +		__futex_atomic_op("mov	%w0, %w4",
> +				  ret, oldval, uaddr, tmp, oparg);
> +		break;
> +	case FUTEX_OP_ADD:
> +		__futex_atomic_op("add	%w0, %w1, %w4",
> +				  ret, oldval, uaddr, tmp, oparg);
> +		break;
> +	case FUTEX_OP_OR:
> +		__futex_atomic_op("orr	%w0, %w1, %w4",
> +				  ret, oldval, uaddr, tmp, oparg);
> +		break;
> +	case FUTEX_OP_ANDN:
> +		__futex_atomic_op("and	%w0, %w1, %w4",
> +				  ret, oldval, uaddr, tmp, ~oparg);
> +		break;
> +	case FUTEX_OP_XOR:
> +		__futex_atomic_op("eor	%w0, %w1, %w4",
> +				  ret, oldval, uaddr, tmp, oparg);
> +		break;
> +	default:
> +		ret = -ENOSYS;
> +	}
> +
> +	pagefault_enable();	/* subsumes preempt_enable() */
> +
> +	if (!ret) {
> +		switch (cmp) {
> +		case FUTEX_OP_CMP_EQ: ret = (oldval == cmparg); break;
> +		case FUTEX_OP_CMP_NE: ret = (oldval != cmparg); break;
> +		case FUTEX_OP_CMP_LT: ret = (oldval < cmparg); break;
> +		case FUTEX_OP_CMP_GE: ret = (oldval >= cmparg); break;
> +		case FUTEX_OP_CMP_LE: ret = (oldval <= cmparg); break;
> +		case FUTEX_OP_CMP_GT: ret = (oldval > cmparg); break;
> +		default: ret = -ENOSYS;
> +		}
> +	}
> +	return ret;
> +}
> +
> +static inline int
> +futex_atomic_cmpxchg_inatomic(u32 *uval, u32 __user *uaddr,
> +			      u32 oldval, u32 newval)
> +{
> +	int ret = 0;
> +	u32 val, tmp;
> +
> +	if (!access_ok(VERIFY_WRITE, uaddr, sizeof(u32)))
> +		return -EFAULT;
> +
> +	asm volatile("// futex_atomic_cmpxchg_inatomic\n"
> +"1:	ldaxr	%w1, %2\n"
> +"	sub	%w3, %w1, %w4\n"
> +"	cbnz	%w3, 3f\n"
> +"2:	stlxr	%w3, %w5, %2\n"
> +"	cbnz	%w3, 1b\n"
> +"3:	.pushsection __ex_table,\"a\"\n"
> +"	.align	3\n"
> +"	.quad	1b, 4f, 2b, 4f\n"
> +"	.popsection\n"
> +"	.pushsection .fixup,\"ax\"\n"

Same here w.r.t. exception table location and barrier.

> +"4:	mov	%w0, %w6\n"
> +"	b	3b\n"
> +"	.popsection"
> +	: "+r" (ret), "=&r" (val), "+Q" (*uaddr), "=&r" (tmp)
> +	: "r" (oldval), "r" (newval), "Ir" (-EFAULT)
> +	: "cc", "memory");
> +
> +	*uval = val;
> +	return ret;
> +}
> +
> +#endif /* __KERNEL__ */
> +#endif /* __ASM_FUTEX_H */


-Olof

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 13/31] arm64: Device specific operations
  2012-08-14 17:52 ` [PATCH v2 13/31] arm64: Device specific operations Catalin Marinas
@ 2012-08-15  0:33   ` Olof Johansson
  2012-09-14 17:29     ` Catalin Marinas
  2012-08-15 16:13   ` Arnd Bergmann
  2012-08-17  9:19   ` Tony Lindgren
  2 siblings, 1 reply; 170+ messages in thread
From: Olof Johansson @ 2012-08-15  0:33 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

On Tue, Aug 14, 2012 at 06:52:14PM +0100, Catalin Marinas wrote:
> This patch adds several definitions for device communication, including
> I/O accessors and ioremap(). The __raw_* accessors are implemented as
> inline asm to avoid compiler generation of post-indexed accesses (less
> efficient to emulate in a virtualised environment).
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/device.h |   26 ++++
>  arch/arm64/include/asm/fb.h     |   34 +++++
>  arch/arm64/include/asm/io.h     |  263 +++++++++++++++++++++++++++++++++++++++
>  arch/arm64/kernel/io.c          |   64 ++++++++++
>  arch/arm64/mm/ioremap.c         |   84 +++++++++++++
>  5 files changed, 471 insertions(+), 0 deletions(-)
>  create mode 100644 arch/arm64/include/asm/device.h
>  create mode 100644 arch/arm64/include/asm/fb.h
>  create mode 100644 arch/arm64/include/asm/io.h
>  create mode 100644 arch/arm64/kernel/io.c
>  create mode 100644 arch/arm64/mm/ioremap.c
> 
> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
> new file mode 100644
> index 0000000..48fa83f
> --- /dev/null
> +++ b/arch/arm64/include/asm/io.h

[...]

> +/*
> + *  I/O port access primitives.
> + */
> +#define IO_SPACE_LIMIT		0xffff
> +
> +/*
> + * We currently don't have any platform with PCI support, so just leave this
> + * defined to 0 until needed.
> + */
> +#define PCI_IOBASE		((void __iomem *)0)

You could just leave out the PCI / I/O code alltogether instead.

> diff --git a/arch/arm64/kernel/io.c b/arch/arm64/kernel/io.c
> new file mode 100644
> index 0000000..7d37ead
> --- /dev/null
> +++ b/arch/arm64/kernel/io.c
> @@ -0,0 +1,64 @@
> +/*
> + * Based on arch/arm/kernel/io.c
> + *
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/export.h>
> +#include <linux/types.h>
> +#include <linux/io.h>
> +
> +/*
> + * Copy data from IO memory space to "real" memory space.
> + */
> +void __memcpy_fromio(void *to, const volatile void __iomem *from, size_t count)
> +{
> +	unsigned char *t = to;
> +	while (count) {
> +		count--;
> +		*t = readb(from);
> +		t++;
> +		from++;
> +	}
> +}
> +EXPORT_SYMBOL(__memcpy_fromio);
> +
> +/*
> + * Copy data from "real" memory space to IO memory space.
> + */
> +void __memcpy_toio(volatile void __iomem *to, const void *from, size_t count)
> +{
> +	const unsigned char *f = from;
> +	while (count) {
> +		count--;
> +		writeb(*f, to);
> +		f++;
> +		to++;
> +	}
> +}
> +EXPORT_SYMBOL(__memcpy_toio);
> +
> +/*
> + * "memset" on IO memory space.
> + */
> +void __memset_io(volatile void __iomem *dst, int c, size_t count)
> +{
> +	while (count) {
> +		count--;
> +		writeb(c, dst);
> +		dst++;
> +	}
> +}
> +EXPORT_SYMBOL(__memset_io);

Doing all of the above a byte at a time is horribly inefficient. Feel
free to borrow the implementations from arch/powerpc/kernel/io.c instead
of from ARM.


-Olof

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 14/31] arm64: DMA mapping API
  2012-08-14 17:52 ` [PATCH v2 14/31] arm64: DMA mapping API Catalin Marinas
@ 2012-08-15  0:40   ` Olof Johansson
  2012-08-21 13:05     ` Catalin Marinas
  2012-08-15 16:16   ` Arnd Bergmann
  1 sibling, 1 reply; 170+ messages in thread
From: Olof Johansson @ 2012-08-15  0:40 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann

Hi,


On Tue, Aug 14, 2012 at 06:52:15PM +0100, Catalin Marinas wrote:
> This patch adds support for the DMA mapping API. It uses dma_map_ops for
> flexibility and it currently supports swiotlb. This patch could be
> simplified further if the DMA accesses are coherent (not mandated by the
> architecture) or if corresponding hooks are placed in the generic
> swiotlb code to deal with cache maintenance.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/dma-mapping.h |  124 ++++++++++++++++++++
>  arch/arm64/mm/dma-mapping.c          |  208 ++++++++++++++++++++++++++++++++++
>  2 files changed, 332 insertions(+), 0 deletions(-)
>  create mode 100644 arch/arm64/include/asm/dma-mapping.h
>  create mode 100644 arch/arm64/mm/dma-mapping.c
> 
> diff --git a/arch/arm64/include/asm/dma-mapping.h b/arch/arm64/include/asm/dma-mapping.h
> new file mode 100644
> index 0000000..538f4b4
> --- /dev/null
> +++ b/arch/arm64/include/asm/dma-mapping.h
> @@ -0,0 +1,124 @@
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_DMA_MAPPING_H
> +#define __ASM_DMA_MAPPING_H
> +
> +#ifdef __KERNEL__
> +
> +#include <linux/types.h>
> +#include <linux/vmalloc.h>
> +
> +#include <asm-generic/dma-coherent.h>
> +
> +#define ARCH_HAS_DMA_GET_REQUIRED_MASK
> +
> +extern struct dma_map_ops *dma_ops;
> 
> +static inline struct dma_map_ops *get_dma_ops(struct device *dev)
> +{
> +	if (unlikely(!dev) || !dev->archdata.dma_ops)
> +		return dma_ops;
> +	else
> +		return dev->archdata.dma_ops;
> +}

Does it make sense to add the concept of a global dma ops on arm64,
instead of requiring the dma ops pointer per device similar to how
some other platforms do it (including powerpc)? For devices that lack
archdata.dma_ops, dma_supported() should return 0 (and the other ops
should return error).



-Olof

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 15/31] arm64: SMP support
  2012-08-14 17:52 ` [PATCH v2 15/31] arm64: SMP support Catalin Marinas
@ 2012-08-15  0:49   ` Olof Johansson
  2012-08-15 13:04   ` Arnd Bergmann
  2012-08-17  9:21   ` Tony Lindgren
  2 siblings, 0 replies; 170+ messages in thread
From: Olof Johansson @ 2012-08-15  0:49 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann,
	Will Deacon, Marc Zyngier

Hi,

On Tue, Aug 14, 2012 at 06:52:16PM +0100, Catalin Marinas wrote:
> This patch adds SMP initialisation and spinlocks implementation for
> AArch64. The spinlock support uses the new load-acquire/store-release
> instructions to avoid explicit barriers. The architecture also specifies
> that an event is automatically generated when clearing the exclusive
> monitor state to wake up processors in WFE, so there is no need for an
> explicit DSB/SEV instruction sequence. The SEVL instruction is used to
> set the exclusive monitor locally as there is no conditional WFE and a
> branch is more expensive.
> 
> For the SMP booting protocol, see Documentation/arm64/booting.txt.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---

> diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
> new file mode 100644
> index 0000000..34a37fb
> --- /dev/null
> +++ b/arch/arm64/include/asm/spinlock.h
> @@ -0,0 +1,199 @@
> +/*
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_SPINLOCK_H
> +#define __ASM_SPINLOCK_H
> +
> +#include <asm/spinlock_types.h>
> +#include <asm/processor.h>
> +
> +/*
> + * AArch64 Spin-locking.
> + *
> + * We exclusively read the old value.  If it is zero, we may have
> + * won the lock, so we try exclusively storing it.  A memory barrier
> + * is required after we get a lock, and before we release it, because
> + * V6 CPUs are assumed to have weakly ordered memory.

This comment should be updated, to mention the implicit locking and remove the
reference to V6?

Also, ignore previous questions on another reply about need for barriers,
obviously not needed given the load-acquire/store-release semantics.



-Olof

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 01/31] arm64: Assembly macros and definitions
  2012-08-14 17:52 ` [PATCH v2 01/31] arm64: Assembly macros and definitions Catalin Marinas
@ 2012-08-15 12:57   ` Arnd Bergmann
  0 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 12:57 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> This patch introduces several assembly macros and definitions used in
> the .S files across arch/arm64/ like IRQ disabling/enabling, together
> with asm-offsets.c.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 03/31] arm64: Exception handling
  2012-08-14 17:52 ` [PATCH v2 03/31] arm64: Exception handling Catalin Marinas
  2012-08-14 23:29   ` Olof Johansson
@ 2012-08-15 13:03   ` Arnd Bergmann
  2012-08-16 10:05     ` Will Deacon
  1 sibling, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 13:03 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> +#ifdef CONFIG_AARCH32_EMULATION
> +#define compat_thumb_mode(regs) \
> +	(((regs)->pstate & COMPAT_PSR_T_BIT))
> +#else
> +#define compat_thumb_mode(regs) (0)
> +#endif

The symbol we use on other platforms is CONFIG_COMPAT. I don't think you
need to have a separate CONFIG_AARCH32_EMULATION

> +void __bad_xchg(volatile void *ptr, int size)
> +{
> +	printk("xchg: bad data size: pc 0x%p, ptr 0x%p, size %d\n",
> +		__builtin_return_address(0), ptr, size);
> +	BUG();
> +}
> +EXPORT_SYMBOL(__bad_xchg);
> +

I think we're better off not defining this function. My guess is that
initially the idea on ARM was that it was meant as a BUILD_BUG_ON
replacement, but the someone added this function. And you copied it.

Microblaze has the same declaration, but (correctly) misses the
definition, which produces a much more helpful link failure than
a run-time BUG(). Using BUILD_BUG_ON would be even better.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 15/31] arm64: SMP support
  2012-08-14 17:52 ` [PATCH v2 15/31] arm64: SMP support Catalin Marinas
  2012-08-15  0:49   ` Olof Johansson
@ 2012-08-15 13:04   ` Arnd Bergmann
  2012-08-17  9:21   ` Tony Lindgren
  2 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 13:04 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon, Marc Zyngier

On Tuesday 14 August 2012, Catalin Marinas wrote:
> This patch adds SMP initialisation and spinlocks implementation for
> AArch64. The spinlock support uses the new load-acquire/store-release
> instructions to avoid explicit barriers. The architecture also specifies
> that an event is automatically generated when clearing the exclusive
> monitor state to wake up processors in WFE, so there is no need for an
> explicit DSB/SEV instruction sequence. The SEVL instruction is used to
> set the exclusive monitor locally as there is no conditional WFE and a
> branch is more expensive.
> 
> For the SMP booting protocol, see Documentation/arm64/booting.txt.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Marc Zyngier <marc.zyngier@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 02/31] arm64: Kernel booting and initialisation
  2012-08-14 17:52 ` [PATCH v2 02/31] arm64: Kernel booting and initialisation Catalin Marinas
  2012-08-14 23:06   ` Olof Johansson
@ 2012-08-15 13:20   ` Arnd Bergmann
  2012-08-15 17:06     ` Olof Johansson
  2012-08-16 12:53     ` Catalin Marinas
  2012-08-16 18:59   ` Nicolas Pitre
                     ` (2 subsequent siblings)
  4 siblings, 2 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 13:20 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> +The AArch64 exception model is made up of a number of exception levels
> +(EL0 - EL3), with EL0 and EL1 having a secure and a non-secure
> +counterpart.  EL2 is the hypervisor level and exists only in non-secure
> +mode. EL3 is the highest priority level and exists only in secure mode.

I'm always confused by a description like this. It sounds like you cannot
have a hypervisor if you have code running in secure mode in EL3. What
I instead understand is that you enter non-secure mode by going from
EL3 into EL2.

> +2. Setup the device tree
> +-------------------------
> +
> +Requirement: MANDATORY
> +
> +The device tree blob (dtb) must be no bigger than 2 megabytes in size
> +and placed at a 2-megabyte boundary within the first 512 megabytes from
> +the start of the kernel image. This is to allow the kernel to map the
> +blob using a single section mapping in the initial page tables.

I've seen people put firmware for some peripherals into the device tree,
so that a device driver can grab a blob from there and load it into the
device, rather than calling request_firmware() which would fail if the
OS running on the system does not contain the blob. If such firmware is
too large, you end up violating the 2 MB limit you impose here.

Should we keep that limit and declare those use cases as invalid, or
should we try to make the boot protocol more flexible?

> diff --git a/arch/arm64/include/asm/setup.h b/arch/arm64/include/asm/setup.h
> new file mode 100644
> index 0000000..d766493
> --- /dev/null
> +++ b/arch/arm64/include/asm/setup.h
> @@ -0,0 +1,26 @@
> +#ifndef __ASM_SETUP_H
> +#define __ASM_SETUP_H
> +
> +#include <linux/types.h>
> +
> +#define COMMAND_LINE_SIZE 1024
> +
> +#endif

Is this necessary? The asm-generic version of this file allows 512 bytes,
which seems plenty.

> +unsigned int processor_id;
> +EXPORT_SYMBOL(processor_id);
> +
> +unsigned int elf_hwcap __read_mostly;
> +EXPORT_SYMBOL(elf_hwcap);

EXPORT_SYMBOL_GPL?

Neither of these looks like they should be used in drivers.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 04/31] arm64: MMU definitions
  2012-08-14 17:52 ` [PATCH v2 04/31] arm64: MMU definitions Catalin Marinas
@ 2012-08-15 13:30   ` Arnd Bergmann
  2012-08-15 13:39     ` Catalin Marinas
  2012-08-15 16:34     ` Geert Uytterhoeven
  2012-08-17  9:04   ` Tony Lindgren
  1 sibling, 2 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 13:30 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
>
> +/*
> + * TCR flags.
> + */
> +#define TCR_TxSZ(x)		(((64 - (x)) << 16) | ((64 - (x)) << 0))
> +#define TCR_IRGN_NC		((0 << 8) | (0 << 24))
> +#define TCR_IRGN_WBWA		((1 << 8) | (1 << 24))
> +#define TCR_IRGN_WT		((2 << 8) | (2 << 24))
> +#define TCR_IRGN_WBnWA		((3 << 8) | (3 << 24))
> +#define TCR_IRGN_MASK		((3 << 8) | (3 << 24))
> +#define TCR_ORGN_NC		((0 << 10) | (0 << 26))
> +#define TCR_ORGN_WBWA		((1 << 10) | (1 << 26))
> +#define TCR_ORGN_WT		((2 << 10) | (2 << 26))
> +#define TCR_ORGN_WBnWA		((3 << 10) | (3 << 26))
> +#define TCR_ORGN_MASK		((3 << 10) | (3 << 26))
> +#define TCR_SHARED		((3 << 12) | (3 << 28))
> +#define TCR_TG0_64K		(1 << 14)
> +#define TCR_TG1_64K		(1 << 30)
> +#define TCR_IPS_40BIT		(2 << 32)
> +#define TCR_ASID16		(1 << 36)
> +

As a matter of coding style, I would much prefer tables like this to be
written as

#define TCR_IRGN_MASK		0x0000000003000300
#define TCR_IRGN_WBnWA		0x0000000003000300
#define TCR_IRGN_WT		0x0000000002000200
#define TCR_IRGN_WBWA		0x0000000001000100
#define TCR_IRGN_NC		0x0000000000000000

#define TCR_ORGN_MASK		0x000000000c000c00
#define TCR_ORGN_WBnWA		0x000000000c000c00
#define TCR_ORGN_WT		0x0000000008000800
#define TCR_ORGN_WBWA		0x0000000004000400
#define TCR_ORGN_NC		0x0000000000000000

The advantage of this is that you can visually compare the bitmasks
to a hex dump, and if you are suffering from endian-confused documentation
authors, there is no ambiguity about which end of the word is bit zero.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 04/31] arm64: MMU definitions
  2012-08-15 13:30   ` Arnd Bergmann
@ 2012-08-15 13:39     ` Catalin Marinas
  2012-08-15 16:34     ` Geert Uytterhoeven
  1 sibling, 0 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-15 13:39 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

Hi Arnd,

On Wed, Aug 15, 2012 at 02:30:01PM +0100, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> > +/*
> > + * TCR flags.
> > + */
> > +#define TCR_TxSZ(x)		(((64 - (x)) << 16) | ((64 - (x)) << 0))
> > +#define TCR_IRGN_NC		((0 << 8) | (0 << 24))
> > +#define TCR_IRGN_WBWA		((1 << 8) | (1 << 24))
> > +#define TCR_IRGN_WT		((2 << 8) | (2 << 24))
> > +#define TCR_IRGN_WBnWA		((3 << 8) | (3 << 24))
> > +#define TCR_IRGN_MASK		((3 << 8) | (3 << 24))
> > +#define TCR_ORGN_NC		((0 << 10) | (0 << 26))
> > +#define TCR_ORGN_WBWA		((1 << 10) | (1 << 26))
> > +#define TCR_ORGN_WT		((2 << 10) | (2 << 26))
> > +#define TCR_ORGN_WBnWA		((3 << 10) | (3 << 26))
> > +#define TCR_ORGN_MASK		((3 << 10) | (3 << 26))
> > +#define TCR_SHARED		((3 << 12) | (3 << 28))
> > +#define TCR_TG0_64K		(1 << 14)
> > +#define TCR_TG1_64K		(1 << 30)
> > +#define TCR_IPS_40BIT		(2 << 32)
> > +#define TCR_ASID16		(1 << 36)
> > +
> 
> As a matter of coding style, I would much prefer tables like this to be
> written as
> 
> #define TCR_IRGN_MASK		0x0000000003000300
> #define TCR_IRGN_WBnWA		0x0000000003000300
> #define TCR_IRGN_WT		0x0000000002000200
> #define TCR_IRGN_WBWA		0x0000000001000100
> #define TCR_IRGN_NC		0x0000000000000000
> 
> #define TCR_ORGN_MASK		0x000000000c000c00
> #define TCR_ORGN_WBnWA		0x000000000c000c00
> #define TCR_ORGN_WT		0x0000000008000800
> #define TCR_ORGN_WBWA		0x0000000004000400
> #define TCR_ORGN_NC		0x0000000000000000
> 
> The advantage of this is that you can visually compare the bitmasks
> to a hex dump, and if you are suffering from endian-confused documentation
> authors, there is no ambiguity about which end of the word is bit zero.

That depends on the case, in some places it's more readable like this.
In the above case, I find it easier to compare against the documentation
which, for example, has groups of 2 bits at position 8 and 24 or 10 and
26 (for TTBR0 and TTBR1). The meaning of a group of 2 bits is described
separately as 0b00 (NC), 0b01(WBWA) etc. Same goes for the shareability
bits (12 and 28).

So I think at least for code writing it's less error-prone to write the
explicit bit position than a magic long hex.

-- 
Catalin

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 05/31] arm64: MMU initialisation
  2012-08-14 17:52 ` [PATCH v2 05/31] arm64: MMU initialisation Catalin Marinas
@ 2012-08-15 13:45   ` Arnd Bergmann
  2012-08-17 10:06   ` Santosh Shilimkar
  1 sibling, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 13:45 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> This patch contains the initialisation of the memory blocks, MMU
> attributes and the memory map. Only five memory types are defined:
> Device nGnRnE (equivalent to Strongly Ordered), Device nGnRE (classic
> Device memory), Device GRE, Normal Non-cacheable and Normal Cacheable.
> Cache policies are supported via the memory attributes register
> (MAIR_EL1) and only affect the Normal Cacheable mappings.

It looks like you've managed to eliminate bootmem as I suggested earlier,
very nice!

Acked-by: Arnd Bergmann <arnd@arndb.de>

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 06/31] arm64: MMU fault handling and page table management
  2012-08-14 17:52 ` [PATCH v2 06/31] arm64: MMU fault handling and page table management Catalin Marinas
@ 2012-08-15 13:47   ` Arnd Bergmann
  2012-08-17 16:07     ` Catalin Marinas
  0 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 13:47 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> +
> +pgd_t *pgd_alloc(struct mm_struct *mm)
> +{
> +	pgd_t *new_pgd;
> +
> +	new_pgd = (pgd_t *)__get_free_pages(GFP_KERNEL, PGD_ORDER);
> +	if (!new_pgd)
> +		return NULL;
> +
> +	memset(new_pgd, 0, PAGE_SIZE << PGD_ORDER);
> +
> +	return new_pgd;
> +}
> +
> +void pgd_free(struct mm_struct *mm, pgd_t *pgd)
> +{
> +	free_pages((unsigned long)pgd, PGD_ORDER);
> +}
 
According to the documentation, you should only need 8kb for the pgd on
a 64kb page system. Is it required that you use up a full page here?

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 07/31] arm64: Process management
  2012-08-14 17:52 ` [PATCH v2 07/31] arm64: Process management Catalin Marinas
  2012-08-14 23:50   ` Olof Johansson
@ 2012-08-15 13:53   ` Arnd Bergmann
  2012-08-17 16:15     ` Catalin Marinas
  2012-08-16 15:09   ` Tobias Klauser
  2 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 13:53 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> +#define THREAD_SIZE_ORDER	1
> +#define THREAD_SIZE		8192
> +#define THREAD_START_SP		(THREAD_SIZE - 16)

THREAD_SIZE_ORDER looks wrong for 64kb-page kernels. It also doesn't seem to
be used, so better remove it.

	Arnd


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 08/31] arm64: CPU support
  2012-08-14 17:52 ` [PATCH v2 08/31] arm64: CPU support Catalin Marinas
  2012-08-15  0:10   ` Olof Johansson
@ 2012-08-15 13:56   ` Arnd Bergmann
  2012-08-20 16:00     ` Catalin Marinas
  1 sibling, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 13:56 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> diff --git a/arch/arm64/include/asm/procinfo.h b/arch/arm64/include/asm/procinfo.h
> new file mode 100644
> index 0000000..81fece9
> --- /dev/null
> +++ b/arch/arm64/include/asm/procinfo.h
> @@ -0,0 +1,44 @@
> +/*
> + * Based on arch/arm/include/asm/procinfo.h
> + *
> + * Copyright (C) 1996-1999 Russell King
> + * Copyright (C) 2012 ARM Ltd.
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +#ifndef __ASM_PROCINFO_H
> +#define __ASM_PROCINFO_H
> +
> +#ifdef __KERNEL__
> +
> +/*
> + * Note!  struct processor is always defined if we're
> + * using MULTI_CPU, otherwise this entry is unused,
> + * but still exists.
> + *
> + * NOTE! The following structure is defined by assembly
> + * language, NOT C code.  For more information, check:
> + *  arch/arm/mm/proc-*.S and arch/arm/kernel/head.S
> + */
> +struct proc_info_list {
> +	unsigned int		cpu_val;
> +	unsigned int		cpu_mask;
> +	unsigned long		__cpu_flush;		/* used by head.S */
> +	const char		*cpu_name;
> +};
> +
> +#else	/* __KERNEL__ */
> +#include <asm/elf.h>
> +#warning "Please include asm/elf.h instead"
> +#endif	/* __KERNEL__ */
> +#endif

I think you forgot to remove this file when you removed MULTI_CPU.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 16/31] arm64: ELF definitions
  2012-08-14 17:52 ` [PATCH v2 16/31] arm64: ELF definitions Catalin Marinas
@ 2012-08-15 14:15   ` Arnd Bergmann
  2012-08-16 10:23     ` Will Deacon
  0 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 14:15 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> +
> +void elf_set_personality(int personality)
> +{
> +       switch (personality & PER_MASK) {
> +       case PER_LINUX:
> +               clear_thread_flag(TIF_32BIT);
> +               break;
> +       case PER_LINUX32:
> +               set_thread_flag(TIF_32BIT);
> +               break;
> +       default:
> +               pr_warning("Process %s tried to assume unknown personality %d\n",
> +                          current->comm, personality);
> +               return;
> +       }
> +
> +       current->personality = personality;
> +}
> +EXPORT_SYMBOL(elf_set_personality);

This looks wrong: PER_LINUX/PER_LINUX32 decides over the output of the
uname system call, while TIF_32BIT decides over the instruction set
when returning to user space. You definitely should not set the personality
to the value you pass from the elf loader. Instead, just do

#define SET_PERSONALITY(ex) clear_thread_flag(TIF_32BIT);
#defined COMPAT_SET_PERSONALITY(ex) set_thread_flag(TIF_32BIT);

I also don't see a reason to export this. You'd have trouble loading
the elf interpreter module from user space without the elf interpreter.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 17/31] arm64: System calls handling
  2012-08-14 17:52 ` [PATCH v2 17/31] arm64: System calls handling Catalin Marinas
@ 2012-08-15 14:22   ` Arnd Bergmann
  2012-08-21 17:51     ` Catalin Marinas
  0 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 14:22 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> +
> +/* This matches struct stat64 in glibc2.1, hence the absolutely
> + * insane amounts of padding around dev_t's.
> + * Note: The kernel zero's the padded region because glibc might read them
> + * in the hope that the kernel has stretched to using larger sizes.
> + */
> +struct stat64 {
> +	compat_u64	st_dev;
> +	unsigned char   __pad0[4];
> +
> +#define STAT64_HAS_BROKEN_ST_INO	1
> +	compat_ulong_t	__st_ino;
> +	compat_uint_t	st_mode;
> +	compat_uint_t	st_nlink;
> +
> +	compat_ulong_t	st_uid;
> +	compat_ulong_t	st_gid;
> +
> +	compat_u64	st_rdev;
> +	unsigned char   __pad3[4];
> +
> +	compat_s64	st_size;
> +	compat_ulong_t	st_blksize;
> +	compat_u64	st_blocks;	/* Number 512-byte blocks allocated. */
> +
> +	compat_ulong_t	st_atime;
> +	compat_ulong_t	st_atime_nsec;
> +
> +	compat_ulong_t	st_mtime;
> +	compat_ulong_t	st_mtime_nsec;
> +
> +	compat_ulong_t	st_ctime;
> +	compat_ulong_t	st_ctime_nsec;
> +
> +	compat_u64	st_ino;
> +};

The comment above struct stat64 is completely irrelevant here. I would instead
explain why you need your own stat64 in the first place.

> +int kernel_execve(const char *filename,
> +		  const char *const argv[],
> +		  const char *const envp[])
> +{
> +	struct pt_regs regs;
> +	int ret;
> +
> +	memset(&regs, 0, sizeof(struct pt_regs));
> +	ret = do_execve(filename,
> +			(const char __user *const __user *)argv,
> +			(const char __user *const __user *)envp, &regs);
> +	if (ret < 0)
> +		goto out;
> +
> +	/*
> +	 * Save argc to the register structure for userspace.
> +	 */
> +	regs.regs[0] = ret;
> +
> +	/*
> +	 * We were successful.  We won't be returning to our caller, but
> +	 * instead to user space by manipulating the kernel stack.
> +	 */
> +	asm(	"add	x0, %0, %1\n\t"
> +		"mov	x1, %2\n\t"
> +		"mov	x2, %3\n\t"
> +		"bl	memmove\n\t"	/* copy regs to top of stack */
> +		"mov	x27, #0\n\t"	/* not a syscall */
> +		"mov	x28, %0\n\t"	/* thread structure */
> +		"mov	sp, x0\n\t"	/* reposition stack pointer */
> +		"b	ret_to_user"
> +		:
> +		: "r" (current_thread_info()),
> +		  "Ir" (THREAD_START_SP - sizeof(regs)),
> +		  "r" (&regs),
> +		  "Ir" (sizeof(regs))
> +		: "x0", "x1", "x2", "x27", "x28", "x30", "memory");
> +
> + out:
> +	return ret;
> +}
> +EXPORT_SYMBOL(kernel_execve);

Al Viro was recently talking about a generic implementation of execve.
I can't find that now, but I think you should use that.

> +
> +asmlinkage long sys_mmap(unsigned long addr, unsigned long len,
> +			 unsigned long prot, unsigned long flags,
> +			 unsigned long fd, off_t off)
> +{
> +	if (offset_in_page(off) != 0)
> +		return -EINVAL;
> +
> +	return sys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
> +}
> +
> +/*
> + * Wrappers to pass the pt_regs argument.
> + */
> +#define sys_execve		sys_execve_wrapper
> +#define sys_clone		sys_clone_wrapper
> +#define sys_rt_sigreturn	sys_rt_sigreturn_wrapper
> +#define sys_sigaltstack		sys_sigaltstack_wrapper

I think

#define sys_mmap sys_mmap_pgoff 

would be more appropriate than defining your own sys_mmap function here.
We should probably make that the default in asm-generic/unistd.h and
change the architectures that have their own implementation to override
it.

	Arnd


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 21/31] arm64: 32-bit (compat) applications support
  2012-08-14 17:52 ` [PATCH v2 21/31] arm64: 32-bit (compat) applications support Catalin Marinas
@ 2012-08-15 14:34   ` Arnd Bergmann
  2012-08-16 10:28     ` Will Deacon
  2012-08-24 10:43     ` Catalin Marinas
  2012-08-20 10:53   ` Pavel Machek
  1 sibling, 2 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 14:34 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> +#ifdef CONFIG_AARCH32_EMULATION
> +#include <linux/compat.h>
> +
> +#define AARCH32_KERN_SIGRET_CODE_OFFSET	0x500
> +
> +extern const compat_ulong_t aarch32_sigret_code[6];
> +
> +int compat_setup_frame(int usig, struct k_sigaction *ka, sigset_t *set,
> +		       struct pt_regs *regs);
> +int compat_setup_rt_frame(int usig, struct k_sigaction *ka, siginfo_t *info,
> +			  sigset_t *set, struct pt_regs *regs);
> +
> +void compat_setup_restart_syscall(struct pt_regs *regs);
> +#else
> +
> +static inline int compat_setup_frame(int usid, struct k_sigaction *ka,
> +				     sigset_t *set, struct pt_regs *regs)
> +{
> +	BUG();
> +}

What good is the run-time BUG() here? Nothing should be calling these
when CONFIG_COMPAT is disabled, so I think you should just remove
the #ifdef around the declarations, and the entire #else case.


> +asmlinkage int compat_sys_sched_rr_get_interval(compat_pid_t pid,
> +						struct compat_timespec __user *interval)
> +{
> +	struct timespec t;
> +	int ret;
> +	mm_segment_t old_fs = get_fs();
> +
> +	set_fs(KERNEL_DS);
> +	ret = sys_sched_rr_get_interval(pid, (struct timespec __user *)&t);
> +	set_fs(old_fs);
> +	if (put_compat_timespec(&t, interval))
> +		return -EFAULT;
> +	return ret;
> +}
> +
> +asmlinkage int compat_sys_sendfile(int out_fd, int in_fd,
> +				   compat_off_t __user *offset, s32 count)
> +{
> +	mm_segment_t old_fs = get_fs();
> +	int ret;
> +	off_t of;
> +
> +	if (offset && get_user(of, offset))
> +		return -EFAULT;
> +
> +	set_fs(KERNEL_DS);
> +	ret = sys_sendfile(out_fd, in_fd, offset ? (off_t __user *)&of : NULL,
> +			   count);
> +	set_fs(old_fs);
> +
> +	if (offset && put_user(of, offset))
> +		return -EFAULT;
> +	return ret;
> +}

I guess it's time to move these two into common code. They look like they should
be shared across most architectures that have compat support.

> +asmlinkage int compat_sys_personality(compat_ulong_t personality)
> +{
> +	int ret;
> +
> +	if (personality(current->personality) == PER_LINUX32 &&
> +		personality == PER_LINUX)
> +		personality = PER_LINUX32;
> +	ret = sys_personality(personality);
> +	if (ret == PER_LINUX32)
> +		ret = PER_LINUX;
> +	return ret;
> +}

Where did you get this from?

You should not need compat_sys_personality, just call the native function.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 22/31] arm64: Floating point and SIMD
  2012-08-14 17:52 ` [PATCH v2 22/31] arm64: Floating point and SIMD Catalin Marinas
@ 2012-08-15 14:35   ` Arnd Bergmann
  0 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 14:35 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> This patch adds support for FP/ASIMD register bank saving and restoring
> during context switch and FP exception handling to generate SIGFPE.
> There are 32 128-bit registers and the context switching is currently
> done non-lazily. Benchmarks on real hardware are required before
> implementing lazy FP state saving/restoring.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 20/31] arm64: User access library function
  2012-08-14 17:52 ` [PATCH v2 20/31] arm64: User access library functions Catalin Marinas
@ 2012-08-15 14:49   ` Arnd Bergmann
  2012-09-03 12:58     ` Catalin Marinas
  2012-09-05 19:13     ` Russell King - ARM Linux
  0 siblings, 2 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 14:49 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon, Marc Zyngier

On Tuesday 14 August 2012, Catalin Marinas wrote:

> +/*
> + * Single-value transfer routines.  They automatically use the right
> + * size if we just have the right pointer type.  Note that the functions
> + * which read from user space (*get_*) need to take care not to leak
> + * kernel data even if the calling code is buggy and fails to check
> + * the return value.  This means zeroing out the destination variable
> + * or buffer on error.  Normally this is done out of line by the
> + * fixup code, but there are a few places where it intrudes on the
> + * main code path.  When we only write to user space, there is no
> + * problem.
> + */
> +extern long __get_user_1(void *);
> +extern long __get_user_2(void *);
> +extern long __get_user_4(void *);
> +extern long __get_user_8(void *);
> +
> +#define __get_user_x(__r2,__p,__e,__s,__i...)				\
> +	   asm volatile(						\
> +		__asmeq("%0", "x0") __asmeq("%1", "x2")			\
> +		"bl	__get_user_" #__s				\
> +		: "=&r" (__e), "=r" (__r2)				\
> +		: "0" (__p)						\
> +		: __i, "cc")
> +
> +#define get_user(x,p)							\
> +	({								\
> +		register const typeof(*(p)) __user *__p asm("x0") = (p);\
> +		register unsigned long __r2 asm("x2");			\
> +		register long __e asm("x0");				\
> +		switch (sizeof(*(__p))) {				\
> +		case 1:							\
> +			__get_user_x(__r2, __p, __e, 1, "x30");		\
> +			break;						\
> +		case 2:							\
> +			__get_user_x(__r2, __p, __e, 2, "x3", "x30");	\
> +			break;						\
> +		case 4:							\
> +			__get_user_x(__r2, __p, __e, 4, "x30");		\
> +			break;						\
> +		case 8:							\
> +			__get_user_x(__r2, __p, __e, 8, "x30");		\
> +			break;						\
> +		default: __e = __get_user_bad(); break;			\
> +		}							\
> +		x = (typeof(*(p))) __r2;				\
> +		__e;							\
> +	})

It's fairly unusual to have out of line get_user/put_user functions.
What is the reason for this, other than copying from ARM?

> +
> +__get_user_bad:
> +	mov	x2, #0
> +	mov	x0, #-EFAULT
> +	ret
> +ENDPROC(__get_user_bad)

> +__put_user_bad:
> +	mov	x0, #-EFAULT
> +	ret
> +ENDPROC(__put_user_bad)
> +

The purpose of these symbols is to provoke a link error when you
pass the wrong data into get_user/put_user. Actually defining them
completely breaks this logic, so you should remove these!

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 23/31] arm64: Debugging support
  2012-08-14 17:52 ` [PATCH v2 23/31] arm64: Debugging support Catalin Marinas
@ 2012-08-15 15:07   ` Arnd Bergmann
  2012-08-16 10:47     ` Will Deacon
  0 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 15:07 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> +const struct user_regset_view *task_user_regset_view(struct task_struct *task)
> +{
> +#ifdef CONFIG_AARCH32_EMULATION
> +	if (test_tsk_thread_flag(task, TIF_32BIT))
> +		return &user_aarch32_view;
> +#endif
> +	return &user_aarch64_view;
> +}

Ah, nice. So you support 64 bit debuggers debugging 32 bit processes, right?

>From what I can tell, there is no support for 32 bit processes debugging
64 bit ones. Is that something you plan to add in the future, or do you
consider that out of scope? In either case, a comment would be helpful.

> +long arch_ptrace(struct task_struct *child, long request,
> +		 unsigned long addr, unsigned long data)
> +{
> +	int ret;
> +	unsigned long *datap = (unsigned long __user *)data;
> +
> +	switch (request) {
> +		case PTRACE_GET_THREAD_AREA:
> +			ret = put_user(child->thread.tp_value, datap);
> +			break;
> +
> +#ifdef CONFIG_HAVE_HW_BREAKPOINT
> +		case PTRACE_GETHBPREGS:
> +			ret = ptrace_gethbpregs(child, addr, datap);
> +			break;
> +
> +		case PTRACE_SETHBPREGS:
> +			ret = ptrace_sethbpregs(child, addr, datap);
> +			break;
> +#endif
> +
> +		default:
> +			ret = ptrace_request(child, request, addr, data);
> +			break;
> +	}
> +
> +	return ret;
> +}

Is there a reaons why these are not regsets but have their own ptrace
commands? I believe new architectures should generally not add ptrace
commands any more.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 24/31] arm64: Add support for /proc/sys/debug/exception-trace
  2012-08-14 17:52 ` [PATCH v2 24/31] arm64: Add support for /proc/sys/debug/exception-trace Catalin Marinas
@ 2012-08-15 15:08   ` Arnd Bergmann
  0 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 15:08 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel

On Tuesday 14 August 2012, Catalin Marinas wrote:
> 
> This patch allows setting of the show_unhandled_signals variable via
> /proc/sys/debug/exception-trace. The default value is currently 1
> showing unhandled user faults (undefined instructions, data aborts) and
> invalid signal stack frames.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 25/31] arm64: Performance counters support
  2012-08-14 17:52 ` [PATCH v2 25/31] arm64: Performance counters support Catalin Marinas
@ 2012-08-15 15:11   ` Arnd Bergmann
  2012-08-16 10:51     ` Will Deacon
  0 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 15:11 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> From: Will Deacon <will.deacon@arm.com>
> 
> This patch adds support for the AArch64 performance counters.
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> ---
>  arch/arm64/include/asm/perf_event.h |   22 +
>  arch/arm64/include/asm/pmu.h        |   82 +++
>  arch/arm64/kernel/perf_event.c      | 1368 +++++++++++++++++++++++++++++++++++
>  tools/perf/perf.h                   |    6 +

Can you explain how AArch64 performance counters differ from the 32
bit ones? Do they work for AArch32 user space under AArch64 kernels?
Is it possible to share parts of the implementation with arch/arm?

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 26/31] arm64: Miscellaneous library functions
  2012-08-14 17:52 ` [PATCH v2 26/31] arm64: Miscellaneous library functions Catalin Marinas
@ 2012-08-15 15:21   ` Arnd Bergmann
  2012-08-16 10:57     ` Will Deacon
  0 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 15:21 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Marc Zyngier, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> +
> +/*
> + * Use compiler builtins for simple inline operations.
> + */
> +static inline unsigned long __ffs(unsigned long word)
> +{
> +	return __builtin_ffsl(word) - 1;
> +}
> +
> +static inline int ffs(int x)
> +{
> +	return __builtin_ffs(x);
> +}
> +
> +static inline unsigned long __fls(unsigned long word)
> +{
> +	return BITS_PER_LONG - 1 - __builtin_clzl(word);
> +}
> +
> +static inline int fls(int x)
> +{
> +	return x ? sizeof(x) * BITS_PER_BYTE - __builtin_clz(x) : 0;
> +}

These are all great, but I think whether to use them or not should
depend on the compiler version rather than the architecture in
general. Do we know a minimum gcc version that supports all of the
above? Then we could put that code into the generic files.

If that's not possible, we could still make the implementation
available for other architectures by moving it to

asm-generic/bitops/builtin-__ffs.h
asm-generic/bitops/builtin-ffs.h
asm-generic/bitops/builtin-__fls.h
asm-generic/bitops/builtin-fls.h

> --- /dev/null
> +++ b/arch/arm64/lib/bitops.c
> @@ -0,0 +1,25 @@
> +/*
> + * Copyright (C) 2012 ARM Limited
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU General Public License version 2 as
> + * published by the Free Software Foundation.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> + * GNU General Public License for more details.
> + *
> + * You should have received a copy of the GNU General Public License
> + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/spinlock.h>
> +#include <linux/atomic.h>
> +
> +#ifdef CONFIG_SMP
> +arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
> +       [0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
> +};
> +#endif

What?

I suppose this is a leftover from an earlier version using the
generic bitops, right?

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 27/31] arm64: Loadable modules
  2012-08-14 17:52 ` [PATCH v2 27/31] arm64: Loadable modules Catalin Marinas
@ 2012-08-15 15:23   ` Arnd Bergmann
  2012-08-15 15:35     ` Catalin Marinas
  0 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 15:23 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> +
> +void *module_alloc(unsigned long size)
> +{
> +       return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> +                                   GFP_KERNEL, PAGE_KERNEL_EXEC, -1,
> +                                   __builtin_return_address(0));
> +}
> +

What is the reason for using a separate virtual address range for the
modules instead of falling back to the default module_alloc function
that uses vmalloc_exec()?

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 27/31] arm64: Loadable modules
  2012-08-15 15:23   ` Arnd Bergmann
@ 2012-08-15 15:35     ` Catalin Marinas
  2012-08-15 16:16       ` Arnd Bergmann
  0 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-15 15:35 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Wed, Aug 15, 2012 at 04:23:21PM +0100, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> > +
> > +void *module_alloc(unsigned long size)
> > +{
> > +       return __vmalloc_node_range(size, 1, MODULES_VADDR, MODULES_END,
> > +                                   GFP_KERNEL, PAGE_KERNEL_EXEC, -1,
> > +                                   __builtin_return_address(0));
> > +}
> > +
> 
> What is the reason for using a separate virtual address range for the
> modules instead of falling back to the default module_alloc function
> that uses vmalloc_exec()?

Primarily branch relocation, we have a limitation to 128MB branch range.
The alternative would be to always compile the modules with a large
memory model but we may lose some performance and could make the
relocation handling even harder. What we do now is pretty much similar
to static linking but at module load time.

-- 
Catalin

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 28/31] arm64: Generic timers support
  2012-08-14 17:52 ` [PATCH v2 28/31] arm64: Generic timers support Catalin Marinas
@ 2012-08-15 15:52   ` Arnd Bergmann
  2012-08-16 12:40   ` Linus Walleij
                     ` (3 subsequent siblings)
  4 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 15:52 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Marc Zyngier, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> +static void arch_timer_reg_write(int reg, u32 val)
> +{
> +       switch (reg) {
> +       case ARCH_TIMER_REG_CTRL:
> +               asm volatile("msr cntp_ctl_el0,  %0" : : "r" (val));
> +               break;
> +       case ARCH_TIMER_REG_TVAL:
> +               asm volatile("msr cntp_tval_el0, %0" : : "r" (val));
> +               break;
> +       default:
> +               BUG();
> +       }
> +
> +       isb();
> +}
> +
> +static u32 arch_timer_reg_read(int reg)
> +{
> +       u32 val;
> +
> +       switch (reg) {
> +       case ARCH_TIMER_REG_CTRL:
> +               asm volatile("mrs %0,  cntp_ctl_el0" : "=r" (val));
> +               break;
> +       case ARCH_TIMER_REG_FREQ:
> +               asm volatile("mrs %0,   cntfrq_el0" : "=r" (val));
> +               break;
> +       case ARCH_TIMER_REG_TVAL:
> +               asm volatile("mrs %0, cntp_tval_el0" : "=r" (val));
> +               break;
> +       default:
> +               BUG();
> +       }
> +
> +       return val;
> +}

Are the inline assemblies the only things in this driver that are
specific to AArch64?
Are you planning to use the same file for 32 bit ARM as well, e.g.
when running a 32 bit guest kernel on a 64 bit host?

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 29/31] arm64: Miscellaneous header files
  2012-08-14 17:52 ` [PATCH v2 29/31] arm64: Miscellaneous header files Catalin Marinas
@ 2012-08-15 15:56   ` Arnd Bergmann
  0 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 15:56 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> diff --git a/arch/arm64/include/asm/cmpxchg.h b/arch/arm64/include/asm/cmpxchg.h
> new file mode 100644
> index 0000000..dc50de7
> --- /dev/null
> +++ b/arch/arm64/include/asm/cmpxchg.h

> +	default:
> +		__bad_cmpxchg(ptr, size);
> +		oldval = 0;
> +	}

I did not see a definition for __bad_cmpxchg but I may have missed that.
Please make sure that none exists, or just use BUILD_BUG_ON().

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 31/31] arm64: MAINTAINERS update
  2012-08-14 17:52 ` [PATCH v2 31/31] arm64: MAINTAINERS update Catalin Marinas
@ 2012-08-15 15:57   ` Arnd Bergmann
  0 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 15:57 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel

On Tuesday 14 August 2012, Catalin Marinas wrote:
> 
> This patch updates the MAINTAINERS file for the AArch64 Linux kernel
> port.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 30/31] arm64: Build infrastructure
  2012-08-14 17:52 ` [PATCH v2 30/31] arm64: Build infrastructure Catalin Marinas
  2012-08-14 21:01   ` Sam Ravnborg
@ 2012-08-15 16:07   ` Arnd Bergmann
  2012-08-17  9:32   ` Tony Lindgren
  2 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 16:07 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:

> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
> new file mode 100644
> index 0000000..1ce3d04
> --- /dev/null
> +++ b/arch/arm64/Kconfig
> @@ -0,0 +1,261 @@
> +config ARM64
> +	def_bool y
> +	select OF
> +	select OF_EARLY_FLATTREE
> +	select IRQ_DOMAIN
> +	select HAVE_AOUT
> +	select HAVE_DMA_ATTRS
> +	select HAVE_DMA_API_DEBUG
> +	select HAVE_IDE

Please remove HAVE_AOUT and HAVE_IDE

> +	select HAVE_MEMBLOCK
> +	select RTC_LIB
> +	select SYS_SUPPORTS_APM_EMULATION

APM_EMULATION can probably go too

> +
> +config ARCH_PHYS_ADDR_T_64BIT
> +	def_bool y
> +
> +config HAVE_PWM
> +	bool

HAVE_PWM is going away soon.

> +config AARCH32_EMULATION
> +	bool "Kernel support for 32-bit EL0"
> +	depends on !ARM64_64K_PAGES
> +	select COMPAT_BINFMT_ELF
> +	help
> +	  This option enables support for a 32-bit EL0 running under a 64-bit
> +	  kernel at EL1. AArch32-specific components such as system calls,
> +	  the user helper functions, VFP support and the ptrace interface are
> +	  handled appropriately by the kernel.
> +
> +	  If you want to execute 32-bit userspace applications, say Y.
> +
> +config COMPAT
> +	def_bool y
> +	depends on AARCH32_EMULATION

As mentioned, you can just merge the two into CONFIG_COMPAT.

> +targets := Image Image.gz
> +
> +$(obj)/Image: vmlinux FORCE
> +	$(call if_changed,objcopy)
> +	@echo '  Kernel: $@ is ready'
> +
> +$(obj)/Image.gz: $(obj)/Image FORCE
> +	$(call if_changed,gzip)
> +	@echo '  Kernel: $@ is ready'

Drop the useless output, at least when building with make -s.

> +if [ -x /sbin/loadmap ]; then
> +  /sbin/loadmap
> +else
> +  echo "You have to install it yourself"
> +fi

What is loadmap?

> diff --git a/arch/arm64/configs/generic_defconfig b/arch/arm64/configs/generic_defconfig
> new file mode 100644
> index 0000000..d9aac95
> --- /dev/null
> +++ b/arch/arm64/configs/generic_defconfig

I think it can just be called "defconfig".

> diff --git a/arch/arm64/mm/Kconfig b/arch/arm64/mm/Kconfig
> new file mode 100644
> index 0000000..8e94e52
> --- /dev/null
> +++ b/arch/arm64/mm/Kconfig
> @@ -0,0 +1,5 @@
> +config MMU
> +	def_bool y
> +
> +config CPU_64
> +	def_bool y

This file can be dropped. You can unconditionally enable CONFIG_MMU,
and the CPU_64 symbol is pointless, just use CONFIG_64BIT.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 13/31] arm64: Device specific operations
  2012-08-14 17:52 ` [PATCH v2 13/31] arm64: Device specific operations Catalin Marinas
  2012-08-15  0:33   ` Olof Johansson
@ 2012-08-15 16:13   ` Arnd Bergmann
  2012-08-17  9:19   ` Tony Lindgren
  2 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 16:13 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Tuesday 14 August 2012, Catalin Marinas wrote:
> 
> This patch adds several definitions for device communication, including
> I/O accessors and ioremap(). The __raw_* accessors are implemented as
> inline asm to avoid compiler generation of post-indexed accesses (less
> efficient to emulate in a virtualised environment).
> 
> Signed-off-by: Will Deacon <will.deacon@arm.com>
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>


Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 14/31] arm64: DMA mapping API
  2012-08-14 17:52 ` [PATCH v2 14/31] arm64: DMA mapping API Catalin Marinas
  2012-08-15  0:40   ` Olof Johansson
@ 2012-08-15 16:16   ` Arnd Bergmann
  2012-08-21 12:59     ` Catalin Marinas
  1 sibling, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 16:16 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel

On Tuesday 14 August 2012, Catalin Marinas wrote:
> +static struct dma_map_ops arm64_swiotlb_dma_ops = {
> +       .alloc = arm64_swiotlb_alloc_coherent,
> +       .free = arm64_swiotlb_free_coherent,
> +       .map_page = arm64_swiotlb_map_page,
> +       .unmap_page = arm64_swiotlb_unmap_page,
> +       .map_sg = arm64_swiotlb_map_sg_attrs,
> +       .unmap_sg = arm64_swiotlb_unmap_sg_attrs,
> +       .sync_single_for_cpu = arm64_swiotlb_sync_single_for_cpu,
> +       .sync_single_for_device = arm64_swiotlb_sync_single_for_device,
> +       .sync_sg_for_cpu = arm64_swiotlb_sync_sg_for_cpu,
> +       .sync_sg_for_device = arm64_swiotlb_sync_sg_for_device,
> +       .dma_supported = swiotlb_dma_supported,
> +       .mapping_error = swiotlb_dma_mapping_error,
> +};
> +
> +void __init swiotlb_init_with_default_size(size_t default_size, int verbose);
> +
> +void __init arm64_swiotlb_init(size_t max_size)
> +{
> +       dma_ops = &arm64_swiotlb_dma_ops;
> +       swiotlb_init_with_default_size(min((size_t)SZ_64M, max_size), 1);
> +}

Why is swiotlb the default? I would expect that most devices can in fact
use the entire 64 bit address space, so you can use a simple linear
implementation for those.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 27/31] arm64: Loadable modules
  2012-08-15 15:35     ` Catalin Marinas
@ 2012-08-15 16:16       ` Arnd Bergmann
  0 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-15 16:16 UTC (permalink / raw)
  To: Catalin Marinas; +Cc: linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Wednesday 15 August 2012, Catalin Marinas wrote:
> Primarily branch relocation, we have a limitation to 128MB branch range.
> The alternative would be to always compile the modules with a large
> memory model but we may lose some performance and could make the
> relocation handling even harder. What we do now is pretty much similar
> to static linking but at module load time.

Ok, makes sense.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 04/31] arm64: MMU definitions
  2012-08-15 13:30   ` Arnd Bergmann
  2012-08-15 13:39     ` Catalin Marinas
@ 2012-08-15 16:34     ` Geert Uytterhoeven
  2012-08-15 16:45       ` Catalin Marinas
  1 sibling, 1 reply; 170+ messages in thread
From: Geert Uytterhoeven @ 2012-08-15 16:34 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Wed, Aug 15, 2012 at 3:30 PM, Arnd Bergmann <arnd@arndb.de> wrote:
>> +#define TCR_IPS_40BIT                (2 << 32)

By default, constants are int, i.e. 32-bit. So you must write

2ULL << 32

>> +#define TCR_ASID16           (1 << 36)

1ULL

> As a matter of coding style, I would much prefer tables like this to be
> written as
>
> #define TCR_IRGN_MASK           0x0000000003000300

0x0000000003000300ULL, to be safe

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 04/31] arm64: MMU definitions
  2012-08-15 16:34     ` Geert Uytterhoeven
@ 2012-08-15 16:45       ` Catalin Marinas
  0 siblings, 0 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-15 16:45 UTC (permalink / raw)
  To: Geert Uytterhoeven
  Cc: Arnd Bergmann, linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Wed, Aug 15, 2012 at 05:34:46PM +0100, Geert Uytterhoeven wrote:
> On Wed, Aug 15, 2012 at 3:30 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> >> +#define TCR_IPS_40BIT                (2 << 32)
> 
> By default, constants are int, i.e. 32-bit. So you must write
> 
> 2ULL << 32
> 
> >> +#define TCR_ASID16           (1 << 36)
> 
> 1ULL

Those higher constants are only used in assembly currently, so no
side-effects. But I agree that I should use something like:

	(_AC(1, UL) << 36)

(UL is sufficient on a 64-bit system)

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 02/31] arm64: Kernel booting and initialisation
  2012-08-15 13:20   ` Arnd Bergmann
@ 2012-08-15 17:06     ` Olof Johansson
  2012-08-16 12:53     ` Catalin Marinas
  1 sibling, 0 replies; 170+ messages in thread
From: Olof Johansson @ 2012-08-15 17:06 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel, Will Deacon

On Wed, Aug 15, 2012 at 01:20:02PM +0000, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> 
> > +The AArch64 exception model is made up of a number of exception levels
> > +(EL0 - EL3), with EL0 and EL1 having a secure and a non-secure
> > +counterpart.  EL2 is the hypervisor level and exists only in non-secure
> > +mode. EL3 is the highest priority level and exists only in secure mode.
> 
> I'm always confused by a description like this. It sounds like you cannot
> have a hypervisor if you have code running in secure mode in EL3. What
> I instead understand is that you enter non-secure mode by going from
> EL3 into EL2.
> 
> > +2. Setup the device tree
> > +-------------------------
> > +
> > +Requirement: MANDATORY
> > +
> > +The device tree blob (dtb) must be no bigger than 2 megabytes in size
> > +and placed at a 2-megabyte boundary within the first 512 megabytes from
> > +the start of the kernel image. This is to allow the kernel to map the
> > +blob using a single section mapping in the initial page tables.
> 
> I've seen people put firmware for some peripherals into the device tree,
> so that a device driver can grab a blob from there and load it into the
> device, rather than calling request_firmware() which would fail if the
> OS running on the system does not contain the blob. If such firmware is
> too large, you end up violating the 2 MB limit you impose here.
> 
> Should we keep that limit and declare those use cases as invalid, or
> should we try to make the boot protocol more flexible?
> 
> > diff --git a/arch/arm64/include/asm/setup.h b/arch/arm64/include/asm/setup.h
> > new file mode 100644
> > index 0000000..d766493
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/setup.h
> > @@ -0,0 +1,26 @@
> > +#ifndef __ASM_SETUP_H
> > +#define __ASM_SETUP_H
> > +
> > +#include <linux/types.h>
> > +
> > +#define COMMAND_LINE_SIZE 1024
> > +
> > +#endif
> 
> Is this necessary? The asm-generic version of this file allows 512 bytes,
> which seems plenty.

Chrome OS on my system today uses a 553 character cmdline, in particular
because some of the device mapper arguments are in there (since we boot without
ramdisk). It adds up quickly.

I suggest keeping it common with x86 since those limits are what people
will be used to (2048).


-Olof

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 02/31] arm64: Kernel booting and initialisation
  2012-08-14 23:06   ` Olof Johansson
@ 2012-08-15 17:37     ` Catalin Marinas
  2012-08-15 19:03       ` Olof Johansson
  0 siblings, 1 reply; 170+ messages in thread
From: Catalin Marinas @ 2012-08-15 17:37 UTC (permalink / raw)
  To: Olof Johansson
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

Hi Olof,

On Wed, Aug 15, 2012 at 12:06:45AM +0100, Olof Johansson wrote:
> On Tue, Aug 14, 2012 at 06:52:03PM +0100, Catalin Marinas wrote:
> > +Before jumping into the kernel, the following conditions must be met:
> > +
> > +- Quiesce all DMA capable devices so that memory does not get
> > +  corrupted by bogus network packets or disk data.  This will save
> > +  you many hours of debug.
> > +
> > +- Primary CPU general-purpose register settings
> > +  x0 = physical address of device tree blob (dtb) in system RAM.
> > +
> > +- CPU mode
> > +  All forms of interrupts must be masked in PSTATE.DAIF (Debug, SError,
> > +  IRQ and FIQ).
> > +  The CPU must be in either EL2 (RECOMMENDED in order to have access to
> > +  the virtualisation extensions) or non-secure EL1.
> > +
> > +- Caches, MMUs
> > +  The MMU must be off.
> > +  Instruction cache may be on or off.
> > +  Data cache must be off and invalidated.
> > +
> > +- Architected timers
> > +  CNTFRQ must be programmed with the timer frequency.
> > +  If entering the kernel at EL1, CNTHCTL_EL2 must have EL1PCTEN (bit 0)
> > +  set where available.
> > +
> > +- Coherency
> > +  All CPUs to be booted by the kernel must be part of the same coherency
> > +  domain on entry to the kernel.  This may require IMPLEMENTATION DEFINED
> > +  initialisation to enable the receiving of maintenance operations on
> > +  each CPU.
> > +
> > +- System registers
> > +  All writable architected system registers at the exception level where
> > +  the kernel image will be entered must be initialised by software at a
> > +  higher exception level to prevent execution in an UNKNOWN state.
> 
> Given the recent development of ARM platforms, you might want to mandate
> the state of IOMMUs as well (they should probably be off, since there
> should be no active DMA activity). Graphics would be the exception to
> this, since if you want to keep scanning out a splash screen, you'll
> have to keep doing DMA...

We'll enhance this document as we get hardware as it's not clear whether
we can simply mandate it to be off. We may have situations with some
simple IOMMU that is previously set up by the firmware and the kernel
doesn't get access to it. One example is the System MMU from ARM that
supports stage 2 (hypervisor) translations and you just run a guest
kernel without any control of the IOMMU.

> > +- The primary CPU must jump directly to the first instruction of the
> > +  kernel image.  The device tree blob passed by this CPU must contain
> > +  for each CPU node:
> > +
> > +    1. An 'enable-method' property. Currently, the only supported value
> > +       for this field is the string "spin-table".
> > +
> > +    2. A 'cpu-release-addr' property identifying a 64-bit,
> > +       zero-initialised memory location.
> 
> These would be good to have documented in the
> Documentation/devicetree/bindings hierarchy as well.

OK.

> > index 0000000..d766493
> > --- /dev/null
> > +++ b/arch/arm64/include/asm/setup.h
> > @@ -0,0 +1,26 @@
> > +/*
> > + * Based on arch/arm/include/asm/setup.h
> > + *
> > + * Copyright (C) 1997-1999 Russell King
> > + * Copyright (C) 2012 ARM Ltd.
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > + */
> > +#ifndef __ASM_SETUP_H
> > +#define __ASM_SETUP_H
> > +
> > +#include <linux/types.h>
> > +
> > +#define COMMAND_LINE_SIZE 1024
> 
> Probably not a huge deal, and other architectures seem to be all over
> the map on this, but you might want to go with a larger value now rather
> than later. 2048 or 4096 perhaps?

It looks like there are many different values, including the asm-generic
one which is 512. I'm happy to follow the x86 example and change it to
2048, it doesn't really matter.

> > diff --git a/arch/arm64/kernel/head.S b/arch/arm64/kernel/head.S
> > new file mode 100644
> > index 0000000..34ccdc0
> > --- /dev/null
> > +++ b/arch/arm64/kernel/head.S
> 
> [...]
> 
> > +/*
> > + * Setup common bits before finally enabling the MMU. Essentially this is just
> > + * loading the page table pointer and vector base registers.
> > + *
> > + * On entry to this code, x0 must contain the SCTLR_EL1 value for turning on
> > + * the MMU.
> > + */
> > +__enable_mmu:
> 
> ENTRY()?

__enable_mmu is not used outside this file, so no need for ENTRY().

> > +	ldr	x5, =vectors
> > +	msr	vbar_el1, x5
> > +	msr	ttbr0_el1, x25			// load TTBR0
> > +	msr	ttbr1_el1, x26			// load TTBR1
> > +	isb
> > +	b	__turn_mmu_on
> > +ENDPROC(__enable_mmu)
> 
> ...or just END()? Same for a few of the other functions below.

ENDPROC() gives us ".type @function" in addition to END(). This proved
to be useful in the past for debugging symbols, unwind table (though we
don't have the latter on AArch64).

> > diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
> > new file mode 100644
> > index 0000000..f25186f
> > --- /dev/null
> > +++ b/arch/arm64/kernel/setup.c
> 
> [...]
> 
> > +static void __init setup_processor(void)
> > +{
> > +	struct proc_info_list *list;
> > +
> > +	/*
> > +	 * locate processor in the list of supported processor
> > +	 * types.  The linker builds this table for us from the
> > +	 * entries in arch/arm/mm/proc.S
> > +	 */
> 
> Probably from arch/arm64/... somewhere?

Yes, I did a grep and found a few more.

> > +	printk("CPU: %s [%08x] revision %d\n",
> > +	       cpu_name, read_cpuid_id(), read_cpuid_id() & 15);
> > +
> > +	sprintf(init_utsname()->machine, "aarch64");
> 
> > +	initial_boot_params = devtree;
> > +	dt_root = of_get_flat_dt_root();
> > +
> > +	machine_name = of_get_flat_dt_prop(dt_root, "model", NULL);
> > +	if (!machine_name)
> > +		machine_name = of_get_flat_dt_prop(dt_root, "compatible", NULL);
> > +	if (!machine_name)
> > +		machine_name = "<unknown>";
> > +	pr_info("Machine: %s\n", machine_name);
> 
> This property is an array of strings. It would be more valuable to print out
> the entry that was matched for a platform instead of the provided one from the
> device tree.

If we add machine_desc structure back, we could print which machine was
matched. But so far I try to keep the SoC code to a minimum and just do
the probing later in the SoC code (of_find_matching_node). Ideally we
shouldn't have any SoC code and just keep code in drivers but we'll see
how far we can get. We can discuss more details at the KS as I would
like the arm-soc team to get involved here.

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 02/31] arm64: Kernel booting and initialisation
  2012-08-15 17:37     ` Catalin Marinas
@ 2012-08-15 19:03       ` Olof Johansson
  2012-08-15 19:53         ` Catalin Marinas
  0 siblings, 1 reply; 170+ messages in thread
From: Olof Johansson @ 2012-08-15 19:03 UTC (permalink / raw)
  To: Catalin Marinas
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

Hi,

On Wed, Aug 15, 2012 at 06:37:11PM +0100, Catalin Marinas wrote:
> Hi Olof,
> 
> > Given the recent development of ARM platforms, you might want to mandate
> > the state of IOMMUs as well (they should probably be off, since there
> > should be no active DMA activity). Graphics would be the exception to
> > this, since if you want to keep scanning out a splash screen, you'll
> > have to keep doing DMA...
> 
> We'll enhance this document as we get hardware as it's not clear whether
> we can simply mandate it to be off. We may have situations with some
> simple IOMMU that is previously set up by the firmware and the kernel
> doesn't get access to it. One example is the System MMU from ARM that
> supports stage 2 (hypervisor) translations and you just run a guest
> kernel without any control of the IOMMU.

Ok, fair enough.

> > > +/*
> > > + * Setup common bits before finally enabling the MMU. Essentially this is just
> > > + * loading the page table pointer and vector base registers.
> > > + *
> > > + * On entry to this code, x0 must contain the SCTLR_EL1 value for turning on
> > > + * the MMU.
> > > + */
> > > +__enable_mmu:
> > 
> > ENTRY()?
> 
> __enable_mmu is not used outside this file, so no need for ENTRY().
> 
> > > +	ldr	x5, =vectors
> > > +	msr	vbar_el1, x5
> > > +	msr	ttbr0_el1, x25			// load TTBR0
> > > +	msr	ttbr1_el1, x26			// load TTBR1
> > > +	isb
> > > +	b	__turn_mmu_on
> > > +ENDPROC(__enable_mmu)
> > 
> > ...or just END()? Same for a few of the other functions below.
> 
> ENDPROC() gives us ".type @function" in addition to END(). This proved
> to be useful in the past for debugging symbols, unwind table (though we
> don't have the latter on AArch64).

A good as reason as any, sounds good.

> > > +static void __init setup_processor(void)
> > > +{
> > > +	struct proc_info_list *list;
> > > +
> > > +	/*
> > > +	 * locate processor in the list of supported processor
> > > +	 * types.  The linker builds this table for us from the
> > > +	 * entries in arch/arm/mm/proc.S
> > > +	 */
> > 
> > Probably from arch/arm64/... somewhere?
> 
> Yes, I did a grep and found a few more.

Yeah, I pointed out some other stale ARM-derived comments in other patches.

> > > +	printk("CPU: %s [%08x] revision %d\n",
> > > +	       cpu_name, read_cpuid_id(), read_cpuid_id() & 15);
> > > +
> > > +	sprintf(init_utsname()->machine, "aarch64");
> > 
> > > +	initial_boot_params = devtree;
> > > +	dt_root = of_get_flat_dt_root();
> > > +
> > > +	machine_name = of_get_flat_dt_prop(dt_root, "model", NULL);
> > > +	if (!machine_name)
> > > +		machine_name = of_get_flat_dt_prop(dt_root, "compatible", NULL);
> > > +	if (!machine_name)
> > > +		machine_name = "<unknown>";
> > > +	pr_info("Machine: %s\n", machine_name);
> > 
> > This property is an array of strings. It would be more valuable to print out
> > the entry that was matched for a platform instead of the provided one from the
> > device tree.
> 
> If we add machine_desc structure back, we could print which machine was
> matched. But so far I try to keep the SoC code to a minimum and just do
> the probing later in the SoC code (of_find_matching_node). Ideally we
> shouldn't have any SoC code and just keep code in drivers but we'll see
> how far we can get. We can discuss more details at the KS as I would
> like the arm-soc team to get involved here.

Interesting approach, I wonder if it'll scale, in particular if it comes
to needing to do early setup and init. For device-level setup, generic
will probably work just fine. And if it doesn't, things can be changed
later. So it sounds like a good start.

Definitely something we should discuss. I suggest not doing it at KS
though, since only half of the arm-soc team is invited there. So the
ARM mini-summit or hallway around LPC is a better venue.


-Olof


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 02/31] arm64: Kernel booting and initialisation
  2012-08-15 19:03       ` Olof Johansson
@ 2012-08-15 19:53         ` Catalin Marinas
  0 siblings, 0 replies; 170+ messages in thread
From: Catalin Marinas @ 2012-08-15 19:53 UTC (permalink / raw)
  To: Olof Johansson
  Cc: linux-arch, linux-arm-kernel, linux-kernel, Arnd Bergmann, Will Deacon

On 15 August 2012 20:03, Olof Johansson <olof@lixom.net> wrote:
> On Wed, Aug 15, 2012 at 06:37:11PM +0100, Catalin Marinas wrote:
>> If we add machine_desc structure back, we could print which machine was
>> matched. But so far I try to keep the SoC code to a minimum and just do
>> the probing later in the SoC code (of_find_matching_node). Ideally we
>> shouldn't have any SoC code and just keep code in drivers but we'll see
>> how far we can get. We can discuss more details at the KS as I would
>> like the arm-soc team to get involved here.
>
> Interesting approach, I wonder if it'll scale, in particular if it comes
> to needing to do early setup and init. For device-level setup, generic
> will probably work just fine. And if it doesn't, things can be changed
> later. So it sounds like a good start.
>
> Definitely something we should discuss. I suggest not doing it at KS
> though, since only half of the arm-soc team is invited there. So the
> ARM mini-summit or hallway around LPC is a better venue.

I was indeed thinking of the ARM mini-summit or hallway discussions.
The KS has different topics and it wouldn't have been of wide interest
anyway.

-- 
Catalin

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 03/31] arm64: Exception handling
  2012-08-15 13:03   ` Arnd Bergmann
@ 2012-08-16 10:05     ` Will Deacon
  2012-08-16 11:54       ` Arnd Bergmann
  0 siblings, 1 reply; 170+ messages in thread
From: Will Deacon @ 2012-08-16 10:05 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel

On Wed, Aug 15, 2012 at 02:03:47PM +0100, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> 
> > +#ifdef CONFIG_AARCH32_EMULATION
> > +#define compat_thumb_mode(regs) \
> > +	(((regs)->pstate & COMPAT_PSR_T_BIT))
> > +#else
> > +#define compat_thumb_mode(regs) (0)
> > +#endif
> 
> The symbol we use on other platforms is CONFIG_COMPAT. I don't think you
> need to have a separate CONFIG_AARCH32_EMULATION

Using COMPAT does preclude the possibility of doing something like the x32
ABI later on though. Some other architectures seem to do something similar
(MIPS32_COMPAT, IA32_EMULATION).

Will

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 16/31] arm64: ELF definitions
  2012-08-15 14:15   ` Arnd Bergmann
@ 2012-08-16 10:23     ` Will Deacon
  2012-08-16 12:37       ` Arnd Bergmann
  0 siblings, 1 reply; 170+ messages in thread
From: Will Deacon @ 2012-08-16 10:23 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel

On Wed, Aug 15, 2012 at 03:15:39PM +0100, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> > +
> > +void elf_set_personality(int personality)
> > +{
> > +       switch (personality & PER_MASK) {
> > +       case PER_LINUX:
> > +               clear_thread_flag(TIF_32BIT);
> > +               break;
> > +       case PER_LINUX32:
> > +               set_thread_flag(TIF_32BIT);
> > +               break;
> > +       default:
> > +               pr_warning("Process %s tried to assume unknown personality %d\n",
> > +                          current->comm, personality);
> > +               return;
> > +       }
> > +
> > +       current->personality = personality;
> > +}
> > +EXPORT_SYMBOL(elf_set_personality);
> 
> This looks wrong: PER_LINUX/PER_LINUX32 decides over the output of the
> uname system call, while TIF_32BIT decides over the instruction set
> when returning to user space. You definitely should not set the personality
> to the value you pass from the elf loader. Instead, just do
> 
> #define SET_PERSONALITY(ex) clear_thread_flag(TIF_32BIT);
> #defined COMPAT_SET_PERSONALITY(ex) set_thread_flag(TIF_32BIT);

In this case, won't uname be incorrect (aarch64l) for aarch32 tasks (which
expect something like armv8l)?

Will

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 21/31] arm64: 32-bit (compat) applications support
  2012-08-15 14:34   ` Arnd Bergmann
@ 2012-08-16 10:28     ` Will Deacon
  2012-08-16 12:39       ` Arnd Bergmann
  2012-08-23  6:46       ` PER_LINUX32, Was: " Arnd Bergmann
  2012-08-24 10:43     ` Catalin Marinas
  1 sibling, 2 replies; 170+ messages in thread
From: Will Deacon @ 2012-08-16 10:28 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel

On Wed, Aug 15, 2012 at 03:34:04PM +0100, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> > +asmlinkage int compat_sys_personality(compat_ulong_t personality)
> > +{
> > +	int ret;
> > +
> > +	if (personality(current->personality) == PER_LINUX32 &&
> > +		personality == PER_LINUX)
> > +		personality = PER_LINUX32;
> > +	ret = sys_personality(personality);
> > +	if (ret == PER_LINUX32)
> > +		ret = PER_LINUX;
> > +	return ret;
> > +}
> 
> Where did you get this from?
> 
> You should not need compat_sys_personality, just call the native function.

Hmm, but in that case an aarch32 application doing a personality(PER_LINUX)
syscall will start seeing the wrong uname.

Will

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 23/31] arm64: Debugging support
  2012-08-15 15:07   ` Arnd Bergmann
@ 2012-08-16 10:47     ` Will Deacon
  2012-08-16 12:49       ` Arnd Bergmann
  0 siblings, 1 reply; 170+ messages in thread
From: Will Deacon @ 2012-08-16 10:47 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel

On Wed, Aug 15, 2012 at 04:07:36PM +0100, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> 
> > +const struct user_regset_view *task_user_regset_view(struct task_struct *task)
> > +{
> > +#ifdef CONFIG_AARCH32_EMULATION
> > +	if (test_tsk_thread_flag(task, TIF_32BIT))
> > +		return &user_aarch32_view;
> > +#endif
> > +	return &user_aarch64_view;
> > +}
> 
> Ah, nice. So you support 64 bit debuggers debugging 32 bit processes, right?

That should work if the debugger can deal with it, yes.

> From what I can tell, there is no support for 32 bit processes debugging
> 64 bit ones. Is that something you plan to add in the future, or do you
> consider that out of scope? In either case, a comment would be helpful.

That can't really work because the debugger won't be able to manipulate
child pointers properly without us adding a new ptrace interface (and then,
I still wonder about how feasible it really is). I can add a comment.

> > +long arch_ptrace(struct task_struct *child, long request,
> > +		 unsigned long addr, unsigned long data)
> > +{
> > +	int ret;
> > +	unsigned long *datap = (unsigned long __user *)data;
> > +
> > +	switch (request) {
> > +		case PTRACE_GET_THREAD_AREA:
> > +			ret = put_user(child->thread.tp_value, datap);
> > +			break;
> > +
> > +#ifdef CONFIG_HAVE_HW_BREAKPOINT
> > +		case PTRACE_GETHBPREGS:
> > +			ret = ptrace_gethbpregs(child, addr, datap);
> > +			break;
> > +
> > +		case PTRACE_SETHBPREGS:
> > +			ret = ptrace_sethbpregs(child, addr, datap);
> > +			break;
> > +#endif
> > +
> > +		default:
> > +			ret = ptrace_request(child, request, addr, data);
> > +			break;
> > +	}
> > +
> > +	return ret;
> > +}
> 
> Is there a reaons why these are not regsets but have their own ptrace
> commands? I believe new architectures should generally not add ptrace
> commands any more.

I could probably add some regset wrappers about the hbp accessors (which we
have to keep for the compat ptrace interface). I'll have a think as it might
even make sense to have different regsets for breakpoints and watchpoints.

As for the the tls, is it worth having a regset with only one register?

Will

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 25/31] arm64: Performance counters support
  2012-08-15 15:11   ` Arnd Bergmann
@ 2012-08-16 10:51     ` Will Deacon
  0 siblings, 0 replies; 170+ messages in thread
From: Will Deacon @ 2012-08-16 10:51 UTC (permalink / raw)
  To: Arnd Bergmann; +Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel

On Wed, Aug 15, 2012 at 04:11:11PM +0100, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> > From: Will Deacon <will.deacon@arm.com>
> > 
> > This patch adds support for the AArch64 performance counters.
> > 
> > Signed-off-by: Will Deacon <will.deacon@arm.com>
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > ---
> >  arch/arm64/include/asm/perf_event.h |   22 +
> >  arch/arm64/include/asm/pmu.h        |   82 +++
> >  arch/arm64/kernel/perf_event.c      | 1368 +++++++++++++++++++++++++++++++++++
> >  tools/perf/perf.h                   |    6 +
> 
> Can you explain how AArch64 performance counters differ from the 32
> bit ones? Do they work for AArch32 user space under AArch64 kernels?
> Is it possible to share parts of the implementation with arch/arm?

Perf should work for compat tasks, yes. I'd like to share some of the code
with arch/arm/ and I've started reworking the arch/arm/ stuff to accomodate
this better:

  git://git.kernel.org/pub/scm/linux/kernel/git/will/linux.git perf/updates

I'm not sure how well it will fit in drivers/ but I'm certainly willing to
give it a try.

Will

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 26/31] arm64: Miscellaneous library functions
  2012-08-15 15:21   ` Arnd Bergmann
@ 2012-08-16 10:57     ` Will Deacon
  2012-08-16 13:00       ` Arnd Bergmann
  0 siblings, 1 reply; 170+ messages in thread
From: Will Deacon @ 2012-08-16 10:57 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel,
	Marc Zyngier

On Wed, Aug 15, 2012 at 04:21:14PM +0100, Arnd Bergmann wrote:
> On Tuesday 14 August 2012, Catalin Marinas wrote:
> 
> > +
> > +/*
> > + * Use compiler builtins for simple inline operations.
> > + */
> > +static inline unsigned long __ffs(unsigned long word)
> > +{
> > +	return __builtin_ffsl(word) - 1;
> > +}
> > +
> > +static inline int ffs(int x)
> > +{
> > +	return __builtin_ffs(x);
> > +}
> > +
> > +static inline unsigned long __fls(unsigned long word)
> > +{
> > +	return BITS_PER_LONG - 1 - __builtin_clzl(word);
> > +}
> > +
> > +static inline int fls(int x)
> > +{
> > +	return x ? sizeof(x) * BITS_PER_BYTE - __builtin_clz(x) : 0;
> > +}
> 
> These are all great, but I think whether to use them or not should
> depend on the compiler version rather than the architecture in
> general. Do we know a minimum gcc version that supports all of the
> above? Then we could put that code into the generic files.

I imagine that the version of GCC that supports these builtins varies for
each architecture. For aarch64, the compile will always support these
builtins and these particular ones are guaranteed to be inlined.

> If that's not possible, we could still make the implementation
> available for other architectures by moving it to
> 
> asm-generic/bitops/builtin-__ffs.h
> asm-generic/bitops/builtin-ffs.h
> asm-generic/bitops/builtin-__fls.h
> asm-generic/bitops/builtin-fls.h

Yeah, that might be an idea. The architecture can then decide to use them if
it knows they are available and usable.

> > --- /dev/null
> > +++ b/arch/arm64/lib/bitops.c
> > @@ -0,0 +1,25 @@
> > +/*
> > + * Copyright (C) 2012 ARM Limited
> > + *
> > + * This program is free software; you can redistribute it and/or modify
> > + * it under the terms of the GNU General Public License version 2 as
> > + * published by the Free Software Foundation.
> > + *
> > + * This program is distributed in the hope that it will be useful,
> > + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> > + * GNU General Public License for more details.
> > + *
> > + * You should have received a copy of the GNU General Public License
> > + * along with this program.  If not, see <http://www.gnu.org/licenses/>.
> > + */
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/spinlock.h>
> > +#include <linux/atomic.h>
> > +
> > +#ifdef CONFIG_SMP
> > +arch_spinlock_t __atomic_hash[ATOMIC_HASH_SIZE] __lock_aligned = {
> > +       [0 ... (ATOMIC_HASH_SIZE-1)]  = __ARCH_SPIN_LOCK_UNLOCKED
> > +};
> > +#endif
> 
> What?
> 
> I suppose this is a leftover from an earlier version using the
> generic bitops, right?

We currently use the generic atomic bitops (asm-generic/bitops/atomic.h)
which contains:

#  define ATOMIC_HASH(a) (&(__atomic_hash[ (((unsigned long) a)/L1_CACHE_BYTES) & (ATOMIC_HASH_SIZE-1) ]))

so we have to provide a definition for the array. We have additional patches
containing optimised assembly implementations of the atomic bitops which we
will push later, once we've got some hardware to benchmark with.

Will

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 03/31] arm64: Exception handling
  2012-08-16 10:05     ` Will Deacon
@ 2012-08-16 11:54       ` Arnd Bergmann
  0 siblings, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-16 11:54 UTC (permalink / raw)
  To: Will Deacon; +Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel

On Thursday 16 August 2012, Will Deacon wrote:
> On Wed, Aug 15, 2012 at 02:03:47PM +0100, Arnd Bergmann wrote:
> > On Tuesday 14 August 2012, Catalin Marinas wrote:
> > 
> > > +#ifdef CONFIG_AARCH32_EMULATION
> > > +#define compat_thumb_mode(regs) \
> > > +   (((regs)->pstate & COMPAT_PSR_T_BIT))
> > > +#else
> > > +#define compat_thumb_mode(regs) (0)
> > > +#endif
> > 
> > The symbol we use on other platforms is CONFIG_COMPAT. I don't think you
> > need to have a separate CONFIG_AARCH32_EMULATION
> 
> Using COMPAT does preclude the possibility of doing something like the x32
> ABI later on though. Some other architectures seem to do something similar
> (MIPS32_COMPAT, IA32_EMULATION).

Ok, fair enough.

	Arnd


^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 16/31] arm64: ELF definitions
  2012-08-16 10:23     ` Will Deacon
@ 2012-08-16 12:37       ` Arnd Bergmann
  2012-08-21 16:06         ` Catalin Marinas
  0 siblings, 1 reply; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-16 12:37 UTC (permalink / raw)
  To: Will Deacon; +Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel

On Thursday 16 August 2012, Will Deacon wrote:
> > This looks wrong: PER_LINUX/PER_LINUX32 decides over the output of the
> > uname system call, while TIF_32BIT decides over the instruction set
> > when returning to user space. You definitely should not set the personality
> > to the value you pass from the elf loader. Instead, just do
> > 
> > #define SET_PERSONALITY(ex) clear_thread_flag(TIF_32BIT);
> > #defined COMPAT_SET_PERSONALITY(ex) set_thread_flag(TIF_32BIT);
> 
> In this case, won't uname be incorrect (aarch64l) for aarch32 tasks (which
> expect something like armv8l)?

No, the uname output is meant to tell you about the system, not the
instruction set that you are using (you already know that in compiled
code).

The main use case is to fool stuff like autoconf into assuming your
architecture is the other one, e.g. when building a 32 bit package
on using a 64 bit /bin/bash to run ./configure or vice versa.

	Arnd

^ permalink raw reply	[flat|nested] 170+ messages in thread

* Re: [PATCH v2 21/31] arm64: 32-bit (compat) applications support
  2012-08-16 10:28     ` Will Deacon
@ 2012-08-16 12:39       ` Arnd Bergmann
  2012-08-23  6:46       ` PER_LINUX32, Was: " Arnd Bergmann
  1 sibling, 0 replies; 170+ messages in thread
From: Arnd Bergmann @ 2012-08-16 12:39 UTC (permalink / raw)
  To: Will Deacon; +Cc: Catalin Marinas, linux-arch, linux-arm-kernel, linux-kernel

On Thursday 16 August 2012, Will Deacon wrote:
> 
> On Wed, Aug 15, 2012 at 03:34:04PM +0100, Arnd Bergmann wrote:
> > On Tuesday 14 August 2012, Catalin Marinas wrote:
> > > +asmlinkage int compat_sys_personality(compat_ulong_t personality)
> > > +{
> > > +   int ret;
> > > +
> > > +   if (personality(current->personality) == PER_LINUX32 &&
> > > +           personality == PER_LINUX)
> > > +           personality = PER_LINUX32;
> > > +   ret = sys_personality(personality);
> > > +   if (ret == PER_LINUX32)
> > > +           ret = PER_LINUX;
> > > +   return ret;
> > > +}
> > 
> > Where did you get this from?
> > 
> > You should not need compat_sys_personality, just call the native function.
> 
> Hmm, but in that case an aarch32 application doing a personality(PER_LINUX)
> syscall will start seeing the wrong uname.

Not the wrong uname, just the default one, which is correct.

	Arnd

^ permalink raw reply	[flat|