linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v9 0/6] ARM: VDSO
@ 2014-08-22 21:52 Nathan Lynch
  2014-08-22 21:52 ` [PATCH v9 1/6] ARM: use _install_special_mapping for sigpage Nathan Lynch
                   ` (7 more replies)
  0 siblings, 8 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-08-22 21:52 UTC (permalink / raw)
  To: linux-arm-kernel

Provide fast userspace implementations of gettimeofday and
clock_gettime on systems that implement the generic timers extension
defined in ARMv7.  This follows the example of arm64 in conception but
significantly differs in some aspects of the implementation (C vs
assembly, mainly).

Clocks supported:
- CLOCK_REALTIME
- CLOCK_MONOTONIC
- CLOCK_REALTIME_COARSE
- CLOCK_MONOTONIC_COARSE

This also provides clock_getres (as arm64 does).

getcpu support is planned but not included at this time.

For applications to transparently benefit from this change,
ARM-specific support code needs to be added to glibc.  I have such a
patch, and have verified that glibc's self tests do not detect any
regressions.  I hope to have that code added to glibc for the 2.21
release.

The VDSO symbols are available for lookup via dlsym even with an
unpatched glibc.

Note that while the high-precision realtime and monotonic clock
support depends on the generic timers extension, support for
clock_getres and coarse clocks is independent of the timer
implementation and is provided unconditionally.  High-resolution clock
support requires changes to the arch timer code, posted here:

http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/281280.html

The VDSO will function correctly without those changes, but
gettimeofday and clock_gettime with CLOCK_REALTIME/CLOCK_MONOTONIC
will not be accelerated.

Tested on OMAP5 and i.MX6, verifying that results obtained with the
vdso are consistent with those obtained from the kernel.  On OMAP5 I
observe a 3- to 4-fold speedup for gettimeofday / CLOCK_REALTIME, with
even better (if less interesting) speedups for the coarse clock ids
and clock_getres.

I've been testing and benchmarking this with some custom test code
which I have hosted here:

https://github.com/nlynch-mentor/vdsotest

Unpatched FSF GDB may complain "warning: Could not load shared library
symbols for linux-vdso.so.1."  This is not an ARM-specific issue.
Current Fedora and Debian patch their GDB packages to prevent this
warning.

Changes since v8:
- Update to 3.17-rc1.
- Split out arch timer changes into separate series.
- Ensure that VDSO will not attempt to read the counter if access is
  not enabled; this check can be removed after the arch timer changes
  are merged.  See update_vsyscall and vdso_can_use_arch_timer in
  patch #5.

Changes since v7:
- Update to next-20140801.
- In arch_setup_additional_pages, fix call to get_unmapped_area - use
  bytes (not pages) for length argument.
- As x86 does, separate data and text into two VMAs, [vvar] and [vdso]
  respectively.  These have different permissions; [vdso] will allow
  debuggers to set breakpoints, but [vvar] is read-only and cannot be
  modified even via ptrace.
- Use _install_special_mapping for signal page, vvar, and vdso
  mappings.
- Add -DDISABLE_BRANCH_PROFILING.
- Add --no-undefined -Bsymbolic to link options to cause linker to
  error out on unresolved references, making checkundef script
  unnecessary.
- Specify max-page-size, common-page-size in linker options so the
  true alignment is reflected in program header; otherwise gdb gets
  confused.
- Fix incremental build vs. generated auxvec.h.
- Use appropriate unwind directives in __get_datapage.
- Added vdso_install target and help text.  Install build-id symlinks
  as x86 does.
- Adjust update_vsyscall for changes in struct timekeeper.

Changes since v6:
- Update to 3.16-rc1.
- Remove -lgcc from link step - need to support GCC installations
  without libgcc.
- Force -O2 compilation to prevent GCC from emitting calls to libgcc
  math routines.
- Use custom post-processing to clear the EF_ARM_ABI_FLOAT_SOFT flag
  if set in the ELF header to produce a shared object which is
  architecturally allowed to be used by both soft- and hard-float
  code.
- Consolidate common arch timer code instead of duplicating it.
- Prevent the VDSO from attempting CP15 access on memory-only
  architected timer implementations by renaming the clocksource.

Changes since v5:
- Update to 3.15-rc1.
- Place vdso at a randomized offset above the stack along with the
  sigpage.
- Properly export asm/auxvec.h.
- Split patch into series for ease of review.

Changes since v4:
- Map data page at the beginning of the VMA to prevent orphan
  sections at the end of output invalidating the calculated offset.
- Move checkundef into cmd_vdsold to avoid spurious rebuilds.
- Change vdso_init message to pr_debug.
- Add -fno-stack-protector to cflags.

Changes since v3:
- Update to 3.14-rc6.
- Record vdso base in mm context before installing mapping (for the
  sake of perf_mmap_event).
- Use a more seqcount-like API for critical sections.  Using seqcount
  API directly, however, would leak kernel pointers to userspace when
  lockdep is enabled.
- Trap instead of looping forever in division-by-zero stubs.

Changes since v2:
- Update to 3.14-rc4.
- Make vDSO configurable, depending on AEABI and MMU.
- Defer shifting of nanosecond component of timespec: fixes observed
  1ns inconsistencies for CLOCK_REALTIME, CLOCK_MONOTONIC (see
  45a7905fc48f for arm64 equivalent).
- Force reload of seq_count when spinning: without a memory clobber
  after the load of vdata->seq_count, GCC can generate code like this:
    2f8:   e59c9020        ldr     r9, [ip, #32]
    2fc:   e3190001        tst     r9, #1
    300:   1a000033        bne     3d4 <do_realtime+0x104>
    304:   f57ff05b        dmb     ish
    308:   e59c3034        ldr     r3, [ip, #52]   ; 0x34
    ...
    3d4:   eafffffe        b       3d4 <do_realtime+0x104>
- Build vdso.so with -lgcc: calls to __lshrdi3, __divsi3 sometimes
  emitted (especially with -Os).  Override certain libgcc functions to
  prevent undefined symbols.
- Do not clear PG_reserved on vdso pages.
- Remove unnecessary get_page calls.
- Simplify ELF signature check during init.
- Use volatile for asm syscall fallbacks.
- Check whether vdso_pagelist is initialized in arm_install_vdso.
- Record clocksource mask in data page.
- Reduce code duplication in do_realtime, do_monotonic.
- Reduce calculations performed in critical sections.
- Simplify coarse clock handling.
- Move datapage load to its own assembly routine.
- Tune vdso_data layout and tweak field names.
- Check vdso shared object for undefined symbols during build.

Changes since v1:
- update to 3.14-rc1
- ensure cache coherency for data page
- Document the kernel-to-userspace protocol for vdso data page updates,
  and note that the timekeeping core prevents concurrent updates.
- update wall-to-monotonic fields unconditionally
- move vdso_start, vdso_end declarations to vdso.h
- correctly build and run when CONFIG_ARM_ARCH_TIMER=n
- rearrange linker script to avoid overlapping sections when CONFIG_DEBUGINFO=n
- remove use_syscall checks from coarse clock paths
- crib BUG_INSTR (0xe7f001f2) from asm/bug.h for text fill


Nathan Lynch (6):
  ARM: use _install_special_mapping for sigpage
  ARM: place sigpage at a random offset above stack
  ARM: miscellaneous vdso infrastructure, preparation
  ARM: add vdso user-space code
  ARM: vdso initialization, mapping, and synchronization
  ARM: add CONFIG_VDSO Kconfig and Makefile bits

 arch/arm/Makefile                    |   8 +
 arch/arm/include/asm/Kbuild          |   1 -
 arch/arm/include/asm/auxvec.h        |   1 +
 arch/arm/include/asm/elf.h           |   9 +
 arch/arm/include/asm/mmu.h           |   3 +
 arch/arm/include/asm/vdso.h          |  34 ++++
 arch/arm/include/asm/vdso_datapage.h |  60 +++++++
 arch/arm/include/uapi/asm/Kbuild     |   1 +
 arch/arm/include/uapi/asm/auxvec.h   |   7 +
 arch/arm/kernel/Makefile             |   1 +
 arch/arm/kernel/asm-offsets.c        |   5 +
 arch/arm/kernel/process.c            |  71 +++++++-
 arch/arm/kernel/vdso.c               | 207 ++++++++++++++++++++++
 arch/arm/mm/Kconfig                  |  15 ++
 arch/arm/vdso/.gitignore             |   1 +
 arch/arm/vdso/Makefile               |  74 ++++++++
 arch/arm/vdso/datapage.S             |  15 ++
 arch/arm/vdso/vdso.S                 |  35 ++++
 arch/arm/vdso/vdso.lds.S             |  88 ++++++++++
 arch/arm/vdso/vdsomunge.c            | 208 +++++++++++++++++++++++
 arch/arm/vdso/vgettimeofday.c        | 320 +++++++++++++++++++++++++++++++++++
 21 files changed, 1154 insertions(+), 10 deletions(-)
 create mode 100644 arch/arm/include/asm/auxvec.h
 create mode 100644 arch/arm/include/asm/vdso.h
 create mode 100644 arch/arm/include/asm/vdso_datapage.h
 create mode 100644 arch/arm/include/uapi/asm/auxvec.h
 create mode 100644 arch/arm/kernel/vdso.c
 create mode 100644 arch/arm/vdso/.gitignore
 create mode 100644 arch/arm/vdso/Makefile
 create mode 100644 arch/arm/vdso/datapage.S
 create mode 100644 arch/arm/vdso/vdso.S
 create mode 100644 arch/arm/vdso/vdso.lds.S
 create mode 100644 arch/arm/vdso/vdsomunge.c
 create mode 100644 arch/arm/vdso/vgettimeofday.c

-- 
1.9.3

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 1/6] ARM: use _install_special_mapping for sigpage
  2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
@ 2014-08-22 21:52 ` Nathan Lynch
  2014-08-22 21:52 ` [PATCH v9 2/6] ARM: place sigpage at a random offset above stack Nathan Lynch
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-08-22 21:52 UTC (permalink / raw)
  To: linux-arm-kernel

_install_special_mapping allows the VMA to be identifed in
/proc/pid/maps without the use of arch_vma_name, providing a
slight net reduction in object size:

  text    data     bss     dec     hex filename
  2996      96     144    3236     ca4 arch/arm/kernel/process.o (before)
  2956     104     144    3204     c84 arch/arm/kernel/process.o (after)

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/process.c | 24 ++++++++++++++++--------
 1 file changed, 16 insertions(+), 8 deletions(-)

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 81ef686a91ca..46fbbb3701a0 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -472,19 +472,23 @@ int in_gate_area_no_mm(unsigned long addr)
 
 const char *arch_vma_name(struct vm_area_struct *vma)
 {
-	return is_gate_vma(vma) ? "[vectors]" :
-		(vma->vm_mm && vma->vm_start == vma->vm_mm->context.sigpage) ?
-		 "[sigpage]" : NULL;
+	return is_gate_vma(vma) ? "[vectors]" : NULL;
 }
 
 static struct page *signal_page;
 extern struct page *get_signal_page(void);
 
+static const struct vm_special_mapping sigpage_mapping = {
+	.name = "[sigpage]",
+	.pages = &signal_page,
+};
+
 int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 {
 	struct mm_struct *mm = current->mm;
+	struct vm_area_struct *vma;
 	unsigned long addr;
-	int ret;
+	int ret = 0;
 
 	if (!signal_page)
 		signal_page = get_signal_page();
@@ -498,12 +502,16 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 		goto up_fail;
 	}
 
-	ret = install_special_mapping(mm, addr, PAGE_SIZE,
+	vma = _install_special_mapping(mm, addr, PAGE_SIZE,
 		VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC,
-		&signal_page);
+		&sigpage_mapping);
+
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto up_fail;
+	}
 
-	if (ret == 0)
-		mm->context.sigpage = addr;
+	mm->context.sigpage = addr;
 
  up_fail:
 	up_write(&mm->mmap_sem);
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v9 2/6] ARM: place sigpage at a random offset above stack
  2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
  2014-08-22 21:52 ` [PATCH v9 1/6] ARM: use _install_special_mapping for sigpage Nathan Lynch
@ 2014-08-22 21:52 ` Nathan Lynch
  2014-08-22 21:52 ` [PATCH v9 3/6] ARM: miscellaneous vdso infrastructure, preparation Nathan Lynch
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-08-22 21:52 UTC (permalink / raw)
  To: linux-arm-kernel

The sigpage is currently placed alongside shared libraries etc in the
address space.  Similar to what x86_64 does for its VDSO, place the
sigpage at a randomized offset above the stack so that learning the
base address of the sigpage doesn't help expose where shared libraries
are loaded in the address space (and vice versa).

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 arch/arm/kernel/process.c | 36 +++++++++++++++++++++++++++++++++++-
 1 file changed, 35 insertions(+), 1 deletion(-)

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 46fbbb3701a0..9e0d931dd475 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -475,6 +475,38 @@ const char *arch_vma_name(struct vm_area_struct *vma)
 	return is_gate_vma(vma) ? "[vectors]" : NULL;
 }
 
+/* If possible, provide a placement hint at a random offset from the
+ * stack for the signal page.
+ */
+static unsigned long sigpage_addr(const struct mm_struct *mm, unsigned int npages)
+{
+	unsigned long offset;
+	unsigned long first;
+	unsigned long last;
+	unsigned long addr;
+	unsigned int slots;
+
+	first = PAGE_ALIGN(mm->start_stack);
+
+	last = TASK_SIZE - (npages << PAGE_SHIFT);
+
+	/* No room after stack? */
+	if (first > last)
+		return 0;
+
+	/* Just enough room? */
+	if (first == last)
+		return first;
+
+	slots = ((last - first) >> PAGE_SHIFT) + 1;
+
+	offset = get_random_int() % slots;
+
+	addr = first + (offset << PAGE_SHIFT);
+
+	return addr;
+}
+
 static struct page *signal_page;
 extern struct page *get_signal_page(void);
 
@@ -488,6 +520,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
 	unsigned long addr;
+	unsigned long hint;
 	int ret = 0;
 
 	if (!signal_page)
@@ -496,7 +529,8 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 		return -ENOMEM;
 
 	down_write(&mm->mmap_sem);
-	addr = get_unmapped_area(NULL, 0, PAGE_SIZE, 0, 0);
+	hint = sigpage_addr(mm, 1);
+	addr = get_unmapped_area(NULL, hint, PAGE_SIZE, 0, 0);
 	if (IS_ERR_VALUE(addr)) {
 		ret = addr;
 		goto up_fail;
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v9 3/6] ARM: miscellaneous vdso infrastructure, preparation
  2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
  2014-08-22 21:52 ` [PATCH v9 1/6] ARM: use _install_special_mapping for sigpage Nathan Lynch
  2014-08-22 21:52 ` [PATCH v9 2/6] ARM: place sigpage at a random offset above stack Nathan Lynch
@ 2014-08-22 21:52 ` Nathan Lynch
  2014-08-22 21:52 ` [PATCH v9 4/6] ARM: add vdso user-space code Nathan Lynch
                   ` (4 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-08-22 21:52 UTC (permalink / raw)
  To: linux-arm-kernel

Define the layout of the data structure shared between kernel and
userspace.

Track the vdso address in the mm_context; needed for communicating
AT_SYSINFO_EHDR to the ELF loader.

Add declarations for arm_install_vdso; implementation is in a
following patch.

Define AT_SYSINFO_EHDR, and, if CONFIG_VDSO=y, report the vdso shared
object address via the ELF auxiliary vector.

Note - this adds the AT_SYSINFO_EHDR in a new user-visible header
asm/auxvec.h; this is consistent with other architectures.

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
---
 arch/arm/include/asm/Kbuild          |  1 -
 arch/arm/include/asm/auxvec.h        |  1 +
 arch/arm/include/asm/elf.h           |  9 ++++++
 arch/arm/include/asm/mmu.h           |  3 ++
 arch/arm/include/asm/vdso.h          | 34 ++++++++++++++++++++
 arch/arm/include/asm/vdso_datapage.h | 60 ++++++++++++++++++++++++++++++++++++
 arch/arm/include/uapi/asm/Kbuild     |  1 +
 arch/arm/include/uapi/asm/auxvec.h   |  7 +++++
 8 files changed, 115 insertions(+), 1 deletion(-)
 create mode 100644 arch/arm/include/asm/auxvec.h
 create mode 100644 arch/arm/include/asm/vdso.h
 create mode 100644 arch/arm/include/asm/vdso_datapage.h
 create mode 100644 arch/arm/include/uapi/asm/auxvec.h

diff --git a/arch/arm/include/asm/Kbuild b/arch/arm/include/asm/Kbuild
index 70cd84eb7fda..9fb902425606 100644
--- a/arch/arm/include/asm/Kbuild
+++ b/arch/arm/include/asm/Kbuild
@@ -1,6 +1,5 @@
 
 
-generic-y += auxvec.h
 generic-y += bitsperlong.h
 generic-y += cputime.h
 generic-y += current.h
diff --git a/arch/arm/include/asm/auxvec.h b/arch/arm/include/asm/auxvec.h
new file mode 100644
index 000000000000..fbd388c46299
--- /dev/null
+++ b/arch/arm/include/asm/auxvec.h
@@ -0,0 +1 @@
+#include <uapi/asm/auxvec.h>
diff --git a/arch/arm/include/asm/elf.h b/arch/arm/include/asm/elf.h
index f4b46d39b9cf..e9822ac77efe 100644
--- a/arch/arm/include/asm/elf.h
+++ b/arch/arm/include/asm/elf.h
@@ -1,7 +1,9 @@
 #ifndef __ASMARM_ELF_H
 #define __ASMARM_ELF_H
 
+#include <asm/auxvec.h>
 #include <asm/hwcap.h>
+#include <asm/vdso_datapage.h>
 
 /*
  * ELF register definitions..
@@ -129,6 +131,13 @@ extern unsigned long arch_randomize_brk(struct mm_struct *mm);
 #define arch_randomize_brk arch_randomize_brk
 
 #ifdef CONFIG_MMU
+#ifdef CONFIG_VDSO
+#define ARCH_DLINFO						\
+do {								\
+	NEW_AUX_ENT(AT_SYSINFO_EHDR,				\
+		    (elf_addr_t)current->mm->context.vdso);	\
+} while (0)
+#endif
 #define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
 struct linux_binprm;
 int arch_setup_additional_pages(struct linux_binprm *, int);
diff --git a/arch/arm/include/asm/mmu.h b/arch/arm/include/asm/mmu.h
index 64fd15159b7d..a5b47421059d 100644
--- a/arch/arm/include/asm/mmu.h
+++ b/arch/arm/include/asm/mmu.h
@@ -11,6 +11,9 @@ typedef struct {
 #endif
 	unsigned int	vmalloc_seq;
 	unsigned long	sigpage;
+#ifdef CONFIG_VDSO
+	unsigned long	vdso;
+#endif
 } mm_context_t;
 
 #ifdef CONFIG_CPU_HAS_ASID
diff --git a/arch/arm/include/asm/vdso.h b/arch/arm/include/asm/vdso.h
new file mode 100644
index 000000000000..08d06e73ccf2
--- /dev/null
+++ b/arch/arm/include/asm/vdso.h
@@ -0,0 +1,34 @@
+#ifndef __ASM_VDSO_H
+#define __ASM_VDSO_H
+
+#ifdef __KERNEL__
+
+#ifndef __ASSEMBLY__
+
+struct mm_struct;
+
+#ifdef CONFIG_VDSO
+
+void arm_install_vdso(struct mm_struct *mm, unsigned long addr);
+
+extern char vdso_start, vdso_end;
+
+extern unsigned int vdso_total_pages;
+
+#else /* CONFIG_VDSO */
+
+static inline void arm_install_vdso(struct mm_struct *mm, unsigned long addr)
+{
+}
+
+static const unsigned int vdso_total_pages = 0;
+
+#endif /* CONFIG_VDSO */
+
+#endif /* __ASSEMBLY__ */
+
+#define VDSO_LBASE	0x0
+
+#endif /* __KERNEL__ */
+
+#endif /* __ASM_VDSO_H */
diff --git a/arch/arm/include/asm/vdso_datapage.h b/arch/arm/include/asm/vdso_datapage.h
new file mode 100644
index 000000000000..f08bdb73d3f4
--- /dev/null
+++ b/arch/arm/include/asm/vdso_datapage.h
@@ -0,0 +1,60 @@
+/*
+ * Adapted from arm64 version.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+#ifndef __ASM_VDSO_DATAPAGE_H
+#define __ASM_VDSO_DATAPAGE_H
+
+#ifdef __KERNEL__
+
+#ifndef __ASSEMBLY__
+
+#include <asm/page.h>
+
+/* Try to be cache-friendly on systems that don't implement the
+ * generic timer: fit the unconditionally updated fields in the first
+ * 32 bytes.
+ */
+struct vdso_data {
+	u32 seq_count;		/* sequence count - odd during updates */
+	u16 use_syscall;	/* whether to fall back to syscalls */
+	u16 cs_shift;		/* clocksource shift */
+	u32 xtime_coarse_sec;	/* coarse time */
+	u32 xtime_coarse_nsec;
+
+	u32 wtm_clock_sec;	/* wall to monotonic offset */
+	u32 wtm_clock_nsec;
+	u32 xtime_clock_sec;	/* CLOCK_REALTIME - seconds */
+	u32 cs_mult;		/* clocksource multiplier */
+
+	u64 cs_cycle_last;	/* last cycle value */
+	u64 cs_mask;		/* clocksource mask */
+
+	u64 xtime_clock_snsec;	/* CLOCK_REALTIME sub-ns base */
+	u32 tz_minuteswest;	/* timezone info for gettimeofday(2) */
+	u32 tz_dsttime;
+};
+
+union vdso_data_store {
+	struct vdso_data data;
+	u8 page[PAGE_SIZE];
+};
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __KERNEL__ */
+
+#endif /* __ASM_VDSO_DATAPAGE_H */
diff --git a/arch/arm/include/uapi/asm/Kbuild b/arch/arm/include/uapi/asm/Kbuild
index 70a1c9da30ca..a1c05f93d920 100644
--- a/arch/arm/include/uapi/asm/Kbuild
+++ b/arch/arm/include/uapi/asm/Kbuild
@@ -1,6 +1,7 @@
 # UAPI Header export list
 include include/uapi/asm-generic/Kbuild.asm
 
+header-y += auxvec.h
 header-y += byteorder.h
 header-y += fcntl.h
 header-y += hwcap.h
diff --git a/arch/arm/include/uapi/asm/auxvec.h b/arch/arm/include/uapi/asm/auxvec.h
new file mode 100644
index 000000000000..f56936b97ec2
--- /dev/null
+++ b/arch/arm/include/uapi/asm/auxvec.h
@@ -0,0 +1,7 @@
+#ifndef __ASM_AUXVEC_H
+#define __ASM_AUXVEC_H
+
+/* vDSO location */
+#define AT_SYSINFO_EHDR	33
+
+#endif
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v9 4/6] ARM: add vdso user-space code
  2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
                   ` (2 preceding siblings ...)
  2014-08-22 21:52 ` [PATCH v9 3/6] ARM: miscellaneous vdso infrastructure, preparation Nathan Lynch
@ 2014-08-22 21:52 ` Nathan Lynch
  2014-09-10 16:47   ` Will Deacon
  2014-08-22 21:52 ` [PATCH v9 5/6] ARM: vdso initialization, mapping, and synchronization Nathan Lynch
                   ` (3 subsequent siblings)
  7 siblings, 1 reply; 20+ messages in thread
From: Nathan Lynch @ 2014-08-22 21:52 UTC (permalink / raw)
  To: linux-arm-kernel

Place vdso-related user-space code in arch/arm/kernel/vdso/.

It is almost completely written in C with some assembly helpers to
load the data page address, sample the counter, and fall back to
system calls when necessary.

If CONFIG_ARM_ARCH_TIMER is not enabled, the vdso cannot service
high-resolution clocks and falls back to syscalls.  Low-resolution
clocks e.g. CLOCK_REALTIME_COARSE can be serviced regardless.

Of particular note is that a post-processing step ("vdsomunge") is
necessary to produce a shared object which is architecturally allowed
to be used by both soft- and hard-float EABI programs.

The 2012 edition of the ARM ABI defines Tag_ABI_VFP_args = 3 "Code is
compatible with both the base and VFP variants; the user did not
permit non-variadic functions to pass FP parameters/results."
Unfortunately current toolchains do not support this tag, which is
ideally what we would use.

The best available option is to ensure that both EF_ARM_ABI_FLOAT_SOFT
and EF_ARM_ABI_FLOAT_HARD are unset in the ELF header's e_flags,
indicating that the shared object is "old" and should be accepted for
backward compatibility's sake.  While binutils < 2.24 appear to
produce a vdso.so with both flags clear, 2.24 always sets
EF_ARM_ABI_FLOAT_SOFT, with no way to inhibit this behavior.  So we
have to fix things up with a custom post-processing step.

In fact, the VDSO code in glibc does much less validation (including
checking these flags) than the code for handling conventional
file-backed shared libraries, so this is a bit moot unless glibc's
VDSO code becomes more strict.

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
---
 arch/arm/kernel/asm-offsets.c |   5 +
 arch/arm/vdso/.gitignore      |   1 +
 arch/arm/vdso/Makefile        |  74 ++++++++++
 arch/arm/vdso/datapage.S      |  15 ++
 arch/arm/vdso/vdso.S          |  35 +++++
 arch/arm/vdso/vdso.lds.S      |  88 ++++++++++++
 arch/arm/vdso/vdsomunge.c     | 208 +++++++++++++++++++++++++++
 arch/arm/vdso/vgettimeofday.c | 320 ++++++++++++++++++++++++++++++++++++++++++
 8 files changed, 746 insertions(+)
 create mode 100644 arch/arm/vdso/.gitignore
 create mode 100644 arch/arm/vdso/Makefile
 create mode 100644 arch/arm/vdso/datapage.S
 create mode 100644 arch/arm/vdso/vdso.S
 create mode 100644 arch/arm/vdso/vdso.lds.S
 create mode 100644 arch/arm/vdso/vdsomunge.c
 create mode 100644 arch/arm/vdso/vgettimeofday.c

diff --git a/arch/arm/kernel/asm-offsets.c b/arch/arm/kernel/asm-offsets.c
index 713e807621d2..72b4c3873dc8 100644
--- a/arch/arm/kernel/asm-offsets.c
+++ b/arch/arm/kernel/asm-offsets.c
@@ -24,6 +24,7 @@
 #include <asm/memory.h>
 #include <asm/procinfo.h>
 #include <asm/suspend.h>
+#include <asm/vdso_datapage.h>
 #include <asm/hardware/cache-l2x0.h>
 #include <linux/kbuild.h>
 
@@ -200,5 +201,9 @@ int main(void)
 #endif
   DEFINE(KVM_VTTBR,		offsetof(struct kvm, arch.vttbr));
 #endif
+  BLANK();
+#ifdef CONFIG_VDSO
+  DEFINE(VDSO_DATA_SIZE,	sizeof(union vdso_data_store));
+#endif
   return 0; 
 }
diff --git a/arch/arm/vdso/.gitignore b/arch/arm/vdso/.gitignore
new file mode 100644
index 000000000000..f8b69d84238e
--- /dev/null
+++ b/arch/arm/vdso/.gitignore
@@ -0,0 +1 @@
+vdso.lds
diff --git a/arch/arm/vdso/Makefile b/arch/arm/vdso/Makefile
new file mode 100644
index 000000000000..bab0a8be7924
--- /dev/null
+++ b/arch/arm/vdso/Makefile
@@ -0,0 +1,74 @@
+hostprogs-y := vdsomunge
+
+obj-vdso := vgettimeofday.o datapage.o
+
+# Build rules
+targets := $(obj-vdso) vdso.so vdso.so.dbg vdso.so.raw vdso.lds
+obj-vdso := $(addprefix $(obj)/, $(obj-vdso))
+
+ccflags-y := -shared -fPIC -fno-common -fno-builtin -fno-stack-protector
+ccflags-y += -nostdlib -Wl,-soname=linux-vdso.so.1 -DDISABLE_BRANCH_PROFILING
+ccflags-y += -Wl,--no-undefined $(call cc-ldoption, -Wl$(comma)--hash-style=sysv)
+
+obj-y += vdso.o
+extra-y += vdso.lds
+CPPFLAGS_vdso.lds += -P -C -U$(ARCH)
+
+CFLAGS_REMOVE_vdso.o = -pg
+
+# Force -O2 to avoid libgcc dependencies
+CFLAGS_REMOVE_vgettimeofday.o = -pg -Os
+CFLAGS_vgettimeofday.o = -O2
+
+# Disable gcov profiling for VDSO code
+GCOV_PROFILE := n
+
+# Force dependency
+$(obj)/vdso.o : $(obj)/vdso.so
+
+# Link rule for the .so file
+$(obj)/vdso.so.raw: $(src)/vdso.lds $(obj-vdso) FORCE
+	$(call if_changed,vdsold)
+
+$(obj)/vdso.so.dbg: $(obj)/vdso.so.raw $(obj)/vdsomunge FORCE
+	$(call if_changed,vdsomunge)
+
+# Strip rule for the .so file
+$(obj)/%.so: OBJCOPYFLAGS := -S
+$(obj)/%.so: $(obj)/%.so.dbg FORCE
+	$(call if_changed,objcopy)
+
+# Actual build commands
+quiet_cmd_vdsold = VDSO    $@
+      cmd_vdsold = $(CC) $(c_flags) -Wl,-T $(filter %.lds,$^) $(filter %.o,$^) \
+                   $(call cc-ldoption, -Wl$(comma)--build-id) \
+                   -Wl,-Bsymbolic -Wl,-z,max-page-size=4096 \
+                   -Wl,-z,common-page-size=4096 -o $@
+
+quiet_cmd_vdsomunge = MUNGE   $@
+      cmd_vdsomunge = $(objtree)/$(obj)/vdsomunge $< $@
+
+#
+# Install the unstripped copy of vdso.so.dbg.  If our toolchain
+# supports build-id, install .build-id links as well.
+#
+# Cribbed from arch/x86/vdso/Makefile.
+#
+quiet_cmd_vdso_install = INSTALL $<
+define cmd_vdso_install
+	cp $< "$(MODLIB)/vdso/vdso.so"; \
+	if readelf -n $< | grep -q 'Build ID'; then \
+	  buildid=`readelf -n $< |grep 'Build ID' |sed -e 's/^.*Build ID: \(.*\)$$/\1/'`; \
+	  first=`echo $$buildid | cut -b-2`; \
+	  last=`echo $$buildid | cut -b3-`; \
+	  mkdir -p "$(MODLIB)/vdso/.build-id/$$first"; \
+	  ln -sf "../../vdso.so" "$(MODLIB)/vdso/.build-id/$$first/$$last.debug"; \
+	fi
+endef
+
+$(MODLIB)/vdso: FORCE
+	@mkdir -p $(MODLIB)/vdso
+
+PHONY += vdso_install
+vdso_install: $(obj)/vdso.so.dbg $(MODLIB)/vdso FORCE
+	$(call cmd,vdso_install)
diff --git a/arch/arm/vdso/datapage.S b/arch/arm/vdso/datapage.S
new file mode 100644
index 000000000000..a2e60367931b
--- /dev/null
+++ b/arch/arm/vdso/datapage.S
@@ -0,0 +1,15 @@
+#include <linux/linkage.h>
+#include <asm/asm-offsets.h>
+
+	.align 2
+.L_vdso_data_ptr:
+	.long	_start - . - VDSO_DATA_SIZE
+
+ENTRY(__get_datapage)
+	.fnstart
+	adr	r0, .L_vdso_data_ptr
+	ldr	r1, [r0]
+	add	r0, r0, r1
+	bx	lr
+	.fnend
+ENDPROC(__get_datapage)
diff --git a/arch/arm/vdso/vdso.S b/arch/arm/vdso/vdso.S
new file mode 100644
index 000000000000..b2b97e3e7bab
--- /dev/null
+++ b/arch/arm/vdso/vdso.S
@@ -0,0 +1,35 @@
+/*
+ * Adapted from arm64 version.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ */
+
+#include <linux/init.h>
+#include <linux/linkage.h>
+#include <linux/const.h>
+#include <asm/page.h>
+
+	__PAGE_ALIGNED_DATA
+
+	.globl vdso_start, vdso_end
+	.balign PAGE_SIZE
+vdso_start:
+	.incbin "arch/arm/vdso/vdso.so"
+	.balign PAGE_SIZE
+vdso_end:
+
+	.previous
diff --git a/arch/arm/vdso/vdso.lds.S b/arch/arm/vdso/vdso.lds.S
new file mode 100644
index 000000000000..dabdebcab17e
--- /dev/null
+++ b/arch/arm/vdso/vdso.lds.S
@@ -0,0 +1,88 @@
+/*
+ * Adapted from arm64 version.
+ *
+ * GNU linker script for the VDSO library.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ * Author: Will Deacon <will.deacon@arm.com>
+ * Heavily based on the vDSO linker scripts for other archs.
+ */
+
+#include <linux/const.h>
+#include <asm/page.h>
+#include <asm/vdso.h>
+
+OUTPUT_FORMAT("elf32-littlearm", "elf32-bigarm", "elf32-littlearm")
+OUTPUT_ARCH(arm)
+
+SECTIONS
+{
+	PROVIDE(_start = .);
+
+	. = VDSO_LBASE + SIZEOF_HEADERS;
+
+	.hash		: { *(.hash) }			:text
+	.gnu.hash	: { *(.gnu.hash) }
+	.dynsym		: { *(.dynsym) }
+	.dynstr		: { *(.dynstr) }
+	.gnu.version	: { *(.gnu.version) }
+	.gnu.version_d	: { *(.gnu.version_d) }
+	.gnu.version_r	: { *(.gnu.version_r) }
+
+	.note		: { *(.note.*) }		:text	:note
+
+
+	.eh_frame_hdr	: { *(.eh_frame_hdr) }		:text	:eh_frame_hdr
+	.eh_frame	: { KEEP (*(.eh_frame)) }	:text
+
+	.dynamic	: { *(.dynamic) }		:text	:dynamic
+
+	.rodata		: { *(.rodata*) }		:text
+
+	.text		: { *(.text*) }			:text	=0xe7f001f2
+
+	.got		: { *(.got) }
+	.rel.plt	: { *(.rel.plt) }
+
+	/DISCARD/	: {
+		*(.note.GNU-stack)
+		*(.data .data.* .gnu.linkonce.d.* .sdata*)
+		*(.bss .sbss .dynbss .dynsbss)
+	}
+}
+
+/*
+ * We must supply the ELF program headers explicitly to get just one
+ * PT_LOAD segment, and set the flags explicitly to make segments read-only.
+ */
+PHDRS
+{
+	text		PT_LOAD		FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
+	dynamic		PT_DYNAMIC	FLAGS(4);		/* PF_R */
+	note		PT_NOTE		FLAGS(4);		/* PF_R */
+	eh_frame_hdr	PT_GNU_EH_FRAME;
+}
+
+VERSION
+{
+	LINUX_2.6 {
+	global:
+		__vdso_clock_getres;
+		__vdso_clock_gettime;
+		__vdso_gettimeofday;
+	local: *;
+	};
+}
diff --git a/arch/arm/vdso/vdsomunge.c b/arch/arm/vdso/vdsomunge.c
new file mode 100644
index 000000000000..b586ef699fb8
--- /dev/null
+++ b/arch/arm/vdso/vdsomunge.c
@@ -0,0 +1,208 @@
+/*
+ * Copyright 2014 Mentor Graphics Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2 of the
+ * License.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ *
+ *
+ * vdsomunge - Host program which produces a shared object
+ * architecturally specified to be usable by both soft- and hard-float
+ * programs.
+ *
+ * The Procedure Call Standard for the ARM Architecture (ARM IHI
+ * 0042E) says:
+ *
+ *	6.4.1 VFP and Base Standard Compatibility
+ *
+ *	Code compiled for the VFP calling standard is compatible with
+ *	the base standard (and vice-versa) if no floating-point or
+ *	containerized vector arguments or results are used.
+ *
+ * And ELF for the ARM Architecture (ARM IHI 0044E) (Table 4-2) says:
+ *
+ *	If both EF_ARM_ABI_FLOAT_XXXX bits are clear, conformance to the
+ *	base procedure-call standard is implied.
+ *
+ * The VDSO is built with -msoft-float, as with the rest of the ARM
+ * kernel, and uses no floating point arguments or results.  The build
+ * process will produce a shared object that may or may not have the
+ * EF_ARM_ABI_FLOAT_SOFT flag set (it seems to depend on the binutils
+ * version; binutils starting with 2.24 appears to set it).  The
+ * EF_ARM_ABI_FLOAT_HARD flag should definitely not be set, and this
+ * program will error out if it is.
+ *
+ * If the soft-float flag is set, this program clears it.  That's all
+ * it does.
+ */
+
+#define _GNU_SOURCE
+
+#include <byteswap.h>
+#include <elf.h>
+#include <errno.h>
+#include <error.h>
+#include <fcntl.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+#include <unistd.h>
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+#define HOST_ORDER ELFDATA2LSB
+#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+#define HOST_ORDER ELFDATA2MSB
+#endif
+
+/* Some of the ELF constants we'd like to use were added to <elf.h>
+ * relatively recently.
+ */
+#ifndef EF_ARM_EABI_VER5
+#define EF_ARM_EABI_VER5 0x05000000
+#endif
+
+#ifndef EF_ARM_ABI_FLOAT_SOFT
+#define EF_ARM_ABI_FLOAT_SOFT 0x200
+#endif
+
+#ifndef EF_ARM_ABI_FLOAT_HARD
+#define EF_ARM_ABI_FLOAT_HARD 0x400
+#endif
+
+static const char *outfile;
+
+static void cleanup(void)
+{
+	if (error_message_count > 0 && outfile != NULL)
+		unlink(outfile);
+}
+
+static Elf32_Word read_elf_word(Elf32_Word word, bool swap)
+{
+	return swap ? bswap_32(word) : word;
+}
+
+static Elf32_Half read_elf_half(Elf32_Half half, bool swap)
+{
+	return swap ? bswap_16(half) : half;
+}
+
+static void write_elf_word(Elf32_Word val, Elf32_Word *dst, bool swap)
+{
+	*dst = swap ? bswap_32(val) : val;
+}
+
+int main(int argc, char **argv)
+{
+	const Elf32_Ehdr *inhdr;
+	bool clear_soft_float;
+	const char *infile;
+	Elf32_Word e_flags;
+	const void *inbuf;
+	struct stat stat;
+	void *outbuf;
+	bool swap;
+	int outfd;
+	int infd;
+
+	atexit(cleanup);
+
+	if (argc != 3)
+		error(EXIT_FAILURE, 0, "Usage: %s [infile] [outfile]", argv[0]);
+
+	infile = argv[1];
+	outfile = argv[2];
+
+	infd = open(infile, O_RDONLY);
+	if (infd < 0)
+		error(EXIT_FAILURE, errno, "Cannot open %s", infile);
+
+	if (fstat(infd, &stat) != 0)
+		error(EXIT_FAILURE, errno, "Failed stat for %s", infile);
+
+	inbuf = mmap(NULL, stat.st_size, PROT_READ, MAP_PRIVATE, infd, 0);
+	if (inbuf == MAP_FAILED)
+		error(EXIT_FAILURE, errno, "Failed to map %s", infile);
+
+	close(infd);
+
+	inhdr = inbuf;
+
+	if (memcmp(&inhdr->e_ident, ELFMAG, SELFMAG) != 0)
+		error(EXIT_FAILURE, 0, "Not an ELF file");
+
+	if (inhdr->e_ident[EI_CLASS] != ELFCLASS32)
+		error(EXIT_FAILURE, 0, "Unsupported ELF class");
+
+	swap = inhdr->e_ident[EI_DATA] != HOST_ORDER;
+
+	if (read_elf_half(inhdr->e_type, swap) != ET_DYN)
+		error(EXIT_FAILURE, 0, "Not a shared object");
+
+	if (read_elf_half(inhdr->e_machine, swap) != EM_ARM) {
+		error(EXIT_FAILURE, 0, "Unsupported architecture %#x",
+		      inhdr->e_machine);
+	}
+
+	e_flags = read_elf_word(inhdr->e_flags, swap);
+
+	if (EF_ARM_EABI_VERSION(e_flags) != EF_ARM_EABI_VER5) {
+		error(EXIT_FAILURE, 0, "Unsupported EABI version %#x",
+		      EF_ARM_EABI_VERSION(e_flags));
+	}
+
+	if (e_flags & EF_ARM_ABI_FLOAT_HARD)
+		error(EXIT_FAILURE, 0, "Unexpected hard-float flag set in "
+		      "e_flags");
+
+	clear_soft_float = !!(e_flags & EF_ARM_ABI_FLOAT_SOFT);
+
+	outfd = open(outfile, O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR);
+	if (outfd < 0)
+		error(EXIT_FAILURE, errno, "Cannot open %s", outfile);
+
+	if (ftruncate(outfd, stat.st_size) != 0)
+		error(EXIT_FAILURE, errno, "Cannot truncate %s", outfile);
+
+	outbuf = mmap(NULL, stat.st_size, PROT_READ | PROT_WRITE, MAP_SHARED,
+		      outfd, 0);
+	if (outbuf == MAP_FAILED)
+		error(EXIT_FAILURE, errno, "Failed to map %s", outfile);
+
+	close(outfd);
+
+	memcpy(outbuf, inbuf, stat.st_size);
+
+	if (clear_soft_float) {
+		Elf32_Ehdr *outhdr;
+
+		outhdr = outbuf;
+		e_flags &= ~EF_ARM_ABI_FLOAT_SOFT;
+		write_elf_word(e_flags, &outhdr->e_flags, swap);
+
+#ifdef DEBUG
+		printf("%s: cleared soft-float bit in ELF header for %s "
+		       "(%#x => %#x)\n", program_invocation_short_name,
+		       outfile, inhdr->e_flags, outhdr->e_flags);
+#endif
+
+	}
+
+	if (msync(outbuf, stat.st_size, MS_SYNC) != 0)
+		error(EXIT_FAILURE, errno, "Failed to sync %s", outfile);
+
+	return EXIT_SUCCESS;
+}
diff --git a/arch/arm/vdso/vgettimeofday.c b/arch/arm/vdso/vgettimeofday.c
new file mode 100644
index 000000000000..cb3201b6cd71
--- /dev/null
+++ b/arch/arm/vdso/vgettimeofday.c
@@ -0,0 +1,320 @@
+/*
+ * Copyright 2014 Mentor Graphics Corporation.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public License
+ * as published by the Free Software Foundation; version 2 of the
+ * License.
+ *
+ * This program is distributed in the hope that it will be useful, but
+ * WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/compiler.h>
+#include <linux/hrtimer.h>
+#include <linux/time.h>
+#include <asm/arch_timer.h>
+#include <asm/barrier.h>
+#include <asm/bug.h>
+#include <asm/page.h>
+#include <asm/unistd.h>
+#include <asm/vdso_datapage.h>
+
+#ifndef CONFIG_AEABI
+#error This code depends on AEABI system call conventions
+#endif
+
+extern struct vdso_data *__get_datapage(void);
+
+static u32 __vdso_read_begin(const struct vdso_data *vdata)
+{
+	u32 seq;
+repeat:
+	seq = ACCESS_ONCE(vdata->seq_count);
+	if (seq & 1) {
+		cpu_relax();
+		goto repeat;
+	}
+	return seq;
+}
+
+static u32 vdso_read_begin(const struct vdso_data *vdata)
+{
+	u32 seq = __vdso_read_begin(vdata);
+	smp_rmb();
+	return seq;
+}
+
+static int vdso_read_retry(const struct vdso_data *vdata, u32 start)
+{
+	smp_rmb();
+	return vdata->seq_count != start;
+}
+
+static long clock_gettime_fallback(clockid_t _clkid, struct timespec *_ts)
+{
+	register struct timespec *ts asm("r1") = _ts;
+	register clockid_t clkid asm("r0") = _clkid;
+	register long ret asm ("r0");
+	register long nr asm("r7") = __NR_clock_gettime;
+
+	asm volatile(
+	"	swi #0\n"
+	: "=r" (ret)
+	: "r" (clkid), "r" (ts), "r" (nr)
+	: "memory");
+
+	return ret;
+}
+
+static int do_realtime_coarse(struct timespec *ts, struct vdso_data *vdata)
+{
+	u32 seq;
+
+	do {
+		seq = vdso_read_begin(vdata);
+
+		ts->tv_sec = vdata->xtime_coarse_sec;
+		ts->tv_nsec = vdata->xtime_coarse_nsec;
+
+	} while (vdso_read_retry(vdata, seq));
+
+	return 0;
+}
+
+static int do_monotonic_coarse(struct timespec *ts, struct vdso_data *vdata)
+{
+	struct timespec tomono;
+	u32 seq;
+
+	do {
+		seq = vdso_read_begin(vdata);
+
+		ts->tv_sec = vdata->xtime_coarse_sec;
+		ts->tv_nsec = vdata->xtime_coarse_nsec;
+
+		tomono.tv_sec = vdata->wtm_clock_sec;
+		tomono.tv_nsec = vdata->wtm_clock_nsec;
+
+	} while (vdso_read_retry(vdata, seq));
+
+	ts->tv_sec += tomono.tv_sec;
+	timespec_add_ns(ts, tomono.tv_nsec);
+
+	return 0;
+}
+
+#ifdef CONFIG_ARM_ARCH_TIMER
+
+static u64 get_ns(struct vdso_data *vdata)
+{
+	u64 cycle_delta;
+	u64 cycle_now;
+	u64 nsec;
+
+	cycle_now = arch_counter_get_cntvct();
+
+	cycle_delta = (cycle_now - vdata->cs_cycle_last) & vdata->cs_mask;
+
+	nsec = (cycle_delta * vdata->cs_mult) + vdata->xtime_clock_snsec;
+	nsec >>= vdata->cs_shift;
+
+	return nsec;
+}
+
+static int do_realtime(struct timespec *ts, struct vdso_data *vdata)
+{
+	u64 nsecs;
+	u32 seq;
+
+	do {
+		seq = vdso_read_begin(vdata);
+
+		if (vdata->use_syscall)
+			return -1;
+
+		ts->tv_sec = vdata->xtime_clock_sec;
+		nsecs = get_ns(vdata);
+
+	} while (vdso_read_retry(vdata, seq));
+
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, nsecs);
+
+	return 0;
+}
+
+static int do_monotonic(struct timespec *ts, struct vdso_data *vdata)
+{
+	struct timespec tomono;
+	u64 nsecs;
+	u32 seq;
+
+	do {
+		seq = vdso_read_begin(vdata);
+
+		if (vdata->use_syscall)
+			return -1;
+
+		ts->tv_sec = vdata->xtime_clock_sec;
+		nsecs = get_ns(vdata);
+
+		tomono.tv_sec = vdata->wtm_clock_sec;
+		tomono.tv_nsec = vdata->wtm_clock_nsec;
+
+	} while (vdso_read_retry(vdata, seq));
+
+	ts->tv_sec += tomono.tv_sec;
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, nsecs + tomono.tv_nsec);
+
+	return 0;
+}
+
+#else /* CONFIG_ARM_ARCH_TIMER */
+
+static int do_realtime(struct timespec *ts, struct vdso_data *vdata)
+{
+	return -1;
+}
+
+static int do_monotonic(struct timespec *ts, struct vdso_data *vdata)
+{
+	return -1;
+}
+
+#endif /* CONFIG_ARM_ARCH_TIMER */
+
+int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts)
+{
+	struct vdso_data *vdata;
+	int ret = -1;
+
+	vdata = __get_datapage();
+
+	switch (clkid) {
+	case CLOCK_REALTIME_COARSE:
+		ret = do_realtime_coarse(ts, vdata);
+		break;
+	case CLOCK_MONOTONIC_COARSE:
+		ret = do_monotonic_coarse(ts, vdata);
+		break;
+	case CLOCK_REALTIME:
+		ret = do_realtime(ts, vdata);
+		break;
+	case CLOCK_MONOTONIC:
+		ret = do_monotonic(ts, vdata);
+		break;
+	default:
+		break;
+	}
+
+	if (ret)
+		ret = clock_gettime_fallback(clkid, ts);
+
+	return ret;
+}
+
+static long clock_getres_fallback(clockid_t _clkid, struct timespec *_ts)
+{
+	register struct timespec *ts asm("r1") = _ts;
+	register clockid_t clkid asm("r0") = _clkid;
+	register long ret asm ("r0");
+	register long nr asm("r7") = __NR_clock_getres;
+
+	asm volatile(
+	"	swi #0\n"
+	: "=r" (ret)
+	: "r" (clkid), "r" (ts), "r" (nr)
+	: "memory");
+
+	return ret;
+}
+
+int __vdso_clock_getres(clockid_t clkid, struct timespec *ts)
+{
+	int ret;
+
+	switch (clkid) {
+	case CLOCK_REALTIME:
+	case CLOCK_MONOTONIC:
+		if (ts) {
+			ts->tv_sec = 0;
+			ts->tv_nsec = MONOTONIC_RES_NSEC;
+		}
+		ret = 0;
+		break;
+	case CLOCK_REALTIME_COARSE:
+	case CLOCK_MONOTONIC_COARSE:
+		if (ts) {
+			ts->tv_sec = 0;
+			ts->tv_nsec = LOW_RES_NSEC;
+		}
+		ret = 0;
+		break;
+	default:
+		ret = clock_getres_fallback(clkid, ts);
+		break;
+	}
+
+	return ret;
+}
+
+static long gettimeofday_fallback(struct timeval *_tv, struct timezone *_tz)
+{
+	register struct timezone *tz asm("r1") = _tz;
+	register struct timeval *tv asm("r0") = _tv;
+	register long ret asm ("r0");
+	register long nr asm("r7") = __NR_gettimeofday;
+
+	asm volatile(
+	"	swi #0\n"
+	: "=r" (ret)
+	: "r" (tv), "r" (tz), "r" (nr)
+	: "memory");
+
+	return ret;
+}
+
+int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
+{
+	struct timespec ts;
+	struct vdso_data *vdata;
+	int ret;
+
+	vdata = __get_datapage();
+
+	ret = do_realtime(&ts, vdata);
+	if (ret)
+		return gettimeofday_fallback(tv, tz);
+
+	if (tv) {
+		tv->tv_sec = ts.tv_sec;
+		tv->tv_usec = ts.tv_nsec / 1000;
+	}
+	if (tz) {
+		tz->tz_minuteswest = vdata->tz_minuteswest;
+		tz->tz_dsttime = vdata->tz_dsttime;
+	}
+
+	return ret;
+}
+
+/* Avoid unresolved references emitted by GCC */
+
+void __aeabi_unwind_cpp_pr0(void)
+{
+}
+
+void __aeabi_unwind_cpp_pr1(void)
+{
+}
+
+void __aeabi_unwind_cpp_pr2(void)
+{
+}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v9 5/6] ARM: vdso initialization, mapping, and synchronization
  2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
                   ` (3 preceding siblings ...)
  2014-08-22 21:52 ` [PATCH v9 4/6] ARM: add vdso user-space code Nathan Lynch
@ 2014-08-22 21:52 ` Nathan Lynch
  2014-08-22 21:52 ` [PATCH v9 6/6] ARM: add CONFIG_VDSO Kconfig and Makefile bits Nathan Lynch
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-08-22 21:52 UTC (permalink / raw)
  To: linux-arm-kernel

Initialize the vdso page list at boot, install the vdso mapping at
exec time, and update the data page during timer ticks.  This code is
not built if CONFIG_VDSO is not enabled.

Account for the vdso length when randomizing the offset from the
stack.  The vdso is placed immediately following the sigpage with a
separate install_special_mapping call in arm_install_vdso.

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
---
 arch/arm/kernel/process.c |  17 +++-
 arch/arm/kernel/vdso.c    | 207 ++++++++++++++++++++++++++++++++++++++++++++++
 2 files changed, 221 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm/kernel/vdso.c

diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index 9e0d931dd475..7aa1025aff1f 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -41,6 +41,7 @@
 #include <asm/system_misc.h>
 #include <asm/mach/time.h>
 #include <asm/tls.h>
+#include <asm/vdso.h>
 
 #ifdef CONFIG_CC_STACKPROTECTOR
 #include <linux/stackprotector.h>
@@ -476,7 +477,7 @@ const char *arch_vma_name(struct vm_area_struct *vma)
 }
 
 /* If possible, provide a placement hint at a random offset from the
- * stack for the signal page.
+ * stack for the sigpage and vdso pages.
  */
 static unsigned long sigpage_addr(const struct mm_struct *mm, unsigned int npages)
 {
@@ -519,6 +520,7 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 {
 	struct mm_struct *mm = current->mm;
 	struct vm_area_struct *vma;
+	unsigned long npages;
 	unsigned long addr;
 	unsigned long hint;
 	int ret = 0;
@@ -528,9 +530,12 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 	if (!signal_page)
 		return -ENOMEM;
 
+	npages = 1; /* for sigpage */
+	npages += vdso_total_pages;
+
 	down_write(&mm->mmap_sem);
-	hint = sigpage_addr(mm, 1);
-	addr = get_unmapped_area(NULL, hint, PAGE_SIZE, 0, 0);
+	hint = sigpage_addr(mm, npages);
+	addr = get_unmapped_area(NULL, hint, npages << PAGE_SHIFT, 0, 0);
 	if (IS_ERR_VALUE(addr)) {
 		ret = addr;
 		goto up_fail;
@@ -547,6 +552,12 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 
 	mm->context.sigpage = addr;
 
+	/* Unlike the sigpage, failure to install the vdso is unlikely
+	 * to be fatal to the process, so no error check needed
+	 * here.
+	 */
+	arm_install_vdso(mm, addr + PAGE_SIZE);
+
  up_fail:
 	up_write(&mm->mmap_sem);
 	return ret;
diff --git a/arch/arm/kernel/vdso.c b/arch/arm/kernel/vdso.c
new file mode 100644
index 000000000000..7bd393f588ea
--- /dev/null
+++ b/arch/arm/kernel/vdso.c
@@ -0,0 +1,207 @@
+/*
+ * Adapted from arm64 version.
+ *
+ * Copyright (C) 2012 ARM Limited
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License version 2 as
+ * published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program.  If not, see <http://www.gnu.org/licenses/>.
+ */
+
+#include <linux/err.h>
+#include <linux/kernel.h>
+#include <linux/mm.h>
+#include <linux/slab.h>
+#include <linux/timekeeper_internal.h>
+#include <linux/vmalloc.h>
+
+#include <asm/barrier.h>
+#include <asm/cacheflush.h>
+#include <asm/page.h>
+#include <asm/vdso.h>
+#include <asm/vdso_datapage.h>
+
+#include <asm/arch_timer.h>
+
+static struct page **vdso_text_pagelist;
+
+/* Total number of pages needed for the data and text portions of the VDSO. */
+unsigned int vdso_total_pages __read_mostly;
+
+/*
+ * The vDSO data page.
+ */
+static union vdso_data_store vdso_data_store __page_aligned_data;
+static struct vdso_data *vdso_data = &vdso_data_store.data;
+
+static struct page *vdso_data_page;
+static struct vm_special_mapping vdso_data_mapping = {
+	.name = "[vvar]",
+	.pages = &vdso_data_page,
+};
+
+static struct vm_special_mapping vdso_text_mapping = {
+	.name = "[vdso]",
+};
+
+static int __init vdso_init(void)
+{
+	unsigned int text_pages;
+	int i;
+
+	if (memcmp(&vdso_start, "\177ELF", 4)) {
+		pr_err("vDSO is not a valid ELF object!\n");
+		return -ENOEXEC;
+	}
+
+	text_pages = (&vdso_end - &vdso_start) >> PAGE_SHIFT;
+	pr_debug("vdso: %i text pages at base %p\n", text_pages, &vdso_start);
+
+	/* Allocate the vDSO text pagelist */
+	vdso_text_pagelist = kcalloc(text_pages, sizeof(struct page *),
+				     GFP_KERNEL);
+	if (vdso_text_pagelist == NULL)
+		return -ENOMEM;
+
+	/* Grab the vDSO data page. */
+	vdso_data_page = virt_to_page(vdso_data);
+
+	/* Grab the vDSO text pages. */
+	for (i = 0; i < text_pages; i++)
+		vdso_text_pagelist[i] = virt_to_page(&vdso_start + i * PAGE_SIZE);
+
+	vdso_text_mapping.pages = vdso_text_pagelist;
+
+	vdso_total_pages = 1; /* for the data/vvar page */
+	vdso_total_pages += text_pages;
+
+	return 0;
+}
+arch_initcall(vdso_init);
+
+static int install_vvar(struct mm_struct *mm, unsigned long addr)
+{
+	struct vm_area_struct *vma;
+
+	vma = _install_special_mapping(mm, addr, PAGE_SIZE,
+				       VM_READ | VM_MAYREAD,
+				       &vdso_data_mapping);
+
+	return IS_ERR(vma) ? PTR_ERR(vma) : 0;
+}
+
+/* assumes mmap_sem is write-locked */
+void arm_install_vdso(struct mm_struct *mm, unsigned long addr)
+{
+	struct vm_area_struct *vma;
+	unsigned long len;
+
+	mm->context.vdso = 0;
+
+	if (vdso_text_pagelist == NULL)
+		return;
+
+	if (install_vvar(mm, addr))
+		return;
+
+	/* Account for vvar page. */
+	addr += PAGE_SIZE;
+	len = (vdso_total_pages - 1) << PAGE_SHIFT;
+
+	vma = _install_special_mapping(mm, addr, len,
+		VM_READ | VM_EXEC | VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC,
+		&vdso_text_mapping);
+
+	if (!IS_ERR(vma))
+		mm->context.vdso = addr;
+}
+
+static void vdso_write_begin(struct vdso_data *vdata)
+{
+	++vdso_data->seq_count;
+	smp_wmb();
+}
+
+static void vdso_write_end(struct vdso_data *vdata)
+{
+	smp_wmb();
+	++vdso_data->seq_count;
+}
+
+static bool vdso_can_use_arch_timer(const struct timekeeper *tk)
+{
+#ifdef CONFIG_ARM_ARCH_TIMER
+	u32 cntkctl;
+
+	if (strcmp(tk->tkr.clock->name, "arch_sys_counter") != 0)
+		return false;
+
+	cntkctl = arch_timer_get_cntkctl();
+
+	return cntkctl & ARCH_TIMER_USR_VCT_ACCESS_EN;
+#else
+	return false;
+#endif
+}
+
+/**
+ * update_vsyscall - update the vdso data page
+ *
+ * Increment the sequence counter, making it odd, indicating to
+ * userspace that an update is in progress.  Update the fields used
+ * for coarse clocks and, if the architected system timer is in use,
+ * the fields used for high precision clocks.  Increment the sequence
+ * counter again, making it even, indicating to userspace that the
+ * update is finished.
+ *
+ * Userspace is expected to sample seq_count before reading any other
+ * fields from the data page.  If seq_count is odd, userspace is
+ * expected to wait until it becomes even.  After copying data from
+ * the page, userspace must sample seq_count again; if it has changed
+ * from its previous value, userspace must retry the whole sequence.
+ *
+ * Calls to update_vsyscall are serialized by the timekeeping core.
+ */
+void update_vsyscall(struct timekeeper *tk)
+{
+	struct timespec xtime_coarse;
+	struct timespec64 *wtm = &tk->wall_to_monotonic;
+	bool use_syscall = !vdso_can_use_arch_timer(tk);
+
+	vdso_write_begin(vdso_data);
+
+	xtime_coarse = __current_kernel_time();
+	vdso_data->use_syscall			= use_syscall;
+	vdso_data->xtime_coarse_sec		= xtime_coarse.tv_sec;
+	vdso_data->xtime_coarse_nsec		= xtime_coarse.tv_nsec;
+	vdso_data->wtm_clock_sec		= wtm->tv_sec;
+	vdso_data->wtm_clock_nsec		= wtm->tv_nsec;
+
+	if (!use_syscall) {
+		vdso_data->cs_cycle_last	= tk->tkr.cycle_last;
+		vdso_data->xtime_clock_sec	= tk->xtime_sec;
+		vdso_data->xtime_clock_snsec	= tk->tkr.xtime_nsec;
+		vdso_data->cs_mult		= tk->tkr.mult;
+		vdso_data->cs_shift		= tk->tkr.shift;
+		vdso_data->cs_mask		= tk->tkr.mask;
+	}
+
+	vdso_write_end(vdso_data);
+
+	flush_dcache_page(virt_to_page(vdso_data));
+}
+
+void update_vsyscall_tz(void)
+{
+	vdso_data->tz_minuteswest	= sys_tz.tz_minuteswest;
+	vdso_data->tz_dsttime		= sys_tz.tz_dsttime;
+	flush_dcache_page(virt_to_page(vdso_data));
+}
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v9 6/6] ARM: add CONFIG_VDSO Kconfig and Makefile bits
  2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
                   ` (4 preceding siblings ...)
  2014-08-22 21:52 ` [PATCH v9 5/6] ARM: vdso initialization, mapping, and synchronization Nathan Lynch
@ 2014-08-22 21:52 ` Nathan Lynch
  2014-08-27 20:49 ` [PATCH v9 0/6] ARM: VDSO Christopher Covington
  2014-09-06  2:32 ` Nathan Lynch
  7 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-08-22 21:52 UTC (permalink / raw)
  To: linux-arm-kernel

Allow users to enable the vdso in Kconfig; include the vdso in the
build if CONFIG_VDSO is enabled.  Add 'vdso_install' target.

Signed-off-by: Nathan Lynch <nathan_lynch@mentor.com>
---
 arch/arm/Makefile        |  8 ++++++++
 arch/arm/kernel/Makefile |  1 +
 arch/arm/mm/Kconfig      | 15 +++++++++++++++
 3 files changed, 24 insertions(+)

diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 0ce9d0f71f2a..109c011ef4ed 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -264,6 +264,7 @@ core-$(CONFIG_FPE_FASTFPE)	+= $(FASTFPE_OBJ)
 core-$(CONFIG_VFP)		+= arch/arm/vfp/
 core-$(CONFIG_XEN)		+= arch/arm/xen/
 core-$(CONFIG_KVM_ARM_HOST) 	+= arch/arm/kvm/
+core-$(CONFIG_VDSO)		+= arch/arm/vdso/
 
 # If we have a machine-specific directory, then include it in the build.
 core-y				+= arch/arm/kernel/ arch/arm/mm/ arch/arm/common/
@@ -316,6 +317,12 @@ PHONY += dtbs dtbs_install
 dtbs dtbs_install: prepare scripts
 	$(Q)$(MAKE) $(build)=$(boot)/dts MACHINE=$(MACHINE) $@
 
+PHONY += vdso_install
+vdso_install:
+ifeq ($(CONFIG_VDSO),y)
+	$(Q)$(MAKE) $(build)=arch/arm/vdso $@
+endif
+
 # We use MRPROPER_FILES and CLEAN_FILES now
 archclean:
 	$(Q)$(MAKE) $(clean)=$(boot)
@@ -340,4 +347,5 @@ define archhelp
   echo  '                  Install using (your) ~/bin/$(INSTALLKERNEL) or'
   echo  '                  (distribution) /sbin/$(INSTALLKERNEL) or'
   echo  '                  install to $$(INSTALL_PATH) and run lilo'
+  echo  '  vdso_install  - Install unstripped vdso.so to $$(INSTALL_MOD_PATH)/vdso'
 endef
diff --git a/arch/arm/kernel/Makefile b/arch/arm/kernel/Makefile
index 38ddd9f83d0e..3e2b80658344 100644
--- a/arch/arm/kernel/Makefile
+++ b/arch/arm/kernel/Makefile
@@ -86,6 +86,7 @@ obj-$(CONFIG_PERF_EVENTS)	+= perf_regs.o
 obj-$(CONFIG_HW_PERF_EVENTS)	+= perf_event.o perf_event_cpu.o
 AFLAGS_iwmmxt.o			:= -Wa,-mcpu=iwmmxt
 obj-$(CONFIG_ARM_CPU_TOPOLOGY)  += topology.o
+obj-$(CONFIG_VDSO)		+= vdso.o
 
 ifneq ($(CONFIG_ARCH_EBSA110),y)
   obj-y		+= io.o
diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig
index ae69809a9e47..c256323040d6 100644
--- a/arch/arm/mm/Kconfig
+++ b/arch/arm/mm/Kconfig
@@ -824,6 +824,21 @@ config KUSER_HELPERS
 	  Say N here only if you are absolutely certain that you do not
 	  need these helpers; otherwise, the safe option is to say Y.
 
+config VDSO
+	bool "Enable vDSO for acceleration of some system calls"
+	depends on AEABI && MMU
+	default y if ARM_ARCH_TIMER
+	select GENERIC_TIME_VSYSCALL
+	help
+	  Place in the process address space an ELF shared object
+	  providing fast implementations of several system calls,
+	  including gettimeofday and clock_gettime.  Systems that
+	  implement the ARM architected timer will receive maximum
+	  benefit.
+
+	  You must have glibc 2.21 or later for programs to seamlessly
+	  take advantage of this.
+
 config DMA_CACHE_RWFO
 	bool "Enable read/write for ownership DMA cache maintenance"
 	depends on CPU_V6K && SMP
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [PATCH v9 0/6] ARM: VDSO
  2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
                   ` (5 preceding siblings ...)
  2014-08-22 21:52 ` [PATCH v9 6/6] ARM: add CONFIG_VDSO Kconfig and Makefile bits Nathan Lynch
@ 2014-08-27 20:49 ` Christopher Covington
  2014-08-27 21:42   ` Andy Lutomirski
  2014-09-03  5:41   ` Nathan Lynch
  2014-09-06  2:32 ` Nathan Lynch
  7 siblings, 2 replies; 20+ messages in thread
From: Christopher Covington @ 2014-08-27 20:49 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/22/2014 05:52 PM, Nathan Lynch wrote:
> Provide fast userspace implementations of gettimeofday and
> clock_gettime on systems that implement the generic timers extension
> defined in ARMv7.  This follows the example of arm64 in conception but
> significantly differs in some aspects of the implementation (C vs
> assembly, mainly).
> 
> Clocks supported:
> - CLOCK_REALTIME
> - CLOCK_MONOTONIC
> - CLOCK_REALTIME_COARSE
> - CLOCK_MONOTONIC_COARSE
> 
> This also provides clock_getres (as arm64 does).
> 
> getcpu support is planned but not included at this time.
> 
> For applications to transparently benefit from this change,
> ARM-specific support code needs to be added to glibc.  I have such a
> patch, and have verified that glibc's self tests do not detect any
> regressions.  I hope to have that code added to glibc for the 2.21
> release.
> 
> The VDSO symbols are available for lookup via dlsym even with an
> unpatched glibc.
> 
> Note that while the high-precision realtime and monotonic clock
> support depends on the generic timers extension, support for
> clock_getres and coarse clocks is independent of the timer
> implementation and is provided unconditionally.  High-resolution clock
> support requires changes to the arch timer code, posted here:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/281280.html
> 
> The VDSO will function correctly without those changes, but
> gettimeofday and clock_gettime with CLOCK_REALTIME/CLOCK_MONOTONIC
> will not be accelerated.
> 
> Tested on OMAP5 and i.MX6, verifying that results obtained with the
> vdso are consistent with those obtained from the kernel.  On OMAP5 I
> observe a 3- to 4-fold speedup for gettimeofday / CLOCK_REALTIME, with
> even better (if less interesting) speedups for the coarse clock ids
> and clock_getres.
> 
> I've been testing and benchmarking this with some custom test code
> which I have hosted here:
> 
> https://github.com/nlynch-mentor/vdsotest
> 
> Unpatched FSF GDB may complain "warning: Could not load shared library
> symbols for linux-vdso.so.1."  This is not an ARM-specific issue.
> Current Fedora and Debian patch their GDB packages to prevent this
> warning.
> 
> Changes since v8:
> - Update to 3.17-rc1.
> - Split out arch timer changes into separate series.
> - Ensure that VDSO will not attempt to read the counter if access is
>   not enabled; this check can be removed after the arch timer changes
>   are merged.  See update_vsyscall and vdso_can_use_arch_timer in
>   patch #5.
> 
> Changes since v7:
> - Update to next-20140801.
> - In arch_setup_additional_pages, fix call to get_unmapped_area - use
>   bytes (not pages) for length argument.
> - As x86 does, separate data and text into two VMAs, [vvar] and [vdso]
>   respectively.  These have different permissions; [vdso] will allow
>   debuggers to set breakpoints, but [vvar] is read-only and cannot be
>   modified even via ptrace.
> - Use _install_special_mapping for signal page, vvar, and vdso
>   mappings.
> - Add -DDISABLE_BRANCH_PROFILING.
> - Add --no-undefined -Bsymbolic to link options to cause linker to
>   error out on unresolved references, making checkundef script
>   unnecessary.
> - Specify max-page-size, common-page-size in linker options so the
>   true alignment is reflected in program header; otherwise gdb gets
>   confused.
> - Fix incremental build vs. generated auxvec.h.
> - Use appropriate unwind directives in __get_datapage.
> - Added vdso_install target and help text.  Install build-id symlinks
>   as x86 does.
> - Adjust update_vsyscall for changes in struct timekeeper.
> 
> Changes since v6:
> - Update to 3.16-rc1.
> - Remove -lgcc from link step - need to support GCC installations
>   without libgcc.
> - Force -O2 compilation to prevent GCC from emitting calls to libgcc
>   math routines.
> - Use custom post-processing to clear the EF_ARM_ABI_FLOAT_SOFT flag
>   if set in the ELF header to produce a shared object which is
>   architecturally allowed to be used by both soft- and hard-float
>   code.
> - Consolidate common arch timer code instead of duplicating it.
> - Prevent the VDSO from attempting CP15 access on memory-only
>   architected timer implementations by renaming the clocksource.
> 
> Changes since v5:
> - Update to 3.15-rc1.
> - Place vdso at a randomized offset above the stack along with the
>   sigpage.
> - Properly export asm/auxvec.h.
> - Split patch into series for ease of review.
> 
> Changes since v4:
> - Map data page at the beginning of the VMA to prevent orphan
>   sections at the end of output invalidating the calculated offset.
> - Move checkundef into cmd_vdsold to avoid spurious rebuilds.
> - Change vdso_init message to pr_debug.
> - Add -fno-stack-protector to cflags.
> 
> Changes since v3:
> - Update to 3.14-rc6.
> - Record vdso base in mm context before installing mapping (for the
>   sake of perf_mmap_event).
> - Use a more seqcount-like API for critical sections.  Using seqcount
>   API directly, however, would leak kernel pointers to userspace when
>   lockdep is enabled.
> - Trap instead of looping forever in division-by-zero stubs.
> 
> Changes since v2:
> - Update to 3.14-rc4.
> - Make vDSO configurable, depending on AEABI and MMU.
> - Defer shifting of nanosecond component of timespec: fixes observed
>   1ns inconsistencies for CLOCK_REALTIME, CLOCK_MONOTONIC (see
>   45a7905fc48f for arm64 equivalent).
> - Force reload of seq_count when spinning: without a memory clobber
>   after the load of vdata->seq_count, GCC can generate code like this:
>     2f8:   e59c9020        ldr     r9, [ip, #32]
>     2fc:   e3190001        tst     r9, #1
>     300:   1a000033        bne     3d4 <do_realtime+0x104>
>     304:   f57ff05b        dmb     ish
>     308:   e59c3034        ldr     r3, [ip, #52]   ; 0x34
>     ...
>     3d4:   eafffffe        b       3d4 <do_realtime+0x104>
> - Build vdso.so with -lgcc: calls to __lshrdi3, __divsi3 sometimes
>   emitted (especially with -Os).  Override certain libgcc functions to
>   prevent undefined symbols.
> - Do not clear PG_reserved on vdso pages.
> - Remove unnecessary get_page calls.
> - Simplify ELF signature check during init.
> - Use volatile for asm syscall fallbacks.
> - Check whether vdso_pagelist is initialized in arm_install_vdso.
> - Record clocksource mask in data page.
> - Reduce code duplication in do_realtime, do_monotonic.
> - Reduce calculations performed in critical sections.
> - Simplify coarse clock handling.
> - Move datapage load to its own assembly routine.
> - Tune vdso_data layout and tweak field names.
> - Check vdso shared object for undefined symbols during build.
> 
> Changes since v1:
> - update to 3.14-rc1
> - ensure cache coherency for data page
> - Document the kernel-to-userspace protocol for vdso data page updates,
>   and note that the timekeeping core prevents concurrent updates.
> - update wall-to-monotonic fields unconditionally
> - move vdso_start, vdso_end declarations to vdso.h
> - correctly build and run when CONFIG_ARM_ARCH_TIMER=n
> - rearrange linker script to avoid overlapping sections when CONFIG_DEBUGINFO=n
> - remove use_syscall checks from coarse clock paths
> - crib BUG_INSTR (0xe7f001f2) from asm/bug.h for text fill
> 
> 
> Nathan Lynch (6):
>   ARM: use _install_special_mapping for sigpage
>   ARM: place sigpage at a random offset above stack
>   ARM: miscellaneous vdso infrastructure, preparation
>   ARM: add vdso user-space code
>   ARM: vdso initialization, mapping, and synchronization
>   ARM: add CONFIG_VDSO Kconfig and Makefile bits
> 
>  arch/arm/Makefile                    |   8 +
>  arch/arm/include/asm/Kbuild          |   1 -
>  arch/arm/include/asm/auxvec.h        |   1 +
>  arch/arm/include/asm/elf.h           |   9 +
>  arch/arm/include/asm/mmu.h           |   3 +
>  arch/arm/include/asm/vdso.h          |  34 ++++
>  arch/arm/include/asm/vdso_datapage.h |  60 +++++++
>  arch/arm/include/uapi/asm/Kbuild     |   1 +
>  arch/arm/include/uapi/asm/auxvec.h   |   7 +
>  arch/arm/kernel/Makefile             |   1 +
>  arch/arm/kernel/asm-offsets.c        |   5 +
>  arch/arm/kernel/process.c            |  71 +++++++-
>  arch/arm/kernel/vdso.c               | 207 ++++++++++++++++++++++
>  arch/arm/mm/Kconfig                  |  15 ++
>  arch/arm/vdso/.gitignore             |   1 +
>  arch/arm/vdso/Makefile               |  74 ++++++++
>  arch/arm/vdso/datapage.S             |  15 ++
>  arch/arm/vdso/vdso.S                 |  35 ++++
>  arch/arm/vdso/vdso.lds.S             |  88 ++++++++++
>  arch/arm/vdso/vdsomunge.c            | 208 +++++++++++++++++++++++
>  arch/arm/vdso/vgettimeofday.c        | 320 +++++++++++++++++++++++++++++++++++
>  21 files changed, 1154 insertions(+), 10 deletions(-)
>  create mode 100644 arch/arm/include/asm/auxvec.h
>  create mode 100644 arch/arm/include/asm/vdso.h
>  create mode 100644 arch/arm/include/asm/vdso_datapage.h
>  create mode 100644 arch/arm/include/uapi/asm/auxvec.h
>  create mode 100644 arch/arm/kernel/vdso.c
>  create mode 100644 arch/arm/vdso/.gitignore
>  create mode 100644 arch/arm/vdso/Makefile
>  create mode 100644 arch/arm/vdso/datapage.S
>  create mode 100644 arch/arm/vdso/vdso.S
>  create mode 100644 arch/arm/vdso/vdso.lds.S
>  create mode 100644 arch/arm/vdso/vdsomunge.c
>  create mode 100644 arch/arm/vdso/vgettimeofday.c

It appears to me that there is code in several architecture subdirectories
(I'm aware of x86, arm64, and with these patches arm[32] and I would be
surprised if there weren't more) doing largely the same setup of special
mappings at randomized offsets, checking ELF magic etc. Not that these patches
should necessarily do it, but is there a reasonable amount of consolidation
that could be done, or am I underestimating how much of this really does vary
per architecture?

Thanks,
Christopher

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by the Linux Foundation.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 0/6] ARM: VDSO
  2014-08-27 20:49 ` [PATCH v9 0/6] ARM: VDSO Christopher Covington
@ 2014-08-27 21:42   ` Andy Lutomirski
  2014-09-03  5:41   ` Nathan Lynch
  1 sibling, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2014-08-27 21:42 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Aug 27, 2014 at 1:49 PM, Christopher Covington
<cov@codeaurora.org> wrote:
> It appears to me that there is code in several architecture subdirectories
> (I'm aware of x86, arm64, and with these patches arm[32] and I would be
> surprised if there weren't more) doing largely the same setup of special
> mappings at randomized offsets, checking ELF magic etc. Not that these patches
> should necessarily do it, but is there a reasonable amount of consolidation
> that could be done, or am I underestimating how much of this really does vary
> per architecture?
>

There's certainly code that could be consolidated, but it would be a
decently large project.

Some day, I'd like to see everyone use vdso2c, but it needs to stop
knowing that it's targetting x86 for that to happen.

--Andy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 0/6] ARM: VDSO
  2014-08-27 20:49 ` [PATCH v9 0/6] ARM: VDSO Christopher Covington
  2014-08-27 21:42   ` Andy Lutomirski
@ 2014-09-03  5:41   ` Nathan Lynch
  2014-09-03 13:13     ` Christopher Covington
  2014-09-03 16:59     ` Andy Lutomirski
  1 sibling, 2 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-09-03  5:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/27/2014 03:49 PM, Christopher Covington wrote:
> 
> It appears to me that there is code in several architecture subdirectories
> (I'm aware of x86, arm64, and with these patches arm[32] and I would be
> surprised if there weren't more) doing largely the same setup of special
> mappings at randomized offsets, checking ELF magic etc. Not that these patches
> should necessarily do it, but is there a reasonable amount of consolidation
> that could be done, or am I underestimating how much of this really does vary
> per architecture?

Sorry to not respond to this promptly, was distracted by some other work.

As Andy said, the possibility for consolidating some aspects of VDSO support
is there, but it would be a fair bit of work.

For example, arch_setup_additional_pages tends to have the general form of:

lock mmap_sem
get_unmapped_area
install_special_mapping (or _install_special_mapping, preferably)
stash vdso address in mmu context
release mmap_sem

But there are a lot of implementation details that differ:

         +----------------------------------------------------------------
         | Number of VMAs installed
         |   +------------------------------------------------------------
         |   | Considers uses_interp
         |   |     +------------------------------------------------------
         |   |     | Uses _install_special_mapping
         |   |     |     +------------------------------------------------
         |   |     |     | Performs additional work (e.g. remap_pfn_range)
         |   |     |     |     +------------------------------------------
         |   |     |     |     | Randomizes VDSO offset vs stack and libs
         |   |     |     |     |     +------------------------------------
         |   |     |     |     |     | Records VDSO address in mmu context
         |   |     |     |     |     |     +------------------------------
         |   |     |     |     |     |     | Supports compat VDSO
         |   |     |     |     |     |     |     +------------------------
         |   |     |     |     |     |     |     | Supports disabling VDSO
         |   |     |     |     |     |     |     | at boot (e.g. vdso=off)
         |   |     |     |     |     |     |     |     +------------------
         |   |     |     |     |     |     |     |     | Can disable VDSO
 arch    |   |     |     |     |     |     |     |     | via Kconfig
---------+---+-----+-----+-----+-----+-----+-----+-----+------------------
 arm*    | 3 | no  | yes | no  | yes | yes | no  | no  | yes
 arm64   | 2 | no  | yes | no  | no  | yes | no  | no  | no
 hexagon | 1 | no  | no  | no  | no  | yes | no  | no  | no
 mips    | 1 | no  | no  | no  | no  | yes | no  | no  | no
 powerpc | 1 | no  | no  | no  | no  | yes | yes | no  | no
 s390    | 1 | yes | no  | no  | no  | yes | yes | yes | no
 sh      | 1 | no  | no  | no  | no  | yes | no  | yes | yes
 tile    | 1 | no  | no  | yes | no  | yes | no  | yes | no
 x86     | 2 | no  | yes | yes | yes | yes | yes | yes | no

* With VDSO patches from this thread, of course.

I think pushing the mmap_sem lock/unlock up into the ELF loader might be
of some benefit (slightly reduced complexity in the arch code).  But
any generic replacement for arch_setup_additional_pages will have to
account for all the differences above, and probably a few more I've
missed.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 0/6] ARM: VDSO
  2014-09-03  5:41   ` Nathan Lynch
@ 2014-09-03 13:13     ` Christopher Covington
  2014-09-03 16:59     ` Andy Lutomirski
  1 sibling, 0 replies; 20+ messages in thread
From: Christopher Covington @ 2014-09-03 13:13 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nathan,

On 09/03/2014 01:41 AM, Nathan Lynch wrote:
> On 08/27/2014 03:49 PM, Christopher Covington wrote:
>>
>> It appears to me that there is code in several architecture subdirectories
>> (I'm aware of x86, arm64, and with these patches arm[32] and I would be
>> surprised if there weren't more) doing largely the same setup of special
>> mappings at randomized offsets, checking ELF magic etc. Not that these patches
>> should necessarily do it, but is there a reasonable amount of consolidation
>> that could be done, or am I underestimating how much of this really does vary
>> per architecture?
> 
> Sorry to not respond to this promptly, was distracted by some other work.
> 
> As Andy said, the possibility for consolidating some aspects of VDSO support
> is there, but it would be a fair bit of work.
> 
> For example, arch_setup_additional_pages tends to have the general form of:
> 
> lock mmap_sem
> get_unmapped_area
> install_special_mapping (or _install_special_mapping, preferably)
> stash vdso address in mmu context
> release mmap_sem
> 
> But there are a lot of implementation details that differ:
> 
>          +----------------------------------------------------------------
>          | Number of VMAs installed
>          |   +------------------------------------------------------------
>          |   | Considers uses_interp
>          |   |     +------------------------------------------------------
>          |   |     | Uses _install_special_mapping
>          |   |     |     +------------------------------------------------
>          |   |     |     | Performs additional work (e.g. remap_pfn_range)
>          |   |     |     |     +------------------------------------------
>          |   |     |     |     | Randomizes VDSO offset vs stack and libs
>          |   |     |     |     |     +------------------------------------
>          |   |     |     |     |     | Records VDSO address in mmu context
>          |   |     |     |     |     |     +------------------------------
>          |   |     |     |     |     |     | Supports compat VDSO
>          |   |     |     |     |     |     |     +------------------------
>          |   |     |     |     |     |     |     | Supports disabling VDSO
>          |   |     |     |     |     |     |     | at boot (e.g. vdso=off)
>          |   |     |     |     |     |     |     |     +------------------
>          |   |     |     |     |     |     |     |     | Can disable VDSO
>  arch    |   |     |     |     |     |     |     |     | via Kconfig
> ---------+---+-----+-----+-----+-----+-----+-----+-----+------------------
>  arm*    | 3 | no  | yes | no  | yes | yes | no  | no  | yes
>  arm64   | 2 | no  | yes | no  | no  | yes | no  | no  | no
>  hexagon | 1 | no  | no  | no  | no  | yes | no  | no  | no
>  mips    | 1 | no  | no  | no  | no  | yes | no  | no  | no
>  powerpc | 1 | no  | no  | no  | no  | yes | yes | no  | no
>  s390    | 1 | yes | no  | no  | no  | yes | yes | yes | no
>  sh      | 1 | no  | no  | no  | no  | yes | no  | yes | yes
>  tile    | 1 | no  | no  | yes | no  | yes | no  | yes | no
>  x86     | 2 | no  | yes | yes | yes | yes | yes | yes | no
> 
> * With VDSO patches from this thread, of course.
> 
> I think pushing the mmap_sem lock/unlock up into the ELF loader might be
> of some benefit (slightly reduced complexity in the arch code).  But
> any generic replacement for arch_setup_additional_pages will have to
> account for all the differences above, and probably a few more I've
> missed.

I really appreciate the detailed response. I'll try to find time to explore
this further, hopefully using QEMU to run kernels for most of those architectures.

Christopher

-- 
Employee of Qualcomm Innovation Center, Inc.
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum,
hosted by the Linux Foundation.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 0/6] ARM: VDSO
  2014-09-03  5:41   ` Nathan Lynch
  2014-09-03 13:13     ` Christopher Covington
@ 2014-09-03 16:59     ` Andy Lutomirski
  2014-09-03 20:03       ` Nathan Lynch
  1 sibling, 1 reply; 20+ messages in thread
From: Andy Lutomirski @ 2014-09-03 16:59 UTC (permalink / raw)
  To: linux-arm-kernel

On Sep 2, 2014 10:44 PM, "Nathan Lynch" <Nathan_Lynch@mentor.com> wrote:
>
> On 08/27/2014 03:49 PM, Christopher Covington wrote:
> >
> > It appears to me that there is code in several architecture subdirectories
> > (I'm aware of x86, arm64, and with these patches arm[32] and I would be
> > surprised if there weren't more) doing largely the same setup of special
> > mappings at randomized offsets, checking ELF magic etc. Not that these patches
> > should necessarily do it, but is there a reasonable amount of consolidation
> > that could be done, or am I underestimating how much of this really does vary
> > per architecture?
>
> Sorry to not respond to this promptly, was distracted by some other work.
>
> As Andy said, the possibility for consolidating some aspects of VDSO support
> is there, but it would be a fair bit of work.
>
> For example, arch_setup_additional_pages tends to have the general form of:
>
> lock mmap_sem
> get_unmapped_area
> install_special_mapping (or _install_special_mapping, preferably)
> stash vdso address in mmu context
> release mmap_sem
>
> But there are a lot of implementation details that differ:
>
>          +----------------------------------------------------------------
>          | Number of VMAs installed
>          |   +------------------------------------------------------------
>          |   | Considers uses_interp
>          |   |     +------------------------------------------------------
>          |   |     | Uses _install_special_mapping
>          |   |     |     +------------------------------------------------
>          |   |     |     | Performs additional work (e.g. remap_pfn_range)
>          |   |     |     |     +------------------------------------------
>          |   |     |     |     | Randomizes VDSO offset vs stack and libs
>          |   |     |     |     |     +------------------------------------
>          |   |     |     |     |     | Records VDSO address in mmu context
>          |   |     |     |     |     |     +------------------------------
>          |   |     |     |     |     |     | Supports compat VDSO
>          |   |     |     |     |     |     |     +------------------------
>          |   |     |     |     |     |     |     | Supports disabling VDSO
>          |   |     |     |     |     |     |     | at boot (e.g. vdso=off)
>          |   |     |     |     |     |     |     |     +------------------
>          |   |     |     |     |     |     |     |     | Can disable VDSO
>  arch    |   |     |     |     |     |     |     |     | via Kconfig
> ---------+---+-----+-----+-----+-----+-----+-----+-----+------------------
>  arm*    | 3 | no  | yes | no  | yes | yes | no  | no  | yes
>  arm64   | 2 | no  | yes | no  | no  | yes | no  | no  | no
>  hexagon | 1 | no  | no  | no  | no  | yes | no  | no  | no
>  mips    | 1 | no  | no  | no  | no  | yes | no  | no  | no
>  powerpc | 1 | no  | no  | no  | no  | yes | yes | no  | no
>  s390    | 1 | yes | no  | no  | no  | yes | yes | yes | no
>  sh      | 1 | no  | no  | no  | no  | yes | no  | yes | yes
>  tile    | 1 | no  | no  | yes | no  | yes | no  | yes | no
>  x86     | 2 | no  | yes | yes | yes | yes | yes | yes | no
>
> * With VDSO patches from this thread, of course.
>
> I think pushing the mmap_sem lock/unlock up into the ELF loader might be
> of some benefit (slightly reduced complexity in the arch code).  But
> any generic replacement for arch_setup_additional_pages will have to
> account for all the differences above, and probably a few more I've
> missed.
>

Wow, nice table!  I think that we should eventually get rid of most of
these differences.

Christopher, since you seem to be interested in CRIU, one thing to
note is that any architecture that shoves a pointer to the vdso into
the mmu context is likely to fail if the vdso is mremapped.  CRIU
needs to mremap the vdso, so this is a problem.

x86_64 is an exception: it doesn't use that pointer for anything.

--Andy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 0/6] ARM: VDSO
  2014-09-03 16:59     ` Andy Lutomirski
@ 2014-09-03 20:03       ` Nathan Lynch
  2014-09-03 20:12         ` Andy Lutomirski
  0 siblings, 1 reply; 20+ messages in thread
From: Nathan Lynch @ 2014-09-03 20:03 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/03/2014 11:59 AM, Andy Lutomirski wrote:
> On Sep 2, 2014 10:44 PM, "Nathan Lynch" <Nathan_Lynch@mentor.com> wrote:
>>
>> On 08/27/2014 03:49 PM, Christopher Covington wrote:
>>>
>>> It appears to me that there is code in several architecture subdirectories
>>> (I'm aware of x86, arm64, and with these patches arm[32] and I would be
>>> surprised if there weren't more) doing largely the same setup of special
>>> mappings at randomized offsets, checking ELF magic etc. Not that these patches
>>> should necessarily do it, but is there a reasonable amount of consolidation
>>> that could be done, or am I underestimating how much of this really does vary
>>> per architecture?
>>
>> Sorry to not respond to this promptly, was distracted by some other work.
>>
>> As Andy said, the possibility for consolidating some aspects of VDSO support
>> is there, but it would be a fair bit of work.
>>
>> For example, arch_setup_additional_pages tends to have the general form of:
>>
>> lock mmap_sem
>> get_unmapped_area
>> install_special_mapping (or _install_special_mapping, preferably)
>> stash vdso address in mmu context
>> release mmap_sem
>>
>> But there are a lot of implementation details that differ:
>>
>>          +----------------------------------------------------------------
>>          | Number of VMAs installed
>>          |   +------------------------------------------------------------
>>          |   | Considers uses_interp
>>          |   |     +------------------------------------------------------
>>          |   |     | Uses _install_special_mapping
>>          |   |     |     +------------------------------------------------
>>          |   |     |     | Performs additional work (e.g. remap_pfn_range)
>>          |   |     |     |     +------------------------------------------
>>          |   |     |     |     | Randomizes VDSO offset vs stack and libs
>>          |   |     |     |     |     +------------------------------------
>>          |   |     |     |     |     | Records VDSO address in mmu context
>>          |   |     |     |     |     |     +------------------------------
>>          |   |     |     |     |     |     | Supports compat VDSO
>>          |   |     |     |     |     |     |     +------------------------
>>          |   |     |     |     |     |     |     | Supports disabling VDSO
>>          |   |     |     |     |     |     |     | at boot (e.g. vdso=off)
>>          |   |     |     |     |     |     |     |     +------------------
>>          |   |     |     |     |     |     |     |     | Can disable VDSO
>>  arch    |   |     |     |     |     |     |     |     | via Kconfig
>> ---------+---+-----+-----+-----+-----+-----+-----+-----+------------------
>>  arm*    | 3 | no  | yes | no  | yes | yes | no  | no  | yes
>>  arm64   | 2 | no  | yes | no  | no  | yes | no  | no  | no
>>  hexagon | 1 | no  | no  | no  | no  | yes | no  | no  | no
>>  mips    | 1 | no  | no  | no  | no  | yes | no  | no  | no
>>  powerpc | 1 | no  | no  | no  | no  | yes | yes | no  | no
>>  s390    | 1 | yes | no  | no  | no  | yes | yes | yes | no
>>  sh      | 1 | no  | no  | no  | no  | yes | no  | yes | yes
>>  tile    | 1 | no  | no  | yes | no  | yes | no  | yes | no
>>  x86     | 2 | no  | yes | yes | yes | yes | yes | yes | no
>>
>> * With VDSO patches from this thread, of course.
>>
>> I think pushing the mmap_sem lock/unlock up into the ELF loader might be
>> of some benefit (slightly reduced complexity in the arch code).  But
>> any generic replacement for arch_setup_additional_pages will have to
>> account for all the differences above, and probably a few more I've
>> missed.
>>
> 
> Wow, nice table!  I think that we should eventually get rid of most of
> these differences.

Thanks, and agreed.


> Christopher, since you seem to be interested in CRIU, one thing to
> note is that any architecture that shoves a pointer to the vdso into
> the mmu context is likely to fail if the vdso is mremapped.  CRIU
> needs to mremap the vdso, so this is a problem.
> 
> x86_64 is an exception: it doesn't use that pointer for anything.

Hmm, I would expect architectures that implement arch_vma_name like so
to experience problems with CRIU:

const char *arch_vma_name(struct vm_area_struct *vma)
{
	if (vma->vm_mm && vma->vm_start == vma->vm_mm->context.vdso_base)
		return "[vdso]";
	return NULL;
}

Is this what you're referring to?

Looking at 3.17-rc3, every arch uses mm_context_t->vdso_base or
similar to provide a value for AT_SYSINFO_EHDR at exec time.
Is this also problematic?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 0/6] ARM: VDSO
  2014-09-03 20:03       ` Nathan Lynch
@ 2014-09-03 20:12         ` Andy Lutomirski
  0 siblings, 0 replies; 20+ messages in thread
From: Andy Lutomirski @ 2014-09-03 20:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 3, 2014 at 1:03 PM, Nathan Lynch <Nathan_Lynch@mentor.com> wrote:
> On 09/03/2014 11:59 AM, Andy Lutomirski wrote:
>> On Sep 2, 2014 10:44 PM, "Nathan Lynch" <Nathan_Lynch@mentor.com> wrote:
>>>
>>> On 08/27/2014 03:49 PM, Christopher Covington wrote:
>>>>
>>>> It appears to me that there is code in several architecture subdirectories
>>>> (I'm aware of x86, arm64, and with these patches arm[32] and I would be
>>>> surprised if there weren't more) doing largely the same setup of special
>>>> mappings at randomized offsets, checking ELF magic etc. Not that these patches
>>>> should necessarily do it, but is there a reasonable amount of consolidation
>>>> that could be done, or am I underestimating how much of this really does vary
>>>> per architecture?
>>>
>>> Sorry to not respond to this promptly, was distracted by some other work.
>>>
>>> As Andy said, the possibility for consolidating some aspects of VDSO support
>>> is there, but it would be a fair bit of work.
>>>
>>> For example, arch_setup_additional_pages tends to have the general form of:
>>>
>>> lock mmap_sem
>>> get_unmapped_area
>>> install_special_mapping (or _install_special_mapping, preferably)
>>> stash vdso address in mmu context
>>> release mmap_sem
>>>
>>> But there are a lot of implementation details that differ:
>>>
>>>          +----------------------------------------------------------------
>>>          | Number of VMAs installed
>>>          |   +------------------------------------------------------------
>>>          |   | Considers uses_interp
>>>          |   |     +------------------------------------------------------
>>>          |   |     | Uses _install_special_mapping
>>>          |   |     |     +------------------------------------------------
>>>          |   |     |     | Performs additional work (e.g. remap_pfn_range)
>>>          |   |     |     |     +------------------------------------------
>>>          |   |     |     |     | Randomizes VDSO offset vs stack and libs
>>>          |   |     |     |     |     +------------------------------------
>>>          |   |     |     |     |     | Records VDSO address in mmu context
>>>          |   |     |     |     |     |     +------------------------------
>>>          |   |     |     |     |     |     | Supports compat VDSO
>>>          |   |     |     |     |     |     |     +------------------------
>>>          |   |     |     |     |     |     |     | Supports disabling VDSO
>>>          |   |     |     |     |     |     |     | at boot (e.g. vdso=off)
>>>          |   |     |     |     |     |     |     |     +------------------
>>>          |   |     |     |     |     |     |     |     | Can disable VDSO
>>>  arch    |   |     |     |     |     |     |     |     | via Kconfig
>>> ---------+---+-----+-----+-----+-----+-----+-----+-----+------------------
>>>  arm*    | 3 | no  | yes | no  | yes | yes | no  | no  | yes
>>>  arm64   | 2 | no  | yes | no  | no  | yes | no  | no  | no
>>>  hexagon | 1 | no  | no  | no  | no  | yes | no  | no  | no
>>>  mips    | 1 | no  | no  | no  | no  | yes | no  | no  | no
>>>  powerpc | 1 | no  | no  | no  | no  | yes | yes | no  | no
>>>  s390    | 1 | yes | no  | no  | no  | yes | yes | yes | no
>>>  sh      | 1 | no  | no  | no  | no  | yes | no  | yes | yes
>>>  tile    | 1 | no  | no  | yes | no  | yes | no  | yes | no
>>>  x86     | 2 | no  | yes | yes | yes | yes | yes | yes | no
>>>
>>> * With VDSO patches from this thread, of course.
>>>
>>> I think pushing the mmap_sem lock/unlock up into the ELF loader might be
>>> of some benefit (slightly reduced complexity in the arch code).  But
>>> any generic replacement for arch_setup_additional_pages will have to
>>> account for all the differences above, and probably a few more I've
>>> missed.
>>>
>>
>> Wow, nice table!  I think that we should eventually get rid of most of
>> these differences.
>
> Thanks, and agreed.
>
>
>> Christopher, since you seem to be interested in CRIU, one thing to
>> note is that any architecture that shoves a pointer to the vdso into
>> the mmu context is likely to fail if the vdso is mremapped.  CRIU
>> needs to mremap the vdso, so this is a problem.
>>
>> x86_64 is an exception: it doesn't use that pointer for anything.
>
> Hmm, I would expect architectures that implement arch_vma_name like so
> to experience problems with CRIU:
>
> const char *arch_vma_name(struct vm_area_struct *vma)
> {
>         if (vma->vm_mm && vma->vm_start == vma->vm_mm->context.vdso_base)
>                 return "[vdso]";
>         return NULL;
> }
>
> Is this what you're referring to?

I never entirely understood why this wasn't a bigger problem.  I think
it only really caused problems when checkpointing, restoring,
checkpointing *again*, and getting unlucky.

>
> Looking at 3.17-rc3, every arch uses mm_context_t->vdso_base or
> similar to provide a value for AT_SYSINFO_EHDR at exec time.
> Is this also problematic?
>

This one's fine, since it's very hard to mremap between mapping the
vdso and having exec return.

The ones that are serious problems (on x86 32-bit userspace, at least)
are the vdso sigreturn trampoline and, even worse, the vdso sysexit
trampoline.  The latter will cause every syscall on any native 32-bit
system or on any 32-bit compat code running on an Intel CPU to
segfault immediately upon mremapping the vdso.

--Andy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 0/6] ARM: VDSO
  2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
                   ` (6 preceding siblings ...)
  2014-08-27 20:49 ` [PATCH v9 0/6] ARM: VDSO Christopher Covington
@ 2014-09-06  2:32 ` Nathan Lynch
  7 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-09-06  2:32 UTC (permalink / raw)
  To: linux-arm-kernel

On 08/22/2014 04:52 PM, Nathan Lynch wrote:
> Provide fast userspace implementations of gettimeofday and
> clock_gettime on systems that implement the generic timers extension
> defined in ARMv7.  This follows the example of arm64 in conception but
> significantly differs in some aspects of the implementation (C vs
> assembly, mainly).
> 
> Clocks supported:
> - CLOCK_REALTIME
> - CLOCK_MONOTONIC
> - CLOCK_REALTIME_COARSE
> - CLOCK_MONOTONIC_COARSE
> 
> This also provides clock_getres (as arm64 does).
> 
> getcpu support is planned but not included at this time.
> 
> For applications to transparently benefit from this change,
> ARM-specific support code needs to be added to glibc.  I have such a
> patch, and have verified that glibc's self tests do not detect any
> regressions.  I hope to have that code added to glibc for the 2.21
> release.
> 
> The VDSO symbols are available for lookup via dlsym even with an
> unpatched glibc.
> 
> Note that while the high-precision realtime and monotonic clock
> support depends on the generic timers extension, support for
> clock_getres and coarse clocks is independent of the timer
> implementation and is provided unconditionally.  High-resolution clock
> support requires changes to the arch timer code, posted here:
> 
> http://lists.infradead.org/pipermail/linux-arm-kernel/2014-August/281280.html
> 
> The VDSO will function correctly without those changes, but
> gettimeofday and clock_gettime with CLOCK_REALTIME/CLOCK_MONOTONIC
> will not be accelerated.

Russell, what is your current thinking on taking the VDSO patches?  I
believe all feedback has been addressed and I was hoping to get it into
3.18...

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 4/6] ARM: add vdso user-space code
  2014-08-22 21:52 ` [PATCH v9 4/6] ARM: add vdso user-space code Nathan Lynch
@ 2014-09-10 16:47   ` Will Deacon
  2014-09-10 16:52     ` Andy Lutomirski
  2014-09-12  6:50     ` Nathan Lynch
  0 siblings, 2 replies; 20+ messages in thread
From: Will Deacon @ 2014-09-10 16:47 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Nathan,

On Fri, Aug 22, 2014 at 10:52:29PM +0100, Nathan Lynch wrote:
> Place vdso-related user-space code in arch/arm/kernel/vdso/.
> 
> It is almost completely written in C with some assembly helpers to
> load the data page address, sample the counter, and fall back to
> system calls when necessary.

I'm still a bit puzzled as to how we can implement a compat version of this
for a 32-bit userspace running under a 64-bit kernel. Maybe the answer is
that we don't care enough (programs will still work fine without it), but if
we did want to then we're going to need to build the kernel with two
toolchains and it gets really horrible.

Do you have any ideas?

Will

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 4/6] ARM: add vdso user-space code
  2014-09-10 16:47   ` Will Deacon
@ 2014-09-10 16:52     ` Andy Lutomirski
  2014-09-10 17:10       ` Will Deacon
  2014-09-12  6:50     ` Nathan Lynch
  1 sibling, 1 reply; 20+ messages in thread
From: Andy Lutomirski @ 2014-09-10 16:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 10, 2014 at 9:47 AM, Will Deacon <will.deacon@arm.com> wrote:
> Hi Nathan,
>
> On Fri, Aug 22, 2014 at 10:52:29PM +0100, Nathan Lynch wrote:
>> Place vdso-related user-space code in arch/arm/kernel/vdso/.
>>
>> It is almost completely written in C with some assembly helpers to
>> load the data page address, sample the counter, and fall back to
>> system calls when necessary.
>
> I'm still a bit puzzled as to how we can implement a compat version of this
> for a 32-bit userspace running under a 64-bit kernel. Maybe the answer is
> that we don't care enough (programs will still work fine without it), but if
> we did want to then we're going to need to build the kernel with two
> toolchains and it gets really horrible.
>
> Do you have any ideas?

Convince the gcc and binutils people to add a-m32 option to aarch64?
That's how x86_64 pulls this off :)  Or you could require a
cross-compiler to be available to enable this particular feature.

I have no further bright ideas, unless aarch64 and arm assembly are
miraculously nearly compatible, in which case you could do something
like what x32 does (build 64-bit and then use objcopy to turn the
result into an x32 object).

--Andy

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 4/6] ARM: add vdso user-space code
  2014-09-10 16:52     ` Andy Lutomirski
@ 2014-09-10 17:10       ` Will Deacon
  2014-09-10 17:25         ` Nathan Lynch
  0 siblings, 1 reply; 20+ messages in thread
From: Will Deacon @ 2014-09-10 17:10 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Sep 10, 2014 at 05:52:39PM +0100, Andy Lutomirski wrote:
> On Wed, Sep 10, 2014 at 9:47 AM, Will Deacon <will.deacon@arm.com> wrote:
> > On Fri, Aug 22, 2014 at 10:52:29PM +0100, Nathan Lynch wrote:
> >> Place vdso-related user-space code in arch/arm/kernel/vdso/.
> >>
> >> It is almost completely written in C with some assembly helpers to
> >> load the data page address, sample the counter, and fall back to
> >> system calls when necessary.
> >
> > I'm still a bit puzzled as to how we can implement a compat version of this
> > for a 32-bit userspace running under a 64-bit kernel. Maybe the answer is
> > that we don't care enough (programs will still work fine without it), but if
> > we did want to then we're going to need to build the kernel with two
> > toolchains and it gets really horrible.
> >
> > Do you have any ideas?
> 
> Convince the gcc and binutils people to add a-m32 option to aarch64?
> That's how x86_64 pulls this off :)  Or you could require a
> cross-compiler to be available to enable this particular feature.

The compilers have two separate backends, so I think I'll know what they
say. I guess it's either overhauling kbuild to support two cross compilers,
or have some shell script to accept an option we make up.

> I have no further bright ideas, unless aarch64 and arm assembly are
> miraculously nearly compatible, in which case you could do something
> like what x32 does (build 64-bit and then use objcopy to turn the
> result into an x32 object).

The assembler is pretty different, so I'd be pretty uneasy about trying to
do that.

Will

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 4/6] ARM: add vdso user-space code
  2014-09-10 17:10       ` Will Deacon
@ 2014-09-10 17:25         ` Nathan Lynch
  0 siblings, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-09-10 17:25 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/10/2014 12:10 PM, Will Deacon wrote:
> On Wed, Sep 10, 2014 at 05:52:39PM +0100, Andy Lutomirski wrote:
>> On Wed, Sep 10, 2014 at 9:47 AM, Will Deacon <will.deacon@arm.com> wrote:
>>> On Fri, Aug 22, 2014 at 10:52:29PM +0100, Nathan Lynch wrote:
>>>> Place vdso-related user-space code in arch/arm/kernel/vdso/.
>>>>
>>>> It is almost completely written in C with some assembly helpers to
>>>> load the data page address, sample the counter, and fall back to
>>>> system calls when necessary.
>>>
>>> I'm still a bit puzzled as to how we can implement a compat version of this
>>> for a 32-bit userspace running under a 64-bit kernel. Maybe the answer is
>>> that we don't care enough (programs will still work fine without it), but if
>>> we did want to then we're going to need to build the kernel with two
>>> toolchains and it gets really horrible.
>>>
>>> Do you have any ideas?
>>
>> Convince the gcc and binutils people to add a-m32 option to aarch64?
>> That's how x86_64 pulls this off :)  Or you could require a
>> cross-compiler to be available to enable this particular feature.
> 
> The compilers have two separate backends, so I think I'll know what they
> say. I guess it's either overhauling kbuild to support two cross compilers,
> or have some shell script to accept an option we make up.

arch/powerpc has addressed this with a CROSS32_COMPILE= variable for
producing 32-bit outputs (compat vdso, bootwrapper) during a 64-bit
kernel build.  I think that was used before powerpc toolchains commonly
honored -m32.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* [PATCH v9 4/6] ARM: add vdso user-space code
  2014-09-10 16:47   ` Will Deacon
  2014-09-10 16:52     ` Andy Lutomirski
@ 2014-09-12  6:50     ` Nathan Lynch
  1 sibling, 0 replies; 20+ messages in thread
From: Nathan Lynch @ 2014-09-12  6:50 UTC (permalink / raw)
  To: linux-arm-kernel

On 09/10/2014 11:47 AM, Will Deacon wrote:
> Hi Nathan,
> 
> On Fri, Aug 22, 2014 at 10:52:29PM +0100, Nathan Lynch wrote:
>> Place vdso-related user-space code in arch/arm/kernel/vdso/.
>>
>> It is almost completely written in C with some assembly helpers to
>> load the data page address, sample the counter, and fall back to
>> system calls when necessary.
> 
> I'm still a bit puzzled as to how we can implement a compat version of this
> for a 32-bit userspace running under a 64-bit kernel. Maybe the answer is
> that we don't care enough (programs will still work fine without it), but if
> we did want to then we're going to need to build the kernel with two
> toolchains and it gets really horrible.
> 
> Do you have any ideas?

Assuming a GCC+binutils toolchain, I don't have any workable ideas for
generating ARMv7 shared object during an arm64 kernel build without
relying on a second compiler.  I think theoretically you could do
something like what kuser32.S does, but that would be untenable for
something as complex as a vdso.

I recognize that this might present an awkward situation where an ARMv7
program could incur more system call overhead on arm64 than it does on
arm, but I don't see any way around it.

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2014-09-12  6:50 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-08-22 21:52 [PATCH v9 0/6] ARM: VDSO Nathan Lynch
2014-08-22 21:52 ` [PATCH v9 1/6] ARM: use _install_special_mapping for sigpage Nathan Lynch
2014-08-22 21:52 ` [PATCH v9 2/6] ARM: place sigpage at a random offset above stack Nathan Lynch
2014-08-22 21:52 ` [PATCH v9 3/6] ARM: miscellaneous vdso infrastructure, preparation Nathan Lynch
2014-08-22 21:52 ` [PATCH v9 4/6] ARM: add vdso user-space code Nathan Lynch
2014-09-10 16:47   ` Will Deacon
2014-09-10 16:52     ` Andy Lutomirski
2014-09-10 17:10       ` Will Deacon
2014-09-10 17:25         ` Nathan Lynch
2014-09-12  6:50     ` Nathan Lynch
2014-08-22 21:52 ` [PATCH v9 5/6] ARM: vdso initialization, mapping, and synchronization Nathan Lynch
2014-08-22 21:52 ` [PATCH v9 6/6] ARM: add CONFIG_VDSO Kconfig and Makefile bits Nathan Lynch
2014-08-27 20:49 ` [PATCH v9 0/6] ARM: VDSO Christopher Covington
2014-08-27 21:42   ` Andy Lutomirski
2014-09-03  5:41   ` Nathan Lynch
2014-09-03 13:13     ` Christopher Covington
2014-09-03 16:59     ` Andy Lutomirski
2014-09-03 20:03       ` Nathan Lynch
2014-09-03 20:12         ` Andy Lutomirski
2014-09-06  2:32 ` Nathan Lynch

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).