linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/3] MIPS VDSO support
@ 2015-09-28 10:03 Markos Chandras
  2015-09-28 10:10 ` [PATCH 1/3] MIPS: Initial implementation of a VDSO Markos Chandras
                   ` (2 more replies)
  0 siblings, 3 replies; 35+ messages in thread
From: Markos Chandras @ 2015-09-28 10:03 UTC (permalink / raw)
  To: linux-mips; +Cc: alex, Markos Chandras, linux-kernel

Hi,

This series adds a proper VDSO to the kernel on MIPS. The first commit
adds the basic VDSO, replacing the current signal return trampoline
page. The following commits add user implementations of gettimeofday() and
clock_gettime() which can make use of either the CP0 count or the GIC
user-mode visible section.

A tree with these changes can be found at [1]. It's based on v4.3-rc3

Use of the time functions relies on glibc modifications. A patch for
this can be found in my repository at [2] and I will soon post it to the glibc
mailing list.

[1]: http://git.linux-mips.org/cgit/mchandras/linux.git/log/?h=4.3-vdso
[2]: https://github.com/hwoarang/glibc/tree/2.22-vdso

Alex Smith (3):
  MIPS: Initial implementation of a VDSO
  irqchip: irq-mips-gic: Provide function to map GIC user section
  MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()

 arch/mips/Kbuild                     |   1 +
 arch/mips/Kconfig                    |   5 +
 arch/mips/include/asm/abi.h          |   5 +-
 arch/mips/include/asm/clocksource.h  |  29 ++++
 arch/mips/include/asm/elf.h          |   7 +
 arch/mips/include/asm/processor.h    |   8 +-
 arch/mips/include/asm/vdso.h         | 139 +++++++++++++++--
 arch/mips/include/uapi/asm/Kbuild    |   2 +-
 arch/mips/include/uapi/asm/auxvec.h  |  17 ++
 arch/mips/kernel/csrc-r4k.c          |  44 ++++++
 arch/mips/kernel/signal.c            |  12 +-
 arch/mips/kernel/signal32.c          |   7 +-
 arch/mips/kernel/signal_n32.c        |   5 +-
 arch/mips/kernel/vdso.c              | 198 ++++++++++++++---------
 arch/mips/vdso/.gitignore            |   4 +
 arch/mips/vdso/Makefile              | 142 +++++++++++++++++
 arch/mips/vdso/elf.S                 |  68 ++++++++
 arch/mips/vdso/genvdso.c             | 294 +++++++++++++++++++++++++++++++++++
 arch/mips/vdso/genvdso.h             | 188 ++++++++++++++++++++++
 arch/mips/vdso/gettimeofday.c        | 232 +++++++++++++++++++++++++++
 arch/mips/vdso/sigreturn.S           |  49 ++++++
 arch/mips/vdso/vdso.h                |  84 ++++++++++
 arch/mips/vdso/vdso.lds.S            | 103 ++++++++++++
 drivers/clocksource/mips-gic-timer.c |   7 +-
 drivers/irqchip/irq-mips-gic.c       |  27 +++-
 include/linux/irqchip/mips-gic.h     |  24 ++-
 26 files changed, 1572 insertions(+), 129 deletions(-)
 create mode 100644 arch/mips/include/asm/clocksource.h
 create mode 100644 arch/mips/include/uapi/asm/auxvec.h
 create mode 100644 arch/mips/vdso/.gitignore
 create mode 100644 arch/mips/vdso/Makefile
 create mode 100644 arch/mips/vdso/elf.S
 create mode 100644 arch/mips/vdso/genvdso.c
 create mode 100644 arch/mips/vdso/genvdso.h
 create mode 100644 arch/mips/vdso/gettimeofday.c
 create mode 100644 arch/mips/vdso/sigreturn.S
 create mode 100644 arch/mips/vdso/vdso.h
 create mode 100644 arch/mips/vdso/vdso.lds.S

-- 
2.5.3


^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH 1/3] MIPS: Initial implementation of a VDSO
  2015-09-28 10:03 [PATCH 0/3] MIPS VDSO support Markos Chandras
@ 2015-09-28 10:10 ` Markos Chandras
  2015-09-28 10:54   ` Alex Smith
  2015-09-28 10:11 ` [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section Markos Chandras
  2015-09-28 10:12 ` [PATCH 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime() Markos Chandras
  2 siblings, 1 reply; 35+ messages in thread
From: Markos Chandras @ 2015-09-28 10:10 UTC (permalink / raw)
  To: linux-mips
  Cc: alex, Alex Smith, linux-kernel, Matthew Fortune, Markos Chandras

From: Alex Smith <alex.smith@imgtec.com>

Add an initial implementation of a proper (i.e. an ELF shared library)
VDSO. With this commit it does not export any symbols, it only replaces
the current signal return trampoline page. A later commit will add user
implementations of gettimeofday()/clock_gettime().

To support both new toolchains and old ones which don't generate ABI
flags section, we define its content manually and then use a tool
(genvdso) to patch up the section to have the correct name and type.
genvdso also extracts symbol offsets ({,rt_}sigreturn) needed by the
kernel, and generates a C file containing a "struct mips_vdso_image"
containing both the VDSO data and these offsets. This C file is
compiled into the kernel.

On 64-bit kernels we require a different VDSO for each supported ABI,
so we may build up to 3 different VDSOs. The VDSO to use is selected by
the mips_abi structure.

A kernel/user shared data page is created and mapped below the VDSO
image. This is currently empty, but will be used by the user time
function implementations which are added later.

[markos.chandras@imgtec.com:
- Add more comments
- Move abi detection in genvdso.h since it's the get_symbol function
that needs it.
- Add an R6 specific way to calculate the base address of VDSO in order
to avoid the branch instruction which affects performance.
- Do not patch .gnu.attributes since it's not needed for dynamic linking.
- Simplify Makefile a little bit.
- checkpatch fixes]

Cc: linux-kernel@vger.kernel.org
Cc: Matthew Fortune <matthew.fortune@imgtec.com>
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
---
 arch/mips/Kbuild                    |   1 +
 arch/mips/include/asm/abi.h         |   5 +-
 arch/mips/include/asm/elf.h         |   7 +
 arch/mips/include/asm/processor.h   |   8 +-
 arch/mips/include/asm/vdso.h        |  73 +++++++--
 arch/mips/include/uapi/asm/Kbuild   |   2 +-
 arch/mips/include/uapi/asm/auxvec.h |  17 +++
 arch/mips/kernel/signal.c           |  12 +-
 arch/mips/kernel/signal32.c         |   7 +-
 arch/mips/kernel/signal_n32.c       |   5 +-
 arch/mips/kernel/vdso.c             | 160 ++++++++++----------
 arch/mips/vdso/.gitignore           |   4 +
 arch/mips/vdso/Makefile             | 142 +++++++++++++++++
 arch/mips/vdso/elf.S                |  68 +++++++++
 arch/mips/vdso/genvdso.c            | 293 ++++++++++++++++++++++++++++++++++++
 arch/mips/vdso/genvdso.h            | 187 +++++++++++++++++++++++
 arch/mips/vdso/sigreturn.S          |  49 ++++++
 arch/mips/vdso/vdso.h               |  79 ++++++++++
 arch/mips/vdso/vdso.lds.S           | 100 ++++++++++++
 19 files changed, 1095 insertions(+), 124 deletions(-)
 create mode 100644 arch/mips/include/uapi/asm/auxvec.h
 create mode 100644 arch/mips/vdso/.gitignore
 create mode 100644 arch/mips/vdso/Makefile
 create mode 100644 arch/mips/vdso/elf.S
 create mode 100644 arch/mips/vdso/genvdso.c
 create mode 100644 arch/mips/vdso/genvdso.h
 create mode 100644 arch/mips/vdso/sigreturn.S
 create mode 100644 arch/mips/vdso/vdso.h
 create mode 100644 arch/mips/vdso/vdso.lds.S

diff --git a/arch/mips/Kbuild b/arch/mips/Kbuild
index dd295335891a..5c3f688a5232 100644
--- a/arch/mips/Kbuild
+++ b/arch/mips/Kbuild
@@ -17,6 +17,7 @@ obj- := $(platform-)
 obj-y += kernel/
 obj-y += mm/
 obj-y += net/
+obj-y += vdso/
 
 ifdef CONFIG_KVM
 obj-y += kvm/
diff --git a/arch/mips/include/asm/abi.h b/arch/mips/include/asm/abi.h
index 37f84078e78a..940760844e2f 100644
--- a/arch/mips/include/asm/abi.h
+++ b/arch/mips/include/asm/abi.h
@@ -11,19 +11,20 @@
 
 #include <asm/signal.h>
 #include <asm/siginfo.h>
+#include <asm/vdso.h>
 
 struct mips_abi {
 	int (* const setup_frame)(void *sig_return, struct ksignal *ksig,
 				  struct pt_regs *regs, sigset_t *set);
-	const unsigned long	signal_return_offset;
 	int (* const setup_rt_frame)(void *sig_return, struct ksignal *ksig,
 				     struct pt_regs *regs, sigset_t *set);
-	const unsigned long	rt_signal_return_offset;
 	const unsigned long	restart;
 
 	unsigned	off_sc_fpregs;
 	unsigned	off_sc_fpc_csr;
 	unsigned	off_sc_used_math;
+
+	struct mips_vdso_image *vdso;
 };
 
 #endif /* _ASM_ABI_H */
diff --git a/arch/mips/include/asm/elf.h b/arch/mips/include/asm/elf.h
index 53b26933b12c..b01a6ff468e0 100644
--- a/arch/mips/include/asm/elf.h
+++ b/arch/mips/include/asm/elf.h
@@ -8,6 +8,7 @@
 #ifndef _ASM_ELF_H
 #define _ASM_ELF_H
 
+#include <linux/auxvec.h>
 #include <linux/fs.h>
 #include <uapi/linux/elf.h>
 
@@ -419,6 +420,12 @@ extern const char *__elf_platform;
 #define ELF_ET_DYN_BASE		(TASK_SIZE / 3 * 2)
 #endif
 
+#define ARCH_DLINFO							\
+do {									\
+	NEW_AUX_ENT(AT_SYSINFO_EHDR,					\
+		    (unsigned long)current->mm->context.vdso);		\
+} while (0)
+
 #define ARCH_HAS_SETUP_ADDITIONAL_PAGES 1
 struct linux_binprm;
 extern int arch_setup_additional_pages(struct linux_binprm *bprm,
diff --git a/arch/mips/include/asm/processor.h b/arch/mips/include/asm/processor.h
index 59ee6dcf6eed..3f832c3dd8f5 100644
--- a/arch/mips/include/asm/processor.h
+++ b/arch/mips/include/asm/processor.h
@@ -36,12 +36,6 @@ extern unsigned int vced_count, vcei_count;
  */
 #define HAVE_ARCH_PICK_MMAP_LAYOUT 1
 
-/*
- * A special page (the vdso) is mapped into all processes at the very
- * top of the virtual memory space.
- */
-#define SPECIAL_PAGES_SIZE PAGE_SIZE
-
 #ifdef CONFIG_32BIT
 #ifdef CONFIG_KVM_GUEST
 /* User space process size is limited to 1GB in KVM Guest Mode */
@@ -80,7 +74,7 @@ extern unsigned int vced_count, vcei_count;
 
 #endif
 
-#define STACK_TOP	((TASK_SIZE & PAGE_MASK) - SPECIAL_PAGES_SIZE)
+#define STACK_TOP	(TASK_SIZE & PAGE_MASK)
 
 /*
  * This decides where the kernel will search for a free chunk of vm
diff --git a/arch/mips/include/asm/vdso.h b/arch/mips/include/asm/vdso.h
index cca56aa40ff4..db2d45be8f2e 100644
--- a/arch/mips/include/asm/vdso.h
+++ b/arch/mips/include/asm/vdso.h
@@ -1,29 +1,70 @@
 /*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
  *
- * Copyright (C) 2009 Cavium Networks
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
  */
 
 #ifndef __ASM_VDSO_H
 #define __ASM_VDSO_H
 
-#include <linux/types.h>
+#include <linux/mm_types.h>
 
+/**
+ * struct mips_vdso_image - Details of a VDSO image.
+ * @data: Pointer to VDSO image data (page-aligned).
+ * @size: Size of the VDSO image data (page-aligned).
+ * @off_sigreturn: Offset of the sigreturn() trampoline.
+ * @off_rt_sigreturn: Offset of the rt_sigreturn() trampoline.
+ * @mapping: Special mapping structure.
+ *
+ * This structure contains details of a VDSO image, including the image data
+ * and offsets of certain symbols required by the kernel. It is generated as
+ * part of the VDSO build process, aside from the mapping page array, which is
+ * populated at runtime.
+ */
+struct mips_vdso_image {
+	void *data;
+	unsigned long size;
 
-#ifdef CONFIG_32BIT
-struct mips_vdso {
-	u32 signal_trampoline[2];
-	u32 rt_signal_trampoline[2];
+	unsigned long off_sigreturn;
+	unsigned long off_rt_sigreturn;
+
+	struct vm_special_mapping mapping;
 };
-#else  /* !CONFIG_32BIT */
-struct mips_vdso {
-	u32 o32_signal_trampoline[2];
-	u32 o32_rt_signal_trampoline[2];
-	u32 rt_signal_trampoline[2];
-	u32 n32_rt_signal_trampoline[2];
+
+/*
+ * The following structures are auto-generated as part of the build for each
+ * ABI by genvdso, see arch/mips/vdso/Makefile.
+ */
+
+extern struct mips_vdso_image vdso_image;
+
+#ifdef CONFIG_MIPS32_O32
+extern struct mips_vdso_image vdso_image_o32;
+#endif
+
+#ifdef CONFIG_MIPS32_N32
+extern struct mips_vdso_image vdso_image_n32;
+#endif
+
+/**
+ * union mips_vdso_data - Data provided by the kernel for the VDSO.
+ *
+ * This structure contains data needed by functions within the VDSO. It is
+ * populated by the kernel and mapped read-only into user memory.
+ *
+ * Note: Care should be taken when modifying as the layout must remain the same
+ * for both 64- and 32-bit (for 32-bit userland on 64-bit kernel).
+ */
+union mips_vdso_data {
+	struct {
+	};
+
+	u8 page[PAGE_SIZE];
 };
-#endif /* CONFIG_32BIT */
 
 #endif /* __ASM_VDSO_H */
diff --git a/arch/mips/include/uapi/asm/Kbuild b/arch/mips/include/uapi/asm/Kbuild
index 96fe7395ed8d..f2cf41461146 100644
--- a/arch/mips/include/uapi/asm/Kbuild
+++ b/arch/mips/include/uapi/asm/Kbuild
@@ -1,9 +1,9 @@
 # UAPI Header export list
 include include/uapi/asm-generic/Kbuild.asm
 
-generic-y += auxvec.h
 generic-y += ipcbuf.h
 
+header-y += auxvec.h
 header-y += bitfield.h
 header-y += bitsperlong.h
 header-y += break.h
diff --git a/arch/mips/include/uapi/asm/auxvec.h b/arch/mips/include/uapi/asm/auxvec.h
new file mode 100644
index 000000000000..c9c7195272c4
--- /dev/null
+++ b/arch/mips/include/uapi/asm/auxvec.h
@@ -0,0 +1,17 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#ifndef __ASM_AUXVEC_H
+#define __ASM_AUXVEC_H
+
+/* Location of VDSO image. */
+#define AT_SYSINFO_EHDR		33
+
+#endif /* __ASM_AUXVEC_H */
diff --git a/arch/mips/kernel/signal.c b/arch/mips/kernel/signal.c
index 2fec67bfc457..bf792e2839a6 100644
--- a/arch/mips/kernel/signal.c
+++ b/arch/mips/kernel/signal.c
@@ -36,7 +36,6 @@
 #include <asm/ucontext.h>
 #include <asm/cpu-features.h>
 #include <asm/war.h>
-#include <asm/vdso.h>
 #include <asm/dsp.h>
 #include <asm/inst.h>
 #include <asm/msa.h>
@@ -752,16 +751,15 @@ static int setup_rt_frame(void *sig_return, struct ksignal *ksig,
 struct mips_abi mips_abi = {
 #ifdef CONFIG_TRAD_SIGNALS
 	.setup_frame	= setup_frame,
-	.signal_return_offset = offsetof(struct mips_vdso, signal_trampoline),
 #endif
 	.setup_rt_frame = setup_rt_frame,
-	.rt_signal_return_offset =
-		offsetof(struct mips_vdso, rt_signal_trampoline),
 	.restart	= __NR_restart_syscall,
 
 	.off_sc_fpregs = offsetof(struct sigcontext, sc_fpregs),
 	.off_sc_fpc_csr = offsetof(struct sigcontext, sc_fpc_csr),
 	.off_sc_used_math = offsetof(struct sigcontext, sc_used_math),
+
+	.vdso		= &vdso_image,
 };
 
 static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
@@ -801,11 +799,11 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
 	}
 
 	if (sig_uses_siginfo(&ksig->ka))
-		ret = abi->setup_rt_frame(vdso + abi->rt_signal_return_offset,
+		ret = abi->setup_rt_frame(vdso + abi->vdso->off_rt_sigreturn,
 					  ksig, regs, oldset);
 	else
-		ret = abi->setup_frame(vdso + abi->signal_return_offset, ksig,
-				       regs, oldset);
+		ret = abi->setup_frame(vdso + abi->vdso->off_sigreturn,
+				       ksig, regs, oldset);
 
 	signal_setup_done(ret, ksig, 0);
 }
diff --git a/arch/mips/kernel/signal32.c b/arch/mips/kernel/signal32.c
index f7e89524e316..4909639aa35b 100644
--- a/arch/mips/kernel/signal32.c
+++ b/arch/mips/kernel/signal32.c
@@ -31,7 +31,6 @@
 #include <asm/ucontext.h>
 #include <asm/fpu.h>
 #include <asm/war.h>
-#include <asm/vdso.h>
 #include <asm/dsp.h>
 
 #include "signal-common.h"
@@ -406,14 +405,12 @@ static int setup_rt_frame_32(void *sig_return, struct ksignal *ksig,
  */
 struct mips_abi mips_abi_32 = {
 	.setup_frame	= setup_frame_32,
-	.signal_return_offset =
-		offsetof(struct mips_vdso, o32_signal_trampoline),
 	.setup_rt_frame = setup_rt_frame_32,
-	.rt_signal_return_offset =
-		offsetof(struct mips_vdso, o32_rt_signal_trampoline),
 	.restart	= __NR_O32_restart_syscall,
 
 	.off_sc_fpregs = offsetof(struct sigcontext32, sc_fpregs),
 	.off_sc_fpc_csr = offsetof(struct sigcontext32, sc_fpc_csr),
 	.off_sc_used_math = offsetof(struct sigcontext32, sc_used_math),
+
+	.vdso		= &vdso_image_o32,
 };
diff --git a/arch/mips/kernel/signal_n32.c b/arch/mips/kernel/signal_n32.c
index 0d017fdcaf07..a7bc38430500 100644
--- a/arch/mips/kernel/signal_n32.c
+++ b/arch/mips/kernel/signal_n32.c
@@ -38,7 +38,6 @@
 #include <asm/fpu.h>
 #include <asm/cpu-features.h>
 #include <asm/war.h>
-#include <asm/vdso.h>
 
 #include "signal-common.h"
 
@@ -151,11 +150,11 @@ static int setup_rt_frame_n32(void *sig_return, struct ksignal *ksig,
 
 struct mips_abi mips_abi_n32 = {
 	.setup_rt_frame = setup_rt_frame_n32,
-	.rt_signal_return_offset =
-		offsetof(struct mips_vdso, n32_rt_signal_trampoline),
 	.restart	= __NR_N32_restart_syscall,
 
 	.off_sc_fpregs = offsetof(struct sigcontext, sc_fpregs),
 	.off_sc_fpc_csr = offsetof(struct sigcontext, sc_fpc_csr),
 	.off_sc_used_math = offsetof(struct sigcontext, sc_used_math),
+
+	.vdso		= &vdso_image_n32,
 };
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index ed2a278722a9..56cc3c4377fb 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -1,122 +1,116 @@
 /*
- * This file is subject to the terms and conditions of the GNU General Public
- * License.  See the file "COPYING" in the main directory of this archive
- * for more details.
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
  *
- * Copyright (C) 2009, 2010 Cavium Networks, Inc.
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
  */
 
-
-#include <linux/kernel.h>
-#include <linux/err.h>
-#include <linux/sched.h>
-#include <linux/mm.h>
-#include <linux/init.h>
 #include <linux/binfmts.h>
 #include <linux/elf.h>
-#include <linux/vmalloc.h>
-#include <linux/unistd.h>
-#include <linux/random.h>
+#include <linux/err.h>
+#include <linux/init.h>
+#include <linux/mm.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
 
+#include <asm/abi.h>
 #include <asm/vdso.h>
-#include <asm/uasm.h>
-#include <asm/processor.h>
+
+/* Kernel-provided data used by the VDSO. */
+static union mips_vdso_data vdso_data __page_aligned_data;
 
 /*
- * Including <asm/unistd.h> would give use the 64-bit syscall numbers ...
+ * Mapping for the VDSO data pages. The real pages are mapped manually, as
+ * what we map and where within the area they are mapped is determined at
+ * runtime.
  */
-#define __NR_O32_sigreturn		4119
-#define __NR_O32_rt_sigreturn		4193
-#define __NR_N32_rt_sigreturn		6211
+static struct page *no_pages[] = { NULL };
+static struct vm_special_mapping vdso_vvar_mapping = {
+	.name = "[vvar]",
+	.pages = no_pages,
+};
 
-static struct page *vdso_page;
-
-static void __init install_trampoline(u32 *tramp, unsigned int sigreturn)
+static void __init init_vdso_image(struct mips_vdso_image *image)
 {
-	uasm_i_addiu(&tramp, 2, 0, sigreturn);	/* li v0, sigreturn */
-	uasm_i_syscall(&tramp, 0);
+	unsigned long num_pages, i;
+
+	BUG_ON(!PAGE_ALIGNED(image->data));
+	BUG_ON(!PAGE_ALIGNED(image->size));
+
+	num_pages = image->size / PAGE_SIZE;
+
+	for (i = 0; i < num_pages; i++) {
+		image->mapping.pages[i] =
+			virt_to_page(image->data + (i * PAGE_SIZE));
+	}
 }
 
 static int __init init_vdso(void)
 {
-	struct mips_vdso *vdso;
-
-	vdso_page = alloc_page(GFP_KERNEL);
-	if (!vdso_page)
-		panic("Cannot allocate vdso");
-
-	vdso = vmap(&vdso_page, 1, 0, PAGE_KERNEL);
-	if (!vdso)
-		panic("Cannot map vdso");
-	clear_page(vdso);
-
-	install_trampoline(vdso->rt_signal_trampoline, __NR_rt_sigreturn);
-#ifdef CONFIG_32BIT
-	install_trampoline(vdso->signal_trampoline, __NR_sigreturn);
-#else
-	install_trampoline(vdso->n32_rt_signal_trampoline,
-			   __NR_N32_rt_sigreturn);
-	install_trampoline(vdso->o32_signal_trampoline, __NR_O32_sigreturn);
-	install_trampoline(vdso->o32_rt_signal_trampoline,
-			   __NR_O32_rt_sigreturn);
+	init_vdso_image(&vdso_image);
+
+#ifdef CONFIG_MIPS32_O32
+	init_vdso_image(&vdso_image_o32);
 #endif
 
-	vunmap(vdso);
+#ifdef CONFIG_MIPS32_N32
+	init_vdso_image(&vdso_image_n32);
+#endif
 
 	return 0;
 }
 subsys_initcall(init_vdso);
 
-static unsigned long vdso_addr(unsigned long start)
-{
-	unsigned long offset = 0UL;
-
-	if (current->flags & PF_RANDOMIZE) {
-		offset = get_random_int();
-		offset <<= PAGE_SHIFT;
-		if (TASK_IS_32BIT_ADDR)
-			offset &= 0xfffffful;
-		else
-			offset &= 0xffffffful;
-	}
-
-	return STACK_TOP + offset;
-}
-
 int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 {
-	int ret;
-	unsigned long addr;
+	struct mips_vdso_image *image = current->thread.abi->vdso;
 	struct mm_struct *mm = current->mm;
+	unsigned long base, vdso_addr;
+	struct vm_area_struct *vma;
+	int ret;
 
 	down_write(&mm->mmap_sem);
 
-	addr = vdso_addr(mm->start_stack);
-
-	addr = get_unmapped_area(NULL, addr, PAGE_SIZE, 0, 0);
-	if (IS_ERR_VALUE(addr)) {
-		ret = addr;
-		goto up_fail;
+	base = get_unmapped_area(NULL, 0, PAGE_SIZE + image->size, 0, 0);
+	if (IS_ERR_VALUE(base)) {
+		ret = base;
+		goto out;
 	}
 
-	ret = install_special_mapping(mm, addr, PAGE_SIZE,
-				      VM_READ|VM_EXEC|
-				      VM_MAYREAD|VM_MAYWRITE|VM_MAYEXEC,
-				      &vdso_page);
+	vdso_addr = base + PAGE_SIZE;
+
+	vma = _install_special_mapping(mm, base, PAGE_SIZE,
+				       VM_READ | VM_MAYREAD,
+				       &vdso_vvar_mapping);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto out;
+	}
 
+	/* Map data page. */
+	ret = remap_pfn_range(vma, base,
+			      virt_to_phys(&vdso_data) >> PAGE_SHIFT,
+			      PAGE_SIZE, PAGE_READONLY);
 	if (ret)
-		goto up_fail;
+		goto out;
+
+	/* Map VDSO image. */
+	vma = _install_special_mapping(mm, vdso_addr, image->size,
+				       VM_READ | VM_EXEC |
+				       VM_MAYREAD | VM_MAYWRITE | VM_MAYEXEC,
+				       &image->mapping);
+	if (IS_ERR(vma)) {
+		ret = PTR_ERR(vma);
+		goto out;
+	}
 
-	mm->context.vdso = (void *)addr;
+	mm->context.vdso = (void *)vdso_addr;
+	ret = 0;
 
-up_fail:
+out:
 	up_write(&mm->mmap_sem);
 	return ret;
 }
-
-const char *arch_vma_name(struct vm_area_struct *vma)
-{
-	if (vma->vm_mm && vma->vm_start == (long)vma->vm_mm->context.vdso)
-		return "[vdso]";
-	return NULL;
-}
diff --git a/arch/mips/vdso/.gitignore b/arch/mips/vdso/.gitignore
new file mode 100644
index 000000000000..5286a7d73d79
--- /dev/null
+++ b/arch/mips/vdso/.gitignore
@@ -0,0 +1,4 @@
+*.so*
+vdso-*image.c
+genvdso
+vdso*.lds
diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
new file mode 100644
index 000000000000..9a8a6b373eb0
--- /dev/null
+++ b/arch/mips/vdso/Makefile
@@ -0,0 +1,142 @@
+# Objects to go into the VDSO.
+obj-vdso-y := elf.o sigreturn.o
+
+# Common compiler flags between ABIs.
+ccflags-vdso := \
+	$(filter -I%,$(KBUILD_CFLAGS)) \
+	$(filter -E%,$(KBUILD_CFLAGS)) \
+	$(filter -march=%,$(KBUILD_CFLAGS))
+cflags-vdso := $(ccflags-vdso) \
+	$(filter -W%,$(filter-out -Wa$(comma)%,$(KBUILD_CFLAGS))) \
+	-O2 -g -fPIC -fno-common -fno-builtin -G 0 -DDISABLE_BRANCH_PROFILING \
+	$(call cc-option, -fno-stack-protector)
+aflags-vdso := $(ccflags-vdso) \
+	$(filter -I%,$(KBUILD_CFLAGS)) \
+	$(filter -E%,$(KBUILD_CFLAGS)) \
+	-D__ASSEMBLY__ -Wa,-gdwarf-2
+
+# VDSO linker flags.
+VDSO_LDFLAGS := \
+	-Wl,-Bsymbolic -Wl,--no-undefined -Wl,-soname=linux-vdso.so.1 \
+	-nostdlib -shared \
+	$(call cc-ldoption, -Wl$(comma)--hash-style=sysv) \
+	$(call cc-ldoption, -Wl$(comma)--build-id)
+
+GCOV_PROFILE := n
+
+#
+# Shared build commands.
+#
+
+quiet_cmd_vdsold = VDSO    $@
+      cmd_vdsold = $(CC) $(c_flags) $(VDSO_LDFLAGS) \
+                   -Wl,-T $(filter %.lds,$^) $(filter %.o,$^) -o $@
+
+hostprogs-y := genvdso
+
+quiet_cmd_genvdso = GENVDSO $@
+define cmd_genvdso
+	cp $< $(<:%.dbg=%) && \
+	$(OBJCOPY) -S $< $(<:%.dbg=%) && \
+	$(obj)/genvdso $< $(<:%.dbg=%) $@ $(VDSO_NAME)
+endef
+
+#
+# Build native VDSO.
+#
+
+native-abi := $(filter -mabi=%,$(KBUILD_CFLAGS))
+
+targets += $(obj-vdso-y)
+targets += vdso.lds vdso.so.dbg vdso.so vdso-image.c
+
+obj-vdso := $(obj-vdso-y:%.o=$(obj)/%.o)
+
+$(obj-vdso): KBUILD_CFLAGS := $(cflags-vdso) $(native-abi)
+$(obj-vdso): KBUILD_AFLAGS := $(aflags-vdso) $(native-abi)
+
+$(obj)/vdso.lds: KBUILD_CPPFLAGS := $(native-abi)
+
+$(obj)/vdso.so.dbg: $(obj)/vdso.lds $(obj-vdso) FORCE
+	$(call if_changed,vdsold)
+
+$(obj)/vdso-image.c: $(obj)/vdso.so.dbg $(obj)/genvdso FORCE
+	$(call if_changed,genvdso)
+
+obj-y += vdso-image.o
+
+#
+# Build O32 VDSO.
+#
+
+# Define these outside the ifdef to ensure they are picked up by clean.
+targets += $(obj-vdso-y:%.o=%-o32.o)
+targets += vdso-o32.lds vdso-o32.so.dbg vdso-o32.so vdso-o32-image.c
+
+ifdef CONFIG_MIPS32_O32
+
+obj-vdso-o32 := $(obj-vdso-y:%.o=$(obj)/%-o32.o)
+
+$(obj-vdso-o32): KBUILD_CFLAGS := $(cflags-vdso) -mabi=32
+$(obj-vdso-o32): KBUILD_AFLAGS := $(aflags-vdso) -mabi=32
+
+$(obj)/%-o32.o: $(src)/%.S FORCE
+	$(call if_changed_dep,as_o_S)
+
+$(obj)/%-o32.o: $(src)/%.c FORCE
+	$(call cmd,force_checksrc)
+	$(call if_changed_rule,cc_o_c)
+
+$(obj)/vdso-o32.lds: KBUILD_CPPFLAGS := -mabi=32
+$(obj)/vdso-o32.lds: $(src)/vdso.lds.S FORCE
+	$(call if_changed_dep,cpp_lds_S)
+
+$(obj)/vdso-o32.so.dbg: $(obj)/vdso-o32.lds $(obj-vdso-o32) FORCE
+	$(call if_changed,vdsold)
+
+$(obj)/vdso-o32-image.c: VDSO_NAME := o32
+$(obj)/vdso-o32-image.c: $(obj)/vdso-o32.so.dbg $(obj)/genvdso FORCE
+	$(call if_changed,genvdso)
+
+obj-y += vdso-o32-image.o
+
+endif
+
+#
+# Build N32 VDSO.
+#
+
+targets += $(obj-vdso-y:%.o=%-n32.o)
+targets += vdso-n32.lds vdso-n32.so.dbg vdso-n32.so vdso-n32-image.c
+
+ifdef CONFIG_MIPS32_N32
+
+obj-vdso-n32 := $(obj-vdso-y:%.o=$(obj)/%-n32.o)
+
+$(obj-vdso-n32): KBUILD_CFLAGS := $(cflags-vdso) -mabi=n32
+$(obj-vdso-n32): KBUILD_AFLAGS := $(aflags-vdso) -mabi=n32
+
+$(obj)/%-n32.o: $(src)/%.S FORCE
+	$(call if_changed_dep,as_o_S)
+
+$(obj)/%-n32.o: $(src)/%.c FORCE
+	$(call cmd,force_checksrc)
+	$(call if_changed_rule,cc_o_c)
+
+$(obj)/vdso-n32.lds: KBUILD_CPPFLAGS := -mabi=n32
+$(obj)/vdso-n32.lds: $(src)/vdso.lds.S FORCE
+	$(call if_changed_dep,cpp_lds_S)
+
+$(obj)/vdso-n32.so.dbg: $(obj)/vdso-n32.lds $(obj-vdso-n32) FORCE
+	$(call if_changed,vdsold)
+
+$(obj)/vdso-n32-image.c: VDSO_NAME := n32
+$(obj)/vdso-n32-image.c: $(obj)/vdso-n32.so.dbg $(obj)/genvdso FORCE
+	$(call if_changed,genvdso)
+
+obj-y += vdso-n32-image.o
+
+endif
+
+# FIXME: Need install rule for debug.
+# Needs to deal with dependency for generation of dbg by cmd_genvdso...
diff --git a/arch/mips/vdso/elf.S b/arch/mips/vdso/elf.S
new file mode 100644
index 000000000000..60c23d0d452c
--- /dev/null
+++ b/arch/mips/vdso/elf.S
@@ -0,0 +1,68 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include "vdso.h"
+
+#include <linux/elfnote.h>
+#include <linux/version.h>
+
+ELFNOTE_START(Linux, 0, "a")
+	.long LINUX_VERSION_CODE
+ELFNOTE_END
+
+/*
+ * The .MIPS.abiflags section must be defined with the FP ABI flags set
+ * to 'any' to be able to link with both old and new libraries.
+ * Newer toolchains are capable of automatically generating this, but we want
+ * to work with older toolchains as well. Therefore, we define the contents of
+ * this section here (under different names), and then genvdso will patch
+ * it to have the correct name and type.
+ *
+ * We base the .MIPS.abiflags section on preprocessor definitions rather than
+ * CONFIG_* because we need to match the particular ABI we are building the
+ * VDSO for.
+ *
+ * See https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
+ * for the .MIPS.abitflags and .gnu.attributes section description.
+ */
+
+	.section .mips_abiflags, "a"
+	.align 3
+__mips_abiflags:
+	.hword	0		/* version */
+	.byte	__mips		/* isa_level */
+
+	/* isa_rev */
+#ifdef __mips_isa_rev
+	.byte	__mips_isa_rev
+#else
+	.byte	0
+#endif
+
+	/* gpr_size */
+#ifdef __mips64
+	.byte	2		/* AFL_REG_64 */
+#else
+	.byte	1		/* AFL_REG_32 */
+#endif
+
+	/* cpr1_size */
+#if (defined(__mips_isa_rev) && __mips_isa_rev >= 6) || defined(__mips64)
+	.byte	2		/* AFL_REG_64 */
+#else
+	.byte	1		/* AFL_REG_32 */
+#endif
+
+	.byte	0		/* cpr2_size (AFL_REG_NONE) */
+	.byte	0		/* fp_abi (Val_GNU_MIPS_ABI_FP_ANY) */
+	.word	0		/* isa_ext */
+	.word	0		/* ases */
+	.word	0		/* flags1 */
+	.word	0		/* flags2 */
diff --git a/arch/mips/vdso/genvdso.c b/arch/mips/vdso/genvdso.c
new file mode 100644
index 000000000000..530a36f465ce
--- /dev/null
+++ b/arch/mips/vdso/genvdso.c
@@ -0,0 +1,293 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+/*
+ * This tool is used to generate the real VDSO images from the raw image. It
+ * first patches up the MIPS ABI flags and GNU attributes sections defined in
+ * elf.S to have the correct name and type. It then generates a C source file
+ * to be compiled into the kernel containing the VDSO image data and a
+ * mips_vdso_image struct for it, including symbol offsets extracted from the
+ * image.
+ *
+ * We need to be passed both a stripped and unstripped VDSO image. The stripped
+ * image is compiled into the kernel, but we must also patch up the unstripped
+ * image's ABI flags sections so that it can be installed and used for
+ * debugging.
+ */
+
+#include <sys/mman.h>
+#include <sys/stat.h>
+#include <sys/types.h>
+
+#include <byteswap.h>
+#include <elf.h>
+#include <errno.h>
+#include <fcntl.h>
+#include <inttypes.h>
+#include <stdarg.h>
+#include <stdbool.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+/* Define these in case the system elf.h is not new enough to have them. */
+#ifndef SHT_GNU_ATTRIBUTES
+# define SHT_GNU_ATTRIBUTES	0x6ffffff5
+#endif
+#ifndef SHT_MIPS_ABIFLAGS
+# define SHT_MIPS_ABIFLAGS	0x7000002a
+#endif
+
+enum {
+	ABI_O32 = (1 << 0),
+	ABI_N32 = (1 << 1),
+	ABI_N64 = (1 << 2),
+
+	ABI_ALL = ABI_O32 | ABI_N32 | ABI_N64,
+};
+
+/* Symbols the kernel requires offsets for. */
+static struct {
+	const char *name;
+	const char *offset_name;
+	unsigned int abis;
+} vdso_symbols[] = {
+	{ "__vdso_sigreturn", "off_sigreturn", ABI_O32 },
+	{ "__vdso_rt_sigreturn", "off_rt_sigreturn", ABI_ALL },
+	{}
+};
+
+static const char *program_name;
+static const char *vdso_name;
+static unsigned char elf_class;
+static unsigned int elf_abi;
+static bool need_swap;
+static FILE *out_file;
+
+#if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
+# define HOST_ORDER		ELFDATA2LSB
+#elif __BYTE_ORDER__ == __ORDER_BIG_ENDIAN__
+# define HOST_ORDER		ELFDATA2MSB
+#endif
+
+#define BUILD_SWAP(bits)						\
+	static uint##bits##_t swap_uint##bits(uint##bits##_t val)	\
+	{								\
+		return need_swap ? bswap_##bits(val) : val;		\
+	}
+
+BUILD_SWAP(16)
+BUILD_SWAP(32)
+BUILD_SWAP(64)
+
+#define __FUNC(name, bits) name##bits
+#define _FUNC(name, bits) __FUNC(name, bits)
+#define FUNC(name) _FUNC(name, ELF_BITS)
+
+#define __ELF(x, bits) Elf##bits##_##x
+#define _ELF(x, bits) __ELF(x, bits)
+#define ELF(x) _ELF(x, ELF_BITS)
+
+/*
+ * Include genvdso.h twice with ELF_BITS defined differently to get functions
+ * for both ELF32 and ELF64.
+ */
+
+#define ELF_BITS 64
+#include "genvdso.h"
+#undef ELF_BITS
+
+#define ELF_BITS 32
+#include "genvdso.h"
+#undef ELF_BITS
+
+static void *map_vdso(const char *path, size_t *_size)
+{
+	int fd;
+	struct stat stat;
+	void *addr;
+	const Elf32_Ehdr *ehdr;
+
+	fd = open(path, O_RDWR);
+	if (fd < 0) {
+		fprintf(stderr, "%s: Failed to open '%s': %s\n", program_name,
+			path, strerror(errno));
+		return NULL;
+	}
+
+	if (fstat(fd, &stat) != 0) {
+		fprintf(stderr, "%s: Failed to stat '%s': %s\n", program_name,
+			path, strerror(errno));
+		return NULL;
+	}
+
+	addr = mmap(NULL, stat.st_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd,
+		    0);
+	if (addr == MAP_FAILED) {
+		fprintf(stderr, "%s: Failed to map '%s': %s\n", program_name,
+			path, strerror(errno));
+		return NULL;
+	}
+
+	/* ELF32/64 header formats are the same for the bits we're checking. */
+	ehdr = addr;
+
+	if (memcmp(ehdr->e_ident, ELFMAG, SELFMAG) != 0) {
+		fprintf(stderr, "%s: '%s' is not an ELF file\n", program_name,
+			path);
+		return NULL;
+	}
+
+	elf_class = ehdr->e_ident[EI_CLASS];
+	switch (elf_class) {
+	case ELFCLASS32:
+	case ELFCLASS64:
+		break;
+	default:
+		fprintf(stderr, "%s: '%s' has invalid ELF class\n",
+			program_name, path);
+		return NULL;
+	}
+
+	switch (ehdr->e_ident[EI_DATA]) {
+	case ELFDATA2LSB:
+	case ELFDATA2MSB:
+		need_swap = ehdr->e_ident[EI_DATA] != HOST_ORDER;
+		break;
+	default:
+		fprintf(stderr, "%s: '%s' has invalid ELF data order\n",
+			program_name, path);
+		return NULL;
+	}
+
+	if (swap_uint16(ehdr->e_machine) != EM_MIPS) {
+		fprintf(stderr,
+			"%s: '%s' has invalid ELF machine (expected EM_MIPS)\n",
+			program_name, path);
+		return NULL;
+	} else if (swap_uint16(ehdr->e_type) != ET_DYN) {
+		fprintf(stderr,
+			"%s: '%s' has invalid ELF type (expected ET_DYN)\n",
+			program_name, path);
+		return NULL;
+	}
+
+	*_size = stat.st_size;
+	return addr;
+}
+
+static bool patch_vdso(const char *path, void *vdso)
+{
+	if (elf_class == ELFCLASS64)
+		return patch_vdso64(path, vdso);
+	else
+		return patch_vdso32(path, vdso);
+}
+
+static bool get_symbols(const char *path, void *vdso)
+{
+	if (elf_class == ELFCLASS64)
+		return get_symbols64(path, vdso);
+	else
+		return get_symbols32(path, vdso);
+}
+
+int main(int argc, char **argv)
+{
+	const char *dbg_vdso_path, *vdso_path, *out_path;
+	void *dbg_vdso, *vdso;
+	size_t dbg_vdso_size, vdso_size, i;
+
+	program_name = argv[0];
+
+	if (argc < 4 || argc > 5) {
+		fprintf(stderr,
+			"Usage: %s <debug VDSO> <stripped VDSO> <output file> [<name>]\n",
+			program_name);
+		return EXIT_FAILURE;
+	}
+
+	dbg_vdso_path = argv[1];
+	vdso_path = argv[2];
+	out_path = argv[3];
+	vdso_name = (argc > 4) ? argv[4] : "";
+
+	dbg_vdso = map_vdso(dbg_vdso_path, &dbg_vdso_size);
+	if (!dbg_vdso)
+		return EXIT_FAILURE;
+
+	vdso = map_vdso(vdso_path, &vdso_size);
+	if (!vdso)
+		return EXIT_FAILURE;
+
+	/* Patch both the VDSOs' ABI flags sections. */
+	if (!patch_vdso(dbg_vdso_path, dbg_vdso))
+		return EXIT_FAILURE;
+	if (!patch_vdso(vdso_path, vdso))
+		return EXIT_FAILURE;
+
+	if (msync(dbg_vdso, dbg_vdso_size, MS_SYNC) != 0) {
+		fprintf(stderr, "%s: Failed to sync '%s': %s\n", program_name,
+			dbg_vdso_path, strerror(errno));
+		return EXIT_FAILURE;
+	} else if (msync(vdso, vdso_size, MS_SYNC) != 0) {
+		fprintf(stderr, "%s: Failed to sync '%s': %s\n", program_name,
+			vdso_path, strerror(errno));
+		return EXIT_FAILURE;
+	}
+
+	out_file = fopen(out_path, "w");
+	if (!out_file) {
+		fprintf(stderr, "%s: Failed to open '%s': %s\n", program_name,
+			out_path, strerror(errno));
+		return EXIT_FAILURE;
+	}
+
+	fprintf(out_file, "/* Automatically generated - do not edit */\n");
+	fprintf(out_file, "#include <linux/linkage.h>\n");
+	fprintf(out_file, "#include <linux/mm.h>\n");
+	fprintf(out_file, "#include <asm/vdso.h>\n");
+
+	/* Write out the stripped VDSO data. */
+	fprintf(out_file,
+		"static unsigned char vdso_data[PAGE_ALIGN(%zu)] __page_aligned_data = {\n\t",
+		vdso_size);
+	for (i = 0; i < vdso_size; i++) {
+		if (!(i % 10))
+			fprintf(out_file, "\n\t");
+		fprintf(out_file, "0x%02x, ", ((unsigned char *)vdso)[i]);
+	}
+	fprintf(out_file, "\n};\n");
+
+	/* Preallocate a page array. */
+	fprintf(out_file,
+		"static struct page *vdso_pages[PAGE_ALIGN(%zu) / PAGE_SIZE];\n",
+		vdso_size);
+
+	fprintf(out_file, "struct mips_vdso_image vdso_image%s%s = {\n",
+		(vdso_name[0]) ? "_" : "", vdso_name);
+	fprintf(out_file, "\t.data = vdso_data,\n");
+	fprintf(out_file, "\t.size = PAGE_ALIGN(%zu),\n", vdso_size);
+	fprintf(out_file, "\t.mapping = {\n");
+	fprintf(out_file, "\t\t.name = \"[vdso]\",\n");
+	fprintf(out_file, "\t\t.pages = vdso_pages,\n");
+	fprintf(out_file, "\t},\n");
+
+	/* Calculate and write symbol offsets to <output file> */
+	if (!get_symbols(dbg_vdso_path, dbg_vdso)) {
+		unlink(out_path);
+		return EXIT_FAILURE;
+	}
+
+	fprintf(out_file, "};\n");
+
+	return EXIT_SUCCESS;
+}
diff --git a/arch/mips/vdso/genvdso.h b/arch/mips/vdso/genvdso.h
new file mode 100644
index 000000000000..94334727059a
--- /dev/null
+++ b/arch/mips/vdso/genvdso.h
@@ -0,0 +1,187 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+static inline bool FUNC(patch_vdso)(const char *path, void *vdso)
+{
+	const ELF(Ehdr) *ehdr = vdso;
+	void *shdrs;
+	ELF(Shdr) *shdr;
+	char *shstrtab, *name;
+	uint16_t sh_count, sh_entsize, i;
+	unsigned int local_gotno, symtabno, gotsym;
+	ELF(Dyn) *dyn = NULL;
+
+	shdrs = vdso + FUNC(swap_uint)(ehdr->e_shoff);
+	sh_count = swap_uint16(ehdr->e_shnum);
+	sh_entsize = swap_uint16(ehdr->e_shentsize);
+
+	shdr = shdrs + (sh_entsize * swap_uint16(ehdr->e_shstrndx));
+	shstrtab = vdso + FUNC(swap_uint)(shdr->sh_offset);
+
+	for (i = 0; i < sh_count; i++) {
+		shdr = shdrs + (i * sh_entsize);
+		name = shstrtab + swap_uint32(shdr->sh_name);
+
+		/*
+		 * Ensure there are no relocation sections - ld.so does not
+		 * relocate the VDSO so if there are relocations things will
+		 * break.
+		 */
+		switch (swap_uint32(shdr->sh_type)) {
+		case SHT_REL:
+		case SHT_RELA:
+			fprintf(stderr,
+				"%s: '%s' contains relocation sections\n",
+				program_name, path);
+			return false;
+		case SHT_DYNAMIC:
+			dyn = vdso + FUNC(swap_uint)(shdr->sh_offset);
+			break;
+		}
+
+		/* Check for existing sections. */
+		if (strcmp(name, ".MIPS.abiflags") == 0) {
+			fprintf(stderr,
+				"%s: '%s' already contains a '.MIPS.abiflags' section\n",
+				program_name, path);
+			return false;
+		}
+
+		if (strcmp(name, ".mips_abiflags") == 0) {
+			strcpy(name, ".MIPS.abiflags");
+			shdr->sh_type = swap_uint32(SHT_MIPS_ABIFLAGS);
+			shdr->sh_entsize = shdr->sh_size;
+		}
+	}
+
+	/*
+	 * Ensure the GOT has no entries other than the standard 2, for the same
+	 * reason we check that there's no relocation sections above.
+	 * The standard two entries are:
+	 * - Lazy resolver
+	 * - Module pointer
+	 */
+	if (dyn) {
+		local_gotno = symtabno = gotsym = 0;
+
+		while (FUNC(swap_uint)(dyn->d_tag) != DT_NULL) {
+			switch (FUNC(swap_uint)(dyn->d_tag)) {
+			/*
+			 * This member holds the number of local GOT entries.
+			 */
+			case DT_MIPS_LOCAL_GOTNO:
+				local_gotno = FUNC(swap_uint)(dyn->d_un.d_val);
+				break;
+			/*
+			 * This member holds the number of entries in the
+			 * .dynsym section.
+			 */
+			case DT_MIPS_SYMTABNO:
+				symtabno = FUNC(swap_uint)(dyn->d_un.d_val);
+				break;
+			/*
+			 * This member holds the index of the first dynamic
+			 * symbol table entry that corresponds to an entry in
+			 * the GOT.
+			 */
+			case DT_MIPS_GOTSYM:
+				gotsym = FUNC(swap_uint)(dyn->d_un.d_val);
+				break;
+			}
+
+			dyn++;
+		}
+
+		if (local_gotno > 2 || symtabno - gotsym) {
+			fprintf(stderr,
+				"%s: '%s' contains unexpected GOT entries\n",
+				program_name, path);
+			return false;
+		}
+	}
+
+	return true;
+}
+
+static inline bool FUNC(get_symbols)(const char *path, void *vdso)
+{
+	const ELF(Ehdr) *ehdr = vdso;
+	void *shdrs, *symtab;
+	ELF(Shdr) *shdr;
+	const ELF(Sym) *sym;
+	char *strtab, *name;
+	uint16_t sh_count, sh_entsize, st_count, st_entsize, i, j;
+	uint64_t offset;
+	uint32_t flags;
+
+	shdrs = vdso + FUNC(swap_uint)(ehdr->e_shoff);
+	sh_count = swap_uint16(ehdr->e_shnum);
+	sh_entsize = swap_uint16(ehdr->e_shentsize);
+
+	for (i = 0; i < sh_count; i++) {
+		shdr = shdrs + (i * sh_entsize);
+
+		if (swap_uint32(shdr->sh_type) == SHT_SYMTAB)
+			break;
+	}
+
+	if (i == sh_count) {
+		fprintf(stderr, "%s: '%s' has no symbol table\n", program_name,
+			path);
+		return false;
+	}
+
+	/* Get flags */
+	flags = swap_uint32(ehdr->e_flags);
+	if (elf_class == ELFCLASS64)
+		elf_abi = ABI_N64;
+	else if (flags & EF_MIPS_ABI2)
+		elf_abi = ABI_N32;
+	else
+		elf_abi = ABI_O32;
+
+	/* Get symbol table. */
+	symtab = vdso + FUNC(swap_uint)(shdr->sh_offset);
+	st_entsize = FUNC(swap_uint)(shdr->sh_entsize);
+	st_count = FUNC(swap_uint)(shdr->sh_size) / st_entsize;
+
+	/* Get string table. */
+	shdr = shdrs + (swap_uint32(shdr->sh_link) * sh_entsize);
+	strtab = vdso + FUNC(swap_uint)(shdr->sh_offset);
+
+	/* Write offsets for symbols needed by the kernel. */
+	for (i = 0; vdso_symbols[i].name; i++) {
+		if (!(vdso_symbols[i].abis & elf_abi))
+			continue;
+
+		for (j = 0; j < st_count; j++) {
+			sym = symtab + (j * st_entsize);
+			name = strtab + swap_uint32(sym->st_name);
+
+			if (!strcmp(name, vdso_symbols[i].name)) {
+				offset = FUNC(swap_uint)(sym->st_value);
+
+				fprintf(out_file,
+					"\t.%s = 0x%" PRIx64 ",\n",
+					vdso_symbols[i].offset_name, offset);
+				break;
+			}
+		}
+
+		if (j == st_count) {
+			fprintf(stderr,
+				"%s: '%s' is missing required symbol '%s'\n",
+				program_name, path, vdso_symbols[i].name);
+			return false;
+		}
+	}
+
+	return true;
+}
diff --git a/arch/mips/vdso/sigreturn.S b/arch/mips/vdso/sigreturn.S
new file mode 100644
index 000000000000..715bf5993529
--- /dev/null
+++ b/arch/mips/vdso/sigreturn.S
@@ -0,0 +1,49 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include "vdso.h"
+
+#include <uapi/asm/unistd.h>
+
+#include <asm/regdef.h>
+#include <asm/asm.h>
+
+	.section	.text
+	.cfi_sections	.debug_frame
+
+LEAF(__vdso_rt_sigreturn)
+	.cfi_startproc
+	.frame	sp, 0, ra
+	.mask	0x00000000, 0
+	.fmask	0x00000000, 0
+	.cfi_signal_frame
+
+	li	v0, __NR_rt_sigreturn
+	syscall
+
+	.cfi_endproc
+	END(__vdso_rt_sigreturn)
+
+#if _MIPS_SIM == _MIPS_SIM_ABI32
+
+LEAF(__vdso_sigreturn)
+	.cfi_startproc
+	.frame	sp, 0, ra
+	.mask	0x00000000, 0
+	.fmask	0x00000000, 0
+	.cfi_signal_frame
+
+	li	v0, __NR_sigreturn
+	syscall
+
+	.cfi_endproc
+	END(__vdso_sigreturn)
+
+#endif
diff --git a/arch/mips/vdso/vdso.h b/arch/mips/vdso/vdso.h
new file mode 100644
index 000000000000..64b98967e245
--- /dev/null
+++ b/arch/mips/vdso/vdso.h
@@ -0,0 +1,79 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include <asm/sgidefs.h>
+
+#if _MIPS_SIM != _MIPS_SIM_ABI64 && defined(CONFIG_64BIT)
+
+/* Building 32-bit VDSO for the 64-bit kernel. Fake a 32-bit Kconfig. */
+#undef CONFIG_64BIT
+#define CONFIG_32BIT 1
+
+#endif
+
+#ifndef __ASSEMBLY__
+
+#include <asm/asm.h>
+#include <asm/page.h>
+#include <asm/vdso.h>
+
+static inline unsigned long get_vdso_base(void)
+{
+	unsigned long addr;
+
+	/*
+	 * Get the base load address of the VDSO. We have to avoid generating
+	 * relocations and references to the GOT because ld.so does not peform
+	 * relocations on the VDSO. We use the current offset from the VDSO base
+	 * and perform a PC-relative branch which gives the absolute address in
+	 * ra, and take the difference. The assembler chokes on
+	 * "li %0, _start - .", so embed the offset as a word and branch over
+	 * it.
+	 *
+	 * TODO: Is there a better way to do this?
+	 */
+
+#ifdef CONFIG_CPU_MIPSR6
+	/*
+	 * We can't use cpu_has_mips_r6 since it will create a relocation
+	 * in the VDSO because of the global cpu_data[] variable.
+	 */
+
+	/* lapc <symbol> is an alias to addiupc reg, <symbol> - .
+	 *
+	 * We can't use addiupc because there is no label-label
+	 * support for the addiupc reloc
+	 */
+	__asm__("lapc	%0, _start			\n"
+		: "=r" (addr) : :);
+#else
+	__asm__(
+	"	.set push				\n"
+	"	.set noreorder				\n"
+	"	bal	1f				\n"
+	"	 nop					\n"
+	"	.word	_start - .			\n"
+	"1:	lw	%0, 0($31)			\n"
+	"	" STR(PTR_ADDU) " %0, $31, %0		\n"
+	"	.set pop				\n"
+	: "=r" (addr)
+	:
+	: "$31");
+#endif /* CONFIG_CPU_MIPSR6 */
+
+	return addr;
+}
+
+static inline const union mips_vdso_data *get_vdso_data(void)
+{
+	return (const union mips_vdso_data *)(get_vdso_base() - PAGE_SIZE);
+}
+
+#endif /* __ASSEMBLY__ */
diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S
new file mode 100644
index 000000000000..21655b6fefc5
--- /dev/null
+++ b/arch/mips/vdso/vdso.lds.S
@@ -0,0 +1,100 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include <asm/sgidefs.h>
+
+#if _MIPS_SIM == _MIPS_SIM_ABI64
+OUTPUT_FORMAT("elf64-tradlittlemips", "elf64-tradbigmips", "elf64-tradlittlemips")
+#elif _MIPS_SIM == _MIPS_SIM_NABI32
+OUTPUT_FORMAT("elf32-ntradlittlemips", "elf32-ntradbigmips", "elf32-ntradlittlemips")
+#else
+OUTPUT_FORMAT("elf32-tradlittlemips", "elf32-tradbigmips", "elf32-tradlittlemips")
+#endif
+
+OUTPUT_ARCH(mips)
+
+SECTIONS
+{
+	PROVIDE(_start = .);
+	. = SIZEOF_HEADERS;
+
+	/*
+	 * In order to retain compatibility with older toolchains we provide the
+	 * ABI flags section ourself. Newer assemblers will automatically
+	 * generate .MIPS.abiflags sections so we discard such input sections,
+	 * and then manually define our own section here. genvdso will patch
+	 * this section to have the correct name/type.
+	 */
+	.mips_abiflags	: { *(.mips_abiflags) } 	:text :abiflags
+
+	.reginfo	: { *(.reginfo) }		:text :reginfo
+
+	.hash		: { *(.hash) }			:text
+	.gnu.hash	: { *(.gnu.hash) }
+	.dynsym		: { *(.dynsym) }
+	.dynstr		: { *(.dynstr) }
+	.gnu.version	: { *(.gnu.version) }
+	.gnu.version_d	: { *(.gnu.version_d) }
+	.gnu.version_r	: { *(.gnu.version_r) }
+
+	.note		: { *(.note.*) }		:text :note
+
+	.text		: { *(.text*) }			:text
+	PROVIDE (__etext = .);
+	PROVIDE (_etext = .);
+	PROVIDE (etext = .);
+
+	.eh_frame_hdr	: { *(.eh_frame_hdr) }		:text :eh_frame_hdr
+	.eh_frame	: { KEEP (*(.eh_frame)) }	:text
+
+	.dynamic	: { *(.dynamic) }		:text :dynamic
+
+	.rodata		: { *(.rodata*) }		:text
+
+	_end = .;
+	PROVIDE(end = .);
+
+	/DISCARD/	: {
+		*(.MIPS.abiflags)
+		*(.gnu.attributes)
+		*(.note.GNU-stack)
+		*(.data .data.* .gnu.linkonce.d.* .sdata*)
+		*(.bss .sbss .dynbss .dynsbss)
+	}
+}
+
+PHDRS
+{
+	/*
+	 * Provide a PT_MIPS_ABIFLAGS header to assign the ABI flags section
+	 * to. We can specify the header type directly here so no modification
+	 * is needed later on.
+	 */
+	abiflags	0x70000003;
+
+	/*
+	 * The ABI flags header must exist directly after the PT_INTERP header,
+	 * so we must explicitly place the PT_MIPS_REGINFO header after it to
+	 * stop the linker putting one in at the start.
+	 */
+	reginfo		0x70000000;
+
+	text		PT_LOAD		FLAGS(5) FILEHDR PHDRS; /* PF_R|PF_X */
+	dynamic		PT_DYNAMIC	FLAGS(4);		/* PF_R */
+	note		PT_NOTE		FLAGS(4);		/* PF_R */
+	eh_frame_hdr	PT_GNU_EH_FRAME;
+}
+
+VERSION
+{
+	LINUX_2.6 {
+	local: *;
+	};
+}
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-09-28 10:03 [PATCH 0/3] MIPS VDSO support Markos Chandras
  2015-09-28 10:10 ` [PATCH 1/3] MIPS: Initial implementation of a VDSO Markos Chandras
@ 2015-09-28 10:11 ` Markos Chandras
  2015-09-28 10:55   ` Marc Zyngier
  2015-10-12  9:40   ` [PATCH v2 " Markos Chandras
  2015-09-28 10:12 ` [PATCH 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime() Markos Chandras
  2 siblings, 2 replies; 35+ messages in thread
From: Markos Chandras @ 2015-09-28 10:11 UTC (permalink / raw)
  To: linux-mips
  Cc: alex, Alex Smith, Thomas Gleixner, Jason Cooper, Marc Zyngier,
	linux-kernel, Markos Chandras

From: Alex Smith <alex.smith@imgtec.com>

The GIC provides a "user-mode visible" section containing a mirror of
the counter registers which can be mapped into user memory. This will
be used by the VDSO time function implementations, so provide a
function to map it in.

When the GIC is not enabled in Kconfig a dummy inline version of this
function is provided, along with "#define gic_present 0", so that we
don't have to litter the VDSO code with ifdefs.

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
---
 drivers/irqchip/irq-mips-gic.c   | 27 +++++++++++++++++++++------
 include/linux/irqchip/mips-gic.h | 24 ++++++++++++++++++++++--
 2 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
index af2f16bb8a94..c995b199ca32 100644
--- a/drivers/irqchip/irq-mips-gic.c
+++ b/drivers/irqchip/irq-mips-gic.c
@@ -13,6 +13,7 @@
 #include <linux/irq.h>
 #include <linux/irqchip.h>
 #include <linux/irqchip/mips-gic.h>
+#include <linux/mm.h>
 #include <linux/of_address.h>
 #include <linux/sched.h>
 #include <linux/smp.h>
@@ -29,6 +30,7 @@ struct gic_pcpu_mask {
 	DECLARE_BITMAP(pcpu_mask, GIC_MAX_INTRS);
 };
 
+static unsigned long gic_base_addr;
 static void __iomem *gic_base;
 static struct gic_pcpu_mask pcpu_masks[NR_CPUS];
 static DEFINE_SPINLOCK(gic_lock);
@@ -301,6 +303,19 @@ int gic_get_c0_fdc_int(void)
 				  GIC_LOCAL_TO_HWIRQ(GIC_LOCAL_INT_FDC));
 }
 
+int gic_map_user_section(struct vm_area_struct *vma, unsigned long base,
+			 unsigned long size)
+{
+	unsigned long pfn;
+
+	BUG_ON(!gic_present);
+	BUG_ON(size > USM_VISIBLE_SECTION_SIZE);
+
+	pfn = (gic_base_addr + USM_VISIBLE_SECTION_OFS) >> PAGE_SHIFT;
+	return io_remap_pfn_range(vma, base, pfn, size,
+				  pgprot_noncached(PAGE_READONLY));
+}
+
 static void gic_handle_shared_int(bool chained)
 {
 	unsigned int i, intr, virq, gic_reg_step = mips_cm_is64 ? 8 : 4;
@@ -783,14 +798,15 @@ static const struct irq_domain_ops gic_irq_domain_ops = {
 	.xlate = gic_irq_domain_xlate,
 };
 
-static void __init __gic_init(unsigned long gic_base_addr,
-			      unsigned long gic_addrspace_size,
+static void __init __gic_init(unsigned long base_addr,
+			      unsigned long addrspace_size,
 			      unsigned int cpu_vec, unsigned int irqbase,
 			      struct device_node *node)
 {
 	unsigned int gicconfig;
 
-	gic_base = ioremap_nocache(gic_base_addr, gic_addrspace_size);
+	gic_base_addr = base_addr;
+	gic_base = ioremap_nocache(base_addr, addrspace_size);
 
 	gicconfig = gic_read(GIC_REG(SHARED, GIC_SH_CONFIG));
 	gic_shared_intrs = (gicconfig & GIC_SH_CONFIG_NUMINTRS_MSK) >>
@@ -847,11 +863,10 @@ static void __init __gic_init(unsigned long gic_base_addr,
 	gic_ipi_init();
 }
 
-void __init gic_init(unsigned long gic_base_addr,
-		     unsigned long gic_addrspace_size,
+void __init gic_init(unsigned long base_addr, unsigned long addrspace_size,
 		     unsigned int cpu_vec, unsigned int irqbase)
 {
-	__gic_init(gic_base_addr, gic_addrspace_size, cpu_vec, irqbase, NULL);
+	__gic_init(base_addr, addrspace_size, cpu_vec, irqbase, NULL);
 }
 
 static int __init gic_of_init(struct device_node *node,
diff --git a/include/linux/irqchip/mips-gic.h b/include/linux/irqchip/mips-gic.h
index 4e6861605050..68f2e9539204 100644
--- a/include/linux/irqchip/mips-gic.h
+++ b/include/linux/irqchip/mips-gic.h
@@ -245,10 +245,14 @@
 #define GIC_SHARED_TO_HWIRQ(x)	(GIC_SHARED_HWIRQ_BASE + (x))
 #define GIC_HWIRQ_TO_SHARED(x)	((x) - GIC_SHARED_HWIRQ_BASE)
 
+struct vm_area_struct;
+
+#ifdef CONFIG_MIPS_GIC
+
 extern unsigned int gic_present;
 
-extern void gic_init(unsigned long gic_base_addr,
-	unsigned long gic_addrspace_size, unsigned int cpu_vec,
+extern void gic_init(unsigned long base_addr,
+	unsigned long addrspace_size, unsigned int cpu_vec,
 	unsigned int irqbase);
 extern void gic_clocksource_init(unsigned int);
 extern cycle_t gic_read_count(void);
@@ -264,4 +268,20 @@ extern unsigned int plat_ipi_resched_int_xlate(unsigned int);
 extern int gic_get_c0_compare_int(void);
 extern int gic_get_c0_perfcount_int(void);
 extern int gic_get_c0_fdc_int(void);
+extern int gic_map_user_section(struct vm_area_struct *vma, unsigned long base,
+				unsigned long size);
+
+#else /* CONFIG_MIPS_GIC */
+
+#define gic_present	0
+
+static inline int gic_map_user_section(struct vm_area_struct *vma,
+				       unsigned long base, unsigned long size)
+{
+	/* Shouldn't be called. */
+	return -1;
+}
+
+#endif /* CONFIG_MIPS_GIC */
+
 #endif /* __LINUX_IRQCHIP_MIPS_GIC_H */
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* [PATCH 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-09-28 10:03 [PATCH 0/3] MIPS VDSO support Markos Chandras
  2015-09-28 10:10 ` [PATCH 1/3] MIPS: Initial implementation of a VDSO Markos Chandras
  2015-09-28 10:11 ` [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section Markos Chandras
@ 2015-09-28 10:12 ` Markos Chandras
  2015-09-28 13:15   ` kbuild test robot
  2015-10-12 10:24   ` [PATCH v2 " Markos Chandras
  2 siblings, 2 replies; 35+ messages in thread
From: Markos Chandras @ 2015-09-28 10:12 UTC (permalink / raw)
  To: linux-mips; +Cc: alex, Alex Smith, linux-kernel, Markos Chandras

From: Alex Smith <alex.smith@imgtec.com>

Add user-mode implementations of gettimeofday() and clock_gettime() to
the VDSO. This is currently usable with 2 clocksources: the CP0 count
register, which is accessible to user-mode via RDHWR on R2 and later
cores, or the MIPS Global Interrupt Controller (GIC) timer, which
provides a "user-mode visible" section containing a mirror of its
counter registers. This section must be mapped into user memory, which
is done below the VDSO data page.

When a supported clocksource is not in use, the VDSO functions will
return -ENOSYS, which causes libc to fall back on the standard syscall
path.

When support for neither of these clocksources is compiled into the
kernel at all, the VDSO still provides clock_gettime(), as the coarse
realtime/monotonic clocks can still be implemented. However,
gettimeofday() is not provided in this case as nothing can be done
without a suitable clocksource. This causes the symbol lookup to fail
in libc and it will then always use the standard syscall path.

This patch includes a workaround for a bug in QEMU which results in
RDHWR on the CP0 count register always returning a constant (incorrect)
value. A fix for this has been submitted, and the workaround can be
removed after the fix has been in stable releases for a reasonable
amount of time.

A simple performance test which calls gettimeofday() 1000 times in a
loop and calculates the average execution time gives the following
results on a Malta + I6400 (running at 20MHz):

 - Syscall:    ~31000 ns
 - VDSO (GIC): ~15000 ns
 - VDSO (CP0): ~9500 ns

[markos.chandras@imgtec.com:
- Minor code re-arrangements in order for mappings to be made
in the order they appear to the process' address space.
- Move do_{monotonic, realtime} outside of the MIPS_CLOCK_VSYSCALL ifdef]

Cc: linux-kernel@vger.kernel.org
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
---
 arch/mips/Kconfig                    |   5 +
 arch/mips/include/asm/clocksource.h  |  29 +++++
 arch/mips/include/asm/vdso.h         |  68 +++++++++-
 arch/mips/kernel/csrc-r4k.c          |  44 +++++++
 arch/mips/kernel/vdso.c              |  62 +++++++++-
 arch/mips/vdso/Makefile              |   2 +-
 arch/mips/vdso/gettimeofday.c        | 232 +++++++++++++++++++++++++++++++++++
 arch/mips/vdso/vdso.h                |   9 ++
 arch/mips/vdso/vdso.lds.S            |   3 +
 drivers/clocksource/mips-gic-timer.c |   7 +-
 10 files changed, 450 insertions(+), 11 deletions(-)
 create mode 100644 arch/mips/include/asm/clocksource.h
 create mode 100644 arch/mips/vdso/gettimeofday.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index e3aa5b0b4ef1..68f4f246887c 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -60,6 +60,8 @@ config MIPS
 	select SYSCTL_EXCEPTION_TRACE
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select HAVE_IRQ_TIME_ACCOUNTING
+	select GENERIC_TIME_VSYSCALL
+	select ARCH_CLOCKSOURCE_DATA
 
 menu "Machine selection"
 
@@ -1036,6 +1038,9 @@ config CSRC_R4K
 config CSRC_SB1250
 	bool
 
+config MIPS_CLOCK_VSYSCALL
+	def_bool CSRC_R4K || CLKSRC_MIPS_GIC
+
 config GPIO_TXX9
 	select ARCH_REQUIRE_GPIOLIB
 	bool
diff --git a/arch/mips/include/asm/clocksource.h b/arch/mips/include/asm/clocksource.h
new file mode 100644
index 000000000000..3deb1d0c1a94
--- /dev/null
+++ b/arch/mips/include/asm/clocksource.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#ifndef __ASM_CLOCKSOURCE_H
+#define __ASM_CLOCKSOURCE_H
+
+#include <linux/types.h>
+
+/* VDSO clocksources. */
+#define VDSO_CLOCK_NONE		0	/* No suitable clocksource. */
+#define VDSO_CLOCK_R4K		1	/* Use the coprocessor 0 count. */
+#define VDSO_CLOCK_GIC		2	/* Use the GIC. */
+
+/**
+ * struct arch_clocksource_data - Architecture-specific clocksource information.
+ * @vdso_clock_mode: Method the VDSO should use to access the clocksource.
+ */
+struct arch_clocksource_data {
+	u8 vdso_clock_mode;
+};
+
+#endif /* __ASM_CLOCKSOURCE_H */
diff --git a/arch/mips/include/asm/vdso.h b/arch/mips/include/asm/vdso.h
index db2d45be8f2e..8f4ca5dd992b 100644
--- a/arch/mips/include/asm/vdso.h
+++ b/arch/mips/include/asm/vdso.h
@@ -13,6 +13,8 @@
 
 #include <linux/mm_types.h>
 
+#include <asm/barrier.h>
+
 /**
  * struct mips_vdso_image - Details of a VDSO image.
  * @data: Pointer to VDSO image data (page-aligned).
@@ -53,18 +55,82 @@ extern struct mips_vdso_image vdso_image_n32;
 
 /**
  * union mips_vdso_data - Data provided by the kernel for the VDSO.
+ * @xtime_sec:		Current real time (seconds part).
+ * @xtime_nsec:		Current real time (nanoseconds part, shifted).
+ * @wall_to_mono_sec:	Wall-to-monotonic offset (seconds part).
+ * @wall_to_mono_nsec:	Wall-to-monotonic offset (nanoseconds part).
+ * @seq_count:		Counter to synchronise updates (odd = updating).
+ * @cs_shift:		Clocksource shift value.
+ * @clock_mode:		Clocksource to use for time functions.
+ * @cs_mult:		Clocksource multiplier value.
+ * @cs_cycle_last:	Clock cycle value at last update.
+ * @cs_mask:		Clocksource mask value.
+ * @tz_minuteswest:	Minutes west of Greenwich (from timezone).
+ * @tz_dsttime:		Type of DST correction (from timezone).
  *
  * This structure contains data needed by functions within the VDSO. It is
- * populated by the kernel and mapped read-only into user memory.
+ * populated by the kernel and mapped read-only into user memory. The time
+ * fields are mirrors of internal data from the timekeeping infrastructure.
  *
  * Note: Care should be taken when modifying as the layout must remain the same
  * for both 64- and 32-bit (for 32-bit userland on 64-bit kernel).
  */
 union mips_vdso_data {
 	struct {
+		u64 xtime_sec;
+		u64 xtime_nsec;
+		u32 wall_to_mono_sec;
+		u32 wall_to_mono_nsec;
+		u32 seq_count;
+		u32 cs_shift;
+		u8 clock_mode;
+		u32 cs_mult;
+		u64 cs_cycle_last;
+		u64 cs_mask;
+		s32 tz_minuteswest;
+		s32 tz_dsttime;
 	};
 
 	u8 page[PAGE_SIZE];
 };
 
+static inline u32 vdso_data_read_begin(const union mips_vdso_data *data)
+{
+	u32 seq;
+
+	while (true) {
+		seq = ACCESS_ONCE(data->seq_count);
+		if (likely(!(seq & 1))) {
+			/* Paired with smp_wmb() in vdso_data_write_*(). */
+			smp_rmb();
+			return seq;
+		}
+
+		cpu_relax();
+	}
+}
+
+static inline bool vdso_data_read_retry(const union mips_vdso_data *data,
+					u32 start_seq)
+{
+	/* Paired with smp_wmb() in vdso_data_write_*(). */
+	smp_rmb();
+	return unlikely(data->seq_count != start_seq);
+}
+
+static inline void vdso_data_write_begin(union mips_vdso_data *data)
+{
+	++data->seq_count;
+
+	/* Ensure sequence update is written before other data page values. */
+	smp_wmb();
+}
+
+static inline void vdso_data_write_end(union mips_vdso_data *data)
+{
+	/* Ensure data values are written before updating sequence again. */
+	smp_wmb();
+	++data->seq_count;
+}
+
 #endif /* __ASM_VDSO_H */
diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
index e5ed7ada1433..1f910563fdf6 100644
--- a/arch/mips/kernel/csrc-r4k.c
+++ b/arch/mips/kernel/csrc-r4k.c
@@ -28,6 +28,43 @@ static u64 notrace r4k_read_sched_clock(void)
 	return read_c0_count();
 }
 
+static inline unsigned int rdhwr_count(void)
+{
+	unsigned int count;
+
+	__asm__ __volatile__(
+	"	.set push\n"
+	"	.set mips32r2\n"
+	"	rdhwr	%0, $2\n"
+	"	.set pop\n"
+	: "=r" (count));
+
+	return count;
+}
+
+static bool rdhwr_count_usable(void)
+{
+	unsigned int prev, curr, i;
+
+	/*
+	 * Older QEMUs have a broken implementation of RDHWR for the CP0 count
+	 * which always returns a constant value. Try to identify this and don't
+	 * use it in the VDSO if it is broken. This workaround can be removed
+	 * once the fix has been in QEMU stable for a reasonable amount of time.
+	 */
+	for (i = 0, prev = rdhwr_count(); i < 100; i++) {
+		curr = rdhwr_count();
+
+		if (curr != prev)
+			return true;
+
+		prev = curr;
+	}
+
+	pr_warn("Not using R4K clocksource in VDSO due to broken RDHWR\n");
+	return false;
+}
+
 int __init init_r4k_clocksource(void)
 {
 	if (!cpu_has_counter || !mips_hpt_frequency)
@@ -36,6 +73,13 @@ int __init init_r4k_clocksource(void)
 	/* Calculate a somewhat reasonable rating value */
 	clocksource_mips.rating = 200 + mips_hpt_frequency / 10000000;
 
+	/*
+	 * R2 onwards makes the count accessible to user mode so it can be used
+	 * by the VDSO (HWREna is configured by configure_hwrena()).
+	 */
+	if (cpu_has_mips_r2_r6 && rdhwr_count_usable())
+		clocksource_mips.archdata.vdso_clock_mode = VDSO_CLOCK_R4K;
+
 	clocksource_register_hz(&clocksource_mips, mips_hpt_frequency);
 
 	sched_clock_register(r4k_read_sched_clock, 32, mips_hpt_frequency);
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 56cc3c4377fb..7894db0c7922 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -12,9 +12,11 @@
 #include <linux/elf.h>
 #include <linux/err.h>
 #include <linux/init.h>
+#include <linux/irqchip/mips-gic.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/timekeeper_internal.h>
 
 #include <asm/abi.h>
 #include <asm/vdso.h>
@@ -23,7 +25,7 @@
 static union mips_vdso_data vdso_data __page_aligned_data;
 
 /*
- * Mapping for the VDSO data pages. The real pages are mapped manually, as
+ * Mapping for the VDSO data/GIC pages. The real pages are mapped manually, as
  * what we map and where within the area they are mapped is determined at
  * runtime.
  */
@@ -64,25 +66,66 @@ static int __init init_vdso(void)
 }
 subsys_initcall(init_vdso);
 
+void update_vsyscall(struct timekeeper *tk)
+{
+	vdso_data_write_begin(&vdso_data);
+
+	vdso_data.xtime_sec = tk->xtime_sec;
+	vdso_data.xtime_nsec = tk->tkr_mono.xtime_nsec;
+	vdso_data.wall_to_mono_sec = tk->wall_to_monotonic.tv_sec;
+	vdso_data.wall_to_mono_nsec = tk->wall_to_monotonic.tv_nsec;
+	vdso_data.cs_shift = tk->tkr_mono.shift;
+
+	vdso_data.clock_mode = tk->tkr_mono.clock->archdata.vdso_clock_mode;
+	if (vdso_data.clock_mode != VDSO_CLOCK_NONE) {
+		vdso_data.cs_mult = tk->tkr_mono.mult;
+		vdso_data.cs_cycle_last = tk->tkr_mono.cycle_last;
+		vdso_data.cs_mask = tk->tkr_mono.mask;
+	}
+
+	vdso_data_write_end(&vdso_data);
+}
+
+void update_vsyscall_tz(void)
+{
+	if (vdso_data.clock_mode != VDSO_CLOCK_NONE) {
+		vdso_data.tz_minuteswest = sys_tz.tz_minuteswest;
+		vdso_data.tz_dsttime = sys_tz.tz_dsttime;
+	}
+}
+
 int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 {
 	struct mips_vdso_image *image = current->thread.abi->vdso;
 	struct mm_struct *mm = current->mm;
-	unsigned long base, vdso_addr;
+	unsigned long gic_size, vvar_size, size, base, data_addr, vdso_addr;
 	struct vm_area_struct *vma;
 	int ret;
 
 	down_write(&mm->mmap_sem);
 
-	base = get_unmapped_area(NULL, 0, PAGE_SIZE + image->size, 0, 0);
+	/*
+	 * Determine total area size. This includes the VDSO data itself, the
+	 * data page, and the GIC user page if present. Always create a mapping
+	 * for the GIC user area if the GIC is present regardless of whether it
+	 * is the current clocksource, in case it comes into use later on. We
+	 * only map a page even though the total area is 64K, as we only need
+	 * the counter registers at the start.
+	 */
+	gic_size = gic_present ? PAGE_SIZE : 0;
+	vvar_size = gic_size + PAGE_SIZE;
+	size = vvar_size + image->size;
+
+	base = get_unmapped_area(NULL, 0, size, 0, 0);
 	if (IS_ERR_VALUE(base)) {
 		ret = base;
 		goto out;
 	}
 
-	vdso_addr = base + PAGE_SIZE;
+	data_addr = base + gic_size;
+	vdso_addr = data_addr + PAGE_SIZE;
 
-	vma = _install_special_mapping(mm, base, PAGE_SIZE,
+	vma = _install_special_mapping(mm, base, vvar_size,
 				       VM_READ | VM_MAYREAD,
 				       &vdso_vvar_mapping);
 	if (IS_ERR(vma)) {
@@ -90,8 +133,15 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 		goto out;
 	}
 
+	/* Map GIC user page. */
+	if (gic_size) {
+		ret = gic_map_user_section(vma, base, gic_size);
+		if (ret)
+			goto out;
+	}
+
 	/* Map data page. */
-	ret = remap_pfn_range(vma, base,
+	ret = remap_pfn_range(vma, data_addr,
 			      virt_to_phys(&vdso_data) >> PAGE_SHIFT,
 			      PAGE_SIZE, PAGE_READONLY);
 	if (ret)
diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
index 9a8a6b373eb0..c2820997ea9b 100644
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -1,5 +1,5 @@
 # Objects to go into the VDSO.
-obj-vdso-y := elf.o sigreturn.o
+obj-vdso-y := elf.o gettimeofday.o sigreturn.o
 
 # Common compiler flags between ABIs.
 ccflags-vdso := \
diff --git a/arch/mips/vdso/gettimeofday.c b/arch/mips/vdso/gettimeofday.c
new file mode 100644
index 000000000000..ce89c9e294f9
--- /dev/null
+++ b/arch/mips/vdso/gettimeofday.c
@@ -0,0 +1,232 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include "vdso.h"
+
+#include <linux/compiler.h>
+#include <linux/irqchip/mips-gic.h>
+#include <linux/time.h>
+
+#include <asm/clocksource.h>
+#include <asm/io.h>
+#include <asm/mips-cm.h>
+#include <asm/unistd.h>
+#include <asm/vdso.h>
+
+static __always_inline int do_realtime_coarse(struct timespec *ts,
+					      const union mips_vdso_data *data)
+{
+	u32 start_seq;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		ts->tv_sec = data->xtime_sec;
+		ts->tv_nsec = data->xtime_nsec >> data->cs_shift;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	return 0;
+}
+
+static __always_inline int do_monotonic_coarse(struct timespec *ts,
+					       const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u32 to_mono_sec;
+	u32 to_mono_nsec;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		ts->tv_sec = data->xtime_sec;
+		ts->tv_nsec = data->xtime_nsec >> data->cs_shift;
+
+		to_mono_sec = data->wall_to_mono_sec;
+		to_mono_nsec = data->wall_to_mono_nsec;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_sec += to_mono_sec;
+	timespec_add_ns(ts, to_mono_nsec);
+
+	return 0;
+}
+
+#ifdef CONFIG_CSRC_R4K
+
+static __always_inline u64 read_r4k_count(void)
+{
+	unsigned int count;
+
+	__asm__ __volatile__(
+	"	.set push\n"
+	"	.set mips32r2\n"
+	"	rdhwr	%0, $2\n"
+	"	.set pop\n"
+	: "=r" (count));
+
+	return count;
+}
+
+#endif
+
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+
+static __always_inline u64 read_gic_count(const union mips_vdso_data *data)
+{
+	void __iomem *gic = get_gic(data);
+	u32 hi, hi2, lo;
+
+	do {
+		hi = __raw_readl(gic + GIC_UMV_SH_COUNTER_63_32_OFS);
+		lo = __raw_readl(gic + GIC_UMV_SH_COUNTER_31_00_OFS);
+		hi2 = __raw_readl(gic + GIC_UMV_SH_COUNTER_63_32_OFS);
+	} while (hi2 != hi);
+
+	return (((u64)hi) << 32) + lo;
+}
+
+#endif
+
+static __always_inline u64 get_ns(const union mips_vdso_data *data)
+{
+	u64 cycle_now, delta, nsec;
+
+	switch (data->clock_mode) {
+#ifdef CONFIG_CSRC_R4K
+	case VDSO_CLOCK_R4K:
+		cycle_now = read_r4k_count();
+		break;
+#endif
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+	case VDSO_CLOCK_GIC:
+		cycle_now = read_gic_count(data);
+		break;
+#endif
+	default:
+		return 0;
+	}
+
+	delta = (cycle_now - data->cs_cycle_last) & data->cs_mask;
+
+	nsec = (delta * data->cs_mult) + data->xtime_nsec;
+	nsec >>= data->cs_shift;
+
+	return nsec;
+}
+
+static __always_inline int do_realtime(struct timespec *ts,
+				       const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u64 ns;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		if (data->clock_mode == VDSO_CLOCK_NONE)
+			return -ENOSYS;
+
+		ts->tv_sec = data->xtime_sec;
+		ns = get_ns(data);
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, ns);
+
+	return 0;
+}
+
+static __always_inline int do_monotonic(struct timespec *ts,
+					const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u64 ns;
+	u32 to_mono_sec;
+	u32 to_mono_nsec;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		if (data->clock_mode == VDSO_CLOCK_NONE)
+			return -ENOSYS;
+
+		ts->tv_sec = data->xtime_sec;
+		ns = get_ns(data);
+
+		to_mono_sec = data->wall_to_mono_sec;
+		to_mono_nsec = data->wall_to_mono_nsec;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_sec += to_mono_sec;
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, ns + to_mono_nsec);
+
+	return 0;
+}
+
+#ifdef CONFIG_MIPS_CLOCK_VSYSCALL
+
+/*
+ * This is behind the ifdef so that we don't provide the symbol when there's no
+ * possibility of there being a usable clocksource, because there's nothing we
+ * can do without it. When libc fails the symbol lookup it should fall back on
+ * the standard syscall path.
+ */
+int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
+{
+	const union mips_vdso_data *data = get_vdso_data();
+	struct timespec ts;
+	int ret;
+
+	ret = do_realtime(&ts, data);
+	if (ret)
+		return ret;
+
+	if (tv) {
+		tv->tv_sec = ts.tv_sec;
+		tv->tv_usec = ts.tv_nsec / 1000;
+	}
+
+	if (tz) {
+		tz->tz_minuteswest = data->tz_minuteswest;
+		tz->tz_dsttime = data->tz_dsttime;
+	}
+
+	return 0;
+}
+
+#endif /* CONFIG_CLKSRC_MIPS_GIC */
+
+int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts)
+{
+	const union mips_vdso_data *data = get_vdso_data();
+	int ret;
+
+	switch (clkid) {
+	case CLOCK_REALTIME_COARSE:
+		ret = do_realtime_coarse(ts, data);
+		break;
+	case CLOCK_MONOTONIC_COARSE:
+		ret = do_monotonic_coarse(ts, data);
+		break;
+	case CLOCK_REALTIME:
+		ret = do_realtime(ts, data);
+		break;
+	case CLOCK_MONOTONIC:
+		ret = do_monotonic(ts, data);
+		break;
+	default:
+		ret = -ENOSYS;
+		break;
+	}
+
+	/* If we return -ENOSYS libc should fall back to a syscall. */
+	return ret;
+}
diff --git a/arch/mips/vdso/vdso.h b/arch/mips/vdso/vdso.h
index 64b98967e245..1072f8634417 100644
--- a/arch/mips/vdso/vdso.h
+++ b/arch/mips/vdso/vdso.h
@@ -76,4 +76,13 @@ static inline const union mips_vdso_data *get_vdso_data(void)
 	return (const union mips_vdso_data *)(get_vdso_base() - PAGE_SIZE);
 }
 
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+
+static inline void __iomem *get_gic(const union mips_vdso_data *data)
+{
+	return (void __iomem *)data - PAGE_SIZE;
+}
+
+#endif /* CONFIG_CLKSRC_MIPS_GIC */
+
 #endif /* __ASSEMBLY__ */
diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S
index 21655b6fefc5..0bda37c5a1e6 100644
--- a/arch/mips/vdso/vdso.lds.S
+++ b/arch/mips/vdso/vdso.lds.S
@@ -95,6 +95,9 @@ PHDRS
 VERSION
 {
 	LINUX_2.6 {
+	global:
+		__vdso_clock_gettime;
+		__vdso_gettimeofday;
 	local: *;
 	};
 }
diff --git a/drivers/clocksource/mips-gic-timer.c b/drivers/clocksource/mips-gic-timer.c
index 02a1945e5093..89d3e4d7900c 100644
--- a/drivers/clocksource/mips-gic-timer.c
+++ b/drivers/clocksource/mips-gic-timer.c
@@ -140,9 +140,10 @@ static cycle_t gic_hpt_read(struct clocksource *cs)
 }
 
 static struct clocksource gic_clocksource = {
-	.name	= "GIC",
-	.read	= gic_hpt_read,
-	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
+	.name		= "GIC",
+	.read		= gic_hpt_read,
+	.flags		= CLOCK_SOURCE_IS_CONTINUOUS,
+	.archdata	= { .vdso_clock_mode = VDSO_CLOCK_GIC },
 };
 
 static void __init __gic_clocksource_init(void)
-- 
2.5.3


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH 1/3] MIPS: Initial implementation of a VDSO
  2015-09-28 10:10 ` [PATCH 1/3] MIPS: Initial implementation of a VDSO Markos Chandras
@ 2015-09-28 10:54   ` Alex Smith
  2015-09-28 13:07     ` Matthew Fortune
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Smith @ 2015-09-28 10:54 UTC (permalink / raw)
  To: Markos Chandras; +Cc: linux-mips, Alex Smith, linux-kernel, Matthew Fortune

Hi Markos,

Thanks for finishing this off. Just got a few of minor comments.

On 28 September 2015 at 11:10, Markos Chandras
<markos.chandras@imgtec.com> wrote:
> diff --git a/arch/mips/vdso/elf.S b/arch/mips/vdso/elf.S
> new file mode 100644
> index 000000000000..60c23d0d452c
> --- /dev/null
> +++ b/arch/mips/vdso/elf.S
> @@ -0,0 +1,68 @@
> +/*
> + * Copyright (C) 2015 Imagination Technologies
> + * Author: Alex Smith <alex.smith@imgtec.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +
> +#include "vdso.h"
> +
> +#include <linux/elfnote.h>
> +#include <linux/version.h>
> +
> +ELFNOTE_START(Linux, 0, "a")
> +       .long LINUX_VERSION_CODE
> +ELFNOTE_END
> +
> +/*
> + * The .MIPS.abiflags section must be defined with the FP ABI flags set
> + * to 'any' to be able to link with both old and new libraries.
> + * Newer toolchains are capable of automatically generating this, but we want
> + * to work with older toolchains as well. Therefore, we define the contents of
> + * this section here (under different names), and then genvdso will patch
> + * it to have the correct name and type.
> + *
> + * We base the .MIPS.abiflags section on preprocessor definitions rather than
> + * CONFIG_* because we need to match the particular ABI we are building the
> + * VDSO for.
> + *
> + * See https://dmz-portal.mips.com/wiki/MIPS_O32_ABI_-_FR0_and_FR1_Interlinking
> + * for the .MIPS.abitflags and .gnu.attributes section description.
> + */

s/abitflags/abiflags/

> diff --git a/arch/mips/vdso/vdso.h b/arch/mips/vdso/vdso.h
> new file mode 100644
> index 000000000000..64b98967e245
> --- /dev/null
> +++ b/arch/mips/vdso/vdso.h
> @@ -0,0 +1,79 @@
> +/*
> + * Copyright (C) 2015 Imagination Technologies
> + * Author: Alex Smith <alex.smith@imgtec.com>
> + *
> + * This program is free software; you can redistribute it and/or modify it
> + * under the terms of the GNU General Public License as published by the
> + * Free Software Foundation;  either version 2 of the  License, or (at your
> + * option) any later version.
> + */
> +
> +#include <asm/sgidefs.h>
> +
> +#if _MIPS_SIM != _MIPS_SIM_ABI64 && defined(CONFIG_64BIT)
> +
> +/* Building 32-bit VDSO for the 64-bit kernel. Fake a 32-bit Kconfig. */
> +#undef CONFIG_64BIT
> +#define CONFIG_32BIT 1
> +
> +#endif
> +
> +#ifndef __ASSEMBLY__
> +
> +#include <asm/asm.h>
> +#include <asm/page.h>
> +#include <asm/vdso.h>
> +
> +static inline unsigned long get_vdso_base(void)
> +{
> +       unsigned long addr;
> +
> +       /*
> +        * Get the base load address of the VDSO. We have to avoid generating
> +        * relocations and references to the GOT because ld.so does not peform
> +        * relocations on the VDSO. We use the current offset from the VDSO base
> +        * and perform a PC-relative branch which gives the absolute address in
> +        * ra, and take the difference. The assembler chokes on
> +        * "li %0, _start - .", so embed the offset as a word and branch over
> +        * it.
> +        *
> +        * TODO: Is there a better way to do this?

Unless somebody else can come up with a better way to do this I'd say
this TODO can go :)

Also perhaps move the description of what the code is doing (from "We
use the current offset from the VDSO base" onwards) down to after the
#else since it applies  to that code rather than the R6 code which
comes first.

> +        */
> +
> +#ifdef CONFIG_CPU_MIPSR6
> +       /*
> +        * We can't use cpu_has_mips_r6 since it will create a relocation
> +        * in the VDSO because of the global cpu_data[] variable.
> +        */

I think it would be more correct to say here that cpu_data doesn't
even exist to the VDSO because it's a kernel symbol.

> +
> +       /* lapc <symbol> is an alias to addiupc reg, <symbol> - .
> +        *
> +        * We can't use addiupc because there is no label-label
> +        * support for the addiupc reloc
> +        */
> +       __asm__("lapc   %0, _start                      \n"
> +               : "=r" (addr) : :);

Just curious - if lapc is just an alias to addiupc, why does that work
but not addiupc? IIRC I did try addiupc previously but removed it
because it wasn't working, didn't know about lapc!

Thanks,
Alex

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-09-28 10:11 ` [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section Markos Chandras
@ 2015-09-28 10:55   ` Marc Zyngier
  2015-09-28 14:16     ` Qais Yousef
  2015-10-05  8:22     ` Markos Chandras
  2015-10-12  9:40   ` [PATCH v2 " Markos Chandras
  1 sibling, 2 replies; 35+ messages in thread
From: Marc Zyngier @ 2015-09-28 10:55 UTC (permalink / raw)
  To: Markos Chandras, linux-mips
  Cc: alex, Alex Smith, Thomas Gleixner, Jason Cooper, linux-kernel

On 28/09/15 11:11, Markos Chandras wrote:
> From: Alex Smith <alex.smith@imgtec.com>
> 
> The GIC provides a "user-mode visible" section containing a mirror of
> the counter registers which can be mapped into user memory. This will
> be used by the VDSO time function implementations, so provide a
> function to map it in.
> 
> When the GIC is not enabled in Kconfig a dummy inline version of this
> function is provided, along with "#define gic_present 0", so that we
> don't have to litter the VDSO code with ifdefs.
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jason Cooper <jason@lakedaemon.net>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Alex Smith <alex.smith@imgtec.com>
> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
> ---
>  drivers/irqchip/irq-mips-gic.c   | 27 +++++++++++++++++++++------
>  include/linux/irqchip/mips-gic.h | 24 ++++++++++++++++++++++--
>  2 files changed, 43 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
> index af2f16bb8a94..c995b199ca32 100644
> --- a/drivers/irqchip/irq-mips-gic.c
> +++ b/drivers/irqchip/irq-mips-gic.c
> @@ -13,6 +13,7 @@
>  #include <linux/irq.h>
>  #include <linux/irqchip.h>
>  #include <linux/irqchip/mips-gic.h>
> +#include <linux/mm.h>
>  #include <linux/of_address.h>
>  #include <linux/sched.h>
>  #include <linux/smp.h>
> @@ -29,6 +30,7 @@ struct gic_pcpu_mask {
>  	DECLARE_BITMAP(pcpu_mask, GIC_MAX_INTRS);
>  };
>  
> +static unsigned long gic_base_addr;
>  static void __iomem *gic_base;
>  static struct gic_pcpu_mask pcpu_masks[NR_CPUS];
>  static DEFINE_SPINLOCK(gic_lock);
> @@ -301,6 +303,19 @@ int gic_get_c0_fdc_int(void)
>  				  GIC_LOCAL_TO_HWIRQ(GIC_LOCAL_INT_FDC));
>  }
>  
> +int gic_map_user_section(struct vm_area_struct *vma, unsigned long base,
> +			 unsigned long size)
> +{
> +	unsigned long pfn;
> +
> +	BUG_ON(!gic_present);

Why do you have a BUG() here, while you're just returning -1 in the case
where CONFIG_MIPS_GIC is not refined? This feels overly harsh to me.

> +	BUG_ON(size > USM_VISIBLE_SECTION_SIZE);

Same here.

> +
> +	pfn = (gic_base_addr + USM_VISIBLE_SECTION_OFS) >> PAGE_SHIFT;
> +	return io_remap_pfn_range(vma, base, pfn, size,
> +				  pgprot_noncached(PAGE_READONLY));

Two things:

- I suppose you are comfortable with making this region accessible to
userspace (obviously). Not knowing anything about it, is it guaranteed
not to trigger any unpleasant event even if the luser tries to play
dirty tricks on it (like doing unaligned or exclusive access)?

- Does this code have to be in the irqchip driver? It really feels out
of place, and I'd rather see a function that returns the mappable range
to the VDSO code, where the mapping would occur.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 1/3] MIPS: Initial implementation of a VDSO
  2015-09-28 10:54   ` Alex Smith
@ 2015-09-28 13:07     ` Matthew Fortune
  2015-11-20 18:15       ` Maciej W. Rozycki
  0 siblings, 1 reply; 35+ messages in thread
From: Matthew Fortune @ 2015-09-28 13:07 UTC (permalink / raw)
  To: Alex Smith, Markos Chandras; +Cc: linux-mips, Alex Smith, linux-kernel

[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #1: Type: text/plain; charset="utf-8", Size: 1332 bytes --]

Alex Smith <alex@alex-smith.me.uk> writes:
> > +
> > +       /* lapc <symbol> is an alias to addiupc reg, <symbol> - .
> > +        *
> > +        * We can't use addiupc because there is no label-label
> > +        * support for the addiupc reloc
> > +        */
> > +       __asm__("lapc   %0, _start                      \n"
> > +               : "=r" (addr) : :);
> 
> Just curious - if lapc is just an alias to addiupc, why does that work
> but not addiupc? IIRC I did try addiupc previously but removed it
> because it wasn't working, didn't know about lapc!

This is just an unfortunate quirk of how the implementation is done in
binutils. We don't recognise the special case that:

addiupc <reg>, <sym> - .

is the same as

lapc <reg>, <sym>

And therefore don't know that we can just use the MIPS_PC19_S2 reloc
(name of that reloc may not be perfectly correct). It is a special
case as the RHS of the expression in ADDIUPC above can be theoretically
anything so we only support assembly time constants with addiupc.

Apart from the need to document the LAPC alias somewhere I'm not sure
we need do anything to improve addiupc itself particularly.

Thanks,
Matthew
ÿôèº{.nÇ+‰·Ÿ®‰­†+%ŠËÿ±éݶ\x17¥Šwÿº{.nÇ+‰·¥Š{±þG«éÿŠ{ayº\x1dʇڙë,j\a­¢f£¢·hšïêÿ‘êçz_è®\x03(­éšŽŠÝ¢j"ú\x1a¶^[m§ÿÿ¾\a«þG«éÿ¢¸?™¨è­Ú&£ø§~á¶iO•æ¬z·švØ^\x14\x04\x1a¶^[m§ÿÿÃ\fÿ¶ìÿ¢¸?–I¥

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-09-28 10:12 ` [PATCH 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime() Markos Chandras
@ 2015-09-28 13:15   ` kbuild test robot
  2015-10-12 10:24   ` [PATCH v2 " Markos Chandras
  1 sibling, 0 replies; 35+ messages in thread
From: kbuild test robot @ 2015-09-28 13:15 UTC (permalink / raw)
  To: Markos Chandras
  Cc: kbuild-all, linux-mips, alex, Alex Smith, linux-kernel, Markos Chandras

[-- Attachment #1: Type: text/plain, Size: 3691 bytes --]

Hi Alex,

[auto build test results on v4.3-rc3 -- if it's inappropriate base, please ignore]

config: mips-allyesconfig (attached as .config)
reproduce:
  wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
  chmod +x ~/bin/make.cross
  git checkout 0253e397c40cca438922d8ee26cf716a29b1cd77
  # save the attached .config to linux build tree
  make.cross ARCH=mips 

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/srcu.h:34:0,
                    from include/linux/notifier.h:15,
                    from arch/mips/include/asm/uprobes.h:9,
                    from include/linux/uprobes.h:61,
                    from include/linux/mm_types.h:13,
                    from arch/mips/include/asm/vdso.h:14,
                    from arch/mips/vdso/vdso.h:25,
                    from arch/mips/vdso/gettimeofday.c:11:
   include/linux/workqueue.h: In function 'work_static':
>> include/linux/workqueue.h:186:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing]
     return *work_data_bits(work) & WORK_STRUCT_STATIC;
     ^
   cc1: all warnings being treated as errors

vim +186 include/linux/workqueue.h

dd6414b5 Phil Carmody    2010-10-20  170  
65f27f38 David Howells   2006-11-22  171  #define DECLARE_WORK(n, f)						\
65f27f38 David Howells   2006-11-22  172  	struct work_struct n = __WORK_INITIALIZER(n, f)
65f27f38 David Howells   2006-11-22  173  
65f27f38 David Howells   2006-11-22  174  #define DECLARE_DELAYED_WORK(n, f)					\
f991b318 Tejun Heo       2012-08-21  175  	struct delayed_work n = __DELAYED_WORK_INITIALIZER(n, f, 0)
65f27f38 David Howells   2006-11-22  176  
203b42f7 Tejun Heo       2012-08-21  177  #define DECLARE_DEFERRABLE_WORK(n, f)					\
f991b318 Tejun Heo       2012-08-21  178  	struct delayed_work n = __DELAYED_WORK_INITIALIZER(n, f, TIMER_DEFERRABLE)
dd6414b5 Phil Carmody    2010-10-20  179  
dc186ad7 Thomas Gleixner 2009-11-16  180  #ifdef CONFIG_DEBUG_OBJECTS_WORK
dc186ad7 Thomas Gleixner 2009-11-16  181  extern void __init_work(struct work_struct *work, int onstack);
dc186ad7 Thomas Gleixner 2009-11-16  182  extern void destroy_work_on_stack(struct work_struct *work);
ea2e64f2 Thomas Gleixner 2014-03-23  183  extern void destroy_delayed_work_on_stack(struct delayed_work *work);
4690c4ab Tejun Heo       2010-06-29  184  static inline unsigned int work_static(struct work_struct *work)
4690c4ab Tejun Heo       2010-06-29  185  {
22df02bb Tejun Heo       2010-06-29 @186  	return *work_data_bits(work) & WORK_STRUCT_STATIC;
4690c4ab Tejun Heo       2010-06-29  187  }
dc186ad7 Thomas Gleixner 2009-11-16  188  #else
dc186ad7 Thomas Gleixner 2009-11-16  189  static inline void __init_work(struct work_struct *work, int onstack) { }
dc186ad7 Thomas Gleixner 2009-11-16  190  static inline void destroy_work_on_stack(struct work_struct *work) { }
ea2e64f2 Thomas Gleixner 2014-03-23  191  static inline void destroy_delayed_work_on_stack(struct delayed_work *work) { }
4690c4ab Tejun Heo       2010-06-29  192  static inline unsigned int work_static(struct work_struct *work) { return 0; }
dc186ad7 Thomas Gleixner 2009-11-16  193  #endif
dc186ad7 Thomas Gleixner 2009-11-16  194  

:::::: The code at line 186 was first introduced by commit
:::::: 22df02bb3fab24af97bff4c69cc6fd8529fc66fe workqueue: define masks for work flags and conditionalize STATIC flags

:::::: TO: Tejun Heo <tj@kernel.org>
:::::: CC: Tejun Heo <tj@kernel.org>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/octet-stream, Size: 39251 bytes --]

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-09-28 10:55   ` Marc Zyngier
@ 2015-09-28 14:16     ` Qais Yousef
  2015-09-28 15:03       ` Marc Zyngier
  2015-10-05  8:22     ` Markos Chandras
  1 sibling, 1 reply; 35+ messages in thread
From: Qais Yousef @ 2015-09-28 14:16 UTC (permalink / raw)
  To: Marc Zyngier, Markos Chandras
  Cc: linux-mips, alex, Alex Smith, Thomas Gleixner, Jason Cooper,
	linux-kernel

On 09/28/2015 11:55 AM, Marc Zyngier wrote:
> On 28/09/15 11:11, Markos Chandras wrote:
>
>> +
>> +	pfn = (gic_base_addr + USM_VISIBLE_SECTION_OFS) >> PAGE_SHIFT;
>> +	return io_remap_pfn_range(vma, base, pfn, size,
>> +				  pgprot_noncached(PAGE_READONLY));
>
> - Does this code have to be in the irqchip driver? It really feels out
> of place, and I'd rather see a function that returns the mappable range
> to the VDSO code, where the mapping would occur.
>


I don't think it's a good idea either for the VDSO code to know about 
gic_base_addr. Maybe this function could be split to return the pfn and 
let the caller do io_remap_pfn_range(). Though I think it's nice to have 
it all there. USM stands for USer Mode - GIC wants to make some stuff 
visible to user mode and it puts them in that special section. So it 
makes sense to do it all there IMO.

Qais

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-09-28 14:16     ` Qais Yousef
@ 2015-09-28 15:03       ` Marc Zyngier
  0 siblings, 0 replies; 35+ messages in thread
From: Marc Zyngier @ 2015-09-28 15:03 UTC (permalink / raw)
  To: Qais Yousef, Markos Chandras
  Cc: linux-mips, alex, Alex Smith, Thomas Gleixner, Jason Cooper,
	linux-kernel

On 28/09/15 15:16, Qais Yousef wrote:
> On 09/28/2015 11:55 AM, Marc Zyngier wrote:
>> On 28/09/15 11:11, Markos Chandras wrote:
>>
>>> +
>>> +	pfn = (gic_base_addr + USM_VISIBLE_SECTION_OFS) >> PAGE_SHIFT;
>>> +	return io_remap_pfn_range(vma, base, pfn, size,
>>> +				  pgprot_noncached(PAGE_READONLY));
>>
>> - Does this code have to be in the irqchip driver? It really feels out
>> of place, and I'd rather see a function that returns the mappable range
>> to the VDSO code, where the mapping would occur.
>>
> 
> 
> I don't think it's a good idea either for the VDSO code to know about 
> gic_base_addr. Maybe this function could be split to return the pfn and 
> let the caller do io_remap_pfn_range(). Though I think it's nice to have 
> it all there. USM stands for USer Mode - GIC wants to make some stuff 
> visible to user mode and it puts them in that special section. So it 
> makes sense to do it all there IMO.

Maybe I wasn't clear enough. My suggestion was to expose this in
the VDSO setup code:

@@ -90,8 +133,15 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 		goto out;
 	}
 
+	/* Map GIC user page. */
+	if (gic_size) {
+		ret = gic_map_user_section(vma, base, gic_size);
+		if (ret)
+			goto out;
+	}
+

This could easily be written as:

	if (gic_size) {
		struct resource gic_res;
		ret = gic_get_usm_range(&gic_res);
		if (ret)
			goto out;
		... and perform the mapping here...
	}

You can also rewrite the hunks above to actually get the present/size
information from the GIC. And if you have DT, you should be able to
directly find the memory region there, without involving the GIC
driver at all.

I don't really fancy having some userspace visible stuff in an
interrupt controller driver, and I tend to find it nicer to split
the responsabilities: the VDSO code deals with the userspace mapping,
and the interrupt controller deals with interrupts.

Thanks,

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-09-28 10:55   ` Marc Zyngier
  2015-09-28 14:16     ` Qais Yousef
@ 2015-10-05  8:22     ` Markos Chandras
  1 sibling, 0 replies; 35+ messages in thread
From: Markos Chandras @ 2015-10-05  8:22 UTC (permalink / raw)
  To: Marc Zyngier, linux-mips
  Cc: alex, Alex Smith, Thomas Gleixner, Jason Cooper, linux-kernel

Hi,

On 09/28/2015 11:55 AM, Marc Zyngier wrote:
> On 28/09/15 11:11, Markos Chandras wrote:
>> From: Alex Smith <alex.smith@imgtec.com>
>>
>> The GIC provides a "user-mode visible" section containing a mirror of
>> the counter registers which can be mapped into user memory. This will
>> be used by the VDSO time function implementations, so provide a
>> function to map it in.
>>
>> When the GIC is not enabled in Kconfig a dummy inline version of this
>> function is provided, along with "#define gic_present 0", so that we
>> don't have to litter the VDSO code with ifdefs.
>>
>> Cc: Thomas Gleixner <tglx@linutronix.de>
>> Cc: Jason Cooper <jason@lakedaemon.net>
>> Cc: Marc Zyngier <marc.zyngier@arm.com>
>> Cc: linux-kernel@vger.kernel.org
>> Signed-off-by: Alex Smith <alex.smith@imgtec.com>
>> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
>> ---
>>  drivers/irqchip/irq-mips-gic.c   | 27 +++++++++++++++++++++------
>>  include/linux/irqchip/mips-gic.h | 24 ++++++++++++++++++++++--
>>  2 files changed, 43 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
>> index af2f16bb8a94..c995b199ca32 100644
>> --- a/drivers/irqchip/irq-mips-gic.c
>> +++ b/drivers/irqchip/irq-mips-gic.c
>> @@ -13,6 +13,7 @@
>>  #include <linux/irq.h>
>>  #include <linux/irqchip.h>
>>  #include <linux/irqchip/mips-gic.h>
>> +#include <linux/mm.h>
>>  #include <linux/of_address.h>
>>  #include <linux/sched.h>
>>  #include <linux/smp.h>
>> @@ -29,6 +30,7 @@ struct gic_pcpu_mask {
>>  	DECLARE_BITMAP(pcpu_mask, GIC_MAX_INTRS);
>>  };
>>  
>> +static unsigned long gic_base_addr;
>>  static void __iomem *gic_base;
>>  static struct gic_pcpu_mask pcpu_masks[NR_CPUS];
>>  static DEFINE_SPINLOCK(gic_lock);
>> @@ -301,6 +303,19 @@ int gic_get_c0_fdc_int(void)
>>  				  GIC_LOCAL_TO_HWIRQ(GIC_LOCAL_INT_FDC));
>>  }
>>  
>> +int gic_map_user_section(struct vm_area_struct *vma, unsigned long base,
>> +			 unsigned long size)
>> +{
>> +	unsigned long pfn;
>> +
>> +	BUG_ON(!gic_present);
> 
> Why do you have a BUG() here, while you're just returning -1 in the case
> where CONFIG_MIPS_GIC is not refined? This feels overly harsh to me.

I suppose i could change that to return -1 if git_present is not true.

> 
>> +	BUG_ON(size > USM_VISIBLE_SECTION_SIZE);
> 
> Same here.

But I think this is different. The size of mapping has to be less than
USM_VISIBLE_SECTION_SIZE because that's the maximum data size exposed by
the GIC chip for userspace use. So if that's not true, then BUG_ON seems
like a sensible thing to do.

> 
> - Does this code have to be in the irqchip driver? It really feels out
> of place, and I'd rather see a function that returns the mappable range
> to the VDSO code, where the mapping would occur.
> 
> Thanks,
> 

That does seem like a good idea. I will have a look


-- 
markos

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-09-28 10:11 ` [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section Markos Chandras
  2015-09-28 10:55   ` Marc Zyngier
@ 2015-10-12  9:40   ` Markos Chandras
  2015-10-12  9:51     ` Marc Zyngier
  1 sibling, 1 reply; 35+ messages in thread
From: Markos Chandras @ 2015-10-12  9:40 UTC (permalink / raw)
  To: linux-mips
  Cc: Alex Smith, Thomas Gleixner, Jason Cooper, Marc Zyngier,
	linux-kernel, Markos Chandras

From: Alex Smith <alex.smith@imgtec.com>

The GIC provides a "user-mode visible" section containing a mirror of
the counter registers which can be mapped into user memory. This will
be used by the VDSO time function implementations, so provide a
function to map it in.

When the GIC is not enabled in Kconfig a dummy inline version of this
function is provided, along with "#define gic_present 0", so that we
don't have to litter the VDSO code with ifdefs.

[markos.chandras@imgtec.com:
- Move mapping code to arch/mips/kernel/vdso.c and use a resource
type to get the GIC usermode information
- Avoid renaming function arguments and use __gic_base_addr to hold
the base GIC address prior to ioremap.]

Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Jason Cooper <jason@lakedaemon.net>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
---
Changes since v1:
- Move mapping code to arch/mips/kernel/vdso.c and use a resource
type to get the GIC usermode information
- Avoid renaming function arguments and use __gic_base_addr to hold
the base GIC address prior to ioremap.

http://www.linux-mips.org/archives/linux-mips/2015-09/msg00316.html
---
 drivers/irqchip/irq-mips-gic.c   | 14 ++++++++++++++
 include/linux/irqchip/mips-gic.h | 17 +++++++++++++++++
 2 files changed, 31 insertions(+)

diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
index af2f16bb8a94..392beebb81ee 100644
--- a/drivers/irqchip/irq-mips-gic.c
+++ b/drivers/irqchip/irq-mips-gic.c
@@ -29,6 +29,7 @@ struct gic_pcpu_mask {
 	DECLARE_BITMAP(pcpu_mask, GIC_MAX_INTRS);
 };
 
+static unsigned long __gic_base_addr;
 static void __iomem *gic_base;
 static struct gic_pcpu_mask pcpu_masks[NR_CPUS];
 static DEFINE_SPINLOCK(gic_lock);
@@ -301,6 +302,17 @@ int gic_get_c0_fdc_int(void)
 				  GIC_LOCAL_TO_HWIRQ(GIC_LOCAL_INT_FDC));
 }
 
+int gic_get_usm_range(struct resource *gic_usm_res)
+{
+	if (!gic_present)
+		return -1;
+
+	gic_usm_res->start = __gic_base_addr + USM_VISIBLE_SECTION_OFS;
+	gic_usm_res->end = gic_usm_res->start + (USM_VISIBLE_SECTION_SIZE - 1);
+
+	return 0;
+}
+
 static void gic_handle_shared_int(bool chained)
 {
 	unsigned int i, intr, virq, gic_reg_step = mips_cm_is64 ? 8 : 4;
@@ -790,6 +802,8 @@ static void __init __gic_init(unsigned long gic_base_addr,
 {
 	unsigned int gicconfig;
 
+	__gic_base_addr = gic_base_addr;
+
 	gic_base = ioremap_nocache(gic_base_addr, gic_addrspace_size);
 
 	gicconfig = gic_read(GIC_REG(SHARED, GIC_SH_CONFIG));
diff --git a/include/linux/irqchip/mips-gic.h b/include/linux/irqchip/mips-gic.h
index 4e6861605050..71ab7c548550 100644
--- a/include/linux/irqchip/mips-gic.h
+++ b/include/linux/irqchip/mips-gic.h
@@ -9,6 +9,7 @@
 #define __LINUX_IRQCHIP_MIPS_GIC_H
 
 #include <linux/clocksource.h>
+#include <linux/ioport.h>
 
 #define GIC_MAX_INTRS			256
 
@@ -245,6 +246,8 @@
 #define GIC_SHARED_TO_HWIRQ(x)	(GIC_SHARED_HWIRQ_BASE + (x))
 #define GIC_HWIRQ_TO_SHARED(x)	((x) - GIC_SHARED_HWIRQ_BASE)
 
+#ifdef CONFIG_MIPS_GIC
+
 extern unsigned int gic_present;
 
 extern void gic_init(unsigned long gic_base_addr,
@@ -264,4 +267,18 @@ extern unsigned int plat_ipi_resched_int_xlate(unsigned int);
 extern int gic_get_c0_compare_int(void);
 extern int gic_get_c0_perfcount_int(void);
 extern int gic_get_c0_fdc_int(void);
+extern int gic_get_usm_range(struct resource *gic_usm_res);
+
+#else /* CONFIG_MIPS_GIC */
+
+#define gic_present	0
+
+static int gic_get_usm_range(struct resource *gic_usm_res)
+{
+	/* Shouldn't be called. */
+	return -1
+}
+
+#endif /* CONFIG_MIPS_GIC */
+
 #endif /* __LINUX_IRQCHIP_MIPS_GIC_H */
-- 
2.6.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-10-12  9:40   ` [PATCH v2 " Markos Chandras
@ 2015-10-12  9:51     ` Marc Zyngier
  2015-10-12 10:16       ` Thomas Gleixner
  0 siblings, 1 reply; 35+ messages in thread
From: Marc Zyngier @ 2015-10-12  9:51 UTC (permalink / raw)
  To: Markos Chandras, linux-mips
  Cc: Alex Smith, Thomas Gleixner, Jason Cooper, linux-kernel

On 12/10/15 10:40, Markos Chandras wrote:
> From: Alex Smith <alex.smith@imgtec.com>
> 
> The GIC provides a "user-mode visible" section containing a mirror of
> the counter registers which can be mapped into user memory. This will
> be used by the VDSO time function implementations, so provide a
> function to map it in.
> 
> When the GIC is not enabled in Kconfig a dummy inline version of this
> function is provided, along with "#define gic_present 0", so that we
> don't have to litter the VDSO code with ifdefs.
> 
> [markos.chandras@imgtec.com:
> - Move mapping code to arch/mips/kernel/vdso.c and use a resource
> type to get the GIC usermode information
> - Avoid renaming function arguments and use __gic_base_addr to hold
> the base GIC address prior to ioremap.]
> 
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Jason Cooper <jason@lakedaemon.net>
> Cc: Marc Zyngier <marc.zyngier@arm.com>
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Alex Smith <alex.smith@imgtec.com>
> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
> ---
> Changes since v1:
> - Move mapping code to arch/mips/kernel/vdso.c and use a resource
> type to get the GIC usermode information
> - Avoid renaming function arguments and use __gic_base_addr to hold
> the base GIC address prior to ioremap.
> 
> http://www.linux-mips.org/archives/linux-mips/2015-09/msg00316.html
> ---
>  drivers/irqchip/irq-mips-gic.c   | 14 ++++++++++++++
>  include/linux/irqchip/mips-gic.h | 17 +++++++++++++++++
>  2 files changed, 31 insertions(+)
> 
> diff --git a/drivers/irqchip/irq-mips-gic.c b/drivers/irqchip/irq-mips-gic.c
> index af2f16bb8a94..392beebb81ee 100644
> --- a/drivers/irqchip/irq-mips-gic.c
> +++ b/drivers/irqchip/irq-mips-gic.c
> @@ -29,6 +29,7 @@ struct gic_pcpu_mask {
>  	DECLARE_BITMAP(pcpu_mask, GIC_MAX_INTRS);
>  };
>  
> +static unsigned long __gic_base_addr;
>  static void __iomem *gic_base;
>  static struct gic_pcpu_mask pcpu_masks[NR_CPUS];
>  static DEFINE_SPINLOCK(gic_lock);
> @@ -301,6 +302,17 @@ int gic_get_c0_fdc_int(void)
>  				  GIC_LOCAL_TO_HWIRQ(GIC_LOCAL_INT_FDC));
>  }
>  
> +int gic_get_usm_range(struct resource *gic_usm_res)
> +{
> +	if (!gic_present)
> +		return -1;
> +
> +	gic_usm_res->start = __gic_base_addr + USM_VISIBLE_SECTION_OFS;
> +	gic_usm_res->end = gic_usm_res->start + (USM_VISIBLE_SECTION_SIZE - 1);
> +
> +	return 0;
> +}
> +
>  static void gic_handle_shared_int(bool chained)
>  {
>  	unsigned int i, intr, virq, gic_reg_step = mips_cm_is64 ? 8 : 4;
> @@ -790,6 +802,8 @@ static void __init __gic_init(unsigned long gic_base_addr,
>  {
>  	unsigned int gicconfig;
>  
> +	__gic_base_addr = gic_base_addr;
> +
>  	gic_base = ioremap_nocache(gic_base_addr, gic_addrspace_size);
>  
>  	gicconfig = gic_read(GIC_REG(SHARED, GIC_SH_CONFIG));
> diff --git a/include/linux/irqchip/mips-gic.h b/include/linux/irqchip/mips-gic.h
> index 4e6861605050..71ab7c548550 100644
> --- a/include/linux/irqchip/mips-gic.h
> +++ b/include/linux/irqchip/mips-gic.h
> @@ -9,6 +9,7 @@
>  #define __LINUX_IRQCHIP_MIPS_GIC_H
>  
>  #include <linux/clocksource.h>
> +#include <linux/ioport.h>
>  
>  #define GIC_MAX_INTRS			256
>  
> @@ -245,6 +246,8 @@
>  #define GIC_SHARED_TO_HWIRQ(x)	(GIC_SHARED_HWIRQ_BASE + (x))
>  #define GIC_HWIRQ_TO_SHARED(x)	((x) - GIC_SHARED_HWIRQ_BASE)
>  
> +#ifdef CONFIG_MIPS_GIC
> +
>  extern unsigned int gic_present;
>  
>  extern void gic_init(unsigned long gic_base_addr,
> @@ -264,4 +267,18 @@ extern unsigned int plat_ipi_resched_int_xlate(unsigned int);
>  extern int gic_get_c0_compare_int(void);
>  extern int gic_get_c0_perfcount_int(void);
>  extern int gic_get_c0_fdc_int(void);
> +extern int gic_get_usm_range(struct resource *gic_usm_res);
> +
> +#else /* CONFIG_MIPS_GIC */
> +
> +#define gic_present	0
> +
> +static int gic_get_usm_range(struct resource *gic_usm_res)
> +{
> +	/* Shouldn't be called. */
> +	return -1
> +}
> +
> +#endif /* CONFIG_MIPS_GIC */
> +
>  #endif /* __LINUX_IRQCHIP_MIPS_GIC_H */
> 

This looks much better than the previous version (though I cannot find
the two other patches on LKML just yet).

FWIW:

Reviewed-by: Marc Zyngier <marc.zyngier@arm.com>

	M.
-- 
Jazz is not dead. It just smells funny...

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-10-12  9:51     ` Marc Zyngier
@ 2015-10-12 10:16       ` Thomas Gleixner
  2015-10-15  9:37         ` Qais Yousef
  0 siblings, 1 reply; 35+ messages in thread
From: Thomas Gleixner @ 2015-10-12 10:16 UTC (permalink / raw)
  To: Marc Zyngier
  Cc: Markos Chandras, linux-mips, Alex Smith, Jason Cooper, linux-kernel

On Mon, 12 Oct 2015, Marc Zyngier wrote:
> On 12/10/15 10:40, Markos Chandras wrote:
> > From: Alex Smith <alex.smith@imgtec.com>
> > 
> > The GIC provides a "user-mode visible" section containing a mirror of
> > the counter registers which can be mapped into user memory. This will
> > be used by the VDSO time function implementations, so provide a
> > function to map it in.

<SNIP>
 
> 
> This looks much better than the previous version (though I cannot find
> the two other patches on LKML just yet).

Yes, it looks better. But I really have to ask the question why we are
trying to pack the world and somemore into an irq chip driver. We
already have the completely misplaced gic_read_count() there.

While I understand that all of this is in the GIC block at least
according to the documentation, technically it's different hardware
blocks. And logically its different as well.

So why not describe the various blocks (interrupt controller, timer,
shadow timer) as separate entities in the device tree and let each
subsystem look them up on their own. This cross subsystem hackery is
just horrible and does not buy anything except merge dependencies and
other avoidable hassle.

Thoughts?

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v2 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-09-28 10:12 ` [PATCH 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime() Markos Chandras
  2015-09-28 13:15   ` kbuild test robot
@ 2015-10-12 10:24   ` Markos Chandras
  2015-10-21  8:57     ` [PATCH v3 " Markos Chandras
  1 sibling, 1 reply; 35+ messages in thread
From: Markos Chandras @ 2015-10-12 10:24 UTC (permalink / raw)
  To: linux-mips; +Cc: Alex Smith, linux-kernel, Markos Chandras

From: Alex Smith <alex.smith@imgtec.com>

Add user-mode implementations of gettimeofday() and clock_gettime() to
the VDSO. This is currently usable with 2 clocksources: the CP0 count
register, which is accessible to user-mode via RDHWR on R2 and later
cores, or the MIPS Global Interrupt Controller (GIC) timer, which
provides a "user-mode visible" section containing a mirror of its
counter registers. This section must be mapped into user memory, which
is done below the VDSO data page.

When a supported clocksource is not in use, the VDSO functions will
return -ENOSYS, which causes libc to fall back on the standard syscall
path.

When support for neither of these clocksources is compiled into the
kernel at all, the VDSO still provides clock_gettime(), as the coarse
realtime/monotonic clocks can still be implemented. However,
gettimeofday() is not provided in this case as nothing can be done
without a suitable clocksource. This causes the symbol lookup to fail
in libc and it will then always use the standard syscall path.

This patch includes a workaround for a bug in QEMU which results in
RDHWR on the CP0 count register always returning a constant (incorrect)
value. A fix for this has been submitted, and the workaround can be
removed after the fix has been in stable releases for a reasonable
amount of time.

A simple performance test which calls gettimeofday() 1000 times in a
loop and calculates the average execution time gives the following
results on a Malta + I6400 (running at 20MHz):

 - Syscall:    ~31000 ns
 - VDSO (GIC): ~15000 ns
 - VDSO (CP0): ~9500 ns

[markos.chandras@imgtec.com:
- Minor code re-arrangements in order for mappings to be made
in the order they appear to the process' address space.
- Move do_{monotonic, realtime} outside of the MIPS_CLOCK_VSYSCALL ifdef
- Use gic_get_usm_range so we can do the GIC mapping in the
arch/mips/kernel/vdso instead of the GIC irqchip driver]

Cc: linux-kernel@vger.kernel.org
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
---
Changes since v1:
- Use gic_get_usm_range so we can do the GIC mapping in the
arch/mips/kernel/vdso instead of the GIC irqchip driver
---
 arch/mips/Kconfig                    |   5 +
 arch/mips/include/asm/clocksource.h  |  29 +++++
 arch/mips/include/asm/vdso.h         |  68 +++++++++-
 arch/mips/kernel/csrc-r4k.c          |  44 +++++++
 arch/mips/kernel/vdso.c              |  71 ++++++++++-
 arch/mips/vdso/Makefile              |   2 +-
 arch/mips/vdso/gettimeofday.c        | 232 +++++++++++++++++++++++++++++++++++
 arch/mips/vdso/vdso.h                |   9 ++
 arch/mips/vdso/vdso.lds.S            |   3 +
 drivers/clocksource/mips-gic-timer.c |   7 +-
 10 files changed, 459 insertions(+), 11 deletions(-)
 create mode 100644 arch/mips/include/asm/clocksource.h
 create mode 100644 arch/mips/vdso/gettimeofday.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index e3aa5b0b4ef1..68f4f246887c 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -60,6 +60,8 @@ config MIPS
 	select SYSCTL_EXCEPTION_TRACE
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select HAVE_IRQ_TIME_ACCOUNTING
+	select GENERIC_TIME_VSYSCALL
+	select ARCH_CLOCKSOURCE_DATA
 
 menu "Machine selection"
 
@@ -1036,6 +1038,9 @@ config CSRC_R4K
 config CSRC_SB1250
 	bool
 
+config MIPS_CLOCK_VSYSCALL
+	def_bool CSRC_R4K || CLKSRC_MIPS_GIC
+
 config GPIO_TXX9
 	select ARCH_REQUIRE_GPIOLIB
 	bool
diff --git a/arch/mips/include/asm/clocksource.h b/arch/mips/include/asm/clocksource.h
new file mode 100644
index 000000000000..3deb1d0c1a94
--- /dev/null
+++ b/arch/mips/include/asm/clocksource.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#ifndef __ASM_CLOCKSOURCE_H
+#define __ASM_CLOCKSOURCE_H
+
+#include <linux/types.h>
+
+/* VDSO clocksources. */
+#define VDSO_CLOCK_NONE		0	/* No suitable clocksource. */
+#define VDSO_CLOCK_R4K		1	/* Use the coprocessor 0 count. */
+#define VDSO_CLOCK_GIC		2	/* Use the GIC. */
+
+/**
+ * struct arch_clocksource_data - Architecture-specific clocksource information.
+ * @vdso_clock_mode: Method the VDSO should use to access the clocksource.
+ */
+struct arch_clocksource_data {
+	u8 vdso_clock_mode;
+};
+
+#endif /* __ASM_CLOCKSOURCE_H */
diff --git a/arch/mips/include/asm/vdso.h b/arch/mips/include/asm/vdso.h
index db2d45be8f2e..8f4ca5dd992b 100644
--- a/arch/mips/include/asm/vdso.h
+++ b/arch/mips/include/asm/vdso.h
@@ -13,6 +13,8 @@
 
 #include <linux/mm_types.h>
 
+#include <asm/barrier.h>
+
 /**
  * struct mips_vdso_image - Details of a VDSO image.
  * @data: Pointer to VDSO image data (page-aligned).
@@ -53,18 +55,82 @@ extern struct mips_vdso_image vdso_image_n32;
 
 /**
  * union mips_vdso_data - Data provided by the kernel for the VDSO.
+ * @xtime_sec:		Current real time (seconds part).
+ * @xtime_nsec:		Current real time (nanoseconds part, shifted).
+ * @wall_to_mono_sec:	Wall-to-monotonic offset (seconds part).
+ * @wall_to_mono_nsec:	Wall-to-monotonic offset (nanoseconds part).
+ * @seq_count:		Counter to synchronise updates (odd = updating).
+ * @cs_shift:		Clocksource shift value.
+ * @clock_mode:		Clocksource to use for time functions.
+ * @cs_mult:		Clocksource multiplier value.
+ * @cs_cycle_last:	Clock cycle value at last update.
+ * @cs_mask:		Clocksource mask value.
+ * @tz_minuteswest:	Minutes west of Greenwich (from timezone).
+ * @tz_dsttime:		Type of DST correction (from timezone).
  *
  * This structure contains data needed by functions within the VDSO. It is
- * populated by the kernel and mapped read-only into user memory.
+ * populated by the kernel and mapped read-only into user memory. The time
+ * fields are mirrors of internal data from the timekeeping infrastructure.
  *
  * Note: Care should be taken when modifying as the layout must remain the same
  * for both 64- and 32-bit (for 32-bit userland on 64-bit kernel).
  */
 union mips_vdso_data {
 	struct {
+		u64 xtime_sec;
+		u64 xtime_nsec;
+		u32 wall_to_mono_sec;
+		u32 wall_to_mono_nsec;
+		u32 seq_count;
+		u32 cs_shift;
+		u8 clock_mode;
+		u32 cs_mult;
+		u64 cs_cycle_last;
+		u64 cs_mask;
+		s32 tz_minuteswest;
+		s32 tz_dsttime;
 	};
 
 	u8 page[PAGE_SIZE];
 };
 
+static inline u32 vdso_data_read_begin(const union mips_vdso_data *data)
+{
+	u32 seq;
+
+	while (true) {
+		seq = ACCESS_ONCE(data->seq_count);
+		if (likely(!(seq & 1))) {
+			/* Paired with smp_wmb() in vdso_data_write_*(). */
+			smp_rmb();
+			return seq;
+		}
+
+		cpu_relax();
+	}
+}
+
+static inline bool vdso_data_read_retry(const union mips_vdso_data *data,
+					u32 start_seq)
+{
+	/* Paired with smp_wmb() in vdso_data_write_*(). */
+	smp_rmb();
+	return unlikely(data->seq_count != start_seq);
+}
+
+static inline void vdso_data_write_begin(union mips_vdso_data *data)
+{
+	++data->seq_count;
+
+	/* Ensure sequence update is written before other data page values. */
+	smp_wmb();
+}
+
+static inline void vdso_data_write_end(union mips_vdso_data *data)
+{
+	/* Ensure data values are written before updating sequence again. */
+	smp_wmb();
+	++data->seq_count;
+}
+
 #endif /* __ASM_VDSO_H */
diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
index e5ed7ada1433..1f910563fdf6 100644
--- a/arch/mips/kernel/csrc-r4k.c
+++ b/arch/mips/kernel/csrc-r4k.c
@@ -28,6 +28,43 @@ static u64 notrace r4k_read_sched_clock(void)
 	return read_c0_count();
 }
 
+static inline unsigned int rdhwr_count(void)
+{
+	unsigned int count;
+
+	__asm__ __volatile__(
+	"	.set push\n"
+	"	.set mips32r2\n"
+	"	rdhwr	%0, $2\n"
+	"	.set pop\n"
+	: "=r" (count));
+
+	return count;
+}
+
+static bool rdhwr_count_usable(void)
+{
+	unsigned int prev, curr, i;
+
+	/*
+	 * Older QEMUs have a broken implementation of RDHWR for the CP0 count
+	 * which always returns a constant value. Try to identify this and don't
+	 * use it in the VDSO if it is broken. This workaround can be removed
+	 * once the fix has been in QEMU stable for a reasonable amount of time.
+	 */
+	for (i = 0, prev = rdhwr_count(); i < 100; i++) {
+		curr = rdhwr_count();
+
+		if (curr != prev)
+			return true;
+
+		prev = curr;
+	}
+
+	pr_warn("Not using R4K clocksource in VDSO due to broken RDHWR\n");
+	return false;
+}
+
 int __init init_r4k_clocksource(void)
 {
 	if (!cpu_has_counter || !mips_hpt_frequency)
@@ -36,6 +73,13 @@ int __init init_r4k_clocksource(void)
 	/* Calculate a somewhat reasonable rating value */
 	clocksource_mips.rating = 200 + mips_hpt_frequency / 10000000;
 
+	/*
+	 * R2 onwards makes the count accessible to user mode so it can be used
+	 * by the VDSO (HWREna is configured by configure_hwrena()).
+	 */
+	if (cpu_has_mips_r2_r6 && rdhwr_count_usable())
+		clocksource_mips.archdata.vdso_clock_mode = VDSO_CLOCK_R4K;
+
 	clocksource_register_hz(&clocksource_mips, mips_hpt_frequency);
 
 	sched_clock_register(r4k_read_sched_clock, 32, mips_hpt_frequency);
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 56cc3c4377fb..975e99759bab 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -12,9 +12,12 @@
 #include <linux/elf.h>
 #include <linux/err.h>
 #include <linux/init.h>
+#include <linux/ioport.h>
+#include <linux/irqchip/mips-gic.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/timekeeper_internal.h>
 
 #include <asm/abi.h>
 #include <asm/vdso.h>
@@ -23,7 +26,7 @@
 static union mips_vdso_data vdso_data __page_aligned_data;
 
 /*
- * Mapping for the VDSO data pages. The real pages are mapped manually, as
+ * Mapping for the VDSO data/GIC pages. The real pages are mapped manually, as
  * what we map and where within the area they are mapped is determined at
  * runtime.
  */
@@ -64,25 +67,67 @@ static int __init init_vdso(void)
 }
 subsys_initcall(init_vdso);
 
+void update_vsyscall(struct timekeeper *tk)
+{
+	vdso_data_write_begin(&vdso_data);
+
+	vdso_data.xtime_sec = tk->xtime_sec;
+	vdso_data.xtime_nsec = tk->tkr_mono.xtime_nsec;
+	vdso_data.wall_to_mono_sec = tk->wall_to_monotonic.tv_sec;
+	vdso_data.wall_to_mono_nsec = tk->wall_to_monotonic.tv_nsec;
+	vdso_data.cs_shift = tk->tkr_mono.shift;
+
+	vdso_data.clock_mode = tk->tkr_mono.clock->archdata.vdso_clock_mode;
+	if (vdso_data.clock_mode != VDSO_CLOCK_NONE) {
+		vdso_data.cs_mult = tk->tkr_mono.mult;
+		vdso_data.cs_cycle_last = tk->tkr_mono.cycle_last;
+		vdso_data.cs_mask = tk->tkr_mono.mask;
+	}
+
+	vdso_data_write_end(&vdso_data);
+}
+
+void update_vsyscall_tz(void)
+{
+	if (vdso_data.clock_mode != VDSO_CLOCK_NONE) {
+		vdso_data.tz_minuteswest = sys_tz.tz_minuteswest;
+		vdso_data.tz_dsttime = sys_tz.tz_dsttime;
+	}
+}
+
 int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 {
 	struct mips_vdso_image *image = current->thread.abi->vdso;
 	struct mm_struct *mm = current->mm;
-	unsigned long base, vdso_addr;
+	unsigned long gic_size, vvar_size, size, base, data_addr, vdso_addr;
 	struct vm_area_struct *vma;
+	struct resource gic_res;
 	int ret;
 
 	down_write(&mm->mmap_sem);
 
-	base = get_unmapped_area(NULL, 0, PAGE_SIZE + image->size, 0, 0);
+	/*
+	 * Determine total area size. This includes the VDSO data itself, the
+	 * data page, and the GIC user page if present. Always create a mapping
+	 * for the GIC user area if the GIC is present regardless of whether it
+	 * is the current clocksource, in case it comes into use later on. We
+	 * only map a page even though the total area is 64K, as we only need
+	 * the counter registers at the start.
+	 */
+	gic_size = gic_present ? PAGE_SIZE : 0;
+	vvar_size = gic_size + PAGE_SIZE;
+	size = vvar_size + image->size;
+
+	base = get_unmapped_area(NULL, 0, size, 0, 0);
 	if (IS_ERR_VALUE(base)) {
 		ret = base;
 		goto out;
 	}
 
-	vdso_addr = base + PAGE_SIZE;
+	data_addr = base + gic_size;
+	vdso_addr = data_addr + PAGE_SIZE;
 
-	vma = _install_special_mapping(mm, base, PAGE_SIZE,
+	vma = _install_special_mapping(mm, base, vvar_size,
 				       VM_READ | VM_MAYREAD,
 				       &vdso_vvar_mapping);
 	if (IS_ERR(vma)) {
@@ -90,8 +135,22 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 		goto out;
 	}
 
+	/* Map GIC user page. */
+	if (gic_size) {
+		ret = gic_get_usm_range(&gic_res);
+		if (ret)
+			goto out;
+
+		ret = io_remap_pfn_range(vma, base,
+					 gic_res.start >> PAGE_SHIFT,
+					 gic_size,
+					 pgprot_noncached(PAGE_READONLY));
+		if (ret)
+			goto out;
+	}
+
 	/* Map data page. */
-	ret = remap_pfn_range(vma, base,
+	ret = remap_pfn_range(vma, data_addr,
 			      virt_to_phys(&vdso_data) >> PAGE_SHIFT,
 			      PAGE_SIZE, PAGE_READONLY);
 	if (ret)
diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile
index 9a8a6b373eb0..c2820997ea9b 100644
--- a/arch/mips/vdso/Makefile
+++ b/arch/mips/vdso/Makefile
@@ -1,5 +1,5 @@
 # Objects to go into the VDSO.
-obj-vdso-y := elf.o sigreturn.o
+obj-vdso-y := elf.o gettimeofday.o sigreturn.o
 
 # Common compiler flags between ABIs.
 ccflags-vdso := \
diff --git a/arch/mips/vdso/gettimeofday.c b/arch/mips/vdso/gettimeofday.c
new file mode 100644
index 000000000000..ce89c9e294f9
--- /dev/null
+++ b/arch/mips/vdso/gettimeofday.c
@@ -0,0 +1,232 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include "vdso.h"
+
+#include <linux/compiler.h>
+#include <linux/irqchip/mips-gic.h>
+#include <linux/time.h>
+
+#include <asm/clocksource.h>
+#include <asm/io.h>
+#include <asm/mips-cm.h>
+#include <asm/unistd.h>
+#include <asm/vdso.h>
+
+static __always_inline int do_realtime_coarse(struct timespec *ts,
+					      const union mips_vdso_data *data)
+{
+	u32 start_seq;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		ts->tv_sec = data->xtime_sec;
+		ts->tv_nsec = data->xtime_nsec >> data->cs_shift;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	return 0;
+}
+
+static __always_inline int do_monotonic_coarse(struct timespec *ts,
+					       const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u32 to_mono_sec;
+	u32 to_mono_nsec;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		ts->tv_sec = data->xtime_sec;
+		ts->tv_nsec = data->xtime_nsec >> data->cs_shift;
+
+		to_mono_sec = data->wall_to_mono_sec;
+		to_mono_nsec = data->wall_to_mono_nsec;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_sec += to_mono_sec;
+	timespec_add_ns(ts, to_mono_nsec);
+
+	return 0;
+}
+
+#ifdef CONFIG_CSRC_R4K
+
+static __always_inline u64 read_r4k_count(void)
+{
+	unsigned int count;
+
+	__asm__ __volatile__(
+	"	.set push\n"
+	"	.set mips32r2\n"
+	"	rdhwr	%0, $2\n"
+	"	.set pop\n"
+	: "=r" (count));
+
+	return count;
+}
+
+#endif
+
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+
+static __always_inline u64 read_gic_count(const union mips_vdso_data *data)
+{
+	void __iomem *gic = get_gic(data);
+	u32 hi, hi2, lo;
+
+	do {
+		hi = __raw_readl(gic + GIC_UMV_SH_COUNTER_63_32_OFS);
+		lo = __raw_readl(gic + GIC_UMV_SH_COUNTER_31_00_OFS);
+		hi2 = __raw_readl(gic + GIC_UMV_SH_COUNTER_63_32_OFS);
+	} while (hi2 != hi);
+
+	return (((u64)hi) << 32) + lo;
+}
+
+#endif
+
+static __always_inline u64 get_ns(const union mips_vdso_data *data)
+{
+	u64 cycle_now, delta, nsec;
+
+	switch (data->clock_mode) {
+#ifdef CONFIG_CSRC_R4K
+	case VDSO_CLOCK_R4K:
+		cycle_now = read_r4k_count();
+		break;
+#endif
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+	case VDSO_CLOCK_GIC:
+		cycle_now = read_gic_count(data);
+		break;
+#endif
+	default:
+		return 0;
+	}
+
+	delta = (cycle_now - data->cs_cycle_last) & data->cs_mask;
+
+	nsec = (delta * data->cs_mult) + data->xtime_nsec;
+	nsec >>= data->cs_shift;
+
+	return nsec;
+}
+
+static __always_inline int do_realtime(struct timespec *ts,
+				       const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u64 ns;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		if (data->clock_mode == VDSO_CLOCK_NONE)
+			return -ENOSYS;
+
+		ts->tv_sec = data->xtime_sec;
+		ns = get_ns(data);
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, ns);
+
+	return 0;
+}
+
+static __always_inline int do_monotonic(struct timespec *ts,
+					const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u64 ns;
+	u32 to_mono_sec;
+	u32 to_mono_nsec;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		if (data->clock_mode == VDSO_CLOCK_NONE)
+			return -ENOSYS;
+
+		ts->tv_sec = data->xtime_sec;
+		ns = get_ns(data);
+
+		to_mono_sec = data->wall_to_mono_sec;
+		to_mono_nsec = data->wall_to_mono_nsec;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_sec += to_mono_sec;
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, ns + to_mono_nsec);
+
+	return 0;
+}
+
+#ifdef CONFIG_MIPS_CLOCK_VSYSCALL
+
+/*
+ * This is behind the ifdef so that we don't provide the symbol when there's no
+ * possibility of there being a usable clocksource, because there's nothing we
+ * can do without it. When libc fails the symbol lookup it should fall back on
+ * the standard syscall path.
+ */
+int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
+{
+	const union mips_vdso_data *data = get_vdso_data();
+	struct timespec ts;
+	int ret;
+
+	ret = do_realtime(&ts, data);
+	if (ret)
+		return ret;
+
+	if (tv) {
+		tv->tv_sec = ts.tv_sec;
+		tv->tv_usec = ts.tv_nsec / 1000;
+	}
+
+	if (tz) {
+		tz->tz_minuteswest = data->tz_minuteswest;
+		tz->tz_dsttime = data->tz_dsttime;
+	}
+
+	return 0;
+}
+
+#endif /* CONFIG_CLKSRC_MIPS_GIC */
+
+int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts)
+{
+	const union mips_vdso_data *data = get_vdso_data();
+	int ret;
+
+	switch (clkid) {
+	case CLOCK_REALTIME_COARSE:
+		ret = do_realtime_coarse(ts, data);
+		break;
+	case CLOCK_MONOTONIC_COARSE:
+		ret = do_monotonic_coarse(ts, data);
+		break;
+	case CLOCK_REALTIME:
+		ret = do_realtime(ts, data);
+		break;
+	case CLOCK_MONOTONIC:
+		ret = do_monotonic(ts, data);
+		break;
+	default:
+		ret = -ENOSYS;
+		break;
+	}
+
+	/* If we return -ENOSYS libc should fall back to a syscall. */
+	return ret;
+}
diff --git a/arch/mips/vdso/vdso.h b/arch/mips/vdso/vdso.h
index 5dfe1e5fac14..92c1c20832e6 100644
--- a/arch/mips/vdso/vdso.h
+++ b/arch/mips/vdso/vdso.h
@@ -75,4 +75,13 @@ static inline const union mips_vdso_data *get_vdso_data(void)
 	return (const union mips_vdso_data *)(get_vdso_base() - PAGE_SIZE);
 }
 
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+
+static inline void __iomem *get_gic(const union mips_vdso_data *data)
+{
+	return (void __iomem *)data - PAGE_SIZE;
+}
+
+#endif /* CONFIG_CLKSRC_MIPS_GIC */
+
 #endif /* __ASSEMBLY__ */
diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S
index 21655b6fefc5..0bda37c5a1e6 100644
--- a/arch/mips/vdso/vdso.lds.S
+++ b/arch/mips/vdso/vdso.lds.S
@@ -95,6 +95,9 @@ PHDRS
 VERSION
 {
 	LINUX_2.6 {
+	global:
+		__vdso_clock_gettime;
+		__vdso_gettimeofday;
 	local: *;
 	};
 }
diff --git a/drivers/clocksource/mips-gic-timer.c b/drivers/clocksource/mips-gic-timer.c
index 02a1945e5093..89d3e4d7900c 100644
--- a/drivers/clocksource/mips-gic-timer.c
+++ b/drivers/clocksource/mips-gic-timer.c
@@ -140,9 +140,10 @@ static cycle_t gic_hpt_read(struct clocksource *cs)
 }
 
 static struct clocksource gic_clocksource = {
-	.name	= "GIC",
-	.read	= gic_hpt_read,
-	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
+	.name		= "GIC",
+	.read		= gic_hpt_read,
+	.flags		= CLOCK_SOURCE_IS_CONTINUOUS,
+	.archdata	= { .vdso_clock_mode = VDSO_CLOCK_GIC },
 };
 
 static void __init __gic_clocksource_init(void)
-- 
2.6.1


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-10-12 10:16       ` Thomas Gleixner
@ 2015-10-15  9:37         ` Qais Yousef
  2015-10-15 10:18           ` Thomas Gleixner
  0 siblings, 1 reply; 35+ messages in thread
From: Qais Yousef @ 2015-10-15  9:37 UTC (permalink / raw)
  To: Thomas Gleixner, Marc Zyngier
  Cc: Markos Chandras, linux-mips, Alex Smith, Jason Cooper, linux-kernel

On 10/12/2015 11:16 AM, Thomas Gleixner wrote:
> On Mon, 12 Oct 2015, Marc Zyngier wrote:
>> On 12/10/15 10:40, Markos Chandras wrote:
>>> From: Alex Smith <alex.smith@imgtec.com>
>>>
>>> The GIC provides a "user-mode visible" section containing a mirror of
>>> the counter registers which can be mapped into user memory. This will
>>> be used by the VDSO time function implementations, so provide a
>>> function to map it in.
> <SNIP>
>   
>> This looks much better than the previous version (though I cannot find
>> the two other patches on LKML just yet).
> Yes, it looks better. But I really have to ask the question why we are
> trying to pack the world and somemore into an irq chip driver. We
> already have the completely misplaced gic_read_count() there.

This code has a bad history. It was scattered all over the place in arch 
code. Andrew Bresticker did a good job cleaning it up and moved it to 
this irqchip driver.

     https://lkml.org/lkml/2014/9/18/487
     https://lkml.org/lkml/2014/10/20/481

>
> While I understand that all of this is in the GIC block at least
> according to the documentation, technically it's different hardware
> blocks. And logically its different as well.

Yes but they're exposed through the same register interface.

>
> So why not describe the various blocks (interrupt controller, timer,
> shadow timer) as separate entities in the device tree and let each
> subsystem look them up on their own. This cross subsystem hackery is
> just horrible and does not buy anything except merge dependencies and
> other avoidable hassle.

There's a mips-gic-timer driver in drivers/clocksource. But in device 
tree it's a subnode of the irqchip driver.

http://lxr.free-electrons.com/source/drivers/clocksource/mips-gic-timer.c
http://lxr.free-electrons.com/source/Documentation/devicetree/bindings/interrupt-controller/mips-gic.txt

>
> Thoughts?
>

It could be refactored but the DT binding already specifies the GIC 
timer as a subnode of GIC. Exposing this usermode register is the only 
thing left in the register set that GIC driver wasn't dealing with.

Little gain in changing all of this now I think?

Thanks,
Qais

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v2 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section
  2015-10-15  9:37         ` Qais Yousef
@ 2015-10-15 10:18           ` Thomas Gleixner
  0 siblings, 0 replies; 35+ messages in thread
From: Thomas Gleixner @ 2015-10-15 10:18 UTC (permalink / raw)
  To: Qais Yousef
  Cc: Marc Zyngier, Markos Chandras, linux-mips, Alex Smith,
	Jason Cooper, linux-kernel

On Thu, 15 Oct 2015, Qais Yousef wrote:
> It could be refactored but the DT binding already specifies the GIC timer as a
> subnode of GIC. Exposing this usermode register is the only thing left in the
> register set that GIC driver wasn't dealing with.
> 
> Little gain in changing all of this now I think?

Well, the not so little gain is clear separation and a sane design.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 35+ messages in thread

* [PATCH v3 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-12 10:24   ` [PATCH v2 " Markos Chandras
@ 2015-10-21  8:57     ` Markos Chandras
  2015-10-23  1:41       ` [v3, " Leonid Yegoshin
  2016-01-25 22:36       ` [PATCH v3 " Hauke Mehrtens
  0 siblings, 2 replies; 35+ messages in thread
From: Markos Chandras @ 2015-10-21  8:57 UTC (permalink / raw)
  To: linux-mips; +Cc: Alex Smith, linux-kernel, Markos Chandras

From: Alex Smith <alex.smith@imgtec.com>

Add user-mode implementations of gettimeofday() and clock_gettime() to
the VDSO. This is currently usable with 2 clocksources: the CP0 count
register, which is accessible to user-mode via RDHWR on R2 and later
cores, or the MIPS Global Interrupt Controller (GIC) timer, which
provides a "user-mode visible" section containing a mirror of its
counter registers. This section must be mapped into user memory, which
is done below the VDSO data page.

When a supported clocksource is not in use, the VDSO functions will
return -ENOSYS, which causes libc to fall back on the standard syscall
path.

When support for neither of these clocksources is compiled into the
kernel at all, the VDSO still provides clock_gettime(), as the coarse
realtime/monotonic clocks can still be implemented. However,
gettimeofday() is not provided in this case as nothing can be done
without a suitable clocksource. This causes the symbol lookup to fail
in libc and it will then always use the standard syscall path.

This patch includes a workaround for a bug in QEMU which results in
RDHWR on the CP0 count register always returning a constant (incorrect)
value. A fix for this has been submitted, and the workaround can be
removed after the fix has been in stable releases for a reasonable
amount of time.

A simple performance test which calls gettimeofday() 1000 times in a
loop and calculates the average execution time gives the following
results on a Malta + I6400 (running at 20MHz):

 - Syscall:    ~31000 ns
 - VDSO (GIC): ~15000 ns
 - VDSO (CP0): ~9500 ns

[markos.chandras@imgtec.com:
- Minor code re-arrangements in order for mappings to be made
in the order they appear to the process' address space.
- Move do_{monotonic, realtime} outside of the MIPS_CLOCK_VSYSCALL ifdef
- Use gic_get_usm_range so we can do the GIC mapping in the
arch/mips/kernel/vdso instead of the GIC irqchip driver]

Cc: linux-kernel@vger.kernel.org
Signed-off-by: Alex Smith <alex.smith@imgtec.com>
Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
---
Changes since v2:
- Do not export VDSO symbols if the toolchain does not have proper support
for the VDSO.

Changes since v1:
- Use gic_get_usm_range so we can do the GIC mapping in the
arch/mips/kernel/vdso instead of the GIC irqchip driver
---
 arch/mips/Kconfig                    |   5 +
 arch/mips/include/asm/clocksource.h  |  29 +++++
 arch/mips/include/asm/vdso.h         |  68 +++++++++-
 arch/mips/kernel/csrc-r4k.c          |  44 +++++++
 arch/mips/kernel/vdso.c              |  71 ++++++++++-
 arch/mips/vdso/gettimeofday.c        | 232 +++++++++++++++++++++++++++++++++++
 arch/mips/vdso/vdso.h                |   9 ++
 arch/mips/vdso/vdso.lds.S            |   5 +
 drivers/clocksource/mips-gic-timer.c |   7 +-
 9 files changed, 460 insertions(+), 10 deletions(-)
 create mode 100644 arch/mips/include/asm/clocksource.h
 create mode 100644 arch/mips/vdso/gettimeofday.c

diff --git a/arch/mips/Kconfig b/arch/mips/Kconfig
index e3aa5b0b4ef1..68f4f246887c 100644
--- a/arch/mips/Kconfig
+++ b/arch/mips/Kconfig
@@ -60,6 +60,8 @@ config MIPS
 	select SYSCTL_EXCEPTION_TRACE
 	select HAVE_VIRT_CPU_ACCOUNTING_GEN
 	select HAVE_IRQ_TIME_ACCOUNTING
+	select GENERIC_TIME_VSYSCALL
+	select ARCH_CLOCKSOURCE_DATA
 
 menu "Machine selection"
 
@@ -1036,6 +1038,9 @@ config CSRC_R4K
 config CSRC_SB1250
 	bool
 
+config MIPS_CLOCK_VSYSCALL
+	def_bool CSRC_R4K || CLKSRC_MIPS_GIC
+
 config GPIO_TXX9
 	select ARCH_REQUIRE_GPIOLIB
 	bool
diff --git a/arch/mips/include/asm/clocksource.h b/arch/mips/include/asm/clocksource.h
new file mode 100644
index 000000000000..3deb1d0c1a94
--- /dev/null
+++ b/arch/mips/include/asm/clocksource.h
@@ -0,0 +1,29 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#ifndef __ASM_CLOCKSOURCE_H
+#define __ASM_CLOCKSOURCE_H
+
+#include <linux/types.h>
+
+/* VDSO clocksources. */
+#define VDSO_CLOCK_NONE		0	/* No suitable clocksource. */
+#define VDSO_CLOCK_R4K		1	/* Use the coprocessor 0 count. */
+#define VDSO_CLOCK_GIC		2	/* Use the GIC. */
+
+/**
+ * struct arch_clocksource_data - Architecture-specific clocksource information.
+ * @vdso_clock_mode: Method the VDSO should use to access the clocksource.
+ */
+struct arch_clocksource_data {
+	u8 vdso_clock_mode;
+};
+
+#endif /* __ASM_CLOCKSOURCE_H */
diff --git a/arch/mips/include/asm/vdso.h b/arch/mips/include/asm/vdso.h
index db2d45be8f2e..8f4ca5dd992b 100644
--- a/arch/mips/include/asm/vdso.h
+++ b/arch/mips/include/asm/vdso.h
@@ -13,6 +13,8 @@
 
 #include <linux/mm_types.h>
 
+#include <asm/barrier.h>
+
 /**
  * struct mips_vdso_image - Details of a VDSO image.
  * @data: Pointer to VDSO image data (page-aligned).
@@ -53,18 +55,82 @@ extern struct mips_vdso_image vdso_image_n32;
 
 /**
  * union mips_vdso_data - Data provided by the kernel for the VDSO.
+ * @xtime_sec:		Current real time (seconds part).
+ * @xtime_nsec:		Current real time (nanoseconds part, shifted).
+ * @wall_to_mono_sec:	Wall-to-monotonic offset (seconds part).
+ * @wall_to_mono_nsec:	Wall-to-monotonic offset (nanoseconds part).
+ * @seq_count:		Counter to synchronise updates (odd = updating).
+ * @cs_shift:		Clocksource shift value.
+ * @clock_mode:		Clocksource to use for time functions.
+ * @cs_mult:		Clocksource multiplier value.
+ * @cs_cycle_last:	Clock cycle value at last update.
+ * @cs_mask:		Clocksource mask value.
+ * @tz_minuteswest:	Minutes west of Greenwich (from timezone).
+ * @tz_dsttime:		Type of DST correction (from timezone).
  *
  * This structure contains data needed by functions within the VDSO. It is
- * populated by the kernel and mapped read-only into user memory.
+ * populated by the kernel and mapped read-only into user memory. The time
+ * fields are mirrors of internal data from the timekeeping infrastructure.
  *
  * Note: Care should be taken when modifying as the layout must remain the same
  * for both 64- and 32-bit (for 32-bit userland on 64-bit kernel).
  */
 union mips_vdso_data {
 	struct {
+		u64 xtime_sec;
+		u64 xtime_nsec;
+		u32 wall_to_mono_sec;
+		u32 wall_to_mono_nsec;
+		u32 seq_count;
+		u32 cs_shift;
+		u8 clock_mode;
+		u32 cs_mult;
+		u64 cs_cycle_last;
+		u64 cs_mask;
+		s32 tz_minuteswest;
+		s32 tz_dsttime;
 	};
 
 	u8 page[PAGE_SIZE];
 };
 
+static inline u32 vdso_data_read_begin(const union mips_vdso_data *data)
+{
+	u32 seq;
+
+	while (true) {
+		seq = ACCESS_ONCE(data->seq_count);
+		if (likely(!(seq & 1))) {
+			/* Paired with smp_wmb() in vdso_data_write_*(). */
+			smp_rmb();
+			return seq;
+		}
+
+		cpu_relax();
+	}
+}
+
+static inline bool vdso_data_read_retry(const union mips_vdso_data *data,
+					u32 start_seq)
+{
+	/* Paired with smp_wmb() in vdso_data_write_*(). */
+	smp_rmb();
+	return unlikely(data->seq_count != start_seq);
+}
+
+static inline void vdso_data_write_begin(union mips_vdso_data *data)
+{
+	++data->seq_count;
+
+	/* Ensure sequence update is written before other data page values. */
+	smp_wmb();
+}
+
+static inline void vdso_data_write_end(union mips_vdso_data *data)
+{
+	/* Ensure data values are written before updating sequence again. */
+	smp_wmb();
+	++data->seq_count;
+}
+
 #endif /* __ASM_VDSO_H */
diff --git a/arch/mips/kernel/csrc-r4k.c b/arch/mips/kernel/csrc-r4k.c
index e5ed7ada1433..1f910563fdf6 100644
--- a/arch/mips/kernel/csrc-r4k.c
+++ b/arch/mips/kernel/csrc-r4k.c
@@ -28,6 +28,43 @@ static u64 notrace r4k_read_sched_clock(void)
 	return read_c0_count();
 }
 
+static inline unsigned int rdhwr_count(void)
+{
+	unsigned int count;
+
+	__asm__ __volatile__(
+	"	.set push\n"
+	"	.set mips32r2\n"
+	"	rdhwr	%0, $2\n"
+	"	.set pop\n"
+	: "=r" (count));
+
+	return count;
+}
+
+static bool rdhwr_count_usable(void)
+{
+	unsigned int prev, curr, i;
+
+	/*
+	 * Older QEMUs have a broken implementation of RDHWR for the CP0 count
+	 * which always returns a constant value. Try to identify this and don't
+	 * use it in the VDSO if it is broken. This workaround can be removed
+	 * once the fix has been in QEMU stable for a reasonable amount of time.
+	 */
+	for (i = 0, prev = rdhwr_count(); i < 100; i++) {
+		curr = rdhwr_count();
+
+		if (curr != prev)
+			return true;
+
+		prev = curr;
+	}
+
+	pr_warn("Not using R4K clocksource in VDSO due to broken RDHWR\n");
+	return false;
+}
+
 int __init init_r4k_clocksource(void)
 {
 	if (!cpu_has_counter || !mips_hpt_frequency)
@@ -36,6 +73,13 @@ int __init init_r4k_clocksource(void)
 	/* Calculate a somewhat reasonable rating value */
 	clocksource_mips.rating = 200 + mips_hpt_frequency / 10000000;
 
+	/*
+	 * R2 onwards makes the count accessible to user mode so it can be used
+	 * by the VDSO (HWREna is configured by configure_hwrena()).
+	 */
+	if (cpu_has_mips_r2_r6 && rdhwr_count_usable())
+		clocksource_mips.archdata.vdso_clock_mode = VDSO_CLOCK_R4K;
+
 	clocksource_register_hz(&clocksource_mips, mips_hpt_frequency);
 
 	sched_clock_register(r4k_read_sched_clock, 32, mips_hpt_frequency);
diff --git a/arch/mips/kernel/vdso.c b/arch/mips/kernel/vdso.c
index 56cc3c4377fb..975e99759bab 100644
--- a/arch/mips/kernel/vdso.c
+++ b/arch/mips/kernel/vdso.c
@@ -12,9 +12,12 @@
 #include <linux/elf.h>
 #include <linux/err.h>
 #include <linux/init.h>
+#include <linux/ioport.h>
+#include <linux/irqchip/mips-gic.h>
 #include <linux/mm.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/timekeeper_internal.h>
 
 #include <asm/abi.h>
 #include <asm/vdso.h>
@@ -23,7 +26,7 @@
 static union mips_vdso_data vdso_data __page_aligned_data;
 
 /*
- * Mapping for the VDSO data pages. The real pages are mapped manually, as
+ * Mapping for the VDSO data/GIC pages. The real pages are mapped manually, as
  * what we map and where within the area they are mapped is determined at
  * runtime.
  */
@@ -64,25 +67,67 @@ static int __init init_vdso(void)
 }
 subsys_initcall(init_vdso);
 
+void update_vsyscall(struct timekeeper *tk)
+{
+	vdso_data_write_begin(&vdso_data);
+
+	vdso_data.xtime_sec = tk->xtime_sec;
+	vdso_data.xtime_nsec = tk->tkr_mono.xtime_nsec;
+	vdso_data.wall_to_mono_sec = tk->wall_to_monotonic.tv_sec;
+	vdso_data.wall_to_mono_nsec = tk->wall_to_monotonic.tv_nsec;
+	vdso_data.cs_shift = tk->tkr_mono.shift;
+
+	vdso_data.clock_mode = tk->tkr_mono.clock->archdata.vdso_clock_mode;
+	if (vdso_data.clock_mode != VDSO_CLOCK_NONE) {
+		vdso_data.cs_mult = tk->tkr_mono.mult;
+		vdso_data.cs_cycle_last = tk->tkr_mono.cycle_last;
+		vdso_data.cs_mask = tk->tkr_mono.mask;
+	}
+
+	vdso_data_write_end(&vdso_data);
+}
+
+void update_vsyscall_tz(void)
+{
+	if (vdso_data.clock_mode != VDSO_CLOCK_NONE) {
+		vdso_data.tz_minuteswest = sys_tz.tz_minuteswest;
+		vdso_data.tz_dsttime = sys_tz.tz_dsttime;
+	}
+}
+
 int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 {
 	struct mips_vdso_image *image = current->thread.abi->vdso;
 	struct mm_struct *mm = current->mm;
-	unsigned long base, vdso_addr;
+	unsigned long gic_size, vvar_size, size, base, data_addr, vdso_addr;
 	struct vm_area_struct *vma;
+	struct resource gic_res;
 	int ret;
 
 	down_write(&mm->mmap_sem);
 
-	base = get_unmapped_area(NULL, 0, PAGE_SIZE + image->size, 0, 0);
+	/*
+	 * Determine total area size. This includes the VDSO data itself, the
+	 * data page, and the GIC user page if present. Always create a mapping
+	 * for the GIC user area if the GIC is present regardless of whether it
+	 * is the current clocksource, in case it comes into use later on. We
+	 * only map a page even though the total area is 64K, as we only need
+	 * the counter registers at the start.
+	 */
+	gic_size = gic_present ? PAGE_SIZE : 0;
+	vvar_size = gic_size + PAGE_SIZE;
+	size = vvar_size + image->size;
+
+	base = get_unmapped_area(NULL, 0, size, 0, 0);
 	if (IS_ERR_VALUE(base)) {
 		ret = base;
 		goto out;
 	}
 
-	vdso_addr = base + PAGE_SIZE;
+	data_addr = base + gic_size;
+	vdso_addr = data_addr + PAGE_SIZE;
 
-	vma = _install_special_mapping(mm, base, PAGE_SIZE,
+	vma = _install_special_mapping(mm, base, vvar_size,
 				       VM_READ | VM_MAYREAD,
 				       &vdso_vvar_mapping);
 	if (IS_ERR(vma)) {
@@ -90,8 +135,22 @@ int arch_setup_additional_pages(struct linux_binprm *bprm, int uses_interp)
 		goto out;
 	}
 
+	/* Map GIC user page. */
+	if (gic_size) {
+		ret = gic_get_usm_range(&gic_res);
+		if (ret)
+			goto out;
+
+		ret = io_remap_pfn_range(vma, base,
+					 gic_res.start >> PAGE_SHIFT,
+					 gic_size,
+					 pgprot_noncached(PAGE_READONLY));
+		if (ret)
+			goto out;
+	}
+
 	/* Map data page. */
-	ret = remap_pfn_range(vma, base,
+	ret = remap_pfn_range(vma, data_addr,
 			      virt_to_phys(&vdso_data) >> PAGE_SHIFT,
 			      PAGE_SIZE, PAGE_READONLY);
 	if (ret)
diff --git a/arch/mips/vdso/gettimeofday.c b/arch/mips/vdso/gettimeofday.c
new file mode 100644
index 000000000000..ce89c9e294f9
--- /dev/null
+++ b/arch/mips/vdso/gettimeofday.c
@@ -0,0 +1,232 @@
+/*
+ * Copyright (C) 2015 Imagination Technologies
+ * Author: Alex Smith <alex.smith@imgtec.com>
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License as published by the
+ * Free Software Foundation;  either version 2 of the  License, or (at your
+ * option) any later version.
+ */
+
+#include "vdso.h"
+
+#include <linux/compiler.h>
+#include <linux/irqchip/mips-gic.h>
+#include <linux/time.h>
+
+#include <asm/clocksource.h>
+#include <asm/io.h>
+#include <asm/mips-cm.h>
+#include <asm/unistd.h>
+#include <asm/vdso.h>
+
+static __always_inline int do_realtime_coarse(struct timespec *ts,
+					      const union mips_vdso_data *data)
+{
+	u32 start_seq;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		ts->tv_sec = data->xtime_sec;
+		ts->tv_nsec = data->xtime_nsec >> data->cs_shift;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	return 0;
+}
+
+static __always_inline int do_monotonic_coarse(struct timespec *ts,
+					       const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u32 to_mono_sec;
+	u32 to_mono_nsec;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		ts->tv_sec = data->xtime_sec;
+		ts->tv_nsec = data->xtime_nsec >> data->cs_shift;
+
+		to_mono_sec = data->wall_to_mono_sec;
+		to_mono_nsec = data->wall_to_mono_nsec;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_sec += to_mono_sec;
+	timespec_add_ns(ts, to_mono_nsec);
+
+	return 0;
+}
+
+#ifdef CONFIG_CSRC_R4K
+
+static __always_inline u64 read_r4k_count(void)
+{
+	unsigned int count;
+
+	__asm__ __volatile__(
+	"	.set push\n"
+	"	.set mips32r2\n"
+	"	rdhwr	%0, $2\n"
+	"	.set pop\n"
+	: "=r" (count));
+
+	return count;
+}
+
+#endif
+
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+
+static __always_inline u64 read_gic_count(const union mips_vdso_data *data)
+{
+	void __iomem *gic = get_gic(data);
+	u32 hi, hi2, lo;
+
+	do {
+		hi = __raw_readl(gic + GIC_UMV_SH_COUNTER_63_32_OFS);
+		lo = __raw_readl(gic + GIC_UMV_SH_COUNTER_31_00_OFS);
+		hi2 = __raw_readl(gic + GIC_UMV_SH_COUNTER_63_32_OFS);
+	} while (hi2 != hi);
+
+	return (((u64)hi) << 32) + lo;
+}
+
+#endif
+
+static __always_inline u64 get_ns(const union mips_vdso_data *data)
+{
+	u64 cycle_now, delta, nsec;
+
+	switch (data->clock_mode) {
+#ifdef CONFIG_CSRC_R4K
+	case VDSO_CLOCK_R4K:
+		cycle_now = read_r4k_count();
+		break;
+#endif
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+	case VDSO_CLOCK_GIC:
+		cycle_now = read_gic_count(data);
+		break;
+#endif
+	default:
+		return 0;
+	}
+
+	delta = (cycle_now - data->cs_cycle_last) & data->cs_mask;
+
+	nsec = (delta * data->cs_mult) + data->xtime_nsec;
+	nsec >>= data->cs_shift;
+
+	return nsec;
+}
+
+static __always_inline int do_realtime(struct timespec *ts,
+				       const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u64 ns;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		if (data->clock_mode == VDSO_CLOCK_NONE)
+			return -ENOSYS;
+
+		ts->tv_sec = data->xtime_sec;
+		ns = get_ns(data);
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, ns);
+
+	return 0;
+}
+
+static __always_inline int do_monotonic(struct timespec *ts,
+					const union mips_vdso_data *data)
+{
+	u32 start_seq;
+	u64 ns;
+	u32 to_mono_sec;
+	u32 to_mono_nsec;
+
+	do {
+		start_seq = vdso_data_read_begin(data);
+
+		if (data->clock_mode == VDSO_CLOCK_NONE)
+			return -ENOSYS;
+
+		ts->tv_sec = data->xtime_sec;
+		ns = get_ns(data);
+
+		to_mono_sec = data->wall_to_mono_sec;
+		to_mono_nsec = data->wall_to_mono_nsec;
+	} while (vdso_data_read_retry(data, start_seq));
+
+	ts->tv_sec += to_mono_sec;
+	ts->tv_nsec = 0;
+	timespec_add_ns(ts, ns + to_mono_nsec);
+
+	return 0;
+}
+
+#ifdef CONFIG_MIPS_CLOCK_VSYSCALL
+
+/*
+ * This is behind the ifdef so that we don't provide the symbol when there's no
+ * possibility of there being a usable clocksource, because there's nothing we
+ * can do without it. When libc fails the symbol lookup it should fall back on
+ * the standard syscall path.
+ */
+int __vdso_gettimeofday(struct timeval *tv, struct timezone *tz)
+{
+	const union mips_vdso_data *data = get_vdso_data();
+	struct timespec ts;
+	int ret;
+
+	ret = do_realtime(&ts, data);
+	if (ret)
+		return ret;
+
+	if (tv) {
+		tv->tv_sec = ts.tv_sec;
+		tv->tv_usec = ts.tv_nsec / 1000;
+	}
+
+	if (tz) {
+		tz->tz_minuteswest = data->tz_minuteswest;
+		tz->tz_dsttime = data->tz_dsttime;
+	}
+
+	return 0;
+}
+
+#endif /* CONFIG_CLKSRC_MIPS_GIC */
+
+int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts)
+{
+	const union mips_vdso_data *data = get_vdso_data();
+	int ret;
+
+	switch (clkid) {
+	case CLOCK_REALTIME_COARSE:
+		ret = do_realtime_coarse(ts, data);
+		break;
+	case CLOCK_MONOTONIC_COARSE:
+		ret = do_monotonic_coarse(ts, data);
+		break;
+	case CLOCK_REALTIME:
+		ret = do_realtime(ts, data);
+		break;
+	case CLOCK_MONOTONIC:
+		ret = do_monotonic(ts, data);
+		break;
+	default:
+		ret = -ENOSYS;
+		break;
+	}
+
+	/* If we return -ENOSYS libc should fall back to a syscall. */
+	return ret;
+}
diff --git a/arch/mips/vdso/vdso.h b/arch/mips/vdso/vdso.h
index 0bb6b1adc385..cfb1be441dec 100644
--- a/arch/mips/vdso/vdso.h
+++ b/arch/mips/vdso/vdso.h
@@ -77,4 +77,13 @@ static inline const union mips_vdso_data *get_vdso_data(void)
 	return (const union mips_vdso_data *)(get_vdso_base() - PAGE_SIZE);
 }
 
+#ifdef CONFIG_CLKSRC_MIPS_GIC
+
+static inline void __iomem *get_gic(const union mips_vdso_data *data)
+{
+	return (void __iomem *)data - PAGE_SIZE;
+}
+
+#endif /* CONFIG_CLKSRC_MIPS_GIC */
+
 #endif /* __ASSEMBLY__ */
diff --git a/arch/mips/vdso/vdso.lds.S b/arch/mips/vdso/vdso.lds.S
index 21655b6fefc5..8df7dd53e8e0 100644
--- a/arch/mips/vdso/vdso.lds.S
+++ b/arch/mips/vdso/vdso.lds.S
@@ -95,6 +95,11 @@ PHDRS
 VERSION
 {
 	LINUX_2.6 {
+#ifndef DISABLE_MIPS_VDSO
+	global:
+		__vdso_clock_gettime;
+		__vdso_gettimeofday;
+#endif
 	local: *;
 	};
 }
diff --git a/drivers/clocksource/mips-gic-timer.c b/drivers/clocksource/mips-gic-timer.c
index 02a1945e5093..89d3e4d7900c 100644
--- a/drivers/clocksource/mips-gic-timer.c
+++ b/drivers/clocksource/mips-gic-timer.c
@@ -140,9 +140,10 @@ static cycle_t gic_hpt_read(struct clocksource *cs)
 }
 
 static struct clocksource gic_clocksource = {
-	.name	= "GIC",
-	.read	= gic_hpt_read,
-	.flags	= CLOCK_SOURCE_IS_CONTINUOUS,
+	.name		= "GIC",
+	.read		= gic_hpt_read,
+	.flags		= CLOCK_SOURCE_IS_CONTINUOUS,
+	.archdata	= { .vdso_clock_mode = VDSO_CLOCK_GIC },
 };
 
 static void __init __gic_clocksource_init(void)
-- 
2.6.2


^ permalink raw reply related	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-21  8:57     ` [PATCH v3 " Markos Chandras
@ 2015-10-23  1:41       ` Leonid Yegoshin
  2015-10-27 14:47         ` Ralf Baechle
  2016-01-25 22:36       ` [PATCH v3 " Hauke Mehrtens
  1 sibling, 1 reply; 35+ messages in thread
From: Leonid Yegoshin @ 2015-10-23  1:41 UTC (permalink / raw)
  To: Markos Chandras, linux-mips; +Cc: Alex Smith, linux-kernel

You can not use R4K CP0_count in SMP (multicore) without core-specific 
adjustment.
After first power-saving with core clock off or core down the values in 
CP0_count
in different cores are absolutely different.

Until you include in system a patch like 
http://patchwork.linux-mips.org/patch/10871/

     "MIPS: Setup an instruction emulation in VDSO protected page 
instead of user stack"

which creates a per-thread memory and put into that memory an adjustment 
value
(which can be changed during re-schedule to another core), the use of 
R4K counter is incorrect
in SMP environment.

Note: It is also possible to setup and maintain a per-core page with 
that value too as
an alternative variant for adjustment.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-23  1:41       ` [v3, " Leonid Yegoshin
@ 2015-10-27 14:47         ` Ralf Baechle
  2015-10-27 20:46           ` Leonid Yegoshin
  0 siblings, 1 reply; 35+ messages in thread
From: Ralf Baechle @ 2015-10-27 14:47 UTC (permalink / raw)
  To: Leonid Yegoshin; +Cc: Markos Chandras, linux-mips, Alex Smith, linux-kernel

On Thu, Oct 22, 2015 at 06:41:30PM -0700, Leonid Yegoshin wrote:

> You can not use R4K CP0_count in SMP (multicore) without core-specific
> adjustment.
> After first power-saving with core clock off or core down the values in
> CP0_count
> in different cores are absolutely different.
> 
> Until you include in system a patch like
> http://patchwork.linux-mips.org/patch/10871/
> 
>     "MIPS: Setup an instruction emulation in VDSO protected page instead of
> user stack"
> 
> which creates a per-thread memory and put into that memory an adjustment
> value
> (which can be changed during re-schedule to another core), the use of R4K
> counter is incorrect
> in SMP environment.
> 
> Note: It is also possible to setup and maintain a per-core page with that
> value too as
> an alternative variant for adjustment.

The CPU hot plugging code is supposed to resychronize the counters when
a CPU is coming online again so that case should be handled.  Beyond that
the r4k timer code in the kernel also doesn't support clock scaling
so I'm ok if the VDSO series doesn't support this properly.

  Ralf

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-27 14:47         ` Ralf Baechle
@ 2015-10-27 20:46           ` Leonid Yegoshin
  2015-10-27 21:02             ` David Daney
  0 siblings, 1 reply; 35+ messages in thread
From: Leonid Yegoshin @ 2015-10-27 20:46 UTC (permalink / raw)
  To: Ralf Baechle; +Cc: Markos Chandras, linux-mips, Alex Smith, linux-kernel

On 10/27/2015 07:47 AM, Ralf Baechle wrote:
> On Thu, Oct 22, 2015 at 06:41:30PM -0700, Leonid Yegoshin wrote:
>
>> You can not use R4K CP0_count in SMP (multicore) without core-specific
>> adjustment.
>> After first power-saving with core clock off or core down the values in
>> CP0_count
>> in different cores are absolutely different.
>>
>> Until you include in system a patch like
>> http://patchwork.linux-mips.org/patch/10871/
>>
>>      "MIPS: Setup an instruction emulation in VDSO protected page instead of
>> user stack"
>>
>> which creates a per-thread memory and put into that memory an adjustment
>> value
>> (which can be changed during re-schedule to another core), the use of R4K
>> counter is incorrect
>> in SMP environment.
>>
>> Note: It is also possible to setup and maintain a per-core page with that
>> value too as
>> an alternative variant for adjustment.
> The CPU hot plugging code is supposed to resychronize the counters when
> a CPU is coming online again so that case should be handled.  Beyond that
> the r4k timer code in the kernel also doesn't support clock scaling
> so I'm ok if the VDSO series doesn't support this properly.
>
>    Ralf

I doesn't work in this way - a standard CP0_counter synchronization code 
takes up to hundred milliseconds to complete with running some loop 
cycles on two CPUs. It is clearly seen in Malta FPGA board.

Non-standard (one way sync, write CP0_counter value to memory in CPU0 
before CPU1 wakeup) is not precise because it can't predict how much 
time the CPU1 can spent in wakeup. Just because of HW, for exam, and SW 
next.

I believe, until this issue is fixed the R4K only CPU should be excluded 
from VDSO timing acceleration.

And finally. clock scaling - what we would do if there are two CPUs with 
different clock ratios in system? It seems like common kernel timing 
subsystem can handle that.

- Leonid.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-27 20:46           ` Leonid Yegoshin
@ 2015-10-27 21:02             ` David Daney
  2015-10-27 21:15               ` Leonid Yegoshin
  0 siblings, 1 reply; 35+ messages in thread
From: David Daney @ 2015-10-27 21:02 UTC (permalink / raw)
  To: Leonid Yegoshin
  Cc: Ralf Baechle, Markos Chandras, linux-mips, Alex Smith, linux-kernel

On 10/27/2015 01:46 PM, Leonid Yegoshin wrote:
[...]
>
> And finally. clock scaling - what we would do if there are two CPUs with
> different clock ratios in system? It seems like common kernel timing
> subsystem can handle that.
>

The code that executes in userspace must have access to a consistent 
clock source.  If you are running on a SMP system that doesn't have 
synchronized CP0.Count registers, then your gettimeofday() cannot use 
CP0.Count (RDHWR $2).

As far as I know, CP0.Count is the only available counter visible to 
userspace, so you would have to disable the accelerated versions of 
gettimeofday() where you cannot assert that the counters are always 
synchronized.

David Daney



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-27 21:02             ` David Daney
@ 2015-10-27 21:15               ` Leonid Yegoshin
  2015-10-27 21:44                 ` David Daney
  0 siblings, 1 reply; 35+ messages in thread
From: Leonid Yegoshin @ 2015-10-27 21:15 UTC (permalink / raw)
  To: David Daney
  Cc: Ralf Baechle, Markos Chandras, linux-mips, Alex Smith, linux-kernel

On 10/27/2015 02:02 PM, David Daney wrote:
> On 10/27/2015 01:46 PM, Leonid Yegoshin wrote:
> [...]
>>
>> And finally. clock scaling - what we would do if there are two CPUs with
>> different clock ratios in system? It seems like common kernel timing
>> subsystem can handle that.
>>
>
> The code that executes in userspace must have access to a consistent 
> clock source.  If you are running on a SMP system that doesn't have 
> synchronized CP0.Count registers, then your gettimeofday() cannot use 
> CP0.Count (RDHWR $2).

Right, I agree.

>
> As far as I know, CP0.Count is the only available counter visible to 
> userspace, so you would have to disable the accelerated versions of 
> gettimeofday() where you cannot assert that the counters are always 
> synchronized.

Any system with GIC may have access to the same GIC global counter in a 
special separate page available for mapping by user in RO mode and it 
seems Alex did that.

Besides that this GIC global counter is used as a major system 
clocksource in systems with GIC.

- Leonid



^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-27 21:15               ` Leonid Yegoshin
@ 2015-10-27 21:44                 ` David Daney
  2015-10-27 21:49                   ` Leonid Yegoshin
  0 siblings, 1 reply; 35+ messages in thread
From: David Daney @ 2015-10-27 21:44 UTC (permalink / raw)
  To: Leonid Yegoshin
  Cc: Ralf Baechle, Markos Chandras, linux-mips, Alex Smith, linux-kernel

On 10/27/2015 02:15 PM, Leonid Yegoshin wrote:
> On 10/27/2015 02:02 PM, David Daney wrote:
>> On 10/27/2015 01:46 PM, Leonid Yegoshin wrote:
>> [...]
>>>
>>> And finally. clock scaling - what we would do if there are two CPUs with
>>> different clock ratios in system? It seems like common kernel timing
>>> subsystem can handle that.
>>>
>>
>> The code that executes in userspace must have access to a consistent
>> clock source.  If you are running on a SMP system that doesn't have
>> synchronized CP0.Count registers, then your gettimeofday() cannot use
>> CP0.Count (RDHWR $2).
>
> Right, I agree.
>
>>
>> As far as I know, CP0.Count is the only available counter visible to
>> userspace, so you would have to disable the accelerated versions of
>> gettimeofday() where you cannot assert that the counters are always
>> synchronized.
>
> Any system with GIC may have access to the same GIC global counter in a
> special separate page available for mapping by user in RO mode and it
> seems Alex did that.
>
> Besides that this GIC global counter is used as a major system
> clocksource in systems with GIC.

Yes, I had forgotten about the GIC thing.

In any event, there is a set of systems where we could run into problems 
with unsynchronized clocks.  There needs to be an easy way to 
enable/disable the gettimeofday() acceleration at run time based on the 
properties of the counter (GIC, CP0.Count, or whatever) chosen at boot time.

For example, On OCTEON single-chip systems we synchronize the the 
counters and they don't drift.  So, we can use CPO.Count.  However, on 
two-chip NUMA configurations there may be clock drift between the two 
chips, so CPO.Count cannot be used as a clocksource.  We have a single 
kernel image that runs on both types of systems, so we have to be able 
to enable/disable the gettimeofday() acceleration.

David Daney


>
> - Leonid
>
>
>
>


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-27 21:44                 ` David Daney
@ 2015-10-27 21:49                   ` Leonid Yegoshin
  2015-10-28 10:20                     ` Alex Smith
  0 siblings, 1 reply; 35+ messages in thread
From: Leonid Yegoshin @ 2015-10-27 21:49 UTC (permalink / raw)
  To: David Daney
  Cc: Ralf Baechle, Markos Chandras, linux-mips, Alex Smith, linux-kernel


> For example, On OCTEON single-chip systems we synchronize the the 
> counters and they don't drift.  So, we can use CPO.Count. However, on 
> two-chip NUMA configurations there may be clock drift between the two 
> chips, so CPO.Count cannot be used as a clocksource.  We have a single 
> kernel image that runs on both types of systems, so we have to be able 
> to enable/disable the gettimeofday() acceleration.
>
Much more interesting the situation then there are a different clock 
frequency in different CPUs.

It seems for me that per-thread memory idea may be required soon.

- Leonid

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-27 21:49                   ` Leonid Yegoshin
@ 2015-10-28 10:20                     ` Alex Smith
  2015-10-28 18:21                       ` Leonid Yegoshin
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Smith @ 2015-10-28 10:20 UTC (permalink / raw)
  To: Leonid Yegoshin, David Daney
  Cc: Ralf Baechle, Markos Chandras, linux-mips, Alex Smith, linux-kernel

On 27 October 2015 at 20:46, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
> I doesn't work in this way - a standard CP0_counter synchronization code
> takes up to hundred milliseconds to complete with running some loop cycles
> on two CPUs. It is clearly seen in Malta FPGA board.
>
> Non-standard (one way sync, write CP0_counter value to memory in CPU0 before
> CPU1 wakeup) is not precise because it can't predict how much time the CPU1
> can spent in wakeup. Just because of HW, for exam, and SW next.
>
> I believe, until this issue is fixed the R4K only CPU should be excluded
> from VDSO timing acceleration.

The VDSO code will currently use the CP0 count whenever the kernel is
using it as its primary clocksource, aside from the case where RDHWR
is broken as it is on old QEMUs.

Maybe I'm missing something but I don't see anything in the generic
timekeeping code that handles the same clocksource being
unsynchronised or running at a different rate on different CPUs.

Given that, if you think there is an issue that prevents the VDSO from
using it then that would surely affect the kernel as well and needs to
be fixed separately?

If it really is necessary to prevent the VDSO from using a certain
clocksource even though the kernel is using it, it should be a simple
matter of setting clocksource.archdata.vdso_clock_mode to
VDSO_CLOCK_NONE. This is how this patch stops it using the CP0 count
when RDHWR is broken.

Alex

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-28 10:20                     ` Alex Smith
@ 2015-10-28 18:21                       ` Leonid Yegoshin
  2015-10-28 18:30                         ` Alex Smith
  0 siblings, 1 reply; 35+ messages in thread
From: Leonid Yegoshin @ 2015-10-28 18:21 UTC (permalink / raw)
  To: Alex Smith, David Daney
  Cc: Ralf Baechle, Markos Chandras, linux-mips, Alex Smith, linux-kernel

On 10/28/2015 03:20 AM, Alex Smith wrote:
> On 27 October 2015 at 20:46, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
>> I believe, until this issue is fixed the R4K only CPU should be excluded
>> from VDSO timing acceleration.
> The VDSO code will currently use the CP0 count whenever the kernel is
> using it as its primary clocksource, aside from the case where RDHWR
> is broken as it is on old QEMUs.

1) I don't see that in code - there is no check that kernel uses 
actually uses R4K clocksource as primary (A), and if kernel uses R4K 
count as a clocksource and later switches to some more precise 
clocksource then there is no change in VDSO gettimeofday handling (B).

2) The fact that MIPS kernel as today uses CP0_COUNT in any core as the 
same clocksource is correct only until first power saving event with CPU 
clock disabled (skipping Octeon). After that it is an incorrect use 
without an accurate synchronization and that synchronization doesn't exist.

And I remember that today kernel uses only CPU0 CP0_COUNT to update 
time... may be wrong, need to check, but that could be a good code.

>
> Maybe I'm missing something but I don't see anything in the generic
> timekeeping code that handles the same clocksource being
> unsynchronised or running at a different rate on different CPUs.

(I would like to skip here the generic timekeeping code capabilities, 
just to restrict a discussion to HW capabilities)

>
> Given that, if you think there is an issue that prevents the VDSO from
> using it then that would surely affect the kernel as well and needs to
> be fixed separately?
>
> If it really is necessary to prevent the VDSO from using a certain
> clocksource even though the kernel is using it, it should be a simple
> matter of setting clocksource.archdata.vdso_clock_mode to
> VDSO_CLOCK_NONE. This is how this patch stops it using the CP0 count
> when RDHWR is broken.

OK, just put kernel-build time check that it is not SMP without GIC 
clocksource or it is Octeon. It would be enough to stop a mess.

- Leonid


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-28 18:21                       ` Leonid Yegoshin
@ 2015-10-28 18:30                         ` Alex Smith
  2015-10-28 18:57                           ` Leonid Yegoshin
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Smith @ 2015-10-28 18:30 UTC (permalink / raw)
  To: Leonid Yegoshin
  Cc: David Daney, Ralf Baechle, Markos Chandras, linux-mips,
	Alex Smith, linux-kernel

On 28 October 2015 at 18:21, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
>
> On 10/28/2015 03:20 AM, Alex Smith wrote:
>>
>> On 27 October 2015 at 20:46, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
>>>
>>> I believe, until this issue is fixed the R4K only CPU should be excluded
>>> from VDSO timing acceleration.
>>
>> The VDSO code will currently use the CP0 count whenever the kernel is
>> using it as its primary clocksource, aside from the case where RDHWR
>> is broken as it is on old QEMUs.
>
>
> 1) I don't see that in code - there is no check that kernel uses actually uses R4K clocksource as primary (A), and if kernel uses R4K count as a clocksource and later switches to some more precise clocksource then there is no change in VDSO gettimeofday handling (B).

Incorrect. The vdso_clock_mode flag in arch_clocksource_data that I
mentioned in my previous email is copied into the VDSO data page by
update_vsyscall(), which is called when the clocksource changes.

>
> 2) The fact that MIPS kernel as today uses CP0_COUNT in any core as the same clocksource is correct only until first power saving event with CPU clock disabled (skipping Octeon). After that it is an incorrect use without an accurate synchronization and that synchronization doesn't exist.
>
> And I remember that today kernel uses only CPU0 CP0_COUNT to update time... may be wrong, need to check, but that could be a good code.
>
>>
>> Maybe I'm missing something but I don't see anything in the generic
>> timekeeping code that handles the same clocksource being
>> unsynchronised or running at a different rate on different CPUs.
>
>
> (I would like to skip here the generic timekeeping code capabilities, just to restrict a discussion to HW capabilities)
>
>>
>> Given that, if you think there is an issue that prevents the VDSO from
>> using it then that would surely affect the kernel as well and needs to
>> be fixed separately?
>>
>> If it really is necessary to prevent the VDSO from using a certain
>> clocksource even though the kernel is using it, it should be a simple
>> matter of setting clocksource.archdata.vdso_clock_mode to
>> VDSO_CLOCK_NONE. This is how this patch stops it using the CP0 count
>> when RDHWR is broken.
>
>
> OK, just put kernel-build time check that it is not SMP without GIC clocksource or it is Octeon. It would be enough to stop a mess.

If you feel it's necessary then please do.

Thanks,
Alex

>
> - Leonid
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-28 18:30                         ` Alex Smith
@ 2015-10-28 18:57                           ` Leonid Yegoshin
  2015-10-28 19:04                             ` Alex Smith
  0 siblings, 1 reply; 35+ messages in thread
From: Leonid Yegoshin @ 2015-10-28 18:57 UTC (permalink / raw)
  To: Alex Smith
  Cc: David Daney, Ralf Baechle, Markos Chandras, linux-mips,
	Alex Smith, linux-kernel

On 10/28/2015 11:30 AM, Alex Smith wrote:
> On 28 October 2015 at 18:21, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
>>
>>
>> 1) I don't see that in code - there is no check that kernel uses actually uses R4K clocksource as primary (A), and if kernel uses R4K count as a clocksource and later switches to some more precise clocksource then there is no change in VDSO gettimeofday handling (B).
> Incorrect. The vdso_clock_mode flag in arch_clocksource_data that I
> mentioned in my previous email is copied into the VDSO data page by
> update_vsyscall(), which is called when the clocksource changes.

OK, I see this, good.

>
>> 2) The fact that MIPS kernel as today uses CP0_COUNT in any core as the same clocksource is correct only until first power saving event with CPU clock disabled (skipping Octeon). After that it is an incorrect use without an accurate synchronization and that synchronization doesn't exist.
>>
>> And I remember that today kernel uses only CPU0 CP0_COUNT to update time... may be wrong, need to check, but that could be a good code.
>>
>>> Maybe I'm missing something but I don't see anything in the generic
>>> timekeeping code that handles the same clocksource being
>>> unsynchronised or running at a different rate on different CPUs.
>>
>> (I would like to skip here the generic timekeeping code capabilities, just to restrict a discussion to HW capabilities)
>>
>>> Given that, if you think there is an issue that prevents the VDSO from
>>> using it then that would surely affect the kernel as well and needs to
>>> be fixed separately?
>>>
>>> If it really is necessary to prevent the VDSO from using a certain
>>> clocksource even though the kernel is using it, it should be a simple
>>> matter of setting clocksource.archdata.vdso_clock_mode to
>>> VDSO_CLOCK_NONE. This is how this patch stops it using the CP0 count
>>> when RDHWR is broken.
>>
>> OK, just put kernel-build time check that it is not SMP without GIC clocksource or it is Octeon. It would be enough to stop a mess.
> If you feel it's necessary then please do.

Please resend a patch with this fix.

- Leonid.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-28 18:57                           ` Leonid Yegoshin
@ 2015-10-28 19:04                             ` Alex Smith
  2015-10-28 19:28                               ` Leonid Yegoshin
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Smith @ 2015-10-28 19:04 UTC (permalink / raw)
  To: Leonid Yegoshin
  Cc: David Daney, Ralf Baechle, Markos Chandras, linux-mips,
	Alex Smith, linux-kernel

On 28 October 2015 at 18:57, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
> On 10/28/2015 11:30 AM, Alex Smith wrote:
>>
>> On 28 October 2015 at 18:21, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
>> wrote:
>>>
>>>
>>>
>>> 1) I don't see that in code - there is no check that kernel uses actually
>>> uses R4K clocksource as primary (A), and if kernel uses R4K count as a
>>> clocksource and later switches to some more precise clocksource then there
>>> is no change in VDSO gettimeofday handling (B).
>>
>> Incorrect. The vdso_clock_mode flag in arch_clocksource_data that I
>> mentioned in my previous email is copied into the VDSO data page by
>> update_vsyscall(), which is called when the clocksource changes.
>
>
> OK, I see this, good.
>
>>
>>> 2) The fact that MIPS kernel as today uses CP0_COUNT in any core as the
>>> same clocksource is correct only until first power saving event with CPU
>>> clock disabled (skipping Octeon). After that it is an incorrect use without
>>> an accurate synchronization and that synchronization doesn't exist.
>>>
>>> And I remember that today kernel uses only CPU0 CP0_COUNT to update
>>> time... may be wrong, need to check, but that could be a good code.
>>>
>>>> Maybe I'm missing something but I don't see anything in the generic
>>>> timekeeping code that handles the same clocksource being
>>>> unsynchronised or running at a different rate on different CPUs.
>>>
>>>
>>> (I would like to skip here the generic timekeeping code capabilities,
>>> just to restrict a discussion to HW capabilities)
>>>
>>>> Given that, if you think there is an issue that prevents the VDSO from
>>>> using it then that would surely affect the kernel as well and needs to
>>>> be fixed separately?
>>>>
>>>> If it really is necessary to prevent the VDSO from using a certain
>>>> clocksource even though the kernel is using it, it should be a simple
>>>> matter of setting clocksource.archdata.vdso_clock_mode to
>>>> VDSO_CLOCK_NONE. This is how this patch stops it using the CP0 count
>>>> when RDHWR is broken.
>>>
>>>
>>> OK, just put kernel-build time check that it is not SMP without GIC
>>> clocksource or it is Octeon. It would be enough to stop a mess.
>>
>> If you feel it's necessary then please do.
>
>
> Please resend a patch with this fix.

As I've explained the VDSO will only use the CP0 counter in the same
situations that the kernel would when it is the active clocksource.
Any issue that makes the counter unreliable affects the kernel as well
and is unrelated to the VDSO, so a fix does not belong in this patch.

Alex

>
> - Leonid.

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-28 19:04                             ` Alex Smith
@ 2015-10-28 19:28                               ` Leonid Yegoshin
  2015-10-28 19:55                                 ` Alex Smith
  0 siblings, 1 reply; 35+ messages in thread
From: Leonid Yegoshin @ 2015-10-28 19:28 UTC (permalink / raw)
  To: Alex Smith
  Cc: David Daney, Ralf Baechle, Markos Chandras, linux-mips,
	Alex Smith, linux-kernel

On 10/28/2015 12:04 PM, Alex Smith wrote:
> On 28 October 2015 at 18:57, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
>>
> As I've explained the VDSO will only use the CP0 counter in the same
> situations that the kernel would when it is the active clocksource.
> Any issue that makes the counter unreliable affects the kernel as well
> and is unrelated to the VDSO, so a fix does not belong in this patch.

What would you do if some SoC with different type of cores will define 
CPU1 etc CP0_COUNT as a DIFFERENT clocksource from CPU0 (because of 
frequency etc)? Timekeeping can select CPU0 clocksource but code still 
uses a local CPU1 CP0_COUNT for gettimeofday().

And this kind of solution is the first in line to have an accurate 
timing in systems without GIC and with different clock frequencies.

- Leonid


^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-28 19:28                               ` Leonid Yegoshin
@ 2015-10-28 19:55                                 ` Alex Smith
  2015-10-28 20:15                                   ` Leonid Yegoshin
  0 siblings, 1 reply; 35+ messages in thread
From: Alex Smith @ 2015-10-28 19:55 UTC (permalink / raw)
  To: Leonid Yegoshin
  Cc: David Daney, Ralf Baechle, Markos Chandras, linux-mips,
	Alex Smith, linux-kernel

On 28 October 2015 at 19:28, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
> On 10/28/2015 12:04 PM, Alex Smith wrote:
>>
>> On 28 October 2015 at 18:57, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com>
>> wrote:
>>>
>>>
>> As I've explained the VDSO will only use the CP0 counter in the same
>> situations that the kernel would when it is the active clocksource.
>> Any issue that makes the counter unreliable affects the kernel as well
>> and is unrelated to the VDSO, so a fix does not belong in this patch.
>
>
> What would you do if some SoC with different type of cores will define CPU1
> etc CP0_COUNT as a DIFFERENT clocksource from CPU0 (because of frequency
> etc)? Timekeeping can select CPU0 clocksource but code still uses a local
> CPU1 CP0_COUNT for gettimeofday().

Clocksources are not per-CPU. If the CP0 counter is the current
clocksource, then both the kernel and VDSO implementations of
gettimeofday will read out the CP0 counter from whatever CPU they run
on.

If in future there is some behaviour dependent on the current CPU in
the kernel gettimeofday implementation, then sure, something will need
to be done about it, but right now I see no issue that specifically
affects the VDSO code.

Alex

>
> And this kind of solution is the first in line to have an accurate timing in
> systems without GIC and with different clock frequencies.
>
> - Leonid
>

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [v3, 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-28 19:55                                 ` Alex Smith
@ 2015-10-28 20:15                                   ` Leonid Yegoshin
  0 siblings, 0 replies; 35+ messages in thread
From: Leonid Yegoshin @ 2015-10-28 20:15 UTC (permalink / raw)
  To: Alex Smith
  Cc: David Daney, Ralf Baechle, Markos Chandras, linux-mips,
	Alex Smith, linux-kernel

On 10/28/2015 12:55 PM, Alex Smith wrote:
> On 28 October 2015 at 19:28, Leonid Yegoshin <Leonid.Yegoshin@imgtec.com> wrote:
>> . 
> Clocksources are not per-CPU. If the CP0 counter is the current
> clocksource, then both the kernel and VDSO implementations of
> gettimeofday will read out the CP0 counter from whatever CPU they run
> on.

OK, it was an invalid example. Let's be specific - in case of different 
clock frequency in different CPUs it easy to adjust it in kernel via 
clocksource->read()/etc but it is impossible to adjust that in VDSO 
implementation.

And that can't be fixed easily without some-kind of "per-thread" data 
page for correct multipliers.

There are many problems with assumption that in all kind of MIPS cores 
R4K CP0_COUNT registers are in sync in different CPUs. Even current 
kernel has problems here but I think it is not excuse to mount more on it.

- Leonid.



^ permalink raw reply	[flat|nested] 35+ messages in thread

* RE: [PATCH 1/3] MIPS: Initial implementation of a VDSO
  2015-09-28 13:07     ` Matthew Fortune
@ 2015-11-20 18:15       ` Maciej W. Rozycki
  0 siblings, 0 replies; 35+ messages in thread
From: Maciej W. Rozycki @ 2015-11-20 18:15 UTC (permalink / raw)
  To: Matthew Fortune
  Cc: Alex Smith, Markos Chandras, linux-mips, Alex Smith, linux-kernel

On Mon, 28 Sep 2015, Matthew Fortune wrote:

> > > +       /* lapc <symbol> is an alias to addiupc reg, <symbol> - .
> > > +        *
> > > +        * We can't use addiupc because there is no label-label
> > > +        * support for the addiupc reloc
> > > +        */
> > > +       __asm__("lapc   %0, _start                      \n"
> > > +               : "=r" (addr) : :);
> > 
> > Just curious - if lapc is just an alias to addiupc, why does that work
> > but not addiupc? IIRC I did try addiupc previously but removed it
> > because it wasn't working, didn't know about lapc!
> 
> This is just an unfortunate quirk of how the implementation is done in
> binutils. We don't recognise the special case that:
> 
> addiupc <reg>, <sym> - .
> 
> is the same as
> 
> lapc <reg>, <sym>
> 
> And therefore don't know that we can just use the MIPS_PC19_S2 reloc
> (name of that reloc may not be perfectly correct). It is a special
> case as the RHS of the expression in ADDIUPC above can be theoretically
> anything so we only support assembly time constants with addiupc.
> 
> Apart from the need to document the LAPC alias somewhere I'm not sure
> we need do anything to improve addiupc itself particularly.

 For the record -- this corresponds to how the LA macro and the 
PC-relative ADDIU instruction are handled when assembling MIPS16 code.

 And the place to document such peculiarities is obviously an assembly 
language manual.  A few have been written for the MIPS architecture 
already and with recent updates to the instruction set perhaps it is time 
for a revised edition or yet another book.

  Maciej

^ permalink raw reply	[flat|nested] 35+ messages in thread

* Re: [PATCH v3 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime()
  2015-10-21  8:57     ` [PATCH v3 " Markos Chandras
  2015-10-23  1:41       ` [v3, " Leonid Yegoshin
@ 2016-01-25 22:36       ` Hauke Mehrtens
  1 sibling, 0 replies; 35+ messages in thread
From: Hauke Mehrtens @ 2016-01-25 22:36 UTC (permalink / raw)
  To: Markos Chandras, linux-mips; +Cc: Alex Smith, linux-kernel

On 10/21/2015 10:57 AM, Markos Chandras wrote:
> From: Alex Smith <alex.smith@imgtec.com>
> 
> Add user-mode implementations of gettimeofday() and clock_gettime() to
> the VDSO. This is currently usable with 2 clocksources: the CP0 count
> register, which is accessible to user-mode via RDHWR on R2 and later
> cores, or the MIPS Global Interrupt Controller (GIC) timer, which
> provides a "user-mode visible" section containing a mirror of its
> counter registers. This section must be mapped into user memory, which
> is done below the VDSO data page.
> 
> When a supported clocksource is not in use, the VDSO functions will
> return -ENOSYS, which causes libc to fall back on the standard syscall
> path.
> 
> When support for neither of these clocksources is compiled into the
> kernel at all, the VDSO still provides clock_gettime(), as the coarse
> realtime/monotonic clocks can still be implemented. However,
> gettimeofday() is not provided in this case as nothing can be done
> without a suitable clocksource. This causes the symbol lookup to fail
> in libc and it will then always use the standard syscall path.
> 
> This patch includes a workaround for a bug in QEMU which results in
> RDHWR on the CP0 count register always returning a constant (incorrect)
> value. A fix for this has been submitted, and the workaround can be
> removed after the fix has been in stable releases for a reasonable
> amount of time.
> 
> A simple performance test which calls gettimeofday() 1000 times in a
> loop and calculates the average execution time gives the following
> results on a Malta + I6400 (running at 20MHz):
> 
>  - Syscall:    ~31000 ns
>  - VDSO (GIC): ~15000 ns
>  - VDSO (CP0): ~9500 ns
> 
> [markos.chandras@imgtec.com:
> - Minor code re-arrangements in order for mappings to be made
> in the order they appear to the process' address space.
> - Move do_{monotonic, realtime} outside of the MIPS_CLOCK_VSYSCALL ifdef
> - Use gic_get_usm_range so we can do the GIC mapping in the
> arch/mips/kernel/vdso instead of the GIC irqchip driver]
> 
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Alex Smith <alex.smith@imgtec.com>
> Signed-off-by: Markos Chandras <markos.chandras@imgtec.com>
> ---
> Changes since v2:
> - Do not export VDSO symbols if the toolchain does not have proper support
> for the VDSO.
> 
> Changes since v1:
> - Use gic_get_usm_range so we can do the GIC mapping in the
> arch/mips/kernel/vdso instead of the GIC irqchip driver
> ---
>  arch/mips/Kconfig                    |   5 +
>  arch/mips/include/asm/clocksource.h  |  29 +++++
>  arch/mips/include/asm/vdso.h         |  68 +++++++++-
>  arch/mips/kernel/csrc-r4k.c          |  44 +++++++
>  arch/mips/kernel/vdso.c              |  71 ++++++++++-
>  arch/mips/vdso/gettimeofday.c        | 232 +++++++++++++++++++++++++++++++++++
>  arch/mips/vdso/vdso.h                |   9 ++
>  arch/mips/vdso/vdso.lds.S            |   5 +
>  drivers/clocksource/mips-gic-timer.c |   7 +-
>  9 files changed, 460 insertions(+), 10 deletions(-)
>  create mode 100644 arch/mips/include/asm/clocksource.h
>  create mode 100644 arch/mips/vdso/gettimeofday.c
> 

....

> +
> +int __vdso_clock_gettime(clockid_t clkid, struct timespec *ts)
> +{
> +	const union mips_vdso_data *data = get_vdso_data();
> +	int ret;
> +
> +	switch (clkid) {
> +	case CLOCK_REALTIME_COARSE:
> +		ret = do_realtime_coarse(ts, data);
> +		break;
> +	case CLOCK_MONOTONIC_COARSE:
> +		ret = do_monotonic_coarse(ts, data);
> +		break;
> +	case CLOCK_REALTIME:
> +		ret = do_realtime(ts, data);
> +		break;
> +	case CLOCK_MONOTONIC:
> +		ret = do_monotonic(ts, data);
> +		break;
> +	default:
> +		ret = -ENOSYS;
> +		break;
> +	}
> +
> +	/* If we return -ENOSYS libc should fall back to a syscall. */

This comment is important.

The other architectures (checked arm64, tile, x86) are calling the
original syscall instead of returning -ENOSYS here. This will confuse
people trying to use this feature like me.

When the libc does not call the normal syscall this will cause problems.

> +	return ret;
> +}

Hauke

^ permalink raw reply	[flat|nested] 35+ messages in thread

end of thread, other threads:[~2016-01-25 22:36 UTC | newest]

Thread overview: 35+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-28 10:03 [PATCH 0/3] MIPS VDSO support Markos Chandras
2015-09-28 10:10 ` [PATCH 1/3] MIPS: Initial implementation of a VDSO Markos Chandras
2015-09-28 10:54   ` Alex Smith
2015-09-28 13:07     ` Matthew Fortune
2015-11-20 18:15       ` Maciej W. Rozycki
2015-09-28 10:11 ` [PATCH 2/3] irqchip: irq-mips-gic: Provide function to map GIC user section Markos Chandras
2015-09-28 10:55   ` Marc Zyngier
2015-09-28 14:16     ` Qais Yousef
2015-09-28 15:03       ` Marc Zyngier
2015-10-05  8:22     ` Markos Chandras
2015-10-12  9:40   ` [PATCH v2 " Markos Chandras
2015-10-12  9:51     ` Marc Zyngier
2015-10-12 10:16       ` Thomas Gleixner
2015-10-15  9:37         ` Qais Yousef
2015-10-15 10:18           ` Thomas Gleixner
2015-09-28 10:12 ` [PATCH 3/3] MIPS: VDSO: Add implementations of gettimeofday() and clock_gettime() Markos Chandras
2015-09-28 13:15   ` kbuild test robot
2015-10-12 10:24   ` [PATCH v2 " Markos Chandras
2015-10-21  8:57     ` [PATCH v3 " Markos Chandras
2015-10-23  1:41       ` [v3, " Leonid Yegoshin
2015-10-27 14:47         ` Ralf Baechle
2015-10-27 20:46           ` Leonid Yegoshin
2015-10-27 21:02             ` David Daney
2015-10-27 21:15               ` Leonid Yegoshin
2015-10-27 21:44                 ` David Daney
2015-10-27 21:49                   ` Leonid Yegoshin
2015-10-28 10:20                     ` Alex Smith
2015-10-28 18:21                       ` Leonid Yegoshin
2015-10-28 18:30                         ` Alex Smith
2015-10-28 18:57                           ` Leonid Yegoshin
2015-10-28 19:04                             ` Alex Smith
2015-10-28 19:28                               ` Leonid Yegoshin
2015-10-28 19:55                                 ` Alex Smith
2015-10-28 20:15                                   ` Leonid Yegoshin
2016-01-25 22:36       ` [PATCH v3 " Hauke Mehrtens

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).