linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation.
@ 2019-12-23 14:31 Christophe Leroy
  2019-12-23 14:31 ` [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback Christophe Leroy
                   ` (10 more replies)
  0 siblings, 11 replies; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

This is a second tentative to switch powerpc/32 vdso to generic C implementation.

It will likely not work on 64 bits or even build properly at the moment.

powerpc is a bit special for VDSO as well as system calls in the
way that it requires setting CR SO bit which cannot be done in C.
Therefore, entry/exit and fallback needs to be performed in ASM.

To allow that, the fallback calls are moved out of the common code
and left to the arches.

A few other changes in the common code have allowed performance improvement.

The performance has improved since first RFC, but it is still lower than
current assembly VDSO.

On a powerpc8xx, with current powerpc/32 ASM VDSO:

gettimeofday:    vdso: 737 nsec/call
clock-getres-realtime:    vdso: 475 nsec/call
clock-gettime-realtime:    vdso: 892 nsec/call
clock-getres-monotonic:    vdso: 475 nsec/call
clock-gettime-monotonic:    vdso: 1014 nsec/call

First try of C implementation:

gettimeofday:    vdso: 1533 nsec/call
clock-getres-realtime:    vdso: 853 nsec/call
clock-gettime-realtime:    vdso: 1570 nsec/call
clock-getres-monotonic:    vdso: 835 nsec/call
clock-gettime-monotonic:    vdso: 1605 nsec/call

With this series:

gettimeofday:    vdso: 1016 nsec/call
clock-getres-realtime:    vdso: 560 nsec/call
clock-gettime-realtime:    vdso: 1192 nsec/call
clock-getres-monotonic:    vdso: 560 nsec/call
clock-gettime-monotonic:    vdso: 1192 nsec/call


Changes made to other arches are untested, not even compiled.


Christophe Leroy (10):
  lib: vdso: ensure all arches have 32bit fallback
  lib: vdso: move call to fallback out of common code.
  lib: vdso: Change __cvdso_clock_gettime/getres_common() to
    __cvdso_clock_gettime/getres()
  lib: vdso: get pointer to vdso data from the arch
  lib: vdso: inline do_hres()
  lib: vdso: make do_coarse() return 0
  lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  lib: vdso: Avoid duplication in __cvdso_clock_getres()
  powerpc/vdso32: inline __get_datapage()
  powerpc/32: Switch VDSO to C implementation.

 arch/arm/include/asm/vdso/gettimeofday.h          |  26 +++
 arch/arm/vdso/vgettimeofday.c                     |  32 ++-
 arch/arm64/include/asm/vdso/compat_gettimeofday.h |   2 -
 arch/arm64/include/asm/vdso/gettimeofday.h        |  26 +++
 arch/arm64/kernel/vdso/vgettimeofday.c            |  24 +-
 arch/arm64/kernel/vdso32/vgettimeofday.c          |  39 +++-
 arch/mips/include/asm/vdso/gettimeofday.h         |  28 ++-
 arch/mips/vdso/vgettimeofday.c                    |  56 ++++-
 arch/powerpc/Kconfig                              |   2 +
 arch/powerpc/include/asm/vdso/gettimeofday.h      |  45 ++++
 arch/powerpc/include/asm/vdso/vsyscall.h          |  27 +++
 arch/powerpc/include/asm/vdso_datapage.h          |  28 +--
 arch/powerpc/kernel/asm-offsets.c                 |  23 +-
 arch/powerpc/kernel/time.c                        |  92 +-------
 arch/powerpc/kernel/vdso.c                        |  19 +-
 arch/powerpc/kernel/vdso32/Makefile               |  19 +-
 arch/powerpc/kernel/vdso32/cacheflush.S           |   9 +-
 arch/powerpc/kernel/vdso32/datapage.S             |  28 +--
 arch/powerpc/kernel/vdso32/gettimeofday.S         | 265 +++-------------------
 arch/powerpc/kernel/vdso32/vgettimeofday.c        |  32 +++
 arch/x86/entry/vdso/vclock_gettime.c              |  52 ++++-
 arch/x86/include/asm/vdso/gettimeofday.h          |  28 ++-
 lib/vdso/gettimeofday.c                           | 100 +++-----
 23 files changed, 505 insertions(+), 497 deletions(-)
 create mode 100644 arch/powerpc/include/asm/vdso/gettimeofday.h
 create mode 100644 arch/powerpc/include/asm/vdso/vsyscall.h
 create mode 100644 arch/powerpc/kernel/vdso32/vgettimeofday.c

-- 
2.13.3


^ permalink raw reply	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-24  2:07   ` Andy Lutomirski
  2019-12-30 12:27   ` Arnd Bergmann
  2019-12-23 14:31 ` [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code Christophe Leroy
                   ` (9 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

In order to simplify next step which moves fallback call at arch
level, ensure all arches have a 32bit fallback instead of handling
the lack of 32bit fallback in the common code based
on VDSO_HAS_32BIT_FALLBACK

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 arch/arm/include/asm/vdso/gettimeofday.h          | 26 +++++++++++++++++++++
 arch/arm64/include/asm/vdso/compat_gettimeofday.h |  2 --
 arch/arm64/include/asm/vdso/gettimeofday.h        | 26 +++++++++++++++++++++
 arch/mips/include/asm/vdso/gettimeofday.h         | 28 +++++++++++++++++++++--
 arch/x86/include/asm/vdso/gettimeofday.h          | 28 +++++++++++++++++++++--
 lib/vdso/gettimeofday.c                           | 10 --------
 6 files changed, 104 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/vdso/gettimeofday.h b/arch/arm/include/asm/vdso/gettimeofday.h
index 0ad2429c324f..55f8ad6e7777 100644
--- a/arch/arm/include/asm/vdso/gettimeofday.h
+++ b/arch/arm/include/asm/vdso/gettimeofday.h
@@ -70,6 +70,32 @@ static __always_inline int clock_getres_fallback(
 	return ret;
 }
 
+static __always_inline
+long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
+{
+	struct __kernel_timespec ts;
+	int ret = clock_gettime_fallback(clock, &ts);
+
+	if (likely(!ret)) {
+		_ts->tv_sec = ts.tv_sec;
+		_ts->tv_nsec = ts.tv_nsec;
+	}
+	return ret;
+}
+
+static __always_inline
+long clock_getres32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
+{
+	struct __kernel_timespec ts;
+	int ret = clock_getres_fallback(clock, &ts);
+
+	if (likely(!ret && _ts)) {
+		_ts->tv_sec = ts.tv_sec;
+		_ts->tv_nsec = ts.tv_nsec;
+	}
+	return ret;
+}
+
 static __always_inline u64 __arch_get_hw_counter(int clock_mode)
 {
 #ifdef CONFIG_ARM_ARCH_TIMER
diff --git a/arch/arm64/include/asm/vdso/compat_gettimeofday.h b/arch/arm64/include/asm/vdso/compat_gettimeofday.h
index c50ee1b7d5cd..bab700e37a03 100644
--- a/arch/arm64/include/asm/vdso/compat_gettimeofday.h
+++ b/arch/arm64/include/asm/vdso/compat_gettimeofday.h
@@ -16,8 +16,6 @@
 
 #define VDSO_HAS_CLOCK_GETRES		1
 
-#define VDSO_HAS_32BIT_FALLBACK		1
-
 static __always_inline
 int gettimeofday_fallback(struct __kernel_old_timeval *_tv,
 			  struct timezone *_tz)
diff --git a/arch/arm64/include/asm/vdso/gettimeofday.h b/arch/arm64/include/asm/vdso/gettimeofday.h
index b08f476b72b4..c41c86a07423 100644
--- a/arch/arm64/include/asm/vdso/gettimeofday.h
+++ b/arch/arm64/include/asm/vdso/gettimeofday.h
@@ -66,6 +66,32 @@ int clock_getres_fallback(clockid_t _clkid, struct __kernel_timespec *_ts)
 	return ret;
 }
 
+static __always_inline
+long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
+{
+	struct __kernel_timespec ts;
+	int ret = clock_gettime_fallback(clock, &ts);
+
+	if (likely(!ret)) {
+		_ts->tv_sec = ts.tv_sec;
+		_ts->tv_nsec = ts.tv_nsec;
+	}
+	return ret;
+}
+
+static __always_inline
+long clock_getres32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
+{
+	struct __kernel_timespec ts;
+	int ret = clock_getres_fallback(clock, &ts);
+
+	if (likely(!ret && _ts)) {
+		_ts->tv_sec = ts.tv_sec;
+		_ts->tv_nsec = ts.tv_nsec;
+	}
+	return ret;
+}
+
 static __always_inline u64 __arch_get_hw_counter(s32 clock_mode)
 {
 	u64 res;
diff --git a/arch/mips/include/asm/vdso/gettimeofday.h b/arch/mips/include/asm/vdso/gettimeofday.h
index b08825531e9f..60608e930a5c 100644
--- a/arch/mips/include/asm/vdso/gettimeofday.h
+++ b/arch/mips/include/asm/vdso/gettimeofday.h
@@ -109,8 +109,6 @@ static __always_inline int clock_getres_fallback(
 
 #if _MIPS_SIM != _MIPS_SIM_ABI64
 
-#define VDSO_HAS_32BIT_FALLBACK	1
-
 static __always_inline long clock_gettime32_fallback(
 					clockid_t _clkid,
 					struct old_timespec32 *_ts)
@@ -150,6 +148,32 @@ static __always_inline int clock_getres32_fallback(
 
 	return error ? -ret : ret;
 }
+#else
+static __always_inline
+long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
+{
+	struct __kernel_timespec ts;
+	int ret = clock_gettime_fallback(clock, &ts);
+
+	if (likely(!ret)) {
+		_ts->tv_sec = ts.tv_sec;
+		_ts->tv_nsec = ts.tv_nsec;
+	}
+	return ret;
+}
+
+static __always_inline
+long clock_getres32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
+{
+	struct __kernel_timespec ts;
+	int ret = clock_getres_fallback(clock, &ts);
+
+	if (likely(!ret && _ts)) {
+		_ts->tv_sec = ts.tv_sec;
+		_ts->tv_nsec = ts.tv_nsec;
+	}
+	return ret;
+}
 #endif
 
 #ifdef CONFIG_CSRC_R4K
diff --git a/arch/x86/include/asm/vdso/gettimeofday.h b/arch/x86/include/asm/vdso/gettimeofday.h
index e9ee139cf29e..e1e16c2fdba0 100644
--- a/arch/x86/include/asm/vdso/gettimeofday.h
+++ b/arch/x86/include/asm/vdso/gettimeofday.h
@@ -94,9 +94,33 @@ long clock_getres_fallback(clockid_t _clkid, struct __kernel_timespec *_ts)
 	return ret;
 }
 
-#else
+static __always_inline
+long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
+{
+	struct __kernel_timespec ts;
+	int ret = clock_gettime_fallback(clock, &ts);
 
-#define VDSO_HAS_32BIT_FALLBACK	1
+	if (likely(!ret)) {
+		_ts->tv_sec = ts.tv_sec;
+		_ts->tv_nsec = ts.tv_nsec;
+	}
+	return ret;
+}
+
+static __always_inline
+long clock_getres32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
+{
+	struct __kernel_timespec ts;
+	int ret = clock_getres_fallback(clock, &ts);
+
+	if (likely(!ret && _ts)) {
+		_ts->tv_sec = ts.tv_sec;
+		_ts->tv_nsec = ts.tv_nsec;
+	}
+	return ret;
+}
+
+#else
 
 static __always_inline
 long clock_gettime_fallback(clockid_t _clkid, struct __kernel_timespec *_ts)
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 9ecfd3b547ba..59189ed49352 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -125,13 +125,8 @@ __cvdso_clock_gettime32(clockid_t clock, struct old_timespec32 *res)
 
 	ret = __cvdso_clock_gettime_common(clock, &ts);
 
-#ifdef VDSO_HAS_32BIT_FALLBACK
 	if (unlikely(ret))
 		return clock_gettime32_fallback(clock, res);
-#else
-	if (unlikely(ret))
-		ret = clock_gettime_fallback(clock, &ts);
-#endif
 
 	if (likely(!ret)) {
 		res->tv_sec = ts.tv_sec;
@@ -238,13 +233,8 @@ __cvdso_clock_getres_time32(clockid_t clock, struct old_timespec32 *res)
 
 	ret = __cvdso_clock_getres_common(clock, &ts);
 
-#ifdef VDSO_HAS_32BIT_FALLBACK
 	if (unlikely(ret))
 		return clock_getres32_fallback(clock, res);
-#else
-	if (unlikely(ret))
-		ret = clock_getres_fallback(clock, &ts);
-#endif
 
 	if (likely(!ret && res)) {
 		res->tv_sec = ts.tv_sec;
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code.
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
  2019-12-23 14:31 ` [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-24  2:24   ` Andy Lutomirski
  2019-12-23 14:31 ` [RFC PATCH v2 03/10] lib: vdso: Change __cvdso_clock_gettime/getres_common() to __cvdso_clock_gettime/getres() Christophe Leroy
                   ` (8 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

On powerpc, VDSO functions and syscalls cannot be implemented in C
because the Linux kernel ABI requires that CR[SO] bit is set in case
of error and cleared when no error.

As this cannot be done in C, C VDSO functions and syscall'based
fallback need a trampoline in ASM.

By moving the fallback calls out of the common code, arches like
powerpc can implement both the call to C VDSO and the fallback call
in a single trampoline function.

The two advantages are:
- No need play back and forth with CR[SO] and negative return value.
- No stack frame is required in VDSO C functions for the fallbacks.

The performance improvement is significant on powerpc.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 arch/arm/vdso/vgettimeofday.c            | 28 +++++++++++++++---
 arch/arm64/kernel/vdso/vgettimeofday.c   | 21 ++++++++++++--
 arch/arm64/kernel/vdso32/vgettimeofday.c | 35 ++++++++++++++++++++---
 arch/mips/vdso/vgettimeofday.c           | 49 +++++++++++++++++++++++++++-----
 arch/x86/entry/vdso/vclock_gettime.c     | 42 +++++++++++++++++++++++----
 lib/vdso/gettimeofday.c                  | 31 ++++----------------
 6 files changed, 156 insertions(+), 50 deletions(-)

diff --git a/arch/arm/vdso/vgettimeofday.c b/arch/arm/vdso/vgettimeofday.c
index 1976c6f325a4..5451afb715e6 100644
--- a/arch/arm/vdso/vgettimeofday.c
+++ b/arch/arm/vdso/vgettimeofday.c
@@ -10,25 +10,45 @@
 int __vdso_clock_gettime(clockid_t clock,
 			 struct old_timespec32 *ts)
 {
-	return __cvdso_clock_gettime32(clock, ts);
+	int ret = __cvdso_clock_gettime32(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime32_fallback(clock, &ts);
 }
 
 int __vdso_clock_gettime64(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
-	return __cvdso_clock_gettime(clock, ts);
+	int ret = __cvdso_clock_gettime(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime_fallback(clock, ts);
 }
 
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 			struct timezone *tz)
 {
-	return __cvdso_gettimeofday(tv, tz);
+	int ret = __cvdso_gettimeofday(tv, tz);
+
+	if (likely(!ret))
+		return ret;
+
+	return gettimeofday_fallback(tv, tz);
 }
 
 int __vdso_clock_getres(clockid_t clock_id,
 			struct old_timespec32 *res)
 {
-	return __cvdso_clock_getres_time32(clock_id, res);
+	int ret = __cvdso_clock_getres_time32(clock_id, res);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_getres32_fallback(clock, res);
 }
 
 /* Avoid unresolved references emitted by GCC */
diff --git a/arch/arm64/kernel/vdso/vgettimeofday.c b/arch/arm64/kernel/vdso/vgettimeofday.c
index 747635501a14..62694876b216 100644
--- a/arch/arm64/kernel/vdso/vgettimeofday.c
+++ b/arch/arm64/kernel/vdso/vgettimeofday.c
@@ -11,17 +11,32 @@
 int __kernel_clock_gettime(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
-	return __cvdso_clock_gettime(clock, ts);
+	int ret = __cvdso_clock_gettime(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime_fallback(clock, ts);
 }
 
 int __kernel_gettimeofday(struct __kernel_old_timeval *tv,
 			  struct timezone *tz)
 {
-	return __cvdso_gettimeofday(tv, tz);
+	int ret = __cvdso_gettimeofday(tv, tz);
+
+	if (likely(!ret))
+		return ret;
+
+	return gettimeofday_fallback(tv, tz);
 }
 
 int __kernel_clock_getres(clockid_t clock_id,
 			  struct __kernel_timespec *res)
 {
-	return __cvdso_clock_getres(clock_id, res);
+	int ret =  __cvdso_clock_getres(clock_id, res);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_getres_fallback(clock, res);
 }
diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
index 54fc1c2ce93f..6770d2bedd1f 100644
--- a/arch/arm64/kernel/vdso32/vgettimeofday.c
+++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
@@ -11,37 +11,64 @@
 int __vdso_clock_gettime(clockid_t clock,
 			 struct old_timespec32 *ts)
 {
+	int ret;
+
 	/* The checks below are required for ABI consistency with arm */
 	if ((u32)ts >= TASK_SIZE_32)
 		return -EFAULT;
 
-	return __cvdso_clock_gettime32(clock, ts);
+	ret = __cvdso_clock_gettime32(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime32_fallback(clock, &ts);
 }
 
 int __vdso_clock_gettime64(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
+	int ret;
+
 	/* The checks below are required for ABI consistency with arm */
 	if ((u32)ts >= TASK_SIZE_32)
 		return -EFAULT;
 
-	return __cvdso_clock_gettime(clock, ts);
+	ret = __cvdso_clock_gettime(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime_fallback(clock, ts);
 }
 
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 			struct timezone *tz)
 {
-	return __cvdso_gettimeofday(tv, tz);
+	int ret = __cvdso_gettimeofday(tv, tz);
+
+	if (likely(!ret))
+		return ret;
+
+	return gettimeofday_fallback(tv, tz);
 }
 
 int __vdso_clock_getres(clockid_t clock_id,
 			struct old_timespec32 *res)
 {
+	int ret;
+	struct __kernel_timespec ts;
+
 	/* The checks below are required for ABI consistency with arm */
 	if ((u32)res >= TASK_SIZE_32)
 		return -EFAULT;
 
-	return __cvdso_clock_getres_time32(clock_id, res);
+	ret = __cvdso_clock_getres_time32(clock_id, res);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_getres32_fallback(clock, res);
 }
 
 /* Avoid unresolved references emitted by GCC */
diff --git a/arch/mips/vdso/vgettimeofday.c b/arch/mips/vdso/vgettimeofday.c
index 6ebdc37c89fc..02758c58454c 100644
--- a/arch/mips/vdso/vgettimeofday.c
+++ b/arch/mips/vdso/vgettimeofday.c
@@ -14,25 +14,45 @@
 int __vdso_clock_gettime(clockid_t clock,
 			 struct old_timespec32 *ts)
 {
-	return __cvdso_clock_gettime32(clock, ts);
+	int ret = __cvdso_clock_gettime32(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime32_fallback(clock, ts);
 }
 
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 			struct timezone *tz)
 {
-	return __cvdso_gettimeofday(tv, tz);
+	int ret = __cvdso_gettimeofday(tv, tz);
+
+	if (likely(!ret))
+		return ret;
+
+	return gettimeofday_fallback(tv, tz);
 }
 
 int __vdso_clock_getres(clockid_t clock_id,
 			struct old_timespec32 *res)
 {
-	return __cvdso_clock_getres_time32(clock_id, res);
+	int ret = __cvdso_clock_getres_time32(clock_id, res);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_getres32_fallback(clock, res);
 }
 
 int __vdso_clock_gettime64(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
-	return __cvdso_clock_gettime(clock, ts);
+	int ret = __cvdso_clock_gettime(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime_fallback(clock, ts);
 }
 
 #else
@@ -40,19 +60,34 @@ int __vdso_clock_gettime64(clockid_t clock,
 int __vdso_clock_gettime(clockid_t clock,
 			 struct __kernel_timespec *ts)
 {
-	return __cvdso_clock_gettime(clock, ts);
+	int ret = __cvdso_clock_gettime(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime_fallback(clock, ts);
 }
 
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 			struct timezone *tz)
 {
-	return __cvdso_gettimeofday(tv, tz);
+	int ret = __cvdso_gettimeofday(tv, tz);
+
+	if (likely(!ret))
+		return ret;
+
+	return gettimeofday_fallback(tv, tz);
 }
 
 int __vdso_clock_getres(clockid_t clock_id,
 			struct __kernel_timespec *res)
 {
-	return __cvdso_clock_getres(clock_id, res);
+	int ret =  __cvdso_clock_getres(clock_id, res);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_getres_fallback(clock, res);
 }
 
 #endif
diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index 7d70935b6758..fde511cfe284 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -19,7 +19,12 @@ extern __kernel_old_time_t __vdso_time(__kernel_old_time_t *t);
 
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
 {
-	return __cvdso_gettimeofday(tv, tz);
+	int ret = __cvdso_gettimeofday(tv, tz);
+
+	if (likely(!ret))
+		return ret;
+
+	return gettimeofday_fallback(tv, tz);
 }
 
 int gettimeofday(struct __kernel_old_timeval *, struct timezone *)
@@ -40,7 +45,12 @@ extern int __vdso_clock_getres(clockid_t clock, struct __kernel_timespec *res);
 
 int __vdso_clock_gettime(clockid_t clock, struct __kernel_timespec *ts)
 {
-	return __cvdso_clock_gettime(clock, ts);
+	int ret = __cvdso_clock_gettime(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime_fallback(clock, ts);
 }
 
 int clock_gettime(clockid_t, struct __kernel_timespec *)
@@ -49,7 +59,12 @@ int clock_gettime(clockid_t, struct __kernel_timespec *)
 int __vdso_clock_getres(clockid_t clock,
 			struct __kernel_timespec *res)
 {
-	return __cvdso_clock_getres(clock, res);
+	int ret =  __cvdso_clock_getres(clock_id, res);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_getres_fallback(clock, res);
 }
 int clock_getres(clockid_t, struct __kernel_timespec *)
 	__attribute__((weak, alias("__vdso_clock_getres")));
@@ -61,7 +76,12 @@ extern int __vdso_clock_getres(clockid_t clock, struct old_timespec32 *res);
 
 int __vdso_clock_gettime(clockid_t clock, struct old_timespec32 *ts)
 {
-	return __cvdso_clock_gettime32(clock, ts);
+	int ret = __cvdso_clock_gettime32(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime32_fallback(clock, ts);
 }
 
 int clock_gettime(clockid_t, struct old_timespec32 *)
@@ -69,7 +89,12 @@ int clock_gettime(clockid_t, struct old_timespec32 *)
 
 int __vdso_clock_gettime64(clockid_t clock, struct __kernel_timespec *ts)
 {
-	return __cvdso_clock_gettime(clock, ts);
+	int ret = __cvdso_clock_gettime(clock, ts);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_gettime_fallback(clock, ts);
 }
 
 int clock_gettime64(clockid_t, struct __kernel_timespec *)
@@ -77,7 +102,12 @@ int clock_gettime64(clockid_t, struct __kernel_timespec *)
 
 int __vdso_clock_getres(clockid_t clock, struct old_timespec32 *res)
 {
-	return __cvdso_clock_getres_time32(clock, res);
+	int ret = __cvdso_clock_getres_time32(clock, res);
+
+	if (likely(!ret))
+		return ret;
+
+	return clock_getres32_fallback(clock, res);
 }
 
 int clock_getres(clockid_t, struct old_timespec32 *)
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 59189ed49352..4618e274f1d5 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -16,9 +16,6 @@
  * - __arch_get_vdso_data(): to get the vdso datapage.
  * - __arch_get_hw_counter(): to get the hw counter based on the
  *   clock_mode.
- * - gettimeofday_fallback(): fallback for gettimeofday.
- * - clock_gettime_fallback(): fallback for clock_gettime.
- * - clock_getres_fallback(): fallback for clock_getres.
  */
 #ifdef ENABLE_COMPAT_VDSO
 #include <asm/vdso/compat_gettimeofday.h>
@@ -110,23 +107,14 @@ __cvdso_clock_gettime_common(clockid_t clock, struct __kernel_timespec *ts)
 static __maybe_unused int
 __cvdso_clock_gettime(clockid_t clock, struct __kernel_timespec *ts)
 {
-	int ret = __cvdso_clock_gettime_common(clock, ts);
-
-	if (unlikely(ret))
-		return clock_gettime_fallback(clock, ts);
-	return 0;
+	return __cvdso_clock_gettime_common(clock, ts);
 }
 
 static __maybe_unused int
 __cvdso_clock_gettime32(clockid_t clock, struct old_timespec32 *res)
 {
 	struct __kernel_timespec ts;
-	int ret;
-
-	ret = __cvdso_clock_gettime_common(clock, &ts);
-
-	if (unlikely(ret))
-		return clock_gettime32_fallback(clock, res);
+	int ret = __cvdso_clock_gettime_common(clock, &ts);
 
 	if (likely(!ret)) {
 		res->tv_sec = ts.tv_sec;
@@ -144,7 +132,7 @@ __cvdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
 		struct __kernel_timespec ts;
 
 		if (do_hres(&vd[CS_HRES_COARSE], CLOCK_REALTIME, &ts))
-			return gettimeofday_fallback(tv, tz);
+			return -1;
 
 		tv->tv_sec = ts.tv_sec;
 		tv->tv_usec = (u32)ts.tv_nsec / NSEC_PER_USEC;
@@ -218,23 +206,14 @@ int __cvdso_clock_getres_common(clockid_t clock, struct __kernel_timespec *res)
 
 int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
 {
-	int ret = __cvdso_clock_getres_common(clock, res);
-
-	if (unlikely(ret))
-		return clock_getres_fallback(clock, res);
-	return 0;
+	return __cvdso_clock_getres_common(clock, res);
 }
 
 static __maybe_unused int
 __cvdso_clock_getres_time32(clockid_t clock, struct old_timespec32 *res)
 {
 	struct __kernel_timespec ts;
-	int ret;
-
-	ret = __cvdso_clock_getres_common(clock, &ts);
-
-	if (unlikely(ret))
-		return clock_getres32_fallback(clock, res);
+	int ret = __cvdso_clock_getres_common(clock, &ts);
 
 	if (likely(!ret && res)) {
 		res->tv_sec = ts.tv_sec;
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 03/10] lib: vdso: Change __cvdso_clock_gettime/getres_common() to __cvdso_clock_gettime/getres()
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
  2019-12-23 14:31 ` [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback Christophe Leroy
  2019-12-23 14:31 ` [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-23 14:31 ` [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch Christophe Leroy
                   ` (7 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

__cvdso_clock_getres() just calls __cvdso_clock_getres_common().
__cvdso_clock_gettime() just calls __cvdso_clock_getres_common().

Drop __cvdso_clock_getres() and __cvdso_clock_gettime()
Rename __cvdso_clock_gettime_common() into __cvdso_clock_gettime()
Rename __cvdso_clock_getres_common() into __cvdso_clock_getres()

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 lib/vdso/gettimeofday.c | 19 ++++---------------
 1 file changed, 4 insertions(+), 15 deletions(-)

diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 4618e274f1d5..c6eeeb47f446 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -79,7 +79,7 @@ static void do_coarse(const struct vdso_data *vd, clockid_t clk,
 }
 
 static __maybe_unused int
-__cvdso_clock_gettime_common(clockid_t clock, struct __kernel_timespec *ts)
+__cvdso_clock_gettime(clockid_t clock, struct __kernel_timespec *ts)
 {
 	const struct vdso_data *vd = __arch_get_vdso_data();
 	u32 msk;
@@ -105,16 +105,10 @@ __cvdso_clock_gettime_common(clockid_t clock, struct __kernel_timespec *ts)
 }
 
 static __maybe_unused int
-__cvdso_clock_gettime(clockid_t clock, struct __kernel_timespec *ts)
-{
-	return __cvdso_clock_gettime_common(clock, ts);
-}
-
-static __maybe_unused int
 __cvdso_clock_gettime32(clockid_t clock, struct old_timespec32 *res)
 {
 	struct __kernel_timespec ts;
-	int ret = __cvdso_clock_gettime_common(clock, &ts);
+	int ret = __cvdso_clock_gettime(clock, &ts);
 
 	if (likely(!ret)) {
 		res->tv_sec = ts.tv_sec;
@@ -161,7 +155,7 @@ static __maybe_unused __kernel_old_time_t __cvdso_time(__kernel_old_time_t *time
 
 #ifdef VDSO_HAS_CLOCK_GETRES
 static __maybe_unused
-int __cvdso_clock_getres_common(clockid_t clock, struct __kernel_timespec *res)
+int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
 {
 	const struct vdso_data *vd = __arch_get_vdso_data();
 	u64 hrtimer_res;
@@ -204,16 +198,11 @@ int __cvdso_clock_getres_common(clockid_t clock, struct __kernel_timespec *res)
 	return 0;
 }
 
-int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
-{
-	return __cvdso_clock_getres_common(clock, res);
-}
-
 static __maybe_unused int
 __cvdso_clock_getres_time32(clockid_t clock, struct old_timespec32 *res)
 {
 	struct __kernel_timespec ts;
-	int ret = __cvdso_clock_getres_common(clock, &ts);
+	int ret = __cvdso_clock_getres(clock, &ts);
 
 	if (likely(!ret && res)) {
 		res->tv_sec = ts.tv_sec;
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
                   ` (2 preceding siblings ...)
  2019-12-23 14:31 ` [RFC PATCH v2 03/10] lib: vdso: Change __cvdso_clock_gettime/getres_common() to __cvdso_clock_gettime/getres() Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-24  2:27   ` Andy Lutomirski
  2019-12-23 14:31 ` [RFC PATCH v2 05/10] lib: vdso: inline do_hres() Christophe Leroy
                   ` (6 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

On powerpc, __arch_get_vdso_data() clobbers the link register,
requiring the caller to set a stack frame in order to save it.

As the parent function already has to set a stack frame and save
the link register to call the C vdso function, retriving the
vdso data pointer there is lighter.

Give arches the opportunity to hand the vdso data pointer
to C vdso functions.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 arch/arm/vdso/vgettimeofday.c            | 12 ++++++++----
 arch/arm64/kernel/vdso/vgettimeofday.c   |  9 ++++++---
 arch/arm64/kernel/vdso32/vgettimeofday.c | 12 ++++++++----
 arch/mips/vdso/vgettimeofday.c           | 21 ++++++++++++++-------
 arch/x86/entry/vdso/vclock_gettime.c     | 22 +++++++++++++++-------
 lib/vdso/gettimeofday.c                  | 28 ++++++++++++++--------------
 6 files changed, 65 insertions(+), 39 deletions(-)

diff --git a/arch/arm/vdso/vgettimeofday.c b/arch/arm/vdso/vgettimeofday.c
index 5451afb715e6..efad7d508d06 100644
--- a/arch/arm/vdso/vgettimeofday.c
+++ b/arch/arm/vdso/vgettimeofday.c
@@ -10,7 +10,8 @@
 int __vdso_clock_gettime(clockid_t clock,
 			 struct old_timespec32 *ts)
 {
-	int ret = __cvdso_clock_gettime32(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime32(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -21,7 +22,8 @@ int __vdso_clock_gettime(clockid_t clock,
 int __vdso_clock_gettime64(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
-	int ret = __cvdso_clock_gettime(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -32,7 +34,8 @@ int __vdso_clock_gettime64(clockid_t clock,
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 			struct timezone *tz)
 {
-	int ret = __cvdso_gettimeofday(tv, tz);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_gettimeofday(vd, tv, tz);
 
 	if (likely(!ret))
 		return ret;
@@ -43,7 +46,8 @@ int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 int __vdso_clock_getres(clockid_t clock_id,
 			struct old_timespec32 *res)
 {
-	int ret = __cvdso_clock_getres_time32(clock_id, res);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_getres_time32(vd, clock_id, res);
 
 	if (likely(!ret))
 		return ret;
diff --git a/arch/arm64/kernel/vdso/vgettimeofday.c b/arch/arm64/kernel/vdso/vgettimeofday.c
index 62694876b216..9a7122ec6d17 100644
--- a/arch/arm64/kernel/vdso/vgettimeofday.c
+++ b/arch/arm64/kernel/vdso/vgettimeofday.c
@@ -11,7 +11,8 @@
 int __kernel_clock_gettime(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
-	int ret = __cvdso_clock_gettime(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -22,7 +23,8 @@ int __kernel_clock_gettime(clockid_t clock,
 int __kernel_gettimeofday(struct __kernel_old_timeval *tv,
 			  struct timezone *tz)
 {
-	int ret = __cvdso_gettimeofday(tv, tz);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_gettimeofday(vd, tv, tz);
 
 	if (likely(!ret))
 		return ret;
@@ -33,7 +35,8 @@ int __kernel_gettimeofday(struct __kernel_old_timeval *tv,
 int __kernel_clock_getres(clockid_t clock_id,
 			  struct __kernel_timespec *res)
 {
-	int ret =  __cvdso_clock_getres(clock_id, res);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret =  __cvdso_clock_getres(vd, clock_id, res);
 
 	if (likely(!ret))
 		return ret;
diff --git a/arch/arm64/kernel/vdso32/vgettimeofday.c b/arch/arm64/kernel/vdso32/vgettimeofday.c
index 6770d2bedd1f..3eb6a82c1c25 100644
--- a/arch/arm64/kernel/vdso32/vgettimeofday.c
+++ b/arch/arm64/kernel/vdso32/vgettimeofday.c
@@ -11,13 +11,14 @@
 int __vdso_clock_gettime(clockid_t clock,
 			 struct old_timespec32 *ts)
 {
+	const struct vdso_data *vd = __arch_get_vdso_data();
 	int ret;
 
 	/* The checks below are required for ABI consistency with arm */
 	if ((u32)ts >= TASK_SIZE_32)
 		return -EFAULT;
 
-	ret = __cvdso_clock_gettime32(clock, ts);
+	ret = __cvdso_clock_gettime32(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -28,13 +29,14 @@ int __vdso_clock_gettime(clockid_t clock,
 int __vdso_clock_gettime64(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
+	const struct vdso_data *vd = __arch_get_vdso_data();
 	int ret;
 
 	/* The checks below are required for ABI consistency with arm */
 	if ((u32)ts >= TASK_SIZE_32)
 		return -EFAULT;
 
-	ret = __cvdso_clock_gettime(clock, ts);
+	ret = __cvdso_clock_gettime(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -45,7 +47,8 @@ int __vdso_clock_gettime64(clockid_t clock,
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 			struct timezone *tz)
 {
-	int ret = __cvdso_gettimeofday(tv, tz);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_gettimeofday(vd, tv, tz);
 
 	if (likely(!ret))
 		return ret;
@@ -56,6 +59,7 @@ int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 int __vdso_clock_getres(clockid_t clock_id,
 			struct old_timespec32 *res)
 {
+	const struct vdso_data *vd = __arch_get_vdso_data();
 	int ret;
 	struct __kernel_timespec ts;
 
@@ -63,7 +67,7 @@ int __vdso_clock_getres(clockid_t clock_id,
 	if ((u32)res >= TASK_SIZE_32)
 		return -EFAULT;
 
-	ret = __cvdso_clock_getres_time32(clock_id, res);
+	ret = __cvdso_clock_getres_time32(vd, clock_id, res);
 
 	if (likely(!ret))
 		return ret;
diff --git a/arch/mips/vdso/vgettimeofday.c b/arch/mips/vdso/vgettimeofday.c
index 02758c58454c..65d79d4695da 100644
--- a/arch/mips/vdso/vgettimeofday.c
+++ b/arch/mips/vdso/vgettimeofday.c
@@ -14,7 +14,8 @@
 int __vdso_clock_gettime(clockid_t clock,
 			 struct old_timespec32 *ts)
 {
-	int ret = __cvdso_clock_gettime32(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime32(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -25,7 +26,8 @@ int __vdso_clock_gettime(clockid_t clock,
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 			struct timezone *tz)
 {
-	int ret = __cvdso_gettimeofday(tv, tz);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_gettimeofday(vd, tv, tz);
 
 	if (likely(!ret))
 		return ret;
@@ -36,7 +38,8 @@ int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 int __vdso_clock_getres(clockid_t clock_id,
 			struct old_timespec32 *res)
 {
-	int ret = __cvdso_clock_getres_time32(clock_id, res);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_getres_time32(vd, clock_id, res);
 
 	if (likely(!ret))
 		return ret;
@@ -47,7 +50,8 @@ int __vdso_clock_getres(clockid_t clock_id,
 int __vdso_clock_gettime64(clockid_t clock,
 			   struct __kernel_timespec *ts)
 {
-	int ret = __cvdso_clock_gettime(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -60,7 +64,8 @@ int __vdso_clock_gettime64(clockid_t clock,
 int __vdso_clock_gettime(clockid_t clock,
 			 struct __kernel_timespec *ts)
 {
-	int ret = __cvdso_clock_gettime(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -71,7 +76,8 @@ int __vdso_clock_gettime(clockid_t clock,
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 			struct timezone *tz)
 {
-	int ret = __cvdso_gettimeofday(tv, tz);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_gettimeofday(vd, tv, tz);
 
 	if (likely(!ret))
 		return ret;
@@ -82,7 +88,8 @@ int __vdso_gettimeofday(struct __kernel_old_timeval *tv,
 int __vdso_clock_getres(clockid_t clock_id,
 			struct __kernel_timespec *res)
 {
-	int ret =  __cvdso_clock_getres(clock_id, res);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret =  __cvdso_clock_getres(vd, clock_id, res);
 
 	if (likely(!ret))
 		return ret;
diff --git a/arch/x86/entry/vdso/vclock_gettime.c b/arch/x86/entry/vdso/vclock_gettime.c
index fde511cfe284..b87847b660dd 100644
--- a/arch/x86/entry/vdso/vclock_gettime.c
+++ b/arch/x86/entry/vdso/vclock_gettime.c
@@ -19,7 +19,8 @@ extern __kernel_old_time_t __vdso_time(__kernel_old_time_t *t);
 
 int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
 {
-	int ret = __cvdso_gettimeofday(tv, tz);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_gettimeofday(vd, tv, tz);
 
 	if (likely(!ret))
 		return ret;
@@ -32,7 +33,9 @@ int gettimeofday(struct __kernel_old_timeval *, struct timezone *)
 
 __kernel_old_time_t __vdso_time(__kernel_old_time_t *t)
 {
-	return __cvdso_time(t);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+
+	return __cvdso_time(vd, t);
 }
 
 __kernel_old_time_t time(__kernel_old_time_t *t)	__attribute__((weak, alias("__vdso_time")));
@@ -45,7 +48,8 @@ extern int __vdso_clock_getres(clockid_t clock, struct __kernel_timespec *res);
 
 int __vdso_clock_gettime(clockid_t clock, struct __kernel_timespec *ts)
 {
-	int ret = __cvdso_clock_gettime(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -59,7 +63,8 @@ int clock_gettime(clockid_t, struct __kernel_timespec *)
 int __vdso_clock_getres(clockid_t clock,
 			struct __kernel_timespec *res)
 {
-	int ret =  __cvdso_clock_getres(clock_id, res);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret =  __cvdso_clock_getres(vd, clock_id, res);
 
 	if (likely(!ret))
 		return ret;
@@ -76,7 +81,8 @@ extern int __vdso_clock_getres(clockid_t clock, struct old_timespec32 *res);
 
 int __vdso_clock_gettime(clockid_t clock, struct old_timespec32 *ts)
 {
-	int ret = __cvdso_clock_gettime32(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime32(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -89,7 +95,8 @@ int clock_gettime(clockid_t, struct old_timespec32 *)
 
 int __vdso_clock_gettime64(clockid_t clock, struct __kernel_timespec *ts)
 {
-	int ret = __cvdso_clock_gettime(clock, ts);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_gettime(vd, clock, ts);
 
 	if (likely(!ret))
 		return ret;
@@ -102,7 +109,8 @@ int clock_gettime64(clockid_t, struct __kernel_timespec *)
 
 int __vdso_clock_getres(clockid_t clock, struct old_timespec32 *res)
 {
-	int ret = __cvdso_clock_getres_time32(clock, res);
+	const struct vdso_data *vd = __arch_get_vdso_data();
+	int ret = __cvdso_clock_getres_time32(vd, clock, res);
 
 	if (likely(!ret))
 		return ret;
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index c6eeeb47f446..24e1ba838260 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -13,7 +13,6 @@
 /*
  * The generic vDSO implementation requires that gettimeofday.h
  * provides:
- * - __arch_get_vdso_data(): to get the vdso datapage.
  * - __arch_get_hw_counter(): to get the hw counter based on the
  *   clock_mode.
  */
@@ -79,9 +78,9 @@ static void do_coarse(const struct vdso_data *vd, clockid_t clk,
 }
 
 static __maybe_unused int
-__cvdso_clock_gettime(clockid_t clock, struct __kernel_timespec *ts)
+__cvdso_clock_gettime(const struct vdso_data *vd, clockid_t clock,
+		      struct __kernel_timespec *ts)
 {
-	const struct vdso_data *vd = __arch_get_vdso_data();
 	u32 msk;
 
 	/* Check for negative values or invalid clocks */
@@ -105,10 +104,11 @@ __cvdso_clock_gettime(clockid_t clock, struct __kernel_timespec *ts)
 }
 
 static __maybe_unused int
-__cvdso_clock_gettime32(clockid_t clock, struct old_timespec32 *res)
+__cvdso_clock_gettime32(const struct vdso_data *vd, clockid_t clock,
+			struct old_timespec32 *res)
 {
 	struct __kernel_timespec ts;
-	int ret = __cvdso_clock_gettime(clock, &ts);
+	int ret = __cvdso_clock_gettime(vd, clock, &ts);
 
 	if (likely(!ret)) {
 		res->tv_sec = ts.tv_sec;
@@ -118,10 +118,9 @@ __cvdso_clock_gettime32(clockid_t clock, struct old_timespec32 *res)
 }
 
 static __maybe_unused int
-__cvdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
+__cvdso_gettimeofday(const struct vdso_data *vd, struct __kernel_old_timeval *tv,
+		     struct timezone *tz)
 {
-	const struct vdso_data *vd = __arch_get_vdso_data();
-
 	if (likely(tv != NULL)) {
 		struct __kernel_timespec ts;
 
@@ -141,9 +140,9 @@ __cvdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz)
 }
 
 #ifdef VDSO_HAS_TIME
-static __maybe_unused __kernel_old_time_t __cvdso_time(__kernel_old_time_t *time)
+static __maybe_unused __kernel_old_time_t
+__cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
 {
-	const struct vdso_data *vd = __arch_get_vdso_data();
 	__kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
 
 	if (time)
@@ -155,9 +154,9 @@ static __maybe_unused __kernel_old_time_t __cvdso_time(__kernel_old_time_t *time
 
 #ifdef VDSO_HAS_CLOCK_GETRES
 static __maybe_unused
-int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
+int __cvdso_clock_getres(const struct vdso_data *vd, clockid_t clock,
+			 struct __kernel_timespec *res)
 {
-	const struct vdso_data *vd = __arch_get_vdso_data();
 	u64 hrtimer_res;
 	u32 msk;
 	u64 ns;
@@ -199,10 +198,11 @@ int __cvdso_clock_getres(clockid_t clock, struct __kernel_timespec *res)
 }
 
 static __maybe_unused int
-__cvdso_clock_getres_time32(clockid_t clock, struct old_timespec32 *res)
+__cvdso_clock_getres_time32(const struct vdso_data *vd, clockid_t clock,
+			    struct old_timespec32 *res)
 {
 	struct __kernel_timespec ts;
-	int ret = __cvdso_clock_getres(clock, &ts);
+	int ret = __cvdso_clock_getres(vd, clock, &ts);
 
 	if (likely(!ret && res)) {
 		res->tv_sec = ts.tv_sec;
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 05/10] lib: vdso: inline do_hres()
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
                   ` (3 preceding siblings ...)
  2019-12-23 14:31 ` [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-24  2:29   ` Andy Lutomirski
  2019-12-30 12:07   ` Arnd Bergmann
  2019-12-23 14:31 ` [RFC PATCH v2 06/10] lib: vdso: make do_coarse() return 0 Christophe Leroy
                   ` (5 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

do_hres() is called from several places, so GCC doesn't inline
it at first.

do_hres() takes a struct __kernel_timespec * parameter for
passing the result. In the 32 bits case, this parameter corresponds
to a local var in the caller. In order to provide a pointer
to this structure, the caller has to put it in its stack and
do_hres() has to write the result in the stack. This is suboptimal,
especially on RISC processor like powerpc.

By making GCC inline the function, the struct __kernel_timespec
remains a local var using registers, avoiding the need to write and
read stack.

The improvement is significant on powerpc.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 lib/vdso/gettimeofday.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 24e1ba838260..86d5b1c8796b 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -34,8 +34,8 @@ u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
 }
 #endif
 
-static int do_hres(const struct vdso_data *vd, clockid_t clk,
-		   struct __kernel_timespec *ts)
+static inline int do_hres(const struct vdso_data *vd, clockid_t clk,
+			  struct __kernel_timespec *ts)
 {
 	const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
 	u64 cycles, last, sec, ns;
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 06/10] lib: vdso: make do_coarse() return 0
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
                   ` (4 preceding siblings ...)
  2019-12-23 14:31 ` [RFC PATCH v2 05/10] lib: vdso: inline do_hres() Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-23 14:31 ` [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time() Christophe Leroy
                   ` (4 subsequent siblings)
  10 siblings, 0 replies; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

do_coarse() is similare to do_hres() except that it never
fails.

Change its type to int instead of void and get it return 0
at all time. This cleans the code a bit.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 lib/vdso/gettimeofday.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 86d5b1c8796b..17b4cff6e5f0 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -64,7 +64,7 @@ static inline int do_hres(const struct vdso_data *vd, clockid_t clk,
 	return 0;
 }
 
-static void do_coarse(const struct vdso_data *vd, clockid_t clk,
+static int do_coarse(const struct vdso_data *vd, clockid_t clk,
 		      struct __kernel_timespec *ts)
 {
 	const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
@@ -75,6 +75,8 @@ static void do_coarse(const struct vdso_data *vd, clockid_t clk,
 		ts->tv_sec = vdso_ts->sec;
 		ts->tv_nsec = vdso_ts->nsec;
 	} while (unlikely(vdso_read_retry(vd, seq)));
+
+	return 0;
 }
 
 static __maybe_unused int
@@ -92,14 +94,13 @@ __cvdso_clock_gettime(const struct vdso_data *vd, clockid_t clock,
 	 * clocks are handled in the VDSO directly.
 	 */
 	msk = 1U << clock;
-	if (likely(msk & VDSO_HRES)) {
+	if (likely(msk & VDSO_HRES))
 		return do_hres(&vd[CS_HRES_COARSE], clock, ts);
-	} else if (msk & VDSO_COARSE) {
-		do_coarse(&vd[CS_HRES_COARSE], clock, ts);
-		return 0;
-	} else if (msk & VDSO_RAW) {
+	else if (msk & VDSO_COARSE)
+		return do_coarse(&vd[CS_HRES_COARSE], clock, ts);
+	else if (msk & VDSO_RAW)
 		return do_hres(&vd[CS_RAW], clock, ts);
-	}
+
 	return -1;
 }
 
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
                   ` (5 preceding siblings ...)
  2019-12-23 14:31 ` [RFC PATCH v2 06/10] lib: vdso: make do_coarse() return 0 Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-24  1:58   ` Andy Lutomirski
  2020-01-10 21:12   ` Thomas Gleixner
  2019-12-23 14:31 ` [RFC PATCH v2 08/10] lib: vdso: Avoid duplication in __cvdso_clock_getres() Christophe Leroy
                   ` (3 subsequent siblings)
  10 siblings, 2 replies; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

READ_ONCE() forces the read of the 64 bit value of
vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec allthough
only the lower part is needed.

This results in a suboptimal code:

00000af4 <__c_kernel_time>:
 af4:	2c 03 00 00 	cmpwi   r3,0
 af8:	81 44 00 20 	lwz     r10,32(r4)
 afc:	81 64 00 24 	lwz     r11,36(r4)
 b00:	41 82 00 08 	beq     b08 <__c_kernel_time+0x14>
 b04:	91 63 00 00 	stw     r11,0(r3)
 b08:	7d 63 5b 78 	mr      r3,r11
 b0c:	4e 80 00 20 	blr

By removing the READ_ONCE(), only the lower part is read from
memory, and the code is cleaner:

00000af4 <__c_kernel_time>:
 af4:	7c 69 1b 79 	mr.     r9,r3
 af8:	80 64 00 24 	lwz     r3,36(r4)
 afc:	4d 82 00 20 	beqlr
 b00:	90 69 00 00 	stw     r3,0(r9)
 b04:	4e 80 00 20 	blr

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 lib/vdso/gettimeofday.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 17b4cff6e5f0..5a17a9d2e6cd 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -144,7 +144,7 @@ __cvdso_gettimeofday(const struct vdso_data *vd, struct __kernel_old_timeval *tv
 static __maybe_unused __kernel_old_time_t
 __cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
 {
-	__kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
+	__kernel_old_time_t t = vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec;
 
 	if (time)
 		*time = t;
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 08/10] lib: vdso: Avoid duplication in __cvdso_clock_getres()
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
                   ` (6 preceding siblings ...)
  2019-12-23 14:31 ` [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time() Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-24  1:59   ` Andy Lutomirski
  2019-12-23 14:31 ` [RFC PATCH v2 09/10] powerpc/vdso32: inline __get_datapage() Christophe Leroy
                   ` (2 subsequent siblings)
  10 siblings, 1 reply; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

VDSO_HRES and VDSO_RAW clocks are handled the same way.

Don't duplicate code.

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 lib/vdso/gettimeofday.c | 7 +------
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 5a17a9d2e6cd..aa4a167bf1e0 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -172,7 +172,7 @@ int __cvdso_clock_getres(const struct vdso_data *vd, clockid_t clock,
 	 * clocks are handled in the VDSO directly.
 	 */
 	msk = 1U << clock;
-	if (msk & VDSO_HRES) {
+	if (msk & (VDSO_HRES | VDSO_RAW)) {
 		/*
 		 * Preserves the behaviour of posix_get_hrtimer_res().
 		 */
@@ -182,11 +182,6 @@ int __cvdso_clock_getres(const struct vdso_data *vd, clockid_t clock,
 		 * Preserves the behaviour of posix_get_coarse_res().
 		 */
 		ns = LOW_RES_NSEC;
-	} else if (msk & VDSO_RAW) {
-		/*
-		 * Preserves the behaviour of posix_get_hrtimer_res().
-		 */
-		ns = hrtimer_res;
 	} else {
 		return -1;
 	}
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 09/10] powerpc/vdso32: inline __get_datapage()
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
                   ` (7 preceding siblings ...)
  2019-12-23 14:31 ` [RFC PATCH v2 08/10] lib: vdso: Avoid duplication in __cvdso_clock_getres() Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2019-12-23 14:31 ` [RFC PATCH v2 10/10] powerpc/32: Switch VDSO to C implementation Christophe Leroy
  2020-01-09 17:52 ` Surprising code generated for vdso_read_begin() Christophe Leroy
  10 siblings, 0 replies; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

__get_datapage() is only a few instructions to retrieve the
address of the page where the kernel stores data to the VDSO.

By inlining this function into its users, a bl/blr pair and
a mflr/mtlr pair is avoided, plus a few reg moves.

The improvement is noticeable (about 55 nsec/call on an 8xx)

vdsotest before the patch:
gettimeofday:    vdso: 731 nsec/call
clock-gettime-realtime-coarse:    vdso: 668 nsec/call
clock-gettime-monotonic-coarse:    vdso: 745 nsec/call

vdsotest after the patch:
gettimeofday:    vdso: 677 nsec/call
clock-gettime-realtime-coarse:    vdso: 613 nsec/call
clock-gettime-monotonic-coarse:    vdso: 690 nsec/call

Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

---
v3: define get_datapage macro in asm/vdso_datapage.h
v4: fixed build failure with old binutils
---
 arch/powerpc/include/asm/vdso_datapage.h  | 10 ++++++++++
 arch/powerpc/kernel/vdso32/cacheflush.S   |  9 ++++-----
 arch/powerpc/kernel/vdso32/datapage.S     | 28 +++-------------------------
 arch/powerpc/kernel/vdso32/gettimeofday.S | 12 +++++-------
 4 files changed, 22 insertions(+), 37 deletions(-)

diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h
index 40f13f3626d3..ee5319a6f4e3 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -118,6 +118,16 @@ struct vdso_data {
 
 extern struct vdso_data *vdso_data;
 
+#else /* __ASSEMBLY__ */
+
+.macro get_datapage ptr, tmp
+	bcl	20, 31, .+4
+	mflr	\ptr
+	addi	\ptr, \ptr, (__kernel_datapage_offset - (.-4))@l
+	lwz	\tmp, 0(\ptr)
+	add	\ptr, \tmp, \ptr
+.endm
+
 #endif /* __ASSEMBLY__ */
 
 #endif /* __KERNEL__ */
diff --git a/arch/powerpc/kernel/vdso32/cacheflush.S b/arch/powerpc/kernel/vdso32/cacheflush.S
index 7f882e7b9f43..d178ec8c279d 100644
--- a/arch/powerpc/kernel/vdso32/cacheflush.S
+++ b/arch/powerpc/kernel/vdso32/cacheflush.S
@@ -8,6 +8,7 @@
 #include <asm/processor.h>
 #include <asm/ppc_asm.h>
 #include <asm/vdso.h>
+#include <asm/vdso_datapage.h>
 #include <asm/asm-offsets.h>
 
 	.text
@@ -24,14 +25,12 @@ V_FUNCTION_BEGIN(__kernel_sync_dicache)
   .cfi_startproc
 	mflr	r12
   .cfi_register lr,r12
-	mr	r11,r3
-	bl	__get_datapage@local
+	get_datapage	r10, r0
 	mtlr	r12
-	mr	r10,r3
 
 	lwz	r7,CFG_DCACHE_BLOCKSZ(r10)
 	addi	r5,r7,-1
-	andc	r6,r11,r5		/* round low to line bdy */
+	andc	r6,r3,r5		/* round low to line bdy */
 	subf	r8,r6,r4		/* compute length */
 	add	r8,r8,r5		/* ensure we get enough */
 	lwz	r9,CFG_DCACHE_LOGBLOCKSZ(r10)
@@ -48,7 +47,7 @@ V_FUNCTION_BEGIN(__kernel_sync_dicache)
 
 	lwz	r7,CFG_ICACHE_BLOCKSZ(r10)
 	addi	r5,r7,-1
-	andc	r6,r11,r5		/* round low to line bdy */
+	andc	r6,r3,r5		/* round low to line bdy */
 	subf	r8,r6,r4		/* compute length */
 	add	r8,r8,r5
 	lwz	r9,CFG_ICACHE_LOGBLOCKSZ(r10)
diff --git a/arch/powerpc/kernel/vdso32/datapage.S b/arch/powerpc/kernel/vdso32/datapage.S
index 6c7401bd284e..1095d818f94a 100644
--- a/arch/powerpc/kernel/vdso32/datapage.S
+++ b/arch/powerpc/kernel/vdso32/datapage.S
@@ -10,35 +10,13 @@
 #include <asm/asm-offsets.h>
 #include <asm/unistd.h>
 #include <asm/vdso.h>
+#include <asm/vdso_datapage.h>
 
 	.text
 	.global	__kernel_datapage_offset;
 __kernel_datapage_offset:
 	.long	0
 
-V_FUNCTION_BEGIN(__get_datapage)
-  .cfi_startproc
-	/* We don't want that exposed or overridable as we want other objects
-	 * to be able to bl directly to here
-	 */
-	.protected __get_datapage
-	.hidden __get_datapage
-
-	mflr	r0
-  .cfi_register lr,r0
-
-	bcl	20,31,data_page_branch
-data_page_branch:
-	mflr	r3
-	mtlr	r0
-	addi	r3, r3, __kernel_datapage_offset-data_page_branch
-	lwz	r0,0(r3)
-  .cfi_restore lr
-	add	r3,r0,r3
-	blr
-  .cfi_endproc
-V_FUNCTION_END(__get_datapage)
-
 /*
  * void *__kernel_get_syscall_map(unsigned int *syscall_count) ;
  *
@@ -53,7 +31,7 @@ V_FUNCTION_BEGIN(__kernel_get_syscall_map)
 	mflr	r12
   .cfi_register lr,r12
 	mr	r4,r3
-	bl	__get_datapage@local
+	get_datapage	r3, r0
 	mtlr	r12
 	addi	r3,r3,CFG_SYSCALL_MAP32
 	cmpli	cr0,r4,0
@@ -75,7 +53,7 @@ V_FUNCTION_BEGIN(__kernel_get_tbfreq)
   .cfi_startproc
 	mflr	r12
   .cfi_register lr,r12
-	bl	__get_datapage@local
+	get_datapage	r3, r0
 	lwz	r4,(CFG_TB_TICKS_PER_SEC + 4)(r3)
 	lwz	r3,CFG_TB_TICKS_PER_SEC(r3)
 	mtlr	r12
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index 3306672f57a9..d6c1d331e8cb 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -9,6 +9,7 @@
 #include <asm/processor.h>
 #include <asm/ppc_asm.h>
 #include <asm/vdso.h>
+#include <asm/vdso_datapage.h>
 #include <asm/asm-offsets.h>
 #include <asm/unistd.h>
 
@@ -33,8 +34,7 @@ V_FUNCTION_BEGIN(__kernel_gettimeofday)
 
 	mr	r10,r3			/* r10 saves tv */
 	mr	r11,r4			/* r11 saves tz */
-	bl	__get_datapage@local	/* get data page */
-	mr	r9, r3			/* datapage ptr in r9 */
+	get_datapage	r9, r0
 	cmplwi	r10,0			/* check if tv is NULL */
 	beq	3f
 	lis	r7,1000000@ha		/* load up USEC_PER_SEC */
@@ -74,8 +74,7 @@ V_FUNCTION_BEGIN(__kernel_clock_gettime)
 	mflr	r12			/* r12 saves lr */
   .cfi_register lr,r12
 	mr	r11,r4			/* r11 saves tp */
-	bl	__get_datapage@local	/* get data page */
-	mr	r9,r3			/* datapage ptr in r9 */
+	get_datapage	r9, r0
 	lis	r7,NSEC_PER_SEC@h	/* want nanoseconds */
 	ori	r7,r7,NSEC_PER_SEC@l
 50:	bl	__do_get_tspec@local	/* get sec/nsec from tb & kernel */
@@ -156,7 +155,7 @@ V_FUNCTION_BEGIN(__kernel_clock_getres)
 
 	mflr	r12
   .cfi_register lr,r12
-	bl	__get_datapage@local	/* get data page */
+	get_datapage	r3, r0
 	lwz	r5, CLOCK_HRTIMER_RES(r3)
 	mtlr	r12
 	li	r3,0
@@ -190,8 +189,7 @@ V_FUNCTION_BEGIN(__kernel_time)
   .cfi_register lr,r12
 
 	mr	r11,r3			/* r11 holds t */
-	bl	__get_datapage@local
-	mr	r9, r3			/* datapage ptr in r9 */
+	get_datapage	r9, r0
 
 	lwz	r3,STAMP_XTIME_SEC+LOPART(r9)
 
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* [RFC PATCH v2 10/10] powerpc/32: Switch VDSO to C implementation.
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
                   ` (8 preceding siblings ...)
  2019-12-23 14:31 ` [RFC PATCH v2 09/10] powerpc/vdso32: inline __get_datapage() Christophe Leroy
@ 2019-12-23 14:31 ` Christophe Leroy
  2020-01-09 17:52 ` Surprising code generated for vdso_read_begin() Christophe Leroy
  10 siblings, 0 replies; 45+ messages in thread
From: Christophe Leroy @ 2019-12-23 14:31 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

This is a tentative to switch powerpc/32 vdso to
generic C implementation.

It will likely not work on 64 bits or even build properly
at the moment, hence the RFC status.

powerpc is a bit special for VDSO as well as system calls in the
way that it requires setting CR SO bit which cannot be done in C.
Therefore, entry/exit and fallback needs to be performed in ASM.

On powerpc 8xx, performance is degraded by 30-40% for gettime and
by 15-20% for getres

On a powerpc885 at 132MHz:
With current powerpc/32 ASM VDSO:

gettimeofday:    vdso: 737 nsec/call
clock-getres-realtime-coarse:    vdso: 3081 nsec/call
clock-gettime-realtime-coarse:    vdso: 2861 nsec/call
clock-getres-realtime:    vdso: 475 nsec/call
clock-gettime-realtime:    vdso: 892 nsec/call
clock-getres-boottime:    vdso: 2621 nsec/call
clock-gettime-boottime:    vdso: 3857 nsec/call
clock-getres-tai:    vdso: 2620 nsec/call
clock-gettime-tai:    vdso: 3854 nsec/call
clock-getres-monotonic-raw:    vdso: 2621 nsec/call
clock-gettime-monotonic-raw:    vdso: 3499 nsec/call
clock-getres-monotonic-coarse:    vdso: 3083 nsec/call
clock-gettime-monotonic-coarse:    vdso: 3082 nsec/call
clock-getres-monotonic:    vdso: 475 nsec/call
clock-gettime-monotonic:    vdso: 1014 nsec/call

Once switched to C implementation:

gettimeofday:    vdso: 1016 nsec/call
clock-getres-realtime-coarse:    vdso: 614 nsec/call
clock-gettime-realtime-coarse:    vdso: 760 nsec/call
clock-getres-realtime:    vdso: 560 nsec/call
clock-gettime-realtime:    vdso: 1192 nsec/call
clock-getres-boottime:    vdso: 560 nsec/call
clock-gettime-boottime:    vdso: 1194 nsec/call
clock-getres-tai:    vdso: 560 nsec/call
clock-gettime-tai:    vdso: 1192 nsec/call
clock-getres-monotonic-raw:    vdso: 560 nsec/call
clock-gettime-monotonic-raw:    vdso: 1248 nsec/call
clock-getres-monotonic-coarse:    vdso: 614 nsec/call
clock-gettime-monotonic-coarse:    vdso: 760 nsec/call
clock-getres-monotonic:    vdso: 560 nsec/call
clock-gettime-monotonic:    vdso: 1192 nsec/call

On a powerpc 8321 running at 333MHz
With current powerpc/32 ASM VDSO:

gettimeofday:    vdso: 190 nsec/call
clock-getres-realtime-coarse:    vdso: 1449 nsec/call
clock-gettime-realtime-coarse:    vdso: 1352 nsec/call
clock-getres-realtime:    vdso: 135 nsec/call
clock-gettime-realtime:    vdso: 244 nsec/call
clock-getres-boottime:    vdso: 1313 nsec/call
clock-gettime-boottime:    vdso: 1701 nsec/call
clock-getres-tai:    vdso: 1268 nsec/call
clock-gettime-tai:    vdso: 1742 nsec/call
clock-getres-monotonic-raw:    vdso: 1310 nsec/call
clock-gettime-monotonic-raw:    vdso: 1584 nsec/call
clock-getres-monotonic-coarse:    vdso: 1488 nsec/call
clock-gettime-monotonic-coarse:    vdso: 1503 nsec/call
clock-getres-monotonic:    vdso: 135 nsec/call
clock-gettime-monotonic:    vdso: 283 nsec/call

Once switched to C implementation:

gettimeofday:    vdso: 347 nsec/call
clock-getres-realtime-coarse:    vdso: 169 nsec/call
clock-gettime-realtime-coarse:    vdso: 271 nsec/call
clock-getres-realtime:    vdso: 150 nsec/call
clock-gettime-realtime:    vdso: 383 nsec/call
clock-getres-boottime:    vdso: 157 nsec/call
clock-gettime-boottime:    vdso: 377 nsec/call
clock-getres-tai:    vdso: 150 nsec/call
clock-gettime-tai:    vdso: 380 nsec/call
clock-getres-monotonic-raw:    vdso: 153 nsec/call
clock-gettime-monotonic-raw:    vdso: 407 nsec/call
clock-getres-monotonic-coarse:    vdso: 169 nsec/call
clock-gettime-monotonic-coarse:    vdso: 271 nsec/call
clock-getres-monotonic:    vdso: 153 nsec/call
clock-gettime-monotonic:    vdso: 377 nsec/call
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
 arch/powerpc/Kconfig                         |   2 +
 arch/powerpc/include/asm/vdso/gettimeofday.h |  45 +++++
 arch/powerpc/include/asm/vdso/vsyscall.h     |  27 +++
 arch/powerpc/include/asm/vdso_datapage.h     |  18 +-
 arch/powerpc/kernel/asm-offsets.c            |  23 +--
 arch/powerpc/kernel/time.c                   |  92 +---------
 arch/powerpc/kernel/vdso.c                   |  19 +-
 arch/powerpc/kernel/vdso32/Makefile          |  19 +-
 arch/powerpc/kernel/vdso32/gettimeofday.S    | 261 ++++-----------------------
 arch/powerpc/kernel/vdso32/vgettimeofday.c   |  32 ++++
 10 files changed, 178 insertions(+), 360 deletions(-)
 create mode 100644 arch/powerpc/include/asm/vdso/gettimeofday.h
 create mode 100644 arch/powerpc/include/asm/vdso/vsyscall.h
 create mode 100644 arch/powerpc/kernel/vdso32/vgettimeofday.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 1ec34e16ed65..bd04c68baf91 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -169,6 +169,7 @@ config PPC
 	select GENERIC_STRNCPY_FROM_USER
 	select GENERIC_STRNLEN_USER
 	select GENERIC_TIME_VSYSCALL
+	select GENERIC_GETTIMEOFDAY
 	select HAVE_ARCH_AUDITSYSCALL
 	select HAVE_ARCH_HUGE_VMAP		if PPC_BOOK3S_64 && PPC_RADIX_MMU
 	select HAVE_ARCH_JUMP_LABEL
@@ -198,6 +199,7 @@ config PPC
 	select HAVE_FUNCTION_GRAPH_TRACER
 	select HAVE_FUNCTION_TRACER
 	select HAVE_GCC_PLUGINS			if GCC_VERSION >= 50200   # plugin support on gcc <= 5.1 is buggy on PPC
+	select HAVE_GENERIC_VDSO
 	select HAVE_HW_BREAKPOINT		if PERF_EVENTS && (PPC_BOOK3S || PPC_8xx)
 	select HAVE_IDE
 	select HAVE_IOREMAP_PROT
diff --git a/arch/powerpc/include/asm/vdso/gettimeofday.h b/arch/powerpc/include/asm/vdso/gettimeofday.h
new file mode 100644
index 000000000000..e170e12a78bb
--- /dev/null
+++ b/arch/powerpc/include/asm/vdso/gettimeofday.h
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_VDSO_GETTIMEOFDAY_H
+#define __ASM_VDSO_GETTIMEOFDAY_H
+
+#ifndef __ASSEMBLY__
+
+#include <asm/time.h>
+#include <asm/unistd.h>
+#include <uapi/linux/time.h>
+
+#define VDSO_HAS_CLOCK_GETRES		1
+
+#define VDSO_HAS_TIME			1
+
+static __always_inline u64 __arch_get_hw_counter(s32 clock_mode)
+{
+	/*
+	 * clock_mode == 0 implies that vDSO are enabled otherwise
+	 * fallback on syscall.
+	 */
+	if (clock_mode)
+		return ULLONG_MAX;
+
+	return get_tb() & LLONG_MAX;
+}
+
+/*
+ * powerpc specific delta calculation.
+ *
+ * This variant removes the masking of the subtraction because the
+ * clocksource mask of all VDSO capable clocksources on powerpc is U64_MAX
+ * which would result in a pointless operation. The compiler cannot
+ * optimize it away as the mask comes from the vdso data and is not compile
+ * time constant.
+ */
+static __always_inline
+u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
+{
+	return (cycles - last) * mult;
+}
+#define vdso_calc_delta vdso_calc_delta
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __ASM_VDSO_GETTIMEOFDAY_H */
diff --git a/arch/powerpc/include/asm/vdso/vsyscall.h b/arch/powerpc/include/asm/vdso/vsyscall.h
new file mode 100644
index 000000000000..d12c2298cbb8
--- /dev/null
+++ b/arch/powerpc/include/asm/vdso/vsyscall.h
@@ -0,0 +1,27 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __ASM_VDSO_VSYSCALL_H
+#define __ASM_VDSO_VSYSCALL_H
+
+#ifndef __ASSEMBLY__
+
+#include <linux/timekeeper_internal.h>
+#include <asm/vdso_datapage.h>
+
+extern struct vdso_arch_data *vdso_arch_data;
+
+/*
+ * Update the vDSO data page to keep in sync with kernel timekeeping.
+ */
+static __always_inline
+struct vdso_data *__powerpc_get_k_vdso_data(void)
+{
+	return vdso_arch_data->data;
+}
+#define __arch_get_k_vdso_data __powerpc_get_k_vdso_data
+
+/* The asm-generic header needs to be included after the definitions above */
+#include <asm-generic/vdso/vsyscall.h>
+
+#endif /* !__ASSEMBLY__ */
+
+#endif /* __ASM_VDSO_VSYSCALL_H */
diff --git a/arch/powerpc/include/asm/vdso_datapage.h b/arch/powerpc/include/asm/vdso_datapage.h
index ee5319a6f4e3..6b9424ca91c4 100644
--- a/arch/powerpc/include/asm/vdso_datapage.h
+++ b/arch/powerpc/include/asm/vdso_datapage.h
@@ -36,6 +36,7 @@
 
 #include <linux/unistd.h>
 #include <linux/time.h>
+#include <vdso/datapage.h>
 
 #define SYSCALL_MAP_SIZE      ((NR_syscalls + 31) / 32)
 
@@ -93,20 +94,9 @@ struct vdso_data {
 /*
  * And here is the simpler 32 bits version
  */
-struct vdso_data {
-	__u64 tb_orig_stamp;		/* Timebase at boot		0x30 */
+struct vdso_arch_data {
+	struct vdso_data data[CS_BASES];
 	__u64 tb_ticks_per_sec;		/* Timebase tics / sec		0x38 */
-	__u64 tb_to_xs;			/* Inverse of TB to 2^20	0x40 */
-	__u64 stamp_xsec;		/*				0x48 */
-	__u32 tb_update_count;		/* Timebase atomicity ctr	0x50 */
-	__u32 tz_minuteswest;		/* Minutes west of Greenwich	0x58 */
-	__u32 tz_dsttime;		/* Type of dst correction	0x5C */
-	__s32 wtom_clock_sec;			/* Wall to monotonic clock */
-	__s32 wtom_clock_nsec;
-	__s32 stamp_xtime_sec;		/* xtime seconds as at tb_orig_stamp */
-	__s32 stamp_xtime_nsec;		/* xtime nsecs as at tb_orig_stamp */
-	__u32 stamp_sec_fraction;	/* fractional seconds of stamp_xtime */
-	__u32 hrtimer_res;		/* hrtimer resolution */
    	__u32 syscall_map_32[SYSCALL_MAP_SIZE]; /* map of syscalls */
 	__u32 dcache_block_size;	/* L1 d-cache block size     */
 	__u32 icache_block_size;	/* L1 i-cache block size     */
@@ -116,7 +106,7 @@ struct vdso_data {
 
 #endif /* CONFIG_PPC64 */
 
-extern struct vdso_data *vdso_data;
+extern struct vdso_arch_data *vdso_arch_data;
 
 #else /* __ASSEMBLY__ */
 
diff --git a/arch/powerpc/kernel/asm-offsets.c b/arch/powerpc/kernel/asm-offsets.c
index 3d47aec7becf..5927c9cf9cc7 100644
--- a/arch/powerpc/kernel/asm-offsets.c
+++ b/arch/powerpc/kernel/asm-offsets.c
@@ -376,23 +376,12 @@ int main(void)
 #endif /* ! CONFIG_PPC64 */
 
 	/* datapage offsets for use by vdso */
-	OFFSET(CFG_TB_ORIG_STAMP, vdso_data, tb_orig_stamp);
-	OFFSET(CFG_TB_TICKS_PER_SEC, vdso_data, tb_ticks_per_sec);
-	OFFSET(CFG_TB_TO_XS, vdso_data, tb_to_xs);
-	OFFSET(CFG_TB_UPDATE_COUNT, vdso_data, tb_update_count);
-	OFFSET(CFG_TZ_MINUTEWEST, vdso_data, tz_minuteswest);
-	OFFSET(CFG_TZ_DSTTIME, vdso_data, tz_dsttime);
-	OFFSET(CFG_SYSCALL_MAP32, vdso_data, syscall_map_32);
-	OFFSET(WTOM_CLOCK_SEC, vdso_data, wtom_clock_sec);
-	OFFSET(WTOM_CLOCK_NSEC, vdso_data, wtom_clock_nsec);
-	OFFSET(STAMP_XTIME_SEC, vdso_data, stamp_xtime_sec);
-	OFFSET(STAMP_XTIME_NSEC, vdso_data, stamp_xtime_nsec);
-	OFFSET(STAMP_SEC_FRAC, vdso_data, stamp_sec_fraction);
-	OFFSET(CLOCK_HRTIMER_RES, vdso_data, hrtimer_res);
-	OFFSET(CFG_ICACHE_BLOCKSZ, vdso_data, icache_block_size);
-	OFFSET(CFG_DCACHE_BLOCKSZ, vdso_data, dcache_block_size);
-	OFFSET(CFG_ICACHE_LOGBLOCKSZ, vdso_data, icache_log_block_size);
-	OFFSET(CFG_DCACHE_LOGBLOCKSZ, vdso_data, dcache_log_block_size);
+	OFFSET(CFG_TB_TICKS_PER_SEC, vdso_arch_data, tb_ticks_per_sec);
+	OFFSET(CFG_SYSCALL_MAP32, vdso_arch_data, syscall_map_32);
+	OFFSET(CFG_ICACHE_BLOCKSZ, vdso_arch_data, icache_block_size);
+	OFFSET(CFG_DCACHE_BLOCKSZ, vdso_arch_data, dcache_block_size);
+	OFFSET(CFG_ICACHE_LOGBLOCKSZ, vdso_arch_data, icache_log_block_size);
+	OFFSET(CFG_DCACHE_LOGBLOCKSZ, vdso_arch_data, dcache_log_block_size);
 #ifdef CONFIG_PPC64
 	OFFSET(CFG_SYSCALL_MAP64, vdso_data, syscall_map_64);
 	OFFSET(TVAL64_TV_SEC, __kernel_old_timeval, tv_sec);
diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index 1168e8b37e30..e0c8023db42c 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -882,95 +882,6 @@ static notrace u64 timebase_read(struct clocksource *cs)
 	return (u64)get_tb();
 }
 
-
-void update_vsyscall(struct timekeeper *tk)
-{
-	struct timespec64 xt;
-	struct clocksource *clock = tk->tkr_mono.clock;
-	u32 mult = tk->tkr_mono.mult;
-	u32 shift = tk->tkr_mono.shift;
-	u64 cycle_last = tk->tkr_mono.cycle_last;
-	u64 new_tb_to_xs, new_stamp_xsec;
-	u64 frac_sec;
-
-	if (clock != &clocksource_timebase)
-		return;
-
-	xt.tv_sec = tk->xtime_sec;
-	xt.tv_nsec = (long)(tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift);
-
-	/* Make userspace gettimeofday spin until we're done. */
-	++vdso_data->tb_update_count;
-	smp_mb();
-
-	/*
-	 * This computes ((2^20 / 1e9) * mult) >> shift as a
-	 * 0.64 fixed-point fraction.
-	 * The computation in the else clause below won't overflow
-	 * (as long as the timebase frequency is >= 1.049 MHz)
-	 * but loses precision because we lose the low bits of the constant
-	 * in the shift.  Note that 19342813113834067 ~= 2^(20+64) / 1e9.
-	 * For a shift of 24 the error is about 0.5e-9, or about 0.5ns
-	 * over a second.  (Shift values are usually 22, 23 or 24.)
-	 * For high frequency clocks such as the 512MHz timebase clock
-	 * on POWER[6789], the mult value is small (e.g. 32768000)
-	 * and so we can shift the constant by 16 initially
-	 * (295147905179 ~= 2^(20+64-16) / 1e9) and then do the
-	 * remaining shifts after the multiplication, which gives a
-	 * more accurate result (e.g. with mult = 32768000, shift = 24,
-	 * the error is only about 1.2e-12, or 0.7ns over 10 minutes).
-	 */
-	if (mult <= 62500000 && clock->shift >= 16)
-		new_tb_to_xs = ((u64) mult * 295147905179ULL) >> (clock->shift - 16);
-	else
-		new_tb_to_xs = (u64) mult * (19342813113834067ULL >> clock->shift);
-
-	/*
-	 * Compute the fractional second in units of 2^-32 seconds.
-	 * The fractional second is tk->tkr_mono.xtime_nsec >> tk->tkr_mono.shift
-	 * in nanoseconds, so multiplying that by 2^32 / 1e9 gives
-	 * it in units of 2^-32 seconds.
-	 * We assume shift <= 32 because clocks_calc_mult_shift()
-	 * generates shift values in the range 0 - 32.
-	 */
-	frac_sec = tk->tkr_mono.xtime_nsec << (32 - shift);
-	do_div(frac_sec, NSEC_PER_SEC);
-
-	/*
-	 * Work out new stamp_xsec value for any legacy users of systemcfg.
-	 * stamp_xsec is in units of 2^-20 seconds.
-	 */
-	new_stamp_xsec = frac_sec >> 12;
-	new_stamp_xsec += tk->xtime_sec * XSEC_PER_SEC;
-
-	/*
-	 * tb_update_count is used to allow the userspace gettimeofday code
-	 * to assure itself that it sees a consistent view of the tb_to_xs and
-	 * stamp_xsec variables.  It reads the tb_update_count, then reads
-	 * tb_to_xs and stamp_xsec and then reads tb_update_count again.  If
-	 * the two values of tb_update_count match and are even then the
-	 * tb_to_xs and stamp_xsec values are consistent.  If not, then it
-	 * loops back and reads them again until this criteria is met.
-	 */
-	vdso_data->tb_orig_stamp = cycle_last;
-	vdso_data->stamp_xsec = new_stamp_xsec;
-	vdso_data->tb_to_xs = new_tb_to_xs;
-	vdso_data->wtom_clock_sec = tk->wall_to_monotonic.tv_sec;
-	vdso_data->wtom_clock_nsec = tk->wall_to_monotonic.tv_nsec;
-	vdso_data->stamp_xtime_sec = xt.tv_sec;
-	vdso_data->stamp_xtime_nsec = xt.tv_nsec;
-	vdso_data->stamp_sec_fraction = frac_sec;
-	vdso_data->hrtimer_res = hrtimer_resolution;
-	smp_wmb();
-	++(vdso_data->tb_update_count);
-}
-
-void update_vsyscall_tz(void)
-{
-	vdso_data->tz_minuteswest = sys_tz.tz_minuteswest;
-	vdso_data->tz_dsttime = sys_tz.tz_dsttime;
-}
-
 static void __init clocksource_init(void)
 {
 	struct clocksource *clock;
@@ -1140,8 +1051,7 @@ void __init time_init(void)
 		sys_tz.tz_dsttime = 0;
 	}
 
-	vdso_data->tb_update_count = 0;
-	vdso_data->tb_ticks_per_sec = tb_ticks_per_sec;
+	vdso_arch_data->tb_ticks_per_sec = tb_ticks_per_sec;
 
 	/* initialise and enable the large decrementer (if we have one) */
 	set_decrementer_max();
diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c
index eae9ddaecbcf..d1e4f3a3a781 100644
--- a/arch/powerpc/kernel/vdso.c
+++ b/arch/powerpc/kernel/vdso.c
@@ -17,6 +17,7 @@
 #include <linux/elf.h>
 #include <linux/security.h>
 #include <linux/memblock.h>
+#include <vdso/datapage.h>
 
 #include <asm/pgtable.h>
 #include <asm/processor.h>
@@ -71,10 +72,10 @@ static int vdso_ready;
  * with it, it will become dynamically allocated
  */
 static union {
-	struct vdso_data	data;
+	struct vdso_arch_data	arch_data;
 	u8			page[PAGE_SIZE];
 } vdso_data_store __page_aligned_data;
-struct vdso_data *vdso_data = &vdso_data_store.data;
+struct vdso_arch_data *vdso_arch_data = &vdso_data_store.arch_data;
 
 /* Format of the patch table */
 struct vdso_patch_def
@@ -661,7 +662,7 @@ static void __init vdso_setup_syscall_map(void)
 				0x80000000UL >> (i & 0x1f);
 #else /* CONFIG_PPC64 */
 		if (sys_call_table[i] != sys_ni_syscall)
-			vdso_data->syscall_map_32[i >> 5] |=
+			vdso_arch_data->syscall_map_32[i >> 5] |=
 				0x80000000UL >> (i & 0x1f);
 #endif /* CONFIG_PPC64 */
 	}
@@ -729,10 +730,10 @@ static int __init vdso_init(void)
 	vdso64_pages = (&vdso64_end - &vdso64_start) >> PAGE_SHIFT;
 	DBG("vdso64_kbase: %p, 0x%x pages\n", vdso64_kbase, vdso64_pages);
 #else
-	vdso_data->dcache_block_size = L1_CACHE_BYTES;
-	vdso_data->dcache_log_block_size = L1_CACHE_SHIFT;
-	vdso_data->icache_block_size = L1_CACHE_BYTES;
-	vdso_data->icache_log_block_size = L1_CACHE_SHIFT;
+	vdso_arch_data->dcache_block_size = L1_CACHE_BYTES;
+	vdso_arch_data->dcache_log_block_size = L1_CACHE_SHIFT;
+	vdso_arch_data->icache_block_size = L1_CACHE_BYTES;
+	vdso_arch_data->icache_log_block_size = L1_CACHE_SHIFT;
 #endif /* CONFIG_PPC64 */
 
 
@@ -775,7 +776,7 @@ static int __init vdso_init(void)
 		get_page(pg);
 		vdso32_pagelist[i] = pg;
 	}
-	vdso32_pagelist[i++] = virt_to_page(vdso_data);
+	vdso32_pagelist[i++] = virt_to_page(vdso_arch_data);
 	vdso32_pagelist[i] = NULL;
 #endif
 
@@ -792,7 +793,7 @@ static int __init vdso_init(void)
 	vdso64_pagelist[i] = NULL;
 #endif /* CONFIG_PPC64 */
 
-	get_page(virt_to_page(vdso_data));
+	get_page(virt_to_page(vdso_arch_data));
 
 	smp_wmb();
 	vdso_ready = 1;
diff --git a/arch/powerpc/kernel/vdso32/Makefile b/arch/powerpc/kernel/vdso32/Makefile
index 06f54d947057..09edcd1a2dc7 100644
--- a/arch/powerpc/kernel/vdso32/Makefile
+++ b/arch/powerpc/kernel/vdso32/Makefile
@@ -2,10 +2,17 @@
 
 # List of files in the vdso, has to be asm only for now
 
+ARCH_REL_TYPE_ABS := R_PPC_JUMP_SLOT|R_PPC_GLOB_DAT|R_PPC_ADDR32|R_PPC_ADDR24|R_PPC_ADDR16|R_PPC_ADDR16_LO|R_PPC_ADDR16_HI|R_PPC_ADDR16_HA|R_PPC_ADDR14|R_PPC_ADDR14_BRTAKEN|R_PPC_ADDR14_BRNTAKEN
+include $(srctree)/lib/vdso/Makefile
+
 obj-vdso32-$(CONFIG_PPC64) = getcpu.o
 obj-vdso32 = sigtramp.o gettimeofday.o datapage.o cacheflush.o note.o \
 		$(obj-vdso32-y)
 
+ifneq ($(c-gettimeofday-y),)
+  CFLAGS_vgettimeofday.o += -include $(c-gettimeofday-y)
+endif
+
 # Build rules
 
 ifdef CROSS32_COMPILE
@@ -38,8 +45,8 @@ CPPFLAGS_vdso32.lds += -P -C -Upowerpc
 $(obj)/vdso32_wrapper.o : $(obj)/vdso32.so
 
 # link rule for the .so file, .lds has to be first
-$(obj)/vdso32.so.dbg: $(src)/vdso32.lds $(obj-vdso32) FORCE
-	$(call if_changed,vdso32ld)
+$(obj)/vdso32.so.dbg: $(src)/vdso32.lds $(obj-vdso32) $(obj)/vgettimeofday.o FORCE
+	$(call if_changed,vdso32ld_and_check)
 
 # strip rule for the .so file
 $(obj)/%.so: OBJCOPYFLAGS := -S
@@ -49,12 +56,16 @@ $(obj)/%.so: $(obj)/%.so.dbg FORCE
 # assembly rules for the .S files
 $(obj-vdso32): %.o: %.S FORCE
 	$(call if_changed_dep,vdso32as)
+$(obj)/vgettimeofday.o: %.o: %.c FORCE
+	$(call if_changed_dep,vdso32cc)
 
 # actual build commands
-quiet_cmd_vdso32ld = VDSO32L $@
-      cmd_vdso32ld = $(VDSOCC) $(c_flags) $(CC32FLAGS) -o $@ -Wl,-T$(filter %.lds,$^) $(filter %.o,$^)
+quiet_cmd_vdso32ld_and_check = VDSO32L $@
+      cmd_vdso32ld_and_check = $(VDSOCC) $(c_flags) $(CC32FLAGS) -o $@ -Wl,-T$(filter %.lds,$^) $(filter %.o,$^) ; $(cmd_vdso_check)
 quiet_cmd_vdso32as = VDSO32A $@
       cmd_vdso32as = $(VDSOCC) $(a_flags) $(CC32FLAGS) -c -o $@ $<
+quiet_cmd_vdso32cc = VDSO32A $@
+      cmd_vdso32cc = $(VDSOCC) $(c_flags) $(CC32FLAGS) -c -o $@ $<
 
 # install commands for the unstripped file
 quiet_cmd_vdso_install = INSTALL $@
diff --git a/arch/powerpc/kernel/vdso32/gettimeofday.S b/arch/powerpc/kernel/vdso32/gettimeofday.S
index d6c1d331e8cb..d767d0089ad0 100644
--- a/arch/powerpc/kernel/vdso32/gettimeofday.S
+++ b/arch/powerpc/kernel/vdso32/gettimeofday.S
@@ -13,14 +13,32 @@
 #include <asm/asm-offsets.h>
 #include <asm/unistd.h>
 
-/* Offset for the low 32-bit part of a field of long type */
-#ifdef CONFIG_PPC64
-#define LOPART	4
-#else
-#define LOPART	0
-#endif
-
 	.text
+
+.macro cvdso_call_with_fallback funct, syscall
+	stwu	r1, -16(r1)
+	mflr	r0
+	stw	r0, 20(r1)
+	stw	r3, 8(r1)
+	stw	r4, 12(r1)
+	get_datapage	r5, r0
+	bl	\funct
+	lwz	r0, 20(r1)
+	cmpwi	r3, 0
+	mtlr	r0
+	bne-	99f
+	addi	r1, r1, 16
+	crclr	so
+	blr
+99:
+	lwz	r3, 8(r1)
+	lwz	r4, 12(r1)
+	li	r0, \syscall
+	addi	r1, r1, 16
+	sc
+	blr
+.endm
+
 /*
  * Exact prototype of gettimeofday
  *
@@ -29,31 +47,7 @@
  */
 V_FUNCTION_BEGIN(__kernel_gettimeofday)
   .cfi_startproc
-	mflr	r12
-  .cfi_register lr,r12
-
-	mr	r10,r3			/* r10 saves tv */
-	mr	r11,r4			/* r11 saves tz */
-	get_datapage	r9, r0
-	cmplwi	r10,0			/* check if tv is NULL */
-	beq	3f
-	lis	r7,1000000@ha		/* load up USEC_PER_SEC */
-	addi	r7,r7,1000000@l		/* so we get microseconds in r4 */
-	bl	__do_get_tspec@local	/* get sec/usec from tb & kernel */
-	stw	r3,TVAL32_TV_SEC(r10)
-	stw	r4,TVAL32_TV_USEC(r10)
-
-3:	cmplwi	r11,0			/* check if tz is NULL */
-	beq	1f
-	lwz	r4,CFG_TZ_MINUTEWEST(r9)/* fill tz */
-	lwz	r5,CFG_TZ_DSTTIME(r9)
-	stw	r4,TZONE_TZ_MINWEST(r11)
-	stw	r5,TZONE_TZ_DSTTIME(r11)
-
-1:	mtlr	r12
-	crclr	cr0*4+so
-	li	r3,0
-	blr
+	cvdso_call_with_fallback __c_kernel_gettimeofday, __NR_gettimeofday
   .cfi_endproc
 V_FUNCTION_END(__kernel_gettimeofday)
 
@@ -65,76 +59,7 @@ V_FUNCTION_END(__kernel_gettimeofday)
  */
 V_FUNCTION_BEGIN(__kernel_clock_gettime)
   .cfi_startproc
-	/* Check for supported clock IDs */
-	cmpli	cr0,r3,CLOCK_REALTIME
-	cmpli	cr1,r3,CLOCK_MONOTONIC
-	cror	cr0*4+eq,cr0*4+eq,cr1*4+eq
-	bne	cr0,99f
-
-	mflr	r12			/* r12 saves lr */
-  .cfi_register lr,r12
-	mr	r11,r4			/* r11 saves tp */
-	get_datapage	r9, r0
-	lis	r7,NSEC_PER_SEC@h	/* want nanoseconds */
-	ori	r7,r7,NSEC_PER_SEC@l
-50:	bl	__do_get_tspec@local	/* get sec/nsec from tb & kernel */
-	bne	cr1,80f			/* not monotonic -> all done */
-
-	/*
-	 * CLOCK_MONOTONIC
-	 */
-
-	/* now we must fixup using wall to monotonic. We need to snapshot
-	 * that value and do the counter trick again. Fortunately, we still
-	 * have the counter value in r8 that was returned by __do_get_xsec.
-	 * At this point, r3,r4 contain our sec/nsec values, r5 and r6
-	 * can be used, r7 contains NSEC_PER_SEC.
-	 */
-
-	lwz	r5,(WTOM_CLOCK_SEC+LOPART)(r9)
-	lwz	r6,WTOM_CLOCK_NSEC(r9)
-
-	/* We now have our offset in r5,r6. We create a fake dependency
-	 * on that value and re-check the counter
-	 */
-	or	r0,r6,r5
-	xor	r0,r0,r0
-	add	r9,r9,r0
-	lwz	r0,(CFG_TB_UPDATE_COUNT+LOPART)(r9)
-        cmpl    cr0,r8,r0		/* check if updated */
-	bne-	50b
-
-	/* Calculate and store result. Note that this mimics the C code,
-	 * which may cause funny results if nsec goes negative... is that
-	 * possible at all ?
-	 */
-	add	r3,r3,r5
-	add	r4,r4,r6
-	cmpw	cr0,r4,r7
-	cmpwi	cr1,r4,0
-	blt	1f
-	subf	r4,r7,r4
-	addi	r3,r3,1
-1:	bge	cr1,80f
-	addi	r3,r3,-1
-	add	r4,r4,r7
-
-80:	stw	r3,TSPC32_TV_SEC(r11)
-	stw	r4,TSPC32_TV_NSEC(r11)
-
-	mtlr	r12
-	crclr	cr0*4+so
-	li	r3,0
-	blr
-
-	/*
-	 * syscall fallback
-	 */
-99:
-	li	r0,__NR_clock_gettime
-  .cfi_restore lr
-	sc
-	blr
+	cvdso_call_with_fallback __c_kernel_clock_gettime, __NR_clock_gettime
   .cfi_endproc
 V_FUNCTION_END(__kernel_clock_gettime)
 
@@ -147,32 +72,7 @@ V_FUNCTION_END(__kernel_clock_gettime)
  */
 V_FUNCTION_BEGIN(__kernel_clock_getres)
   .cfi_startproc
-	/* Check for supported clock IDs */
-	cmpwi	cr0,r3,CLOCK_REALTIME
-	cmpwi	cr1,r3,CLOCK_MONOTONIC
-	cror	cr0*4+eq,cr0*4+eq,cr1*4+eq
-	bne	cr0,99f
-
-	mflr	r12
-  .cfi_register lr,r12
-	get_datapage	r3, r0
-	lwz	r5, CLOCK_HRTIMER_RES(r3)
-	mtlr	r12
-	li	r3,0
-	cmpli	cr0,r4,0
-	crclr	cr0*4+so
-	beqlr
-	stw	r3,TSPC32_TV_SEC(r4)
-	stw	r5,TSPC32_TV_NSEC(r4)
-	blr
-
-	/*
-	 * syscall fallback
-	 */
-99:
-	li	r0,__NR_clock_getres
-	sc
-	blr
+	cvdso_call_with_fallback __c_kernel_clock_getres, __NR_clock_getres
   .cfi_endproc
 V_FUNCTION_END(__kernel_clock_getres)
 
@@ -185,104 +85,15 @@ V_FUNCTION_END(__kernel_clock_getres)
  */
 V_FUNCTION_BEGIN(__kernel_time)
   .cfi_startproc
-	mflr	r12
-  .cfi_register lr,r12
-
-	mr	r11,r3			/* r11 holds t */
-	get_datapage	r9, r0
-
-	lwz	r3,STAMP_XTIME_SEC+LOPART(r9)
-
-	cmplwi	r11,0			/* check if t is NULL */
-	beq	2f
-	stw	r3,0(r11)		/* store result at *t */
-2:	mtlr	r12
+	stwu	r1, -16(r1)
+	mflr	r0
+	stw	r0, 20(r1)
+	get_datapage	r4, r0
+	bl	__c_kernel_time
+	lwz	r0, 20(r1)
 	crclr	cr0*4+so
+	mtlr	r0
+	addi	r1, r1, 16
 	blr
   .cfi_endproc
 V_FUNCTION_END(__kernel_time)
-
-/*
- * This is the core of clock_gettime() and gettimeofday(),
- * it returns the current time in r3 (seconds) and r4.
- * On entry, r7 gives the resolution of r4, either USEC_PER_SEC
- * or NSEC_PER_SEC, giving r4 in microseconds or nanoseconds.
- * It expects the datapage ptr in r9 and doesn't clobber it.
- * It clobbers r0, r5 and r6.
- * On return, r8 contains the counter value that can be reused.
- * This clobbers cr0 but not any other cr field.
- */
-__do_get_tspec:
-  .cfi_startproc
-	/* Check for update count & load values. We use the low
-	 * order 32 bits of the update count
-	 */
-1:	lwz	r8,(CFG_TB_UPDATE_COUNT+LOPART)(r9)
-	andi.	r0,r8,1			/* pending update ? loop */
-	bne-	1b
-	xor	r0,r8,r8		/* create dependency */
-	add	r9,r9,r0
-
-	/* Load orig stamp (offset to TB) */
-	lwz	r5,CFG_TB_ORIG_STAMP(r9)
-	lwz	r6,(CFG_TB_ORIG_STAMP+4)(r9)
-
-	/* Get a stable TB value */
-2:	MFTBU(r3)
-	MFTBL(r4)
-	MFTBU(r0)
-	cmplw	cr0,r3,r0
-	bne-	2b
-
-	/* Subtract tb orig stamp and shift left 12 bits.
-	 */
-	subfc	r4,r6,r4
-	subfe	r0,r5,r3
-	slwi	r0,r0,12
-	rlwimi.	r0,r4,12,20,31
-	slwi	r4,r4,12
-
-	/*
-	 * Load scale factor & do multiplication.
-	 * We only use the high 32 bits of the tb_to_xs value.
-	 * Even with a 1GHz timebase clock, the high 32 bits of
-	 * tb_to_xs will be at least 4 million, so the error from
-	 * ignoring the low 32 bits will be no more than 0.25ppm.
-	 * The error will just make the clock run very very slightly
-	 * slow until the next time the kernel updates the VDSO data,
-	 * at which point the clock will catch up to the kernel's value,
-	 * so there is no long-term error accumulation.
-	 */
-	lwz	r5,CFG_TB_TO_XS(r9)	/* load values */
-	mulhwu	r4,r4,r5
-	li	r3,0
-
-	beq+	4f			/* skip high part computation if 0 */
-	mulhwu	r3,r0,r5
-	mullw	r5,r0,r5
-	addc	r4,r4,r5
-	addze	r3,r3
-4:
-	/* At this point, we have seconds since the xtime stamp
-	 * as a 32.32 fixed-point number in r3 and r4.
-	 * Load & add the xtime stamp.
-	 */
-	lwz	r5,STAMP_XTIME_SEC+LOPART(r9)
-	lwz	r6,STAMP_SEC_FRAC(r9)
-	addc	r4,r4,r6
-	adde	r3,r3,r5
-
-	/* We create a fake dependency on the result in r3/r4
-	 * and re-check the counter
-	 */
-	or	r6,r4,r3
-	xor	r0,r6,r6
-	add	r9,r9,r0
-	lwz	r0,(CFG_TB_UPDATE_COUNT+LOPART)(r9)
-        cmplw	cr0,r8,r0		/* check if updated */
-	bne-	1b
-
-	mulhwu	r4,r4,r7		/* convert to micro or nanoseconds */
-
-	blr
-  .cfi_endproc
diff --git a/arch/powerpc/kernel/vdso32/vgettimeofday.c b/arch/powerpc/kernel/vdso32/vgettimeofday.c
new file mode 100644
index 000000000000..290b4415bf85
--- /dev/null
+++ b/arch/powerpc/kernel/vdso32/vgettimeofday.c
@@ -0,0 +1,32 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * ARM64 userspace implementations of gettimeofday() and similar.
+ *
+ * Copyright (C) 2018 ARM Limited
+ *
+ */
+#include <linux/time.h>
+#include <linux/types.h>
+
+int __c_kernel_clock_gettime(clockid_t clock, struct old_timespec32 *ts,
+			     const struct vdso_data *vd)
+{
+	return __cvdso_clock_gettime32(vd, clock, ts);
+}
+
+int __c_kernel_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz,
+			    const struct vdso_data *vd)
+{
+	return __cvdso_gettimeofday(vd, tv, tz);
+}
+
+int __c_kernel_clock_getres(clockid_t clock_id, struct old_timespec32 *res,
+			    const struct vdso_data *vd)
+{
+	return __cvdso_clock_getres_time32(vd, clock_id, res);
+}
+
+time_t __c_kernel_time(time_t *time, const struct vdso_data *vd)
+{
+	return __cvdso_time(vd, time);
+}
-- 
2.13.3


^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  2019-12-23 14:31 ` [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time() Christophe Leroy
@ 2019-12-24  1:58   ` Andy Lutomirski
  2019-12-24 11:12     ` christophe leroy
  2020-01-10 21:12   ` Thomas Gleixner
  1 sibling, 1 reply; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24  1:58 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino,
	Andrew Lutomirski, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML

On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> READ_ONCE() forces the read of the 64 bit value of
> vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec allthough
> only the lower part is needed.

Seems reasonable and very unlikely to be harmful.  That being said,
this function really ought to be considered deprecated -- 32-bit
time_t is insufficient.

Do you get even better code if you move the read into the if statement?

Reviewed-by: Andy Lutomirski <luto@kernel.org>

--Andy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 08/10] lib: vdso: Avoid duplication in __cvdso_clock_getres()
  2019-12-23 14:31 ` [RFC PATCH v2 08/10] lib: vdso: Avoid duplication in __cvdso_clock_getres() Christophe Leroy
@ 2019-12-24  1:59   ` Andy Lutomirski
  0 siblings, 0 replies; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24  1:59 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino,
	Andrew Lutomirski, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML

On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> VDSO_HRES and VDSO_RAW clocks are handled the same way.
>
> Don't duplicate code.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Reviewed-by: Andy Lutomirski <luto@kernel.org>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback
  2019-12-23 14:31 ` [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback Christophe Leroy
@ 2019-12-24  2:07   ` Andy Lutomirski
  2020-01-10 20:56     ` Thomas Gleixner
  2019-12-30 12:27   ` Arnd Bergmann
  1 sibling, 1 reply; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24  2:07 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino,
	Andrew Lutomirski, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML

On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> In order to simplify next step which moves fallback call at arch
> level, ensure all arches have a 32bit fallback instead of handling
> the lack of 32bit fallback in the common code based
> on VDSO_HAS_32BIT_FALLBACK

I don't like this.  You've implemented what appear to be nonsensical
fallbacks (the 32-bit fallback for a 64-bit vDSO build?  There's no
such thing).

How exactly does this simplify patch 2?

--Andy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code.
  2019-12-23 14:31 ` [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code Christophe Leroy
@ 2019-12-24  2:24   ` Andy Lutomirski
  2019-12-24 11:41     ` christophe leroy
  0 siblings, 1 reply; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24  2:24 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino,
	Andrew Lutomirski, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML

On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> On powerpc, VDSO functions and syscalls cannot be implemented in C
> because the Linux kernel ABI requires that CR[SO] bit is set in case
> of error and cleared when no error.
>
> As this cannot be done in C, C VDSO functions and syscall'based
> fallback need a trampoline in ASM.
>
> By moving the fallback calls out of the common code, arches like
> powerpc can implement both the call to C VDSO and the fallback call
> in a single trampoline function.

Maybe the issue is that I'm not a powerpc person, but I don't
understand this.  The common vDSO code is in C.  Presumably this means
that you need an asm trampoline no matter what to call the C code.  Is
the improvement that, with this change, you can have the asm
trampoline do a single branch, so it's logically:

ret = [call the C code];
if (ret == 0) {
 set success bit;
} else {
 ret = fallback;
 if (ret == 0)
  set success bit;
else
  set failure bit;
}

return ret;

instead of:

ret = [call the C code, which includes the fallback];
if (ret == 0)
  set success bit;
else
  set failure bit;

It's not obvious to me that the former ought to be faster.

>
> The two advantages are:
> - No need play back and forth with CR[SO] and negative return value.
> - No stack frame is required in VDSO C functions for the fallbacks.

How is no stack frame required?  Do you mean that the presence of the
fallback causes worse code generation?  Can you improve the fallback
instead?

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
  2019-12-23 14:31 ` [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch Christophe Leroy
@ 2019-12-24  2:27   ` Andy Lutomirski
  2019-12-24 11:53     ` christophe leroy
  0 siblings, 1 reply; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24  2:27 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino,
	Andrew Lutomirski, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML

On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> On powerpc, __arch_get_vdso_data() clobbers the link register,
> requiring the caller to set a stack frame in order to save it.
>
> As the parent function already has to set a stack frame and save
> the link register to call the C vdso function, retriving the
> vdso data pointer there is lighter.

I'm confused.  Can't you inline __arch_get_vdso_data()?  Or is the
issue that you can't retrieve the program counter on power without
clobbering the link register?

I would imagine that this patch generates worse code on any
architecture with PC-relative addressing modes (which includes at
least x86_64, and I would guess includes most modern architectures).

--Andy

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 05/10] lib: vdso: inline do_hres()
  2019-12-23 14:31 ` [RFC PATCH v2 05/10] lib: vdso: inline do_hres() Christophe Leroy
@ 2019-12-24  2:29   ` Andy Lutomirski
  2019-12-30 12:07   ` Arnd Bergmann
  1 sibling, 0 replies; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24  2:29 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino,
	Andrew Lutomirski, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML

On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> do_hres() is called from several places, so GCC doesn't inline
> it at first.
>
> do_hres() takes a struct __kernel_timespec * parameter for
> passing the result. In the 32 bits case, this parameter corresponds
> to a local var in the caller. In order to provide a pointer
> to this structure, the caller has to put it in its stack and
> do_hres() has to write the result in the stack. This is suboptimal,
> especially on RISC processor like powerpc.
>
> By making GCC inline the function, the struct __kernel_timespec
> remains a local var using registers, avoiding the need to write and
> read stack.
>
> The improvement is significant on powerpc.

I'm okay with it, mainly because I don't expect many workloads to have
more than one copy of the code hot at the same time.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  2019-12-24  1:58   ` Andy Lutomirski
@ 2019-12-24 11:12     ` christophe leroy
  2019-12-24 12:04       ` Andy Lutomirski
  0 siblings, 1 reply; 45+ messages in thread
From: christophe leroy @ 2019-12-24 11:12 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino, LKML,
	linuxppc-dev, linux-arm-kernel, open list:MIPS, X86 ML



Le 24/12/2019 à 02:58, Andy Lutomirski a écrit :
> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> <christophe.leroy@c-s.fr> wrote:
>>
>> READ_ONCE() forces the read of the 64 bit value of
>> vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec allthough
>> only the lower part is needed.
> 
> Seems reasonable and very unlikely to be harmful.  That being said,
> this function really ought to be considered deprecated -- 32-bit
> time_t is insufficient.
> 
> Do you get even better code if you move the read into the if statement?

Euh ...

How can you return t when time pointer is NULL if you read t only when 
time pointer is not NULL ?

Christophe

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code.
  2019-12-24  2:24   ` Andy Lutomirski
@ 2019-12-24 11:41     ` christophe leroy
  2019-12-24 12:09       ` Andy Lutomirski
  0 siblings, 1 reply; 45+ messages in thread
From: christophe leroy @ 2019-12-24 11:41 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino, LKML,
	linuxppc-dev, linux-arm-kernel, open list:MIPS, X86 ML



Le 24/12/2019 à 03:24, Andy Lutomirski a écrit :
> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> <christophe.leroy@c-s.fr> wrote:
>>
>> On powerpc, VDSO functions and syscalls cannot be implemented in C
>> because the Linux kernel ABI requires that CR[SO] bit is set in case
>> of error and cleared when no error.
>>
>> As this cannot be done in C, C VDSO functions and syscall'based
>> fallback need a trampoline in ASM.
>>
>> By moving the fallback calls out of the common code, arches like
>> powerpc can implement both the call to C VDSO and the fallback call
>> in a single trampoline function.
> 
> Maybe the issue is that I'm not a powerpc person, but I don't
> understand this.  The common vDSO code is in C.  Presumably this means
> that you need an asm trampoline no matter what to call the C code.  Is
> the improvement that, with this change, you can have the asm
> trampoline do a single branch, so it's logically:
> 
> ret = [call the C code];
> if (ret == 0) {
>   set success bit;
> } else {
>   ret = fallback;
>   if (ret == 0)
>    set success bit;
> else
>    set failure bit;
> }

More simple than above, in fact it is:

ret = [call the C code];
if (ret == 0) {
  set success bit;
} else {
  ret = fallback [ which sets the success/failure bit];
}
return ret


> 
> return ret;
> 
> instead of:
> 
> ret = [call the C code, which includes the fallback];

C code cannot handle the success/failure bit so we need to do something 
which does:

int assembly_to_fallback()
{
	ret = [syscall the fallback]
	if (success bit set)
		return ret;
	else
		return -ret;
}

Also means going back and forth between the success bit and negative return.

> if (ret == 0)
>    set success bit;
> else
>    set failure bit;
> 
> It's not obvious to me that the former ought to be faster.
> 
>>
>> The two advantages are:
>> - No need play back and forth with CR[SO] and negative return value.
>> - No stack frame is required in VDSO C functions for the fallbacks.
> 
> How is no stack frame required?  Do you mean that the presence of the
> fallback causes worse code generation?  Can you improve the fallback
> instead?
> 

When function F1 calls function F2 (with BL insn), the link register 
(LR) is set with the return address in F1, so that at the end of F2, F2 
branches to LR (with BLR insn), that's how you return from functions.

When F2 calls function F3, the same happens, LR is set to the return of 
F3 into F2. It means that F2 has to save LR in order to be able to 
return to F1, otherwise the return address from F2 into F1 is lost.

But ... thinking about it once more, indeed fallback means doing a 
syscall, and in fact I realise that syscalls won't clobber LR, so it 
should be possible to do something. Let me try it.

Christophe

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
  2019-12-24  2:27   ` Andy Lutomirski
@ 2019-12-24 11:53     ` christophe leroy
  2019-12-24 12:15       ` Andy Lutomirski
  0 siblings, 1 reply; 45+ messages in thread
From: christophe leroy @ 2019-12-24 11:53 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Thomas Gleixner, Vincenzo Frascino, LKML,
	linuxppc-dev, linux-arm-kernel, open list:MIPS, X86 ML



Le 24/12/2019 à 03:27, Andy Lutomirski a écrit :
> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> <christophe.leroy@c-s.fr> wrote:
>>
>> On powerpc, __arch_get_vdso_data() clobbers the link register,
>> requiring the caller to set a stack frame in order to save it.
>>
>> As the parent function already has to set a stack frame and save
>> the link register to call the C vdso function, retriving the
>> vdso data pointer there is lighter.
> 
> I'm confused.  Can't you inline __arch_get_vdso_data()?  Or is the
> issue that you can't retrieve the program counter on power without
> clobbering the link register?

Yes it can be inlined (I did it in V1 
https://patchwork.ozlabs.org/patch/1180571/), but you can't do it 
without clobbering the link register, because the only way to get the 
program counter is to do to as if you were calling another function but 
you call to the address which just follows where you are, so that it 
sets LR which the simulated return address which corresponds to the 
address following the branch.

static __always_inline
const struct vdso_data *__arch_get_vdso_data(void)
{
	void *ptr;

	asm volatile(
		"	bcl	20, 31, .+4;\n"
		"	mflr	%0;\n"
		"	addi	%0, %0, __kernel_datapage_offset - (.-4);\n"
		: "=b"(ptr) : : "lr");

	return ptr + *(unsigned long *)ptr;
}

> 
> I would imagine that this patch generates worse code on any
> architecture with PC-relative addressing modes (which includes at
> least x86_64, and I would guess includes most modern architectures).

Why ? Powerpc is also using PC-relative addressing for all calls but 
indirect calls.

As the other arch C VDSO callers are in C and written in such a way that 
callee is inlined into callers, and as __arch_get_vdso_data() is 
inlined, it should make no difference, shouldn't it ?

Christophe

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  2019-12-24 11:12     ` christophe leroy
@ 2019-12-24 12:04       ` Andy Lutomirski
  0 siblings, 0 replies; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24 12:04 UTC (permalink / raw)
  To: christophe leroy
  Cc: Andy Lutomirski, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Arnd Bergmann, Thomas Gleixner,
	Vincenzo Frascino, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML


> On Dec 24, 2019, at 7:12 PM, christophe leroy <christophe.leroy@c-s.fr> wrote:
> 
> 
> 
>> Le 24/12/2019 à 02:58, Andy Lutomirski a écrit :
>>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
>>> <christophe.leroy@c-s.fr> wrote:
>>> 
>>> READ_ONCE() forces the read of the 64 bit value of
>>> vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec allthough
>>> only the lower part is needed.
>> Seems reasonable and very unlikely to be harmful.  That being said,
>> this function really ought to be considered deprecated -- 32-bit
>> time_t is insufficient.
>> Do you get even better code if you move the read into the if statement?
> 
> Euh ...
> 
> How can you return t when time pointer is NULL if you read t only when time pointer is not NULL ?
> 
> 

Duh, never mind.

But this means your patch may be buggy: you need to make sure the compiler returns the *same* value it stores. Maybe you’re saved by the potential aliasing between the data page and the passed parameter and the value you read, but that’sa bad thing to rely on.

Try barrier() after the read.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code.
  2019-12-24 11:41     ` christophe leroy
@ 2019-12-24 12:09       ` Andy Lutomirski
  0 siblings, 0 replies; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24 12:09 UTC (permalink / raw)
  To: christophe leroy
  Cc: Andy Lutomirski, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Arnd Bergmann, Thomas Gleixner,
	Vincenzo Frascino, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML


> On Dec 24, 2019, at 7:41 PM, christophe leroy <christophe.leroy@c-s.fr> wrote:
> 
> 
> 
>> Le 24/12/2019 à 03:24, Andy Lutomirski a écrit :
>>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
>>> <christophe.leroy@c-s.fr> wrote:
>>> 
>>> On powerpc, VDSO functions and syscalls cannot be implemented in C
>>> because the Linux kernel ABI requires that CR[SO] bit is set in case
>>> of error and cleared when no error.
>>> 
>>> As this cannot be done in C, C VDSO functions and syscall'based
>>> fallback need a trampoline in ASM.
>>> 
>>> By moving the fallback calls out of the common code, arches like
>>> powerpc can implement both the call to C VDSO and the fallback call
>>> in a single trampoline function.
>> Maybe the issue is that I'm not a powerpc person, but I don't
>> understand this.  The common vDSO code is in C.  Presumably this means
>> that you need an asm trampoline no matter what to call the C code.  Is
>> the improvement that, with this change, you can have the asm
>> trampoline do a single branch, so it's logically:
>> ret = [call the C code];
>> if (ret == 0) {
>>  set success bit;
>> } else {
>>  ret = fallback;
>>  if (ret == 0)
>>   set success bit;
>> else
>>   set failure bit;
>> }
> 
> More simple than above, in fact it is:
> 
> ret = [call the C code];
> if (ret == 0) {
> set success bit;
> } else {
> ret = fallback [ which sets the success/failure bit];
> }
> return ret

Cute.

> 
> 
>> return ret;
>> instead of:
>> ret = [call the C code, which includes the fallback];
> 
> C code cannot handle the success/failure bit so we need to do something which does:
> 
> int assembly_to_fallback()
> {
>    ret = [syscall the fallback]
>    if (success bit set)
>        return ret;
>    else
>        return -ret;
> }

Wait, your calling convention has syscalls return positive values on error?

But I think this is moot. The syscalls in question never return nonzero success values, so you should be able to inline the syscall without worrying about this.

> 
> Also means going back and forth between the success bit and negative return.
> 
>> if (ret == 0)
>>   set success bit;
>> else
>>   set failure bit;
>> It's not obvious to me that the former ought to be faster.
>>> 
>>> The two advantages are:
>>> - No need play back and forth with CR[SO] and negative return value.
>>> - No stack frame is required in VDSO C functions for the fallbacks.
>> How is no stack frame required?  Do you mean that the presence of the
>> fallback causes worse code generation?  Can you improve the fallback
>> instead?
> 
> When function F1 calls function F2 (with BL insn), the link register (LR) is set with the return address in F1, so that at the end of F2, F2 branches to LR (with BLR insn), that's how you return from functions.
> 
> When F2 calls function F3, the same happens, LR is set to the return of F3 into F2. It means that F2 has to save LR in order to be able to return to F1, otherwise the return address from F2 into F1 is lost.
> 
> But ... thinking about it once more, indeed fallback means doing a syscall, and in fact I realise that syscalls won't clobber LR, so it should be possible to do something. Let me try it.
> 

With that plus assume that nonzero return means failure, I think you should have all your bases covered.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
  2019-12-24 11:53     ` christophe leroy
@ 2019-12-24 12:15       ` Andy Lutomirski
  2019-12-24 12:41         ` Andy Lutomirski
  2019-12-24 14:46         ` Segher Boessenkool
  0 siblings, 2 replies; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24 12:15 UTC (permalink / raw)
  To: christophe leroy
  Cc: Andy Lutomirski, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Arnd Bergmann, Thomas Gleixner,
	Vincenzo Frascino, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML



> On Dec 24, 2019, at 7:53 PM, christophe leroy <christophe.leroy@c-s.fr> wrote:
> 
> 
> 
>> Le 24/12/2019 à 03:27, Andy Lutomirski a écrit :
>>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
>>> <christophe.leroy@c-s.fr> wrote:
>>> 
>>> On powerpc, __arch_get_vdso_data() clobbers the link register,
>>> requiring the caller to set a stack frame in order to save it.
>>> 
>>> As the parent function already has to set a stack frame and save
>>> the link register to call the C vdso function, retriving the
>>> vdso data pointer there is lighter.
>> I'm confused.  Can't you inline __arch_get_vdso_data()?  Or is the
>> issue that you can't retrieve the program counter on power without
>> clobbering the link register?
> 
> Yes it can be inlined (I did it in V1 https://patchwork.ozlabs.org/patch/1180571/), but you can't do it without clobbering the link register, because the only way to get the program counter is to do to as if you were calling another function but you call to the address which just follows where you are, so that it sets LR which the simulated return address which corresponds to the address following the branch.
> 
> static __always_inline
> const struct vdso_data *__arch_get_vdso_data(void)
> {
>    void *ptr;
> 
>    asm volatile(
>        "    bcl    20, 31, .+4;\n"
>        "    mflr    %0;\n"
>        "    addi    %0, %0, __kernel_datapage_offset - (.-4);\n"
>        : "=b"(ptr) : : "lr");
> 
>    return ptr + *(unsigned long *)ptr;
> }
> 
>> I would imagine that this patch generates worse code on any
>> architecture with PC-relative addressing modes (which includes at
>> least x86_64, and I would guess includes most modern architectures).
> 
> Why ? Powerpc is also using PC-relative addressing for all calls but indirect calls.

I mean PC-relative access for data.  The data page is at a constant, known offset from the vDSO text.

I haven’t checked how much x86_64 benefits from this, but at least the non-array fields ought to be accessible with a PC-relative access.

It should be possible to refactor a little bit so that the compiler can still see what’s going on.  Maybe your patch actually does this. I’d want to look at the assembly.  This also might not matter much on x86_64 in particular, since x86_64 can convert a PC-relative address to an absolute address with a single instruction with no clobbers.

Does power have PC-relative data access?  If so, I wonder if the code can be arranged so that even the array accesses don’t require computing an absolute address at any point.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
  2019-12-24 12:15       ` Andy Lutomirski
@ 2019-12-24 12:41         ` Andy Lutomirski
  2019-12-24 14:46         ` Segher Boessenkool
  1 sibling, 0 replies; 45+ messages in thread
From: Andy Lutomirski @ 2019-12-24 12:41 UTC (permalink / raw)
  To: christophe leroy
  Cc: Andy Lutomirski, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Arnd Bergmann, Thomas Gleixner,
	Vincenzo Frascino, LKML, linuxppc-dev, linux-arm-kernel,
	open list:MIPS, X86 ML

On Tue, Dec 24, 2019 at 4:15 AM Andy Lutomirski <luto@amacapital.net> wrote:
>
>
>
> > On Dec 24, 2019, at 7:53 PM, christophe leroy <christophe.leroy@c-s.fr> wrote:
> >
> > 
> >
> >> Le 24/12/2019 à 03:27, Andy Lutomirski a écrit :
> >>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> >>> <christophe.leroy@c-s.fr> wrote:
> >>>
> >>> On powerpc, __arch_get_vdso_data() clobbers the link register,
> >>> requiring the caller to set a stack frame in order to save it.
> >>>
> >>> As the parent function already has to set a stack frame and save
> >>> the link register to call the C vdso function, retriving the
> >>> vdso data pointer there is lighter.
> >> I'm confused.  Can't you inline __arch_get_vdso_data()?  Or is the
> >> issue that you can't retrieve the program counter on power without
> >> clobbering the link register?
> >
> > Yes it can be inlined (I did it in V1 https://patchwork.ozlabs.org/patch/1180571/), but you can't do it without clobbering the link register, because the only way to get the program counter is to do to as if you were calling another function but you call to the address which just follows where you are, so that it sets LR which the simulated return address which corresponds to the address following the branch.
> >
> > static __always_inline
> > const struct vdso_data *__arch_get_vdso_data(void)
> > {
> >    void *ptr;
> >
> >    asm volatile(
> >        "    bcl    20, 31, .+4;\n"
> >        "    mflr    %0;\n"
> >        "    addi    %0, %0, __kernel_datapage_offset - (.-4);\n"
> >        : "=b"(ptr) : : "lr");
> >
> >    return ptr + *(unsigned long *)ptr;
> > }
> >
> >> I would imagine that this patch generates worse code on any
> >> architecture with PC-relative addressing modes (which includes at
> >> least x86_64, and I would guess includes most modern architectures).
> >
> > Why ? Powerpc is also using PC-relative addressing for all calls but indirect calls.
>
> I mean PC-relative access for data.  The data page is at a constant, known offset from the vDSO text.
>
> I haven’t checked how much x86_64 benefits from this, but at least the non-array fields ought to be accessible with a PC-relative access.
>
> It should be possible to refactor a little bit so that the compiler can still see what’s going on.  Maybe your patch actually does this. I’d want to look at the assembly.  This also might not matter much on x86_64 in particular, since x86_64 can convert a PC-relative address to an absolute address with a single instruction with no clobbers.
>
> Does power have PC-relative data access?  If so, I wonder if the code can be arranged so that even the array accesses don’t require computing an absolute address at any point.

Indeed the x86 code is also suboptimal, but at least the unnecessary
absolute address calculation is cheap on x86_64.  Ideally we'd pass
around offsets into the vdso data instead of passing pointers, and
maybe the compiler will figure it out.  I can try to play with this in
the morning.

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch
  2019-12-24 12:15       ` Andy Lutomirski
  2019-12-24 12:41         ` Andy Lutomirski
@ 2019-12-24 14:46         ` Segher Boessenkool
  1 sibling, 0 replies; 45+ messages in thread
From: Segher Boessenkool @ 2019-12-24 14:46 UTC (permalink / raw)
  To: Andy Lutomirski
  Cc: christophe leroy, Arnd Bergmann, X86 ML, LKML, open list:MIPS,
	Paul Mackerras, Andy Lutomirski, Thomas Gleixner,
	Vincenzo Frascino, linuxppc-dev, linux-arm-kernel

On Tue, Dec 24, 2019 at 08:15:11PM +0800, Andy Lutomirski wrote:
> Does power have PC-relative data access?  If so, I wonder if the code can be arranged so that even the array accesses don’t require computing an absolute address at any point.

Not before ISA 3.0 (that is Power9).

The bcl/mflr dance isn't *that* expensive, if you use it sparingly.


Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 05/10] lib: vdso: inline do_hres()
  2019-12-23 14:31 ` [RFC PATCH v2 05/10] lib: vdso: inline do_hres() Christophe Leroy
  2019-12-24  2:29   ` Andy Lutomirski
@ 2019-12-30 12:07   ` Arnd Bergmann
  2020-01-10 21:07     ` Thomas Gleixner
  1 sibling, 1 reply; 45+ messages in thread
From: Arnd Bergmann @ 2019-12-30 12:07 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Vincenzo Frascino, Andy Lutomirski,
	linux-kernel, linuxppc-dev, Linux ARM,
	open list:BROADCOM NVRAM DRIVER, the arch/x86 maintainers

On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> do_hres() is called from several places, so GCC doesn't inline
> it at first.
>
> do_hres() takes a struct __kernel_timespec * parameter for
> passing the result. In the 32 bits case, this parameter corresponds
> to a local var in the caller. In order to provide a pointer
> to this structure, the caller has to put it in its stack and
> do_hres() has to write the result in the stack. This is suboptimal,
> especially on RISC processor like powerpc.
>
> By making GCC inline the function, the struct __kernel_timespec
> remains a local var using registers, avoiding the need to write and
> read stack.
>
> The improvement is significant on powerpc.
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

Good idea, I can see how this ends up being an improvement
for most of the callers.

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback
  2019-12-23 14:31 ` [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback Christophe Leroy
  2019-12-24  2:07   ` Andy Lutomirski
@ 2019-12-30 12:27   ` Arnd Bergmann
  2020-01-02 11:29     ` Arnd Bergmann
  1 sibling, 1 reply; 45+ messages in thread
From: Arnd Bergmann @ 2019-12-30 12:27 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Vincenzo Frascino, Andy Lutomirski,
	linux-kernel, linuxppc-dev, Linux ARM,
	open list:BROADCOM NVRAM DRIVER, the arch/x86 maintainers

On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
>
> In order to simplify next step which moves fallback call at arch
> level, ensure all arches have a 32bit fallback instead of handling
> the lack of 32bit fallback in the common code based
> on VDSO_HAS_32BIT_FALLBACK
>
> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>

I like the idea of removing VDSO_HAS_32BIT_FALLBACK and ensuring
that all 32-bit architectures implement them, but we really should *not*
have any implementation calling the 64-bit syscalls.

> +static __always_inline
> +long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> +{
> +       struct __kernel_timespec ts;
> +       int ret = clock_gettime_fallback(clock, &ts);
> +
> +       if (likely(!ret)) {
> +               _ts->tv_sec = ts.tv_sec;
> +               _ts->tv_nsec = ts.tv_nsec;
> +       }
> +       return ret;
> +}
> +
> +static __always_inline
> +long clock_getres32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> +{
> +       struct __kernel_timespec ts;
> +       int ret = clock_getres_fallback(clock, &ts);
> +
> +       if (likely(!ret && _ts)) {
> +               _ts->tv_sec = ts.tv_sec;
> +               _ts->tv_nsec = ts.tv_nsec;
> +       }
> +       return ret;
> +}

Please change these to call __NR_clock_gettime and __NR_clock_getres_time
instead of __NR_clock_gettime64/__NR_clock_getres_time64 for multiple reasons.

- When doing migration between containers, the vdso may get copied into
  an application running on a kernel that does not support the time64
  variants, and then the fallback fails.

- When CONFIG_COMPAT_32BIT_TIME is disabled, the time32 syscalls
  return -ENOSYS, and the vdso version should have the exact same behavior
  to avoid surprises. In particular an application that checks clock_gettime()
  to see if the time32 are in part of the kernel would get an incorrect result
  here.

arch/arm64/include/asm/vdso/compat_gettimeofday.h already does this,
I think you can just copy the implementation or find a way to share it.

> diff --git a/arch/arm64/include/asm/vdso/gettimeofday.h b/arch/arm64/include/asm/vdso/gettimeofday.h
> index b08f476b72b4..c41c86a07423 100644
> --- a/arch/arm64/include/asm/vdso/gettimeofday.h
> +++ b/arch/arm64/include/asm/vdso/gettimeofday.h
> @@ -66,6 +66,32 @@ int clock_getres_fallback(clockid_t _clkid, struct __kernel_timespec *_ts)
>         return ret;
>  }
>
> +static __always_inline
> +long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> +{
> +       struct __kernel_timespec ts;
> +       int ret = clock_gettime_fallback(clock, &ts);
> +
> +       if (likely(!ret)) {
> +               _ts->tv_sec = ts.tv_sec;
> +               _ts->tv_nsec = ts.tv_nsec;
> +       }
> +       return ret;
> +}

As Andy said, this makes no sense at all, nothing should ever call this on a
64-bit architecture.

> diff --git a/arch/mips/include/asm/vdso/gettimeofday.h b/arch/mips/include/asm/vdso/gettimeofday.h
> index b08825531e9f..60608e930a5c 100644
> --- a/arch/mips/include/asm/vdso/gettimeofday.h
> +++ b/arch/mips/include/asm/vdso/gettimeofday.h
> @@ -109,8 +109,6 @@ static __always_inline int clock_getres_fallback(
>
>  #if _MIPS_SIM != _MIPS_SIM_ABI64
>
> -#define VDSO_HAS_32BIT_FALLBACK        1
> -
>  static __always_inline long clock_gettime32_fallback(
>                                         clockid_t _clkid,
>                                         struct old_timespec32 *_ts)
> @@ -150,6 +148,32 @@ static __always_inline int clock_getres32_fallback(
>
>         return error ? -ret : ret;
>  }
> +#else
> +static __always_inline
> +long clock_gettime32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> +{
> +       struct __kernel_timespec ts;
> +       int ret = clock_gettime_fallback(clock, &ts);
> +
> +       if (likely(!ret)) {
> +               _ts->tv_sec = ts.tv_sec;
> +               _ts->tv_nsec = ts.tv_nsec;
> +       }
> +       return ret;
> +}
> +

Same here.

> --- a/lib/vdso/gettimeofday.c
> +++ b/lib/vdso/gettimeofday.c
> @@ -125,13 +125,8 @@ __cvdso_clock_gettime32(clockid_t clock, struct old_timespec32 *res)
>
>         ret = __cvdso_clock_gettime_common(clock, &ts);
>
> -#ifdef VDSO_HAS_32BIT_FALLBACK
>         if (unlikely(ret))
>                 return clock_gettime32_fallback(clock, res);
> -#else
> -       if (unlikely(ret))
> -               ret = clock_gettime_fallback(clock, &ts);
> -#endif
>
>         if (likely(!ret)) {
>                 res->tv_sec = ts.tv_sec;

Removing the #ifdef and the fallback seems fine. I think this is actually
required for correctness on arm32 as well. Maybe enclose the entire function in

#ifdef VDSO_HAS_CLOCK_GETTIME32

to only define it when it is called?

> @@ -238,13 +233,8 @@ __cvdso_clock_getres_time32(clockid_t clock, struct old_timespec32 *res)
>
>         ret = __cvdso_clock_getres_common(clock, &ts);
>
> -#ifdef VDSO_HAS_32BIT_FALLBACK
>         if (unlikely(ret))
>                 return clock_getres32_fallback(clock, res);
> -#else
> -       if (unlikely(ret))
> -               ret = clock_getres_fallback(clock, &ts);
> -#endif

The same applies to all the getres stuff of course.

      Arnd

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback
  2019-12-30 12:27   ` Arnd Bergmann
@ 2020-01-02 11:29     ` Arnd Bergmann
  2020-01-09 15:43       ` Christophe Leroy
  0 siblings, 1 reply; 45+ messages in thread
From: Arnd Bergmann @ 2020-01-02 11:29 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Vincenzo Frascino, Andy Lutomirski,
	linux-kernel, linuxppc-dev, Linux ARM,
	open list:BROADCOM NVRAM DRIVER, the arch/x86 maintainers

On Mon, Dec 30, 2019 at 1:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
> On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy <christophe.leroy@c-s.fr> wrote:
> > +static __always_inline
> > +long clock_getres32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
> > +{
> > +       struct __kernel_timespec ts;
> > +       int ret = clock_getres_fallback(clock, &ts);
> > +
> > +       if (likely(!ret && _ts)) {
> > +               _ts->tv_sec = ts.tv_sec;
> > +               _ts->tv_nsec = ts.tv_nsec;
> > +       }
> > +       return ret;
> > +}
>
> Please change these to call __NR_clock_gettime and __NR_clock_getres_time
> instead of __NR_clock_gettime64/__NR_clock_getres_time64 for multiple reasons.
>
> - When doing migration between containers, the vdso may get copied into
>   an application running on a kernel that does not support the time64
>   variants, and then the fallback fails.
>
> - When CONFIG_COMPAT_32BIT_TIME is disabled, the time32 syscalls
>   return -ENOSYS, and the vdso version should have the exact same behavior
>   to avoid surprises. In particular an application that checks clock_gettime()
>   to see if the time32 are in part of the kernel would get an incorrect result
>   here.
>
> arch/arm64/include/asm/vdso/compat_gettimeofday.h already does this,
> I think you can just copy the implementation or find a way to share it.

There was a related discussion on this after a vdso regression on mips,
and I suggested to drop the time32 functions completely from the
vdso when CONFIG_COMPAT_32BIT_TIME is disabled, such as

diff --git a/arch/powerpc/kernel/vdso32/vdso32.lds.S
b/arch/powerpc/kernel/vdso32/vdso32.lds.S
index 00c025ba4a92..605f259fa24c 100644
--- a/arch/powerpc/kernel/vdso32/vdso32.lds.S
+++ b/arch/powerpc/kernel/vdso32/vdso32.lds.S
@@ -145,10 +145,12 @@ VERSION

                __kernel_get_syscall_map;
 #ifndef CONFIG_PPC_BOOK3S_601
+#ifdef CONFIG_COMPAT_32BIT_TIME
                __kernel_gettimeofday;
                __kernel_clock_gettime;
                __kernel_clock_getres;
                __kernel_time;
+#endif
                __kernel_get_tbfreq;
 #endif
                __kernel_sync_dicache;

Any opinions on this? If everyone agrees with that approach, I can
send a cross-architecture patch to do this everywhere. It's probably
best though if Christophe adds that to his series as it touches a lot
of the same files and I would prefer to avoid conflicting changes.

       Arnd

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback
  2020-01-02 11:29     ` Arnd Bergmann
@ 2020-01-09 15:43       ` Christophe Leroy
  0 siblings, 0 replies; 45+ messages in thread
From: Christophe Leroy @ 2020-01-09 15:43 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Vincenzo Frascino, Andy Lutomirski,
	linux-kernel, linuxppc-dev, Linux ARM,
	open list:BROADCOM NVRAM DRIVER, the arch/x86 maintainers



Le 02/01/2020 à 12:29, Arnd Bergmann a écrit :
> On Mon, Dec 30, 2019 at 1:27 PM Arnd Bergmann <arnd@arndb.de> wrote:
>> On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy <christophe.leroy@c-s.fr> wrote:
>>> +static __always_inline
>>> +long clock_getres32_fallback(clockid_t _clkid, struct old_timespec32 *_ts)
>>> +{
>>> +       struct __kernel_timespec ts;
>>> +       int ret = clock_getres_fallback(clock, &ts);
>>> +
>>> +       if (likely(!ret && _ts)) {
>>> +               _ts->tv_sec = ts.tv_sec;
>>> +               _ts->tv_nsec = ts.tv_nsec;
>>> +       }
>>> +       return ret;
>>> +}
>>
>> Please change these to call __NR_clock_gettime and __NR_clock_getres_time
>> instead of __NR_clock_gettime64/__NR_clock_getres_time64 for multiple reasons.
>>
>> - When doing migration between containers, the vdso may get copied into
>>    an application running on a kernel that does not support the time64
>>    variants, and then the fallback fails.
>>
>> - When CONFIG_COMPAT_32BIT_TIME is disabled, the time32 syscalls
>>    return -ENOSYS, and the vdso version should have the exact same behavior
>>    to avoid surprises. In particular an application that checks clock_gettime()
>>    to see if the time32 are in part of the kernel would get an incorrect result
>>    here.
>>
>> arch/arm64/include/asm/vdso/compat_gettimeofday.h already does this,
>> I think you can just copy the implementation or find a way to share it.
> 
> There was a related discussion on this after a vdso regression on mips,
> and I suggested to drop the time32 functions completely from the
> vdso when CONFIG_COMPAT_32BIT_TIME is disabled, such as
> 
> diff --git a/arch/powerpc/kernel/vdso32/vdso32.lds.S
> b/arch/powerpc/kernel/vdso32/vdso32.lds.S
> index 00c025ba4a92..605f259fa24c 100644
> --- a/arch/powerpc/kernel/vdso32/vdso32.lds.S
> +++ b/arch/powerpc/kernel/vdso32/vdso32.lds.S
> @@ -145,10 +145,12 @@ VERSION
> 
>                  __kernel_get_syscall_map;
>   #ifndef CONFIG_PPC_BOOK3S_601
> +#ifdef CONFIG_COMPAT_32BIT_TIME
>                  __kernel_gettimeofday;
>                  __kernel_clock_gettime;
>                  __kernel_clock_getres;
>                  __kernel_time;
> +#endif
>                  __kernel_get_tbfreq;
>   #endif
>                  __kernel_sync_dicache;
> 
> Any opinions on this? If everyone agrees with that approach, I can
> send a cross-architecture patch to do this everywhere. It's probably
> best though if Christophe adds that to his series as it touches a lot
> of the same files and I would prefer to avoid conflicting changes.
> 

I guess it would be wise.

I don't think my series to switch powerpc to C VDSO will get ready 
anytime soon, because (in addition to the performance impact) I'm having 
hard time with the 32 bits VDSO for PPC64. Many of the powerpc header 
files used by lib/vdso/gettimeofday.c are not ready for generating 32 
bits code for PPC64. Main problem is that at many places, #ifdef 
CONFIG_PPC64 is used instead of #ifdef __powerpc64__. There are also 
some CONFIG options like CONFIG_GENERIC_ATOMIC64 that are selected only 
when CONFIG_PPC32 is set, but which are required for building the 32 
bits VDSO. For that I don't even know how to deal with it.

So, feel free to send your patch, and if my series comes early enough to 
conflict, I'll manage it.

Christophe

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Surprising code generated for vdso_read_begin()
  2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
                   ` (9 preceding siblings ...)
  2019-12-23 14:31 ` [RFC PATCH v2 10/10] powerpc/32: Switch VDSO to C implementation Christophe Leroy
@ 2020-01-09 17:52 ` Christophe Leroy
  2020-01-09 20:07   ` Segher Boessenkool
  10 siblings, 1 reply; 45+ messages in thread
From: Christophe Leroy @ 2020-01-09 17:52 UTC (permalink / raw)
  To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto, Segher Boessenkool
  Cc: x86, linuxppc-dev, linux-kernel, linux-arm-kernel, linux-mips

Wondering why we get something so complicated/redundant for 
vdso_read_begin() <include/vdso/helpers.h>

static __always_inline u32 vdso_read_begin(const struct vdso_data *vd)
{
	u32 seq;

	while ((seq = READ_ONCE(vd->seq)) & 1)
		cpu_relax();

	smp_rmb();
	return seq;
}


  6e0:   81 05 00 f0     lwz     r8,240(r5)
  6e4:   71 09 00 01     andi.   r9,r8,1
  6e8:   41 82 00 10     beq     6f8 <__c_kernel_clock_gettime+0x158>
  6ec:   81 05 00 f0     lwz     r8,240(r5)
  6f0:   71 0a 00 01     andi.   r10,r8,1
  6f4:   40 82 ff f8     bne     6ec <__c_kernel_clock_gettime+0x14c>
  6f8:

r5 being vd pointer

Why the first triplet, not only the second triplet ? Something wrong 
with using READ_ONCE() for that ?

Christophe

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Surprising code generated for vdso_read_begin()
  2020-01-09 17:52 ` Surprising code generated for vdso_read_begin() Christophe Leroy
@ 2020-01-09 20:07   ` Segher Boessenkool
  2020-01-10  6:45     ` Christophe Leroy
  0 siblings, 1 reply; 45+ messages in thread
From: Segher Boessenkool @ 2020-01-09 20:07 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto, x86, linuxppc-dev, linux-kernel,
	linux-arm-kernel, linux-mips

On Thu, Jan 09, 2020 at 05:52:34PM +0000, Christophe Leroy wrote:
> Wondering why we get something so complicated/redundant for 
> vdso_read_begin() <include/vdso/helpers.h>
> 
> static __always_inline u32 vdso_read_begin(const struct vdso_data *vd)
> {
> 	u32 seq;
> 
> 	while ((seq = READ_ONCE(vd->seq)) & 1)
> 		cpu_relax();
> 
> 	smp_rmb();
> 	return seq;
> }
> 
> 
>  6e0:   81 05 00 f0     lwz     r8,240(r5)
>  6e4:   71 09 00 01     andi.   r9,r8,1
>  6e8:   41 82 00 10     beq     6f8 <__c_kernel_clock_gettime+0x158>
>  6ec:   81 05 00 f0     lwz     r8,240(r5)
>  6f0:   71 0a 00 01     andi.   r10,r8,1
>  6f4:   40 82 ff f8     bne     6ec <__c_kernel_clock_gettime+0x14c>
>  6f8:
> 
> r5 being vd pointer
> 
> Why the first triplet, not only the second triplet ? Something wrong 
> with using READ_ONCE() for that ?

It looks like the compiler did loop peeling.  What GCC version is this?
Please try current trunk (to become GCC 10), or at least GCC 9?


Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Surprising code generated for vdso_read_begin()
  2020-01-09 20:07   ` Segher Boessenkool
@ 2020-01-10  6:45     ` Christophe Leroy
  2020-01-11 11:33       ` Segher Boessenkool
  0 siblings, 1 reply; 45+ messages in thread
From: Christophe Leroy @ 2020-01-10  6:45 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto, x86, linuxppc-dev, linux-kernel,
	linux-arm-kernel, linux-mips



Le 09/01/2020 à 21:07, Segher Boessenkool a écrit :
> On Thu, Jan 09, 2020 at 05:52:34PM +0000, Christophe Leroy wrote:
>> Wondering why we get something so complicated/redundant for
>> vdso_read_begin() <include/vdso/helpers.h>
>>
>> static __always_inline u32 vdso_read_begin(const struct vdso_data *vd)
>> {
>> 	u32 seq;
>>
>> 	while ((seq = READ_ONCE(vd->seq)) & 1)
>> 		cpu_relax();
>>
>> 	smp_rmb();
>> 	return seq;
>> }
>>
>>
>>   6e0:   81 05 00 f0     lwz     r8,240(r5)
>>   6e4:   71 09 00 01     andi.   r9,r8,1
>>   6e8:   41 82 00 10     beq     6f8 <__c_kernel_clock_gettime+0x158>
>>   6ec:   81 05 00 f0     lwz     r8,240(r5)
>>   6f0:   71 0a 00 01     andi.   r10,r8,1
>>   6f4:   40 82 ff f8     bne     6ec <__c_kernel_clock_gettime+0x14c>
>>   6f8:
>>
>> r5 being vd pointer
>>
>> Why the first triplet, not only the second triplet ? Something wrong
>> with using READ_ONCE() for that ?
> 
> It looks like the compiler did loop peeling.  What GCC version is this?
> Please try current trunk (to become GCC 10), or at least GCC 9?
> 

It is with GCC 5.5

https://mirrors.edge.kernel.org/pub/tools/crosstool/ doesn't have more 
recent than 8.1

With 8.1, the problem doesn't show up.

Thanks
Christophe

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback
  2019-12-24  2:07   ` Andy Lutomirski
@ 2020-01-10 20:56     ` Thomas Gleixner
  2020-01-10 21:02       ` Andy Lutomirski
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2020-01-10 20:56 UTC (permalink / raw)
  To: Andy Lutomirski, Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Arnd Bergmann, Vincenzo Frascino, Andrew Lutomirski, LKML,
	linuxppc-dev, linux-arm-kernel, open list:MIPS, X86 ML

Andy Lutomirski <luto@kernel.org> writes:

> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
> <christophe.leroy@c-s.fr> wrote:
>>
>> In order to simplify next step which moves fallback call at arch
>> level, ensure all arches have a 32bit fallback instead of handling
>> the lack of 32bit fallback in the common code based
>> on VDSO_HAS_32BIT_FALLBACK
>
> I don't like this.  You've implemented what appear to be nonsensical
> fallbacks (the 32-bit fallback for a 64-bit vDSO build?  There's no
> such thing).
>
> How exactly does this simplify patch 2?

There is a patchset from Vincenzo which fell through the cracks which
addresses the VDS_HAS_32BIT_FALLBACK issue properly. I'm about to pick
it up. See:

 https://lore.kernel.org/lkml/20190830135902.20861-1-vincenzo.frascino@arm.com/

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback
  2020-01-10 20:56     ` Thomas Gleixner
@ 2020-01-10 21:02       ` Andy Lutomirski
  0 siblings, 0 replies; 45+ messages in thread
From: Andy Lutomirski @ 2020-01-10 21:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Andy Lutomirski, Christophe Leroy, Benjamin Herrenschmidt,
	Paul Mackerras, Michael Ellerman, Arnd Bergmann,
	Vincenzo Frascino, Andrew Lutomirski, LKML, linuxppc-dev,
	linux-arm-kernel, open list:MIPS, X86 ML



> On Jan 10, 2020, at 10:56 AM, Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> Andy Lutomirski <luto@kernel.org> writes:
> 
>>> On Mon, Dec 23, 2019 at 6:31 AM Christophe Leroy
>>> <christophe.leroy@c-s.fr> wrote:
>>> 
>>> In order to simplify next step which moves fallback call at arch
>>> level, ensure all arches have a 32bit fallback instead of handling
>>> the lack of 32bit fallback in the common code based
>>> on VDSO_HAS_32BIT_FALLBACK
>> 
>> I don't like this.  You've implemented what appear to be nonsensical
>> fallbacks (the 32-bit fallback for a 64-bit vDSO build?  There's no
>> such thing).
>> 
>> How exactly does this simplify patch 2?
> 
> There is a patchset from Vincenzo which fell through the cracks which
> addresses the VDS_HAS_32BIT_FALLBACK issue properly. I'm about to pick
> it up. See:
> 
> https://lore.kernel.org/lkml/20190830135902.20861-1-vincenzo.frascino@arm.com/
> 

Thanks.  I had been wondering why the conditionals were still there, since I remember seeing these patches.

> Thanks,
> 
>        tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 05/10] lib: vdso: inline do_hres()
  2019-12-30 12:07   ` Arnd Bergmann
@ 2020-01-10 21:07     ` Thomas Gleixner
  2020-01-11  9:06       ` Christophe Leroy
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2020-01-10 21:07 UTC (permalink / raw)
  To: Arnd Bergmann, Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Vincenzo Frascino, Andy Lutomirski, linux-kernel, linuxppc-dev,
	Linux ARM, open list:BROADCOM NVRAM DRIVER,
	the arch/x86 maintainers

Arnd Bergmann <arnd@arndb.de> writes:
> On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy
> <christophe.leroy@c-s.fr> wrote:
>>
>> do_hres() is called from several places, so GCC doesn't inline
>> it at first.
>>
>> do_hres() takes a struct __kernel_timespec * parameter for
>> passing the result. In the 32 bits case, this parameter corresponds
>> to a local var in the caller. In order to provide a pointer
>> to this structure, the caller has to put it in its stack and
>> do_hres() has to write the result in the stack. This is suboptimal,
>> especially on RISC processor like powerpc.
>>
>> By making GCC inline the function, the struct __kernel_timespec
>> remains a local var using registers, avoiding the need to write and
>> read stack.
>>
>> The improvement is significant on powerpc.
>>
>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>
> Good idea, I can see how this ends up being an improvement
> for most of the callers.
>
> Acked-by: Arnd Bergmann <arnd@arndb.de>

  https://lore.kernel.org/r/20191112012724.250792-3-dima@arista.com

On the way to be applied.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  2019-12-23 14:31 ` [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time() Christophe Leroy
  2019-12-24  1:58   ` Andy Lutomirski
@ 2020-01-10 21:12   ` Thomas Gleixner
  2020-01-11  8:05     ` Christophe Leroy
  1 sibling, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2020-01-10 21:12 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, arnd, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

Christophe Leroy <christophe.leroy@c-s.fr> writes:
>
> diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
> index 17b4cff6e5f0..5a17a9d2e6cd 100644
> --- a/lib/vdso/gettimeofday.c
> +++ b/lib/vdso/gettimeofday.c
> @@ -144,7 +144,7 @@ __cvdso_gettimeofday(const struct vdso_data *vd, struct __kernel_old_timeval *tv
>  static __maybe_unused __kernel_old_time_t
>  __cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
>  {
> -	__kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
> +	__kernel_old_time_t t = vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec;
>  
>  	if (time)
>  		*time = t;

Allows the compiler to load twice, i.e. the returned value might be different from the
stored value. So no.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  2020-01-10 21:12   ` Thomas Gleixner
@ 2020-01-11  8:05     ` Christophe Leroy
  2020-01-11 11:07       ` Thomas Gleixner
  0 siblings, 1 reply; 45+ messages in thread
From: Christophe Leroy @ 2020-01-11  8:05 UTC (permalink / raw)
  To: Thomas Gleixner, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, arnd, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86



On 01/10/2020 09:12 PM, Thomas Gleixner wrote:
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>
>> diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
>> index 17b4cff6e5f0..5a17a9d2e6cd 100644
>> --- a/lib/vdso/gettimeofday.c
>> +++ b/lib/vdso/gettimeofday.c
>> @@ -144,7 +144,7 @@ __cvdso_gettimeofday(const struct vdso_data *vd, struct __kernel_old_timeval *tv
>>   static __maybe_unused __kernel_old_time_t
>>   __cvdso_time(const struct vdso_data *vd, __kernel_old_time_t *time)
>>   {
>> -	__kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
>> +	__kernel_old_time_t t = vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec;
>>   
>>   	if (time)
>>   		*time = t;
> 
> Allows the compiler to load twice, i.e. the returned value might be different from the
> stored value. So no.
> 

With READ_ONCE() the 64 bits are being read:

00000ac8 <__c_kernel_time>:
  ac8:	2c 03 00 00 	cmpwi   r3,0
  acc:	81 44 00 20 	lwz     r10,32(r4)
  ad0:	81 64 00 24 	lwz     r11,36(r4)
  ad4:	41 82 00 08 	beq     adc <__c_kernel_time+0x14>
  ad8:	91 63 00 00 	stw     r11,0(r3)
  adc:	7d 63 5b 78 	mr      r3,r11
  ae0:	4e 80 00 20 	blr

Without the READ_ONCE() only 32 bits are read. That's the most optimal.

00000ac8 <__c_kernel_time>:
  ac8:	7c 69 1b 79 	mr.     r9,r3
  acc:	80 64 00 24 	lwz     r3,36(r4)
  ad0:	4d 82 00 20 	beqlr
  ad4:	90 69 00 00 	stw     r3,0(r9)
  ad8:	4e 80 00 20 	blr

Without READ_ONCE() but with a barrier() after the read, we should get 
the same result but GCC (GCC 8.1) does less good:

00000ac8 <__c_kernel_time>:
  ac8:	81 24 00 24 	lwz     r9,36(r4)
  acc:	2f 83 00 00 	cmpwi   cr7,r3,0
  ad0:	41 9e 00 08 	beq     cr7,ad8 <__c_kernel_time+0x10>
  ad4:	91 23 00 00 	stw     r9,0(r3)
  ad8:	7d 23 4b 78 	mr      r3,r9
  adc:	4e 80 00 20 	blr

Assuming both part of the 64 bits data will fall into a single 
cacheline, the second read is in the noise.

So agreed to drop this change.

Christophe

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 05/10] lib: vdso: inline do_hres()
  2020-01-10 21:07     ` Thomas Gleixner
@ 2020-01-11  9:06       ` Christophe Leroy
  0 siblings, 0 replies; 45+ messages in thread
From: Christophe Leroy @ 2020-01-11  9:06 UTC (permalink / raw)
  To: Thomas Gleixner, Arnd Bergmann
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Vincenzo Frascino, Andy Lutomirski, linux-kernel, linuxppc-dev,
	Linux ARM, open list:BROADCOM NVRAM DRIVER,
	the arch/x86 maintainers



On 01/10/2020 09:07 PM, Thomas Gleixner wrote:
> Arnd Bergmann <arnd@arndb.de> writes:
>> On Mon, Dec 23, 2019 at 3:31 PM Christophe Leroy
>> <christophe.leroy@c-s.fr> wrote:
>>>
>>> do_hres() is called from several places, so GCC doesn't inline
>>> it at first.
>>>
>>> do_hres() takes a struct __kernel_timespec * parameter for
>>> passing the result. In the 32 bits case, this parameter corresponds
>>> to a local var in the caller. In order to provide a pointer
>>> to this structure, the caller has to put it in its stack and
>>> do_hres() has to write the result in the stack. This is suboptimal,
>>> especially on RISC processor like powerpc.
>>>
>>> By making GCC inline the function, the struct __kernel_timespec
>>> remains a local var using registers, avoiding the need to write and
>>> read stack.
>>>
>>> The improvement is significant on powerpc.
>>>
>>> Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
>>
>> Good idea, I can see how this ends up being an improvement
>> for most of the callers.
>>
>> Acked-by: Arnd Bergmann <arnd@arndb.de>
> 
>    https://lore.kernel.org/r/20191112012724.250792-3-dima@arista.com
> 
> On the way to be applied.
> 

Oh nice, I get even better result with the way it is done by Dmitry 
compared to my own first patch.

On an mpc8xx at 132Mhz (32bits powerpc), before the patch I have
gettimeofday:    vdso: 1256 nsec/call
clock-gettime-monotonic-raw:    vdso: 1449 nsec/call
clock-gettime-monotonic-coarse:    vdso: 768 nsec/call
clock-gettime-monotonic:    vdso: 1390 nsec/call

With the patch I have:
gettimeofday:    vdso: 947 nsec/call
clock-gettime-monotonic-raw:    vdso: 1156 nsec/call
clock-gettime-monotonic-coarse:    vdso: 638 nsec/call
clock-gettime-monotonic:    vdso: 1094 nsec/call

So that's a 20-25% improvement.

I modified it slightly as follows:

diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 9e474d54814f..b793f211bca8 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -37,8 +37,8 @@ u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, 
u32 mult)
  }
  #endif

-static int do_hres(const struct vdso_data *vd, clockid_t clk,
-		   struct __kernel_timespec *ts)
+static __always_inline int do_hres(const struct vdso_data *vd, 
clockid_t clk,
+				   struct __kernel_timespec *ts)
  {
  	const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
  	u64 cycles, last, sec, ns;
@@ -67,8 +67,8 @@ static int do_hres(const struct vdso_data *vd, 
clockid_t clk,
  	return 0;
  }

-static void do_coarse(const struct vdso_data *vd, clockid_t clk,
-		      struct __kernel_timespec *ts)
+static __always_inline int do_coarse(const struct vdso_data *vd, 
clockid_t clk,
+				     struct __kernel_timespec *ts)
  {
  	const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
  	u32 seq;
@@ -78,6 +78,8 @@ static void do_coarse(const struct vdso_data *vd, 
clockid_t clk,
  		ts->tv_sec = vdso_ts->sec;
  		ts->tv_nsec = vdso_ts->nsec;
  	} while (unlikely(vdso_read_retry(vd, seq)));
+
+	return 0;
  }

  static __maybe_unused int
@@ -95,15 +97,16 @@ __cvdso_clock_gettime_common(const struct vdso_data 
*vd, clockid_t clock,
  	 * clocks are handled in the VDSO directly.
  	 */
  	msk = 1U << clock;
-	if (likely(msk & VDSO_HRES)) {
-		return do_hres(&vd[CS_HRES_COARSE], clock, ts);
-	} else if (msk & VDSO_COARSE) {
-		do_coarse(&vd[CS_HRES_COARSE], clock, ts);
-		return 0;
-	} else if (msk & VDSO_RAW) {
-		return do_hres(&vd[CS_RAW], clock, ts);
-	}
-	return -1;
+	if (likely(msk & VDSO_HRES))
+		vd += CS_HRES_COARSE;
+	else if (msk & VDSO_COARSE)
+		return do_coarse(&vd[CS_HRES_COARSE], clock, ts);
+	else if (msk & VDSO_RAW)
+		vd += CS_RAW;
+	else
+		return -1;
+
+	return do_hres(vd, clock, ts);
  }

  static __maybe_unused int

---

Christophe

^ permalink raw reply related	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  2020-01-11  8:05     ` Christophe Leroy
@ 2020-01-11 11:07       ` Thomas Gleixner
  2020-01-13  6:52         ` Christophe Leroy
  0 siblings, 1 reply; 45+ messages in thread
From: Thomas Gleixner @ 2020-01-11 11:07 UTC (permalink / raw)
  To: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, arnd, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86

Christophe Leroy <christophe.leroy@c-s.fr> writes:
>
> With READ_ONCE() the 64 bits are being read:
>
> Without the READ_ONCE() only 32 bits are read. That's the most optimal.
>
> Without READ_ONCE() but with a barrier() after the read, we should get 
> the same result but GCC (GCC 8.1) does less good:
>
> Assuming both part of the 64 bits data will fall into a single 
> cacheline, the second read is in the noise.

They definitely are in the same cacheline.

> So agreed to drop this change.

We could be smart about this and force the compiler to issue a 32bit
read for 32bit builds. See below. Not sure whether it's worth it, but
OTOH it will take quite a while until the 32bit time interfaces die
completely.

Thanks,

        tglx

8<------------
--- a/include/vdso/datapage.h
+++ b/include/vdso/datapage.h
@@ -21,6 +21,18 @@
 #define CS_RAW		1
 #define CS_BASES	(CS_RAW + 1)
 
+#ifdef __LITTLE_ENDIAN
+struct sec_hl {
+	u32	sec_l;
+	u32	sec_h;
+};
+#else
+struct sec_hl {
+	u32	sec_h;
+	u32	sec_l;
+};
+#endif
+
 /**
  * struct vdso_timestamp - basetime per clock_id
  * @sec:	seconds
@@ -35,7 +47,10 @@
  * vdso_data.cs[x].shift.
  */
 struct vdso_timestamp {
-	u64	sec;
+	union {
+		u64		sec;
+		struct sec_hl	sec_hl;
+	};
 	u64	nsec;
 };
 
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -165,8 +165,13 @@ static __maybe_unused int
 static __maybe_unused __kernel_old_time_t __cvdso_time(__kernel_old_time_t *time)
 {
 	const struct vdso_data *vd = __arch_get_vdso_data();
-	__kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
+	__kernel_old_time_t t;
 
+#if BITS_PER_LONG == 32
+	t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec_hl.sec_l);
+#else
+	t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
+#endif
 	if (time)
 		*time = t;
 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Surprising code generated for vdso_read_begin()
  2020-01-10  6:45     ` Christophe Leroy
@ 2020-01-11 11:33       ` Segher Boessenkool
  2020-02-16 18:10         ` Arnd Bergmann
  0 siblings, 1 reply; 45+ messages in thread
From: Segher Boessenkool @ 2020-01-11 11:33 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman, arnd,
	tglx, vincenzo.frascino, luto, x86, linuxppc-dev, linux-kernel,
	linux-arm-kernel, linux-mips

On Fri, Jan 10, 2020 at 07:45:44AM +0100, Christophe Leroy wrote:
> Le 09/01/2020 à 21:07, Segher Boessenkool a écrit :
> >It looks like the compiler did loop peeling.  What GCC version is this?
> >Please try current trunk (to become GCC 10), or at least GCC 9?
> 
> It is with GCC 5.5
> 
> https://mirrors.edge.kernel.org/pub/tools/crosstool/ doesn't have more 
> recent than 8.1

Arnd, can you update the tools?  We are at 8.3 and 9.2 now :-)  Or is
this hard and/or painful to do?

> With 8.1, the problem doesn't show up.

Good to hear that.  Hrm, I wonder when it was fixed...


Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time()
  2020-01-11 11:07       ` Thomas Gleixner
@ 2020-01-13  6:52         ` Christophe Leroy
  0 siblings, 0 replies; 45+ messages in thread
From: Christophe Leroy @ 2020-01-13  6:52 UTC (permalink / raw)
  To: Thomas Gleixner, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, arnd, vincenzo.frascino, luto
  Cc: linux-kernel, linuxppc-dev, linux-arm-kernel, linux-mips, x86



Le 11/01/2020 à 12:07, Thomas Gleixner a écrit :
> Christophe Leroy <christophe.leroy@c-s.fr> writes:
>>
>> With READ_ONCE() the 64 bits are being read:
>>
>> Without the READ_ONCE() only 32 bits are read. That's the most optimal.
>>
>> Without READ_ONCE() but with a barrier() after the read, we should get
>> the same result but GCC (GCC 8.1) does less good:
>>
>> Assuming both part of the 64 bits data will fall into a single
>> cacheline, the second read is in the noise.
> 
> They definitely are in the same cacheline.
> 
>> So agreed to drop this change.
> 
> We could be smart about this and force the compiler to issue a 32bit
> read for 32bit builds. See below. Not sure whether it's worth it, but
> OTOH it will take quite a while until the 32bit time interfaces die
> completely.

I don't think it is worth something so big to just save 1 or 2 cycles in 
time() function. Lets keep it as it is.

Thanks,
Christophe

> 
> Thanks,
> 
>          tglx
> 
> 8<------------
> --- a/include/vdso/datapage.h
> +++ b/include/vdso/datapage.h
> @@ -21,6 +21,18 @@
>   #define CS_RAW		1
>   #define CS_BASES	(CS_RAW + 1)
>   
> +#ifdef __LITTLE_ENDIAN
> +struct sec_hl {
> +	u32	sec_l;
> +	u32	sec_h;
> +};
> +#else
> +struct sec_hl {
> +	u32	sec_h;
> +	u32	sec_l;
> +};
> +#endif
> +
>   /**
>    * struct vdso_timestamp - basetime per clock_id
>    * @sec:	seconds
> @@ -35,7 +47,10 @@
>    * vdso_data.cs[x].shift.
>    */
>   struct vdso_timestamp {
> -	u64	sec;
> +	union {
> +		u64		sec;
> +		struct sec_hl	sec_hl;
> +	};
>   	u64	nsec;
>   };
>   
> --- a/lib/vdso/gettimeofday.c
> +++ b/lib/vdso/gettimeofday.c
> @@ -165,8 +165,13 @@ static __maybe_unused int
>   static __maybe_unused __kernel_old_time_t __cvdso_time(__kernel_old_time_t *time)
>   {
>   	const struct vdso_data *vd = __arch_get_vdso_data();
> -	__kernel_old_time_t t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
> +	__kernel_old_time_t t;
>   
> +#if BITS_PER_LONG == 32
> +	t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec_hl.sec_l);
> +#else
> +	t = READ_ONCE(vd[CS_HRES_COARSE].basetime[CLOCK_REALTIME].sec);
> +#endif
>   	if (time)
>   		*time = t;
>   
> 

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Surprising code generated for vdso_read_begin()
  2020-01-11 11:33       ` Segher Boessenkool
@ 2020-02-16 18:10         ` Arnd Bergmann
  2020-02-19  8:45           ` Christophe Leroy
  0 siblings, 1 reply; 45+ messages in thread
From: Arnd Bergmann @ 2020-02-16 18:10 UTC (permalink / raw)
  To: Segher Boessenkool
  Cc: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Thomas Gleixner, Vincenzo Frascino,
	Andy Lutomirski, the arch/x86 maintainers, linuxppc-dev,
	linux-kernel, Linux ARM, open list:BROADCOM NVRAM DRIVER

On Sat, Jan 11, 2020 at 12:33 PM Segher Boessenkool
<segher@kernel.crashing.org> wrote:
>
> On Fri, Jan 10, 2020 at 07:45:44AM +0100, Christophe Leroy wrote:
> > Le 09/01/2020 à 21:07, Segher Boessenkool a écrit :
> > >It looks like the compiler did loop peeling.  What GCC version is this?
> > >Please try current trunk (to become GCC 10), or at least GCC 9?
> >
> > It is with GCC 5.5
> >
> > https://mirrors.edge.kernel.org/pub/tools/crosstool/ doesn't have more
> > recent than 8.1
>
> Arnd, can you update the tools?  We are at 8.3 and 9.2 now :-)  Or is
> this hard and/or painful to do?

To follow up on this older thread, I have now uploaded 6.5, 7.5, 8.3 and 9.2
binaries, as well as a recent 10.0 snapshot.

I hope these work, let me know if there are problems.

       Arnd

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Surprising code generated for vdso_read_begin()
  2020-02-16 18:10         ` Arnd Bergmann
@ 2020-02-19  8:45           ` Christophe Leroy
  2020-02-19  9:52             ` Arnd Bergmann
  0 siblings, 1 reply; 45+ messages in thread
From: Christophe Leroy @ 2020-02-19  8:45 UTC (permalink / raw)
  To: Arnd Bergmann, Segher Boessenkool
  Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
	Thomas Gleixner, Vincenzo Frascino, Andy Lutomirski,
	the arch/x86 maintainers, linuxppc-dev, linux-kernel, Linux ARM,
	open list:BROADCOM NVRAM DRIVER



Le 16/02/2020 à 19:10, Arnd Bergmann a écrit :
> On Sat, Jan 11, 2020 at 12:33 PM Segher Boessenkool
> <segher@kernel.crashing.org> wrote:
>>
>> On Fri, Jan 10, 2020 at 07:45:44AM +0100, Christophe Leroy wrote:
>>> Le 09/01/2020 à 21:07, Segher Boessenkool a écrit :
>>>> It looks like the compiler did loop peeling.  What GCC version is this?
>>>> Please try current trunk (to become GCC 10), or at least GCC 9?
>>>
>>> It is with GCC 5.5
>>>
>>> https://mirrors.edge.kernel.org/pub/tools/crosstool/ doesn't have more
>>> recent than 8.1
>>
>> Arnd, can you update the tools?  We are at 8.3 and 9.2 now :-)  Or is
>> this hard and/or painful to do?
> 
> To follow up on this older thread, I have now uploaded 6.5, 7.5, 8.3 and 9.2
> binaries, as well as a recent 10.0 snapshot.
> 

Thanks Arnd,

I have built the VDSO with 9.2, I get less performant result than with 
8.2 (same performance as with 5.5).

After a quick look, I see:
- Irrelevant NOPs to align loops and stuff, allthough -mpcu=860 should 
avoid that.
- A stack frame is set for saving r31 in __c_kernel_clock_gettime. GCC 
8.1 don't need that, all VDSO functions are frameless with 8.1

Christophe

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Surprising code generated for vdso_read_begin()
  2020-02-19  8:45           ` Christophe Leroy
@ 2020-02-19  9:52             ` Arnd Bergmann
  2020-02-19 13:08               ` Segher Boessenkool
  0 siblings, 1 reply; 45+ messages in thread
From: Arnd Bergmann @ 2020-02-19  9:52 UTC (permalink / raw)
  To: Christophe Leroy
  Cc: Segher Boessenkool, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Thomas Gleixner, Vincenzo Frascino,
	Andy Lutomirski, the arch/x86 maintainers, linuxppc-dev,
	linux-kernel, Linux ARM, open list:BROADCOM NVRAM DRIVER

On Wed, Feb 19, 2020 at 9:45 AM Christophe Leroy
<christophe.leroy@c-s.fr> wrote:
> Le 16/02/2020 à 19:10, Arnd Bergmann a écrit :
> > On Sat, Jan 11, 2020 at 12:33 PM Segher Boessenkool
> > <segher@kernel.crashing.org> wrote:
> >>
> >> On Fri, Jan 10, 2020 at 07:45:44AM +0100, Christophe Leroy wrote:
> >>> Le 09/01/2020 à 21:07, Segher Boessenkool a écrit :
> >>>> It looks like the compiler did loop peeling.  What GCC version is this?
> >>>> Please try current trunk (to become GCC 10), or at least GCC 9?
> >>>
> >>> It is with GCC 5.5
> >>>
> >>> https://mirrors.edge.kernel.org/pub/tools/crosstool/ doesn't have more
> >>> recent than 8.1
> >>
> >> Arnd, can you update the tools?  We are at 8.3 and 9.2 now :-)  Or is
> >> this hard and/or painful to do?
> >
> > To follow up on this older thread, I have now uploaded 6.5, 7.5, 8.3 and 9.2
> > binaries, as well as a recent 10.0 snapshot.
> >
>
> Thanks Arnd,
>
> I have built the VDSO with 9.2, I get less performant result than with
> 8.2 (same performance as with 5.5).
>
> After a quick look, I see:
> - Irrelevant NOPs to align loops and stuff, allthough -mpcu=860 should
> avoid that.
> - A stack frame is set for saving r31 in __c_kernel_clock_gettime. GCC
> 8.1 don't need that, all VDSO functions are frameless with 8.1

If you think it should be fixed in gcc, maybe try to reproduce it in
https://godbolt.org/ and open a gcc bug against that.

Also, please try the gcc-10 snapshot, which has the highest chance
of getting fixes if it shows the same issue (or worse).

     Arnd

^ permalink raw reply	[flat|nested] 45+ messages in thread

* Re: Surprising code generated for vdso_read_begin()
  2020-02-19  9:52             ` Arnd Bergmann
@ 2020-02-19 13:08               ` Segher Boessenkool
  0 siblings, 0 replies; 45+ messages in thread
From: Segher Boessenkool @ 2020-02-19 13:08 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: Christophe Leroy, Benjamin Herrenschmidt, Paul Mackerras,
	Michael Ellerman, Thomas Gleixner, Vincenzo Frascino,
	Andy Lutomirski, the arch/x86 maintainers, linuxppc-dev,
	linux-kernel, Linux ARM, open list:BROADCOM NVRAM DRIVER

On Wed, Feb 19, 2020 at 10:52:16AM +0100, Arnd Bergmann wrote:
> On Wed, Feb 19, 2020 at 9:45 AM Christophe Leroy
> <christophe.leroy@c-s.fr> wrote:
> > Le 16/02/2020 à 19:10, Arnd Bergmann a écrit :
> > > On Sat, Jan 11, 2020 at 12:33 PM Segher Boessenkool
> > > <segher@kernel.crashing.org> wrote:
> > >>
> > >> On Fri, Jan 10, 2020 at 07:45:44AM +0100, Christophe Leroy wrote:
> > >>> Le 09/01/2020 à 21:07, Segher Boessenkool a écrit :
> > >>>> It looks like the compiler did loop peeling.  What GCC version is this?
> > >>>> Please try current trunk (to become GCC 10), or at least GCC 9?
> > >>>
> > >>> It is with GCC 5.5
> > >>>
> > >>> https://mirrors.edge.kernel.org/pub/tools/crosstool/ doesn't have more
> > >>> recent than 8.1
> > >>
> > >> Arnd, can you update the tools?  We are at 8.3 and 9.2 now :-)  Or is
> > >> this hard and/or painful to do?
> > >
> > > To follow up on this older thread, I have now uploaded 6.5, 7.5, 8.3 and 9.2
> > > binaries, as well as a recent 10.0 snapshot.
> > >
> >
> > Thanks Arnd,
> >
> > I have built the VDSO with 9.2, I get less performant result than with
> > 8.2 (same performance as with 5.5).
> >
> > After a quick look, I see:
> > - Irrelevant NOPs to align loops and stuff, allthough -mpcu=860 should
> > avoid that.
> > - A stack frame is set for saving r31 in __c_kernel_clock_gettime. GCC
> > 8.1 don't need that, all VDSO functions are frameless with 8.1
> 
> If you think it should be fixed in gcc, maybe try to reproduce it in
> https://godbolt.org/

(Feel free to skip this step; and don't put links to godbolt (or anything
else external) in our bugzilla, please; such links go stale before you
know it.)

> and open a gcc bug against that.

Yes please :-)

> Also, please try the gcc-10 snapshot, which has the highest chance
> of getting fixes if it shows the same issue (or worse).

If it is a regression, chances are it will be backported.  (But not to
9.3, which is due in just a few weeks, just like 8.4).  If it is just a
side effect of some other change, it will probably *not* be undone, not
on trunk (GCC 10) either.  It depends.

But sure, always test trunk if you can.


Segher

^ permalink raw reply	[flat|nested] 45+ messages in thread

end of thread, other threads:[~2020-02-19 13:08 UTC | newest]

Thread overview: 45+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-12-23 14:31 [RFC PATCH v2 00/10] powerpc/32: switch VDSO to C implementation Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 01/10] lib: vdso: ensure all arches have 32bit fallback Christophe Leroy
2019-12-24  2:07   ` Andy Lutomirski
2020-01-10 20:56     ` Thomas Gleixner
2020-01-10 21:02       ` Andy Lutomirski
2019-12-30 12:27   ` Arnd Bergmann
2020-01-02 11:29     ` Arnd Bergmann
2020-01-09 15:43       ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 02/10] lib: vdso: move call to fallback out of common code Christophe Leroy
2019-12-24  2:24   ` Andy Lutomirski
2019-12-24 11:41     ` christophe leroy
2019-12-24 12:09       ` Andy Lutomirski
2019-12-23 14:31 ` [RFC PATCH v2 03/10] lib: vdso: Change __cvdso_clock_gettime/getres_common() to __cvdso_clock_gettime/getres() Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 04/10] lib: vdso: get pointer to vdso data from the arch Christophe Leroy
2019-12-24  2:27   ` Andy Lutomirski
2019-12-24 11:53     ` christophe leroy
2019-12-24 12:15       ` Andy Lutomirski
2019-12-24 12:41         ` Andy Lutomirski
2019-12-24 14:46         ` Segher Boessenkool
2019-12-23 14:31 ` [RFC PATCH v2 05/10] lib: vdso: inline do_hres() Christophe Leroy
2019-12-24  2:29   ` Andy Lutomirski
2019-12-30 12:07   ` Arnd Bergmann
2020-01-10 21:07     ` Thomas Gleixner
2020-01-11  9:06       ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 06/10] lib: vdso: make do_coarse() return 0 Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 07/10] lib: vdso: don't use READ_ONCE() in __c_kernel_time() Christophe Leroy
2019-12-24  1:58   ` Andy Lutomirski
2019-12-24 11:12     ` christophe leroy
2019-12-24 12:04       ` Andy Lutomirski
2020-01-10 21:12   ` Thomas Gleixner
2020-01-11  8:05     ` Christophe Leroy
2020-01-11 11:07       ` Thomas Gleixner
2020-01-13  6:52         ` Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 08/10] lib: vdso: Avoid duplication in __cvdso_clock_getres() Christophe Leroy
2019-12-24  1:59   ` Andy Lutomirski
2019-12-23 14:31 ` [RFC PATCH v2 09/10] powerpc/vdso32: inline __get_datapage() Christophe Leroy
2019-12-23 14:31 ` [RFC PATCH v2 10/10] powerpc/32: Switch VDSO to C implementation Christophe Leroy
2020-01-09 17:52 ` Surprising code generated for vdso_read_begin() Christophe Leroy
2020-01-09 20:07   ` Segher Boessenkool
2020-01-10  6:45     ` Christophe Leroy
2020-01-11 11:33       ` Segher Boessenkool
2020-02-16 18:10         ` Arnd Bergmann
2020-02-19  8:45           ` Christophe Leroy
2020-02-19  9:52             ` Arnd Bergmann
2020-02-19 13:08               ` Segher Boessenkool

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).