From: Christophe Leroy <christophe.leroy@c-s.fr>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Michael Ellerman <mpe@ellerman.id.au>,
arnd@arndb.de, tglx@linutronix.de, vincenzo.frascino@arm.com,
luto@kernel.org
Cc: linux-kernel@vger.kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org,
x86@kernel.org
Subject: [RFC PATCH v3 04/12] lib: vdso: inline do_hres() and do_coarse()
Date: Mon, 13 Jan 2020 17:08:42 +0000 (UTC) [thread overview]
Message-ID: <25d3e027aeef5cdbe1b205ecfbf8d80270fc2bd9.1578934751.git.christophe.leroy@c-s.fr> (raw)
In-Reply-To: <cover.1578934751.git.christophe.leroy@c-s.fr>
do_hres() is called from several places, so GCC doesn't inline
it at first.
do_hres() takes a struct __kernel_timespec * parameter for
passing the result. In the 32 bits case, this parameter corresponds
to a local var in the caller. In order to provide a pointer
to this structure, the caller has to put it in its stack and
do_hres() has to write the result in the stack. This is suboptimal,
especially on RISC processor like powerpc.
By making GCC inline the function, the struct __kernel_timespec
remains a local var using registers, avoiding the need to write and
read stack.
The improvement is significant on powerpc:
Before:
gettimeofday: vdso: 1379 nsec/call
clock-gettime-realtime-coarse: vdso: 868 nsec/call
clock-gettime-realtime: vdso: 1511 nsec/call
clock-gettime-monotonic-raw: vdso: 1576 nsec/call
After:
gettimeofday: vdso: 1078 nsec/call
clock-gettime-realtime-coarse: vdso: 807 nsec/call
clock-gettime-realtime: vdso: 1256 nsec/call
clock-gettime-monotonic-raw: vdso: 1316 nsec/call
At the same time, change the return type of do_coarse() to int, this
increase readability of the if/elseif/elseif/else section
in __cvdso_clock_gettime_common()
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
lib/vdso/gettimeofday.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 42bd8ab955fa..d75e44ba716f 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -38,8 +38,8 @@ u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
}
#endif
-static int do_hres(const struct vdso_data *vd, clockid_t clk,
- struct __kernel_timespec *ts)
+static __always_inline int do_hres(const struct vdso_data *vd, clockid_t clk,
+ struct __kernel_timespec *ts)
{
const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
u64 cycles, last, sec, ns;
@@ -68,8 +68,8 @@ static int do_hres(const struct vdso_data *vd, clockid_t clk,
return 0;
}
-static void do_coarse(const struct vdso_data *vd, clockid_t clk,
- struct __kernel_timespec *ts)
+static __always_inline int do_coarse(const struct vdso_data *vd, clockid_t clk,
+ struct __kernel_timespec *ts)
{
const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
u32 seq;
@@ -79,6 +79,8 @@ static void do_coarse(const struct vdso_data *vd, clockid_t clk,
ts->tv_sec = vdso_ts->sec;
ts->tv_nsec = vdso_ts->nsec;
} while (unlikely(vdso_read_retry(vd, seq)));
+
+ return 0;
}
static __maybe_unused int
@@ -96,15 +98,16 @@ __cvdso_clock_gettime_common(clockid_t clock, struct __kernel_timespec *ts)
* clocks are handled in the VDSO directly.
*/
msk = 1U << clock;
- if (likely(msk & VDSO_HRES)) {
- return do_hres(&vd[CS_HRES_COARSE], clock, ts);
- } else if (msk & VDSO_COARSE) {
- do_coarse(&vd[CS_HRES_COARSE], clock, ts);
- return 0;
- } else if (msk & VDSO_RAW) {
- return do_hres(&vd[CS_RAW], clock, ts);
- }
- return -1;
+ if (likely(msk & VDSO_HRES))
+ vd += CS_HRES_COARSE;
+ else if (msk & VDSO_COARSE)
+ return do_coarse(&vd[CS_HRES_COARSE], clock, ts);
+ else if (msk & VDSO_RAW)
+ vd += CS_RAW;
+ else
+ return -1;
+
+ return do_hres(vd, clock, ts);
}
static __maybe_unused int
--
2.13.3
WARNING: multiple messages have this Message-ID (diff)
From: Christophe Leroy <christophe.leroy@c-s.fr>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Michael Ellerman <mpe@ellerman.id.au>,
arnd@arndb.de, tglx@linutronix.de, vincenzo.frascino@arm.com,
luto@kernel.org
Cc: x86@kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org
Subject: [RFC PATCH v3 04/12] lib: vdso: inline do_hres() and do_coarse()
Date: Mon, 13 Jan 2020 17:08:42 +0000 (UTC) [thread overview]
Message-ID: <25d3e027aeef5cdbe1b205ecfbf8d80270fc2bd9.1578934751.git.christophe.leroy@c-s.fr> (raw)
In-Reply-To: <cover.1578934751.git.christophe.leroy@c-s.fr>
do_hres() is called from several places, so GCC doesn't inline
it at first.
do_hres() takes a struct __kernel_timespec * parameter for
passing the result. In the 32 bits case, this parameter corresponds
to a local var in the caller. In order to provide a pointer
to this structure, the caller has to put it in its stack and
do_hres() has to write the result in the stack. This is suboptimal,
especially on RISC processor like powerpc.
By making GCC inline the function, the struct __kernel_timespec
remains a local var using registers, avoiding the need to write and
read stack.
The improvement is significant on powerpc:
Before:
gettimeofday: vdso: 1379 nsec/call
clock-gettime-realtime-coarse: vdso: 868 nsec/call
clock-gettime-realtime: vdso: 1511 nsec/call
clock-gettime-monotonic-raw: vdso: 1576 nsec/call
After:
gettimeofday: vdso: 1078 nsec/call
clock-gettime-realtime-coarse: vdso: 807 nsec/call
clock-gettime-realtime: vdso: 1256 nsec/call
clock-gettime-monotonic-raw: vdso: 1316 nsec/call
At the same time, change the return type of do_coarse() to int, this
increase readability of the if/elseif/elseif/else section
in __cvdso_clock_gettime_common()
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
lib/vdso/gettimeofday.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 42bd8ab955fa..d75e44ba716f 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -38,8 +38,8 @@ u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
}
#endif
-static int do_hres(const struct vdso_data *vd, clockid_t clk,
- struct __kernel_timespec *ts)
+static __always_inline int do_hres(const struct vdso_data *vd, clockid_t clk,
+ struct __kernel_timespec *ts)
{
const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
u64 cycles, last, sec, ns;
@@ -68,8 +68,8 @@ static int do_hres(const struct vdso_data *vd, clockid_t clk,
return 0;
}
-static void do_coarse(const struct vdso_data *vd, clockid_t clk,
- struct __kernel_timespec *ts)
+static __always_inline int do_coarse(const struct vdso_data *vd, clockid_t clk,
+ struct __kernel_timespec *ts)
{
const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
u32 seq;
@@ -79,6 +79,8 @@ static void do_coarse(const struct vdso_data *vd, clockid_t clk,
ts->tv_sec = vdso_ts->sec;
ts->tv_nsec = vdso_ts->nsec;
} while (unlikely(vdso_read_retry(vd, seq)));
+
+ return 0;
}
static __maybe_unused int
@@ -96,15 +98,16 @@ __cvdso_clock_gettime_common(clockid_t clock, struct __kernel_timespec *ts)
* clocks are handled in the VDSO directly.
*/
msk = 1U << clock;
- if (likely(msk & VDSO_HRES)) {
- return do_hres(&vd[CS_HRES_COARSE], clock, ts);
- } else if (msk & VDSO_COARSE) {
- do_coarse(&vd[CS_HRES_COARSE], clock, ts);
- return 0;
- } else if (msk & VDSO_RAW) {
- return do_hres(&vd[CS_RAW], clock, ts);
- }
- return -1;
+ if (likely(msk & VDSO_HRES))
+ vd += CS_HRES_COARSE;
+ else if (msk & VDSO_COARSE)
+ return do_coarse(&vd[CS_HRES_COARSE], clock, ts);
+ else if (msk & VDSO_RAW)
+ vd += CS_RAW;
+ else
+ return -1;
+
+ return do_hres(vd, clock, ts);
}
static __maybe_unused int
--
2.13.3
WARNING: multiple messages have this Message-ID (diff)
From: Christophe Leroy <christophe.leroy@c-s.fr>
To: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Michael Ellerman <mpe@ellerman.id.au>,
arnd@arndb.de, tglx@linutronix.de, vincenzo.frascino@arm.com,
luto@kernel.org
Cc: x86@kernel.org, linuxppc-dev@lists.ozlabs.org,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org, linux-mips@vger.kernel.org
Subject: [RFC PATCH v3 04/12] lib: vdso: inline do_hres() and do_coarse()
Date: Mon, 13 Jan 2020 17:08:42 +0000 (UTC) [thread overview]
Message-ID: <25d3e027aeef5cdbe1b205ecfbf8d80270fc2bd9.1578934751.git.christophe.leroy@c-s.fr> (raw)
In-Reply-To: <cover.1578934751.git.christophe.leroy@c-s.fr>
do_hres() is called from several places, so GCC doesn't inline
it at first.
do_hres() takes a struct __kernel_timespec * parameter for
passing the result. In the 32 bits case, this parameter corresponds
to a local var in the caller. In order to provide a pointer
to this structure, the caller has to put it in its stack and
do_hres() has to write the result in the stack. This is suboptimal,
especially on RISC processor like powerpc.
By making GCC inline the function, the struct __kernel_timespec
remains a local var using registers, avoiding the need to write and
read stack.
The improvement is significant on powerpc:
Before:
gettimeofday: vdso: 1379 nsec/call
clock-gettime-realtime-coarse: vdso: 868 nsec/call
clock-gettime-realtime: vdso: 1511 nsec/call
clock-gettime-monotonic-raw: vdso: 1576 nsec/call
After:
gettimeofday: vdso: 1078 nsec/call
clock-gettime-realtime-coarse: vdso: 807 nsec/call
clock-gettime-realtime: vdso: 1256 nsec/call
clock-gettime-monotonic-raw: vdso: 1316 nsec/call
At the same time, change the return type of do_coarse() to int, this
increase readability of the if/elseif/elseif/else section
in __cvdso_clock_gettime_common()
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
lib/vdso/gettimeofday.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)
diff --git a/lib/vdso/gettimeofday.c b/lib/vdso/gettimeofday.c
index 42bd8ab955fa..d75e44ba716f 100644
--- a/lib/vdso/gettimeofday.c
+++ b/lib/vdso/gettimeofday.c
@@ -38,8 +38,8 @@ u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
}
#endif
-static int do_hres(const struct vdso_data *vd, clockid_t clk,
- struct __kernel_timespec *ts)
+static __always_inline int do_hres(const struct vdso_data *vd, clockid_t clk,
+ struct __kernel_timespec *ts)
{
const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
u64 cycles, last, sec, ns;
@@ -68,8 +68,8 @@ static int do_hres(const struct vdso_data *vd, clockid_t clk,
return 0;
}
-static void do_coarse(const struct vdso_data *vd, clockid_t clk,
- struct __kernel_timespec *ts)
+static __always_inline int do_coarse(const struct vdso_data *vd, clockid_t clk,
+ struct __kernel_timespec *ts)
{
const struct vdso_timestamp *vdso_ts = &vd->basetime[clk];
u32 seq;
@@ -79,6 +79,8 @@ static void do_coarse(const struct vdso_data *vd, clockid_t clk,
ts->tv_sec = vdso_ts->sec;
ts->tv_nsec = vdso_ts->nsec;
} while (unlikely(vdso_read_retry(vd, seq)));
+
+ return 0;
}
static __maybe_unused int
@@ -96,15 +98,16 @@ __cvdso_clock_gettime_common(clockid_t clock, struct __kernel_timespec *ts)
* clocks are handled in the VDSO directly.
*/
msk = 1U << clock;
- if (likely(msk & VDSO_HRES)) {
- return do_hres(&vd[CS_HRES_COARSE], clock, ts);
- } else if (msk & VDSO_COARSE) {
- do_coarse(&vd[CS_HRES_COARSE], clock, ts);
- return 0;
- } else if (msk & VDSO_RAW) {
- return do_hres(&vd[CS_RAW], clock, ts);
- }
- return -1;
+ if (likely(msk & VDSO_HRES))
+ vd += CS_HRES_COARSE;
+ else if (msk & VDSO_COARSE)
+ return do_coarse(&vd[CS_HRES_COARSE], clock, ts);
+ else if (msk & VDSO_RAW)
+ vd += CS_RAW;
+ else
+ return -1;
+
+ return do_hres(vd, clock, ts);
}
static __maybe_unused int
--
2.13.3
_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
next prev parent reply other threads:[~2020-01-13 17:08 UTC|newest]
Thread overview: 57+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-13 17:08 [RFC PATCH v3 00/12] powerpc: switch VDSO to C implementation Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 01/12] powerpc/64: Don't provide time functions in compat VDSO32 Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 02/12] powerpc/vdso: Switch VDSO to generic C implementation Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 03/12] lib: vdso: mark __cvdso_clock_getres() as static Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy [this message]
2020-01-13 17:08 ` [RFC PATCH v3 04/12] lib: vdso: inline do_hres() and do_coarse() Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 05/12] lib: vdso: Avoid duplication in __cvdso_clock_getres() Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 06/12] lib: vdso: __iter_div_u64_rem() is suboptimal for 32 bit time Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-14 11:31 ` Thomas Gleixner
2020-01-14 11:31 ` Thomas Gleixner
2020-01-14 11:31 ` Thomas Gleixner
2020-01-13 17:08 ` [RFC PATCH v3 07/12] powerpc/vdso: simplify __get_datapage() Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 08/12] lib: vdso: allow arches to provide vdso data pointer Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-14 23:06 ` Thomas Gleixner
2020-01-14 23:06 ` Thomas Gleixner
2020-01-14 23:06 ` Thomas Gleixner
2020-01-15 6:15 ` Christophe Leroy
2020-01-15 6:15 ` Christophe Leroy
2020-01-15 6:15 ` Christophe Leroy
2020-01-16 9:16 ` Christophe Leroy
2020-01-16 9:16 ` Christophe Leroy
2020-01-16 9:16 ` Christophe Leroy
2020-01-16 10:35 ` Thomas Gleixner
2020-01-16 10:35 ` Thomas Gleixner
2020-01-16 10:35 ` Thomas Gleixner
2020-01-16 20:22 ` Andy Lutomirski
2020-01-16 20:22 ` Andy Lutomirski
2020-01-16 20:22 ` Andy Lutomirski
2020-01-13 17:08 ` [RFC PATCH v3 09/12] powerpc/vdso: provide inline alternative to __get_datapage() Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 10/12] powerpc/vdso: provide vdso data pointer from the ASM caller Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 11/12] lib: vdso: split clock verification out of __arch_get_hw_counter() Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` [RFC PATCH v3 12/12] powerpc/vdso: provide __arch_is_hw_counter_valid() Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
2020-01-13 17:08 ` Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=25d3e027aeef5cdbe1b205ecfbf8d80270fc2bd9.1578934751.git.christophe.leroy@c-s.fr \
--to=christophe.leroy@c-s.fr \
--cc=arnd@arndb.de \
--cc=benh@kernel.crashing.org \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=luto@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=paulus@samba.org \
--cc=tglx@linutronix.de \
--cc=vincenzo.frascino@arm.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.