* [PATCH v3 04/12] arm: vdso: enforce monotonic and realtime as inline
@ 2017-10-27 22:25 Mark Salyzyn
2017-10-30 14:10 ` Mark Rutland
2017-10-30 15:59 ` Russell King - ARM Linux
0 siblings, 2 replies; 4+ messages in thread
From: Mark Salyzyn @ 2017-10-27 22:25 UTC (permalink / raw)
To: linux-kernel
Cc: Mark Salyzyn, James Morse, Russell King, Catalin Marinas,
Will Deacon, Andy Lutomirski, Dmitry Safonov, John Stultz,
Mark Rutland, Laura Abbott, Kees Cook, Ard Biesheuvel,
Andy Gross, Kevin Brodsky, Andrew Pinski, linux-arm-kernel,
Mark Salyzyn
Ensure monotonic and realtime are inline, small price to pay for
high volume common request.
Signed-off-by: Mark Salyzyn <salyzyn@android.com>
Cc: James Morse <james.morse@arm.com>
Cc: Russell King <linux@armlinux.org.uk>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Andy Lutomirski <luto@amacapital.net>
Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
Cc: John Stultz <john.stultz@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Laura Abbott <labbott@redhat.com>
Cc: Kees Cook <keescook@chromium.org>
Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Cc: Andy Gross <andy.gross@linaro.org>
Cc: Kevin Brodsky <kevin.brodsky@arm.com>
Cc: Andrew Pinski <apinski@cavium.com>
Cc: linux-kernel@vger.kernel.org
Cc: linux-arm-kernel@lists.infradead.org
v2:
- split first CL into 4 of 7 pieces
v3:
- rebase (unchanged)
---
arch/arm/vdso/vgettimeofday.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/arch/arm/vdso/vgettimeofday.c b/arch/arm/vdso/vgettimeofday.c
index 5f596911bd53..71003a1997c4 100644
--- a/arch/arm/vdso/vgettimeofday.c
+++ b/arch/arm/vdso/vgettimeofday.c
@@ -99,7 +99,7 @@ static notrace int do_monotonic_coarse(const struct vdso_data *vd,
#ifdef CONFIG_ARM_ARCH_TIMER
-static notrace u64 get_ns(const struct vdso_data *vd)
+static __always_inline notrace u64 get_ns(const struct vdso_data *vd)
{
u64 cycle_delta;
u64 cycle_now;
@@ -115,7 +115,9 @@ static notrace u64 get_ns(const struct vdso_data *vd)
return nsec;
}
-static notrace int do_realtime(const struct vdso_data *vd, struct timespec *ts)
+/* Code size doesn't matter (vdso is 4k/16k/64k anyway) and this is faster. */
+static __always_inline notrace int do_realtime(const struct vdso_data *vd,
+ struct timespec *ts)
{
u64 nsecs;
u32 seq;
@@ -137,7 +139,8 @@ static notrace int do_realtime(const struct vdso_data *vd, struct timespec *ts)
return 0;
}
-static notrace int do_monotonic(const struct vdso_data *vd, struct timespec *ts)
+static __always_inline notrace int do_monotonic(const struct vdso_data *vd,
+ struct timespec *ts)
{
struct timespec tomono;
u64 nsecs;
--
2.15.0.rc2.357.g7e34df9404-goog
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3 04/12] arm: vdso: enforce monotonic and realtime as inline
2017-10-27 22:25 [PATCH v3 04/12] arm: vdso: enforce monotonic and realtime as inline Mark Salyzyn
@ 2017-10-30 14:10 ` Mark Rutland
2017-10-30 15:59 ` Russell King - ARM Linux
1 sibling, 0 replies; 4+ messages in thread
From: Mark Rutland @ 2017-10-30 14:10 UTC (permalink / raw)
To: Mark Salyzyn
Cc: linux-kernel, James Morse, Russell King, Catalin Marinas,
Will Deacon, Andy Lutomirski, Dmitry Safonov, John Stultz,
Laura Abbott, Kees Cook, Ard Biesheuvel, Andy Gross,
Kevin Brodsky, Andrew Pinski, linux-arm-kernel, Mark Salyzyn
On Fri, Oct 27, 2017 at 03:25:28PM -0700, Mark Salyzyn wrote:
> Ensure monotonic and realtime are inline, small price to pay for
> high volume common request.
Does this make a noticeable difference on any workload?
What does this do to the binary size?
Thanks,
Mark.
>
> Signed-off-by: Mark Salyzyn <salyzyn@android.com>
> Cc: James Morse <james.morse@arm.com>
> Cc: Russell King <linux@armlinux.org.uk>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will.deacon@arm.com>
> Cc: Andy Lutomirski <luto@amacapital.net>
> Cc: Dmitry Safonov <dsafonov@virtuozzo.com>
> Cc: John Stultz <john.stultz@linaro.org>
> Cc: Mark Rutland <mark.rutland@arm.com>
> Cc: Laura Abbott <labbott@redhat.com>
> Cc: Kees Cook <keescook@chromium.org>
> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org>
> Cc: Andy Gross <andy.gross@linaro.org>
> Cc: Kevin Brodsky <kevin.brodsky@arm.com>
> Cc: Andrew Pinski <apinski@cavium.com>
> Cc: linux-kernel@vger.kernel.org
> Cc: linux-arm-kernel@lists.infradead.org
>
> v2:
> - split first CL into 4 of 7 pieces
>
> v3:
> - rebase (unchanged)
>
> ---
> arch/arm/vdso/vgettimeofday.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/arch/arm/vdso/vgettimeofday.c b/arch/arm/vdso/vgettimeofday.c
> index 5f596911bd53..71003a1997c4 100644
> --- a/arch/arm/vdso/vgettimeofday.c
> +++ b/arch/arm/vdso/vgettimeofday.c
> @@ -99,7 +99,7 @@ static notrace int do_monotonic_coarse(const struct vdso_data *vd,
>
> #ifdef CONFIG_ARM_ARCH_TIMER
>
> -static notrace u64 get_ns(const struct vdso_data *vd)
> +static __always_inline notrace u64 get_ns(const struct vdso_data *vd)
> {
> u64 cycle_delta;
> u64 cycle_now;
> @@ -115,7 +115,9 @@ static notrace u64 get_ns(const struct vdso_data *vd)
> return nsec;
> }
>
> -static notrace int do_realtime(const struct vdso_data *vd, struct timespec *ts)
> +/* Code size doesn't matter (vdso is 4k/16k/64k anyway) and this is faster. */
> +static __always_inline notrace int do_realtime(const struct vdso_data *vd,
> + struct timespec *ts)
> {
> u64 nsecs;
> u32 seq;
> @@ -137,7 +139,8 @@ static notrace int do_realtime(const struct vdso_data *vd, struct timespec *ts)
> return 0;
> }
>
> -static notrace int do_monotonic(const struct vdso_data *vd, struct timespec *ts)
> +static __always_inline notrace int do_monotonic(const struct vdso_data *vd,
> + struct timespec *ts)
> {
> struct timespec tomono;
> u64 nsecs;
> --
> 2.15.0.rc2.357.g7e34df9404-goog
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3 04/12] arm: vdso: enforce monotonic and realtime as inline
2017-10-27 22:25 [PATCH v3 04/12] arm: vdso: enforce monotonic and realtime as inline Mark Salyzyn
2017-10-30 14:10 ` Mark Rutland
@ 2017-10-30 15:59 ` Russell King - ARM Linux
2017-10-31 15:28 ` Mark Salyzyn
1 sibling, 1 reply; 4+ messages in thread
From: Russell King - ARM Linux @ 2017-10-30 15:59 UTC (permalink / raw)
To: Mark Salyzyn
Cc: linux-kernel, James Morse, Catalin Marinas, Will Deacon,
Andy Lutomirski, Dmitry Safonov, John Stultz, Mark Rutland,
Laura Abbott, Kees Cook, Ard Biesheuvel, Andy Gross,
Kevin Brodsky, Andrew Pinski, linux-arm-kernel, Mark Salyzyn
On Fri, Oct 27, 2017 at 03:25:28PM -0700, Mark Salyzyn wrote:
> Ensure monotonic and realtime are inline, small price to pay for
> high volume common request.
Is this just based on a hunch, or is it based on proper measurement?
If proper measurement, where's the data? What CPU was it measured
with? How does this change affect other CPUs?
--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 8.8Mbps down 630kbps up
According to speedtest.net: 8.21Mbps down 510kbps up
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [PATCH v3 04/12] arm: vdso: enforce monotonic and realtime as inline
2017-10-30 15:59 ` Russell King - ARM Linux
@ 2017-10-31 15:28 ` Mark Salyzyn
0 siblings, 0 replies; 4+ messages in thread
From: Mark Salyzyn @ 2017-10-31 15:28 UTC (permalink / raw)
To: Russell King - ARM Linux
Cc: linux-kernel, James Morse, Catalin Marinas, Will Deacon,
Andy Lutomirski, Dmitry Safonov, John Stultz, Mark Rutland,
Laura Abbott, Kees Cook, Ard Biesheuvel, Andy Gross,
Kevin Brodsky, Andrew Pinski, linux-arm-kernel, Mark Salyzyn
On 10/30/2017 08:59 AM, Russell King - ARM Linux wrote:
> On Fri, Oct 27, 2017 at 03:25:28PM -0700, Mark Salyzyn wrote:
>> Ensure monotonic and realtime are inline, small price to pay for
>> high volume common request.
> Is this just based on a hunch, or is it based on proper measurement?
> If proper measurement, where's the data? What CPU was it measured
> with? How does this change affect other CPUs?
>
I was tested faster in the past. Story today is less conclusive and the
change is not worth it.
[TL;DR]
Code size in all cases is about 1/2 a 4K page, and change in size is not
that much in or out.
Originally coded to match assembler for arm64. I tested it when I was
first formulating the series and found a 2-4% improvement on arm
(Nexus6, backport to 3.10) and arm64 (Nexus 6P, backport to 3.18). But
that was (a technological) eon ago.
However, retested as-is, in and out, today side by side, clock_gettime
for CLOCK_MONOTONIC, CLOCK_BOOTTIME and CLOCK_REALTIME, locked cores,
affinity to littles (0-3), 50M iterations, device cooled down for 15
minutes between (vdso64+vdso32) runs, 16 runs each averaged on a
Hikey960, 4.9 kernel, GCC 4.9 -O2 and I get a slightly different story
(with complete private patch stack that has vdso32):
vdso64
realtime: -4.8% (worse)
monotonic: +1.9% (better)
boottime: +3.2%
vdso32
realtime: +4.7% (better)
monotonic: +3.2%
boottime: +3.7%
The maximum deviation on the sample runs was in the order of +/-1%. I
can not explain (the highly repeatable anomaly) as to why vdso64
realtime is slower, yet vdso32 is equally faster. realtime is unique in
the set as common routine serves for both __vdso_clock_gettime and
__vdso_gettimeofday, and where I expected the gains (the hunch).
I have tried other combinations of forced inlines to try to cope with
the clock_gettime(CLOCK_REALTIME) speed, and determined it was almost
like a slippery tuning exercise. As such, I now come to the conclusion
that given the (small?) gains, it is better to trust the C compiler
(especially if this is used by a wider set of architectures) and drop
this patch (and its side effect for boottime) from the series.
It should be noted on the same test bench that the new C coded vdso64 is
+2.9% and +11% faster for realtime and monotonic respectively over the
hand coded assembler it is replacing. Additional props for the C
compiler doing the "right thing".
-- Mark
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2017-10-31 15:28 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-10-27 22:25 [PATCH v3 04/12] arm: vdso: enforce monotonic and realtime as inline Mark Salyzyn
2017-10-30 14:10 ` Mark Rutland
2017-10-30 15:59 ` Russell King - ARM Linux
2017-10-31 15:28 ` Mark Salyzyn
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).