From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Paul Mackerras <paulus@samba.org>,
Michael Ellerman <mpe@ellerman.id.au>,
nathanl@linux.ibm.com, anton@ozlabs.org,
linux-arch@vger.kernel.org, arnd@arndb.de,
linux-kernel@vger.kernel.org, luto@kernel.org,
tglx@linutronix.de, vincenzo.frascino@arm.com,
linuxppc-dev@lists.ozlabs.org
Subject: Re: [PATCH v10 2/5] powerpc/vdso: Prepare for switching VDSO to generic C implementation.
Date: Wed, 5 Aug 2020 09:03:07 -0500 [thread overview]
Message-ID: <20200805140307.GO6753@gate.crashing.org> (raw)
Message-ID: <20200805140307.ZYQS5rbR3gLGITp_jSiCW5MpmcQe4oRRb6tm_U-QHoQ@z> (raw)
In-Reply-To: <348528c33cd4007f3fee7fe643ef160843d09a6c.1596611196.git.christophe.leroy@csgroup.eu>
Hi!
On Wed, Aug 05, 2020 at 07:09:23AM +0000, Christophe Leroy wrote:
> Provide vdso_shift_ns(), as the generic x >> s gives the following
> bad result:
>
> 18: 35 25 ff e0 addic. r9,r5,-32
> 1c: 41 80 00 10 blt 2c <shift+0x14>
> 20: 7c 64 4c 30 srw r4,r3,r9
> 24: 38 60 00 00 li r3,0
> ...
> 2c: 54 69 08 3c rlwinm r9,r3,1,0,30
> 30: 21 45 00 1f subfic r10,r5,31
> 34: 7c 84 2c 30 srw r4,r4,r5
> 38: 7d 29 50 30 slw r9,r9,r10
> 3c: 7c 63 2c 30 srw r3,r3,r5
> 40: 7d 24 23 78 or r4,r9,r4
>
> In our case the shift is always <= 32. In addition, the upper 32 bits
> of the result are likely nul. Lets GCC know it, it also optimises the
> following calculations.
>
> With the patch, we get:
> 0: 21 25 00 20 subfic r9,r5,32
> 4: 7c 69 48 30 slw r9,r3,r9
> 8: 7c 84 2c 30 srw r4,r4,r5
> c: 7d 24 23 78 or r4,r9,r4
> 10: 7c 63 2c 30 srw r3,r3,r5
See below. Such code is valid on PowerPC for all shift < 64, and a
future version of GCC will do that (it is on various TODO lists, it is
bound to happen *some* day ;-), but it won't help you yet of course).
> +/*
> + * The macros sets two stack frames, one for the caller and one for the callee
> + * because there are no requirement for the caller to set a stack frame when
> + * calling VDSO so it may have omitted to set one, especially on PPC64
> + */
If the caller follows the ABI, there always is a stack frame. So what
is going on?
> +/*
> + * powerpc specific delta calculation.
> + *
> + * This variant removes the masking of the subtraction because the
> + * clocksource mask of all VDSO capable clocksources on powerpc is U64_MAX
> + * which would result in a pointless operation. The compiler cannot
> + * optimize it away as the mask comes from the vdso data and is not compile
> + * time constant.
> + */
It cannot optimise it because it does not know shift < 32. The code
below is incorrect for shift equal to 32, fwiw.
> +static __always_inline u64 vdso_calc_delta(u64 cycles, u64 last, u64 mask, u32 mult)
> +{
> + return (cycles - last) * mult;
> +}
> +#define vdso_calc_delta vdso_calc_delta
> +
> +#ifndef __powerpc64__
> +static __always_inline u64 vdso_shift_ns(u64 ns, unsigned long shift)
> +{
> + u32 hi = ns >> 32;
> + u32 lo = ns;
> +
> + lo >>= shift;
> + lo |= hi << (32 - shift);
> + hi >>= shift;
> + if (likely(hi == 0))
> + return lo;
Removing these two lines shouldn't change generated object code? Or not
make it worse, at least.
> + return ((u64)hi << 32) | lo;
> +}
What does the compiler do for just
static __always_inline u64 vdso_shift_ns(u64 ns, unsigned long shift)
return ns >> (shift & 31);
}
?
Segher
next prev parent reply other threads:[~2020-08-05 17:02 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-08-05 7:09 [PATCH v10 0/5] powerpc: switch VDSO to C implementation Christophe Leroy
2020-08-05 7:09 ` [PATCH v10 1/5] powerpc/processor: Move cpu_relax() into asm/vdso/processor.h Christophe Leroy
2020-08-05 7:09 ` [PATCH v10 2/5] powerpc/vdso: Prepare for switching VDSO to generic C implementation Christophe Leroy
2020-08-05 14:03 ` Segher Boessenkool [this message]
2020-08-05 14:03 ` Segher Boessenkool
2020-08-05 16:40 ` Christophe Leroy
2020-08-05 16:40 ` Christophe Leroy
2020-08-05 18:40 ` Segher Boessenkool
2020-08-05 18:40 ` Segher Boessenkool
2020-08-06 5:46 ` Christophe Leroy
2020-08-05 16:51 ` Christophe Leroy
2020-08-05 16:51 ` Christophe Leroy
2020-08-05 20:55 ` Segher Boessenkool
2020-08-05 7:09 ` [PATCH v10 3/5] powerpc/vdso: Save and restore TOC pointer on PPC64 Christophe Leroy
2020-08-05 7:09 ` [PATCH v10 4/5] powerpc/vdso: Switch VDSO to generic C implementation Christophe Leroy
2020-08-05 7:09 ` Christophe Leroy
2020-08-05 7:09 ` [PATCH v10 5/5] powerpc/vdso: Provide __kernel_clock_gettime64() on vdso32 Christophe Leroy
2020-08-05 7:09 ` Christophe Leroy
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200805140307.GO6753@gate.crashing.org \
--to=segher@kernel.crashing.org \
--cc=anton@ozlabs.org \
--cc=arnd@arndb.de \
--cc=benh@kernel.crashing.org \
--cc=christophe.leroy@csgroup.eu \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=luto@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=nathanl@linux.ibm.com \
--cc=paulus@samba.org \
--cc=tglx@linutronix.de \
--cc=vincenzo.frascino@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).