* [PATCH] powerpc: slightly improve cache helpers
@ 2019-05-07 13:31 ` Christophe Leroy
0 siblings, 0 replies; 8+ messages in thread
From: Christophe Leroy @ 2019-05-07 13:31 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Segher Boessenkool
Cc: linux-kernel, linuxppc-dev
Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
that are summed to obtain the target address. Using '%y0' argument
gives GCC the opportunity to use both registers instead of only one
with the second being forced to 0.
Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
arch/powerpc/include/asm/cache.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 40ea5b3781c6..5a22a869a20b 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -85,22 +85,22 @@ extern void _set_L3CR(unsigned long);
static inline void dcbz(void *addr)
{
- __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbz %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbi(void *addr)
{
- __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbi %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbf(void *addr)
{
- __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbf %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbst(void *addr)
{
- __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbst %y0" : : "m"(*(u8 *)addr) : "memory");
}
#endif /* !__ASSEMBLY__ */
#endif /* __KERNEL__ */
--
2.13.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH] powerpc: slightly improve cache helpers
@ 2019-05-07 13:31 ` Christophe Leroy
0 siblings, 0 replies; 8+ messages in thread
From: Christophe Leroy @ 2019-05-07 13:31 UTC (permalink / raw)
To: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
Segher Boessenkool
Cc: linuxppc-dev, linux-kernel
Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
that are summed to obtain the target address. Using '%y0' argument
gives GCC the opportunity to use both registers instead of only one
with the second being forced to 0.
Suggested-by: Segher Boessenkool <segher@kernel.crashing.org>
Signed-off-by: Christophe Leroy <christophe.leroy@c-s.fr>
---
arch/powerpc/include/asm/cache.h | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 40ea5b3781c6..5a22a869a20b 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -85,22 +85,22 @@ extern void _set_L3CR(unsigned long);
static inline void dcbz(void *addr)
{
- __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbz %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbi(void *addr)
{
- __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbi %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbf(void *addr)
{
- __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbf %y0" : : "m"(*(u8 *)addr) : "memory");
}
static inline void dcbst(void *addr)
{
- __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
+ __asm__ __volatile__ ("dcbst %y0" : : "m"(*(u8 *)addr) : "memory");
}
#endif /* !__ASSEMBLY__ */
#endif /* __KERNEL__ */
--
2.13.3
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc: slightly improve cache helpers
2019-05-07 13:31 ` Christophe Leroy
@ 2019-05-07 15:10 ` Segher Boessenkool
-1 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2019-05-07 15:10 UTC (permalink / raw)
To: Christophe Leroy
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
linux-kernel, linuxppc-dev
Hi Christophe,
On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
> that are summed to obtain the target address. Using '%y0' argument
> gives GCC the opportunity to use both registers instead of only one
> with the second being forced to 0.
That's not quite right. Sorry if I didn't explain it properly.
"m" allows all memory. But this instruction only allows reg,reg and
0,reg addressing. For that you need to use constraint "Z".
The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing)
print as "0,reg" instead of "0(reg)" as it would by default (for just "%0").
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc: slightly improve cache helpers
@ 2019-05-07 15:10 ` Segher Boessenkool
0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2019-05-07 15:10 UTC (permalink / raw)
To: Christophe Leroy; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
Hi Christophe,
On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
> that are summed to obtain the target address. Using '%y0' argument
> gives GCC the opportunity to use both registers instead of only one
> with the second being forced to 0.
That's not quite right. Sorry if I didn't explain it properly.
"m" allows all memory. But this instruction only allows reg,reg and
0,reg addressing. For that you need to use constraint "Z".
The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing)
print as "0,reg" instead of "0(reg)" as it would by default (for just "%0").
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc: slightly improve cache helpers
2019-05-07 15:10 ` Segher Boessenkool
@ 2019-05-07 16:53 ` Christophe Leroy
-1 siblings, 0 replies; 8+ messages in thread
From: Christophe Leroy @ 2019-05-07 16:53 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
linux-kernel, linuxppc-dev
Le 07/05/2019 à 17:10, Segher Boessenkool a écrit :
> Hi Christophe,
>
> On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
>> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
>> that are summed to obtain the target address. Using '%y0' argument
>> gives GCC the opportunity to use both registers instead of only one
>> with the second being forced to 0.
>
> That's not quite right. Sorry if I didn't explain it properly.
>
> "m" allows all memory. But this instruction only allows reg,reg and
> 0,reg addressing. For that you need to use constraint "Z".
But gcc help
(https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
says it is better to use 'm':
Z
Memory operand that is an indexed or indirect from a register (it
is usually better to use ‘m’ or ‘es’ in asm statements)
That's the reason why I used 'm', I thought it was equivalent.
Christophe
>
> The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing)
> print as "0,reg" instead of "0(reg)" as it would by default (for just "%0").
>
>
> Segher
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc: slightly improve cache helpers
@ 2019-05-07 16:53 ` Christophe Leroy
0 siblings, 0 replies; 8+ messages in thread
From: Christophe Leroy @ 2019-05-07 16:53 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
Le 07/05/2019 à 17:10, Segher Boessenkool a écrit :
> Hi Christophe,
>
> On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
>> Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
>> that are summed to obtain the target address. Using '%y0' argument
>> gives GCC the opportunity to use both registers instead of only one
>> with the second being forced to 0.
>
> That's not quite right. Sorry if I didn't explain it properly.
>
> "m" allows all memory. But this instruction only allows reg,reg and
> 0,reg addressing. For that you need to use constraint "Z".
But gcc help
(https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
says it is better to use 'm':
Z
Memory operand that is an indexed or indirect from a register (it
is usually better to use ‘m’ or ‘es’ in asm statements)
That's the reason why I used 'm', I thought it was equivalent.
Christophe
>
> The output modifier "%y0" just makes [reg] (i.e. simple indirect addressing)
> print as "0,reg" instead of "0(reg)" as it would by default (for just "%0").
>
>
> Segher
>
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc: slightly improve cache helpers
2019-05-07 16:53 ` Christophe Leroy
@ 2019-05-08 14:40 ` Segher Boessenkool
-1 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2019-05-08 14:40 UTC (permalink / raw)
To: Christophe Leroy
Cc: Benjamin Herrenschmidt, Paul Mackerras, Michael Ellerman,
linux-kernel, linuxppc-dev
On Tue, May 07, 2019 at 06:53:30PM +0200, Christophe Leroy wrote:
> Le 07/05/2019 à 17:10, Segher Boessenkool a écrit :
> >On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
> >>Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
> >>that are summed to obtain the target address. Using '%y0' argument
> >>gives GCC the opportunity to use both registers instead of only one
> >>with the second being forced to 0.
> >
> >That's not quite right. Sorry if I didn't explain it properly.
> >
> >"m" allows all memory. But this instruction only allows reg,reg and
> >0,reg addressing. For that you need to use constraint "Z".
>
> But gcc help
> (https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
> says it is better to use 'm':
It says it *usually* is better to use "m". What it really should say is
it is better to use "m" _when that is valid_. It is not valid for the
cache block instructions.
I'll fix up the comment... "es" is ancient, too, nowadays it is
equivalent to just "m" (and you need "m<>" to allow pre-modify addressing).
> Z
>
> Memory operand that is an indexed or indirect from a register (it
> is usually better to use ‘m’ or ‘es’ in asm statements)
>
> That's the reason why I used 'm', I thought it was equivalent.
Yeah, the manual text could be clearer.
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] powerpc: slightly improve cache helpers
@ 2019-05-08 14:40 ` Segher Boessenkool
0 siblings, 0 replies; 8+ messages in thread
From: Segher Boessenkool @ 2019-05-08 14:40 UTC (permalink / raw)
To: Christophe Leroy; +Cc: linuxppc-dev, Paul Mackerras, linux-kernel
On Tue, May 07, 2019 at 06:53:30PM +0200, Christophe Leroy wrote:
> Le 07/05/2019 à 17:10, Segher Boessenkool a écrit :
> >On Tue, May 07, 2019 at 01:31:39PM +0000, Christophe Leroy wrote:
> >>Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
> >>that are summed to obtain the target address. Using '%y0' argument
> >>gives GCC the opportunity to use both registers instead of only one
> >>with the second being forced to 0.
> >
> >That's not quite right. Sorry if I didn't explain it properly.
> >
> >"m" allows all memory. But this instruction only allows reg,reg and
> >0,reg addressing. For that you need to use constraint "Z".
>
> But gcc help
> (https://gcc.gnu.org/onlinedocs/gcc/Machine-Constraints.html#Machine-Constraints)
> says it is better to use 'm':
It says it *usually* is better to use "m". What it really should say is
it is better to use "m" _when that is valid_. It is not valid for the
cache block instructions.
I'll fix up the comment... "es" is ancient, too, nowadays it is
equivalent to just "m" (and you need "m<>" to allow pre-modify addressing).
> Z
>
> Memory operand that is an indexed or indirect from a register (it
> is usually better to use ‘m’ or ‘es’ in asm statements)
>
> That's the reason why I used 'm', I thought it was equivalent.
Yeah, the manual text could be clearer.
Segher
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2019-05-08 14:42 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-05-07 13:31 [PATCH] powerpc: slightly improve cache helpers Christophe Leroy
2019-05-07 13:31 ` Christophe Leroy
2019-05-07 15:10 ` Segher Boessenkool
2019-05-07 15:10 ` Segher Boessenkool
2019-05-07 16:53 ` Christophe Leroy
2019-05-07 16:53 ` Christophe Leroy
2019-05-08 14:40 ` Segher Boessenkool
2019-05-08 14:40 ` Segher Boessenkool
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.