linux-parisc.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH][RFC] parisc: Use local tlb purges only on UP machines
@ 2022-09-25  6:56 Helge Deller
  2022-09-25 14:33 ` John David Anglin
  0 siblings, 1 reply; 9+ messages in thread
From: Helge Deller @ 2022-09-25  6:56 UTC (permalink / raw)
  To: linux-parisc, James Bottomley, John David Anglin

Limit usage of CPU-local tlb flushes in pacache.S to non-SMP machines.
On 32-bit kernels this was the case already, with this patch this
behaviour is used on 64-bit kernels now too.

Signed-off-by: Helge Deller <deller@gmx.de>

diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S
index 9a0018f1f42c..920f6ef5c3e5 100644
--- a/arch/parisc/kernel/pacache.S
+++ b/arch/parisc/kernel/pacache.S
@@ -539,15 +539,10 @@ ENTRY_CFI(copy_user_page_asm)

 	/* Purge any old translations */

-#ifdef CONFIG_PA20
-	pdtlb,l		%r0(%r28)
-	pdtlb,l		%r0(%r29)
-#else
 0:	pdtlb		%r0(%r28)
 1:	pdtlb		%r0(%r29)
 	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
 	ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SMP, INSN_PxTLB)
-#endif

 #ifdef CONFIG_64BIT
 	/* PA8x00 CPUs can consume 2 loads or 1 store per cycle.
@@ -670,12 +665,8 @@ ENTRY_CFI(clear_user_page_asm)

 	/* Purge any old translation */

-#ifdef CONFIG_PA20
-	pdtlb,l		%r0(%r28)
-#else
 0:	pdtlb		%r0(%r28)
 	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
-#endif

 #ifdef CONFIG_64BIT
 	ldi		(PAGE_SIZE / 128), %r1
@@ -736,12 +727,8 @@ ENTRY_CFI(flush_dcache_page_asm)

 	/* Purge any old translation */

-#ifdef CONFIG_PA20
-	pdtlb,l		%r0(%r28)
-#else
 0:	pdtlb		%r0(%r28)
 	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
-#endif

 88:	ldil		L%dcache_stride, %r1
 	ldw		R%dcache_stride(%r1), r31
@@ -785,12 +772,8 @@ ENTRY_CFI(purge_dcache_page_asm)

 	/* Purge any old translation */

-#ifdef CONFIG_PA20
-	pdtlb,l		%r0(%r28)
-#else
 0:	pdtlb		%r0(%r28)
 	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
-#endif

 88:	ldil		L%dcache_stride, %r1
 	ldw		R%dcache_stride(%r1), r31
@@ -837,17 +820,11 @@ ENTRY_CFI(flush_icache_page_asm)
 	 * have a flat address space, it's not clear which TLB will be
 	 * used.  So, we purge both entries.  */

-#ifdef CONFIG_PA20
-	pdtlb,l		%r0(%r28)
-1:	pitlb,l         %r0(%sr4,%r28)
-	ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SPLIT_TLB, INSN_NOP)
-#else
 0:	pdtlb		%r0(%r28)
 1:	pitlb           %r0(%sr4,%r28)
 	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
 	ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SMP, INSN_PxTLB)
 	ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SPLIT_TLB, INSN_NOP)
-#endif

 88:	ldil		L%icache_stride, %r1
 	ldw		R%icache_stride(%r1), %r31

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] parisc: Use local tlb purges only on UP machines
  2022-09-25  6:56 [PATCH][RFC] parisc: Use local tlb purges only on UP machines Helge Deller
@ 2022-09-25 14:33 ` John David Anglin
  2022-09-25 17:51   ` Helge Deller
  0 siblings, 1 reply; 9+ messages in thread
From: John David Anglin @ 2022-09-25 14:33 UTC (permalink / raw)
  To: Helge Deller, linux-parisc, James Bottomley

I believe this change is wrong and will reduce performance.  The TLB setup for a TMPALIAS
flush is local to any given CPU.  So, we should only need a local TLB purge.

A local TLB purge doesn't require locking to serialize PxTLB broadcasts.  It is also  faster than
a global TLB purges

Indeed, the case that might be wrong is the one that uses pdtlb. It potentially needs serialization
on SMP machines.  See comment in pgtable.h.

Dave

On 2022-09-25 2:56 a.m., Helge Deller wrote:
> Limit usage of CPU-local tlb flushes in pacache.S to non-SMP machines.
> On 32-bit kernels this was the case already, with this patch this
> behaviour is used on 64-bit kernels now too.
>
> Signed-off-by: Helge Deller <deller@gmx.de>
>
> diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S
> index 9a0018f1f42c..920f6ef5c3e5 100644
> --- a/arch/parisc/kernel/pacache.S
> +++ b/arch/parisc/kernel/pacache.S
> @@ -539,15 +539,10 @@ ENTRY_CFI(copy_user_page_asm)
>
>   	/* Purge any old translations */
>
> -#ifdef CONFIG_PA20
> -	pdtlb,l		%r0(%r28)
> -	pdtlb,l		%r0(%r29)
> -#else
>   0:	pdtlb		%r0(%r28)
>   1:	pdtlb		%r0(%r29)
>   	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
>   	ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SMP, INSN_PxTLB)
> -#endif
>
>   #ifdef CONFIG_64BIT
>   	/* PA8x00 CPUs can consume 2 loads or 1 store per cycle.
> @@ -670,12 +665,8 @@ ENTRY_CFI(clear_user_page_asm)
>
>   	/* Purge any old translation */
>
> -#ifdef CONFIG_PA20
> -	pdtlb,l		%r0(%r28)
> -#else
>   0:	pdtlb		%r0(%r28)
>   	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
> -#endif
>
>   #ifdef CONFIG_64BIT
>   	ldi		(PAGE_SIZE / 128), %r1
> @@ -736,12 +727,8 @@ ENTRY_CFI(flush_dcache_page_asm)
>
>   	/* Purge any old translation */
>
> -#ifdef CONFIG_PA20
> -	pdtlb,l		%r0(%r28)
> -#else
>   0:	pdtlb		%r0(%r28)
>   	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
> -#endif
>
>   88:	ldil		L%dcache_stride, %r1
>   	ldw		R%dcache_stride(%r1), r31
> @@ -785,12 +772,8 @@ ENTRY_CFI(purge_dcache_page_asm)
>
>   	/* Purge any old translation */
>
> -#ifdef CONFIG_PA20
> -	pdtlb,l		%r0(%r28)
> -#else
>   0:	pdtlb		%r0(%r28)
>   	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
> -#endif
>
>   88:	ldil		L%dcache_stride, %r1
>   	ldw		R%dcache_stride(%r1), r31
> @@ -837,17 +820,11 @@ ENTRY_CFI(flush_icache_page_asm)
>   	 * have a flat address space, it's not clear which TLB will be
>   	 * used.  So, we purge both entries.  */
>
> -#ifdef CONFIG_PA20
> -	pdtlb,l		%r0(%r28)
> -1:	pitlb,l         %r0(%sr4,%r28)
> -	ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SPLIT_TLB, INSN_NOP)
> -#else
>   0:	pdtlb		%r0(%r28)
>   1:	pitlb           %r0(%sr4,%r28)
>   	ALTERNATIVE(0b, 0b+4, ALT_COND_NO_SMP, INSN_PxTLB)
>   	ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SMP, INSN_PxTLB)
>   	ALTERNATIVE(1b, 1b+4, ALT_COND_NO_SPLIT_TLB, INSN_NOP)
> -#endif
>
>   88:	ldil		L%icache_stride, %r1
>   	ldw		R%icache_stride(%r1), %r31

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] parisc: Use local tlb purges only on UP machines
  2022-09-25 14:33 ` John David Anglin
@ 2022-09-25 17:51   ` Helge Deller
  2022-09-25 18:19     ` John David Anglin
  0 siblings, 1 reply; 9+ messages in thread
From: Helge Deller @ 2022-09-25 17:51 UTC (permalink / raw)
  To: John David Anglin, linux-parisc, James Bottomley

On 9/25/22 16:33, John David Anglin wrote:
> I believe this change is wrong and will reduce performance.

Yes, could be. That's why intentionally marked it RFC.

> The TLB setup for a TMPALIAS flush is local to any given CPU.  So, we
> should only need a local TLB purge.

Can those function theoretially be preempted?
If so, could it be scheduled on another CPU where the TMPALIAS isn't flushed?

> A local TLB purge doesn't require locking to serialize PxTLB broadcasts.  It is also  faster than
> a global TLB purges

True.

> Indeed, the case that might be wrong is the one that uses pdtlb.

Which do you have in mind here?

> It potentially needs serialization
> on SMP machines.  See comment in pgtable.h.

One goal of that patch was to drop the CONFIG_PA20 ifdef case,
because a 32-bit kernel could be compiled for PA8000 in which case
the "pdtlb,l" will burn the machine.
I'll send another patch which takes care of this without changing
the local purges of pdtlb on 64bit CPUs.

Helge

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] parisc: Use local tlb purges only on UP machines
  2022-09-25 17:51   ` Helge Deller
@ 2022-09-25 18:19     ` John David Anglin
  2022-09-25 18:44       ` John David Anglin
  0 siblings, 1 reply; 9+ messages in thread
From: John David Anglin @ 2022-09-25 18:19 UTC (permalink / raw)
  To: Helge Deller, linux-parisc, James Bottomley

On 2022-09-25 1:51 p.m., Helge Deller wrote:
> On 9/25/22 16:33, John David Anglin wrote:
>> I believe this change is wrong and will reduce performance.
>
> Yes, could be. That's why intentionally marked it RFC.
>
>> The TLB setup for a TMPALIAS flush is local to any given CPU.  So, we
>> should only need a local TLB purge.
>
> Can those function theoretially be preempted?
> If so, could it be scheduled on another CPU where the TMPALIAS isn't flushed?
I think preempt_disable/preempt_enable are used before/after all calls to these routines.

It would be better if these calls were in the flush routines.
>
>> A local TLB purge doesn't require locking to serialize PxTLB broadcasts.  It is also  faster than
>> a global TLB purges
>
> True.
>
>> Indeed, the case that might be wrong is the one that uses pdtlb.
>
> Which do you have in mind here?
The pdtlb instructions in these routines are not serialized. Theoretically, this could result in bus
contention and system malfunction.  Maybe only D and R class machines are PA 1.1 and SMP.
>
>> It potentially needs serialization
>> on SMP machines.  See comment in pgtable.h.
>
> One goal of that patch was to drop the CONFIG_PA20 ifdef case,
> because a 32-bit kernel could be compiled for PA8000 in which case
> the "pdtlb,l" will burn the machine.
Don't think so.  "pdtlb,l" is available on all PA 2.0 machines. It's not 64-bit specific.
> I'll send another patch which takes care of this without changing
> the local purges of pdtlb on 64bit CPUs.

Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] parisc: Use local tlb purges only on UP machines
  2022-09-25 18:19     ` John David Anglin
@ 2022-09-25 18:44       ` John David Anglin
  2022-09-25 18:58         ` Helge Deller
  0 siblings, 1 reply; 9+ messages in thread
From: John David Anglin @ 2022-09-25 18:44 UTC (permalink / raw)
  To: Helge Deller, linux-parisc, James Bottomley

On 2022-09-25 2:19 p.m., John David Anglin wrote:
>> One goal of that patch was to drop the CONFIG_PA20 ifdef case,
>> because a 32-bit kernel could be compiled for PA8000 in which case
>> the "pdtlb,l" will burn the machine.
> Don't think so.  "pdtlb,l" is available on all PA 2.0 machines. It's not 64-bit specific.
There is some difference in implementation between PA 1.1 and 2.0. 64-bit register
values are used in the PA 2.0 implementation.

Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] parisc: Use local tlb purges only on UP machines
  2022-09-25 18:44       ` John David Anglin
@ 2022-09-25 18:58         ` Helge Deller
  2022-09-25 19:27           ` John David Anglin
  0 siblings, 1 reply; 9+ messages in thread
From: Helge Deller @ 2022-09-25 18:58 UTC (permalink / raw)
  To: John David Anglin, linux-parisc, James Bottomley

On 9/25/22 20:44, John David Anglin wrote:
> On 2022-09-25 2:19 p.m., John David Anglin wrote:
>>> One goal of that patch was to drop the CONFIG_PA20 ifdef case,
>>> because a 32-bit kernel could be compiled for PA8000 in which case
>>> the "pdtlb,l" will burn the machine.
>> Don't think so.  "pdtlb,l" is available on all PA 2.0 machines. It's not 64-bit specific.
> There is some difference in implementation between PA 1.1 and 2.0. 64-bit register
> values are used in the PA 2.0 implementation.

That's right.
But if you build a 32-bit kernel and choose to optimize for PA8x00 CPUs,
the CONFIG_PA20 is set and the local-purge is used unconditionally.
That breaks e.g. when running such a kernel in qemu (which is 32-bit only).
See my just-posted patch.

Helge

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] parisc: Use local tlb purges only on UP machines
  2022-09-25 18:58         ` Helge Deller
@ 2022-09-25 19:27           ` John David Anglin
  2022-09-25 20:00             ` Helge Deller
  0 siblings, 1 reply; 9+ messages in thread
From: John David Anglin @ 2022-09-25 19:27 UTC (permalink / raw)
  To: Helge Deller, linux-parisc, James Bottomley

On 2022-09-25 2:58 p.m., Helge Deller wrote:
> On 9/25/22 20:44, John David Anglin wrote:
>> On 2022-09-25 2:19 p.m., John David Anglin wrote:
>>>> One goal of that patch was to drop the CONFIG_PA20 ifdef case,
>>>> because a 32-bit kernel could be compiled for PA8000 in which case
>>>> the "pdtlb,l" will burn the machine.
>>> Don't think so.  "pdtlb,l" is available on all PA 2.0 machines. It's not 64-bit specific.
>> There is some difference in implementation between PA 1.1 and 2.0. 64-bit register
>> values are used in the PA 2.0 implementation.
>
> That's right.
> But if you build a 32-bit kernel and choose to optimize for PA8x00 CPUs,
> the CONFIG_PA20 is set and the local-purge is used unconditionally.
> That breaks e.g. when running such a kernel in qemu (which is 32-bit only).
I don't think that's a valid kernel configuration for qemu.  It only supports the PA 1.1
instruction set.  PA8x00 CPUs always support the PA 2.0 instruction set even when running
in 32-bit mode.
> See my just-posted patch.

Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] parisc: Use local tlb purges only on UP machines
  2022-09-25 19:27           ` John David Anglin
@ 2022-09-25 20:00             ` Helge Deller
  2022-09-25 20:21               ` John David Anglin
  0 siblings, 1 reply; 9+ messages in thread
From: Helge Deller @ 2022-09-25 20:00 UTC (permalink / raw)
  To: John David Anglin, linux-parisc, James Bottomley

On 9/25/22 21:27, John David Anglin wrote:
> On 2022-09-25 2:58 p.m., Helge Deller wrote:
>> On 9/25/22 20:44, John David Anglin wrote:
>>> On 2022-09-25 2:19 p.m., John David Anglin wrote:
>>>>> One goal of that patch was to drop the CONFIG_PA20 ifdef case,
>>>>> because a 32-bit kernel could be compiled for PA8000 in which case
>>>>> the "pdtlb,l" will burn the machine.
>>>> Don't think so.  "pdtlb,l" is available on all PA 2.0 machines. It's not 64-bit specific.
>>> There is some difference in implementation between PA 1.1 and 2.0. 64-bit register
>>> values are used in the PA 2.0 implementation.
>>
>> That's right.
>> But if you build a 32-bit kernel and choose to optimize for PA8x00 CPUs,
>> the CONFIG_PA20 is set and the local-purge is used unconditionally.
>> That breaks e.g. when running such a kernel in qemu (which is 32-bit only).
> I don't think that's a valid kernel configuration for qemu.

I agree - it's probably not the best choice if you want to run qemu...

> It only supports the PA 1.1 instruction set.  PA8x00 CPUs always
> support the PA 2.0 instruction set even when running in 32-bit mode.
The 32-bit kernel is built with the 32-bit compiler.
As far as I understand arch/parisc/Makefile, choosing the CPU type
just enables *tuning* for the selected CPU. See Makefile:

# select which processor to optimise for
cflags-$(CONFIG_PA7000)         += -march=1.1 -mschedule=7100
cflags-$(CONFIG_PA7200)         += -march=1.1 -mschedule=7200
cflags-$(CONFIG_PA7100LC)       += -march=1.1 -mschedule=7100LC
cflags-$(CONFIG_PA7300LC)       += -march=1.1 -mschedule=7300
cflags-$(CONFIG_PA8X00)         += -march=2.0 -mschedule=8000

The only assembler instructions which break qemu are our manually
added ones: "pdtlb,l" and "ldd" (for flushing - see my first patch).
The two patches I sent today allows to boot such a kernel, and
they still keep the PA2.0 support when run on PA2.0 machines.

Helge

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH][RFC] parisc: Use local tlb purges only on UP machines
  2022-09-25 20:00             ` Helge Deller
@ 2022-09-25 20:21               ` John David Anglin
  0 siblings, 0 replies; 9+ messages in thread
From: John David Anglin @ 2022-09-25 20:21 UTC (permalink / raw)
  To: Helge Deller, linux-parisc, James Bottomley

On 2022-09-25 4:00 p.m., Helge Deller wrote:
>> It only supports the PA 1.1 instruction set.  PA8x00 CPUs always
>> support the PA 2.0 instruction set even when running in 32-bit mode.
> The 32-bit kernel is built with the 32-bit compiler.
> As far as I understand arch/parisc/Makefile, choosing the CPU type
> just enables *tuning* for the selected CPU. See Makefile:
>
> # select which processor to optimise for
> cflags-$(CONFIG_PA7000)         += -march=1.1 -mschedule=7100
> cflags-$(CONFIG_PA7200)         += -march=1.1 -mschedule=7200
> cflags-$(CONFIG_PA7100LC)       += -march=1.1 -mschedule=7100LC
> cflags-$(CONFIG_PA7300LC)       += -march=1.1 -mschedule=7300
> cflags-$(CONFIG_PA8X00)         += -march=2.0 -mschedule=8000
>
> The only assembler instructions which break qemu are our manually
> added ones: "pdtlb,l" and "ldd" (for flushing - see my first patch).
> The two patches I sent today allows to boot such a kernel, and
> they still keep the PA2.0 support when run on PA2.0 machines.
I think that patch is fine.  Probably the same needs to be done the use of "pdtlb,l"
instruction in the tmpalias flushes.

We used to have locking to serialize the TLB purges in the tmpalias flushes when
we used the pdtlb flush.

It's true the 32-bit compiler doesn't generate much that's not PA 1.1 compatible when
-march=2.0 is specified.  At on time, there was some push to allow 64-bit register use
in the 32-bit runtime.  Some hpux systems preserve the full 64-bits in context switches, etc.
So, one can get away with using 64-bit operations as long as one doesn't try to perserve
values across calls.

I believe "ldd" for prefetch works both in 32 and 64-bit PA 2.0 environments.

Dave

-- 
John David Anglin  dave.anglin@bell.net


^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2022-09-25 20:22 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-09-25  6:56 [PATCH][RFC] parisc: Use local tlb purges only on UP machines Helge Deller
2022-09-25 14:33 ` John David Anglin
2022-09-25 17:51   ` Helge Deller
2022-09-25 18:19     ` John David Anglin
2022-09-25 18:44       ` John David Anglin
2022-09-25 18:58         ` Helge Deller
2022-09-25 19:27           ` John David Anglin
2022-09-25 20:00             ` Helge Deller
2022-09-25 20:21               ` John David Anglin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).