All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] MIPS: Fix input modify in __write_64bit_c0_split()
@ 2017-09-19 13:11 ` James Hogan
  0 siblings, 0 replies; 8+ messages in thread
From: James Hogan @ 2017-09-19 13:11 UTC (permalink / raw)
  To: linux-mips; +Cc: James Hogan, Ralf Baechle

The inline asm in __write_64bit_c0_split() modifies the 64-bit input
operand by shifting the high register left by 32, and constructing the
full 64-bit value in the low register (even on a 32-bit kernel), so if
that value is used again it could cause breakage as GCC would assume the
registers haven't changed when they have.

To quote the GCC extended asm documentation:
> Warning: Do not modify the contents of input-only operands (except for
> inputs tied to outputs). The compiler assumes that on exit from the
> asm statement these operands contain the same values as they had
> before executing the statement.

Avoid modifying the input by using a temporary variable as an output
which is modified instead of the input and not otherwise used. The asm
is always __volatile__ so GCC shouldn't optimise it out. The low
register of the temporary output is written before the high register of
the input is read, so we have two constraint alternatives, one where
both use the same registers (for when the input value isn't subsequently
used), and one with an early clobber on the output in case the low
output uses the same register as the high input. This allows the
resulting assembly to remain mostly unchanged.

A diff of a MIPS32r6 kernel reveals only three differences, two in
relation to write_c0_r10k_diag() in cpu_probe() (register allocation
rearranged slightly but otherwise identical), and one in relation to
write_c0_cvmmemctl2() in kvm_vz_local_flush_guesttlb_all(), but the
octeon CPU is only supported on 64-bit kernels where
__write_64bit_c0_split() isn't used so that shouldn't matter in
practice. So there currently doesn't appear to be anything broken by
this bug.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
---
 arch/mips/include/asm/mipsregs.h | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index e4ed1bc9a734..a6810923b3f0 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -1377,29 +1377,32 @@ do {									\
 
 #define __write_64bit_c0_split(source, sel, val)			\
 do {									\
+	unsigned long long __tmp;					\
 	unsigned long __flags;						\
 									\
 	local_irq_save(__flags);					\
 	if (sel == 0)							\
 		__asm__ __volatile__(					\
 			".set\tmips64\n\t"				\
-			"dsll\t%L0, %L0, 32\n\t"			\
+			"dsll\t%L0, %L1, 32\n\t"			\
 			"dsrl\t%L0, %L0, 32\n\t"			\
-			"dsll\t%M0, %M0, 32\n\t"			\
+			"dsll\t%M0, %M1, 32\n\t"			\
 			"or\t%L0, %L0, %M0\n\t"				\
 			"dmtc0\t%L0, " #source "\n\t"			\
 			".set\tmips0"					\
-			: : "r" (val));					\
+			: "=&r,r" (__tmp)				\
+			: "r,0" (val));					\
 	else								\
 		__asm__ __volatile__(					\
 			".set\tmips64\n\t"				\
-			"dsll\t%L0, %L0, 32\n\t"			\
+			"dsll\t%L0, %L1, 32\n\t"			\
 			"dsrl\t%L0, %L0, 32\n\t"			\
-			"dsll\t%M0, %M0, 32\n\t"			\
+			"dsll\t%M0, %M1, 32\n\t"			\
 			"or\t%L0, %L0, %M0\n\t"				\
 			"dmtc0\t%L0, " #source ", " #sel "\n\t"		\
 			".set\tmips0"					\
-			: : "r" (val));					\
+			: "=&r,r" (__tmp)				\
+			: "r,0" (val));					\
 	local_irq_restore(__flags);					\
 } while (0)
 
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* [PATCH] MIPS: Fix input modify in __write_64bit_c0_split()
@ 2017-09-19 13:11 ` James Hogan
  0 siblings, 0 replies; 8+ messages in thread
From: James Hogan @ 2017-09-19 13:11 UTC (permalink / raw)
  To: linux-mips; +Cc: James Hogan, Ralf Baechle

The inline asm in __write_64bit_c0_split() modifies the 64-bit input
operand by shifting the high register left by 32, and constructing the
full 64-bit value in the low register (even on a 32-bit kernel), so if
that value is used again it could cause breakage as GCC would assume the
registers haven't changed when they have.

To quote the GCC extended asm documentation:
> Warning: Do not modify the contents of input-only operands (except for
> inputs tied to outputs). The compiler assumes that on exit from the
> asm statement these operands contain the same values as they had
> before executing the statement.

Avoid modifying the input by using a temporary variable as an output
which is modified instead of the input and not otherwise used. The asm
is always __volatile__ so GCC shouldn't optimise it out. The low
register of the temporary output is written before the high register of
the input is read, so we have two constraint alternatives, one where
both use the same registers (for when the input value isn't subsequently
used), and one with an early clobber on the output in case the low
output uses the same register as the high input. This allows the
resulting assembly to remain mostly unchanged.

A diff of a MIPS32r6 kernel reveals only three differences, two in
relation to write_c0_r10k_diag() in cpu_probe() (register allocation
rearranged slightly but otherwise identical), and one in relation to
write_c0_cvmmemctl2() in kvm_vz_local_flush_guesttlb_all(), but the
octeon CPU is only supported on 64-bit kernels where
__write_64bit_c0_split() isn't used so that shouldn't matter in
practice. So there currently doesn't appear to be anything broken by
this bug.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: linux-mips@linux-mips.org
---
 arch/mips/include/asm/mipsregs.h | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

diff --git a/arch/mips/include/asm/mipsregs.h b/arch/mips/include/asm/mipsregs.h
index e4ed1bc9a734..a6810923b3f0 100644
--- a/arch/mips/include/asm/mipsregs.h
+++ b/arch/mips/include/asm/mipsregs.h
@@ -1377,29 +1377,32 @@ do {									\
 
 #define __write_64bit_c0_split(source, sel, val)			\
 do {									\
+	unsigned long long __tmp;					\
 	unsigned long __flags;						\
 									\
 	local_irq_save(__flags);					\
 	if (sel == 0)							\
 		__asm__ __volatile__(					\
 			".set\tmips64\n\t"				\
-			"dsll\t%L0, %L0, 32\n\t"			\
+			"dsll\t%L0, %L1, 32\n\t"			\
 			"dsrl\t%L0, %L0, 32\n\t"			\
-			"dsll\t%M0, %M0, 32\n\t"			\
+			"dsll\t%M0, %M1, 32\n\t"			\
 			"or\t%L0, %L0, %M0\n\t"				\
 			"dmtc0\t%L0, " #source "\n\t"			\
 			".set\tmips0"					\
-			: : "r" (val));					\
+			: "=&r,r" (__tmp)				\
+			: "r,0" (val));					\
 	else								\
 		__asm__ __volatile__(					\
 			".set\tmips64\n\t"				\
-			"dsll\t%L0, %L0, 32\n\t"			\
+			"dsll\t%L0, %L1, 32\n\t"			\
 			"dsrl\t%L0, %L0, 32\n\t"			\
-			"dsll\t%M0, %M0, 32\n\t"			\
+			"dsll\t%M0, %M1, 32\n\t"			\
 			"or\t%L0, %L0, %M0\n\t"				\
 			"dmtc0\t%L0, " #source ", " #sel "\n\t"		\
 			".set\tmips0"					\
-			: : "r" (val));					\
+			: "=&r,r" (__tmp)				\
+			: "r,0" (val));					\
 	local_irq_restore(__flags);					\
 } while (0)
 
-- 
2.14.1

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [PATCH] MIPS: Fix input modify in __write_64bit_c0_split()
@ 2017-09-29 15:29   ` Maciej W. Rozycki
  0 siblings, 0 replies; 8+ messages in thread
From: Maciej W. Rozycki @ 2017-09-29 15:29 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-mips, Ralf Baechle

On Tue, 19 Sep 2017, James Hogan wrote:

> Avoid modifying the input by using a temporary variable as an output
> which is modified instead of the input and not otherwise used. The asm
> is always __volatile__ so GCC shouldn't optimise it out. The low
> register of the temporary output is written before the high register of
> the input is read, so we have two constraint alternatives, one where
> both use the same registers (for when the input value isn't subsequently
> used), and one with an early clobber on the output in case the low
> output uses the same register as the high input. This allows the
> resulting assembly to remain mostly unchanged.

 Clever and well-spotted!

Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>

 NB you could use DINS on MIPS64r2+ (and `__read_64bit_c0_split' could 
use one instruction less; patch posted separately).

  Maciej

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] MIPS: Fix input modify in __write_64bit_c0_split()
@ 2017-09-29 15:29   ` Maciej W. Rozycki
  0 siblings, 0 replies; 8+ messages in thread
From: Maciej W. Rozycki @ 2017-09-29 15:29 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-mips, Ralf Baechle

On Tue, 19 Sep 2017, James Hogan wrote:

> Avoid modifying the input by using a temporary variable as an output
> which is modified instead of the input and not otherwise used. The asm
> is always __volatile__ so GCC shouldn't optimise it out. The low
> register of the temporary output is written before the high register of
> the input is read, so we have two constraint alternatives, one where
> both use the same registers (for when the input value isn't subsequently
> used), and one with an early clobber on the output in case the low
> output uses the same register as the high input. This allows the
> resulting assembly to remain mostly unchanged.

 Clever and well-spotted!

Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>

 NB you could use DINS on MIPS64r2+ (and `__read_64bit_c0_split' could 
use one instruction less; patch posted separately).

  Maciej

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] MIPS: Fix input modify in __write_64bit_c0_split()
@ 2017-09-29 16:16     ` James Hogan
  0 siblings, 0 replies; 8+ messages in thread
From: James Hogan @ 2017-09-29 16:16 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: linux-mips, Ralf Baechle

[-- Attachment #1: Type: text/plain, Size: 1251 bytes --]

On Fri, Sep 29, 2017 at 04:29:25PM +0100, Maciej W. Rozycki wrote:
> On Tue, 19 Sep 2017, James Hogan wrote:
> 
> > Avoid modifying the input by using a temporary variable as an output
> > which is modified instead of the input and not otherwise used. The asm
> > is always __volatile__ so GCC shouldn't optimise it out. The low
> > register of the temporary output is written before the high register of
> > the input is read, so we have two constraint alternatives, one where
> > both use the same registers (for when the input value isn't subsequently
> > used), and one with an early clobber on the output in case the low
> > output uses the same register as the high input. This allows the
> > resulting assembly to remain mostly unchanged.
> 
>  Clever and well-spotted!
> 
> Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>

Thanks Maciej!

> 
>  NB you could use DINS on MIPS64r2+ (and `__read_64bit_c0_split' could 
> use one instruction less; patch posted separately).

Yes, that did occur to me, but it seemed simpler for this non-critical
case to use the more compatible sequence (whereas for the equivalent VZ
guest accessors, which imply R5+, I've used DINS in my not-yet-submitted
patches).

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] MIPS: Fix input modify in __write_64bit_c0_split()
@ 2017-09-29 16:16     ` James Hogan
  0 siblings, 0 replies; 8+ messages in thread
From: James Hogan @ 2017-09-29 16:16 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: linux-mips, Ralf Baechle

[-- Attachment #1: Type: text/plain, Size: 1251 bytes --]

On Fri, Sep 29, 2017 at 04:29:25PM +0100, Maciej W. Rozycki wrote:
> On Tue, 19 Sep 2017, James Hogan wrote:
> 
> > Avoid modifying the input by using a temporary variable as an output
> > which is modified instead of the input and not otherwise used. The asm
> > is always __volatile__ so GCC shouldn't optimise it out. The low
> > register of the temporary output is written before the high register of
> > the input is read, so we have two constraint alternatives, one where
> > both use the same registers (for when the input value isn't subsequently
> > used), and one with an early clobber on the output in case the low
> > output uses the same register as the high input. This allows the
> > resulting assembly to remain mostly unchanged.
> 
>  Clever and well-spotted!
> 
> Reviewed-by: Maciej W. Rozycki <macro@imgtec.com>

Thanks Maciej!

> 
>  NB you could use DINS on MIPS64r2+ (and `__read_64bit_c0_split' could 
> use one instruction less; patch posted separately).

Yes, that did occur to me, but it seemed simpler for this non-critical
case to use the more compatible sequence (whereas for the equivalent VZ
guest accessors, which imply R5+, I've used DINS in my not-yet-submitted
patches).

Cheers
James

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] MIPS: Fix input modify in __write_64bit_c0_split()
@ 2017-09-29 23:12       ` Maciej W. Rozycki
  0 siblings, 0 replies; 8+ messages in thread
From: Maciej W. Rozycki @ 2017-09-29 23:12 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-mips, Ralf Baechle

On Fri, 29 Sep 2017, James Hogan wrote:

> >  NB you could use DINS on MIPS64r2+ (and `__read_64bit_c0_split' could 
> > use one instruction less; patch posted separately).
> 
> Yes, that did occur to me, but it seemed simpler for this non-critical
> case to use the more compatible sequence (whereas for the equivalent VZ
> guest accessors, which imply R5+, I've used DINS in my not-yet-submitted
> patches).

 It would have to be a separate change anyway, however your observation 
about 32-bit EJTAG/Cache exception handlers clobbering the temporary 
64-bit value made in the other thread also applies here and we have R2 
code variants for other stuff already, so I think it would be a worthwhile 
improvement.

  Maciej

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [PATCH] MIPS: Fix input modify in __write_64bit_c0_split()
@ 2017-09-29 23:12       ` Maciej W. Rozycki
  0 siblings, 0 replies; 8+ messages in thread
From: Maciej W. Rozycki @ 2017-09-29 23:12 UTC (permalink / raw)
  To: James Hogan; +Cc: linux-mips, Ralf Baechle

On Fri, 29 Sep 2017, James Hogan wrote:

> >  NB you could use DINS on MIPS64r2+ (and `__read_64bit_c0_split' could 
> > use one instruction less; patch posted separately).
> 
> Yes, that did occur to me, but it seemed simpler for this non-critical
> case to use the more compatible sequence (whereas for the equivalent VZ
> guest accessors, which imply R5+, I've used DINS in my not-yet-submitted
> patches).

 It would have to be a separate change anyway, however your observation 
about 32-bit EJTAG/Cache exception handlers clobbering the temporary 
64-bit value made in the other thread also applies here and we have R2 
code variants for other stuff already, so I think it would be a worthwhile 
improvement.

  Maciej

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2017-09-29 23:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-09-19 13:11 [PATCH] MIPS: Fix input modify in __write_64bit_c0_split() James Hogan
2017-09-19 13:11 ` James Hogan
2017-09-29 15:29 ` Maciej W. Rozycki
2017-09-29 15:29   ` Maciej W. Rozycki
2017-09-29 16:16   ` James Hogan
2017-09-29 16:16     ` James Hogan
2017-09-29 23:12     ` Maciej W. Rozycki
2017-09-29 23:12       ` Maciej W. Rozycki

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.