linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
@ 2022-04-01 14:06 Laurent Dufour
  2022-04-13  5:58 ` Nicholas Piggin
  2022-05-03 15:06 ` Michael Ellerman
  0 siblings, 2 replies; 11+ messages in thread
From: Laurent Dufour @ 2022-04-01 14:06 UTC (permalink / raw)
  To: mpe; +Cc: linux-kernel, linuxppc-dev, stable, Nicholas Piggin

RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits
mode (MSR[SF] unset).

The change in MSR is done in enter_rtas() in a relatively complex way,
since the MSR value could be hardcoded.

Furthermore, a panic has been reported when hitting the watchdog interrupt
while running in RTAS, this leads to the following stack trace:

[69244.027433][   C24] watchdog: CPU 24 Hard LOCKUP
[69244.027442][   C24] watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
[69244.027451][   C24] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
[69244.027555][   C24]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
[69244.027587][   C24] Supported: No, Unreleased kernel
[69244.027600][   C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
[69244.027609][   C24] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
[69244.027612][   C24] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
[69244.027615][   C24] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
[69244.027625][   C24] CFAR: 000000000000011c IRQMASK: 1
[69244.027625][   C24] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
[69244.027625][   C24] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
[69244.027625][   C24] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
[69244.027625][   C24] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
[69244.027625][   C24] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
[69244.027625][   C24] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
[69244.027625][   C24] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
[69244.027625][   C24] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
[69244.027663][   C24] NIP [000000001fb41050] 0x1fb41050
[69244.027696][   C24] LR [000000001fb4104c] 0x1fb4104c
[69244.027699][   C24] Call Trace:
[69244.027701][   C24] Instruction dump:
[69244.027723][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[69244.027728][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1]
[69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
[69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
[69244.028171][T87504]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
[69244.028307][T87504] Supported: No, Unreleased kernel
[69244.028385][T87504] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
[69244.028408][T87504] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
[69244.028418][T87504] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
[69244.028429][T87504] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
[69244.028444][T87504] CFAR: 000000000000011c IRQMASK: 1
[69244.028444][T87504] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
[69244.028444][T87504] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
[69244.028444][T87504] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
[69244.028444][T87504] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
[69244.028444][T87504] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
[69244.028444][T87504] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
[69244.028444][T87504] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
[69244.028444][T87504] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
[69244.028534][T87504] NIP [000000001fb41050] 0x1fb41050
[69244.028543][T87504] LR [000000001fb4104c] 0x1fb4104c
[69244.028549][T87504] Call Trace:
[69244.028554][T87504] Instruction dump:
[69244.028561][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[69244.028575][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
[69244.028607][T87504] ---[ end trace 3ddec07f638c34a2 ]---

This happens because MSR[RI] is unset when entering RTAS but there is no
valid reason to not set it here.

RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
"7.2.1 Machine State":

 R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect its
 own critical regions from recursion by setting the MSRRI bit to 0 when in
 the critical regions.

Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
hardcoded value meaning real mode, 32 bits and Recoverable Interrupt is
loaded.

In addition a check is added in do_enter_rtas() to detect calls made with
MSR[RI] unset, as we are forcing it on later.

This patch has been tested on the following machines:
Power KVM Guest
  P8 S822L (host Ubuntu kernel 5.11.0-49-generic)
PowerVM LPAR
  P8 9119-MME (FW860.A1)
  p9 9008-22L (FW950.00)
  P10 9080-HEX (FW1010.00)

Changes in V2:
 - Change comment in code to indicate NMI (Nick's comment)
 - Add reference to PAPR+ in the change log (Michael's comment)

Cc: stable@vger.kernel.org
Suggested-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
---
 arch/powerpc/kernel/entry_64.S | 20 ++++++++------------
 arch/powerpc/kernel/rtas.c     |  5 +++++
 2 files changed, 13 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
index 9581906b5ee9..65cb14b56f8d 100644
--- a/arch/powerpc/kernel/entry_64.S
+++ b/arch/powerpc/kernel/entry_64.S
@@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
 	clrldi	r4,r4,2			/* convert to realmode address */
        	mtlr	r4
 
-	li	r0,0
-	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
-	andc	r0,r6,r0
-	
-        li      r9,1
-        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
-	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
-	andc	r6,r0,r9
-
 __enter_rtas:
-	sync				/* disable interrupts so SRR0/1 */
-	mtmsrd	r0			/* don't get trashed */
-
 	LOAD_REG_ADDR(r4, rtas)
 	ld	r5,RTASENTRY(r4)	/* get the rtas->entry value */
 	ld	r4,RTASBASE(r4)		/* get the rtas->base value */
+
+	/* RTAS runs in 32bits real mode but let MSR[]RI on as we may hit
+	 * NMI (SRESET or MCE). RTAS should disable RI in its critical
+	 * regions (as specified in PAPR+ section 7.2.1). */
+	LOAD_REG_IMMEDIATE(r6, MSR_ME|MSR_RI)
+
+	li      r0,0
+	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */
 	
 	mtspr	SPRN_SRR0,r5
 	mtspr	SPRN_SRR1,r6
diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
index 1f42aabbbab3..d7775b8c8853 100644
--- a/arch/powerpc/kernel/rtas.c
+++ b/arch/powerpc/kernel/rtas.c
@@ -49,6 +49,11 @@ void enter_rtas(unsigned long);
 
 static inline void do_enter_rtas(unsigned long args)
 {
+	unsigned long msr;
+
+	msr = mfmsr();
+	BUG_ON(!(msr & MSR_RI));
+
 	enter_rtas(args);
 
 	srr_regs_clobbered(); /* rtas uses SRRs, invalidate */
-- 
2.35.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-04-01 14:06 [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS Laurent Dufour
@ 2022-04-13  5:58 ` Nicholas Piggin
  2022-04-21 14:09   ` Laurent Dufour
  2022-05-03 15:06 ` Michael Ellerman
  1 sibling, 1 reply; 11+ messages in thread
From: Nicholas Piggin @ 2022-04-13  5:58 UTC (permalink / raw)
  To: Laurent Dufour, mpe; +Cc: linux-kernel, linuxppc-dev, stable

Excerpts from Laurent Dufour's message of April 2, 2022 12:06 am:
> RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits
> mode (MSR[SF] unset).
> 
> The change in MSR is done in enter_rtas() in a relatively complex way,
> since the MSR value could be hardcoded.
> 
> Furthermore, a panic has been reported when hitting the watchdog interrupt
> while running in RTAS, this leads to the following stack trace:
> 
> [69244.027433][   C24] watchdog: CPU 24 Hard LOCKUP
> [69244.027442][   C24] watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
> [69244.027451][   C24] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
> [69244.027555][   C24]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
> [69244.027587][   C24] Supported: No, Unreleased kernel
> [69244.027600][   C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
> [69244.027609][   C24] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
> [69244.027612][   C24] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
> [69244.027615][   C24] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
> [69244.027625][   C24] CFAR: 000000000000011c IRQMASK: 1
> [69244.027625][   C24] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
> [69244.027625][   C24] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
> [69244.027625][   C24] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
> [69244.027625][   C24] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
> [69244.027625][   C24] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
> [69244.027625][   C24] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
> [69244.027625][   C24] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
> [69244.027625][   C24] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
> [69244.027663][   C24] NIP [000000001fb41050] 0x1fb41050
> [69244.027696][   C24] LR [000000001fb4104c] 0x1fb4104c
> [69244.027699][   C24] Call Trace:
> [69244.027701][   C24] Instruction dump:
> [69244.027723][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [69244.027728][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1]
> [69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
> [69244.028171][T87504]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
> [69244.028307][T87504] Supported: No, Unreleased kernel
> [69244.028385][T87504] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
> [69244.028408][T87504] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
> [69244.028418][T87504] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
> [69244.028429][T87504] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
> [69244.028444][T87504] CFAR: 000000000000011c IRQMASK: 1
> [69244.028444][T87504] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
> [69244.028444][T87504] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
> [69244.028444][T87504] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
> [69244.028444][T87504] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
> [69244.028444][T87504] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
> [69244.028444][T87504] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
> [69244.028444][T87504] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
> [69244.028444][T87504] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
> [69244.028534][T87504] NIP [000000001fb41050] 0x1fb41050
> [69244.028543][T87504] LR [000000001fb4104c] 0x1fb4104c
> [69244.028549][T87504] Call Trace:
> [69244.028554][T87504] Instruction dump:
> [69244.028561][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [69244.028575][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [69244.028607][T87504] ---[ end trace 3ddec07f638c34a2 ]---
> 
> This happens because MSR[RI] is unset when entering RTAS but there is no
> valid reason to not set it here.
> 
> RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
> "7.2.1 Machine State":
> 
>  R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect its
>  own critical regions from recursion by setting the MSRRI bit to 0 when in
>  the critical regions.
> 
> Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
> hardcoded value meaning real mode, 32 bits and Recoverable Interrupt is
> loaded.
> 
> In addition a check is added in do_enter_rtas() to detect calls made with
> MSR[RI] unset, as we are forcing it on later.
> 
> This patch has been tested on the following machines:
> Power KVM Guest
>   P8 S822L (host Ubuntu kernel 5.11.0-49-generic)
> PowerVM LPAR
>   P8 9119-MME (FW860.A1)
>   p9 9008-22L (FW950.00)
>   P10 9080-HEX (FW1010.00)
> 
> Changes in V2:
>  - Change comment in code to indicate NMI (Nick's comment)
>  - Add reference to PAPR+ in the change log (Michael's comment)
> 
> Cc: stable@vger.kernel.org
> Suggested-by: Nicholas Piggin <npiggin@gmail.com>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>

Still looks good,

Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

> ---
>  arch/powerpc/kernel/entry_64.S | 20 ++++++++------------
>  arch/powerpc/kernel/rtas.c     |  5 +++++
>  2 files changed, 13 insertions(+), 12 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 9581906b5ee9..65cb14b56f8d 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
>  	clrldi	r4,r4,2			/* convert to realmode address */
>         	mtlr	r4
>  
> -	li	r0,0
> -	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
> -	andc	r0,r6,r0
> -	
> -        li      r9,1
> -        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
> -	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
> -	andc	r6,r0,r9
> -
>  __enter_rtas:
> -	sync				/* disable interrupts so SRR0/1 */
> -	mtmsrd	r0			/* don't get trashed */
> -
>  	LOAD_REG_ADDR(r4, rtas)
>  	ld	r5,RTASENTRY(r4)	/* get the rtas->entry value */
>  	ld	r4,RTASBASE(r4)		/* get the rtas->base value */
> +
> +	/* RTAS runs in 32bits real mode but let MSR[]RI on as we may hit
> +	 * NMI (SRESET or MCE). RTAS should disable RI in its critical
> +	 * regions (as specified in PAPR+ section 7.2.1). */
> +	LOAD_REG_IMMEDIATE(r6, MSR_ME|MSR_RI)
> +
> +	li      r0,0
> +	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */
>  	
>  	mtspr	SPRN_SRR0,r5
>  	mtspr	SPRN_SRR1,r6
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 1f42aabbbab3..d7775b8c8853 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -49,6 +49,11 @@ void enter_rtas(unsigned long);
>  
>  static inline void do_enter_rtas(unsigned long args)
>  {
> +	unsigned long msr;
> +
> +	msr = mfmsr();
> +	BUG_ON(!(msr & MSR_RI));
> +
>  	enter_rtas(args);
>  
>  	srr_regs_clobbered(); /* rtas uses SRRs, invalidate */
> -- 
> 2.35.1
> 
> 

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-04-13  5:58 ` Nicholas Piggin
@ 2022-04-21 14:09   ` Laurent Dufour
  0 siblings, 0 replies; 11+ messages in thread
From: Laurent Dufour @ 2022-04-21 14:09 UTC (permalink / raw)
  To: mpe; +Cc: linuxppc-dev, linux-kernel, stable, Nicholas Piggin

On 13/04/2022, 07:58:42, Nicholas Piggin wrote:
> Excerpts from Laurent Dufour's message of April 2, 2022 12:06 am:
>> RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits
>> mode (MSR[SF] unset).
>>
>> The change in MSR is done in enter_rtas() in a relatively complex way,
>> since the MSR value could be hardcoded.
>>
>> Furthermore, a panic has been reported when hitting the watchdog interrupt
>> while running in RTAS, this leads to the following stack trace:
>>
>> [69244.027433][   C24] watchdog: CPU 24 Hard LOCKUP
>> [69244.027442][   C24] watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
>> [69244.027451][   C24] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
>> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
>> [69244.027555][   C24]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
>> [69244.027587][   C24] Supported: No, Unreleased kernel
>> [69244.027600][   C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
>> [69244.027609][   C24] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
>> [69244.027612][   C24] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
>> [69244.027615][   C24] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
>> [69244.027625][   C24] CFAR: 000000000000011c IRQMASK: 1
>> [69244.027625][   C24] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
>> [69244.027625][   C24] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
>> [69244.027625][   C24] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
>> [69244.027625][   C24] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
>> [69244.027625][   C24] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
>> [69244.027625][   C24] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
>> [69244.027625][   C24] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
>> [69244.027625][   C24] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
>> [69244.027663][   C24] NIP [000000001fb41050] 0x1fb41050
>> [69244.027696][   C24] LR [000000001fb4104c] 0x1fb4104c
>> [69244.027699][   C24] Call Trace:
>> [69244.027701][   C24] Instruction dump:
>> [69244.027723][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [69244.027728][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1]
>> [69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> [69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
>> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
>> [69244.028171][T87504]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
>> [69244.028307][T87504] Supported: No, Unreleased kernel
>> [69244.028385][T87504] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
>> [69244.028408][T87504] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
>> [69244.028418][T87504] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
>> [69244.028429][T87504] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
>> [69244.028444][T87504] CFAR: 000000000000011c IRQMASK: 1
>> [69244.028444][T87504] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
>> [69244.028444][T87504] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
>> [69244.028444][T87504] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
>> [69244.028444][T87504] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
>> [69244.028444][T87504] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
>> [69244.028444][T87504] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
>> [69244.028444][T87504] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
>> [69244.028444][T87504] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
>> [69244.028534][T87504] NIP [000000001fb41050] 0x1fb41050
>> [69244.028543][T87504] LR [000000001fb4104c] 0x1fb4104c
>> [69244.028549][T87504] Call Trace:
>> [69244.028554][T87504] Instruction dump:
>> [69244.028561][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [69244.028575][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [69244.028607][T87504] ---[ end trace 3ddec07f638c34a2 ]---
>>
>> This happens because MSR[RI] is unset when entering RTAS but there is no
>> valid reason to not set it here.
>>
>> RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
>> "7.2.1 Machine State":
>>
>>  R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect its
>>  own critical regions from recursion by setting the MSRRI bit to 0 when in
>>  the critical regions.
>>
>> Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
>> hardcoded value meaning real mode, 32 bits and Recoverable Interrupt is
>> loaded.
>>
>> In addition a check is added in do_enter_rtas() to detect calls made with
>> MSR[RI] unset, as we are forcing it on later.
>>
>> This patch has been tested on the following machines:
>> Power KVM Guest
>>   P8 S822L (host Ubuntu kernel 5.11.0-49-generic)
>> PowerVM LPAR
>>   P8 9119-MME (FW860.A1)
>>   p9 9008-22L (FW950.00)
>>   P10 9080-HEX (FW1010.00)
>>
>> Changes in V2:
>>  - Change comment in code to indicate NMI (Nick's comment)
>>  - Add reference to PAPR+ in the change log (Michael's comment)
>>
>> Cc: stable@vger.kernel.org
>> Suggested-by: Nicholas Piggin <npiggin@gmail.com>
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> 
> Still looks good,
> 
> Reviewed-by: Nicholas Piggin <npiggin@gmail.com>

Hi Michael,

Do you consider peeking that patch ?

Thanks,
Laurent.

> 
>> ---
>>  arch/powerpc/kernel/entry_64.S | 20 ++++++++------------
>>  arch/powerpc/kernel/rtas.c     |  5 +++++
>>  2 files changed, 13 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 9581906b5ee9..65cb14b56f8d 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
>>  	clrldi	r4,r4,2			/* convert to realmode address */
>>         	mtlr	r4
>>  
>> -	li	r0,0
>> -	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
>> -	andc	r0,r6,r0
>> -	
>> -        li      r9,1
>> -        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
>> -	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
>> -	andc	r6,r0,r9
>> -
>>  __enter_rtas:
>> -	sync				/* disable interrupts so SRR0/1 */
>> -	mtmsrd	r0			/* don't get trashed */
>> -
>>  	LOAD_REG_ADDR(r4, rtas)
>>  	ld	r5,RTASENTRY(r4)	/* get the rtas->entry value */
>>  	ld	r4,RTASBASE(r4)		/* get the rtas->base value */
>> +
>> +	/* RTAS runs in 32bits real mode but let MSR[]RI on as we may hit
>> +	 * NMI (SRESET or MCE). RTAS should disable RI in its critical
>> +	 * regions (as specified in PAPR+ section 7.2.1). */
>> +	LOAD_REG_IMMEDIATE(r6, MSR_ME|MSR_RI)
>> +
>> +	li      r0,0
>> +	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */
>>  	
>>  	mtspr	SPRN_SRR0,r5
>>  	mtspr	SPRN_SRR1,r6
>> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
>> index 1f42aabbbab3..d7775b8c8853 100644
>> --- a/arch/powerpc/kernel/rtas.c
>> +++ b/arch/powerpc/kernel/rtas.c
>> @@ -49,6 +49,11 @@ void enter_rtas(unsigned long);
>>  
>>  static inline void do_enter_rtas(unsigned long args)
>>  {
>> +	unsigned long msr;
>> +
>> +	msr = mfmsr();
>> +	BUG_ON(!(msr & MSR_RI));
>> +
>>  	enter_rtas(args);
>>  
>>  	srr_regs_clobbered(); /* rtas uses SRRs, invalidate */
>> -- 
>> 2.35.1
>>
>>


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-04-01 14:06 [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS Laurent Dufour
  2022-04-13  5:58 ` Nicholas Piggin
@ 2022-05-03 15:06 ` Michael Ellerman
  2022-05-03 16:16   ` Fabiano Rosas
  2022-05-03 16:23   ` Laurent Dufour
  1 sibling, 2 replies; 11+ messages in thread
From: Michael Ellerman @ 2022-05-03 15:06 UTC (permalink / raw)
  To: Laurent Dufour; +Cc: linux-kernel, linuxppc-dev, stable, Nicholas Piggin

Laurent Dufour <ldufour@linux.ibm.com> writes:
> RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits
> mode (MSR[SF] unset).

Probably also worth mentioning that it runs in big endian mode :)

It is specified in PAPR (R1-7.2.1-6).

> The change in MSR is done in enter_rtas() in a relatively complex way,
> since the MSR value could be hardcoded.
>
> Furthermore, a panic has been reported when hitting the watchdog interrupt
> while running in RTAS, this leads to the following stack trace:
>
> [69244.027433][   C24] watchdog: CPU 24 Hard LOCKUP
> [69244.027442][   C24] watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
> [69244.027451][   C24] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
> [69244.027555][   C24]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
> [69244.027587][   C24] Supported: No, Unreleased kernel
> [69244.027600][   C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
> [69244.027609][   C24] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
> [69244.027612][   C24] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
> [69244.027615][   C24] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
> [69244.027625][   C24] CFAR: 000000000000011c IRQMASK: 1
> [69244.027625][   C24] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
> [69244.027625][   C24] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
> [69244.027625][   C24] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
> [69244.027625][   C24] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
> [69244.027625][   C24] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
> [69244.027625][   C24] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
> [69244.027625][   C24] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
> [69244.027625][   C24] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
> [69244.027663][   C24] NIP [000000001fb41050] 0x1fb41050
> [69244.027696][   C24] LR [000000001fb4104c] 0x1fb4104c
> [69244.027699][   C24] Call Trace:
> [69244.027701][   C24] Instruction dump:
> [69244.027723][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [69244.027728][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1]
> [69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> [69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
> [69244.028171][T87504]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
> [69244.028307][T87504] Supported: No, Unreleased kernel
> [69244.028385][T87504] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
> [69244.028408][T87504] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
> [69244.028418][T87504] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
> [69244.028429][T87504] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
> [69244.028444][T87504] CFAR: 000000000000011c IRQMASK: 1
> [69244.028444][T87504] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
> [69244.028444][T87504] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
> [69244.028444][T87504] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
> [69244.028444][T87504] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
> [69244.028444][T87504] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
> [69244.028444][T87504] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
> [69244.028444][T87504] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
> [69244.028444][T87504] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
> [69244.028534][T87504] NIP [000000001fb41050] 0x1fb41050
> [69244.028543][T87504] LR [000000001fb4104c] 0x1fb4104c
> [69244.028549][T87504] Call Trace:
> [69244.028554][T87504] Instruction dump:
> [69244.028561][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [69244.028575][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
> [69244.028607][T87504] ---[ end trace 3ddec07f638c34a2 ]---
>
> This happens because MSR[RI] is unset when entering RTAS but there is no
> valid reason to not set it here.
>
> RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
> "7.2.1 Machine State":
>
>  R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect its
>  own critical regions from recursion by setting the MSRRI bit to 0 when in
>  the critical regions.
>
> Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
> hardcoded value meaning real mode, 32 bits and Recoverable Interrupt is
> loaded.
>
> In addition a check is added in do_enter_rtas() to detect calls made with
> MSR[RI] unset, as we are forcing it on later.
>
> This patch has been tested on the following machines:
> Power KVM Guest
>   P8 S822L (host Ubuntu kernel 5.11.0-49-generic)
> PowerVM LPAR
>   P8 9119-MME (FW860.A1)
>   p9 9008-22L (FW950.00)
>   P10 9080-HEX (FW1010.00)
>
> Changes in V2:
>  - Change comment in code to indicate NMI (Nick's comment)
>  - Add reference to PAPR+ in the change log (Michael's comment)
>
> Cc: stable@vger.kernel.org
> Suggested-by: Nicholas Piggin <npiggin@gmail.com>
> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
> ---
>  arch/powerpc/kernel/entry_64.S | 20 ++++++++------------
>  arch/powerpc/kernel/rtas.c     |  5 +++++
>  2 files changed, 13 insertions(+), 12 deletions(-)
>
> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
> index 9581906b5ee9..65cb14b56f8d 100644
> --- a/arch/powerpc/kernel/entry_64.S
> +++ b/arch/powerpc/kernel/entry_64.S
> @@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
>  	clrldi	r4,r4,2			/* convert to realmode address */
>         	mtlr	r4
>  
> -	li	r0,0
> -	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
> -	andc	r0,r6,r0
> -	
> -        li      r9,1
> -        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
> -	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
> -	andc	r6,r0,r9
 
One advantage of the old method is it can adapt to new MSR bits being
set by the kernel.

For example we used to use RTAS on powernv, and this code didn't need
updating to cater to MSR_HV being set. We will probably never use RTAS
on bare-metal again, so that's OK.

But your change might break secure virtual machines, because it clears
MSR_S whereas the old code didn't. I think SVMs did use RTAS, but I
don't know whether it matters if it's called with MSR_S set or not?

Not sure if anyone will remember, or has a working setup they can test.
Maybe for now we just copy MSR_S from the kernel MSR the way the
current code does.

>  __enter_rtas:
> -	sync				/* disable interrupts so SRR0/1 */
> -	mtmsrd	r0			/* don't get trashed */
> -
>  	LOAD_REG_ADDR(r4, rtas)
>  	ld	r5,RTASENTRY(r4)	/* get the rtas->entry value */
>  	ld	r4,RTASBASE(r4)		/* get the rtas->base value */
> +
> +	/* RTAS runs in 32bits real mode but let MSR[]RI on as we may hit

"32-bit big endian real mode"

> +	 * NMI (SRESET or MCE). RTAS should disable RI in its critical
> +	 * regions (as specified in PAPR+ section 7.2.1). */
> +	LOAD_REG_IMMEDIATE(r6, MSR_ME|MSR_RI)
> +
> +	li      r0,0
> +	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */
>  	
>  	mtspr	SPRN_SRR0,r5
>  	mtspr	SPRN_SRR1,r6
> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
> index 1f42aabbbab3..d7775b8c8853 100644
> --- a/arch/powerpc/kernel/rtas.c
> +++ b/arch/powerpc/kernel/rtas.c
> @@ -49,6 +49,11 @@ void enter_rtas(unsigned long);
>  
>  static inline void do_enter_rtas(unsigned long args)
>  {
> +	unsigned long msr;
> +
> +	msr = mfmsr();
> +	BUG_ON(!(msr & MSR_RI));

I'm not sure about this.

We call RTAS in some low-level places, so if we ever hit this BUG_ON
then it might cause us to crash badly, or recursively BUG.

A WARN_ON_ONCE() might be safer?

cheers

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-05-03 15:06 ` Michael Ellerman
@ 2022-05-03 16:16   ` Fabiano Rosas
  2022-05-03 16:47     ` Laurent Dufour
  2022-05-04  4:26     ` Michael Ellerman
  2022-05-03 16:23   ` Laurent Dufour
  1 sibling, 2 replies; 11+ messages in thread
From: Fabiano Rosas @ 2022-05-03 16:16 UTC (permalink / raw)
  To: Michael Ellerman, Laurent Dufour
  Cc: linuxppc-dev, linux-kernel, stable, Nicholas Piggin

Michael Ellerman <mpe@ellerman.id.au> writes:

>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 9581906b5ee9..65cb14b56f8d 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
>>  	clrldi	r4,r4,2			/* convert to realmode address */
>>         	mtlr	r4
>>  
>> -	li	r0,0
>> -	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
>> -	andc	r0,r6,r0
>> -	
>> -        li      r9,1
>> -        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
>> -	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
>> -	andc	r6,r0,r9
>  
> One advantage of the old method is it can adapt to new MSR bits being
> set by the kernel.
>
> For example we used to use RTAS on powernv, and this code didn't need
> updating to cater to MSR_HV being set. We will probably never use RTAS
> on bare-metal again, so that's OK.
>
> But your change might break secure virtual machines, because it clears
> MSR_S whereas the old code didn't. I think SVMs did use RTAS, but I
> don't know whether it matters if it's called with MSR_S set or not?
>
> Not sure if anyone will remember, or has a working setup they can test.
> Maybe for now we just copy MSR_S from the kernel MSR the way the
> current code does.

Would the kernel even be able to change the bit? I think only urfid can
clear MSR_S.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-05-03 15:06 ` Michael Ellerman
  2022-05-03 16:16   ` Fabiano Rosas
@ 2022-05-03 16:23   ` Laurent Dufour
  2022-05-04  5:59     ` Michael Ellerman
  1 sibling, 1 reply; 11+ messages in thread
From: Laurent Dufour @ 2022-05-03 16:23 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linux-kernel, linuxppc-dev, stable, Nicholas Piggin

Thanks Michael for reviewing this.

On 03/05/2022, 17:06:41, Michael Ellerman wrote:
> Laurent Dufour <ldufour@linux.ibm.com> writes:
>> RTAS runs in real mode (MSR[DR] and MSR[IR] unset) and in 32bits
>> mode (MSR[SF] unset).
> 
> Probably also worth mentioning that it runs in big endian mode :)
> 
> It is specified in PAPR (R1-7.2.1-6).

Sure!

> 
>> The change in MSR is done in enter_rtas() in a relatively complex way,
>> since the MSR value could be hardcoded.
>>
>> Furthermore, a panic has been reported when hitting the watchdog interrupt
>> while running in RTAS, this leads to the following stack trace:
>>
>> [69244.027433][   C24] watchdog: CPU 24 Hard LOCKUP
>> [69244.027442][   C24] watchdog: CPU 24 TB:997512652051031, last heartbeat TB:997504470175378 (15980ms ago)
>> [69244.027451][   C24] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
>> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
>> [69244.027555][   C24]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
>> [69244.027587][   C24] Supported: No, Unreleased kernel
>> [69244.027600][   C24] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
>> [69244.027609][   C24] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
>> [69244.027612][   C24] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
>> [69244.027615][   C24] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
>> [69244.027625][   C24] CFAR: 000000000000011c IRQMASK: 1
>> [69244.027625][   C24] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
>> [69244.027625][   C24] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
>> [69244.027625][   C24] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
>> [69244.027625][   C24] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
>> [69244.027625][   C24] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
>> [69244.027625][   C24] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
>> [69244.027625][   C24] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
>> [69244.027625][   C24] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
>> [69244.027663][   C24] NIP [000000001fb41050] 0x1fb41050
>> [69244.027696][   C24] LR [000000001fb4104c] 0x1fb4104c
>> [69244.027699][   C24] Call Trace:
>> [69244.027701][   C24] Instruction dump:
>> [69244.027723][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [69244.027728][   C24] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [69244.027762][T87504] Oops: Unrecoverable System Reset, sig: 6 [#1]
>> [69244.028044][T87504] LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
>> [69244.028089][T87504] Modules linked in: chacha_generic(E) libchacha(E) xxhash_generic(E) wp512(E) sha3_generic(E) rmd160(E) poly1305_generic(E) libpoly1305(E) michael_mic(E) md4(E) crc32_generic(E) cmac(E) ccm(E) algif_rng(E) twofish_generic(E) twofish_common(E) serpent_generic(E) fcrypt(E) des_generic(E) libdes(E) cast6_generic(E) cast5_generic(E) cast_common(E) camellia_generic(E) blowfish_generic(E) blowfish_common(E) algif_skcipher(E) algif_hash(E) gcm(E) algif_aead(E) af_alg(E) tun(E) rpcsec_gss_krb5(E) auth_rpcgss(E)
>> nfsv4(E) dns_resolver(E) rpadlpar_io(EX) rpaphp(EX) xsk_diag(E) tcp_diag(E) udp_diag(E) raw_diag(E) inet_diag(E) unix_diag(E) af_packet_diag(E) netlink_diag(E) nfsv3(E) nfs_acl(E) nfs(E) lockd(E) grace(E) sunrpc(E) fscache(E) netfs(E) af_packet(E) rfkill(E) bonding(E) tls(E) ibmveth(EX) crct10dif_vpmsum(E) rtc_generic(E) drm(E) drm_panel_orientation_quirks(E) fuse(E) configfs(E) backlight(E) ip_tables(E) x_tables(E) dm_service_time(E) sd_mod(E) t10_pi(E)
>> [69244.028171][T87504]  ibmvfc(EX) scsi_transport_fc(E) vmx_crypto(E) gf128mul(E) btrfs(E) blake2b_generic(E) libcrc32c(E) crc32c_vpmsum(E) xor(E) raid6_pq(E) dm_mirror(E) dm_region_hash(E) dm_log(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E)
>> [69244.028307][T87504] Supported: No, Unreleased kernel
>> [69244.028385][T87504] CPU: 24 PID: 87504 Comm: drmgr Kdump: loaded Tainted: G            E  X    5.14.21-150400.71.1.bz196362_2-default #1 SLE15-SP4 (unreleased) 0d821077ef4faa8dfaf370efb5fdca1fa35f4e2c
>> [69244.028408][T87504] NIP:  000000001fb41050 LR: 000000001fb4104c CTR: 0000000000000000
>> [69244.028418][T87504] REGS: c00000000fc33d60 TRAP: 0100   Tainted: G            E  X     (5.14.21-150400.71.1.bz196362_2-default)
>> [69244.028429][T87504] MSR:  8000000002981000 <SF,VEC,VSX,ME>  CR: 48800002  XER: 20040020
>> [69244.028444][T87504] CFAR: 000000000000011c IRQMASK: 1
>> [69244.028444][T87504] GPR00: 0000000000000003 ffffffffffffffff 0000000000000001 00000000000050dc
>> [69244.028444][T87504] GPR04: 000000001ffb6100 0000000000000020 0000000000000001 000000001fb09010
>> [69244.028444][T87504] GPR08: 0000000020000000 0000000000000000 0000000000000000 0000000000000000
>> [69244.028444][T87504] GPR12: 80040000072a40a8 c00000000ff8b680 0000000000000007 0000000000000034
>> [69244.028444][T87504] GPR16: 000000001fbf6e94 000000001fbf6d84 000000001fbd1db0 000000001fb3f008
>> [69244.028444][T87504] GPR20: 000000001fb41018 ffffffffffffffff 000000000000017f fffffffffffff68f
>> [69244.028444][T87504] GPR24: 000000001fb18fe8 000000001fb3e000 000000001fb1adc0 000000001fb1cf40
>> [69244.028444][T87504] GPR28: 000000001fb26000 000000001fb460f0 000000001fb17f18 000000001fb17000
>> [69244.028534][T87504] NIP [000000001fb41050] 0x1fb41050
>> [69244.028543][T87504] LR [000000001fb4104c] 0x1fb4104c
>> [69244.028549][T87504] Call Trace:
>> [69244.028554][T87504] Instruction dump:
>> [69244.028561][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [69244.028575][T87504] XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX XXXXXXXX
>> [69244.028607][T87504] ---[ end trace 3ddec07f638c34a2 ]---
>>
>> This happens because MSR[RI] is unset when entering RTAS but there is no
>> valid reason to not set it here.
>>
>> RTAS is expected to be called with MSR[RI] as specified in PAPR+ section
>> "7.2.1 Machine State":
>>
>>  R1–7.2.1–9. If called with MSR[RI] equal to 1, then RTAS must protect its
>>  own critical regions from recursion by setting the MSRRI bit to 0 when in
>>  the critical regions.
>>
>> Fixing this by reviewing the way MSR is compute before calling RTAS. Now a
>> hardcoded value meaning real mode, 32 bits and Recoverable Interrupt is
>> loaded.
>>
>> In addition a check is added in do_enter_rtas() to detect calls made with
>> MSR[RI] unset, as we are forcing it on later.
>>
>> This patch has been tested on the following machines:
>> Power KVM Guest
>>   P8 S822L (host Ubuntu kernel 5.11.0-49-generic)
>> PowerVM LPAR
>>   P8 9119-MME (FW860.A1)
>>   p9 9008-22L (FW950.00)
>>   P10 9080-HEX (FW1010.00)
>>
>> Changes in V2:
>>  - Change comment in code to indicate NMI (Nick's comment)
>>  - Add reference to PAPR+ in the change log (Michael's comment)
>>
>> Cc: stable@vger.kernel.org
>> Suggested-by: Nicholas Piggin <npiggin@gmail.com>
>> Signed-off-by: Laurent Dufour <ldufour@linux.ibm.com>
>> ---
>>  arch/powerpc/kernel/entry_64.S | 20 ++++++++------------
>>  arch/powerpc/kernel/rtas.c     |  5 +++++
>>  2 files changed, 13 insertions(+), 12 deletions(-)
>>
>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>> index 9581906b5ee9..65cb14b56f8d 100644
>> --- a/arch/powerpc/kernel/entry_64.S
>> +++ b/arch/powerpc/kernel/entry_64.S
>> @@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
>>  	clrldi	r4,r4,2			/* convert to realmode address */
>>         	mtlr	r4
>>  
>> -	li	r0,0
>> -	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
>> -	andc	r0,r6,r0
>> -	
>> -        li      r9,1
>> -        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
>> -	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
>> -	andc	r6,r0,r9
>  
> One advantage of the old method is it can adapt to new MSR bits being
> set by the kernel.
> 
> For example we used to use RTAS on powernv, and this code didn't need
> updating to cater to MSR_HV being set. We will probably never use RTAS
> on bare-metal again, so that's OK.
> 
> But your change might break secure virtual machines, because it clears
> MSR_S whereas the old code didn't. I think SVMs did use RTAS, but I
> don't know whether it matters if it's called with MSR_S set or not?
> 
> Not sure if anyone will remember, or has a working setup they can test.
> Maybe for now we just copy MSR_S from the kernel MSR the way the
> current code does.

I could update the code to deal with MSR[S], but I can't see how I would
test that :/

> 
>>  __enter_rtas:
>> -	sync				/* disable interrupts so SRR0/1 */
>> -	mtmsrd	r0			/* don't get trashed */
>> -
>>  	LOAD_REG_ADDR(r4, rtas)
>>  	ld	r5,RTASENTRY(r4)	/* get the rtas->entry value */
>>  	ld	r4,RTASBASE(r4)		/* get the rtas->base value */
>> +
>> +	/* RTAS runs in 32bits real mode but let MSR[]RI on as we may hit
> 
> "32-bit big endian real mode"

Yep!

> 
>> +	 * NMI (SRESET or MCE). RTAS should disable RI in its critical
>> +	 * regions (as specified in PAPR+ section 7.2.1). */
>> +	LOAD_REG_IMMEDIATE(r6, MSR_ME|MSR_RI)
>> +
>> +	li      r0,0
>> +	mtmsrd  r0,1                    /* disable RI before using SRR0/1 */
>>  	
>>  	mtspr	SPRN_SRR0,r5
>>  	mtspr	SPRN_SRR1,r6
>> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
>> index 1f42aabbbab3..d7775b8c8853 100644
>> --- a/arch/powerpc/kernel/rtas.c
>> +++ b/arch/powerpc/kernel/rtas.c
>> @@ -49,6 +49,11 @@ void enter_rtas(unsigned long);
>>  
>>  static inline void do_enter_rtas(unsigned long args)
>>  {
>> +	unsigned long msr;
>> +
>> +	msr = mfmsr();
>> +	BUG_ON(!(msr & MSR_RI));
> 
> I'm not sure about this.
> 
> We call RTAS in some low-level places, so if we ever hit this BUG_ON
> then it might cause us to crash badly, or recursively BUG.
> 
> A WARN_ON_ONCE() might be safer?

I'm afraid a BUG_ON is required here. Since MSR[RI] is set on RTAS exit so
if it was not set when calling RTAS, that's a real issue and should
generate unexpected behaviour.

Do you have places in mind where RTAS could be called with !MSR[RI]?

Cheers,
Laurent.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-05-03 16:16   ` Fabiano Rosas
@ 2022-05-03 16:47     ` Laurent Dufour
  2022-05-04  4:27       ` Michael Ellerman
  2022-05-04  4:26     ` Michael Ellerman
  1 sibling, 1 reply; 11+ messages in thread
From: Laurent Dufour @ 2022-05-03 16:47 UTC (permalink / raw)
  To: Fabiano Rosas, Michael Ellerman
  Cc: linuxppc-dev, linux-kernel, stable, Nicholas Piggin

On 03/05/2022, 18:16:29, Fabiano Rosas wrote:
> Michael Ellerman <mpe@ellerman.id.au> writes:
> 
>>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>>> index 9581906b5ee9..65cb14b56f8d 100644
>>> --- a/arch/powerpc/kernel/entry_64.S
>>> +++ b/arch/powerpc/kernel/entry_64.S
>>> @@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
>>>  	clrldi	r4,r4,2			/* convert to realmode address */
>>>         	mtlr	r4
>>>  
>>> -	li	r0,0
>>> -	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
>>> -	andc	r0,r6,r0
>>> -	
>>> -        li      r9,1
>>> -        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
>>> -	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
>>> -	andc	r6,r0,r9
>>  
>> One advantage of the old method is it can adapt to new MSR bits being
>> set by the kernel.
>>
>> For example we used to use RTAS on powernv, and this code didn't need
>> updating to cater to MSR_HV being set. We will probably never use RTAS
>> on bare-metal again, so that's OK.
>>
>> But your change might break secure virtual machines, because it clears
>> MSR_S whereas the old code didn't. I think SVMs did use RTAS, but I
>> don't know whether it matters if it's called with MSR_S set or not?
>>
>> Not sure if anyone will remember, or has a working setup they can test.
>> Maybe for now we just copy MSR_S from the kernel MSR the way the
>> current code does.
> 
> Would the kernel even be able to change the bit? I think only urfid can
> clear MSR_S.

That's a good point, thanks Fabiano!

The POWER ISA programming note about MSR[S] is explicit:
"MSR[S] can be set to 1 only by the System Call instruction and some
interrupts. It can be set to 0 only by urfid."

Since RTAS is entered using rfid, MSR[S] should remain whatever is the
value in SRR1.

And that's what POWER ISA is saying about the rfid instruction, in the
synopsis of the instruction the bit 41 of the resulting MSR (aka MSR[S]) is
not impacted.

So there is no need to take care of this MSR bit in our code, but for sure,
this should be well commented.

Michael, do you agree?

Cheers,
Laurent.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-05-03 16:16   ` Fabiano Rosas
  2022-05-03 16:47     ` Laurent Dufour
@ 2022-05-04  4:26     ` Michael Ellerman
  1 sibling, 0 replies; 11+ messages in thread
From: Michael Ellerman @ 2022-05-04  4:26 UTC (permalink / raw)
  To: Fabiano Rosas, Laurent Dufour
  Cc: linuxppc-dev, linux-kernel, stable, Nicholas Piggin

Fabiano Rosas <farosas@linux.ibm.com> writes:
> Michael Ellerman <mpe@ellerman.id.au> writes:
>>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>>> index 9581906b5ee9..65cb14b56f8d 100644
>>> --- a/arch/powerpc/kernel/entry_64.S
>>> +++ b/arch/powerpc/kernel/entry_64.S
>>> @@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
>>>  	clrldi	r4,r4,2			/* convert to realmode address */
>>>         	mtlr	r4
>>>  
>>> -	li	r0,0
>>> -	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
>>> -	andc	r0,r6,r0
>>> -	
>>> -        li      r9,1
>>> -        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
>>> -	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
>>> -	andc	r6,r0,r9
>>  
>> One advantage of the old method is it can adapt to new MSR bits being
>> set by the kernel.
>>
>> For example we used to use RTAS on powernv, and this code didn't need
>> updating to cater to MSR_HV being set. We will probably never use RTAS
>> on bare-metal again, so that's OK.
>>
>> But your change might break secure virtual machines, because it clears
>> MSR_S whereas the old code didn't. I think SVMs did use RTAS, but I
>> don't know whether it matters if it's called with MSR_S set or not?
>>
>> Not sure if anyone will remember, or has a working setup they can test.
>> Maybe for now we just copy MSR_S from the kernel MSR the way the
>> current code does.
>
> Would the kernel even be able to change the bit? I think only urfid can
> clear MSR_S.

Good point :)

cheers

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-05-03 16:47     ` Laurent Dufour
@ 2022-05-04  4:27       ` Michael Ellerman
  0 siblings, 0 replies; 11+ messages in thread
From: Michael Ellerman @ 2022-05-04  4:27 UTC (permalink / raw)
  To: Laurent Dufour, Fabiano Rosas
  Cc: linuxppc-dev, linux-kernel, stable, Nicholas Piggin

Laurent Dufour <ldufour@linux.ibm.com> writes:
> On 03/05/2022, 18:16:29, Fabiano Rosas wrote:
>> Michael Ellerman <mpe@ellerman.id.au> writes:
>> 
>>>> diff --git a/arch/powerpc/kernel/entry_64.S b/arch/powerpc/kernel/entry_64.S
>>>> index 9581906b5ee9..65cb14b56f8d 100644
>>>> --- a/arch/powerpc/kernel/entry_64.S
>>>> +++ b/arch/powerpc/kernel/entry_64.S
>>>> @@ -330,22 +330,18 @@ _GLOBAL(enter_rtas)
>>>>  	clrldi	r4,r4,2			/* convert to realmode address */
>>>>         	mtlr	r4
>>>>  
>>>> -	li	r0,0
>>>> -	ori	r0,r0,MSR_EE|MSR_SE|MSR_BE|MSR_RI
>>>> -	andc	r0,r6,r0
>>>> -	
>>>> -        li      r9,1
>>>> -        rldicr  r9,r9,MSR_SF_LG,(63-MSR_SF_LG)
>>>> -	ori	r9,r9,MSR_IR|MSR_DR|MSR_FE0|MSR_FE1|MSR_FP|MSR_RI|MSR_LE
>>>> -	andc	r6,r0,r9
>>>  
>>> One advantage of the old method is it can adapt to new MSR bits being
>>> set by the kernel.
>>>
>>> For example we used to use RTAS on powernv, and this code didn't need
>>> updating to cater to MSR_HV being set. We will probably never use RTAS
>>> on bare-metal again, so that's OK.
>>>
>>> But your change might break secure virtual machines, because it clears
>>> MSR_S whereas the old code didn't. I think SVMs did use RTAS, but I
>>> don't know whether it matters if it's called with MSR_S set or not?
>>>
>>> Not sure if anyone will remember, or has a working setup they can test.
>>> Maybe for now we just copy MSR_S from the kernel MSR the way the
>>> current code does.
>> 
>> Would the kernel even be able to change the bit? I think only urfid can
>> clear MSR_S.
>
> That's a good point, thanks Fabiano!
>
> The POWER ISA programming note about MSR[S] is explicit:
> "MSR[S] can be set to 1 only by the System Call instruction and some
> interrupts. It can be set to 0 only by urfid."
>
> Since RTAS is entered using rfid, MSR[S] should remain whatever is the
> value in SRR1.
>
> And that's what POWER ISA is saying about the rfid instruction, in the
> synopsis of the instruction the bit 41 of the resulting MSR (aka MSR[S]) is
> not impacted.
>
> So there is no need to take care of this MSR bit in our code, but for sure,
> this should be well commented.
>
> Michael, do you agree?

Yep.

Can you send a v3 with updated change log and comment mentioning MSR_S
and MSR_LE, thanks.

cheers

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-05-03 16:23   ` Laurent Dufour
@ 2022-05-04  5:59     ` Michael Ellerman
  2022-05-04 11:01       ` Laurent Dufour
  0 siblings, 1 reply; 11+ messages in thread
From: Michael Ellerman @ 2022-05-04  5:59 UTC (permalink / raw)
  To: Laurent Dufour; +Cc: linux-kernel, linuxppc-dev, stable, Nicholas Piggin

Laurent Dufour <ldufour@linux.ibm.com> writes:
> On 03/05/2022, 17:06:41, Michael Ellerman wrote:
>> Laurent Dufour <ldufour@linux.ibm.com> writes:
...
>>> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
>>> index 1f42aabbbab3..d7775b8c8853 100644
>>> --- a/arch/powerpc/kernel/rtas.c
>>> +++ b/arch/powerpc/kernel/rtas.c
>>> @@ -49,6 +49,11 @@ void enter_rtas(unsigned long);
>>>  
>>>  static inline void do_enter_rtas(unsigned long args)
>>>  {
>>> +	unsigned long msr;
>>> +
>>> +	msr = mfmsr();
>>> +	BUG_ON(!(msr & MSR_RI));
>> 
>> I'm not sure about this.
>> 
>> We call RTAS in some low-level places, so if we ever hit this BUG_ON
>> then it might cause us to crash badly, or recursively BUG.
>> 
>> A WARN_ON_ONCE() might be safer?
>
> I'm afraid a BUG_ON is required here. Since MSR[RI] is set on RTAS exit so
> if it was not set when calling RTAS, that's a real issue and should
> generate unexpected behaviour.
>
> Do you have places in mind where RTAS could be called with !MSR[RI]?

The main one I can think of is if someone is using
CONFIG_UDBG_RTAS_CONSOLE, then udbg_rtascon_putc() is wired up as
udbg_putc() and that might be called from anywhere, including xmon.

There's also RTAS calls in low-level xics interrupt code, that might get
called during panic/crash.

I don't expect any of those places to be called with MSR[RI] unset, but
I'm worried that if we're already crashing and for some reason MSR[RI]
is unset, then that BUG_ON will just make things worse.

eg. imagine taking a BUG_ON() for every character we try to print as
part of an oops.

Admittedly CONFIG_UDBG_RTAS_CONSOLE is old and probably not used much
anymore, but I'm still a bit paranoid :)

cheers

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS
  2022-05-04  5:59     ` Michael Ellerman
@ 2022-05-04 11:01       ` Laurent Dufour
  0 siblings, 0 replies; 11+ messages in thread
From: Laurent Dufour @ 2022-05-04 11:01 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: linux-kernel, linuxppc-dev, stable, Nicholas Piggin

On 04/05/2022, 07:59:29, Michael Ellerman wrote:
> Laurent Dufour <ldufour@linux.ibm.com> writes:
>> On 03/05/2022, 17:06:41, Michael Ellerman wrote:
>>> Laurent Dufour <ldufour@linux.ibm.com> writes:
> ...
>>>> diff --git a/arch/powerpc/kernel/rtas.c b/arch/powerpc/kernel/rtas.c
>>>> index 1f42aabbbab3..d7775b8c8853 100644
>>>> --- a/arch/powerpc/kernel/rtas.c
>>>> +++ b/arch/powerpc/kernel/rtas.c
>>>> @@ -49,6 +49,11 @@ void enter_rtas(unsigned long);
>>>>  
>>>>  static inline void do_enter_rtas(unsigned long args)
>>>>  {
>>>> +	unsigned long msr;
>>>> +
>>>> +	msr = mfmsr();
>>>> +	BUG_ON(!(msr & MSR_RI));
>>>
>>> I'm not sure about this.
>>>
>>> We call RTAS in some low-level places, so if we ever hit this BUG_ON
>>> then it might cause us to crash badly, or recursively BUG.
>>>
>>> A WARN_ON_ONCE() might be safer?
>>
>> I'm afraid a BUG_ON is required here. Since MSR[RI] is set on RTAS exit so
>> if it was not set when calling RTAS, that's a real issue and should
>> generate unexpected behaviour.
>>
>> Do you have places in mind where RTAS could be called with !MSR[RI]?
> 
> The main one I can think of is if someone is using
> CONFIG_UDBG_RTAS_CONSOLE, then udbg_rtascon_putc() is wired up as
> udbg_putc() and that might be called from anywhere, including xmon.
> 
> There's also RTAS calls in low-level xics interrupt code, that might get
> called during panic/crash.
> 
> I don't expect any of those places to be called with MSR[RI] unset, but
> I'm worried that if we're already crashing and for some reason MSR[RI]
> is unset, then that BUG_ON will just make things worse.
> 
> eg. imagine taking a BUG_ON() for every character we try to print as
> part of an oops.
> 
> Admittedly CONFIG_UDBG_RTAS_CONSOLE is old and probably not used much
> anymore, but I'm still a bit paranoid :)

I think you're right to be paranoid :)

This part of code can be really sensitive.
I boot a kernel built with CONFIG_UDBG_RTAS_CONSOLE, xmon is working fine,
but I cannot pretend this is covering all the RTAS call cases.

My hope with BUG_ON() is to raise the issue, as soon as possible, so it can
be addressed during the test phase.

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2022-05-04 11:01 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-04-01 14:06 [PATCH v2] powerpc/rtas: Keep MSR[RI] set when calling RTAS Laurent Dufour
2022-04-13  5:58 ` Nicholas Piggin
2022-04-21 14:09   ` Laurent Dufour
2022-05-03 15:06 ` Michael Ellerman
2022-05-03 16:16   ` Fabiano Rosas
2022-05-03 16:47     ` Laurent Dufour
2022-05-04  4:27       ` Michael Ellerman
2022-05-04  4:26     ` Michael Ellerman
2022-05-03 16:23   ` Laurent Dufour
2022-05-04  5:59     ` Michael Ellerman
2022-05-04 11:01       ` Laurent Dufour

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).