All of lore.kernel.org
 help / color / mirror / Atom feed
* [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-14 15:12 ` Sachin Sant
  0 siblings, 0 replies; 23+ messages in thread
From: Sachin Sant @ 2023-06-14 15:12 UTC (permalink / raw)
  To: open list; +Cc: linuxppc-dev, jarkko

Following crash is observed during a kexec operation on 
IBM Power10 server:

[ 34.381548] Kernel attempted to read user page (50) - exploit attempt? (uid: 0)
[ 34.381562] BUG: Kernel NULL pointer dereference on read at 0x00000050
[ 34.381565] Faulting instruction address: 0xc0000000009db1e4
[ 34.381569] Oops: Kernel access of bad area, sig: 11 [#1]
[ 34.381572] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 34.381576] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
[ 34.381613] CPU: 18 PID: 5918 Comm: kexec Kdump: loaded Tainted: G E 6.4.0-rc6-00037-gb6dad5178cea #3
[ 34.381618] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[ 34.381621] NIP: c0000000009db1e4 LR: c0000000009db928 CTR: c0000000009eab60
[ 34.381625] REGS: c00000009742f780 TRAP: 0300 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
[ 34.381628] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44488884 XER: 00000001
[ 34.381638] CFAR: c0000000009db19c DAR: 0000000000000050 DSISR: 40000000 IRQMASK: 0 
[ 34.381638] GPR00: c0000000009db928 c00000009742fa20 c0000000014a1500 c0000000081d0000 
[ 34.381638] GPR04: c00000000d842c50 c00000000d842c50 0000000000000025 fffffffffffe0000 
[ 34.381638] GPR08: 0000000000000000 0000000000000000 0000000000000009 c008000000785280 
[ 34.381638] GPR12: c0000000009eab60 c00000135fab7f00 0000000000000000 0000000000000000 
[ 34.381638] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381638] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381638] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000002e21e08 
[ 34.381638] GPR28: c00000000d842c48 c000000002a02208 c00000000321c0c0 c0000000081d0000 
[ 34.381674] NIP [c0000000009db1e4] tpm_amd_is_rng_defective+0x74/0x240
[ 34.381681] LR [c0000000009db928] tpm_chip_unregister+0x138/0x160
[ 34.381685] Call Trace:
[ 34.381686] [c00000009742faa0] [c0000000009db928] tpm_chip_unregister+0x138/0x160
[ 34.381690] [c00000009742fae0] [c0000000009eab94] tpm_ibmvtpm_remove+0x34/0x130
[ 34.381695] [c00000009742fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
[ 34.381701] [c00000009742fb90] [c000000000a01ecc] device_shutdown+0x21c/0x39c
[ 34.381705] [c00000009742fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
[ 34.381710] [c00000009742fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
[ 34.381714] [c00000009742fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
[ 34.381718] [c00000009742fe10] [c000000000034adc] system_call_exception+0x13c/0x340
[ 34.381723] [c00000009742fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[ 34.381729] --- interrupt: 3000 at 0x7fff9c5459f0
[ 34.381732] NIP: 00007fff9c5459f0 LR: 0000000000000000 CTR: 0000000000000000
[ 34.381735] REGS: c00000009742fe80 TRAP: 3000 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
[ 34.381738] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42422884 XER: 00000000
[ 34.381747] IRQMASK: 0 
[ 34.381747] GPR00: 0000000000000058 00007ffffad83d70 000000012fc47f00 fffffffffee1dead 
[ 34.381747] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003 
[ 34.381747] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381747] GPR12: 0000000000000000 00007fff9c7bb2c0 000000012fc3f598 0000000000000000 
[ 34.381747] GPR16: ffffffffffffffff 0000000000000000 000000012fc1fcc0 0000000000000000 
[ 34.381747] GPR20: 0000000000008913 0000000000008914 000000014b891020 0000000000000003 
[ 34.381747] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007ffffad83ef0 
[ 34.381747] GPR28: 000000012fc19f10 00007fff9c6419c0 000000014b891080 000000014b891040 
[ 34.381781] NIP [00007fff9c5459f0] 0x7fff9c5459f0
[ 34.381784] LR [0000000000000000] 0x0
[ 34.381786] --- interrupt: 3000
[ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
[ 34.381800] ---[ end trace 0000000000000000 ]---
[ 34.384090] pstore: backend (nvram) writing error (-1)

Git bisect points to following patch

commit bd8621ca1510e6e802df9855bdc35a04a3cfa932
    tpm: Add !tpm_amd_is_rng_defective() to the hwrng_unregister() call site

Reverting the commit allows a successful kexec operation.

- Sachin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-14 15:12 ` Sachin Sant
  0 siblings, 0 replies; 23+ messages in thread
From: Sachin Sant @ 2023-06-14 15:12 UTC (permalink / raw)
  To: open list; +Cc: jarkko, linuxppc-dev

Following crash is observed during a kexec operation on 
IBM Power10 server:

[ 34.381548] Kernel attempted to read user page (50) - exploit attempt? (uid: 0)
[ 34.381562] BUG: Kernel NULL pointer dereference on read at 0x00000050
[ 34.381565] Faulting instruction address: 0xc0000000009db1e4
[ 34.381569] Oops: Kernel access of bad area, sig: 11 [#1]
[ 34.381572] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[ 34.381576] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
[ 34.381613] CPU: 18 PID: 5918 Comm: kexec Kdump: loaded Tainted: G E 6.4.0-rc6-00037-gb6dad5178cea #3
[ 34.381618] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[ 34.381621] NIP: c0000000009db1e4 LR: c0000000009db928 CTR: c0000000009eab60
[ 34.381625] REGS: c00000009742f780 TRAP: 0300 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
[ 34.381628] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44488884 XER: 00000001
[ 34.381638] CFAR: c0000000009db19c DAR: 0000000000000050 DSISR: 40000000 IRQMASK: 0 
[ 34.381638] GPR00: c0000000009db928 c00000009742fa20 c0000000014a1500 c0000000081d0000 
[ 34.381638] GPR04: c00000000d842c50 c00000000d842c50 0000000000000025 fffffffffffe0000 
[ 34.381638] GPR08: 0000000000000000 0000000000000000 0000000000000009 c008000000785280 
[ 34.381638] GPR12: c0000000009eab60 c00000135fab7f00 0000000000000000 0000000000000000 
[ 34.381638] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381638] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381638] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000002e21e08 
[ 34.381638] GPR28: c00000000d842c48 c000000002a02208 c00000000321c0c0 c0000000081d0000 
[ 34.381674] NIP [c0000000009db1e4] tpm_amd_is_rng_defective+0x74/0x240
[ 34.381681] LR [c0000000009db928] tpm_chip_unregister+0x138/0x160
[ 34.381685] Call Trace:
[ 34.381686] [c00000009742faa0] [c0000000009db928] tpm_chip_unregister+0x138/0x160
[ 34.381690] [c00000009742fae0] [c0000000009eab94] tpm_ibmvtpm_remove+0x34/0x130
[ 34.381695] [c00000009742fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
[ 34.381701] [c00000009742fb90] [c000000000a01ecc] device_shutdown+0x21c/0x39c
[ 34.381705] [c00000009742fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
[ 34.381710] [c00000009742fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
[ 34.381714] [c00000009742fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
[ 34.381718] [c00000009742fe10] [c000000000034adc] system_call_exception+0x13c/0x340
[ 34.381723] [c00000009742fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[ 34.381729] --- interrupt: 3000 at 0x7fff9c5459f0
[ 34.381732] NIP: 00007fff9c5459f0 LR: 0000000000000000 CTR: 0000000000000000
[ 34.381735] REGS: c00000009742fe80 TRAP: 3000 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
[ 34.381738] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42422884 XER: 00000000
[ 34.381747] IRQMASK: 0 
[ 34.381747] GPR00: 0000000000000058 00007ffffad83d70 000000012fc47f00 fffffffffee1dead 
[ 34.381747] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003 
[ 34.381747] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000 
[ 34.381747] GPR12: 0000000000000000 00007fff9c7bb2c0 000000012fc3f598 0000000000000000 
[ 34.381747] GPR16: ffffffffffffffff 0000000000000000 000000012fc1fcc0 0000000000000000 
[ 34.381747] GPR20: 0000000000008913 0000000000008914 000000014b891020 0000000000000003 
[ 34.381747] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007ffffad83ef0 
[ 34.381747] GPR28: 000000012fc19f10 00007fff9c6419c0 000000014b891080 000000014b891040 
[ 34.381781] NIP [00007fff9c5459f0] 0x7fff9c5459f0
[ 34.381784] LR [0000000000000000] 0x0
[ 34.381786] --- interrupt: 3000
[ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
[ 34.381800] ---[ end trace 0000000000000000 ]---
[ 34.384090] pstore: backend (nvram) writing error (-1)

Git bisect points to following patch

commit bd8621ca1510e6e802df9855bdc35a04a3cfa932
    tpm: Add !tpm_amd_is_rng_defective() to the hwrng_unregister() call site

Reverting the commit allows a successful kexec operation.

- Sachin


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-14 15:12 ` Sachin Sant
@ 2023-06-15  3:07   ` Michael Ellerman
  -1 siblings, 0 replies; 23+ messages in thread
From: Michael Ellerman @ 2023-06-15  3:07 UTC (permalink / raw)
  To: Sachin Sant, open list; +Cc: linuxppc-dev, jarkko

Sachin Sant <sachinp@linux.ibm.com> writes:
> Following crash is observed during a kexec operation on 
> IBM Power10 server:
>
> [ 34.381548] Kernel attempted to read user page (50) - exploit attempt? (uid: 0)
> [ 34.381562] BUG: Kernel NULL pointer dereference on read at 0x00000050
> [ 34.381565] Faulting instruction address: 0xc0000000009db1e4
> [ 34.381569] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 34.381572] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [ 34.381576] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
> [ 34.381613] CPU: 18 PID: 5918 Comm: kexec Kdump: loaded Tainted: G E 6.4.0-rc6-00037-gb6dad5178cea #3
> [ 34.381618] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [ 34.381621] NIP: c0000000009db1e4 LR: c0000000009db928 CTR: c0000000009eab60
> [ 34.381625] REGS: c00000009742f780 TRAP: 0300 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
> [ 34.381628] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44488884 XER: 00000001
> [ 34.381638] CFAR: c0000000009db19c DAR: 0000000000000050 DSISR: 40000000 IRQMASK: 0 
> [ 34.381638] GPR00: c0000000009db928 c00000009742fa20 c0000000014a1500 c0000000081d0000 
> [ 34.381638] GPR04: c00000000d842c50 c00000000d842c50 0000000000000025 fffffffffffe0000 
> [ 34.381638] GPR08: 0000000000000000 0000000000000000 0000000000000009 c008000000785280 
> [ 34.381638] GPR12: c0000000009eab60 c00000135fab7f00 0000000000000000 0000000000000000 
> [ 34.381638] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [ 34.381638] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [ 34.381638] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000002e21e08 
> [ 34.381638] GPR28: c00000000d842c48 c000000002a02208 c00000000321c0c0 c0000000081d0000 
> [ 34.381674] NIP [c0000000009db1e4] tpm_amd_is_rng_defective+0x74/0x240
> [ 34.381681] LR [c0000000009db928] tpm_chip_unregister+0x138/0x160
> [ 34.381685] Call Trace:
> [ 34.381686] [c00000009742faa0] [c0000000009db928] tpm_chip_unregister+0x138/0x160
> [ 34.381690] [c00000009742fae0] [c0000000009eab94] tpm_ibmvtpm_remove+0x34/0x130
...
> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 

  2c:   28 07 23 e9     ld      r9,1832(r3)
  30:   50 00 89 e9     ld      r12,80(r9)

Where r3 is *chip.
r9 is NULL, and 80 = 0x50.

Looks like a NULL chip->ops, which oopses in:

static int tpm_request_locality(struct tpm_chip *chip)
{
	int rc;

	if (!chip->ops->request_locality)


Can you test the patch below?

cheers


diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index cd48033b804a..82eb36e2e16d 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -36,7 +36,7 @@ static int tpm_request_locality(struct tpm_chip *chip)
 {
 	int rc;
 
-	if (!chip->ops->request_locality)
+	if (!chip->ops || !chip->ops->request_locality)
 		return 0;
 
 	rc = chip->ops->request_locality(chip, 0);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-15  3:07   ` Michael Ellerman
  0 siblings, 0 replies; 23+ messages in thread
From: Michael Ellerman @ 2023-06-15  3:07 UTC (permalink / raw)
  To: Sachin Sant, open list; +Cc: jarkko, linuxppc-dev

Sachin Sant <sachinp@linux.ibm.com> writes:
> Following crash is observed during a kexec operation on 
> IBM Power10 server:
>
> [ 34.381548] Kernel attempted to read user page (50) - exploit attempt? (uid: 0)
> [ 34.381562] BUG: Kernel NULL pointer dereference on read at 0x00000050
> [ 34.381565] Faulting instruction address: 0xc0000000009db1e4
> [ 34.381569] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 34.381572] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [ 34.381576] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
> [ 34.381613] CPU: 18 PID: 5918 Comm: kexec Kdump: loaded Tainted: G E 6.4.0-rc6-00037-gb6dad5178cea #3
> [ 34.381618] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [ 34.381621] NIP: c0000000009db1e4 LR: c0000000009db928 CTR: c0000000009eab60
> [ 34.381625] REGS: c00000009742f780 TRAP: 0300 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
> [ 34.381628] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44488884 XER: 00000001
> [ 34.381638] CFAR: c0000000009db19c DAR: 0000000000000050 DSISR: 40000000 IRQMASK: 0 
> [ 34.381638] GPR00: c0000000009db928 c00000009742fa20 c0000000014a1500 c0000000081d0000 
> [ 34.381638] GPR04: c00000000d842c50 c00000000d842c50 0000000000000025 fffffffffffe0000 
> [ 34.381638] GPR08: 0000000000000000 0000000000000000 0000000000000009 c008000000785280 
> [ 34.381638] GPR12: c0000000009eab60 c00000135fab7f00 0000000000000000 0000000000000000 
> [ 34.381638] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [ 34.381638] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [ 34.381638] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000002e21e08 
> [ 34.381638] GPR28: c00000000d842c48 c000000002a02208 c00000000321c0c0 c0000000081d0000 
> [ 34.381674] NIP [c0000000009db1e4] tpm_amd_is_rng_defective+0x74/0x240
> [ 34.381681] LR [c0000000009db928] tpm_chip_unregister+0x138/0x160
> [ 34.381685] Call Trace:
> [ 34.381686] [c00000009742faa0] [c0000000009db928] tpm_chip_unregister+0x138/0x160
> [ 34.381690] [c00000009742fae0] [c0000000009eab94] tpm_ibmvtpm_remove+0x34/0x130
...
> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 

  2c:   28 07 23 e9     ld      r9,1832(r3)
  30:   50 00 89 e9     ld      r12,80(r9)

Where r3 is *chip.
r9 is NULL, and 80 = 0x50.

Looks like a NULL chip->ops, which oopses in:

static int tpm_request_locality(struct tpm_chip *chip)
{
	int rc;

	if (!chip->ops->request_locality)


Can you test the patch below?

cheers


diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index cd48033b804a..82eb36e2e16d 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -36,7 +36,7 @@ static int tpm_request_locality(struct tpm_chip *chip)
 {
 	int rc;
 
-	if (!chip->ops->request_locality)
+	if (!chip->ops || !chip->ops->request_locality)
 		return 0;
 
 	rc = chip->ops->request_locality(chip, 0);

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-15  3:07   ` Michael Ellerman
@ 2023-06-15  4:57     ` Sachin Sant
  -1 siblings, 0 replies; 23+ messages in thread
From: Sachin Sant @ 2023-06-15  4:57 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: open list, linuxppc-dev, jarkko


>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
> 
>  2c:   28 07 23 e9     ld      r9,1832(r3)
>  30:   50 00 89 e9     ld      r12,80(r9)
> 
> Where r3 is *chip.
> r9 is NULL, and 80 = 0x50.
> 
> Looks like a NULL chip->ops, which oopses in:
> 
> static int tpm_request_locality(struct tpm_chip *chip)
> {
> int rc;
> 
> if (!chip->ops->request_locality)
> 
> 
> Can you test the patch below?
> 

It proceeds further but then run into following crash

[  103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0)
[  103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018
[  103.269595] Faulting instruction address: 0xc0000000009dcf34
[  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
[  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
[  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G            E      6.4.0-rc6-dirty #8
[  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[  103.269653] NIP:  c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60
[  103.269656] REGS: c0000000a113f510 TRAP: 0300   Tainted: G            E       (6.4.0-rc6-dirty)
[  103.269660] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88484886  XER: 00000001
[  103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0  [  103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000  [  103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016  [  103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000  [  103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000  [  103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000  [  103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000  [  103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300
[  103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190
[  103.269717] Call Trace:
[  103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable)
[  103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190
[  103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110
[  103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230
[  103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250
[  103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160
[  103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130
[  103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
[  103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c
[  103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
[  103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
[  103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
[  103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340
[  103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[  103.269781] --- interrupt: 3000 at 0x7fff805459f0
[  103.269784] NIP:  00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000
[  103.269786] REGS: c0000000a113fe80 TRAP: 3000   Tainted: G            E       (6.4.0-rc6-dirty)
[  103.269790] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 42422884  XER: 00000000
[  103.269799] IRQMASK: 0  [  103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead  [  103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003  [  103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000  [  103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000  [  103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000  [  103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003  [  103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40  [  103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040  [  103.269833] NIP [00007fff805459f0] 0x7fff805459f0
[  103.269836] LR [0000000000000000] 0x0
[  103.269838] --- interrupt: 3000
[  103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018  [  103.269852] ---[ end trace 0000000000000000 ]—

- Sachin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-15  4:57     ` Sachin Sant
  0 siblings, 0 replies; 23+ messages in thread
From: Sachin Sant @ 2023-06-15  4:57 UTC (permalink / raw)
  To: Michael Ellerman; +Cc: jarkko, linuxppc-dev, open list


>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
> 
>  2c:   28 07 23 e9     ld      r9,1832(r3)
>  30:   50 00 89 e9     ld      r12,80(r9)
> 
> Where r3 is *chip.
> r9 is NULL, and 80 = 0x50.
> 
> Looks like a NULL chip->ops, which oopses in:
> 
> static int tpm_request_locality(struct tpm_chip *chip)
> {
> int rc;
> 
> if (!chip->ops->request_locality)
> 
> 
> Can you test the patch below?
> 

It proceeds further but then run into following crash

[  103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0)
[  103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018
[  103.269595] Faulting instruction address: 0xc0000000009dcf34
[  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
[  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
[  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
[  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G            E      6.4.0-rc6-dirty #8
[  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
[  103.269653] NIP:  c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60
[  103.269656] REGS: c0000000a113f510 TRAP: 0300   Tainted: G            E       (6.4.0-rc6-dirty)
[  103.269660] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88484886  XER: 00000001
[  103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0  [  103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000  [  103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016  [  103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000  [  103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000  [  103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000  [  103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000  [  103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300
[  103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190
[  103.269717] Call Trace:
[  103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable)
[  103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190
[  103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110
[  103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230
[  103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250
[  103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160
[  103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130
[  103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
[  103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c
[  103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
[  103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
[  103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
[  103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340
[  103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
[  103.269781] --- interrupt: 3000 at 0x7fff805459f0
[  103.269784] NIP:  00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000
[  103.269786] REGS: c0000000a113fe80 TRAP: 3000   Tainted: G            E       (6.4.0-rc6-dirty)
[  103.269790] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 42422884  XER: 00000000
[  103.269799] IRQMASK: 0  [  103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead  [  103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003  [  103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000  [  103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000  [  103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000  [  103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003  [  103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40  [  103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040  [  103.269833] NIP [00007fff805459f0] 0x7fff805459f0
[  103.269836] LR [0000000000000000] 0x0
[  103.269838] --- interrupt: 3000
[  103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018  [  103.269852] ---[ end trace 0000000000000000 ]—

- Sachin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-14 15:12 ` Sachin Sant
  (?)
  (?)
@ 2023-06-15 12:04 ` Linux regression tracking #adding (Thorsten Leemhuis)
  -1 siblings, 0 replies; 23+ messages in thread
From: Linux regression tracking #adding (Thorsten Leemhuis) @ 2023-06-15 12:04 UTC (permalink / raw)
  To: Sachin Sant, open list
  Cc: jarkko, linuxppc-dev, Linux kernel regressions list

[CCing the regression list, as it should be in the loop for regressions:
https://docs.kernel.org/admin-guide/reporting-regressions.html]

[TLDR: I'm adding this report to the list of tracked Linux kernel
regressions; the text you find below is based on a few templates
paragraphs you might have encountered already in similar form.
See link in footer if these mails annoy you.]

On 14.06.23 17:12, Sachin Sant wrote:
> Following crash is observed during a kexec operation on 
> IBM Power10 server:
> 
> [ 34.381548] Kernel attempted to read user page (50) - exploit attempt? (uid: 0)
> [ 34.381562] BUG: Kernel NULL pointer dereference on read at 0x00000050
> [ 34.381565] Faulting instruction address: 0xc0000000009db1e4
> [ 34.381569] Oops: Kernel access of bad area, sig: 11 [#1]
> [ 34.381572] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [ 34.381576] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
> [ 34.381613] CPU: 18 PID: 5918 Comm: kexec Kdump: loaded Tainted: G E 6.4.0-rc6-00037-gb6dad5178cea #3
> [ 34.381618] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [ 34.381621] NIP: c0000000009db1e4 LR: c0000000009db928 CTR: c0000000009eab60
> [ 34.381625] REGS: c00000009742f780 TRAP: 0300 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
> [ 34.381628] MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 44488884 XER: 00000001
> [ 34.381638] CFAR: c0000000009db19c DAR: 0000000000000050 DSISR: 40000000 IRQMASK: 0 
> [ 34.381638] GPR00: c0000000009db928 c00000009742fa20 c0000000014a1500 c0000000081d0000 
> [ 34.381638] GPR04: c00000000d842c50 c00000000d842c50 0000000000000025 fffffffffffe0000 
> [ 34.381638] GPR08: 0000000000000000 0000000000000000 0000000000000009 c008000000785280 
> [ 34.381638] GPR12: c0000000009eab60 c00000135fab7f00 0000000000000000 0000000000000000 
> [ 34.381638] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [ 34.381638] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
> [ 34.381638] GPR24: 0000000000000000 0000000000000000 0000000000000000 c000000002e21e08 
> [ 34.381638] GPR28: c00000000d842c48 c000000002a02208 c00000000321c0c0 c0000000081d0000 
> [ 34.381674] NIP [c0000000009db1e4] tpm_amd_is_rng_defective+0x74/0x240
> [ 34.381681] LR [c0000000009db928] tpm_chip_unregister+0x138/0x160
> [ 34.381685] Call Trace:
> [ 34.381686] [c00000009742faa0] [c0000000009db928] tpm_chip_unregister+0x138/0x160
> [ 34.381690] [c00000009742fae0] [c0000000009eab94] tpm_ibmvtpm_remove+0x34/0x130
> [ 34.381695] [c00000009742fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
> [ 34.381701] [c00000009742fb90] [c000000000a01ecc] device_shutdown+0x21c/0x39c
> [ 34.381705] [c00000009742fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
> [ 34.381710] [c00000009742fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
> [ 34.381714] [c00000009742fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
> [ 34.381718] [c00000009742fe10] [c000000000034adc] system_call_exception+0x13c/0x340
> [ 34.381723] [c00000009742fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
> [ 34.381729] --- interrupt: 3000 at 0x7fff9c5459f0
> [ 34.381732] NIP: 00007fff9c5459f0 LR: 0000000000000000 CTR: 0000000000000000
> [ 34.381735] REGS: c00000009742fe80 TRAP: 3000 Tainted: G E (6.4.0-rc6-00037-gb6dad5178cea)
> [ 34.381738] MSR: 800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE> CR: 42422884 XER: 00000000
> [ 34.381747] IRQMASK: 0 
> [ 34.381747] GPR00: 0000000000000058 00007ffffad83d70 000000012fc47f00 fffffffffee1dead 
> [ 34.381747] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003 
> [ 34.381747] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000 
> [ 34.381747] GPR12: 0000000000000000 00007fff9c7bb2c0 000000012fc3f598 0000000000000000 
> [ 34.381747] GPR16: ffffffffffffffff 0000000000000000 000000012fc1fcc0 0000000000000000 
> [ 34.381747] GPR20: 0000000000008913 0000000000008914 000000014b891020 0000000000000003 
> [ 34.381747] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007ffffad83ef0 
> [ 34.381747] GPR28: 000000012fc19f10 00007fff9c6419c0 000000014b891080 000000014b891040 
> [ 34.381781] NIP [00007fff9c5459f0] 0x7fff9c5459f0
> [ 34.381784] LR [0000000000000000] 0x0
> [ 34.381786] --- interrupt: 3000
> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
> [ 34.381800] ---[ end trace 0000000000000000 ]---
> [ 34.384090] pstore: backend (nvram) writing error (-1)
> 
> Git bisect points to following patch
> 
> commit bd8621ca1510e6e802df9855bdc35a04a3cfa932
>     tpm: Add !tpm_amd_is_rng_defective() to the hwrng_unregister() call site
> 
> Reverting the commit allows a successful kexec operation.

Thanks for the report. To be sure the issue doesn't fall through the
cracks unnoticed, I'm adding it to regzbot, the Linux kernel regression
tracking bot:

#regzbot ^introduced bd8621ca1510e6e802df9855bdc35a04a3cfa932
#regzbot title tpm/ppc: crash during a kexec
#regzbot ignore-activity

This isn't a regression? This issue or a fix for it are already
discussed somewhere else? It was fixed already? You want to clarify when
the regression started to happen? Or point out I got the title or
something else totally wrong? Then just reply and tell me -- ideally
while also telling regzbot about it, as explained by the page listed in
the footer of this mail.

Developers: When fixing the issue, remember to add 'Link:' tags pointing
to the report (the parent of this mail). See page linked in footer for
details.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
That page also explains what to do if mails like this annoy you.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-15  4:57     ` Sachin Sant
@ 2023-06-22  7:44       ` Linux regression tracking (Thorsten Leemhuis)
  -1 siblings, 0 replies; 23+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-22  7:44 UTC (permalink / raw)
  To: Sachin Sant, Michael Ellerman; +Cc: open list, linuxppc-dev, jarkko

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

As Linus will likely release 6.4 on this or the following Sunday a quick
question: is there any hope this regression might be fixed any time
soon? Doesn't look like it, as it seems nothing happened for a few days,
but maybe I missed something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 15.06.23 06:57, Sachin Sant wrote:
> 
>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
>>
>>  2c:   28 07 23 e9     ld      r9,1832(r3)
>>  30:   50 00 89 e9     ld      r12,80(r9)
>>
>> Where r3 is *chip.
>> r9 is NULL, and 80 = 0x50.
>>
>> Looks like a NULL chip->ops, which oopses in:
>>
>> static int tpm_request_locality(struct tpm_chip *chip)
>> {
>> int rc;
>>
>> if (!chip->ops->request_locality)
>>
>>
>> Can you test the patch below?
>>
> 
> It proceeds further but then run into following crash
> 
> [  103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0)
> [  103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018
> [  103.269595] Faulting instruction address: 0xc0000000009dcf34
> [  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
> [  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
> [  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G            E      6.4.0-rc6-dirty #8
> [  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [  103.269653] NIP:  c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60
> [  103.269656] REGS: c0000000a113f510 TRAP: 0300   Tainted: G            E       (6.4.0-rc6-dirty)
> [  103.269660] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88484886  XER: 00000001
> [  103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0  [  103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000  [  103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016  [  103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000  [  103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000  [  103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000  [  103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000  [  103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300
> [  103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190
> [  103.269717] Call Trace:
> [  103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable)
> [  103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190
> [  103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110
> [  103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230
> [  103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250
> [  103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160
> [  103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130
> [  103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
> [  103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c
> [  103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
> [  103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
> [  103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
> [  103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340
> [  103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
> [  103.269781] --- interrupt: 3000 at 0x7fff805459f0
> [  103.269784] NIP:  00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000
> [  103.269786] REGS: c0000000a113fe80 TRAP: 3000   Tainted: G            E       (6.4.0-rc6-dirty)
> [  103.269790] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 42422884  XER: 00000000
> [  103.269799] IRQMASK: 0  [  103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead  [  103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003  [  103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000  [  103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000  [  103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000  [  103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003  [  103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40  [  103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040  [  103.269833] NIP [00007fff805459f0] 0x7fff805459f0
> [  103.269836] LR [0000000000000000] 0x0
> [  103.269838] --- interrupt: 3000
> [  103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018  [  103.269852] ---[ end trace 0000000000000000 ]—
> 
> - Sachin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-22  7:44       ` Linux regression tracking (Thorsten Leemhuis)
  0 siblings, 0 replies; 23+ messages in thread
From: Linux regression tracking (Thorsten Leemhuis) @ 2023-06-22  7:44 UTC (permalink / raw)
  To: Sachin Sant, Michael Ellerman; +Cc: jarkko, linuxppc-dev, open list

Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
for once, to make this easily accessible to everyone.

As Linus will likely release 6.4 on this or the following Sunday a quick
question: is there any hope this regression might be fixed any time
soon? Doesn't look like it, as it seems nothing happened for a few days,
but maybe I missed something.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.

#regzbot poke

On 15.06.23 06:57, Sachin Sant wrote:
> 
>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
>>
>>  2c:   28 07 23 e9     ld      r9,1832(r3)
>>  30:   50 00 89 e9     ld      r12,80(r9)
>>
>> Where r3 is *chip.
>> r9 is NULL, and 80 = 0x50.
>>
>> Looks like a NULL chip->ops, which oopses in:
>>
>> static int tpm_request_locality(struct tpm_chip *chip)
>> {
>> int rc;
>>
>> if (!chip->ops->request_locality)
>>
>>
>> Can you test the patch below?
>>
> 
> It proceeds further but then run into following crash
> 
> [  103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0)
> [  103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018
> [  103.269595] Faulting instruction address: 0xc0000000009dcf34
> [  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
> [  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
> [  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
> [  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G            E      6.4.0-rc6-dirty #8
> [  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
> [  103.269653] NIP:  c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60
> [  103.269656] REGS: c0000000a113f510 TRAP: 0300   Tainted: G            E       (6.4.0-rc6-dirty)
> [  103.269660] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88484886  XER: 00000001
> [  103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0  [  103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000  [  103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016  [  103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000  [  103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000  [  103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000  [  103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000  [  103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300
> [  103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190
> [  103.269717] Call Trace:
> [  103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable)
> [  103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190
> [  103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110
> [  103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230
> [  103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250
> [  103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160
> [  103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130
> [  103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
> [  103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c
> [  103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
> [  103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
> [  103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
> [  103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340
> [  103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
> [  103.269781] --- interrupt: 3000 at 0x7fff805459f0
> [  103.269784] NIP:  00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000
> [  103.269786] REGS: c0000000a113fe80 TRAP: 3000   Tainted: G            E       (6.4.0-rc6-dirty)
> [  103.269790] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 42422884  XER: 00000000
> [  103.269799] IRQMASK: 0  [  103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead  [  103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003  [  103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000  [  103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000  [  103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000  [  103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003  [  103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40  [  103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040  [  103.269833] NIP [00007fff805459f0] 0x7fff805459f0
> [  103.269836] LR [0000000000000000] 0x0
> [  103.269838] --- interrupt: 3000
> [  103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018  [  103.269852] ---[ end trace 0000000000000000 ]—
> 
> - Sachin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-22  7:44       ` Linux regression tracking (Thorsten Leemhuis)
@ 2023-06-22 12:36         ` Michael Ellerman
  -1 siblings, 0 replies; 23+ messages in thread
From: Michael Ellerman @ 2023-06-22 12:36 UTC (permalink / raw)
  To: Linux regressions mailing list, Sachin Sant
  Cc: open list, linuxppc-dev, jarkko, mario.limonciello, linux-integrity

"Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> As Linus will likely release 6.4 on this or the following Sunday a quick
> question: is there any hope this regression might be fixed any time
> soon?

No.

I have added the author of the commit to Cc, maybe they can help?

The immediate question is, is it expected for chip->ops to be NULL in
this path? Obviously on actual AMD systems that isn't the case,
otherwise the code would crash there. But is the fact that chip->ops is
NULL a bug in the ibmvtpm driver, or a possibility that has been
overlooked by the checking code.

cheers

> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> #regzbot poke
>
> On 15.06.23 06:57, Sachin Sant wrote:
>> 
>>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
>>>
>>>  2c:   28 07 23 e9     ld      r9,1832(r3)
>>>  30:   50 00 89 e9     ld      r12,80(r9)
>>>
>>> Where r3 is *chip.
>>> r9 is NULL, and 80 = 0x50.
>>>
>>> Looks like a NULL chip->ops, which oopses in:
>>>
>>> static int tpm_request_locality(struct tpm_chip *chip)
>>> {
>>> int rc;
>>>
>>> if (!chip->ops->request_locality)
>>>
>>>
>>> Can you test the patch below?
>>>
>> 
>> It proceeds further but then run into following crash
>> 
>> [  103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0)
>> [  103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018
>> [  103.269595] Faulting instruction address: 0xc0000000009dcf34
>> [  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
>> [  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>> [  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
>> [  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G            E      6.4.0-rc6-dirty #8
>> [  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
>> [  103.269653] NIP:  c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60
>> [  103.269656] REGS: c0000000a113f510 TRAP: 0300   Tainted: G            E       (6.4.0-rc6-dirty)
>> [  103.269660] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88484886  XER: 00000001
>> [  103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0  [  103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000  [  103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016  [  103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000  [  103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000  [  103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000  [  103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000  [  103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300
>> [  103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190
>> [  103.269717] Call Trace:
>> [  103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable)
>> [  103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190
>> [  103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110
>> [  103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230
>> [  103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250
>> [  103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160
>> [  103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130
>> [  103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
>> [  103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c
>> [  103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
>> [  103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
>> [  103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
>> [  103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340
>> [  103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
>> [  103.269781] --- interrupt: 3000 at 0x7fff805459f0
>> [  103.269784] NIP:  00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000
>> [  103.269786] REGS: c0000000a113fe80 TRAP: 3000   Tainted: G            E       (6.4.0-rc6-dirty)
>> [  103.269790] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 42422884  XER: 00000000
>> [  103.269799] IRQMASK: 0  [  103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead  [  103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003  [  103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000  [  103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000  [  103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000  [  103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003  [  103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40  [  103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040  [  103.269833] NIP [00007fff805459f0] 0x7fff805459f0
>> [  103.269836] LR [0000000000000000] 0x0
>> [  103.269838] --- interrupt: 3000
>> [  103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018  [  103.269852] ---[ end trace 0000000000000000 ]—
>> 
>> - Sachin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-22 12:36         ` Michael Ellerman
  0 siblings, 0 replies; 23+ messages in thread
From: Michael Ellerman @ 2023-06-22 12:36 UTC (permalink / raw)
  To: Linux regressions mailing list, Sachin Sant
  Cc: linux-integrity, jarkko, linuxppc-dev, open list, mario.limonciello

"Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes:
> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> for once, to make this easily accessible to everyone.
>
> As Linus will likely release 6.4 on this or the following Sunday a quick
> question: is there any hope this regression might be fixed any time
> soon?

No.

I have added the author of the commit to Cc, maybe they can help?

The immediate question is, is it expected for chip->ops to be NULL in
this path? Obviously on actual AMD systems that isn't the case,
otherwise the code would crash there. But is the fact that chip->ops is
NULL a bug in the ibmvtpm driver, or a possibility that has been
overlooked by the checking code.

cheers

> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> If I did something stupid, please tell me, as explained on that page.
>
> #regzbot poke
>
> On 15.06.23 06:57, Sachin Sant wrote:
>> 
>>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6 
>>>
>>>  2c:   28 07 23 e9     ld      r9,1832(r3)
>>>  30:   50 00 89 e9     ld      r12,80(r9)
>>>
>>> Where r3 is *chip.
>>> r9 is NULL, and 80 = 0x50.
>>>
>>> Looks like a NULL chip->ops, which oopses in:
>>>
>>> static int tpm_request_locality(struct tpm_chip *chip)
>>> {
>>> int rc;
>>>
>>> if (!chip->ops->request_locality)
>>>
>>>
>>> Can you test the patch below?
>>>
>> 
>> It proceeds further but then run into following crash
>> 
>> [  103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0)
>> [  103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018
>> [  103.269595] Faulting instruction address: 0xc0000000009dcf34
>> [  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
>> [  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>> [  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
>> [  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G            E      6.4.0-rc6-dirty #8
>> [  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
>> [  103.269653] NIP:  c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60
>> [  103.269656] REGS: c0000000a113f510 TRAP: 0300   Tainted: G            E       (6.4.0-rc6-dirty)
>> [  103.269660] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88484886  XER: 00000001
>> [  103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0  [  103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000  [  103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016  [  103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000  [  103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000  [  103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000  [  103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000  [  103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300
>> [  103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190
>> [  103.269717] Call Trace:
>> [  103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable)
>> [  103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190
>> [  103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110
>> [  103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230
>> [  103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250
>> [  103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160
>> [  103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130
>> [  103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
>> [  103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c
>> [  103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
>> [  103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
>> [  103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
>> [  103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340
>> [  103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
>> [  103.269781] --- interrupt: 3000 at 0x7fff805459f0
>> [  103.269784] NIP:  00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000
>> [  103.269786] REGS: c0000000a113fe80 TRAP: 3000   Tainted: G            E       (6.4.0-rc6-dirty)
>> [  103.269790] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 42422884  XER: 00000000
>> [  103.269799] IRQMASK: 0  [  103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead  [  103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003  [  103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000  [  103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000  [  103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000  [  103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003  [  103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40  [  103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040  [  103.269833] NIP [00007fff805459f0] 0x7fff805459f0
>> [  103.269836] LR [0000000000000000] 0x0
>> [  103.269838] --- interrupt: 3000
>> [  103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018  [  103.269852] ---[ end trace 0000000000000000 ]—
>> 
>> - Sachin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-22 12:36         ` Michael Ellerman
@ 2023-06-22 14:38           ` Limonciello, Mario
  -1 siblings, 0 replies; 23+ messages in thread
From: Limonciello, Mario @ 2023-06-22 14:38 UTC (permalink / raw)
  To: Michael Ellerman, Linux regressions mailing list, Sachin Sant
  Cc: open list, linuxppc-dev, jarkko, linux-integrity


On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes:
>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>> for once, to make this easily accessible to everyone.
>>
>> As Linus will likely release 6.4 on this or the following Sunday a quick
>> question: is there any hope this regression might be fixed any time
>> soon?
> No.
>
> I have added the author of the commit to Cc, maybe they can help?
>
> The immediate question is, is it expected for chip->ops to be NULL in
> this path? Obviously on actual AMD systems that isn't the case,
> otherwise the code would crash there. But is the fact that chip->ops is
> NULL a bug in the ibmvtpm driver, or a possibility that has been
> overlooked by the checking code.
>
> cheers

All that code assumes that the TPM is still functional which
seems not to be the case for your TPM.

This should fix it:

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 5be91591cb3b..7082b031741e 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct tpm_chip 
*chip)
         u64 version;
         int ret;

+       if (!chip->ops)
+               return false;
+
         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
                 return false;

>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>> --
>> Everything you wanna know about Linux kernel regression tracking:
>> https://linux-regtracking.leemhuis.info/about/#tldr
>> If I did something stupid, please tell me, as explained on that page.
>>
>> #regzbot poke
>>
>> On 15.06.23 06:57, Sachin Sant wrote:
>>>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6
>>>>   2c:   28 07 23 e9     ld      r9,1832(r3)
>>>>   30:   50 00 89 e9     ld      r12,80(r9)
>>>>
>>>> Where r3 is *chip.
>>>> r9 is NULL, and 80 = 0x50.
>>>>
>>>> Looks like a NULL chip->ops, which oopses in:
>>>>
>>>> static int tpm_request_locality(struct tpm_chip *chip)
>>>> {
>>>> int rc;
>>>>
>>>> if (!chip->ops->request_locality)
>>>>
>>>>
>>>> Can you test the patch below?
>>>>
>>> It proceeds further but then run into following crash
>>>
>>> [  103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0)
>>> [  103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018
>>> [  103.269595] Faulting instruction address: 0xc0000000009dcf34
>>> [  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
>>> [  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>>> [  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
>>> [  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G            E      6.4.0-rc6-dirty #8
>>> [  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
>>> [  103.269653] NIP:  c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60
>>> [  103.269656] REGS: c0000000a113f510 TRAP: 0300   Tainted: G            E       (6.4.0-rc6-dirty)
>>> [  103.269660] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88484886  XER: 00000001
>>> [  103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0  [  103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000  [  103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016  [  103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000  [  103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000  [  103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000  [  103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000  [  103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300
>>> [  103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190
>>> [  103.269717] Call Trace:
>>> [  103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable)
>>> [  103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190
>>> [  103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110
>>> [  103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230
>>> [  103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250
>>> [  103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160
>>> [  103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130
>>> [  103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
>>> [  103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c
>>> [  103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
>>> [  103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
>>> [  103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
>>> [  103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340
>>> [  103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
>>> [  103.269781] --- interrupt: 3000 at 0x7fff805459f0
>>> [  103.269784] NIP:  00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000
>>> [  103.269786] REGS: c0000000a113fe80 TRAP: 3000   Tainted: G            E       (6.4.0-rc6-dirty)
>>> [  103.269790] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 42422884  XER: 00000000
>>> [  103.269799] IRQMASK: 0  [  103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead  [  103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003  [  103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000  [  103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000  [  103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000  [  103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003  [  103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40  [  103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040  [  103.269833] NIP [00007fff805459f0] 0x7fff805459f0
>>> [  103.269836] LR [0000000000000000] 0x0
>>> [  103.269838] --- interrupt: 3000
>>> [  103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018  [  103.269852] ---[ end trace 0000000000000000 ]—
>>>
>>> - Sachin

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-22 14:38           ` Limonciello, Mario
  0 siblings, 0 replies; 23+ messages in thread
From: Limonciello, Mario @ 2023-06-22 14:38 UTC (permalink / raw)
  To: Michael Ellerman, Linux regressions mailing list, Sachin Sant
  Cc: linux-integrity, jarkko, linuxppc-dev, open list


On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes:
>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>> for once, to make this easily accessible to everyone.
>>
>> As Linus will likely release 6.4 on this or the following Sunday a quick
>> question: is there any hope this regression might be fixed any time
>> soon?
> No.
>
> I have added the author of the commit to Cc, maybe they can help?
>
> The immediate question is, is it expected for chip->ops to be NULL in
> this path? Obviously on actual AMD systems that isn't the case,
> otherwise the code would crash there. But is the fact that chip->ops is
> NULL a bug in the ibmvtpm driver, or a possibility that has been
> overlooked by the checking code.
>
> cheers

All that code assumes that the TPM is still functional which
seems not to be the case for your TPM.

This should fix it:

diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 5be91591cb3b..7082b031741e 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct tpm_chip 
*chip)
         u64 version;
         int ret;

+       if (!chip->ops)
+               return false;
+
         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
                 return false;

>> Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
>> --
>> Everything you wanna know about Linux kernel regression tracking:
>> https://linux-regtracking.leemhuis.info/about/#tldr
>> If I did something stupid, please tell me, as explained on that page.
>>
>> #regzbot poke
>>
>> On 15.06.23 06:57, Sachin Sant wrote:
>>>>> [ 34.381788] Code: 5463063e 408201c8 38210080 4e800020 60000000 60000000 60000000 7c0802a6 fbe10078 7c7f1b78 f8010090 e9230728 <e9890050> 2c2c0000 41820020 7d8903a6
>>>>   2c:   28 07 23 e9     ld      r9,1832(r3)
>>>>   30:   50 00 89 e9     ld      r12,80(r9)
>>>>
>>>> Where r3 is *chip.
>>>> r9 is NULL, and 80 = 0x50.
>>>>
>>>> Looks like a NULL chip->ops, which oopses in:
>>>>
>>>> static int tpm_request_locality(struct tpm_chip *chip)
>>>> {
>>>> int rc;
>>>>
>>>> if (!chip->ops->request_locality)
>>>>
>>>>
>>>> Can you test the patch below?
>>>>
>>> It proceeds further but then run into following crash
>>>
>>> [  103.269574] Kernel attempted to read user page (18) - exploit attempt? (uid: 0)
>>> [  103.269589] BUG: Kernel NULL pointer dereference on read at 0x00000018
>>> [  103.269595] Faulting instruction address: 0xc0000000009dcf34
>>> [  103.269599] Oops: Kernel access of bad area, sig: 11 [#1]
>>> [  103.269602] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries
>>> [  103.269606] Modules linked in: dm_mod(E) nft_fib_inet(E) nft_fib_ipv4(E) nft_fib_ipv6(E) nft_fib(E) nft_reject_inet(E) nf_reject_ipv4(E) nf_reject_ipv6(E) nft_reject(E) nft_ct(E) nft_chain_nat(E) nf_nat(E) nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) bonding(E) tls(E) rfkill(E) ip_set(E) sunrpc(E) nf_tables(E) nfnetlink(E) pseries_rng(E) aes_gcm_p10_crypto(E) drm(E) drm_panel_orientation_quirks(E) xfs(E) libcrc32c(E) sd_mod(E) sr_mod(E) t10_pi(E) crc64_rocksoft_generic(E) cdrom(E) crc64_rocksoft(E) crc64(E) sg(E) ibmvscsi(E) scsi_transport_srp(E) ibmveth(E) vmx_crypto(E) fuse(E)
>>> [  103.269644] CPU: 18 PID: 6872 Comm: kexec Kdump: loaded Tainted: G            E      6.4.0-rc6-dirty #8
>>> [  103.269649] Hardware name: IBM,9080-HEX POWER10 (raw) 0x800200 0xf000006 of:IBM,FW1030.20 (NH1030_058) hv:phyp pSeries
>>> [  103.269653] NIP:  c0000000009dcf34 LR: c0000000009dd2bc CTR: c0000000009eaa60
>>> [  103.269656] REGS: c0000000a113f510 TRAP: 0300   Tainted: G            E       (6.4.0-rc6-dirty)
>>> [  103.269660] MSR:  800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE>  CR: 88484886  XER: 00000001
>>> [  103.269669] CFAR: c0000000009dd2b8 DAR: 0000000000000018 DSISR: 40000000 IRQMASK: 0  [  103.269669] GPR00: c0000000009dd2bc c0000000a113f7b0 c0000000014a1500 c000000090310000  [  103.269669] GPR04: c00000009f770000 0000000000000016 0000060000007a01 0000000000000016  [  103.269669] GPR08: c00000009f770000 0000000000000000 0000000000000000 0000000000008000  [  103.269669] GPR12: c0000000009eaa60 c00000135fab7f00 0000000000000000 0000000000000000  [  103.269669] GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000  [  103.269669] GPR24: 0000000000000000 0000000000000016 c000000090310000 0000000000001000  [  103.269669] GPR28: c00000009f770000 000000007a010000 c00000009f770000 c000000090310000  [  103.269707] NIP [c0000000009dcf34] tpm_try_transmit+0x74/0x300
>>> [  103.269713] LR [c0000000009dd2bc] tpm_transmit+0xfc/0x190
>>> [  103.269717] Call Trace:
>>> [  103.269718] [c0000000a113f7b0] [c0000000a113f880] 0xc0000000a113f880 (unreliable)
>>> [  103.269724] [c0000000a113f840] [c0000000009dd2bc] tpm_transmit+0xfc/0x190
>>> [  103.269727] [c0000000a113f900] [c0000000009dd398] tpm_transmit_cmd+0x48/0x110
>>> [  103.269731] [c0000000a113f980] [c0000000009df1b0] tpm2_get_tpm_pt+0x140/0x230
>>> [  103.269736] [c0000000a113fa20] [c0000000009db208] tpm_amd_is_rng_defective+0xb8/0x250
>>> [  103.269739] [c0000000a113faa0] [c0000000009db828] tpm_chip_unregister+0x138/0x160
>>> [  103.269743] [c0000000a113fae0] [c0000000009eaa94] tpm_ibmvtpm_remove+0x34/0x130
>>> [  103.269748] [c0000000a113fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
>>> [  103.269754] [c0000000a113fb90] [c000000000a01dcc] device_shutdown+0x21c/0x39c
>>> [  103.269758] [c0000000a113fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
>>> [  103.269762] [c0000000a113fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
>>> [  103.269766] [c0000000a113fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
>>> [  103.269770] [c0000000a113fe10] [c000000000034adc] system_call_exception+0x13c/0x340
>>> [  103.269776] [c0000000a113fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec
>>> [  103.269781] --- interrupt: 3000 at 0x7fff805459f0
>>> [  103.269784] NIP:  00007fff805459f0 LR: 0000000000000000 CTR: 0000000000000000
>>> [  103.269786] REGS: c0000000a113fe80 TRAP: 3000   Tainted: G            E       (6.4.0-rc6-dirty)
>>> [  103.269790] MSR:  800000000280f033 <SF,VEC,VSX,EE,PR,FP,ME,IR,DR,RI,LE>  CR: 42422884  XER: 00000000
>>> [  103.269799] IRQMASK: 0  [  103.269799] GPR00: 0000000000000058 00007fffc07a68c0 0000000110437f00 fffffffffee1dead  [  103.269799] GPR04: 0000000028121969 0000000045584543 0000000000000000 0000000000000003  [  103.269799] GPR08: 0000000000100000 0000000000000000 0000000000000000 0000000000000000  [  103.269799] GPR12: 0000000000000000 00007fff8089b2c0 000000011042f598 0000000000000000  [  103.269799] GPR16: ffffffffffffffff 0000000000000000 000000011040fcc0 0000000000000000  [  103.269799] GPR20: 0000000000008913 0000000000008914 0000000149c61020 0000000000000003  [  103.269799] GPR24: 0000000000000000 0000000000000001 0000000000000003 00007fffc07a6a40  [  103.269799] GPR28: 0000000110409f10 00007fff806419c0 0000000149c61080 0000000149c61040  [  103.269833] NIP [00007fff805459f0] 0x7fff805459f0
>>> [  103.269836] LR [0000000000000000] 0x0
>>> [  103.269838] --- interrupt: 3000
>>> [  103.269839] Code: 83a40006 2c090000 41820208 7c0802a6 79250020 7c25d840 f80100a0 41810224 fbe10088 f8410018 7c7f1b78 e9230728 <e9890018> 7d8903a6 4e800421 e8410018  [  103.269852] ---[ end trace 0000000000000000 ]—
>>>
>>> - Sachin

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-22 14:38           ` Limonciello, Mario
@ 2023-06-23  2:52             ` Sachin Sant
  -1 siblings, 0 replies; 23+ messages in thread
From: Sachin Sant @ 2023-06-23  2:52 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: Michael Ellerman, Linux regressions mailing list, open list,
	linuxppc-dev, jarkko, linux-integrity, Aneesh Kumar K.V



> On 22-Jun-2023, at 8:08 PM, Limonciello, Mario <Mario.Limonciello@amd.com> wrote:
> 
> 
> On 6/22/2023 7:36 AM, Michael Ellerman wrote:
>> "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes:
>>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>>> for once, to make this easily accessible to everyone.
>>> 
>>> As Linus will likely release 6.4 on this or the following Sunday a quick
>>> question: is there any hope this regression might be fixed any time
>>> soon?
>> No.
>> 
>> I have added the author of the commit to Cc, maybe they can help?
>> 
>> The immediate question is, is it expected for chip->ops to be NULL in
>> this path? Obviously on actual AMD systems that isn't the case,
>> otherwise the code would crash there. But is the fact that chip->ops is
>> NULL a bug in the ibmvtpm driver, or a possibility that has been
>> overlooked by the checking code.
>> 
>> cheers
> 
> All that code assumes that the TPM is still functional which
> seems not to be the case for your TPM.
> 
> This should fix it:

Yes, with this change kexec works correctly.

Since Aneesh first reported this problem including reported by credit for him

Reported-by: Aneesh Kumar K. V <aneesh.kumar@linux.ibm.com>
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>

-Sachin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-23  2:52             ` Sachin Sant
  0 siblings, 0 replies; 23+ messages in thread
From: Sachin Sant @ 2023-06-23  2:52 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: Linux regressions mailing list, open list, jarkko,
	Aneesh Kumar K.V, linux-integrity, linuxppc-dev



> On 22-Jun-2023, at 8:08 PM, Limonciello, Mario <Mario.Limonciello@amd.com> wrote:
> 
> 
> On 6/22/2023 7:36 AM, Michael Ellerman wrote:
>> "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes:
>>> Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
>>> for once, to make this easily accessible to everyone.
>>> 
>>> As Linus will likely release 6.4 on this or the following Sunday a quick
>>> question: is there any hope this regression might be fixed any time
>>> soon?
>> No.
>> 
>> I have added the author of the commit to Cc, maybe they can help?
>> 
>> The immediate question is, is it expected for chip->ops to be NULL in
>> this path? Obviously on actual AMD systems that isn't the case,
>> otherwise the code would crash there. But is the fact that chip->ops is
>> NULL a bug in the ibmvtpm driver, or a possibility that has been
>> overlooked by the checking code.
>> 
>> cheers
> 
> All that code assumes that the TPM is still functional which
> seems not to be the case for your TPM.
> 
> This should fix it:

Yes, with this change kexec works correctly.

Since Aneesh first reported this problem including reported by credit for him

Reported-by: Aneesh Kumar K. V <aneesh.kumar@linux.ibm.com>
Reported-by: Sachin Sant <sachinp@linux.ibm.com>
Tested-by: Sachin Sant <sachinp@linux.ibm.com>

-Sachin

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-22 14:38           ` Limonciello, Mario
@ 2023-06-29 17:06             ` Jerry Snitselaar
  -1 siblings, 0 replies; 23+ messages in thread
From: Jerry Snitselaar @ 2023-06-29 17:06 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: Michael Ellerman, Linux regressions mailing list, Sachin Sant,
	open list, linuxppc-dev, jarkko, linux-integrity

On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote:
> 
> On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> > "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes:
> > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> > > for once, to make this easily accessible to everyone.
> > > 
> > > As Linus will likely release 6.4 on this or the following Sunday a quick
> > > question: is there any hope this regression might be fixed any time
> > > soon?
> > No.
> > 
> > I have added the author of the commit to Cc, maybe they can help?
> > 
> > The immediate question is, is it expected for chip->ops to be NULL in
> > this path? Obviously on actual AMD systems that isn't the case,
> > otherwise the code would crash there. But is the fact that chip->ops is
> > NULL a bug in the ibmvtpm driver, or a possibility that has been
> > overlooked by the checking code.
> > 
> > cheers
> 
> All that code assumes that the TPM is still functional which
> seems not to be the case for your TPM.
> 
> This should fix it:
> 
> diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> index 5be91591cb3b..7082b031741e 100644
> --- a/drivers/char/tpm/tpm-chip.c
> +++ b/drivers/char/tpm/tpm-chip.c
> @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct tpm_chip
> *chip)
>         u64 version;
>         int ret;
> 
> +       if (!chip->ops)
> +               return false;
> +
>         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
>                 return false;


Should tpm_amd_is_rng_defective compile to nothing on non-x86 architectures? This code is all about
working around an issue with the AMD fTPM, right?

Regards,
Jerry


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-29 17:06             ` Jerry Snitselaar
  0 siblings, 0 replies; 23+ messages in thread
From: Jerry Snitselaar @ 2023-06-29 17:06 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: Linux regressions mailing list, open list, jarkko, Sachin Sant,
	linux-integrity, linuxppc-dev

On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote:
> 
> On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> > "Linux regression tracking (Thorsten Leemhuis)" <regressions@leemhuis.info> writes:
> > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> > > for once, to make this easily accessible to everyone.
> > > 
> > > As Linus will likely release 6.4 on this or the following Sunday a quick
> > > question: is there any hope this regression might be fixed any time
> > > soon?
> > No.
> > 
> > I have added the author of the commit to Cc, maybe they can help?
> > 
> > The immediate question is, is it expected for chip->ops to be NULL in
> > this path? Obviously on actual AMD systems that isn't the case,
> > otherwise the code would crash there. But is the fact that chip->ops is
> > NULL a bug in the ibmvtpm driver, or a possibility that has been
> > overlooked by the checking code.
> > 
> > cheers
> 
> All that code assumes that the TPM is still functional which
> seems not to be the case for your TPM.
> 
> This should fix it:
> 
> diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> index 5be91591cb3b..7082b031741e 100644
> --- a/drivers/char/tpm/tpm-chip.c
> +++ b/drivers/char/tpm/tpm-chip.c
> @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct tpm_chip
> *chip)
>         u64 version;
>         int ret;
> 
> +       if (!chip->ops)
> +               return false;
> +
>         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
>                 return false;


Should tpm_amd_is_rng_defective compile to nothing on non-x86 architectures? This code is all about
working around an issue with the AMD fTPM, right?

Regards,
Jerry


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-29 17:06             ` Jerry Snitselaar
@ 2023-06-29 17:28               ` Limonciello, Mario
  -1 siblings, 0 replies; 23+ messages in thread
From: Limonciello, Mario @ 2023-06-29 17:28 UTC (permalink / raw)
  To: Jerry Snitselaar
  Cc: Michael Ellerman, Linux regressions mailing list, Sachin Sant,
	open list, linuxppc-dev, jarkko, linux-integrity

[Public]

> -----Original Message-----
> From: Jerry Snitselaar <jsnitsel@redhat.com>
> Sent: Thursday, June 29, 2023 12:07 PM
> To: Limonciello, Mario <Mario.Limonciello@amd.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list
> <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open
> list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc-
> dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org
> Subject: Re: [6.4-rc6] Crash during a kexec operation
> (tpm_amd_is_rng_defective)
>
> On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote:
> >
> > On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> > > "Linux regression tracking (Thorsten Leemhuis)"
> <regressions@leemhuis.info> writes:
> > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> > > > for once, to make this easily accessible to everyone.
> > > >
> > > > As Linus will likely release 6.4 on this or the following Sunday a quick
> > > > question: is there any hope this regression might be fixed any time
> > > > soon?
> > > No.
> > >
> > > I have added the author of the commit to Cc, maybe they can help?
> > >
> > > The immediate question is, is it expected for chip->ops to be NULL in
> > > this path? Obviously on actual AMD systems that isn't the case,
> > > otherwise the code would crash there. But is the fact that chip->ops is
> > > NULL a bug in the ibmvtpm driver, or a possibility that has been
> > > overlooked by the checking code.
> > >
> > > cheers
> >
> > All that code assumes that the TPM is still functional which
> > seems not to be the case for your TPM.
> >
> > This should fix it:
> >
> > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > index 5be91591cb3b..7082b031741e 100644
> > --- a/drivers/char/tpm/tpm-chip.c
> > +++ b/drivers/char/tpm/tpm-chip.c
> > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct
> tpm_chip
> > *chip)
> >         u64 version;
> >         int ret;
> >
> > +       if (!chip->ops)
> > +               return false;
> > +
> >         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
> >                 return false;
>
>
> Should tpm_amd_is_rng_defective compile to nothing on non-x86
> architectures? This code is all about
> working around an issue with the AMD fTPM, right?
>

That's a good point.  Yes it could and that would also solve this problem.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-29 17:28               ` Limonciello, Mario
  0 siblings, 0 replies; 23+ messages in thread
From: Limonciello, Mario @ 2023-06-29 17:28 UTC (permalink / raw)
  To: Jerry Snitselaar
  Cc: Linux regressions mailing list, open list, jarkko, Sachin Sant,
	linux-integrity, linuxppc-dev

[Public]

> -----Original Message-----
> From: Jerry Snitselaar <jsnitsel@redhat.com>
> Sent: Thursday, June 29, 2023 12:07 PM
> To: Limonciello, Mario <Mario.Limonciello@amd.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list
> <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open
> list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc-
> dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org
> Subject: Re: [6.4-rc6] Crash during a kexec operation
> (tpm_amd_is_rng_defective)
>
> On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote:
> >
> > On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> > > "Linux regression tracking (Thorsten Leemhuis)"
> <regressions@leemhuis.info> writes:
> > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> > > > for once, to make this easily accessible to everyone.
> > > >
> > > > As Linus will likely release 6.4 on this or the following Sunday a quick
> > > > question: is there any hope this regression might be fixed any time
> > > > soon?
> > > No.
> > >
> > > I have added the author of the commit to Cc, maybe they can help?
> > >
> > > The immediate question is, is it expected for chip->ops to be NULL in
> > > this path? Obviously on actual AMD systems that isn't the case,
> > > otherwise the code would crash there. But is the fact that chip->ops is
> > > NULL a bug in the ibmvtpm driver, or a possibility that has been
> > > overlooked by the checking code.
> > >
> > > cheers
> >
> > All that code assumes that the TPM is still functional which
> > seems not to be the case for your TPM.
> >
> > This should fix it:
> >
> > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > index 5be91591cb3b..7082b031741e 100644
> > --- a/drivers/char/tpm/tpm-chip.c
> > +++ b/drivers/char/tpm/tpm-chip.c
> > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct
> tpm_chip
> > *chip)
> >         u64 version;
> >         int ret;
> >
> > +       if (!chip->ops)
> > +               return false;
> > +
> >         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
> >                 return false;
>
>
> Should tpm_amd_is_rng_defective compile to nothing on non-x86
> architectures? This code is all about
> working around an issue with the AMD fTPM, right?
>

That's a good point.  Yes it could and that would also solve this problem.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-29 17:28               ` Limonciello, Mario
@ 2023-06-29 17:43                 ` Jerry Snitselaar
  -1 siblings, 0 replies; 23+ messages in thread
From: Jerry Snitselaar @ 2023-06-29 17:43 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: Michael Ellerman, Linux regressions mailing list, Sachin Sant,
	open list, linuxppc-dev, jarkko, linux-integrity

On Thu, Jun 29, 2023 at 05:28:58PM +0000, Limonciello, Mario wrote:
> [Public]
> 
> > -----Original Message-----
> > From: Jerry Snitselaar <jsnitsel@redhat.com>
> > Sent: Thursday, June 29, 2023 12:07 PM
> > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list
> > <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open
> > list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc-
> > dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org
> > Subject: Re: [6.4-rc6] Crash during a kexec operation
> > (tpm_amd_is_rng_defective)
> >
> > On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote:
> > >
> > > On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> > > > "Linux regression tracking (Thorsten Leemhuis)"
> > <regressions@leemhuis.info> writes:
> > > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> > > > > for once, to make this easily accessible to everyone.
> > > > >
> > > > > As Linus will likely release 6.4 on this or the following Sunday a quick
> > > > > question: is there any hope this regression might be fixed any time
> > > > > soon?
> > > > No.
> > > >
> > > > I have added the author of the commit to Cc, maybe they can help?
> > > >
> > > > The immediate question is, is it expected for chip->ops to be NULL in
> > > > this path? Obviously on actual AMD systems that isn't the case,
> > > > otherwise the code would crash there. But is the fact that chip->ops is
> > > > NULL a bug in the ibmvtpm driver, or a possibility that has been
> > > > overlooked by the checking code.
> > > >
> > > > cheers
> > >
> > > All that code assumes that the TPM is still functional which
> > > seems not to be the case for your TPM.
> > >
> > > This should fix it:
> > >
> > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > > index 5be91591cb3b..7082b031741e 100644
> > > --- a/drivers/char/tpm/tpm-chip.c
> > > +++ b/drivers/char/tpm/tpm-chip.c
> > > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct
> > tpm_chip
> > > *chip)
> > >         u64 version;
> > >         int ret;
> > >
> > > +       if (!chip->ops)
> > > +               return false;
> > > +
> > >         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
> > >                 return false;
> >
> >
> > Should tpm_amd_is_rng_defective compile to nothing on non-x86
> > architectures? This code is all about
> > working around an issue with the AMD fTPM, right?
> >
> 
> That's a good point.  Yes it could and that would also solve this problem.
> 
Or I guess more accurately for non-x86 it should be:

static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
{
	return false;
}


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-29 17:43                 ` Jerry Snitselaar
  0 siblings, 0 replies; 23+ messages in thread
From: Jerry Snitselaar @ 2023-06-29 17:43 UTC (permalink / raw)
  To: Limonciello, Mario
  Cc: Linux regressions mailing list, open list, jarkko, Sachin Sant,
	linux-integrity, linuxppc-dev

On Thu, Jun 29, 2023 at 05:28:58PM +0000, Limonciello, Mario wrote:
> [Public]
> 
> > -----Original Message-----
> > From: Jerry Snitselaar <jsnitsel@redhat.com>
> > Sent: Thursday, June 29, 2023 12:07 PM
> > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list
> > <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open
> > list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc-
> > dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org
> > Subject: Re: [6.4-rc6] Crash during a kexec operation
> > (tpm_amd_is_rng_defective)
> >
> > On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote:
> > >
> > > On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> > > > "Linux regression tracking (Thorsten Leemhuis)"
> > <regressions@leemhuis.info> writes:
> > > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> > > > > for once, to make this easily accessible to everyone.
> > > > >
> > > > > As Linus will likely release 6.4 on this or the following Sunday a quick
> > > > > question: is there any hope this regression might be fixed any time
> > > > > soon?
> > > > No.
> > > >
> > > > I have added the author of the commit to Cc, maybe they can help?
> > > >
> > > > The immediate question is, is it expected for chip->ops to be NULL in
> > > > this path? Obviously on actual AMD systems that isn't the case,
> > > > otherwise the code would crash there. But is the fact that chip->ops is
> > > > NULL a bug in the ibmvtpm driver, or a possibility that has been
> > > > overlooked by the checking code.
> > > >
> > > > cheers
> > >
> > > All that code assumes that the TPM is still functional which
> > > seems not to be the case for your TPM.
> > >
> > > This should fix it:
> > >
> > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > > index 5be91591cb3b..7082b031741e 100644
> > > --- a/drivers/char/tpm/tpm-chip.c
> > > +++ b/drivers/char/tpm/tpm-chip.c
> > > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct
> > tpm_chip
> > > *chip)
> > >         u64 version;
> > >         int ret;
> > >
> > > +       if (!chip->ops)
> > > +               return false;
> > > +
> > >         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
> > >                 return false;
> >
> >
> > Should tpm_amd_is_rng_defective compile to nothing on non-x86
> > architectures? This code is all about
> > working around an issue with the AMD fTPM, right?
> >
> 
> That's a good point.  Yes it could and that would also solve this problem.
> 
Or I guess more accurately for non-x86 it should be:

static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
{
	return false;
}


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
  2023-06-29 17:43                 ` Jerry Snitselaar
@ 2023-06-29 17:45                   ` Limonciello, Mario
  -1 siblings, 0 replies; 23+ messages in thread
From: Limonciello, Mario @ 2023-06-29 17:45 UTC (permalink / raw)
  To: Jerry Snitselaar
  Cc: Michael Ellerman, Linux regressions mailing list, Sachin Sant,
	open list, linuxppc-dev, jarkko, linux-integrity

[AMD Official Use Only - General]

> -----Original Message-----
> From: Jerry Snitselaar <jsnitsel@redhat.com>
> Sent: Thursday, June 29, 2023 12:43 PM
> To: Limonciello, Mario <Mario.Limonciello@amd.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list
> <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open
> list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc-
> dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org
> Subject: Re: [6.4-rc6] Crash during a kexec operation
> (tpm_amd_is_rng_defective)
>
> On Thu, Jun 29, 2023 at 05:28:58PM +0000, Limonciello, Mario wrote:
> > [Public]
> >
> > > -----Original Message-----
> > > From: Jerry Snitselaar <jsnitsel@redhat.com>
> > > Sent: Thursday, June 29, 2023 12:07 PM
> > > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > > Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list
> > > <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>;
> open
> > > list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc-
> > > dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org
> > > Subject: Re: [6.4-rc6] Crash during a kexec operation
> > > (tpm_amd_is_rng_defective)
> > >
> > > On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote:
> > > >
> > > > On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> > > > > "Linux regression tracking (Thorsten Leemhuis)"
> > > <regressions@leemhuis.info> writes:
> > > > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> > > > > > for once, to make this easily accessible to everyone.
> > > > > >
> > > > > > As Linus will likely release 6.4 on this or the following Sunday a quick
> > > > > > question: is there any hope this regression might be fixed any time
> > > > > > soon?
> > > > > No.
> > > > >
> > > > > I have added the author of the commit to Cc, maybe they can help?
> > > > >
> > > > > The immediate question is, is it expected for chip->ops to be NULL in
> > > > > this path? Obviously on actual AMD systems that isn't the case,
> > > > > otherwise the code would crash there. But is the fact that chip->ops is
> > > > > NULL a bug in the ibmvtpm driver, or a possibility that has been
> > > > > overlooked by the checking code.
> > > > >
> > > > > cheers
> > > >
> > > > All that code assumes that the TPM is still functional which
> > > > seems not to be the case for your TPM.
> > > >
> > > > This should fix it:
> > > >
> > > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > > > index 5be91591cb3b..7082b031741e 100644
> > > > --- a/drivers/char/tpm/tpm-chip.c
> > > > +++ b/drivers/char/tpm/tpm-chip.c
> > > > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct
> > > tpm_chip
> > > > *chip)
> > > >         u64 version;
> > > >         int ret;
> > > >
> > > > +       if (!chip->ops)
> > > > +               return false;
> > > > +
> > > >         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
> > > >                 return false;
> > >
> > >
> > > Should tpm_amd_is_rng_defective compile to nothing on non-x86
> > > architectures? This code is all about
> > > working around an issue with the AMD fTPM, right?
> > >
> >
> > That's a good point.  Yes it could and that would also solve this problem.
> >
> Or I guess more accurately for non-x86 it should be:
>
> static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
> {
>       return false;
> }


Right, but it should be inline.  Would you mind sending something out for
your cleaner idea to supercede my other solution that still didn't merge?


^ permalink raw reply	[flat|nested] 23+ messages in thread

* RE: [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective)
@ 2023-06-29 17:45                   ` Limonciello, Mario
  0 siblings, 0 replies; 23+ messages in thread
From: Limonciello, Mario @ 2023-06-29 17:45 UTC (permalink / raw)
  To: Jerry Snitselaar
  Cc: Linux regressions mailing list, open list, jarkko, Sachin Sant,
	linux-integrity, linuxppc-dev

[AMD Official Use Only - General]

> -----Original Message-----
> From: Jerry Snitselaar <jsnitsel@redhat.com>
> Sent: Thursday, June 29, 2023 12:43 PM
> To: Limonciello, Mario <Mario.Limonciello@amd.com>
> Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list
> <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>; open
> list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc-
> dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org
> Subject: Re: [6.4-rc6] Crash during a kexec operation
> (tpm_amd_is_rng_defective)
>
> On Thu, Jun 29, 2023 at 05:28:58PM +0000, Limonciello, Mario wrote:
> > [Public]
> >
> > > -----Original Message-----
> > > From: Jerry Snitselaar <jsnitsel@redhat.com>
> > > Sent: Thursday, June 29, 2023 12:07 PM
> > > To: Limonciello, Mario <Mario.Limonciello@amd.com>
> > > Cc: Michael Ellerman <mpe@ellerman.id.au>; Linux regressions mailing list
> > > <regressions@lists.linux.dev>; Sachin Sant <sachinp@linux.ibm.com>;
> open
> > > list <linux-kernel@vger.kernel.org>; linuxppc-dev <linuxppc-
> > > dev@lists.ozlabs.org>; jarkko@kernel.org; linux-integrity@vger.kernel.org
> > > Subject: Re: [6.4-rc6] Crash during a kexec operation
> > > (tpm_amd_is_rng_defective)
> > >
> > > On Thu, Jun 22, 2023 at 09:38:04AM -0500, Limonciello, Mario wrote:
> > > >
> > > > On 6/22/2023 7:36 AM, Michael Ellerman wrote:
> > > > > "Linux regression tracking (Thorsten Leemhuis)"
> > > <regressions@leemhuis.info> writes:
> > > > > > Hi, Thorsten here, the Linux kernel's regression tracker. Top-posting
> > > > > > for once, to make this easily accessible to everyone.
> > > > > >
> > > > > > As Linus will likely release 6.4 on this or the following Sunday a quick
> > > > > > question: is there any hope this regression might be fixed any time
> > > > > > soon?
> > > > > No.
> > > > >
> > > > > I have added the author of the commit to Cc, maybe they can help?
> > > > >
> > > > > The immediate question is, is it expected for chip->ops to be NULL in
> > > > > this path? Obviously on actual AMD systems that isn't the case,
> > > > > otherwise the code would crash there. But is the fact that chip->ops is
> > > > > NULL a bug in the ibmvtpm driver, or a possibility that has been
> > > > > overlooked by the checking code.
> > > > >
> > > > > cheers
> > > >
> > > > All that code assumes that the TPM is still functional which
> > > > seems not to be the case for your TPM.
> > > >
> > > > This should fix it:
> > > >
> > > > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > > > index 5be91591cb3b..7082b031741e 100644
> > > > --- a/drivers/char/tpm/tpm-chip.c
> > > > +++ b/drivers/char/tpm/tpm-chip.c
> > > > @@ -525,6 +525,9 @@ static bool tpm_amd_is_rng_defective(struct
> > > tpm_chip
> > > > *chip)
> > > >         u64 version;
> > > >         int ret;
> > > >
> > > > +       if (!chip->ops)
> > > > +               return false;
> > > > +
> > > >         if (!(chip->flags & TPM_CHIP_FLAG_TPM2))
> > > >                 return false;
> > >
> > >
> > > Should tpm_amd_is_rng_defective compile to nothing on non-x86
> > > architectures? This code is all about
> > > working around an issue with the AMD fTPM, right?
> > >
> >
> > That's a good point.  Yes it could and that would also solve this problem.
> >
> Or I guess more accurately for non-x86 it should be:
>
> static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
> {
>       return false;
> }


Right, but it should be inline.  Would you mind sending something out for
your cleaner idea to supercede my other solution that still didn't merge?


^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-06-29 17:46 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-06-14 15:12 [6.4-rc6] Crash during a kexec operation (tpm_amd_is_rng_defective) Sachin Sant
2023-06-14 15:12 ` Sachin Sant
2023-06-15  3:07 ` Michael Ellerman
2023-06-15  3:07   ` Michael Ellerman
2023-06-15  4:57   ` Sachin Sant
2023-06-15  4:57     ` Sachin Sant
2023-06-22  7:44     ` Linux regression tracking (Thorsten Leemhuis)
2023-06-22  7:44       ` Linux regression tracking (Thorsten Leemhuis)
2023-06-22 12:36       ` Michael Ellerman
2023-06-22 12:36         ` Michael Ellerman
2023-06-22 14:38         ` Limonciello, Mario
2023-06-22 14:38           ` Limonciello, Mario
2023-06-23  2:52           ` Sachin Sant
2023-06-23  2:52             ` Sachin Sant
2023-06-29 17:06           ` Jerry Snitselaar
2023-06-29 17:06             ` Jerry Snitselaar
2023-06-29 17:28             ` Limonciello, Mario
2023-06-29 17:28               ` Limonciello, Mario
2023-06-29 17:43               ` Jerry Snitselaar
2023-06-29 17:43                 ` Jerry Snitselaar
2023-06-29 17:45                 ` Limonciello, Mario
2023-06-29 17:45                   ` Limonciello, Mario
2023-06-15 12:04 ` Linux regression tracking #adding (Thorsten Leemhuis)

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.