archive mirror
 help / color / mirror / Atom feed
From: Vlastimil Babka <>
To: Peter Huewe <>,
	Jarkko Sakkinen <>,
	Jason Gunthorpe <>, Jan Dabros <>
Cc:, LKML <>,,
	Dominik Brodowski <>,
	"Jason A. Donenfeld" <>,
	Herbert Xu <>,
	Linus Torvalds <>
Subject: [REGRESSION] suspend to ram fails in 6.1 due to tpm errors
Date: Mon, 28 Nov 2022 09:15:33 +0100	[thread overview]
Message-ID: <> (raw)


I've noticed my Lenovo T460 developed a failure to suspend to ram related to
TPM errors. Seems at each suspend/resume cycle there's a chance the errors
develop and then further suspends are blocked by the stuck TPM (or its driver?)
I can say for sure it never happened before 6.1, however I didn't test all 6.1
RCs and suspend/resume enough times to pinpoint it was 6.1-rc1 (which would be
naturally suspicious). Bisecting would also be time consuming and unreliable,
so hopefully this historical account will be sufficient:

The tpm messages on boot are always the same, here from 6.0.0-rc7:

tpm_tis 00:08: 1.2 TPM (device-id 0x1B, rev-id 16)
tpm tpm0:a TPM is disabled/deactivated (0x6)
tpm tpm0: tpm_read_log_acpi: TCPA log area empty

And normally during resume from suspend to ram I can see:

tpm tpm0: TPM is disabled/deactivated (0x6)

With 6.1-rc3 (the first 6.1-rcX I've tried on this laptop) this was still
behaving OK, suspend/resume went fine 4 times until I updated the kernel
and rebooted. Maybe it just wasn't enough cycles to hit the issue.

With 6.1-rc4, there were initially 3 resumes OK, but on 4th resume I saw:

tpm tpm0: tpm_try_transmit: send(): error -5
tpm tpm0: invalid TPM_STS.x 0xff, dumping stack for forensics
CPU: 2 PID: 15299 Comm: systemd-sleep Not tainted 6.1.0-rc4-2.gc03e512-default #1 openSUSE Tumbleweed (unreleased) 232cc11569ae1616983f707f1010e2c19601c7ee
Hardware name: LENOVO 20FMS27W03/20FMS27W03, BIOS R06ET71W (1.45 ) 02/21/2022
Call Trace:
 ? tpm_tcg_read_bytes+0x8f/0xa0
 ? set_next_entity+0xda/0x150
 ? pnp_bus_suspend+0x10/0x10
 ? dpm_show_time.cold+0x62/0x62
 ? do_syscall_64+0x67/0x80
 ? do_syscall_64+0x67/0x80
RIP: 0033:0x7f6197f079d4
Code: ff eb b7 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa 90 90 80 3d cd 0f 0f 00 00 74 13 b8 01 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 54 c3 0f 1f 00 48 83 ec 28 48 89 54 24 18 48
RSP: 002b:00007ffdaf9a92d8 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
RAX: ffffffffffffffda RBX: 0000000000000004 RCX: 00007f6197f079d4
RDX: 0000000000000004 RSI: 00007ffdaf9a93c0 RDI: 0000000000000004
RBP: 00007ffdaf9a93c0 R08: 0000563fe8e33650 R09: 0000000000000073
R10: 00000000ffffffff R11: 0000000000000202 R12: 0000000000000004
R13: 0000563fe8e152d0 R14: 0000000000000004 R15: 00007f6197fe69e0

and a second later

tpm tpm0: tpm_try_transmit: send(): error -62

But that appears not to be the main issue. I've just noticed it while gathering
the info now, and didn't notice it back then as there were another 5 resumes ok
continuing the same kernel boot.

But then on another resume I got:

tpm tpm0: A TPM error (28) occurred continue selftest

And afterwards, many messages scattered in the log:

tpm tpm0: A TPM error (28) occurred attempting get random

And since then, suspend to ram no longer works and I see this:

tpm tpm0: Error (28) sending savestate before suspend
tpm_tis 00:08: PM: __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x80 returns 28
tpm_tis 00:08: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 28
tpm_tis 00:08: PM: failed to suspend: error 28
PM: Some devices failed to suspend, or early wake event detected

After reboot to 6.1-rc6, initially 3 resumes ok, and then again on 4th resume:

tpm tpm0: A TPM error (28) occurred continue selftest

and same story with the errors attempting get random, and suspend failing.
Notably this was without the tpm_try_transmit splat above, so that is
probably indeed not tha main issue. The moment things go wrong is
the "A TPM error (28) occurred continue selftest" during resume.

Dominik on IRC pointed me to commit b006c439d58d ("hwrng: core - start hwrng
kthread also for untrusted sources"), which could make sense if the TPM was not
used at all before and now it's used for randomness. But then it probably "just"
uncovered a pre-existing issue? Maybe there's a race with getting the randomness
and suspend? Could it be exactly what this patch is attempting to fix?


#regzbot introduced: v6.0..v6.1-rc4

             reply	other threads:[~2022-11-28  8:15 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-28  8:15 Vlastimil Babka [this message]
2022-11-28 12:03 ` [REGRESSION] suspend to ram fails in 6.1 due to tpm errors Jason A. Donenfeld
2022-11-28 13:35   ` Vlastimil Babka
2022-11-28 19:38     ` Vlastimil Babka
2022-11-28 17:01   ` Jason A. Donenfeld

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \ \ \ \ \ \ \ \ \ \ \ \ \ \

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).