linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Jason A. Donenfeld" <Jason@zx2c4.com>
To: James Bottomley <James.Bottomley@HansenPartnership.com>
Cc: Vlastimil Babka <vbabka@suse.cz>, Peter Huewe <peterhuewe@gmx.de>,
	Jarkko Sakkinen <jarkko@kernel.org>,
	Jason Gunthorpe <jgg@ziepe.ca>, Jan Dabros <jsd@semihalf.com>,
	regressions@lists.linux.dev, LKML <linux-kernel@vger.kernel.org>,
	linux-integrity@vger.kernel.org,
	Dominik Brodowski <linux@dominikbrodowski.net>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [REGRESSION] suspend to ram fails in 6.2-rc1 due to tpm errors
Date: Thu, 29 Dec 2022 05:03:44 +0100	[thread overview]
Message-ID: <Y60RoP77HnwaukEA@zx2c4.com> (raw)
In-Reply-To: <c39cc02da9f60412a0f7f7772ef3d89e4a081d38.camel@HansenPartnership.com>

On Wed, Dec 28, 2022 at 06:07:25PM -0500, James Bottomley wrote:
> On Wed, 2022-12-28 at 21:22 +0100, Vlastimil Babka wrote:
> > Ugh, while the problem [1] was fixed in 6.1, it's now happening again
> > on the T460 with 6.2-rc1. Except I didn't see any oops message or
> > "tpm_try_transmit" error this time. The first indication of a problem
> > is this during a resume from suspend to ram:
> > 
> > tpm tpm0: A TPM error (28) occurred continue selftest
> > 
> > and then periodically 
> > 
> > tpm tpm0: A TPM error (28) occurred attempting get random
> 
> That's a TPM 1.2 error which means the TPM failed the selftest.  The
> original problem was reported against TPM 2.0  because of a missing
> try_get_ops().

No, I'm pretty sure the original bug, which was fixed by "char: tpm:
Protect tpm_pm_suspend with locks" regards 1.2 as well, especially
considering it's the same hardware from Vlastimil causing this. I also
recall seeing this in 1.2 when I ran this with the TPM emulator. So
that's not correct.

> The tpm 1.2 command path was never changed to require
> this (and in fact hasn't changed for ages, TPM 1.2 being a bit
> obsolete).

False. I'll just copy and paste the diff of that commit:

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 1621ce818705..d69905233aff 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -401,13 +401,14 @@ int tpm_pm_suspend(struct device *dev)
 	    !pm_suspend_via_firmware())
 		goto suspended;

-	if (!tpm_chip_start(chip)) {
+	rc = tpm_try_get_ops(chip);
+	if (!rc) {
 		if (chip->flags & TPM_CHIP_FLAG_TPM2)
 			tpm2_shutdown(chip, TPM2_SU_STATE);
 		else
 			rc = tpm1_pm_suspend(chip, tpm_suspend_pcr);

-		tpm_chip_stop(chip);
+		tpm_put_ops(chip);
 	}

 suspended:


So, as you can see, this affects the call to tpm1_pm_suspend.

> So this looks like a new problem with TPM 1.2 and
> suspend/resume, likely in the BIOS of your system.

No, this is not a BIOS problem. This is a kernel bug in the TPM 1.2
driver. Yes, it'd be very nice to wave this all away and blame it on the
BIOS, but I really don't think that's the case, especially considering
this is all reproducible in the emulator.

I recall seeing something pretty similar to this report with the
selftest as well. Basically, the call to tpm1_do_selftest can race with
the call to tpm1_get_random, presumably because tpm1_get_random doesn't
do any locking on its own. So it might be a good idea to make sure that
tpm1_get_random() isn't running before tpm1_do_selftest() or any other
TPM commands run.

I spent a long time working through the TPM code when this came up
during 6.1. I set up the TPM emulator with QEMU and reproduced this and
had a whole test setup and S3 fuzzer. It took a long time, and when I was
done, I paged it all out of my brain. When I found that patch from Jan
that fixed the problem most of the time (but not all the time), I wasted
tons of time trying to get the TPM maintainers to take the patch and
send it to Linus as part of rc7 or rc8. But they all ignored me, and
eventually Linus just took that patch directly.

So I don't think I want to go down another rabbit hole here, having
experienced the TPM maintainers not really caring much, and that sucking
away the remaining energy I had before to keep looking at the issue and
its edge cases not handled by Jan's patch.

On the contrary, it'd make a big difference if the TPM maintainers could
actually help analyze the code that they're most familiar with, so that
we can get to the bottom of this. That's a lot better than some random
drive-by patches from a non-TPM person like me; before the 6.1 bug, I'd
never even looked at these drivers. 

My plan is to therefore be available to help and analyze and test and
maybe even write some code, if the TPM maintainers take the lead on
getting to the bottom of this. But if this hits neglect again like last
time, I'll just send a `depends on BROKEN if PM` patch to the TPM
hw_random driver and see what happens... That's a really awful solution
though, so I hope the maintainers will wake up this cycle.

Regards,
Jason

  reply	other threads:[~2022-12-29  4:04 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-28 20:22 [REGRESSION] suspend to ram fails in 6.2-rc1 due to tpm errors Vlastimil Babka
2022-12-28 23:07 ` James Bottomley
2022-12-29  4:03   ` Jason A. Donenfeld [this message]
2022-12-29  4:16     ` Jason A. Donenfeld
2023-01-05 13:59     ` Thorsten Leemhuis
2023-01-05 14:25       ` Vlastimil Babka
2023-01-05 14:47         ` [PATCH] tpm: Disable hwrng for TPM 1 if PM_SLEEP is enabled Jason A. Donenfeld
2023-01-05 14:53           ` Jason A. Donenfeld
2023-01-05 21:58           ` Linus Torvalds
2023-01-05 22:29             ` Jason A. Donenfeld
2023-01-06  3:01               ` [PATCH v2] tpm: Allow system suspend to continue when TPM suspend fails Jason A. Donenfeld
2023-01-06 16:01                 ` Jason A. Donenfeld
     [not found]                   ` <CAA25o9RGVbiXS6ne53gdM1K706zT=hm5c-KuMWrCA_CJtJDXdw@mail.gmail.com>
2023-01-06 17:16                     ` Jason A. Donenfeld
2023-01-06 18:59                 ` Linus Torvalds
2023-01-06 20:04                   ` Luigi Semenzato
2023-01-06 22:28                     ` Linus Torvalds
2023-01-09 16:05                       ` Jason A. Donenfeld
2023-01-16  8:12                 ` Jarkko Sakkinen
2023-01-16 14:03                   ` Jason A. Donenfeld
2023-01-21  0:07                     ` Jarkko Sakkinen
2023-01-16 11:44                 ` Jarkko Sakkinen
2023-01-16 14:00                   ` Vlastimil Babka
2023-01-21  0:03                     ` Jarkko Sakkinen
2023-01-05 15:17       ` [REGRESSION] suspend to ram fails in 6.2-rc1 due to tpm errors James Bottomley
2023-01-05 15:27         ` Jason A. Donenfeld
2023-01-05 15:32           ` Jason A. Donenfeld
2023-01-09 16:08       ` Jason A. Donenfeld
2023-01-10 17:19         ` Vlastimil Babka
2023-01-20 23:47           ` Jarkko Sakkinen
2023-03-14  9:35         ` Thorsten Leemhuis
2023-03-14 12:19           ` Jarkko Sakkinen
2023-03-14 12:47             ` Jason A. Donenfeld
2023-03-14 13:05               ` Jarkko Sakkinen
2023-03-14 13:08                 ` Jarkko Sakkinen
2023-03-14 13:53                   ` Jason A. Donenfeld
2023-03-14 14:23                     ` Jarkko Sakkinen
2023-04-21 15:03                       ` Jarkko Sakkinen
2023-04-21 18:27                         ` Jason A. Donenfeld
2023-04-23 15:34                           ` Jarkko Sakkinen
2023-04-25 23:34                             ` Jarkko Sakkinen
2023-04-26  1:32                               ` Jason A. Donenfeld
2023-04-26 16:07                                 ` Jarkko Sakkinen
2023-04-26 17:00                                   ` Jarkko Sakkinen
2023-01-04  9:10 ` Johannes Altmanninger
2023-01-16 11:30 ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y60RoP77HnwaukEA@zx2c4.com \
    --to=jason@zx2c4.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jarkko@kernel.org \
    --cc=jgg@ziepe.ca \
    --cc=jsd@semihalf.com \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=peterhuewe@gmx.de \
    --cc=regressions@lists.linux.dev \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).