regressions.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Jarkko Sakkinen <jarkko@kernel.org>
To: Vlastimil Babka <vbabka@suse.cz>
Cc: "Jason A. Donenfeld" <Jason@zx2c4.com>,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	Peter Huewe <peterhuewe@gmx.de>, Jason Gunthorpe <jgg@ziepe.ca>,
	Jan Dabros <jsd@semihalf.com>,
	regressions@lists.linux.dev, LKML <linux-kernel@vger.kernel.org>,
	linux-integrity@vger.kernel.org,
	Dominik Brodowski <linux@dominikbrodowski.net>,
	Herbert Xu <herbert@gondor.apana.org.au>,
	Johannes Altmanninger <aclopte@gmail.com>,
	stable@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH v2] tpm: Allow system suspend to continue when TPM suspend fails
Date: Sat, 21 Jan 2023 00:03:57 +0000	[thread overview]
Message-ID: <Y8sr7YJ8e8eSpPFv@kernel.org> (raw)
In-Reply-To: <af2e5a17-514b-8759-2464-7ebb384a17ba@suse.cz>

On Mon, Jan 16, 2023 at 03:00:03PM +0100, Vlastimil Babka wrote:
> On 1/16/23 12:44, Jarkko Sakkinen wrote:
> > On Fri, Jan 06, 2023 at 04:01:56AM +0100, Jason A. Donenfeld wrote:
> >> TPM 1 is sometimes broken across system suspends, due to races or
> >> locking issues or something else that haven't been diagnosed or fixed
> >> yet, most likely having to do with concurrent reads from the TPM's
> >> hardware random number generator driver. These issues prevent the system
> >> from actually suspending, with errors like:
> >> 
> >>   tpm tpm0: A TPM error (28) occurred continue selftest
> >>   ...
> > 
> > <REMOVE>
> > 
> >>   tpm tpm0: A TPM error (28) occurred attempting get random
> >>   ...
> >>   tpm tpm0: Error (28) sending savestate before suspend
> >>   tpm_tis 00:08: PM: __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x80 returns 28
> >>   tpm_tis 00:08: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 28
> >>   tpm_tis 00:08: PM: failed to suspend: error 28
> >>   PM: Some devices failed to suspend, or early wake event detected
> > 
> > </REMOVE>
> > 
> > Unrelated to thix particular fix.
> 
> Not sure I understand.
> AFAIK this is not a proper fix, but a workaround for when laptop suspend no
> longer works because TPM fails to suspend. The error messages quoted above
> are very much related to the problem of suspend not working, and this patch
> did work as advertised at least for me. I see errors but they don't prevent
> suspend anymore:
> 
> https://lore.kernel.org/all/58d7a42c-9e6b-ab2a-617f-d5e373bf63cb@suse.cz/
> 
> >> This issue was partially fixed by 23393c646142 ("char: tpm: Protect
> >> tpm_pm_suspend with locks"), in a last minute 6.1 commit that Linus took
> >> directly because the TPM maintainers weren't available. However, it
> >> seems like this just addresses the most common cases of the bug, rather
> >> than addressing it entirely. So there are more things to fix still,
> >> apparently.
> >> 
> >> In lieu of actually fixing the underlying bug, just allow system suspend
> >> to continue, so that laptops still go to sleep fine. Later, this can be
> >> reverted when the real bug is fixed.
> >> 
> >> Link: https://lore.kernel.org/lkml/7cbe96cf-e0b5-ba63-d1b4-f63d2e826efa@suse.cz/
> >> Cc: stable@vger.kernel.org # 6.1+
> >> Reported-by: Vlastimil Babka <vbabka@suse.cz>
> >> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org>
> >> Signed-off-by: Jason A. Donenfeld <Jason@zx2c4.com>
> >> ---
> >> This is basically untested and I haven't worked out if there are any
> >> awful implications of letting the system sleep when TPM suspend fails.
> >> Maybe some PCRs get cleared and that will make everything explode on
> >> resume? Maybe it doesn't matter? Somebody well versed in TPMology should
> >> probably [n]ack this approach.
> >> 
> >>  drivers/char/tpm/tpm-interface.c | 5 ++++-
> >>  1 file changed, 4 insertions(+), 1 deletion(-)
> >> 
> >> diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
> >> index d69905233aff..6df9067ef7f9 100644
> >> --- a/drivers/char/tpm/tpm-interface.c
> >> +++ b/drivers/char/tpm/tpm-interface.c
> >> @@ -412,7 +412,10 @@ int tpm_pm_suspend(struct device *dev)
> >>  	}
> >>  
> >>  suspended:
> >> -	return rc;
> >> +	if (rc)
> >> +		pr_err("Unable to suspend tpm-%d (error %d), but continuing system suspend\n",
> >> +		       chip->dev_num, rc);
> >> +	return 0;
> >>  }
> >>  EXPORT_SYMBOL_GPL(tpm_pm_suspend);
> >>  
> >> -- 
> >> 2.39.0
> >> 
> > 
> > This tpm_tis local issue, nothing to do with tpm_pm_suspend(). Executing
> > the selftest as part of wake up, is TPM 1.2 dTPM specific requirement, and
> > the call is located in tpm_tis_resume() [*].
> > 
> > [*] https://lore.kernel.org/lkml/Y8U1QxA4GYvPWDky@kernel.org/
> 
> Yes the changelog at the top does say "due to races or locking issues or
> something else that haven't been diagnosed or fixed yet"
> 
> I don't know what causes the TPM to start returning error 28 on resume and
> never recover from it. But it didn't happen before hwrng started using the
> TPM. Before that, it was probably just the selftest ever doing anything with
> the TPM, and on its own I don't recall it ever (before 6.1) failing and
> preventing further suspend/resume.

Would it be possible to test this theory by commenting out tpm_add_hwrng()
call from tpm_chip_register()?

Since they are called sequentially any sort of concurrency issue can be
probably ruled out.

One thing that I noticed is that probably it would be more safe-play to
move tpm_add_hwrng() call after creating the character device, as there's
no need to do it before anything else.

BR, Jarkko

  reply	other threads:[~2023-01-21  0:04 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-12-28 20:22 [REGRESSION] suspend to ram fails in 6.2-rc1 due to tpm errors Vlastimil Babka
2022-12-28 23:07 ` James Bottomley
2022-12-29  4:03   ` Jason A. Donenfeld
2022-12-29  4:16     ` Jason A. Donenfeld
2023-01-05 13:59     ` Thorsten Leemhuis
2023-01-05 14:25       ` Vlastimil Babka
2023-01-05 14:47         ` [PATCH] tpm: Disable hwrng for TPM 1 if PM_SLEEP is enabled Jason A. Donenfeld
2023-01-05 14:53           ` Jason A. Donenfeld
2023-01-05 21:58           ` Linus Torvalds
2023-01-05 22:29             ` Jason A. Donenfeld
2023-01-06  3:01               ` [PATCH v2] tpm: Allow system suspend to continue when TPM suspend fails Jason A. Donenfeld
2023-01-06 16:01                 ` Jason A. Donenfeld
     [not found]                   ` <CAA25o9RGVbiXS6ne53gdM1K706zT=hm5c-KuMWrCA_CJtJDXdw@mail.gmail.com>
2023-01-06 17:16                     ` Jason A. Donenfeld
2023-01-06 18:59                 ` Linus Torvalds
2023-01-06 20:04                   ` Luigi Semenzato
2023-01-06 22:28                     ` Linus Torvalds
2023-01-09 16:05                       ` Jason A. Donenfeld
2023-01-16  8:12                 ` Jarkko Sakkinen
2023-01-16 14:03                   ` Jason A. Donenfeld
2023-01-21  0:07                     ` Jarkko Sakkinen
2023-01-16 11:44                 ` Jarkko Sakkinen
2023-01-16 14:00                   ` Vlastimil Babka
2023-01-21  0:03                     ` Jarkko Sakkinen [this message]
2023-01-05 15:17       ` [REGRESSION] suspend to ram fails in 6.2-rc1 due to tpm errors James Bottomley
2023-01-05 15:27         ` Jason A. Donenfeld
2023-01-05 15:32           ` Jason A. Donenfeld
2023-01-09 16:08       ` Jason A. Donenfeld
2023-01-10 17:19         ` Vlastimil Babka
2023-01-20 23:47           ` Jarkko Sakkinen
2023-03-14  9:35         ` Thorsten Leemhuis
2023-03-14 12:19           ` Jarkko Sakkinen
2023-03-14 12:47             ` Jason A. Donenfeld
2023-03-14 13:05               ` Jarkko Sakkinen
2023-03-14 13:08                 ` Jarkko Sakkinen
2023-03-14 13:53                   ` Jason A. Donenfeld
2023-03-14 14:23                     ` Jarkko Sakkinen
2023-04-21 15:03                       ` Jarkko Sakkinen
2023-04-21 18:27                         ` Jason A. Donenfeld
2023-04-23 15:34                           ` Jarkko Sakkinen
2023-04-25 23:34                             ` Jarkko Sakkinen
2023-04-26  1:32                               ` Jason A. Donenfeld
2023-04-26 16:07                                 ` Jarkko Sakkinen
2023-04-26 17:00                                   ` Jarkko Sakkinen
2023-01-04  9:10 ` Johannes Altmanninger
2023-01-16 11:30 ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y8sr7YJ8e8eSpPFv@kernel.org \
    --to=jarkko@kernel.org \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=Jason@zx2c4.com \
    --cc=aclopte@gmail.com \
    --cc=herbert@gondor.apana.org.au \
    --cc=jgg@ziepe.ca \
    --cc=jsd@semihalf.com \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux@dominikbrodowski.net \
    --cc=peterhuewe@gmx.de \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=stable@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).