linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Todd Brandt <todd.e.brandt@linux.intel.com>
To: Mario Limonciello <mario.limonciello@amd.com>,
	Jarkko Sakkinen <jarkko@kernel.org>,
	linux-integrity@vger.kernel.org
Cc: linux-kernel@vger.kernel.org, len.brown@intel.com,
	charles.d.prestopine@intel.com, rafael.j.wysocki@intel.com
Subject: Re: REGRESSION WITH BISECT: v6.5-rc6 TPM patch breaks S3 on some Intel systems
Date: Fri, 18 Aug 2023 14:39:58 -0700	[thread overview]
Message-ID: <9f3b82466e36aef3591d03176c04663c89625d4a.camel@linux.intel.com> (raw)
In-Reply-To: <eec91766-10a9-4d50-8e82-376f52f54be8@amd.com>

On Fri, 2023-08-18 at 13:11 -0500, Mario Limonciello wrote:
> On 8/18/2023 13:07, Jarkko Sakkinen wrote:
> > On Fri Aug 18, 2023 at 8:57 PM EEST, Mario Limonciello wrote:
> > > On 8/18/2023 12:53, Jarkko Sakkinen wrote:
> > > > On Fri Aug 18, 2023 at 8:21 PM EEST, Mario Limonciello wrote:
> > > > > On 8/18/2023 12:00, Jarkko Sakkinen wrote:
> > > > > > On Fri Aug 18, 2023 at 4:58 AM EEST, Limonciello, Mario
> > > > > > wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > On 8/17/2023 5:33 PM, Jarkko Sakkinen wrote:
> > > > > > > > On Fri Aug 18, 2023 at 1:25 AM EEST, Todd Brandt wrote:
> > > > > > > > > On Fri, 2023-08-18 at 00:47 +0300, Jarkko Sakkinen
> > > > > > > > > wrote:
> > > > > > > > > > On Fri Aug 18, 2023 at 12:09 AM EEST, Todd Brandt
> > > > > > > > > > wrote:
> > > > > > > > > > > While testing S3 on 6.5.0-rc6 we've found that 5
> > > > > > > > > > > systems are seeing
> > > > > > > > > > > a
> > > > > > > > > > > crash and reboot situation when S3 suspend is
> > > > > > > > > > > initiated. To
> > > > > > > > > > > reproduce
> > > > > > > > > > > it, this call is all that's required "sudo
> > > > > > > > > > > sleepgraph -m mem
> > > > > > > > > > > -rtcwake
> > > > > > > > > > > 15".
> > > > > > > > > > 
> > > > > > > > > > 1. Are there logs available?
> > > > > > > > > > 2. Is this the test case: 
> > > > > > > > > > https://pypi.org/project/sleepgraph/ (never
> > > > > > > > > > used it before).
> > > > > > > > > 
> > > > > > > > > There are no dmesg logs because the S3 crash wipes
> > > > > > > > > them out. Sleepgraph
> > > > > > > > > isn't actually necessary to activate it, just an S3
> > > > > > > > > suspend "echo mem >
> > > > > > > > > /sys/power/state".
> > > > > > > > > 
> > > > > > > > > So far it appears to only have affected test systems,
> > > > > > > > > not production
> > > > > > > > > hardware, and none of them have TPM chips, so I'm
> > > > > > > > > beginning to wonder
> > > > > > > > > if this patch just inadvertently activated a bug
> > > > > > > > > somewhere else in the
> > > > > > > > > kernel that happens to affect test hardware.
> > > > > > > > > 
> > > > > > > > > I'll continue to debug it, this isn't an emergency as
> > > > > > > > > so far I haven't
> > > > > > > > > seen it in production hardware.
> > > > > > > > 
> > > > > > > > OK, I'll still see if I could reproduce it just in
> > > > > > > > case.
> > > > > > > > 
> > > > > > > > BR, Jarkko
> > > > > > > 
> > > > > > > I'd like to better understand what kind of TPM
> > > > > > > initialization path has
> > > > > > > run.  Does the machine have some sort of TPM that failed
> > > > > > > to fully
> > > > > > > initialize perhaps?
> > > > > > > 
> > > > > > > If you can't share a full bootup dmesg, can you at least
> > > > > > > share
> > > > > > > 
> > > > > > > # dmesg | grep -i tpm
> > > > > > 
> > > > > > It would be more useful perhaps to get full dmesg output
> > > > > > after power on
> > > > > > and before going into suspend.
> > > > > > 
> > > > > > Also ftrace filter could be added to the kernel command-
> > > > > > line:
> > > > > > 
> > > > > > ftrace=function ftrace_filter=tpm*
> > > > > > 
> > > > > > After bootup:
> > > > > > 
> > > > > > mount -t tracefs nodev /sys/kernel/tracing
> > > > > > cat /sys/kernel/tracing/trace
> > > > > > 
> > > > > > BR, Jarkko
> > > > > 
> > > > > Todd and I have gone back and forth a little bit on the
> > > > > bugzilla
> > > > > (https://bugzilla.kernel.org/show_bug.cgi?id=217804), and it
> > > > > seems that
> > > > > this isn't an S3 problem - it's a probing problem.
> > > > > 
> > > > > [    1.132521] tpm_crb: probe of INTC6001:00 failed with
> > > > > error 378
> > > > > 
> > > > > That error 378 specifically matches TPM2_CC_GET_CAPABILITY,
> > > > > which is the
> > > > > same command that was being requested.  This leads me to
> > > > > believe the TPM
> > > > > isn't ready at the time of probing.
> > > > > 
> > > > > In this case one solution is we could potentially ignore
> > > > > failures for
> > > > > that tpm2_get_tpm_pt() call, but I think we should first
> > > > > understand why
> > > > > it doesn't work at probing time for this TPM to ensure the
> > > > > actual quirk
> > > > > isn't built on a house of cards.
> > > > 
> > > > Given that there is nothing known broken (at the moment) in
> > > > production,
> > > > I think the following might be a reasonable change.
> > > > 
> > > > BR, Jarkko
> > > > 
> > > 
> > > Yeah that would prevent it.
> > > 
> > > Here's a simpler change that I think should work too though:
> > > diff --git a/drivers/char/tpm/tpm_crb.c
> > > b/drivers/char/tpm/tpm_crb.c
> > > index 9eb1a18590123..b0e9931fe436c 100644
> > > --- a/drivers/char/tpm/tpm_crb.c
> > > +++ b/drivers/char/tpm/tpm_crb.c
> > > @@ -472,8 +472,7 @@ static int crb_check_flags(struct tpm_chip
> > > *chip)
> > >           if (ret)
> > >                   return ret;
> > > 
> > > -       ret = tpm2_get_tpm_pt(chip, TPM2_PT_MANUFACTURER, &val,
> > > NULL);
> > > -       if (ret)
> > > +       if (tpm2_get_tpm_pt(chip, TPM2_PT_MANUFACTURER, &val,
> > > NULL))
> > >                   goto release;
> > > 
> > >           if (val == 0x414D4400U /* AMD */)
> > > 
> > > I think Todd needs to check whether TPM works with that or not
> > > though.
> > 
> > Hmm... I'm sorry if I have a blind spot now but what is that
> > changing?
> > 
> > BR, Jarkko
> 
> It throws away the error code if it fails for some reason.
> Todd just checked it works too.  I'll drop it on the M/L for review.

I just ran 6.5.0-rc6 plus this patch on all 5 machines where the
problem was detected and they work now. It looks good.

Tested-by: Todd Brandt <todd.e.brandt@linux.intel.com>


  parent reply	other threads:[~2023-08-18 21:40 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-17 21:09 REGRESSION WITH BISECT: v6.5-rc6 TPM patch breaks S3 on some Intel systems Todd Brandt
2023-08-17 21:47 ` Jarkko Sakkinen
2023-08-17 22:25   ` Todd Brandt
2023-08-17 22:33     ` Jarkko Sakkinen
2023-08-18  1:58       ` Limonciello, Mario
2023-08-18 17:00         ` Jarkko Sakkinen
2023-08-18 17:21           ` Mario Limonciello
     [not found]             ` <CUVV2MQRCGET.2U22LFQPX1J3G@suppilovahvero>
2023-08-18 17:57               ` Mario Limonciello
2023-08-18 18:07                 ` Jarkko Sakkinen
2023-08-18 18:11                   ` Mario Limonciello
2023-08-18 20:08                     ` Todd Brandt
2023-08-18 21:39                     ` Todd Brandt [this message]
2023-09-02  2:47                       ` Tyler Stachecki
2023-09-04 22:30                         ` Jarkko Sakkinen
2023-08-18 16:51       ` Jarkko Sakkinen
2023-08-18  0:10 ` Bagas Sanjaya

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9f3b82466e36aef3591d03176c04663c89625d4a.camel@linux.intel.com \
    --to=todd.e.brandt@linux.intel.com \
    --cc=charles.d.prestopine@intel.com \
    --cc=jarkko@kernel.org \
    --cc=len.brown@intel.com \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=rafael.j.wysocki@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).