linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: James Bottomley <James.Bottomley@HansenPartnership.com>
To: "Mimi Zohar" <zohar@linux.ibm.com>,
	"Jarkko Sakkinen" <jarkko.sakkinen@linux.intel.com>,
	"Peter Hüwe" <PeterHuewe@gmx.de>
Cc: Calvin Owens <calvinowens@fb.com>, Jason Gunthorpe <jgg@ziepe.ca>,
	Arnd Bergmann <arnd@arndb.de>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-integrity@vger.kernel.org, linux-kernel@vger.kernel.org,
	kernel-team@fb.com
Subject: Re: [PATCH] tpm: Make timeout logic simpler and more robust
Date: Tue, 12 Mar 2019 10:14:16 -0700	[thread overview]
Message-ID: <1552410856.3083.28.camel@HansenPartnership.com> (raw)
In-Reply-To: <1552409969.24794.68.camel@linux.ibm.com>

On Tue, 2019-03-12 at 12:59 -0400, Mimi Zohar wrote:
> On Tue, 2019-03-12 at 07:42 -0700, James Bottomley wrote:
> > On Tue, 2019-03-12 at 14:50 +0200, Jarkko Sakkinen wrote:
> > > On Mon, Mar 11, 2019 at 05:27:43PM -0700, James Bottomley wrote:
> > > > On Mon, 2019-03-11 at 16:54 -0700, Calvin Owens wrote:
> > > > > e're having lots of problems with TPM commands timing out,
> > > > > and we're seeing these problems across lots of different
> > > > > hardware (both v1/v2).
> > > > > 
> > > > > I instrumented the driver to collect latency data, but I
> > > > > wasn't able to find any specific timeout to fix: it seems
> > > > > like many of them are too aggressive. So I tried replacing
> > > > > all the timeout logic with a single universal long timeout,
> > > > > and found that makes our TPMs 100% reliable.
> > > > > 
> > > > > Given that this timeout logic is very complex, problematic,
> > > > > and appears to serve no real purpose, I propose simply
> > > > > deleting all of it.
> > > > 
> > > > "no real purpose" is a bit strong given that all these timeouts
> > > > are standards mandated.  The purpose stated by the standards is
> > > > that there needs to be a way of differentiating the TPM crashed
> > > > from the TPM is taking a very long time to respond.  For a
> > > > normally functioning TPM it looks complex and unnecessary, but
> > > > for a malfunctioning one it's a lifesaver.
> > > 
> > > Standards should be only followed when they make practical sense
> > > and ignored when not. The range is only up to 2s anyway.
> > 
> > I don't disagree ... and I'm certainly not going to defend the TCG
> > because I do think the complexity of some of its standards
> > contributed to the lack of use of TPM 1.2.
> > 
> > However, I am saying we should root cause this problem rather than
> > take a blind shot at the apparent timeout complexity.  My timeout
> > instability is definitely related to the polling adjustments, so
> > it's not unreasonable to think Facebooks might be as well.
> 
> James, I thought Peter sent you a tis "debug" tool to help you debug
> the problem you're seeing.  Whatever happened?

No, not seen one.  I have tried to debug the problem, but it's really
odd: my TPM is a polled nuvoton (so no irq line).  If you poll the data
ready bit on my TPM too often, it simply drops off the bus and every
TPM operation after that times out.  The only way to recover is to
reboot.

James


  reply	other threads:[~2019-03-12 17:48 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-03-11 23:54 [PATCH] tpm: Make timeout logic simpler and more robust Calvin Owens
2019-03-12  0:27 ` James Bottomley
2019-03-12 12:50   ` Jarkko Sakkinen
2019-03-12 14:42     ` James Bottomley
2019-03-12 15:39       ` Jarkko Sakkinen
2019-03-12 19:41         ` Calvin Owens
2019-03-12 16:59       ` Mimi Zohar
2019-03-12 17:14         ` James Bottomley [this message]
2019-03-12 18:32           ` Mimi Zohar
2019-03-12 19:37   ` Calvin Owens
2019-03-12 12:36 ` Jarkko Sakkinen
2019-03-12 16:56   ` Mimi Zohar
2019-03-12 14:55 ` Jarkko Sakkinen
2019-03-12 17:04 ` Mimi Zohar
2019-03-12 20:08   ` Calvin Owens
2019-03-12 20:56     ` Mimi Zohar
2019-03-13 13:22   ` Jarkko Sakkinen
2019-03-13 13:23     ` Jarkko Sakkinen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1552410856.3083.28.camel@HansenPartnership.com \
    --to=james.bottomley@hansenpartnership.com \
    --cc=PeterHuewe@gmx.de \
    --cc=arnd@arndb.de \
    --cc=calvinowens@fb.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jarkko.sakkinen@linux.intel.com \
    --cc=jgg@ziepe.ca \
    --cc=kernel-team@fb.com \
    --cc=linux-integrity@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=zohar@linux.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).