All of lore.kernel.org
 help / color / mirror / Atom feed
* TPM selftest failure in 4.15
@ 2018-02-01 12:16 James Bottomley
  2018-02-01 12:21 ` Paul Menzel
                   ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-01 12:16 UTC (permalink / raw)
  To: linux-integrity

Embarrassingly enough, I'm just on my way to do a TPM talk at FOSDEM.
 I installed my shiny new 4.15 kernel on the 'plane and this is what I
got after I arrived this morning:

jejb@jarvis:~> dmesg | grep -i tpm
[    0.000000] ACPI: TPM2 0x0000000079446CC0 000034
(v03        Tpm2Tabl 00000001 AMI  00000000)
[    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2)
[    1.608863] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.640052] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.691215] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.782377] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.953539] tpm tpm0: A TPM error (2314) occurred continue selftest
[    2.284701] tpm tpm0: A TPM error (2314) occurred continue selftest
[    2.935743] tpm tpm0: A TPM error (2314) occurred continue selftest
[    4.216236] tpm tpm0: TPM self test failed
[    4.236829] ima: No TPM chip found, activating TPM-bypass! (rc=-19)

The error is TPM_RC_TESTING, which means it looks like we don't wait
long enough for the selftests to complete.  I get this all the time
booting with 4.15.  Fortunately I have a 4.13 backup kernel which is
fine (otherwise I'd be a bit hosed since all my keys now require a
TPM).

I'll debug on the train; my current suspicion is that the TPM_LONG
duration might be a bit short for this chip (A nuvoton 6xx in a dell
XPS-13).

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 12:16 TPM selftest failure in 4.15 James Bottomley
@ 2018-02-01 12:21 ` Paul Menzel
  2018-02-01 12:42   ` James Bottomley
  2018-02-08 12:49 ` Jarkko Sakkinen
  2018-02-08 18:45 ` Ken Goldman
  2 siblings, 1 reply; 42+ messages in thread
From: Paul Menzel @ 2018-02-01 12:21 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-integrity

[-- Attachment #1: Type: multipart/signed, Size: 2144 bytes --]

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 12:21 ` Paul Menzel
@ 2018-02-01 12:42   ` James Bottomley
  2018-02-01 15:24     ` James Bottomley
  2018-02-08 13:03     ` Jarkko Sakkinen
  0 siblings, 2 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-01 12:42 UTC (permalink / raw)
  To: Paul Menzel; +Cc: linux-integrity

On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote:
> Dear James,
> 
> 
> On 02/01/18 13:16, James Bottomley wrote:
> > 
> > Embarrassingly enough, I'm just on my way to do a TPM talk at
> > FOSDEM.   I installed my shiny new 4.15 kernel on the 'plane and
> > this is what I got after I arrived this morning:
> > 
> > jejb@jarvis:~> dmesg | grep -i tpm
> > [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034
> > (v03        Tpm2Tabl 00000001 AMI  00000000)
> > [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 
> > 2)
> > [    1.608863] tpm tpm0: A TPM error (2314) occurred continue
> > selftest
> > [    1.640052] tpm tpm0: A TPM error (2314) occurred continue
> > selftest
> > [    1.691215] tpm tpm0: A TPM error (2314) occurred continue
> > selftest
> > [    1.782377] tpm tpm0: A TPM error (2314) occurred continue
> > selftest
> > [    1.953539] tpm tpm0: A TPM error (2314) occurred continue
> > selftest
> > [    2.284701] tpm tpm0: A TPM error (2314) occurred continue
> > selftest
> > [    2.935743] tpm tpm0: A TPM error (2314) occurred continue
> > selftest
> > [    4.216236] tpm tpm0: TPM self test failed
> > [    4.236829] ima: No TPM chip found, activating TPM-bypass! (rc=-
> > 19)
> > 
> > The error is TPM_RC_TESTING, which means it looks like we don't
> > wait long enough for the selftests to complete.  I get this all the
> > time booting with 4.15.  Fortunately I have a 4.13 backup kernel
> > which is fine (otherwise I'd be a bit hosed since all my keys now
> > require a TPM).
> > 
> > I'll debug on the train; my current suspicion is that the TPM_LONG
> > duration might be a bit short for this chip (A nuvoton 6xx in a
> > dell XPS-13).
> 
> Please join the thread [1], where I reported the same problem for the
> Dell XPS 13 9360. Unfortunately, no solution was found, especially,
> as I did not use the TPM. Other owners of that system unfortunately
> didn't have time to report back if it work for them, so the
> "conclusion" kind of was, that my TPM was broken, and had to be
> tested.

OK, I'll try to find a fix.  It's clearly a marginal problem since I've
booted most -rc kernels without issue, so there's some slight timing
change in 4.15 that triggered it.  It could also be a shutdown issue.
 Any NV ram stuff deferred to start up would take a variable amount of
time.

You'd almost think it's some sort of TPM self protest: the more stuff I
use it for the more problems it seems to create.  I'm definitely
motivated to fix it because without a TPM I can't actually do much with
my laptop.

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 12:42   ` James Bottomley
@ 2018-02-01 15:24     ` James Bottomley
  2018-02-01 17:40       ` Jason Gunthorpe
                         ` (2 more replies)
  2018-02-08 13:03     ` Jarkko Sakkinen
  1 sibling, 3 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-01 15:24 UTC (permalink / raw)
  To: Paul Menzel; +Cc: linux-integrity

On Thu, 2018-02-01 at 12:42 +0000, James Bottomley wrote:
> On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote:
> > 
> > Dear James,
> > 
> > 
> > On 02/01/18 13:16, James Bottomley wrote:
> > > 
> > > 
> > > Embarrassingly enough, I'm just on my way to do a TPM talk at
> > > FOSDEM.   I installed my shiny new 4.15 kernel on the 'plane and
> > > this is what I got after I arrived this morning:
> > > 
> > > jejb@jarvis:~> dmesg | grep -i tpm
> > > [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034
> > > (v03        Tpm2Tabl 00000001 AMI  00000000)
> > > [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-
> > > id 
> > > 2)
> > > [    1.608863] tpm tpm0: A TPM error (2314) occurred continue
> > > selftest
> > > [    1.640052] tpm tpm0: A TPM error (2314) occurred continue
> > > selftest
> > > [    1.691215] tpm tpm0: A TPM error (2314) occurred continue
> > > selftest
> > > [    1.782377] tpm tpm0: A TPM error (2314) occurred continue
> > > selftest
> > > [    1.953539] tpm tpm0: A TPM error (2314) occurred continue
> > > selftest
> > > [    2.284701] tpm tpm0: A TPM error (2314) occurred continue
> > > selftest
> > > [    2.935743] tpm tpm0: A TPM error (2314) occurred continue
> > > selftest
> > > [    4.216236] tpm tpm0: TPM self test failed
> > > [    4.236829] ima: No TPM chip found, activating TPM-bypass!
> > > (rc=-
> > > 19)
> > > 
> > > The error is TPM_RC_TESTING, which means it looks like we don't
> > > wait long enough for the selftests to complete.  I get this all
> > > the time booting with 4.15.  Fortunately I have a 4.13 backup
> > > kernel which is fine (otherwise I'd be a bit hosed since all my
> > > keys now require a TPM).
> > > 
> > > I'll debug on the train; my current suspicion is that the
> > > TPM_LONG duration might be a bit short for this chip (A nuvoton
> > > 6xx in a dell XPS-13).
> > 
> > Please join the thread [1], where I reported the same problem for
> > the Dell XPS 13 9360. Unfortunately, no solution was found,
> > especially, as I did not use the TPM. Other owners of that system
> > unfortunately didn't have time to report back if it work for them,
> > so the "conclusion" kind of was, that my TPM was broken, and had to
> > be tested.
> 
> OK, I'll try to find a fix.  It's clearly a marginal problem since
> I've booted most -rc kernels without issue, so there's some slight
> timing change in 4.15 that triggered it.  It could also be a shutdown
> issue.  Any NV ram stuff deferred to start up would take a variable
> amount of time.
> 
> You'd almost think it's some sort of TPM self protest: the more stuff
> I use it for the more problems it seems to create.  I'm definitely
> motivated to fix it because without a TPM I can't actually do much
> with my laptop.

OK, I investigated but now my TPM has returned to normal (as in it
passes the selftest immediately).  There's clearly something that makes
it return TPM_RC_TESTING to every self test probe for seconds at a
time, but I don't know what it is.  Sending a different command seems
to cause the problem to clear (Managed to reproduce once with the patch
to verify), so this is my proposed fix.  It's clearly nonsensical to
detach the driver because the self test still returns TPM_RC_TESTING,
so convert that return to a TPM_RC_SUCCESS on timeout.  It prints a
warning message so we'll see it in the logs if it causes problems.
 Given that this seems to be some type of internal TPM issue, I don't
believe changing the timings would work.

James

---

diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index f40d20671a78..3e1b062d8888 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -872,6 +872,17 @@ static int tpm2_do_selftest(struct tpm_chip *chip)
 		/* wait longer the next round */
 		delay_msec *= 2;
 	}
+	if (rc == TPM2_RC_TESTING) {
+		/*
+		 * A return of RC_TESTING means the TPM is still
+		 * running self tests.  If one fails it will go into
+		 * failure mode and return RC_FAILED to every command,
+		 * so treat a still in testing return as a success
+		 * rather than causing a driver detach.
+		 */
+		dev_err(&chip->dev,"TPM: Still in testing mode after %dms, continuing\n", delay_msec);
+		rc = TPM2_RC_SUCCESS;
+	}
 
 	return rc;
 }

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 15:24     ` James Bottomley
@ 2018-02-01 17:40       ` Jason Gunthorpe
  2018-02-01 18:46         ` James Bottomley
  2018-02-08 17:27         ` Ken Goldman
  2018-02-01 19:16       ` TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx) Paul Menzel
  2018-02-08 13:05       ` TPM selftest failure in 4.15 Jarkko Sakkinen
  2 siblings, 2 replies; 42+ messages in thread
From: Jason Gunthorpe @ 2018-02-01 17:40 UTC (permalink / raw)
  To: James Bottomley; +Cc: Paul Menzel, linux-integrity

On Thu, Feb 01, 2018 at 03:24:08PM +0000, James Bottomley wrote:

> OK, I investigated but now my TPM has returned to normal (as in it
> passes the selftest immediately).  There's clearly something that makes
> it return TPM_RC_TESTING to every self test probe for seconds at a
> time, but I don't know what it is.  Sending a different command seems
> to cause the problem to clear (Managed to reproduce once with the patch
> to verify), so this is my proposed fix.  It's clearly nonsensical to
> detach the driver because the self test still returns TPM_RC_TESTING,
> so convert that return to a TPM_RC_SUCCESS on timeout.  It prints a
> warning message so we'll see it in the logs if it causes problems.
>  Given that this seems to be some type of internal TPM issue, I don't
> believe changing the timings would work.

But if a selftest returns TPM2_RC_TESTING I would expect the next
command to also fail with a testing in progress code? At least by the
spec..

The point of invoking selftest is to get to a state where future TPM
commands will succeed, so returning immediately on RC_TESTING seems
wrong?

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 17:40       ` Jason Gunthorpe
@ 2018-02-01 18:46         ` James Bottomley
  2018-02-01 18:59           ` Jason Gunthorpe
  2018-02-08 17:27         ` Ken Goldman
  1 sibling, 1 reply; 42+ messages in thread
From: James Bottomley @ 2018-02-01 18:46 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Paul Menzel, linux-integrity

On Thu, 2018-02-01 at 10:40 -0700, Jason Gunthorpe wrote:
> On Thu, Feb 01, 2018 at 03:24:08PM +0000, James Bottomley wrote:
> 
> > 
> > OK, I investigated but now my TPM has returned to normal (as in it
> > passes the selftest immediately).  There's clearly something that
> > makes it return TPM_RC_TESTING to every self test probe for seconds
> > at a time, but I don't know what it is.  Sending a different
> > command seems to cause the problem to clear (Managed to reproduce
> > once with the patch to verify), so this is my proposed fix.  It's
> > clearly nonsensical to detach the driver because the self test
> > still returns TPM_RC_TESTING, so convert that return to a
> > TPM_RC_SUCCESS on timeout.  It prints a warning message so we'll
> > see it in the logs if it causes problems.  Given that this seems to
> > be some type of internal TPM issue, I don't believe changing the
> > timings would work.
> 
> But if a selftest returns TPM2_RC_TESTING I would expect the next
> command to also fail with a testing in progress code? At least by the
> spec..

No, the spec says only that the command may fail with TPM_RC_TESTING
*if* the system it requires is under test.

I have no idea what's up with my TPM.  The selftests usually complete
in under 1ms and return TPM_RC_SUCCESS.  However occasionally the self
test can return TPM_RC_TESTING for seconds.

I've already booted with the patch and verified it doesn't induce any
failures for me.  Either the long running test is in a completely
unconnected subsystem or invoking a different command forces the TPM to
clear the testing state.

> The point of invoking selftest is to get to a state where future TPM
> commands will succeed, so returning immediately on RC_TESTING seems
> wrong?

Well it's definitely less wrong than the converse of actually detaching
the TPM driver because a self test is apparently still in progress.  If
you depend on the TPM for a critical function, your entire system is
hosed at that point.

I honestly don't think we should be waiting for the self test at all.
 We should kick it off and treat any TPM_RC_TESTING error as -EAGAIN.
 We're already under fire for slow boot sequences and adding 2s just to
wait for the TPM to self test adds to that for no real value.

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 18:46         ` James Bottomley
@ 2018-02-01 18:59           ` Jason Gunthorpe
  2018-02-01 20:00             ` James Bottomley
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2018-02-01 18:59 UTC (permalink / raw)
  To: James Bottomley; +Cc: Paul Menzel, linux-integrity

On Thu, Feb 01, 2018 at 07:46:04PM +0100, James Bottomley wrote:

> I honestly don't think we should be waiting for the self test at all.
> We should kick it off and treat any TPM_RC_TESTING error as -EAGAIN.
> We're already under fire for slow boot sequences and adding 2s just to
> wait for the TPM to self test adds to that for no real value.

Arguably the BIOS should have completed the selftest - this stuff
generally only exists to support embedded.

I don't like the idea of EAGAIN, that just expose all our users to
this mess.

I would support making transmit_cmd genericly wait and retry if the
TPM insists we need to wait for selftest to complete the specific
command though.

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 15:24     ` James Bottomley
  2018-02-01 17:40       ` Jason Gunthorpe
@ 2018-02-01 19:16       ` Paul Menzel
  2018-02-01 19:17         ` Paul Menzel
  2018-02-08 13:18         ` Jarkko Sakkinen
  2018-02-08 13:05       ` TPM selftest failure in 4.15 Jarkko Sakkinen
  2 siblings, 2 replies; 42+ messages in thread
From: Paul Menzel @ 2018-02-01 19:16 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-integrity, Mario Limonciello, regression, Alexander Steffen

Dear James,


Am 01.02.2018 um 16:24 schrieb James Bottomley:
> On Thu, 2018-02-01 at 12:42 +0000, James Bottomley wrote:
>> On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote:

>>> On 02/01/18 13:16, James Bottomley wrote:
>>>>
>>>>
>>>> Embarrassingly enough, I'm just on my way to do a TPM talk at
>>>> FOSDEM.   I installed my shiny new 4.15 kernel on the 'plane and
>>>> this is what I got after I arrived this morning:
>>>>
>>>> jejb@jarvis:~> dmesg | grep -i tpm
>>>> [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034 (v03        Tpm2Tabl 00000001 AMI  00000000)
>>>> [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2)
>>>> [    1.608863] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>> [    1.640052] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>> [    1.691215] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>> [    1.782377] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>> [    1.953539] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>> [    2.284701] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>> [    2.935743] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>> [    4.216236] tpm tpm0: TPM self test failed
>>>> [    4.236829] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
>>>>
>>>> The error is TPM_RC_TESTING, which means it looks like we don't
>>>> wait long enough for the selftests to complete.  I get this all
>>>> the time booting with 4.15.  Fortunately I have a 4.13 backup
>>>> kernel which is fine (otherwise I'd be a bit hosed since all my
>>>> keys now require a TPM).
>>>>
>>>> I'll debug on the train; my current suspicion is that the
>>>> TPM_LONG duration might be a bit short for this chip (A nuvoton
>>>> 6xx in a dell XPS-13).
>>>
>>> Please join the thread [1], where I reported the same problem for
>>> the Dell XPS 13 9360. Unfortunately, no solution was found,
>>> especially, as I did not use the TPM. Other owners of that system
>>> unfortunately didn't have time to report back if it work for them,
>>> so the "conclusion" kind of was, that my TPM was broken, and had to
>>> be tested.
>>
>> OK, I'll try to find a fix.  It's clearly a marginal problem since
>> I've booted most -rc kernels without issue, so there's some slight
>> timing change in 4.15 that triggered it.  It could also be a shutdown
>> issue.  Any NV ram stuff deferred to start up would take a variable
>> amount of time.
>>
>> You'd almost think it's some sort of TPM self protest: the more stuff
>> I use it for the more problems it seems to create.  I'm definitely
>> motivated to fix it because without a TPM I can't actually do much
>> with my laptop.
> 
> OK, I investigated but now my TPM has returned to normal (as in it
> passes the selftest immediately).  There's clearly something that makes
> it return TPM_RC_TESTING to every self test probe for seconds at a
> time, but I don't know what it is.  Sending a different command seems
> to cause the problem to clear (Managed to reproduce once with the patch
> to verify), so this is my proposed fix.  It's clearly nonsensical to
> detach the driver because the self test still returns TPM_RC_TESTING,
> so convert that return to a TPM_RC_SUCCESS on timeout.  It prints a
> warning message so we'll see it in the logs if it causes problems.
>   Given that this seems to be some type of internal TPM issue, I don't
> believe changing the timings would work.

Maybe Mario can confirm this issue too, now that Linux 4.15 is released. 
Maybe he also has a way to get the Nuvoton people involved.

> ---
> 
> diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
> index f40d20671a78..3e1b062d8888 100644
> --- a/drivers/char/tpm/tpm2-cmd.c
> +++ b/drivers/char/tpm/tpm2-cmd.c
> @@ -872,6 +872,17 @@ static int tpm2_do_selftest(struct tpm_chip *chip)
>   		/* wait longer the next round */
>   		delay_msec *= 2;
>   	}
> +	if (rc == TPM2_RC_TESTING) {
> +		/*
> +		 * A return of RC_TESTING means the TPM is still
> +		 * running self tests.  If one fails it will go into
> +		 * failure mode and return RC_FAILED to every command,
> +		 * so treat a still in testing return as a success
> +		 * rather than causing a driver detach.
> +		 */
> +		dev_err(&chip->dev,"TPM: Still in testing mode after %dms, continuing\n", delay_msec);
> +		rc = TPM2_RC_SUCCESS;
> +	}
>   
>   	return rc;
>   }

Alexander replied the following in the other thread. No idea if you read 
it yet.

> The list of "A TPM error (2314) occurred continue selftest" is caused by my 
> commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React correctly to
> RC_TESTING from TPM 2.0 self tests") [1]. 2314 is TPM_RC_TESTING, so the TPM
> tells us that self-tests are still running in the background. This problem was
> not visible in previous versions, since it (incorrectly) ignored > TPM_RC_TESTING.

Maybe the commit should be reverted for now until this has cleared up 
for the Dell XPS 13 9360(?) to adhere to Linux' no regression policy.


Kind regards,

Paul


PS: Alexander will also be at FOSDEM and mentioned your talk [2].


[1] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=125a2210541079e8e7c69e629ad06cabed788f8[2] 
https://lists.01.org/pipermail/tpm2/2018-January/000486.html

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 19:16       ` TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx) Paul Menzel
@ 2018-02-01 19:17         ` Paul Menzel
  2018-02-01 20:12           ` Mario.Limonciello
  2018-02-08 13:18         ` Jarkko Sakkinen
  1 sibling, 1 reply; 42+ messages in thread
From: Paul Menzel @ 2018-02-01 19:17 UTC (permalink / raw)
  To: James Bottomley
  Cc: linux-integrity, Mario Limonciello, regressions, Alexander Steffen

[resend with regressions@ address fixed, sorry]

Am 01.02.2018 um 20:16 schrieb Paul Menzel:
> Dear James,
> 
> 
> Am 01.02.2018 um 16:24 schrieb James Bottomley:
>> On Thu, 2018-02-01 at 12:42 +0000, James Bottomley wrote:
>>> On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote:
> 
>>>> On 02/01/18 13:16, James Bottomley wrote:
>>>>>
>>>>>
>>>>> Embarrassingly enough, I'm just on my way to do a TPM talk at
>>>>> FOSDEM.   I installed my shiny new 4.15 kernel on the 'plane and
>>>>> this is what I got after I arrived this morning:
>>>>>
>>>>> jejb@jarvis:~> dmesg | grep -i tpm
>>>>> [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034 
>>>>> (v03        Tpm2Tabl 00000001 AMI  00000000)
>>>>> [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2)
>>>>> [    1.608863] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.640052] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.691215] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.782377] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.953539] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    2.284701] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    2.935743] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    4.216236] tpm tpm0: TPM self test failed
>>>>> [    4.236829] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
>>>>>
>>>>> The error is TPM_RC_TESTING, which means it looks like we don't
>>>>> wait long enough for the selftests to complete.  I get this all
>>>>> the time booting with 4.15.  Fortunately I have a 4.13 backup
>>>>> kernel which is fine (otherwise I'd be a bit hosed since all my
>>>>> keys now require a TPM).
>>>>>
>>>>> I'll debug on the train; my current suspicion is that the
>>>>> TPM_LONG duration might be a bit short for this chip (A nuvoton
>>>>> 6xx in a dell XPS-13).
>>>>
>>>> Please join the thread [1], where I reported the same problem for
>>>> the Dell XPS 13 9360. Unfortunately, no solution was found,
>>>> especially, as I did not use the TPM. Other owners of that system
>>>> unfortunately didn't have time to report back if it work for them,
>>>> so the "conclusion" kind of was, that my TPM was broken, and had to
>>>> be tested.
>>>
>>> OK, I'll try to find a fix.  It's clearly a marginal problem since
>>> I've booted most -rc kernels without issue, so there's some slight
>>> timing change in 4.15 that triggered it.  It could also be a shutdown
>>> issue.  Any NV ram stuff deferred to start up would take a variable
>>> amount of time.
>>>
>>> You'd almost think it's some sort of TPM self protest: the more stuff
>>> I use it for the more problems it seems to create.  I'm definitely
>>> motivated to fix it because without a TPM I can't actually do much
>>> with my laptop.
>>
>> OK, I investigated but now my TPM has returned to normal (as in it
>> passes the selftest immediately).  There's clearly something that makes
>> it return TPM_RC_TESTING to every self test probe for seconds at a
>> time, but I don't know what it is.  Sending a different command seems
>> to cause the problem to clear (Managed to reproduce once with the patch
>> to verify), so this is my proposed fix.  It's clearly nonsensical to
>> detach the driver because the self test still returns TPM_RC_TESTING,
>> so convert that return to a TPM_RC_SUCCESS on timeout.  It prints a
>> warning message so we'll see it in the logs if it causes problems.
>>   Given that this seems to be some type of internal TPM issue, I don't
>> believe changing the timings would work.
> 
> Maybe Mario can confirm this issue too, now that Linux 4.15 is released. 
> Maybe he also has a way to get the Nuvoton people involved.
> 
>> ---
>>
>> diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
>> index f40d20671a78..3e1b062d8888 100644
>> --- a/drivers/char/tpm/tpm2-cmd.c
>> +++ b/drivers/char/tpm/tpm2-cmd.c
>> @@ -872,6 +872,17 @@ static int tpm2_do_selftest(struct tpm_chip *chip)
>>           /* wait longer the next round */
>>           delay_msec *= 2;
>>       }
>> +    if (rc == TPM2_RC_TESTING) {
>> +        /*
>> +         * A return of RC_TESTING means the TPM is still
>> +         * running self tests.  If one fails it will go into
>> +         * failure mode and return RC_FAILED to every command,
>> +         * so treat a still in testing return as a success
>> +         * rather than causing a driver detach.
>> +         */
>> +        dev_err(&chip->dev,"TPM: Still in testing mode after %dms, 
>> continuing\n", delay_msec);
>> +        rc = TPM2_RC_SUCCESS;
>> +    }
>>       return rc;
>>   }
> 
> Alexander replied the following in the other thread. No idea if you read 
> it yet.
> 
>> The list of "A TPM error (2314) occurred continue selftest" is caused 
>> by my commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React 
>> correctly to
>> RC_TESTING from TPM 2.0 self tests") [1]. 2314 is TPM_RC_TESTING, so 
>> the TPM
>> tells us that self-tests are still running in the background. This 
>> problem was
>> not visible in previous versions, since it (incorrectly) ignored > 
>> TPM_RC_TESTING.
> 
> Maybe the commit should be reverted for now until this has cleared up 
> for the Dell XPS 13 9360(?) to adhere to Linux' no regression policy.
> 
> 
> Kind regards,
> 
> Paul
> 
> 
> PS: Alexander will also be at FOSDEM and mentioned your talk [2].
> 
> 
> [1] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=125a2210541079e8e7c69e629ad06cabed788f8[2] 
> https://lists.01.org/pipermail/tpm2/2018-January/000486.html

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 18:59           ` Jason Gunthorpe
@ 2018-02-01 20:00             ` James Bottomley
  2018-02-01 20:35               ` Jason Gunthorpe
  2018-02-08 13:10               ` Jarkko Sakkinen
  0 siblings, 2 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-01 20:00 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Paul Menzel, linux-integrity

On Thu, 2018-02-01 at 11:59 -0700, Jason Gunthorpe wrote:
> On Thu, Feb 01, 2018 at 07:46:04PM +0100, James Bottomley wrote:
> 
> > 
> > I honestly don't think we should be waiting for the self test at
> > all.
> > We should kick it off and treat any TPM_RC_TESTING error as
> > -EAGAIN.
> > We're already under fire for slow boot sequences and adding 2s just
> > to
> > wait for the TPM to self test adds to that for no real value.
> 
> Arguably the BIOS should have completed the selftest - this stuff
> generally only exists to support embedded.
> 
> I don't like the idea of EAGAIN, that just expose all our users to
> this mess.
> 
> I would support making transmit_cmd genericly wait and retry if the
> TPM insists we need to wait for selftest to complete the specific
> command though.

OK, how about this then?

James

---

diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
index 1d6729be4cd6..84ed271c060b 100644
--- a/drivers/char/tpm/tpm-interface.c
+++ b/drivers/char/tpm/tpm-interface.c
@@ -521,12 +521,32 @@ ssize_t tpm_transmit_cmd(struct tpm_chip *chip, struct tpm_space *space,
 	const struct tpm_output_header *header = buf;
 	int err;
 	ssize_t len;
+	unsigned int delay_msec = 20;
 
-	len = tpm_transmit(chip, space, (u8 *)buf, bufsiz, flags);
-	if (len <  0)
-		return len;
+	/*
+	 * on first probe we kick off a TPM self test in the
+	 * background This means the TPM may return RC_TESTING to any
+	 * command that tries to use a subsystem under test, so do an
+	 * exponential backoff wait if that happens
+	 */
+	for (;;) {
+		len = tpm_transmit(chip, space, (u8 *)buf, bufsiz, flags);
+		if (len <  0)
+			return len;
+
+		err = be32_to_cpu(header->return_code);
+		if (err != TPM2_RC_TESTING ||
+		    (flags & TPM_TRANSMIT_NOWAIT))
+			break;
+
+		delay_msec *= 2;
+		if (delay_msec > TPM2_DURATION_LONG) {
+			dev_err(&chip->dev,"TPM: still running self tests, giving up waiting\n");
+			break;
+		}
+		tpm_msleep(delay_msec);
+	}
 
-	err = be32_to_cpu(header->return_code);
 	if (err != 0 && desc)
 		dev_err(&chip->dev, "A TPM error (%d) occurred %s\n", err,
 			desc);
diff --git a/drivers/char/tpm/tpm.h b/drivers/char/tpm/tpm.h
index 528cffbd49d3..47c5a5206325 100644
--- a/drivers/char/tpm/tpm.h
+++ b/drivers/char/tpm/tpm.h
@@ -495,6 +495,7 @@ extern struct idr dev_nums_idr;
 enum tpm_transmit_flags {
 	TPM_TRANSMIT_UNLOCKED	= BIT(0),
 	TPM_TRANSMIT_RAW	= BIT(1),
+	TPM_TRANSMIT_NOWAIT	= BIT(2),
 };
 
 ssize_t tpm_transmit(struct tpm_chip *chip, struct tpm_space *space,
diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
index f40d20671a78..106c126b4fe0 100644
--- a/drivers/char/tpm/tpm2-cmd.c
+++ b/drivers/char/tpm/tpm2-cmd.c
@@ -849,28 +849,24 @@ static const struct tpm_input_header tpm2_selftest_header = {
 static int tpm2_do_selftest(struct tpm_chip *chip)
 {
 	int rc;
-	unsigned int delay_msec = 20;
-	long duration;
 	struct tpm2_cmd cmd;
 
-	duration = jiffies_to_msecs(
-		tpm2_calc_ordinal_duration(chip, TPM2_CC_SELF_TEST));
-
-	while (duration > 0) {
-		cmd.header.in = tpm2_selftest_header;
-		cmd.params.selftest_in.full_test = 0;
-
-		rc = tpm_transmit_cmd(chip, NULL, &cmd, TPM2_SELF_TEST_IN_SIZE,
-				      0, 0, "continue selftest");
-
-		if (rc != TPM2_RC_TESTING)
-			break;
-
-		tpm_msleep(delay_msec);
-		duration -= delay_msec;
-
-		/* wait longer the next round */
-		delay_msec *= 2;
+	cmd.header.in = tpm2_selftest_header;
+	cmd.params.selftest_in.full_test = 0;
+
+	rc = tpm_transmit_cmd(chip, NULL, &cmd, TPM2_SELF_TEST_IN_SIZE,
+			      0, TPM_TRANSMIT_NOWAIT, "continue selftest");
+
+	if (rc == TPM2_RC_TESTING) {
+		/*
+		 * A return of RC_TESTING means the TPM is still
+		 * running self tests.  If one fails it will go into
+		 * failure mode and return RC_FAILED to every command,
+		 * so treat a still in testing return as a success
+		 * rather than causing a driver detach.
+		 */
+		dev_info(&chip->dev,"TPM: Running self test in background\n");
+		rc = TPM2_RC_SUCCESS;
 	}
 
 	return rc;

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* RE: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 19:17         ` Paul Menzel
@ 2018-02-01 20:12           ` Mario.Limonciello
  2018-02-01 21:06             ` Mario.Limonciello
                               ` (2 more replies)
  0 siblings, 3 replies; 42+ messages in thread
From: Mario.Limonciello @ 2018-02-01 20:12 UTC (permalink / raw)
  To: pmenzel, James.Bottomley; +Cc: linux-integrity, regressions, Alexander.Steffen



> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> Sent: Thursday, February 1, 2018 1:17 PM
> To: James Bottomley <James.Bottomley@HansenPartnership.com>
> Cc: linux-integrity <linux-integrity@vger.kernel.org>; Limonciello, Mario
> <Mario_Limonciello@Dell.com>; regressions@leemhuis.info; Alexander Steffen
> <Alexander.Steffen@infineon.com>
> Subject: Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
> 
> [resend with regressions@ address fixed, sorry]
> 
> Am 01.02.2018 um 20:16 schrieb Paul Menzel:
> > Dear James,
> >
> >
> > Am 01.02.2018 um 16:24 schrieb James Bottomley:
> >> On Thu, 2018-02-01 at 12:42 +0000, James Bottomley wrote:
> >>> On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote:
> >
> >>>> On 02/01/18 13:16, James Bottomley wrote:
> >>>>>
> >>>>>
> >>>>> Embarrassingly enough, I'm just on my way to do a TPM talk at
> >>>>> FOSDEM.   I installed my shiny new 4.15 kernel on the 'plane and
> >>>>> this is what I got after I arrived this morning:
> >>>>>
> >>>>> jejb@jarvis:~> dmesg | grep -i tpm
> >>>>> [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034
> >>>>> (v03        Tpm2Tabl 00000001 AMI  00000000)
> >>>>> [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2)
> >>>>> [    1.608863] tpm tpm0: A TPM error (2314) occurred continue selftest
> >>>>> [    1.640052] tpm tpm0: A TPM error (2314) occurred continue selftest
> >>>>> [    1.691215] tpm tpm0: A TPM error (2314) occurred continue selftest
> >>>>> [    1.782377] tpm tpm0: A TPM error (2314) occurred continue selftest
> >>>>> [    1.953539] tpm tpm0: A TPM error (2314) occurred continue selftest
> >>>>> [    2.284701] tpm tpm0: A TPM error (2314) occurred continue selftest
> >>>>> [    2.935743] tpm tpm0: A TPM error (2314) occurred continue selftest
> >>>>> [    4.216236] tpm tpm0: TPM self test failed
> >>>>> [    4.236829] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> >>>>>
> >>>>> The error is TPM_RC_TESTING, which means it looks like we don't
> >>>>> wait long enough for the selftests to complete.  I get this all
> >>>>> the time booting with 4.15.  Fortunately I have a 4.13 backup
> >>>>> kernel which is fine (otherwise I'd be a bit hosed since all my
> >>>>> keys now require a TPM).
> >>>>>
> >>>>> I'll debug on the train; my current suspicion is that the
> >>>>> TPM_LONG duration might be a bit short for this chip (A nuvoton
> >>>>> 6xx in a dell XPS-13).
> >>>>
> >>>> Please join the thread [1], where I reported the same problem for
> >>>> the Dell XPS 13 9360. Unfortunately, no solution was found,
> >>>> especially, as I did not use the TPM. Other owners of that system
> >>>> unfortunately didn't have time to report back if it work for them,
> >>>> so the "conclusion" kind of was, that my TPM was broken, and had to
> >>>> be tested.
> >>>
> >>> OK, I'll try to find a fix.  It's clearly a marginal problem since
> >>> I've booted most -rc kernels without issue, so there's some slight
> >>> timing change in 4.15 that triggered it.  It could also be a shutdown
> >>> issue.  Any NV ram stuff deferred to start up would take a variable
> >>> amount of time.
> >>>
> >>> You'd almost think it's some sort of TPM self protest: the more stuff
> >>> I use it for the more problems it seems to create.  I'm definitely
> >>> motivated to fix it because without a TPM I can't actually do much
> >>> with my laptop.
> >>
> >> OK, I investigated but now my TPM has returned to normal (as in it
> >> passes the selftest immediately).  There's clearly something that makes
> >> it return TPM_RC_TESTING to every self test probe for seconds at a
> >> time, but I don't know what it is.  Sending a different command seems
> >> to cause the problem to clear (Managed to reproduce once with the patch
> >> to verify), so this is my proposed fix.  It's clearly nonsensical to
> >> detach the driver because the self test still returns TPM_RC_TESTING,
> >> so convert that return to a TPM_RC_SUCCESS on timeout.  It prints a
> >> warning message so we'll see it in the logs if it causes problems.
> >>   Given that this seems to be some type of internal TPM issue, I don't
> >> believe changing the timings would work.
> >
> > Maybe Mario can confirm this issue too, now that Linux 4.15 is released.
> > Maybe he also has a way to get the Nuvoton people involved.

James,

Did you actually experiment with changing the timings?

I was told that TPMs that are FIPS validated (such as that in the XPS 13) may 
take longer for the self tests to run.

> >
> >> ---
> >>
> >> diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
> >> index f40d20671a78..3e1b062d8888 100644
> >> --- a/drivers/char/tpm/tpm2-cmd.c
> >> +++ b/drivers/char/tpm/tpm2-cmd.c
> >> @@ -872,6 +872,17 @@ static int tpm2_do_selftest(struct tpm_chip *chip)
> >>           /* wait longer the next round */
> >>           delay_msec *= 2;
> >>       }
> >> +    if (rc == TPM2_RC_TESTING) {
> >> +        /*
> >> +         * A return of RC_TESTING means the TPM is still
> >> +         * running self tests.  If one fails it will go into
> >> +         * failure mode and return RC_FAILED to every command,
> >> +         * so treat a still in testing return as a success
> >> +         * rather than causing a driver detach.
> >> +         */
> >> +        dev_err(&chip->dev,"TPM: Still in testing mode after %dms,
> >> continuing\n", delay_msec);
> >> +        rc = TPM2_RC_SUCCESS;
> >> +    }
> >>       return rc;
> >>   }
> >
> > Alexander replied the following in the other thread. No idea if you read
> > it yet.
> >
> >> The list of "A TPM error (2314) occurred continue selftest" is caused
> >> by my commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React
> >> correctly to
> >> RC_TESTING from TPM 2.0 self tests") [1]. 2314 is TPM_RC_TESTING, so
> >> the TPM
> >> tells us that self-tests are still running in the background. This
> >> problem was
> >> not visible in previous versions, since it (incorrectly) ignored >
> >> TPM_RC_TESTING.
> >
> > Maybe the commit should be reverted for now until this has cleared up
> > for the Dell XPS 13 9360(?) to adhere to Linux' no regression policy.
> >
> >
> > Kind regards,
> >
> > Paul
> >
> >
> > PS: Alexander will also be at FOSDEM and mentioned your talk [2].
> >
> >
> > [1]
> >
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=125
> a2210541079e8e7c69e629ad06cabed788f8[2]
> > https://lists.01.org/pipermail/tpm2/2018-January/000486.html

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 20:00             ` James Bottomley
@ 2018-02-01 20:35               ` Jason Gunthorpe
  2018-02-01 21:06                 ` James Bottomley
  2018-02-08 13:10               ` Jarkko Sakkinen
  1 sibling, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2018-02-01 20:35 UTC (permalink / raw)
  To: James Bottomley; +Cc: Paul Menzel, linux-integrity

On Thu, Feb 01, 2018 at 09:00:04PM +0100, James Bottomley wrote:
> On Thu, 2018-02-01 at 11:59 -0700, Jason Gunthorpe wrote:
> > On Thu, Feb 01, 2018 at 07:46:04PM +0100, James Bottomley wrote:
> > 
> > > 
> > > I honestly don't think we should be waiting for the self test at
> > > all.
> > > We should kick it off and treat any TPM_RC_TESTING error as
> > > -EAGAIN.
> > > We're already under fire for slow boot sequences and adding 2s just
> > > to
> > > wait for the TPM to self test adds to that for no real value.
> > 
> > Arguably the BIOS should have completed the selftest - this stuff
> > generally only exists to support embedded.
> > 
> > I don't like the idea of EAGAIN, that just expose all our users to
> > this mess.
> > 
> > I would support making transmit_cmd genericly wait and retry if the
> > TPM insists we need to wait for selftest to complete the specific
> > command though.
> 
> OK, how about this then?

Yeah, I like this concept much better, thanks

> diff --git a/drivers/char/tpm/tpm-interface.c b/drivers/char/tpm/tpm-interface.c
> index 1d6729be4cd6..84ed271c060b 100644
> +++ b/drivers/char/tpm/tpm-interface.c
> @@ -521,12 +521,32 @@ ssize_t tpm_transmit_cmd(struct tpm_chip *chip, struct tpm_space *space,
>  	const struct tpm_output_header *header = buf;
>  	int err;
>  	ssize_t len;
> +	unsigned int delay_msec = 20;
>  
> -	len = tpm_transmit(chip, space, (u8 *)buf, bufsiz, flags);
> -	if (len <  0)
> -		return len;
> +	/*
> +	 * on first probe we kick off a TPM self test in the
> +	 * background This means the TPM may return RC_TESTING to any
> +	 * command that tries to use a subsystem under test, so do an
> +	 * exponential backoff wait if that happens
> +	 */
> +	for (;;) {
> +		len = tpm_transmit(chip, space, (u8 *)buf, bufsiz, flags);
> +		if (len <  0)
> +			return len;
> +
> +		err = be32_to_cpu(header->return_code);
> +		if (err != TPM2_RC_TESTING ||
> +		    (flags & TPM_TRANSMIT_NOWAIT))
> +			break;

Do TPM and TPM2 use a different return code here?

Jason

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 20:35               ` Jason Gunthorpe
@ 2018-02-01 21:06                 ` James Bottomley
  0 siblings, 0 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-01 21:06 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Paul Menzel, linux-integrity

On Thu, 2018-02-01 at 13:35 -0700, Jason Gunthorpe wrote:
> On Thu, Feb 01, 2018 at 09:00:04PM +0100, James Bottomley wrote:
> > 
> > On Thu, 2018-02-01 at 11:59 -0700, Jason Gunthorpe wrote:
> > > 
> > > On Thu, Feb 01, 2018 at 07:46:04PM +0100, James Bottomley wrote:
> > > 
> > > > 
> > > > 
> > > > I honestly don't think we should be waiting for the self test
> > > > at
> > > > all.
> > > > We should kick it off and treat any TPM_RC_TESTING error as
> > > > -EAGAIN.
> > > > We're already under fire for slow boot sequences and adding 2s
> > > > just
> > > > to
> > > > wait for the TPM to self test adds to that for no real value.
> > > 
> > > Arguably the BIOS should have completed the selftest - this stuff
> > > generally only exists to support embedded.
> > > 
> > > I don't like the idea of EAGAIN, that just expose all our users
> > > to
> > > this mess.
> > > 
> > > I would support making transmit_cmd genericly wait and retry if
> > > the
> > > TPM insists we need to wait for selftest to complete the specific
> > > command though.
> > 
> > OK, how about this then?
> 
> Yeah, I like this concept much better, thanks

Great, I'll put it through a few tests then send a formal patch.

> > diff --git a/drivers/char/tpm/tpm-interface.c
> > b/drivers/char/tpm/tpm-interface.c
> > index 1d6729be4cd6..84ed271c060b 100644
> > +++ b/drivers/char/tpm/tpm-interface.c
> > @@ -521,12 +521,32 @@ ssize_t tpm_transmit_cmd(struct tpm_chip
> > *chip, struct tpm_space *space,
> >  	const struct tpm_output_header *header = buf;
> >  	int err;
> >  	ssize_t len;
> > +	unsigned int delay_msec = 20;
> >  
> > -	len = tpm_transmit(chip, space, (u8 *)buf, bufsiz, flags);
> > -	if (len <  0)
> > -		return len;
> > +	/*
> > +	 * on first probe we kick off a TPM self test in the
> > +	 * background This means the TPM may return RC_TESTING to
> > any
> > +	 * command that tries to use a subsystem under test, so do
> > an
> > +	 * exponential backoff wait if that happens
> > +	 */
> > +	for (;;) {
> > +		len = tpm_transmit(chip, space, (u8 *)buf, bufsiz,
> > flags);
> > +		if (len <  0)
> > +			return len;
> > +
> > +		err = be32_to_cpu(header->return_code);
> > +		if (err != TPM2_RC_TESTING ||
> > +		    (flags & TPM_TRANSMIT_NOWAIT))
> > +			break;
> 
> Do TPM and TPM2 use a different return code here?

Yes, TPM2_RC_TESTING is 0x90a and TPM_DOING_SELFTEST is 0x802

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* RE: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 20:12           ` Mario.Limonciello
@ 2018-02-01 21:06             ` Mario.Limonciello
  2018-02-01 22:22               ` Jason Gunthorpe
  2018-02-02  5:46             ` James Bottomley
  2018-02-08 16:53             ` Ken Goldman
  2 siblings, 1 reply; 42+ messages in thread
From: Mario.Limonciello @ 2018-02-01 21:06 UTC (permalink / raw)
  To: Alexander.Steffen; +Cc: linux-integrity, regressions, James.Bottomley, pmenzel

> -----Original Message-----
> From: Limonciello, Mario
> Sent: Thursday, February 1, 2018 2:12 PM
> To: 'Paul Menzel' <pmenzel@molgen.mpg.de>; James Bottomley
> <James.Bottomley@HansenPartnership.com>
> Cc: linux-integrity <linux-integrity@vger.kernel.org>; regressions@leemhuis.info;
> Alexander Steffen <Alexander.Steffen@infineon.com>
> Subject: RE: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
> 
> 
> 
> > -----Original Message-----
> > From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> > Sent: Thursday, February 1, 2018 1:17 PM
> > To: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Cc: linux-integrity <linux-integrity@vger.kernel.org>; Limonciello, Mario
> > <Mario_Limonciello@Dell.com>; regressions@leemhuis.info; Alexander Steffen
> > <Alexander.Steffen@infineon.com>
> > Subject: Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
> >
> > [resend with regressions@ address fixed, sorry]
> >
> > Am 01.02.2018 um 20:16 schrieb Paul Menzel:
> > > Dear James,
> > >
> > >
> > > Am 01.02.2018 um 16:24 schrieb James Bottomley:
> > >> On Thu, 2018-02-01 at 12:42 +0000, James Bottomley wrote:
> > >>> On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote:
> > >
> > >>>> On 02/01/18 13:16, James Bottomley wrote:
> > >>>>>
> > >>>>>
> > >>>>> Embarrassingly enough, I'm just on my way to do a TPM talk at
> > >>>>> FOSDEM.   I installed my shiny new 4.15 kernel on the 'plane and
> > >>>>> this is what I got after I arrived this morning:
> > >>>>>
> > >>>>> jejb@jarvis:~> dmesg | grep -i tpm
> > >>>>> [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034
> > >>>>> (v03        Tpm2Tabl 00000001 AMI  00000000)
> > >>>>> [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2)
> > >>>>> [    1.608863] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >>>>> [    1.640052] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >>>>> [    1.691215] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >>>>> [    1.782377] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >>>>> [    1.953539] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >>>>> [    2.284701] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >>>>> [    2.935743] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >>>>> [    4.216236] tpm tpm0: TPM self test failed
> > >>>>> [    4.236829] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > >>>>>
> > >>>>> The error is TPM_RC_TESTING, which means it looks like we don't
> > >>>>> wait long enough for the selftests to complete.  I get this all
> > >>>>> the time booting with 4.15.  Fortunately I have a 4.13 backup
> > >>>>> kernel which is fine (otherwise I'd be a bit hosed since all my
> > >>>>> keys now require a TPM).
> > >>>>>
> > >>>>> I'll debug on the train; my current suspicion is that the
> > >>>>> TPM_LONG duration might be a bit short for this chip (A nuvoton
> > >>>>> 6xx in a dell XPS-13).
> > >>>>
> > >>>> Please join the thread [1], where I reported the same problem for
> > >>>> the Dell XPS 13 9360. Unfortunately, no solution was found,
> > >>>> especially, as I did not use the TPM. Other owners of that system
> > >>>> unfortunately didn't have time to report back if it work for them,
> > >>>> so the "conclusion" kind of was, that my TPM was broken, and had to
> > >>>> be tested.
> > >>>
> > >>> OK, I'll try to find a fix.  It's clearly a marginal problem since
> > >>> I've booted most -rc kernels without issue, so there's some slight
> > >>> timing change in 4.15 that triggered it.  It could also be a shutdown
> > >>> issue.  Any NV ram stuff deferred to start up would take a variable
> > >>> amount of time.
> > >>>
> > >>> You'd almost think it's some sort of TPM self protest: the more stuff
> > >>> I use it for the more problems it seems to create.  I'm definitely
> > >>> motivated to fix it because without a TPM I can't actually do much
> > >>> with my laptop.
> > >>
> > >> OK, I investigated but now my TPM has returned to normal (as in it
> > >> passes the selftest immediately).  There's clearly something that makes
> > >> it return TPM_RC_TESTING to every self test probe for seconds at a
> > >> time, but I don't know what it is.  Sending a different command seems
> > >> to cause the problem to clear (Managed to reproduce once with the patch
> > >> to verify), so this is my proposed fix.  It's clearly nonsensical to
> > >> detach the driver because the self test still returns TPM_RC_TESTING,
> > >> so convert that return to a TPM_RC_SUCCESS on timeout.  It prints a
> > >> warning message so we'll see it in the logs if it causes problems.
> > >>   Given that this seems to be some type of internal TPM issue, I don't
> > >> believe changing the timings would work.
> > >
> > > Maybe Mario can confirm this issue too, now that Linux 4.15 is released.
> > > Maybe he also has a way to get the Nuvoton people involved.
> 
> James,
> 
> Did you actually experiment with changing the timings?
> 
> I was told that TPMs that are FIPS validated (such as that in the XPS 13) may
> take longer for the self tests to run.
> 
> > >
> > >> ---
> > >>
> > >> diff --git a/drivers/char/tpm/tpm2-cmd.c b/drivers/char/tpm/tpm2-cmd.c
> > >> index f40d20671a78..3e1b062d8888 100644
> > >> --- a/drivers/char/tpm/tpm2-cmd.c
> > >> +++ b/drivers/char/tpm/tpm2-cmd.c
> > >> @@ -872,6 +872,17 @@ static int tpm2_do_selftest(struct tpm_chip *chip)
> > >>           /* wait longer the next round */
> > >>           delay_msec *= 2;
> > >>       }
> > >> +    if (rc == TPM2_RC_TESTING) {
> > >> +        /*
> > >> +         * A return of RC_TESTING means the TPM is still
> > >> +         * running self tests.  If one fails it will go into
> > >> +         * failure mode and return RC_FAILED to every command,
> > >> +         * so treat a still in testing return as a success
> > >> +         * rather than causing a driver detach.
> > >> +         */
> > >> +        dev_err(&chip->dev,"TPM: Still in testing mode after %dms,
> > >> continuing\n", delay_msec);
> > >> +        rc = TPM2_RC_SUCCESS;
> > >> +    }
> > >>       return rc;
> > >>   }
> > >

I discussed this with some folks and although it would fix the problem it is not
accurately characterizing the situation.  What is likely happening here is that
issuing the self test command in succession is causing the TPM to restart the
self test and not complete.  Instead the selfTestDone bit should be polled.

I feel Paul is right, if a solution can't be brought up to do this instead, this 
commit 125a2210541079e8e7c69e629ad06cabed788f8c should be reverted.

> > > Alexander replied the following in the other thread. No idea if you read
> > > it yet.
> > >
> > >> The list of "A TPM error (2314) occurred continue selftest" is caused
> > >> by my commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React
> > >> correctly to
> > >> RC_TESTING from TPM 2.0 self tests") [1]. 2314 is TPM_RC_TESTING, so
> > >> the TPM
> > >> tells us that self-tests are still running in the background. This
> > >> problem was
> > >> not visible in previous versions, since it (incorrectly) ignored >
> > >> TPM_RC_TESTING.
> > >
> > > Maybe the commit should be reverted for now until this has cleared up
> > > for the Dell XPS 13 9360(?) to adhere to Linux' no regression policy.
> > >
> > >
> > > Kind regards,
> > >
> > > Paul
> > >
> > >
> > > PS: Alexander will also be at FOSDEM and mentioned your talk [2].
> > >
> > >
> > > [1]
> > >
> >
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit?id=125
> > a2210541079e8e7c69e629ad06cabed788f8[2]
> > > https://lists.01.org/pipermail/tpm2/2018-January/000486.html

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 21:06             ` Mario.Limonciello
@ 2018-02-01 22:22               ` Jason Gunthorpe
  2018-02-02  5:46                 ` James Bottomley
  0 siblings, 1 reply; 42+ messages in thread
From: Jason Gunthorpe @ 2018-02-01 22:22 UTC (permalink / raw)
  To: Mario.Limonciello
  Cc: Alexander.Steffen, linux-integrity, regressions, James.Bottomley,
	pmenzel

On Thu, Feb 01, 2018 at 09:06:55PM +0000, Mario.Limonciello@dell.com wrote:

> I discussed this with some folks and although it would fix the
> problem it is not accurately characterizing the situation.  What is
> likely happening here is that issuing the self test command in
> succession is causing the TPM to restart the self test and not
> complete.  Instead the selfTestDone bit should be polled.

Oh, I wonder if that is beacuse this was copied from the TPM1 version
where we don't have a bit like that??

Should we check this selfTestDone bit before event sending the self
test?

Jsaon

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 20:12           ` Mario.Limonciello
  2018-02-01 21:06             ` Mario.Limonciello
@ 2018-02-02  5:46             ` James Bottomley
  2018-02-08 16:53             ` Ken Goldman
  2 siblings, 0 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-02  5:46 UTC (permalink / raw)
  To: Mario.Limonciello, pmenzel
  Cc: linux-integrity, regressions, Alexander.Steffen

On Thu, 2018-02-01 at 20:12 +0000, Mario.Limonciello@dell.com wrote:
> 
> > 
> > -----Original Message-----
> > From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> > Sent: Thursday, February 1, 2018 1:17 PM
> > To: James Bottomley <James.Bottomley@HansenPartnership.com>
> > Cc: linux-integrity <linux-integrity@vger.kernel.org>; Limonciello,
> > Mario
> > <Mario_Limonciello@Dell.com>; regressions@leemhuis.info; Alexander
> > Steffen
> > <Alexander.Steffen@infineon.com>
> > Subject: Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton
> > 6xx)
> > 
> > [resend with regressions@ address fixed, sorry]
> > 
> > Am 01.02.2018 um 20:16 schrieb Paul Menzel:
> > > 
> > > Dear James,
> > > 
> > > 
> > > Am 01.02.2018 um 16:24 schrieb James Bottomley:
> > > > 
> > > > On Thu, 2018-02-01 at 12:42 +0000, James Bottomley wrote:
> > > > > 
> > > > > On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote:
> > > 
> > > > 
> > > > > 
> > > > > > 
> > > > > > On 02/01/18 13:16, James Bottomley wrote:
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > Embarrassingly enough, I'm just on my way to do a TPM
> > > > > > > talk at
> > > > > > > FOSDEM.   I installed my shiny new 4.15 kernel on the
> > > > > > > 'plane and
> > > > > > > this is what I got after I arrived this morning:
> > > > > > > 
> > > > > > > jejb@jarvis:~> dmesg | grep -i tpm
> > > > > > > [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034
> > > > > > > (v03        Tpm2Tabl 00000001 AMI  00000000)
> > > > > > > [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id
> > > > > > > 0xFE, rev-id 2)
> > > > > > > [    1.608863] tpm tpm0: A TPM error (2314) occurred
> > > > > > > continue selftest
> > > > > > > [    1.640052] tpm tpm0: A TPM error (2314) occurred
> > > > > > > continue selftest
> > > > > > > [    1.691215] tpm tpm0: A TPM error (2314) occurred
> > > > > > > continue selftest
> > > > > > > [    1.782377] tpm tpm0: A TPM error (2314) occurred
> > > > > > > continue selftest
> > > > > > > [    1.953539] tpm tpm0: A TPM error (2314) occurred
> > > > > > > continue selftest
> > > > > > > [    2.284701] tpm tpm0: A TPM error (2314) occurred
> > > > > > > continue selftest
> > > > > > > [    2.935743] tpm tpm0: A TPM error (2314) occurred
> > > > > > > continue selftest
> > > > > > > [    4.216236] tpm tpm0: TPM self test failed
> > > > > > > [    4.236829] ima: No TPM chip found, activating TPM-
> > > > > > > bypass! (rc=-19)
> > > > > > > 
> > > > > > > The error is TPM_RC_TESTING, which means it looks like we
> > > > > > > don't wait long enough for the selftests to complete.  I
> > > > > > > get this all the time booting with 4.15.  Fortunately I
> > > > > > > have a 4.13 backup kernel which is fine (otherwise I'd be
> > > > > > > a bit hosed since all my keys now require a TPM).
> > > > > > > 
> > > > > > > I'll debug on the train; my current suspicion is that the
> > > > > > > TPM_LONG duration might be a bit short for this chip (A
> > > > > > > nuvoton 6xx in a dell XPS-13).
> > > > > > 
> > > > > > Please join the thread [1], where I reported the same
> > > > > > problem for the Dell XPS 13 9360. Unfortunately, no
> > > > > > solution was found, especially, as I did not use the TPM.
> > > > > > Other owners of that system unfortunately didn't have time
> > > > > > to report back if it work for them, so the "conclusion"
> > > > > > kind of was, that my TPM was broken, and had to be tested.
> > > > > 
> > > > > OK, I'll try to find a fix.  It's clearly a marginal problem
> > > > > since I've booted most -rc kernels without issue, so there's
> > > > > some slight timing change in 4.15 that triggered it.  It
> > > > > could also be a shutdown issue.  Any NV ram stuff deferred to
> > > > > start up would take a variable amount of time.
> > > > > 
> > > > > You'd almost think it's some sort of TPM self protest: the
> > > > > more stuff I use it for the more problems it seems to create.
> > > > >  I'm definitely motivated to fix it because without a TPM I
> > > > > can't actually do much with my laptop.
> > > > 
> > > > OK, I investigated but now my TPM has returned to normal (as in
> > > > it passes the selftest immediately).  There's clearly something
> > > > that makes it return TPM_RC_TESTING to every self test probe
> > > > for seconds at a time, but I don't know what it is.  Sending a
> > > > different command seems to cause the problem to clear (Managed
> > > > to reproduce once with the patch to verify), so this is my
> > > > proposed fix.  It's clearly nonsensical to detach the driver
> > > > because the self test still returns TPM_RC_TESTING,
> > > > so convert that return to a TPM_RC_SUCCESS on timeout.  It
> > > > prints a warning message so we'll see it in the logs if it
> > > > causes problems.   Given that this seems to be some type of
> > > > internal TPM issue, I don't believe changing the timings would
> > > > work.
> > > 
> > > Maybe Mario can confirm this issue too, now that Linux 4.15 is
> > > released. Maybe he also has a way to get the Nuvoton people
> > > involved.
> 
> James,
> 
> Did you actually experiment with changing the timings?

No, I already said: waiting 2s for a device driver init is already too
great a burden on the boot sequence.  I don't honestly think waiting
longer would help either ... 2s is a huge amount of time so there's
something else going on with the TPM.

James

> I was told that TPMs that are FIPS validated (such as that in the XPS
> 13) may take longer for the self tests to run.
> 
> > 
> > > 
> > > 
> > > > 
> > > > ---
> > > > 
> > > > diff --git a/drivers/char/tpm/tpm2-cmd.c
> > > > b/drivers/char/tpm/tpm2-cmd.c
> > > > index f40d20671a78..3e1b062d8888 100644
> > > > --- a/drivers/char/tpm/tpm2-cmd.c
> > > > +++ b/drivers/char/tpm/tpm2-cmd.c
> > > > @@ -872,6 +872,17 @@ static int tpm2_do_selftest(struct
> > > > tpm_chip *chip)
> > > >           /* wait longer the next round */
> > > >           delay_msec *= 2;
> > > >       }
> > > > +    if (rc == TPM2_RC_TESTING) {
> > > > +        /*
> > > > +         * A return of RC_TESTING means the TPM is still
> > > > +         * running self tests.  If one fails it will go into
> > > > +         * failure mode and return RC_FAILED to every command,
> > > > +         * so treat a still in testing return as a success
> > > > +         * rather than causing a driver detach.
> > > > +         */
> > > > +        dev_err(&chip->dev,"TPM: Still in testing mode after
> > > > %dms,
> > > > continuing\n", delay_msec);
> > > > +        rc = TPM2_RC_SUCCESS;
> > > > +    }
> > > >       return rc;
> > > >   }
> > > 
> > > Alexander replied the following in the other thread. No idea if
> > > you read
> > > it yet.
> > > 
> > > > 
> > > > The list of "A TPM error (2314) occurred continue selftest" is
> > > > caused
> > > > by my commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm:
> > > > React
> > > > correctly to
> > > > RC_TESTING from TPM 2.0 self tests") [1]. 2314 is
> > > > TPM_RC_TESTING, so
> > > > the TPM
> > > > tells us that self-tests are still running in the background.
> > > > This
> > > > problem was
> > > > not visible in previous versions, since it (incorrectly)
> > > > ignored >
> > > > TPM_RC_TESTING.
> > > 
> > > Maybe the commit should be reverted for now until this has
> > > cleared up
> > > for the Dell XPS 13 9360(?) to adhere to Linux' no regression
> > > policy.
> > > 
> > > 
> > > Kind regards,
> > > 
> > > Paul
> > > 
> > > 
> > > PS: Alexander will also be at FOSDEM and mentioned your talk [2].
> > > 
> > > 
> > > [1]
> > > 
> > https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
> > commit?id=125
> > a2210541079e8e7c69e629ad06cabed788f8[2]
> > > 
> > > https://lists.01.org/pipermail/tpm2/2018-January/000486.html

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 22:22               ` Jason Gunthorpe
@ 2018-02-02  5:46                 ` James Bottomley
  0 siblings, 0 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-02  5:46 UTC (permalink / raw)
  To: Jason Gunthorpe, Mario.Limonciello
  Cc: Alexander.Steffen, linux-integrity, regressions, pmenzel

On Thu, 2018-02-01 at 15:22 -0700, Jason Gunthorpe wrote:
> On Thu, Feb 01, 2018 at 09:06:55PM +0000, Mario.Limonciello@dell.com
> wrote:
> 
> > 
> > I discussed this with some folks and although it would fix the
> > problem it is not accurately characterizing the situation.  What is
> > likely happening here is that issuing the self test command in
> > succession is causing the TPM to restart the self test and not
> > complete.  Instead the selfTestDone bit should be polled.
> 
> Oh, I wonder if that is beacuse this was copied from the TPM1 version
> where we don't have a bit like that??
> 
> Should we check this selfTestDone bit before event sending the self
> test?

The proposed patch already takes care of that by only sending the self
test once.  I suppose checking the selfTestDone bit and not sending a
self test at all could be an additional optimization on top, though.

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 12:16 TPM selftest failure in 4.15 James Bottomley
  2018-02-01 12:21 ` Paul Menzel
@ 2018-02-08 12:49 ` Jarkko Sakkinen
  2018-02-08 18:45 ` Ken Goldman
  2 siblings, 0 replies; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-08 12:49 UTC (permalink / raw)
  To: James Bottomley; +Cc: linux-integrity

On Thu, Feb 01, 2018 at 12:16:11PM +0000, James Bottomley wrote:
> Embarrassingly enough, I'm just on my way to do a TPM talk at FOSDEM.
>  I installed my shiny new 4.15 kernel on the 'plane and this is what I
> got after I arrived this morning:
> 
> jejb@jarvis:~> dmesg | grep -i tpm
> [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034
> (v03        Tpm2Tabl 00000001 AMI  00000000)
> [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 2)
> [    1.608863] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.640052] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.691215] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.782377] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.953539] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    2.284701] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    2.935743] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    4.216236] tpm tpm0: TPM self test failed
> [    4.236829] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> 
> The error is TPM_RC_TESTING, which means it looks like we don't wait
> long enough for the selftests to complete.  I get this all the time
> booting with 4.15.  Fortunately I have a 4.13 backup kernel which is
> fine (otherwise I'd be a bit hosed since all my keys now require a
> TPM).
> 
> I'll debug on the train; my current suspicion is that the TPM_LONG
> duration might be a bit short for this chip (A nuvoton 6xx in a dell
> XPS-13).
> 
> James

I also have XPS 13. I'll check if I can reproduce this. Sorry for the
latency. At busy times I have some latency for messages that do not
include me in to/cc fields but I eventually catch these. If you want
a quicker response, do that in the future.

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 12:42   ` James Bottomley
  2018-02-01 15:24     ` James Bottomley
@ 2018-02-08 13:03     ` Jarkko Sakkinen
  1 sibling, 0 replies; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-08 13:03 UTC (permalink / raw)
  To: James Bottomley; +Cc: Paul Menzel, linux-integrity

On Thu, Feb 01, 2018 at 12:42:50PM +0000, James Bottomley wrote:
> OK, I'll try to find a fix.  It's clearly a marginal problem since I've
> booted most -rc kernels without issue, so there's some slight timing
> change in 4.15 that triggered it.  It could also be a shutdown issue.
>  Any NV ram stuff deferred to start up would take a variable amount of
> time.
> 
> You'd almost think it's some sort of TPM self protest: the more stuff I
> use it for the more problems it seems to create.  I'm definitely
> motivated to fix it because without a TPM I can't actually do much with
> my laptop.

I checked what was landed to 4.15:

* 125a22105410 ("tpm: React correctly to RC_TESTING from TPM 2.0 self tests")
* 87434f58be31 ("tpm: Use dynamic delay to wait for TPM 2.0 self test result")
* 2482b1bba512 ("tpm: Trigger only missing TPM 2.0 self tests")

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 15:24     ` James Bottomley
  2018-02-01 17:40       ` Jason Gunthorpe
  2018-02-01 19:16       ` TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx) Paul Menzel
@ 2018-02-08 13:05       ` Jarkko Sakkinen
  2 siblings, 0 replies; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-08 13:05 UTC (permalink / raw)
  To: James Bottomley; +Cc: Paul Menzel, linux-integrity

On Thu, Feb 01, 2018 at 03:24:08PM +0000, James Bottomley wrote:
> On Thu, 2018-02-01 at 12:42 +0000, James Bottomley wrote:
> > On Thu, 2018-02-01 at 13:21 +0100, Paul Menzel wrote:
> > > 
> > > Dear James,
> > > 
> > > 
> > > On 02/01/18 13:16, James Bottomley wrote:
> > > > 
> > > > 
> > > > Embarrassingly enough, I'm just on my way to do a TPM talk at
> > > > FOSDEM.   I installed my shiny new 4.15 kernel on the 'plane and
> > > > this is what I got after I arrived this morning:
> > > > 
> > > > jejb@jarvis:~> dmesg | grep -i tpm
> > > > [    0.000000] ACPI: TPM2 0x0000000079446CC0 000034
> > > > (v03        Tpm2Tabl 00000001 AMI  00000000)
> > > > [    1.598059] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-
> > > > id 
> > > > 2)
> > > > [    1.608863] tpm tpm0: A TPM error (2314) occurred continue
> > > > selftest
> > > > [    1.640052] tpm tpm0: A TPM error (2314) occurred continue
> > > > selftest
> > > > [    1.691215] tpm tpm0: A TPM error (2314) occurred continue
> > > > selftest
> > > > [    1.782377] tpm tpm0: A TPM error (2314) occurred continue
> > > > selftest
> > > > [    1.953539] tpm tpm0: A TPM error (2314) occurred continue
> > > > selftest
> > > > [    2.284701] tpm tpm0: A TPM error (2314) occurred continue
> > > > selftest
> > > > [    2.935743] tpm tpm0: A TPM error (2314) occurred continue
> > > > selftest
> > > > [    4.216236] tpm tpm0: TPM self test failed
> > > > [    4.236829] ima: No TPM chip found, activating TPM-bypass!
> > > > (rc=-
> > > > 19)
> > > > 
> > > > The error is TPM_RC_TESTING, which means it looks like we don't
> > > > wait long enough for the selftests to complete.  I get this all
> > > > the time booting with 4.15.  Fortunately I have a 4.13 backup
> > > > kernel which is fine (otherwise I'd be a bit hosed since all my
> > > > keys now require a TPM).
> > > > 
> > > > I'll debug on the train; my current suspicion is that the
> > > > TPM_LONG duration might be a bit short for this chip (A nuvoton
> > > > 6xx in a dell XPS-13).
> > > 
> > > Please join the thread [1], where I reported the same problem for
> > > the Dell XPS 13 9360. Unfortunately, no solution was found,
> > > especially, as I did not use the TPM. Other owners of that system
> > > unfortunately didn't have time to report back if it work for them,
> > > so the "conclusion" kind of was, that my TPM was broken, and had to
> > > be tested.
> > 
> > OK, I'll try to find a fix.  It's clearly a marginal problem since
> > I've booted most -rc kernels without issue, so there's some slight
> > timing change in 4.15 that triggered it.  It could also be a shutdown
> > issue.  Any NV ram stuff deferred to start up would take a variable
> > amount of time.
> > 
> > You'd almost think it's some sort of TPM self protest: the more stuff
> > I use it for the more problems it seems to create.  I'm definitely
> > motivated to fix it because without a TPM I can't actually do much
> > with my laptop.
> 
> OK, I investigated but now my TPM has returned to normal (as in it
> passes the selftest immediately).  There's clearly something that makes
> it return TPM_RC_TESTING to every self test probe for seconds at a
> time, but I don't know what it is.  Sending a different command seems
> to cause the problem to clear (Managed to reproduce once with the patch
> to verify), so this is my proposed fix.  It's clearly nonsensical to
> detach the driver because the self test still returns TPM_RC_TESTING,
> so convert that return to a TPM_RC_SUCCESS on timeout.  It prints a
> warning message so we'll see it in the logs if it causes problems.
>  Given that this seems to be some type of internal TPM issue, I don't
> believe changing the timings would work.

I don't think this is a sane rationale for a fix if the driver has
worked just fine on 4.14. It would be better to first identify the
commit that causes the regression and plan after that how to fix it.

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 20:00             ` James Bottomley
  2018-02-01 20:35               ` Jason Gunthorpe
@ 2018-02-08 13:10               ` Jarkko Sakkinen
  2018-02-08 17:02                 ` James Bottomley
  1 sibling, 1 reply; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-08 13:10 UTC (permalink / raw)
  To: James Bottomley; +Cc: Jason Gunthorpe, Paul Menzel, linux-integrity

On Thu, Feb 01, 2018 at 09:00:04PM +0100, James Bottomley wrote:
> On Thu, 2018-02-01 at 11:59 -0700, Jason Gunthorpe wrote:
> > On Thu, Feb 01, 2018 at 07:46:04PM +0100, James Bottomley wrote:
> > 
> > > 
> > > I honestly don't think we should be waiting for the self test at
> > > all.
> > > We should kick it off and treat any TPM_RC_TESTING error as
> > > -EAGAIN.
> > > We're already under fire for slow boot sequences and adding 2s just
> > > to
> > > wait for the TPM to self test adds to that for no real value.
> > 
> > Arguably the BIOS should have completed the selftest - this stuff
> > generally only exists to support embedded.
> > 
> > I don't like the idea of EAGAIN, that just expose all our users to
> > this mess.
> > 
> > I would support making transmit_cmd genericly wait and retry if the
> > TPM insists we need to wait for selftest to complete the specific
> > command though.
> 
> OK, how about this then?
> 
> James

As long as there is no identified a regression it is a waste
of time to review these...

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 19:16       ` TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx) Paul Menzel
  2018-02-01 19:17         ` Paul Menzel
@ 2018-02-08 13:18         ` Jarkko Sakkinen
  1 sibling, 0 replies; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-08 13:18 UTC (permalink / raw)
  To: Paul Menzel
  Cc: James Bottomley, linux-integrity, Mario Limonciello, regression,
	Alexander Steffen

On Thu, Feb 01, 2018 at 08:16:26PM +0100, Paul Menzel wrote:
> > The list of "A TPM error (2314) occurred continue selftest" is caused by
> > my commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React
> > correctly to
> > RC_TESTING from TPM 2.0 self tests") [1]. 2314 is TPM_RC_TESTING, so the TPM
> > tells us that self-tests are still running in the background. This problem was
> > not visible in previous versions, since it (incorrectly) ignored > TPM_RC_TESTING.
> 
> Maybe the commit should be reverted for now until this has cleared up for
> the Dell XPS 13 9360(?) to adhere to Linux' no regression policy.

I think this should be done as an immediate action with Cc to stable.

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx)
  2018-02-01 20:12           ` Mario.Limonciello
  2018-02-01 21:06             ` Mario.Limonciello
  2018-02-02  5:46             ` James Bottomley
@ 2018-02-08 16:53             ` Ken Goldman
  2 siblings, 0 replies; 42+ messages in thread
From: Ken Goldman @ 2018-02-08 16:53 UTC (permalink / raw)
  To: linux-integrity

On 2/1/2018 3:12 PM, Mario.Limonciello@dell.com wrote:
> 
> I was told that TPMs that are FIPS validated (such as that in the XPS 13) may
> take longer for the self tests to run.

I don't understand why the SeftTest command should take longer.

I understand that a FIPS TPM must do all the self tests before it 
returns any results, while a non-FIPS must only test the algorithms 
required to return the result.  So, some other command will take longer.

~~

 From the TPM spec:

FIPS 140-2 requires that all power-on self-tests be complete before the 
TPM returns any value that depends on the results of a testable 
function. If compliance with FIPS 140-2 is required, any command that 
requires use of an untested function causes the TPM to operate as if 
TPM2_SelfTest(fullTest = NO) was received. The TPM returns 
TPM_RC_TESTING and continues to return TPM_RC_TESTING until all tests 
are complete. Alternatively, it may complete all tests and then complete 
the command. It may also return TPM_RC_NEEDS_TEST.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-08 13:10               ` Jarkko Sakkinen
@ 2018-02-08 17:02                 ` James Bottomley
  2018-02-09 10:02                   ` Jarkko Sakkinen
  0 siblings, 1 reply; 42+ messages in thread
From: James Bottomley @ 2018-02-08 17:02 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: Jason Gunthorpe, Paul Menzel, linux-integrity

On Thu, 2018-02-08 at 15:10 +0200, Jarkko Sakkinen wrote:
> On Thu, Feb 01, 2018 at 09:00:04PM +0100, James Bottomley wrote:
> > 
> > On Thu, 2018-02-01 at 11:59 -0700, Jason Gunthorpe wrote:
> > > 
> > > On Thu, Feb 01, 2018 at 07:46:04PM +0100, James Bottomley wrote:
> > > 
> > > > 
> > > > 
> > > > I honestly don't think we should be waiting for the self test
> > > > at
> > > > all.
> > > > We should kick it off and treat any TPM_RC_TESTING error as
> > > > -EAGAIN.
> > > > We're already under fire for slow boot sequences and adding 2s
> > > > just
> > > > to
> > > > wait for the TPM to self test adds to that for no real value.
> > > 
> > > Arguably the BIOS should have completed the selftest - this stuff
> > > generally only exists to support embedded.
> > > 
> > > I don't like the idea of EAGAIN, that just expose all our users
> > > to
> > > this mess.
> > > 
> > > I would support making transmit_cmd genericly wait and retry if
> > > the
> > > TPM insists we need to wait for selftest to complete the specific
> > > command though.
> > 
> > OK, how about this then?
> > 
> > James
> 
> As long as there is no identified a regression it is a waste
> of time to review these...

There is an identified regression: the TPM driver will now periodically
fail to attach.  However, there's no point reviewing until we agree
what the fix is.  I was just waiting to verify this fixed my problem
(which means seeing the messages it spits out proving the TPM has
remained in self test).  I have now seen this and the driver still
works, so I can submit a formal patch.

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 17:40       ` Jason Gunthorpe
  2018-02-01 18:46         ` James Bottomley
@ 2018-02-08 17:27         ` Ken Goldman
  1 sibling, 0 replies; 42+ messages in thread
From: Ken Goldman @ 2018-02-08 17:27 UTC (permalink / raw)
  To: linux-integrity

On 2/1/2018 12:40 PM, Jason Gunthorpe wrote:

> 
> But if a selftest returns TPM2_RC_TESTING I would expect the next
> command to also fail with a testing in progress code? At least by the
> spec..

It will return that code if the command requires an untested function.
For FIPS, the code is returned if there is any untested function.

Commands that don't require testable functions will succeed.

> The point of invoking selftest is to get to a state where future TPM
> commands will succeed, so returning immediately on RC_TESTING seems
> wrong?

Blocking vs. non-blocking is a TPM vendor implementation option.

Re., "seems wrong", the PC Client TPM changed the requirement, so that
a guaranteed non-blocking option is available.

~~
2.	On receipt of TPM2_SelfTest(fullTest == YES), the TPM SHALL perform a 
full self-test and return the result when all tests are complete.
~~

Of course, at some layer, the device drive is still polling, either for 
a success code (non-blocking) or a response (blocking).

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-01 12:16 TPM selftest failure in 4.15 James Bottomley
  2018-02-01 12:21 ` Paul Menzel
  2018-02-08 12:49 ` Jarkko Sakkinen
@ 2018-02-08 18:45 ` Ken Goldman
  2 siblings, 0 replies; 42+ messages in thread
From: Ken Goldman @ 2018-02-08 18:45 UTC (permalink / raw)
  To: linux-integrity

On 2/1/2018 7:16 AM, James Bottomley wrote:
> 
> The error is TPM_RC_TESTING, which means it looks like we don't wait
> long enough for the selftests to complete.  I get this all the time
> booting with 4.15.  Fortunately I have a 4.13 backup kernel which is
> fine (otherwise I'd be a bit hosed since all my keys now require a
> TPM).

The current TCG PC Client Device Driver WG wisdom, if you don't want to 
poll the status bit, is to issue the TPM2_SelfTest with fullTest = yes.
'yes' forces the command to block (not return a response) until the 
command completes.

The TPM vendors suspect that a loop that re-issues the command in 
response to TPM_RC_TESTING can make matters worse.  If a crypto 
algorithm self test keeps getting aborted for the TPM to process the 
command, it has to restart.

My feeling is that, if you going to sleep/loop on something, it's better 
to loop on dataAvail than resending and looping on TPM_RC_TESTING.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-08 17:02                 ` James Bottomley
@ 2018-02-09 10:02                   ` Jarkko Sakkinen
  2018-02-09 10:30                     ` Nayna Jain
                                       ` (3 more replies)
  0 siblings, 4 replies; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-09 10:02 UTC (permalink / raw)
  To: James Bottomley; +Cc: Jason Gunthorpe, Paul Menzel, linux-integrity

On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
> There is an identified regression: the TPM driver will now periodically
> fail to attach.  However, there's no point reviewing until we agree
> what the fix is.  I was just waiting to verify this fixed my problem
> (which means seeing the messages it spits out proving the TPM has
> remained in self test).  I have now seen this and the driver still
> works, so I can submit a formal patch.

For the self-test the duration falls down to 2 seconds as the specs do
not contain any well-defined duration for it, or at least I haven't
found it.

I see three alternative ways the fix the self-test:

1. Execute self-test with fullTest = YES.
2. Have a flag TPM_CHIP_TESTING that is set when the self-test is
   started. Issue a warning on time-out. Check for this flag in
   tpm_transmit_cmd() and tpm_write() and resend self-test command
   if the flag is stil test before the actual command. Return -EBUSY
   and print a warning if self-test is still active.
3. Increase the duration to the "correct" value if we have one.

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 10:02                   ` Jarkko Sakkinen
@ 2018-02-09 10:30                     ` Nayna Jain
  2018-02-15 12:00                       ` Jarkko Sakkinen
  2018-02-09 11:47                     ` Alexander Steffen
                                       ` (2 subsequent siblings)
  3 siblings, 1 reply; 42+ messages in thread
From: Nayna Jain @ 2018-02-09 10:30 UTC (permalink / raw)
  To: Jarkko Sakkinen, James Bottomley
  Cc: Jason Gunthorpe, Paul Menzel, linux-integrity



On 02/09/2018 03:32 PM, Jarkko Sakkinen wrote:
> On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
>> There is an identified regression: the TPM driver will now periodically
>> fail to attach.  However, there's no point reviewing until we agree
>> what the fix is.  I was just waiting to verify this fixed my problem
>> (which means seeing the messages it spits out proving the TPM has
>> remained in self test).  I have now seen this and the driver still
>> works, so I can submit a formal patch.
> For the self-test the duration falls down to 2 seconds as the specs do
> not contain any well-defined duration for it, or at least I haven't
> found it.
>
> I see three alternative ways the fix the self-test:
>
> 1. Execute self-test with fullTest = YES.
> 2. Have a flag TPM_CHIP_TESTING that is set when the self-test is
>     started. Issue a warning on time-out. Check for this flag in
>     tpm_transmit_cmd() and tpm_write() and resend self-test command
>     if the flag is stil test before the actual command. Return -EBUSY
>     and print a warning if self-test is still active.
> 3. Increase the duration to the "correct" value if we have one.
>
> /Jarkko

IIUC, I see the duration and timeout specified in
TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_Family_2.0_Revision:
Section 5.5.1.3 Command Timing: Table 15

For selftest it is given as:

Command                                      |  Duration[ms]    | 
Timeout[ms]

TPM2_SelfTest(fullTest=YES)                 1000 2000
TPM2_SelfTest(fullTest=NO) 20                        750

With that I think, TPM is expected to complete the selftest in max 2000 ms.

Thanks & Regards,
      - Nayna

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 10:02                   ` Jarkko Sakkinen
  2018-02-09 10:30                     ` Nayna Jain
@ 2018-02-09 11:47                     ` Alexander Steffen
  2018-02-15 12:12                       ` Jarkko Sakkinen
  2018-02-09 12:26                     ` Mimi Zohar
  2018-02-09 16:18                     ` James Bottomley
  3 siblings, 1 reply; 42+ messages in thread
From: Alexander Steffen @ 2018-02-09 11:47 UTC (permalink / raw)
  To: Jarkko Sakkinen, James Bottomley
  Cc: Jason Gunthorpe, Paul Menzel, linux-integrity

On 09.02.2018 11:02, Jarkko Sakkinen wrote:
> On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
>> There is an identified regression: the TPM driver will now periodically
>> fail to attach.  However, there's no point reviewing until we agree
>> what the fix is.  I was just waiting to verify this fixed my problem
>> (which means seeing the messages it spits out proving the TPM has
>> remained in self test).  I have now seen this and the driver still
>> works, so I can submit a formal patch.
> 
> For the self-test the duration falls down to 2 seconds as the specs do
> not contain any well-defined duration for it, or at least I haven't
> found it.
> 
> I see three alternative ways the fix the self-test:
> 
> 1. Execute self-test with fullTest = YES.

I had proposed some fixes in this direction last year:
https://patchwork.kernel.org/patch/10105483/
https://patchwork.kernel.org/patch/10130535/

Those combine the fast test execution with fullTest = NO for 
spec-compliant TPMs with a fallback to fullTest = YES.

> 2. Have a flag TPM_CHIP_TESTING that is set when the self-test is
>     started. Issue a warning on time-out. Check for this flag in
>     tpm_transmit_cmd() and tpm_write() and resend self-test command
>     if the flag is stil test before the actual command. Return -EBUSY
>     and print a warning if self-test is still active.
> 3. Increase the duration to the "correct" value if we have one.

What about option #4? Just not calling TPM2_SelfTest at all.

I had already proposed that solution in 
https://www.spinics.net/lists/linux-integrity/msg01007.html, but did not 
get much feedback. According to the specification, that won't lose any 
test coverage ("If a command requires use of an untested algorithm or 
functional module, the TPM performs the test and then completes the 
command actions." 
(https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-1-Architecture-01.38.pdf 
chapter 12.3, page 83/59).) and will definitely be the best option in 
terms of performance and complexity of the driver code.

Alexander

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 10:02                   ` Jarkko Sakkinen
  2018-02-09 10:30                     ` Nayna Jain
  2018-02-09 11:47                     ` Alexander Steffen
@ 2018-02-09 12:26                     ` Mimi Zohar
  2018-02-09 16:23                       ` James Bottomley
  2018-02-09 16:18                     ` James Bottomley
  3 siblings, 1 reply; 42+ messages in thread
From: Mimi Zohar @ 2018-02-09 12:26 UTC (permalink / raw)
  To: Jarkko Sakkinen, James Bottomley
  Cc: Jason Gunthorpe, Paul Menzel, linux-integrity

On Fri, 2018-02-09 at 12:02 +0200, Jarkko Sakkinen wrote:
> On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
> > There is an identified regression: the TPM driver will now periodically
> > fail to attach.  However, there's no point reviewing until we agree
> > what the fix is.  I was just waiting to verify this fixed my problem
> > (which means seeing the messages it spits out proving the TPM has
> > remained in self test).  I have now seen this and the driver still
> > works, so I can submit a formal patch.
> 
> For the self-test the duration falls down to 2 seconds as the specs do
> not contain any well-defined duration for it, or at least I haven't
> found it.
> 
> I see three alternative ways the fix the self-test:
> 
> 1. Execute self-test with fullTest = YES.
> 2. Have a flag TPM_CHIP_TESTING that is set when the self-test is
>    started. Issue a warning on time-out. Check for this flag in
>    tpm_transmit_cmd() and tpm_write() and resend self-test command
>    if the flag is stil test before the actual command. Return -EBUSY
>    and print a warning if self-test is still active.
> 3. Increase the duration to the "correct" value if we have one.

Please take into account that the TPM must complete initialization
BEFORE IMA looks for the TPM, otherwise IMA goes into TPM-bypass mode.
 This consistently happens with a full self-test (option 1).
Increasing the wait duration will most likely cause this to happen as
well (option 2).

Based on James' patch description, which says "There are various
theories that resending the self-test command actually causes the
tests to restart and thus triggers more TPM_RC_TESTING returns until
the timeout is exceeded", I think IMA's detecting the TPM, by doing a
PCR read, will cause the self-test to restart (option 2).

Reverting the patch that enabled the TPM full self-test, or commenting
out self-test completely, allows IMA to properly find the TPM.

Side note, if the kernel emits TPM initialization warnings, there
should be a successfully completed message as well.

Mimi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 10:02                   ` Jarkko Sakkinen
                                       ` (2 preceding siblings ...)
  2018-02-09 12:26                     ` Mimi Zohar
@ 2018-02-09 16:18                     ` James Bottomley
  3 siblings, 0 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-09 16:18 UTC (permalink / raw)
  To: Jarkko Sakkinen; +Cc: Jason Gunthorpe, Paul Menzel, linux-integrity

On Fri, 2018-02-09 at 12:02 +0200, Jarkko Sakkinen wrote:
> On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
> > 
> > There is an identified regression: the TPM driver will now
> > periodically
> > fail to attach.  However, there's no point reviewing until we agree
> > what the fix is.  I was just waiting to verify this fixed my
> > problem
> > (which means seeing the messages it spits out proving the TPM has
> > remained in self test).  I have now seen this and the driver still
> > works, so I can submit a formal patch.
> 
> For the self-test the duration falls down to 2 seconds as the specs
> do
> not contain any well-defined duration for it, or at least I haven't
> found it.
> 
> I see three alternative ways the fix the self-test:
> 
> 1. Execute self-test with fullTest = YES.
> 2. Have a flag TPM_CHIP_TESTING that is set when the self-test is
>    started. Issue a warning on time-out. Check for this flag in
>    tpm_transmit_cmd() and tpm_write() and resend self-test command
>    if the flag is stil test before the actual command. Return -EBUSY
>    and print a warning if self-test is still active.
> 3. Increase the duration to the "correct" value if we have one.

Actually, I disagree.  The first principle we got out of the discussion
was don't re-send the selftest command if the TPM says it's still
running self tests, so we definitely need only to send it once.

I think if the TPM returns returns RC_TESTING we continue on booting
and let it run selftest in the background.  We don't need a flag
because it will return RC_TESTING to any command that tries to exercise
a system under test.  So all we need to do is look for that return,
pause and retry up to a timeout.  If you look at the patch I submitted:

https://marc.info/?l=linux-integrity&m=151812288917753

That's pretty much what it does.

I really think adding more delay to boot up to try and determine when
the selftests "finish" is the wrong way to do this given that data from
the XPS-13 confirms this is somewhere above 2s, which is already a huge
boot wait.

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 12:26                     ` Mimi Zohar
@ 2018-02-09 16:23                       ` James Bottomley
  2018-02-09 21:23                         ` Mimi Zohar
  2018-04-08 18:27                         ` Ken Goldman
  0 siblings, 2 replies; 42+ messages in thread
From: James Bottomley @ 2018-02-09 16:23 UTC (permalink / raw)
  To: Mimi Zohar, Jarkko Sakkinen; +Cc: Jason Gunthorpe, Paul Menzel, linux-integrity

On Fri, 2018-02-09 at 07:26 -0500, Mimi Zohar wrote:
> On Fri, 2018-02-09 at 12:02 +0200, Jarkko Sakkinen wrote:
> > 
> > On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
> > > 
> > > There is an identified regression: the TPM driver will now
> > > periodically
> > > fail to attach.  However, there's no point reviewing until we
> > > agree
> > > what the fix is.  I was just waiting to verify this fixed my
> > > problem
> > > (which means seeing the messages it spits out proving the TPM has
> > > remained in self test).  I have now seen this and the driver
> > > still
> > > works, so I can submit a formal patch.
> > 
> > For the self-test the duration falls down to 2 seconds as the specs
> > do
> > not contain any well-defined duration for it, or at least I haven't
> > found it.
> > 
> > I see three alternative ways the fix the self-test:
> > 
> > 1. Execute self-test with fullTest = YES.
> > 2. Have a flag TPM_CHIP_TESTING that is set when the self-test is
> >    started. Issue a warning on time-out. Check for this flag in
> >    tpm_transmit_cmd() and tpm_write() and resend self-test command
> >    if the flag is stil test before the actual command. Return
> > -EBUSY
> >    and print a warning if self-test is still active.
> > 3. Increase the duration to the "correct" value if we have one.
> 
> Please take into account that the TPM must complete initialization
> BEFORE IMA looks for the TPM, otherwise IMA goes into TPM-bypass
> mode.  This consistently happens with a full self-test (option 1).
> Increasing the wait duration will most likely cause this to happen as
> well (option 2).
> 
> Based on James' patch description, which says "There are various
> theories that resending the self-test command actually causes the
> tests to restart and thus triggers more TPM_RC_TESTING returns until
> the timeout is exceeded", I think IMA's detecting the TPM, by doing a
> PCR read, will cause the self-test to restart (option 2).

Actually, no, only doing another selftest seems to restart.  Any other
command will either get a normal return (if the TPM has already tested
the subsystem) or an RC_TESTING return indicate it's still being
tested.  It seems that the PCRs are one of the earliest things to come
on-line, so IMA always works.  This is on my current laptop where I'm
running this patch with IMA: I no longer see the IMA bypass message and
IMA comes up normally.

The delay loop in transmit is instrumented, so I never see the TPM
return RC_TESTING to a PCR read even if it comes out of selftest as
RC_TESTING (I'll keep looking, though, it's only been a few days).

> Reverting the patch that enabled the TPM full self-test, or
> commenting out self-test completely, allows IMA to properly find the
> TPM.
> 
> Side note, if the kernel emits TPM initialization warnings, there
> should be a successfully completed message as well.

It would actually be really nice if we printed TPM identification
information as well (having just had to go over a load of systems and
answer the question what TPM do they have ...)

James

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 16:23                       ` James Bottomley
@ 2018-02-09 21:23                         ` Mimi Zohar
  2018-04-08 18:27                         ` Ken Goldman
  1 sibling, 0 replies; 42+ messages in thread
From: Mimi Zohar @ 2018-02-09 21:23 UTC (permalink / raw)
  To: James Bottomley, Jarkko Sakkinen
  Cc: Jason Gunthorpe, Paul Menzel, linux-integrity

On Fri, 2018-02-09 at 08:23 -0800, James Bottomley wrote:
> On Fri, 2018-02-09 at 07:26 -0500, Mimi Zohar wrote:
> > On Fri, 2018-02-09 at 12:02 +0200, Jarkko Sakkinen wrote:
> > > 
> > > On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
> > > > 
> > > > There is an identified regression: the TPM driver will now
> > > > periodically
> > > > fail to attach.  However, there's no point reviewing until we
> > > > agree
> > > > what the fix is.  I was just waiting to verify this fixed my
> > > > problem
> > > > (which means seeing the messages it spits out proving the TPM has
> > > > remained in self test).  I have now seen this and the driver
> > > > still
> > > > works, so I can submit a formal patch.
> > > 
> > > For the self-test the duration falls down to 2 seconds as the specs
> > > do
> > > not contain any well-defined duration for it, or at least I haven't
> > > found it.
> > > 
> > > I see three alternative ways the fix the self-test:
> > > 
> > > 1. Execute self-test with fullTest = YES.
> > > 2. Have a flag TPM_CHIP_TESTING that is set when the self-test is
> > >    started. Issue a warning on time-out. Check for this flag in
> > >    tpm_transmit_cmd() and tpm_write() and resend self-test command
> > >    if the flag is stil test before the actual command. Return
> > > -EBUSY
> > >    and print a warning if self-test is still active.
> > > 3. Increase the duration to the "correct" value if we have one.
> > 
> > Please take into account that the TPM must complete initialization
> > BEFORE IMA looks for the TPM, otherwise IMA goes into TPM-bypass
> > mode.  This consistently happens with a full self-test (option 1).
> > Increasing the wait duration will most likely cause this to happen as
> > well (option 2).
> > 
> > Based on James' patch description, which says "There are various
> > theories that resending the self-test command actually causes the
> > tests to restart and thus triggers more TPM_RC_TESTING returns until
> > the timeout is exceeded", I think IMA's detecting the TPM, by doing a
> > PCR read, will cause the self-test to restart (option 2).
> 
> Actually, no, only doing another selftest seems to restart.  Any other
> command will either get a normal return (if the TPM has already tested
> the subsystem) or an RC_TESTING return indicate it's still being
> tested.

Jarkko's option 2 suggested resending the self-test, thus my comment.

> It seems that the PCRs are one of the earliest things to come
> on-line, so IMA always works.  This is on my current laptop where I'm
> running this patch with IMA: I no longer see the IMA bypass message and
> IMA comes up normally.

Right, as long as the TPM isn't doing a full self-test.

Mimi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 10:30                     ` Nayna Jain
@ 2018-02-15 12:00                       ` Jarkko Sakkinen
  0 siblings, 0 replies; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-15 12:00 UTC (permalink / raw)
  To: Nayna Jain; +Cc: James Bottomley, Jason Gunthorpe, Paul Menzel, linux-integrity

On Fri, Feb 09, 2018 at 04:00:44PM +0530, Nayna Jain wrote:
> IIUC, I see the duration and timeout specified in
> TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_Family_2.0_Revision:
> Section 5.5.1.3 Command Timing: Table 15
>
> For selftest it is given as:
>
> Command                                      |  Duration[ms]    |
> Timeout[ms]
>
> TPM2_SelfTest(fullTest=YES)                 1000 2000
> TPM2_SelfTest(fullTest=NO) 20                        750
>
> With that I think, TPM is expected to complete the selftest in max 2000 ms.

Thanks. So it looks that some TPMs are not spec compliant.

> Thanks & Regards,
>      - Nayna

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 11:47                     ` Alexander Steffen
@ 2018-02-15 12:12                       ` Jarkko Sakkinen
  2018-02-15 15:13                         ` Mimi Zohar
  2018-02-16 18:27                         ` Alexander Steffen
  0 siblings, 2 replies; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-15 12:12 UTC (permalink / raw)
  To: Alexander Steffen
  Cc: James Bottomley, Jason Gunthorpe, Paul Menzel, linux-integrity

On Fri, Feb 09, 2018 at 12:47:10PM +0100, Alexander Steffen wrote:
> On 09.02.2018 11:02, Jarkko Sakkinen wrote:
> > On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
> > > There is an identified regression: the TPM driver will now periodically
> > > fail to attach.  However, there's no point reviewing until we agree
> > > what the fix is.  I was just waiting to verify this fixed my problem
> > > (which means seeing the messages it spits out proving the TPM has
> > > remained in self test).  I have now seen this and the driver still
> > > works, so I can submit a formal patch.
> > 
> > For the self-test the duration falls down to 2 seconds as the specs do
> > not contain any well-defined duration for it, or at least I haven't
> > found it.
> > 
> > I see three alternative ways the fix the self-test:
> > 
> > 1. Execute self-test with fullTest = YES.
> 
> I had proposed some fixes in this direction last year:
> https://patchwork.kernel.org/patch/10105483/
> https://patchwork.kernel.org/patch/10130535/
> 
> Those combine the fast test execution with fullTest = NO for spec-compliant
> TPMs with a fallback to fullTest = YES.

The first was accepted.

The 2nd wasn't accpeted mainly for reasons that for me only acceptable
dependency is:

1. Patch that is part of the same patch set.
2. A merged commit.

I didn't event look at the code for the second one at that point because
it was formally done wrong.

What it is doing would be acceptable for me. I still think that TPM
should be fully tested before letting IMA for example to use it.

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-15 12:12                       ` Jarkko Sakkinen
@ 2018-02-15 15:13                         ` Mimi Zohar
  2018-02-16 18:30                           ` Alexander Steffen
  2018-02-16 18:27                         ` Alexander Steffen
  1 sibling, 1 reply; 42+ messages in thread
From: Mimi Zohar @ 2018-02-15 15:13 UTC (permalink / raw)
  To: Jarkko Sakkinen, Alexander Steffen
  Cc: James Bottomley, Jason Gunthorpe, Paul Menzel, linux-integrity

On Thu, 2018-02-15 at 14:12 +0200, Jarkko Sakkinen wrote:
> On Fri, Feb 09, 2018 at 12:47:10PM +0100, Alexander Steffen wrote:
> > On 09.02.2018 11:02, Jarkko Sakkinen wrote:
> > > On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
> > > > There is an identified regression: the TPM driver will now periodically
> > > > fail to attach.  However, there's no point reviewing until we agree
> > > > what the fix is.  I was just waiting to verify this fixed my problem
> > > > (which means seeing the messages it spits out proving the TPM has
> > > > remained in self test).  I have now seen this and the driver still
> > > > works, so I can submit a formal patch.
> > > 
> > > For the self-test the duration falls down to 2 seconds as the specs do
> > > not contain any well-defined duration for it, or at least I haven't
> > > found it.
> > > 
> > > I see three alternative ways the fix the self-test:
> > > 
> > > 1. Execute self-test with fullTest = YES.
> > 
> > I had proposed some fixes in this direction last year:
> > https://patchwork.kernel.org/patch/10105483/
> > https://patchwork.kernel.org/patch/10130535/
> > 
> > Those combine the fast test execution with fullTest = NO for spec-compliant
> > TPMs with a fallback to fullTest = YES.
> 
> The first was accepted.
> 
> The 2nd wasn't accpeted mainly for reasons that for me only acceptable
> dependency is:
> 
> 1. Patch that is part of the same patch set.
> 2. A merged commit.
> 
> I didn't event look at the code for the second one at that point because
> it was formally done wrong.
> 
> What it is doing would be acceptable for me. I still think that TPM
> should be fully tested before letting IMA for example to use it.

Why?  The short selftest has worked fine up to now.  The full selftest
delays the TPM way too long and causes IMA to go into TPM-bypass mode.
 The faster the TPM is available, the better for IMA.

It seems all commands, except selftest, the code sleeps in a loop and
checks for the command to finish, but doesn't resend the command.
 Only for selftest is the command resent, instead of just waiting for
it to complete.  Is there any reason for this?

Mimi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-15 12:12                       ` Jarkko Sakkinen
  2018-02-15 15:13                         ` Mimi Zohar
@ 2018-02-16 18:27                         ` Alexander Steffen
  2018-02-20 13:05                           ` Jarkko Sakkinen
  1 sibling, 1 reply; 42+ messages in thread
From: Alexander Steffen @ 2018-02-16 18:27 UTC (permalink / raw)
  To: Jarkko Sakkinen
  Cc: James Bottomley, Jason Gunthorpe, Paul Menzel, linux-integrity

On 15.02.2018 13:12, Jarkko Sakkinen wrote:
> On Fri, Feb 09, 2018 at 12:47:10PM +0100, Alexander Steffen wrote:
>> On 09.02.2018 11:02, Jarkko Sakkinen wrote:
>>> On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
>>>> There is an identified regression: the TPM driver will now periodically
>>>> fail to attach.  However, there's no point reviewing until we agree
>>>> what the fix is.  I was just waiting to verify this fixed my problem
>>>> (which means seeing the messages it spits out proving the TPM has
>>>> remained in self test).  I have now seen this and the driver still
>>>> works, so I can submit a formal patch.
>>>
>>> For the self-test the duration falls down to 2 seconds as the specs do
>>> not contain any well-defined duration for it, or at least I haven't
>>> found it.
>>>
>>> I see three alternative ways the fix the self-test:
>>>
>>> 1. Execute self-test with fullTest = YES.
>>
>> I had proposed some fixes in this direction last year:
>> https://patchwork.kernel.org/patch/10105483/
>> https://patchwork.kernel.org/patch/10130535/
>>
>> Those combine the fast test execution with fullTest = NO for spec-compliant
>> TPMs with a fallback to fullTest = YES.
> 
> The first was accepted.
> 
> The 2nd wasn't accpeted mainly for reasons that for me only acceptable
> dependency is:
> 
> 1. Patch that is part of the same patch set.
> 2. A merged commit.
> 
> I didn't event look at the code for the second one at that point because
> it was formally done wrong.

Ah, sorry, I thought this was the easiest solution, since it seemed 
likely that the first patch would be merged at some point.

If a similar situation arises, should I then include the first patch in 
a series together with the second, even if that means that there will be 
two identical copies of the first patch (one from when it was first 
published, one as part of the new series)? Or should I just avoid the 
dependency in the second patch, though that will lead to merge conflicts 
when you want accept both patches?

Alexander

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-15 15:13                         ` Mimi Zohar
@ 2018-02-16 18:30                           ` Alexander Steffen
  2018-02-19  9:15                             ` Nayna Jain
  0 siblings, 1 reply; 42+ messages in thread
From: Alexander Steffen @ 2018-02-16 18:30 UTC (permalink / raw)
  To: Mimi Zohar, Jarkko Sakkinen
  Cc: James Bottomley, Jason Gunthorpe, Paul Menzel, linux-integrity

On 15.02.2018 16:13, Mimi Zohar wrote:
> On Thu, 2018-02-15 at 14:12 +0200, Jarkko Sakkinen wrote:
>> On Fri, Feb 09, 2018 at 12:47:10PM +0100, Alexander Steffen wrote:
>>> On 09.02.2018 11:02, Jarkko Sakkinen wrote:
>>>> On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
>>>>> There is an identified regression: the TPM driver will now periodically
>>>>> fail to attach.  However, there's no point reviewing until we agree
>>>>> what the fix is.  I was just waiting to verify this fixed my problem
>>>>> (which means seeing the messages it spits out proving the TPM has
>>>>> remained in self test).  I have now seen this and the driver still
>>>>> works, so I can submit a formal patch.
>>>>
>>>> For the self-test the duration falls down to 2 seconds as the specs do
>>>> not contain any well-defined duration for it, or at least I haven't
>>>> found it.
>>>>
>>>> I see three alternative ways the fix the self-test:
>>>>
>>>> 1. Execute self-test with fullTest = YES.
>>>
>>> I had proposed some fixes in this direction last year:
>>> https://patchwork.kernel.org/patch/10105483/
>>> https://patchwork.kernel.org/patch/10130535/
>>>
>>> Those combine the fast test execution with fullTest = NO for spec-compliant
>>> TPMs with a fallback to fullTest = YES.
>>
>> The first was accepted.
>>
>> The 2nd wasn't accpeted mainly for reasons that for me only acceptable
>> dependency is:
>>
>> 1. Patch that is part of the same patch set.
>> 2. A merged commit.
>>
>> I didn't event look at the code for the second one at that point because
>> it was formally done wrong.
>>
>> What it is doing would be acceptable for me. I still think that TPM
>> should be fully tested before letting IMA for example to use it.
> 
> Why?  The short selftest has worked fine up to now.  The full selftest
> delays the TPM way too long and causes IMA to go into TPM-bypass mode.
>   The faster the TPM is available, the better for IMA.

I assume that IMA can deal with the TPM not returning RC_SUCCESS for a 
command, right? Then, if speed is the goal, not explicitly triggering 
any selftests and instead letting the TPM handle them internally still 
seems like the best solution. If the user needs a fully tested TPM for 
some reason, he can trigger the selftests himself. But the kernel cannot 
at the same time guarantee that all selftests pass (especially on 
broken/slow devices that take forever to complete them) and make the TPM 
available as early as possible.

I think we had a similar example elsewhere already: The kernel does not 
check all SMART results before accessing the hard disk, why should it do 
that for the TPM?

> It seems all commands, except selftest, the code sleeps in a loop and
> checks for the command to finish, but doesn't resend the command.
>   Only for selftest is the command resent, instead of just waiting for
> it to complete.  Is there any reason for this?

The selftest command is special in that the vendor can choose to have it 
execute its actions synchronously (like all other commands) or 
asynchronously (that is, return immediately and schedule the actions in 
the background). Only in the asynchronous case is the command repeated, 
as a simple way of querying the result. Since we call the command with 
fullTest=NO, only missing selftests are scheduled and with each 
iteration there should be fewer and fewer tests left to be executed, 
similar to how RC_YIELDED works.

Alexander

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-16 18:30                           ` Alexander Steffen
@ 2018-02-19  9:15                             ` Nayna Jain
  2018-02-19 22:26                               ` Jason Gunthorpe
  0 siblings, 1 reply; 42+ messages in thread
From: Nayna Jain @ 2018-02-19  9:15 UTC (permalink / raw)
  To: Alexander Steffen, Mimi Zohar, Jarkko Sakkinen
  Cc: James Bottomley, Jason Gunthorpe, Paul Menzel, linux-integrity



On 02/17/2018 12:00 AM, Alexander Steffen wrote:
> On 15.02.2018 16:13, Mimi Zohar wrote:
>> On Thu, 2018-02-15 at 14:12 +0200, Jarkko Sakkinen wrote:
>>> On Fri, Feb 09, 2018 at 12:47:10PM +0100, Alexander Steffen wrote:
>>>> On 09.02.2018 11:02, Jarkko Sakkinen wrote:
>>>>> On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley wrote:
>>>>>> There is an identified regression: the TPM driver will now 
>>>>>> periodically
>>>>>> fail to attach.  However, there's no point reviewing until we agree
>>>>>> what the fix is.  I was just waiting to verify this fixed my problem
>>>>>> (which means seeing the messages it spits out proving the TPM has
>>>>>> remained in self test).  I have now seen this and the driver still
>>>>>> works, so I can submit a formal patch.
>>>>>
>>>>> For the self-test the duration falls down to 2 seconds as the 
>>>>> specs do
>>>>> not contain any well-defined duration for it, or at least I haven't
>>>>> found it.
>>>>>
>>>>> I see three alternative ways the fix the self-test:
>>>>>
>>>>> 1. Execute self-test with fullTest = YES.
>>>>
>>>> I had proposed some fixes in this direction last year:
>>>> https://patchwork.kernel.org/patch/10105483/
>>>> https://patchwork.kernel.org/patch/10130535/
>>>>
>>>> Those combine the fast test execution with fullTest = NO for 
>>>> spec-compliant
>>>> TPMs with a fallback to fullTest = YES.
>>>
>>> The first was accepted.
>>>
>>> The 2nd wasn't accpeted mainly for reasons that for me only acceptable
>>> dependency is:
>>>
>>> 1. Patch that is part of the same patch set.
>>> 2. A merged commit.
>>>
>>> I didn't event look at the code for the second one at that point 
>>> because
>>> it was formally done wrong.
>>>
>>> What it is doing would be acceptable for me. I still think that TPM
>>> should be fully tested before letting IMA for example to use it.
>>
>> Why?  The short selftest has worked fine up to now.  The full selftest
>> delays the TPM way too long and causes IMA to go into TPM-bypass mode.
>>   The faster the TPM is available, the better for IMA.
>
> I assume that IMA can deal with the TPM not returning RC_SUCCESS for a 
> command, right? Then, if speed is the goal, not explicitly triggering 
> any selftests and instead letting the TPM handle them internally still 
> seems like the best solution. If the user needs a fully tested TPM for 
> some reason, he can trigger the selftests himself. But the kernel 
> cannot at the same time guarantee that all selftests pass (especially 
> on broken/slow devices that take forever to complete them) and make 
> the TPM available as early as possible.
>
> I think we had a similar example elsewhere already: The kernel does 
> not check all SMART results before accessing the hard disk, why should 
> it do that for the TPM?
>
>> It seems all commands, except selftest, the code sleeps in a loop and
>> checks for the command to finish, but doesn't resend the command.
>>   Only for selftest is the command resent, instead of just waiting for
>> it to complete.  Is there any reason for this?
>
> The selftest command is special in that the vendor can choose to have 
> it execute its actions synchronously (like all other commands) or 
> asynchronously (that is, return immediately and schedule the actions 
> in the background). Only in the asynchronous case is the command 
> repeated, as a simple way of querying the result. Since we call the 
> command with fullTest=NO, only missing selftests are scheduled and 
> with each iteration there should be fewer and fewer tests left to be 
> executed, similar to how RC_YIELDED works.
To query for the result, it need not send self test command again, I 
think we can use TPM2_GetTestResult().
According to Spec, TPM Rev 2.0 Part 3 - Commands , Section 10.4, the 
command will return the following possible test results.

TPM_RC_NEEDS_TEST - SelfTest has not been executed.
TPM_RC_TESTING - Tests are still in progress
TPM_RC_FAILURE - Tests Failure.
TPM_RC_SUCCESS - Tests successfully completed.

Thanks & Regards,
     - Nayna

>
> Alexander
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-19  9:15                             ` Nayna Jain
@ 2018-02-19 22:26                               ` Jason Gunthorpe
  0 siblings, 0 replies; 42+ messages in thread
From: Jason Gunthorpe @ 2018-02-19 22:26 UTC (permalink / raw)
  To: Nayna Jain
  Cc: Alexander Steffen, Mimi Zohar, Jarkko Sakkinen, James Bottomley,
	Paul Menzel, linux-integrity

On Mon, Feb 19, 2018 at 02:45:31PM +0530, Nayna Jain wrote:

> To query for the result, it need not send self test command again, I think
> we can use TPM2_GetTestResult().
> According to Spec, TPM Rev 2.0 Part 3 - Commands , Section 10.4, the command
> will return the following possible test results.
> 
> TPM_RC_NEEDS_TEST - SelfTest has not been executed.
> TPM_RC_TESTING - Tests are still in progress
> TPM_RC_FAILURE - Tests Failure.
> TPM_RC_SUCCESS - Tests successfully completed.

That sounds like a great way to fix this TPM without creating a whole
new flow..

TPM1 doesn't have this command which is why the self test loop was
written with repeated self test, it just got copied that way when
someone implemented the same thing for TPM2..

Jason>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-16 18:27                         ` Alexander Steffen
@ 2018-02-20 13:05                           ` Jarkko Sakkinen
  0 siblings, 0 replies; 42+ messages in thread
From: Jarkko Sakkinen @ 2018-02-20 13:05 UTC (permalink / raw)
  To: Alexander Steffen
  Cc: James Bottomley, Jason Gunthorpe, Paul Menzel, linux-integrity

On Fri, 2018-02-16 at 19:27 +0100, Alexander Steffen wrote:
> On 15.02.2018 13:12, Jarkko Sakkinen wrote:
> > On Fri, Feb 09, 2018 at 12:47:10PM +0100, Alexander Steffen wrote:
> > > On 09.02.2018 11:02, Jarkko Sakkinen wrote:
> > > > On Thu, Feb 08, 2018 at 09:02:00AM -0800, James Bottomley
> > > > wrote:
> > > > > There is an identified regression: the TPM driver will now
> > > > > periodically
> > > > > fail to attach.  However, there's no point reviewing until we
> > > > > agree
> > > > > what the fix is.  I was just waiting to verify this fixed my
> > > > > problem
> > > > > (which means seeing the messages it spits out proving the TPM
> > > > > has
> > > > > remained in self test).  I have now seen this and the driver
> > > > > still
> > > > > works, so I can submit a formal patch.
> > > > 
> > > > For the self-test the duration falls down to 2 seconds as the
> > > > specs do
> > > > not contain any well-defined duration for it, or at least I
> > > > haven't
> > > > found it.
> > > > 
> > > > I see three alternative ways the fix the self-test:
> > > > 
> > > > 1. Execute self-test with fullTest = YES.
> > > 
> > > I had proposed some fixes in this direction last year:
> > > https://patchwork.kernel.org/patch/10105483/
> > > https://patchwork.kernel.org/patch/10130535/
> > > 
> > > Those combine the fast test execution with fullTest = NO for
> > > spec-compliant
> > > TPMs with a fallback to fullTest = YES.
> > 
> > The first was accepted.
> > 
> > The 2nd wasn't accpeted mainly for reasons that for me only
> > acceptable
> > dependency is:
> > 
> > 1. Patch that is part of the same patch set.
> > 2. A merged commit.
> > 
> > I didn't event look at the code for the second one at that point
> > because
> > it was formally done wrong.
> 
> Ah, sorry, I thought this was the easiest solution, since it seemed 
> likely that the first patch would be merged at some point.
> 
> If a similar situation arises, should I then include the first patch
> in 
> a series together with the second, even if that means that there will
> be 
> two identical copies of the first patch (one from when it was first 
> published, one as part of the new series)? Or should I just avoid
> the 
> dependency in the second patch, though that will lead to merge
> conflicts 
> when you want accept both patches?
> 
> Alexander

Yes, it would be best to include all patches to the patch set that
have not yet been merged in order to make it self-contained and
easy test and review.

/Jarkko

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: TPM selftest failure in 4.15
  2018-02-09 16:23                       ` James Bottomley
  2018-02-09 21:23                         ` Mimi Zohar
@ 2018-04-08 18:27                         ` Ken Goldman
  1 sibling, 0 replies; 42+ messages in thread
From: Ken Goldman @ 2018-04-08 18:27 UTC (permalink / raw)
  Cc: linux-integrity

On 2/9/2018 11:23 AM, James Bottomley wrote:
> On Fri, 2018-02-09 at 07:26 -0500, Mimi Zohar wrote:
>> On Fri, 2018-02-09 at 12:02 +0200, Jarkko Sakkinen wrote:

> It seems that the PCRs are one of the earliest things to come
> on-line, so IMA always works.  This is on my current laptop where I'm
> running this patch with IMA: I no longer see the IMA bypass message and
> IMA comes up normally.
> 

I would expect PCRs to be available by the time IMA needs them because 
the firmware also extends to PCRs.  Therefore, the hash algorithms will 
be tested very early in boot.

I only worry about the resume from suspend case, where firmware won't be
doing extends.

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2018-04-08 18:27 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-02-01 12:16 TPM selftest failure in 4.15 James Bottomley
2018-02-01 12:21 ` Paul Menzel
2018-02-01 12:42   ` James Bottomley
2018-02-01 15:24     ` James Bottomley
2018-02-01 17:40       ` Jason Gunthorpe
2018-02-01 18:46         ` James Bottomley
2018-02-01 18:59           ` Jason Gunthorpe
2018-02-01 20:00             ` James Bottomley
2018-02-01 20:35               ` Jason Gunthorpe
2018-02-01 21:06                 ` James Bottomley
2018-02-08 13:10               ` Jarkko Sakkinen
2018-02-08 17:02                 ` James Bottomley
2018-02-09 10:02                   ` Jarkko Sakkinen
2018-02-09 10:30                     ` Nayna Jain
2018-02-15 12:00                       ` Jarkko Sakkinen
2018-02-09 11:47                     ` Alexander Steffen
2018-02-15 12:12                       ` Jarkko Sakkinen
2018-02-15 15:13                         ` Mimi Zohar
2018-02-16 18:30                           ` Alexander Steffen
2018-02-19  9:15                             ` Nayna Jain
2018-02-19 22:26                               ` Jason Gunthorpe
2018-02-16 18:27                         ` Alexander Steffen
2018-02-20 13:05                           ` Jarkko Sakkinen
2018-02-09 12:26                     ` Mimi Zohar
2018-02-09 16:23                       ` James Bottomley
2018-02-09 21:23                         ` Mimi Zohar
2018-04-08 18:27                         ` Ken Goldman
2018-02-09 16:18                     ` James Bottomley
2018-02-08 17:27         ` Ken Goldman
2018-02-01 19:16       ` TPM selftest failure in 4.15 (Dell XPS 13, Nuvoton 6xx) Paul Menzel
2018-02-01 19:17         ` Paul Menzel
2018-02-01 20:12           ` Mario.Limonciello
2018-02-01 21:06             ` Mario.Limonciello
2018-02-01 22:22               ` Jason Gunthorpe
2018-02-02  5:46                 ` James Bottomley
2018-02-02  5:46             ` James Bottomley
2018-02-08 16:53             ` Ken Goldman
2018-02-08 13:18         ` Jarkko Sakkinen
2018-02-08 13:05       ` TPM selftest failure in 4.15 Jarkko Sakkinen
2018-02-08 13:03     ` Jarkko Sakkinen
2018-02-08 12:49 ` Jarkko Sakkinen
2018-02-08 18:45 ` Ken Goldman

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.