All of lore.kernel.org
 help / color / mirror / Atom feed
* [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-06 12:34 ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-06 12:34 UTC (permalink / raw)
  To: linux-integrity; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1967 bytes --]

Dear Linux folks,


With Linux 4.15-rc2 built by the Ubuntu Kernel Team, the error messages 
below are shown by the Linux kernel. These are new.

```
Dez 06 13:22:24 Ixpees kernel: microcode: microcode updated early to 
revision 0x62, date = 2017-04-27
Dez 06 13:22:24 Ixpees kernel: Linux version 4.15.0-041500rc2-generic 
(kernel@gloin) (gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3)) #201712031230 
SMP Sun Dec 3 17:32:03 UTC 2017
Dez 06 13:22:24 Ixpees kernel: Command line: 
BOOT_IMAGE=/vmlinuz-4.15.0-041500rc2-generic 
root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
Dez 06 13:22:24 Ixpees kernel: KERNEL supported cpus:
Dez 06 13:22:24 Ixpees kernel:   Intel GenuineIntel
Dez 06 13:22:24 Ixpees kernel:   AMD AuthenticAMD
Dez 06 13:22:24 Ixpees kernel:   Centaur CentaurHauls
Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x001: 
'x87 floating point registers'
Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x002: 
'SSE registers'
[…]
Dez 06 13:22:24 Ixpees kernel: tpm_tis MSFT0101:00: 2.0 TPM (device-id 
0xFE, rev-id 4)
Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred 
continue selftest
Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred 
continue selftest
Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred 
continue selftest
Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred 
continue selftest
Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred 
continue selftest
Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred 
continue selftest
Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred 
continue selftest
Dez 06 13:22:24 Ixpees kernel: tpm tpm0: TPM self test failed
```

Any idea what to do about those? If these are intentional, it’d be great 
to give some hint to the user, what effect this has.


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-06 12:34 ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-06 12:34 UTC (permalink / raw)
  To: linux-integrity; +Cc: linux-kernel

[-- Attachment #1: Type: multipart/signed, Size: 2064 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-06 12:34 ` Paul Menzel
@ 2017-12-06 16:40   ` Jason Gunthorpe
  -1 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2017-12-06 16:40 UTC (permalink / raw)
  To: Paul Menzel; +Cc: linux-integrity, linux-kernel

On Wed, Dec 06, 2017 at 01:34:35PM +0100, Paul Menzel wrote:

> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: TPM self test failed
> 
> Any idea what to do about those? If these are intentional, it’d be great to
> give some hint to the user, what effect this has.

The TPM driver shouldn't load if self test fails, and we don't expect
self test to ever fail.

So..
1) The TPM is busted? Assuming not since you probably used an
   earlier kernel?
2) The CRB driver is no longer executing command properly?
   My guess would be
     f5357413dbaa ("tpm/tpm_crb: Use start method value from ACPI table directly")
   Borked it.

   Maybe this TPM needed one of the workarounds and the
   restructuring broke it, or maybe it was working 'accidently' and
   got broke.
3) For some reason the core code is not handling commands correctly?
   Don't see any patches on that topic though.

Unless someone else has an idea, we may have to ask you to do a
bisection on this bug?

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-06 16:40   ` Jason Gunthorpe
  0 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2017-12-06 16:40 UTC (permalink / raw)
  To: Paul Menzel; +Cc: linux-integrity, linux-kernel

On Wed, Dec 06, 2017 at 01:34:35PM +0100, Paul Menzel wrote:

> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: TPM self test failed
> 
> Any idea what to do about those? If these are intentional, it'd be great to
> give some hint to the user, what effect this has.

The TPM driver shouldn't load if self test fails, and we don't expect
self test to ever fail.

So..
1) The TPM is busted? Assuming not since you probably used an
   earlier kernel?
2) The CRB driver is no longer executing command properly?
   My guess would be
     f5357413dbaa ("tpm/tpm_crb: Use start method value from ACPI table directly")
   Borked it.

   Maybe this TPM needed one of the workarounds and the
   restructuring broke it, or maybe it was working 'accidently' and
   got broke.
3) For some reason the core code is not handling commands correctly?
   Don't see any patches on that topic though.

Unless someone else has an idea, we may have to ask you to do a
bisection on this bug?

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-06 12:34 ` Paul Menzel
@ 2017-12-07 15:56   ` Alexander.Steffen
  -1 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-07 15:56 UTC (permalink / raw)
  To: pmenzel, linux-integrity; +Cc: linux-kernel

> Dear Linux folks,
> 
> 
> With Linux 4.15-rc2 built by the Ubuntu Kernel Team, the error messages
> below are shown by the Linux kernel. These are new.
> 
> ```
> Dez 06 13:22:24 Ixpees kernel: microcode: microcode updated early to
> revision 0x62, date = 2017-04-27
> Dez 06 13:22:24 Ixpees kernel: Linux version 4.15.0-041500rc2-generic
> (kernel@gloin) (gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3)) #201712031230
> SMP Sun Dec 3 17:32:03 UTC 2017
> Dez 06 13:22:24 Ixpees kernel: Command line:
> BOOT_IMAGE=/vmlinuz-4.15.0-041500rc2-generic
> root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
> Dez 06 13:22:24 Ixpees kernel: KERNEL supported cpus:
> Dez 06 13:22:24 Ixpees kernel:   Intel GenuineIntel
> Dez 06 13:22:24 Ixpees kernel:   AMD AuthenticAMD
> Dez 06 13:22:24 Ixpees kernel:   Centaur CentaurHauls
> Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x001:
> 'x87 floating point registers'
> Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x002:
> 'SSE registers'
> […]
> Dez 06 13:22:24 Ixpees kernel: tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE, rev-id 4)
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: TPM self test failed
> ```
> 
> Any idea what to do about those?

The list of "A TPM error (2314) occurred continue selftest" is caused by my commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React correctly to RC_TESTING from TPM 2.0 self tests"). 2314 is TPM_RC_TESTING, so the TPM tells us that self-tests are still running in the background. This problem was not visible in previous versions, since it (incorrectly) ignored TPM_RC_TESTING.

The final error "TPM self test failed" is then the driver giving up after too many TPM_RC_TESTING responses have been received.

What confuses me a bit is that according to your log all of that happens within the same second. The driver uses tpm2_calc_ordinal_duration for TPM2_CC_SELF_TEST which maps to TPM_LONG and finally to TPM2_DURATION_LONG = 2000ms = 2s. So you should see those retries for at least two seconds. But since it does not give the TPM enough time to execute the tests, you see it failing in the end.

Could you try to find out how much time exactly (in milliseconds) it takes from the first "A TPM error" to the final "TPM self test failed" message? Is it possible that tpm_msleep delays for significantly less time in this case?

Also, while looking at the code I've noticed that the retry loop is not written in the best possible way and should actually try one more time. Could you make the following change to tpm2_do_selftest in drivers/char/tpm/tpm2-cmd.c and see whether that helps? 

---
        duration = jiffies_to_msecs(
                tpm2_calc_ordinal_duration(chip, TPM2_CC_SELF_TEST));
 
-       while (duration > 0) {
+       while (1) {
                cmd.header.in = tpm2_selftest_header;
                cmd.params.selftest_in.full_test = 0;
 
                rc = tpm_transmit_cmd(chip, NULL, &cmd,
TPM2_SELF_TEST_IN_SIZE,
                                      0, 0, "continue selftest");
 
-               if (rc != TPM2_RC_TESTING)
+               if (rc != TPM2_RC_TESTING || duration <= 0)
                        break;
 
                tpm_msleep(delay_msec);
---

> If these are intentional, it’d be great
> to give some hint to the user, what effect this has.

I agree that those error messages in their current form are not that helpful for the users. But they are part of the general driver architecture, and are also caused by other parts of the code (e.g. when using a TPM 1.2 that is deactivated or when the platform did not send a startup command). We should find a way to hide (or at least mark) those kind of expected errors.

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-07 15:56   ` Alexander.Steffen
  0 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-07 15:56 UTC (permalink / raw)
  To: pmenzel, linux-integrity; +Cc: linux-kernel

> Dear Linux folks,
> 
> 
> With Linux 4.15-rc2 built by the Ubuntu Kernel Team, the error messages
> below are shown by the Linux kernel. These are new.
> 
> ```
> Dez 06 13:22:24 Ixpees kernel: microcode: microcode updated early to
> revision 0x62, date = 2017-04-27
> Dez 06 13:22:24 Ixpees kernel: Linux version 4.15.0-041500rc2-generic
> (kernel@gloin) (gcc version 7.2.0 (Ubuntu 7.2.0-8ubuntu3)) #201712031230
> SMP Sun Dec 3 17:32:03 UTC 2017
> Dez 06 13:22:24 Ixpees kernel: Command line:
> BOOT_IMAGE=/vmlinuz-4.15.0-041500rc2-generic
> root=/dev/mapper/ubuntu--vg-root ro quiet splash vt.handoff=7
> Dez 06 13:22:24 Ixpees kernel: KERNEL supported cpus:
> Dez 06 13:22:24 Ixpees kernel:   Intel GenuineIntel
> Dez 06 13:22:24 Ixpees kernel:   AMD AuthenticAMD
> Dez 06 13:22:24 Ixpees kernel:   Centaur CentaurHauls
> Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x001:
> 'x87 floating point registers'
> Dez 06 13:22:24 Ixpees kernel: x86/fpu: Supporting XSAVE feature 0x002:
> 'SSE registers'
> [...]
> Dez 06 13:22:24 Ixpees kernel: tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE, rev-id 4)
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: A TPM error (2314) occurred
> continue selftest
> Dez 06 13:22:24 Ixpees kernel: tpm tpm0: TPM self test failed
> ```
> 
> Any idea what to do about those?

The list of "A TPM error (2314) occurred continue selftest" is caused by my commit 125a2210541079e8e7c69e629ad06cabed788f8c ("tpm: React correctly to RC_TESTING from TPM 2.0 self tests"). 2314 is TPM_RC_TESTING, so the TPM tells us that self-tests are still running in the background. This problem was not visible in previous versions, since it (incorrectly) ignored TPM_RC_TESTING.

The final error "TPM self test failed" is then the driver giving up after too many TPM_RC_TESTING responses have been received.

What confuses me a bit is that according to your log all of that happens within the same second. The driver uses tpm2_calc_ordinal_duration for TPM2_CC_SELF_TEST which maps to TPM_LONG and finally to TPM2_DURATION_LONG = 2000ms = 2s. So you should see those retries for at least two seconds. But since it does not give the TPM enough time to execute the tests, you see it failing in the end.

Could you try to find out how much time exactly (in milliseconds) it takes from the first "A TPM error" to the final "TPM self test failed" message? Is it possible that tpm_msleep delays for significantly less time in this case?

Also, while looking at the code I've noticed that the retry loop is not written in the best possible way and should actually try one more time. Could you make the following change to tpm2_do_selftest in drivers/char/tpm/tpm2-cmd.c and see whether that helps? 

---
        duration = jiffies_to_msecs(
                tpm2_calc_ordinal_duration(chip, TPM2_CC_SELF_TEST));
 
-       while (duration > 0) {
+       while (1) {
                cmd.header.in = tpm2_selftest_header;
                cmd.params.selftest_in.full_test = 0;
 
                rc = tpm_transmit_cmd(chip, NULL, &cmd,
TPM2_SELF_TEST_IN_SIZE,
                                      0, 0, "continue selftest");
 
-               if (rc != TPM2_RC_TESTING)
+               if (rc != TPM2_RC_TESTING || duration <= 0)
                        break;
 
                tpm_msleep(delay_msec);
---

> If these are intentional, it'd be great
> to give some hint to the user, what effect this has.

I agree that those error messages in their current form are not that helpful for the users. But they are part of the general driver architecture, and are also caused by other parts of the code (e.g. when using a TPM 1.2 that is deactivated or when the platform did not send a startup command). We should find a way to hide (or at least mark) those kind of expected errors.

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-07 15:56   ` Alexander.Steffen
@ 2017-12-07 18:37     ` Jason Gunthorpe
  -1 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2017-12-07 18:37 UTC (permalink / raw)
  To: Alexander.Steffen; +Cc: pmenzel, linux-integrity, linux-kernel

On Thu, Dec 07, 2017 at 03:56:07PM +0000, Alexander.Steffen@infineon.com wrote:

> > If these are intentional, it’d be great
> > to give some hint to the user, what effect this has.
> 
> I agree that those error messages in their current form are not that
> helpful for the users. But they are part of the general driver
> architecture, and are also caused by other parts of the code
> (e.g. when using a TPM 1.2 that is deactivated or when the platform
> did not send a startup command). We should find a way to hide (or at
> least mark) those kind of expected errors.

Other parts of the TPM code did supresses 'expected' errors like this,
I'm not sure if it got removed during Jarkko's cleanup though - we
need to put stuff like that back, it should not printk for something
like this.

For this, if we are waiting then it should compute an absolute time
after which it will give up.

Code it like this instead and get rid of the ugly 'duration' scheme.

ktime_t stop = ktime_add_ns(ktime_get(), [timeout in ns]);
do
{
}
while (ktime_before(ktime_get(), stop);

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-07 18:37     ` Jason Gunthorpe
  0 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2017-12-07 18:37 UTC (permalink / raw)
  To: Alexander.Steffen; +Cc: pmenzel, linux-integrity, linux-kernel

On Thu, Dec 07, 2017 at 03:56:07PM +0000, Alexander.Steffen@infineon.com wrote:

> > If these are intentional, it'd be great
> > to give some hint to the user, what effect this has.
> 
> I agree that those error messages in their current form are not that
> helpful for the users. But they are part of the general driver
> architecture, and are also caused by other parts of the code
> (e.g. when using a TPM 1.2 that is deactivated or when the platform
> did not send a startup command). We should find a way to hide (or at
> least mark) those kind of expected errors.

Other parts of the TPM code did supresses 'expected' errors like this,
I'm not sure if it got removed during Jarkko's cleanup though - we
need to put stuff like that back, it should not printk for something
like this.

For this, if we are waiting then it should compute an absolute time
after which it will give up.

Code it like this instead and get rid of the ugly 'duration' scheme.

ktime_t stop = ktime_add_ns(ktime_get(), [timeout in ns]);
do
{
}
while (ktime_before(ktime_get(), stop);

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-07 18:37     ` Jason Gunthorpe
@ 2017-12-08 12:14       ` Alexander.Steffen
  -1 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-08 12:14 UTC (permalink / raw)
  To: jgg; +Cc: pmenzel, linux-integrity, linux-kernel

> On Thu, Dec 07, 2017 at 03:56:07PM +0000, Alexander.Steffen@infineon.com
> wrote:
> 
> > > If these are intentional, it’d be great
> > > to give some hint to the user, what effect this has.
> >
> > I agree that those error messages in their current form are not that
> > helpful for the users. But they are part of the general driver
> > architecture, and are also caused by other parts of the code
> > (e.g. when using a TPM 1.2 that is deactivated or when the platform
> > did not send a startup command). We should find a way to hide (or at
> > least mark) those kind of expected errors.
> 
> Other parts of the TPM code did supresses 'expected' errors like this,
> I'm not sure if it got removed during Jarkko's cleanup though - we
> need to put stuff like that back, it should not printk for something
> like this.

Yes, I've got this task here somewhere, just no time to do it...

> For this, if we are waiting then it should compute an absolute time
> after which it will give up.
> 
> Code it like this instead and get rid of the ugly 'duration' scheme.
> 
> ktime_t stop = ktime_add_ns(ktime_get(), [timeout in ns]);
> do
> {
> }
> while (ktime_before(ktime_get(), stop);

Is it really that ugly? I still need delay_msec to increase the delay each round. I can see the benefit of your suggestion when it is important to get the timing exactly right (and also account for time spent elsewhere, when our code might not be executing). But in this case having delays that are approximately right (or longer than intended) is sufficient.

Anyway, from the log messages it is clear that tpm_msleep got called seven times with delays of 20/40/80/160/320/640/1280ms. But still all timestamps lie within the same second. How can this be with a cumulated delay of ~2.5s?

Also, I've just noticed that despite the name tpm_msleep calls usleep_range, not msleep. Can this have an influence? Should tpm_msleep call msleep for longer delays, as suggested by Documentation/timers/timers-howto.txt?

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-08 12:14       ` Alexander.Steffen
  0 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-08 12:14 UTC (permalink / raw)
  To: jgg; +Cc: pmenzel, linux-integrity, linux-kernel

> On Thu, Dec 07, 2017 at 03:56:07PM +0000, Alexander.Steffen@infineon.com
> wrote:
> 
> > > If these are intentional, it'd be great
> > > to give some hint to the user, what effect this has.
> >
> > I agree that those error messages in their current form are not that
> > helpful for the users. But they are part of the general driver
> > architecture, and are also caused by other parts of the code
> > (e.g. when using a TPM 1.2 that is deactivated or when the platform
> > did not send a startup command). We should find a way to hide (or at
> > least mark) those kind of expected errors.
> 
> Other parts of the TPM code did supresses 'expected' errors like this,
> I'm not sure if it got removed during Jarkko's cleanup though - we
> need to put stuff like that back, it should not printk for something
> like this.

Yes, I've got this task here somewhere, just no time to do it...

> For this, if we are waiting then it should compute an absolute time
> after which it will give up.
> 
> Code it like this instead and get rid of the ugly 'duration' scheme.
> 
> ktime_t stop = ktime_add_ns(ktime_get(), [timeout in ns]);
> do
> {
> }
> while (ktime_before(ktime_get(), stop);

Is it really that ugly? I still need delay_msec to increase the delay each round. I can see the benefit of your suggestion when it is important to get the timing exactly right (and also account for time spent elsewhere, when our code might not be executing). But in this case having delays that are approximately right (or longer than intended) is sufficient.

Anyway, from the log messages it is clear that tpm_msleep got called seven times with delays of 20/40/80/160/320/640/1280ms. But still all timestamps lie within the same second. How can this be with a cumulated delay of ~2.5s?

Also, I've just noticed that despite the name tpm_msleep calls usleep_range, not msleep. Can this have an influence? Should tpm_msleep call msleep for longer delays, as suggested by Documentation/timers/timers-howto.txt?

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-08 12:14       ` Alexander.Steffen
  (?)
@ 2017-12-08 15:56       ` Jason Gunthorpe
  2017-12-08 16:07           ` Paul Menzel
  2017-12-08 16:17           ` Mimi Zohar
  -1 siblings, 2 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2017-12-08 15:56 UTC (permalink / raw)
  To: Alexander.Steffen; +Cc: pmenzel, linux-integrity, linux-kernel

On Fri, Dec 08, 2017 at 12:14:04PM +0000, Alexander.Steffen@infineon.com wrote:

> Is it really that ugly? I still need delay_msec to increase the
> delay each round. I can see the benefit of your suggestion when it
> is important to get the timing exactly right (and also account for
> time spent elsewhere, when our code might not be executing). But in
> this case having delays that are approximately right (or longer than
> intended) is sufficient.

For timeouts like this we really need to be above the TPM specified
delay in all cases, even if usleep_range selected something
smaller/larger.. The only way to do that is with an absolute timeout..


> Anyway, from the log messages it is clear that tpm_msleep got called
> seven times with delays of 20/40/80/160/320/640/1280ms. But still
> all timestamps lie within the same second. How can this be with a
> cumulated delay of ~2.5s?

Yes, that does seem to be the bug, our sleep function doesn't work
aynmore for some reason :|

> Also, I've just noticed that despite the name tpm_msleep calls
> usleep_range, not msleep. Can this have an influence? Should
> tpm_msleep call msleep for longer delays, as suggested by
> Documentation/timers/timers-howto.txt?

This change was introduced recently and is probably the source of this
regression.

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-08 15:56       ` Jason Gunthorpe
@ 2017-12-08 16:07           ` Paul Menzel
  2017-12-08 16:17           ` Mimi Zohar
  1 sibling, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-08 16:07 UTC (permalink / raw)
  To: Jason Gunthorpe, Alexander.Steffen; +Cc: linux-integrity, linux-kernel

Dear Jason, dear Alexander,


Thank you for your replies.


Am 08.12.2017 um 16:56 schrieb Jason Gunthorpe:
> On Fri, Dec 08, 2017 at 12:14:04PM +0000, Alexander.Steffen@infineon.com wrote:

[…]

>> Anyway, from the log messages it is clear that tpm_msleep got called
>> seven times with delays of 20/40/80/160/320/640/1280ms. But still
>> all timestamps lie within the same second. How can this be with a
>> cumulated delay of ~2.5s?
> 
> Yes, that does seem to be the bug, our sleep function doesn't work
> aynmore for some reason :|

I have no access to the system right now, but want to point out, that 
the log was created by `journactl -k`, so I do not know if that messes 
with the time stamps. I checked the output of `dmesg` but didn’t see the 
TPM error messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM 
(device-id 0xFE, rev-id 4)`. Do I need to pass a different error message 
to `dmesg`?

>> Also, I've just noticed that despite the name tpm_msleep calls
>> usleep_range, not msleep. Can this have an influence? Should
>> tpm_msleep call msleep for longer delays, as suggested by
>> Documentation/timers/timers-howto.txt?
> 
> This change was introduced recently and is probably the source of this
> regression.

I’ll try to test this on Monday.


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-08 16:07           ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-08 16:07 UTC (permalink / raw)
  To: Jason Gunthorpe, Alexander.Steffen; +Cc: linux-integrity, linux-kernel

Dear Jason, dear Alexander,


Thank you for your replies.


Am 08.12.2017 um 16:56 schrieb Jason Gunthorpe:
> On Fri, Dec 08, 2017 at 12:14:04PM +0000, Alexander.Steffen@infineon.com wrote:

[...]

>> Anyway, from the log messages it is clear that tpm_msleep got called
>> seven times with delays of 20/40/80/160/320/640/1280ms. But still
>> all timestamps lie within the same second. How can this be with a
>> cumulated delay of ~2.5s?
> 
> Yes, that does seem to be the bug, our sleep function doesn't work
> aynmore for some reason :|

I have no access to the system right now, but want to point out, that 
the log was created by `journactl -k`, so I do not know if that messes 
with the time stamps. I checked the output of `dmesg` but didn't see the 
TPM error messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM 
(device-id 0xFE, rev-id 4)`. Do I need to pass a different error message 
to `dmesg`?

>> Also, I've just noticed that despite the name tpm_msleep calls
>> usleep_range, not msleep. Can this have an influence? Should
>> tpm_msleep call msleep for longer delays, as suggested by
>> Documentation/timers/timers-howto.txt?
> 
> This change was introduced recently and is probably the source of this
> regression.

I'll try to test this on Monday.


Kind regards,

Paul

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-08 15:56       ` Jason Gunthorpe
@ 2017-12-08 16:17           ` Mimi Zohar
  2017-12-08 16:17           ` Mimi Zohar
  1 sibling, 0 replies; 51+ messages in thread
From: Mimi Zohar @ 2017-12-08 16:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Alexander.Steffen; +Cc: pmenzel, linux-integrity, linux-kernel

On Fri, 2017-12-08 at 08:56 -0700, Jason Gunthorpe wrote:
> On Fri, Dec 08, 2017 at 12:14:04PM +0000, Alexander.Steffen@infineon.com wrote:
> 
> > Is it really that ugly? I still need delay_msec to increase the
> > delay each round. I can see the benefit of your suggestion when it
> > is important to get the timing exactly right (and also account for
> > time spent elsewhere, when our code might not be executing). But in
> > this case having delays that are approximately right (or longer than
> > intended) is sufficient.
> 
> For timeouts like this we really need to be above the TPM specified
> delay in all cases, even if usleep_range selected something
> smaller/larger.. The only way to do that is with an absolute timeout..
> 
> 
> > Anyway, from the log messages it is clear that tpm_msleep got called
> > seven times with delays of 20/40/80/160/320/640/1280ms. But still
> > all timestamps lie within the same second. How can this be with a
> > cumulated delay of ~2.5s?
> 
> Yes, that does seem to be the bug, our sleep function doesn't work
> aynmore for some reason :|
> 
> > Also, I've just noticed that despite the name tpm_msleep calls
> > usleep_range, not msleep. Can this have an influence? Should
> > tpm_msleep call msleep for longer delays, as suggested by
> > Documentation/timers/timers-howto.txt?
> 
> This change was introduced recently and is probably the source of this
> regression.

msleep() waited a lot longer than the requested time, causing long
delays.  Using usleep_range() still waits more than the requested
time, but less than msleep().

static inline void tpm_msleep(unsigned int delay_msec)
{
        usleep_range((delay_msec * 1000) - TPM_TIMEOUT_RANGE_US,
                     delay_msec * 1000);
};

Other TPM performance improvements have not yet been upstreamed.

Mimi

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-08 16:17           ` Mimi Zohar
  0 siblings, 0 replies; 51+ messages in thread
From: Mimi Zohar @ 2017-12-08 16:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Alexander.Steffen; +Cc: pmenzel, linux-integrity, linux-kernel

On Fri, 2017-12-08 at 08:56 -0700, Jason Gunthorpe wrote:
> On Fri, Dec 08, 2017 at 12:14:04PM +0000, Alexander.Steffen@infineon.com wrote:
> 
> > Is it really that ugly? I still need delay_msec to increase the
> > delay each round. I can see the benefit of your suggestion when it
> > is important to get the timing exactly right (and also account for
> > time spent elsewhere, when our code might not be executing). But in
> > this case having delays that are approximately right (or longer than
> > intended) is sufficient.
> 
> For timeouts like this we really need to be above the TPM specified
> delay in all cases, even if usleep_range selected something
> smaller/larger.. The only way to do that is with an absolute timeout..
> 
> 
> > Anyway, from the log messages it is clear that tpm_msleep got called
> > seven times with delays of 20/40/80/160/320/640/1280ms. But still
> > all timestamps lie within the same second. How can this be with a
> > cumulated delay of ~2.5s?
> 
> Yes, that does seem to be the bug, our sleep function doesn't work
> aynmore for some reason :|
> 
> > Also, I've just noticed that despite the name tpm_msleep calls
> > usleep_range, not msleep. Can this have an influence? Should
> > tpm_msleep call msleep for longer delays, as suggested by
> > Documentation/timers/timers-howto.txt?
> 
> This change was introduced recently and is probably the source of this
> regression.

msleep() waited a lot longer than the requested time, causing long
delays.  Using usleep_range() still waits more than the requested
time, but less than msleep().

static inline void tpm_msleep(unsigned int delay_msec)
{
        usleep_range((delay_msec * 1000) - TPM_TIMEOUT_RANGE_US,
                     delay_msec * 1000);
};

Other TPM performance improvements have not yet been upstreamed.

Mimi

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-08 16:07           ` Paul Menzel
@ 2017-12-08 16:18             ` Jason Gunthorpe
  -1 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2017-12-08 16:18 UTC (permalink / raw)
  To: Paul Menzel; +Cc: Alexander.Steffen, linux-integrity, linux-kernel

On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:

> I have no access to the system right now, but want to point out, that the
> log was created by `journactl -k`, so I do not know if that messes with the
> time stamps. I checked the output of `dmesg` but didn’t see the TPM error
> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE,
> rev-id 4)`. Do I need to pass a different error message to `dmesg`?

It is a good question, I don't know.. If your kernel isn't setup to
timestamp messages then the journalstamp will certainly be garbage.

No idea why you wouldn't see the messages in dmesg, if they are not in
dmesg they couldn't get into the journal

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-08 16:18             ` Jason Gunthorpe
  0 siblings, 0 replies; 51+ messages in thread
From: Jason Gunthorpe @ 2017-12-08 16:18 UTC (permalink / raw)
  To: Paul Menzel; +Cc: Alexander.Steffen, linux-integrity, linux-kernel

On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:

> I have no access to the system right now, but want to point out, that the
> log was created by `journactl -k`, so I do not know if that messes with the
> time stamps. I checked the output of `dmesg` but didn't see the TPM error
> messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE,
> rev-id 4)`. Do I need to pass a different error message to `dmesg`?

It is a good question, I don't know.. If your kernel isn't setup to
timestamp messages then the journalstamp will certainly be garbage.

No idea why you wouldn't see the messages in dmesg, if they are not in
dmesg they couldn't get into the journal

Jason

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-08 16:18             ` Jason Gunthorpe
@ 2017-12-11 12:54               ` Paul Menzel
  -1 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-11 12:54 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Alexander.Steffen, linux-integrity, linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1986 bytes --]

Dear Jason,


On 12/08/17 17:18, Jason Gunthorpe wrote:
> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> 
>> I have no access to the system right now, but want to point out, that the
>> log was created by `journactl -k`, so I do not know if that messes with the
>> time stamps. I checked the output of `dmesg` but didn’t see the TPM error
>> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE,
>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> 
> It is a good question, I don't know.. If your kernel isn't setup to
> timestamp messages then the journalstamp will certainly be garbage.
> 
> No idea why you wouldn't see the messages in dmesg, if they are not in
> dmesg they couldn't get into the journal

It looks like I was running an older Linux kernel version, when running 
`dmesg`. Sorry for the noise. Here are the messages with the Linux 
kernel time stamps, showing that the delays work correctly.

```
$ uname -a
Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3 
17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ sudo dmesg | grep TPM
[    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl 
00000001 AMI  00000000)
[    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
[    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
[    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
[    3.734808] tpm tpm0: TPM self test failed
[    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
```


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-11 12:54               ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-11 12:54 UTC (permalink / raw)
  To: Jason Gunthorpe; +Cc: Alexander.Steffen, linux-integrity, linux-kernel

[-- Attachment #1: Type: multipart/signed, Size: 2086 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-11 12:54               ` Paul Menzel
@ 2017-12-11 16:08                 ` Alexander.Steffen
  -1 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-11 16:08 UTC (permalink / raw)
  To: pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> Dear Jason,
> 
> 
> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> >
> >> I have no access to the system right now, but want to point out, that the
> >> log was created by `journactl -k`, so I do not know if that messes with the
> >> time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
> >> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE,
> >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> >
> > It is a good question, I don't know.. If your kernel isn't setup to
> > timestamp messages then the journalstamp will certainly be garbage.
> >
> > No idea why you wouldn't see the messages in dmesg, if they are not in
> > dmesg they couldn't get into the journal
> 
> It looks like I was running an older Linux kernel version, when running
> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> kernel time stamps, showing that the delays work correctly.
> 
> ```
> $ uname -a
> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> $ sudo dmesg | grep TPM
> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
> 00000001 AMI  00000000)
> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    3.734808] tpm tpm0: TPM self test failed
> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> ```

Thanks for the fixed log. So your TPM seems to be rather slow with executing the selftests. Could try to apply the patch that I've just sent you? It ensures that your TPM gets more time to execute all the tests, up to the limit set in the PTP.

(I would have sent that patch also to linux-kernel@vger.kernel.org, since it was included in the discussion, but for some strange reason my mail system declared that to be an "Invalid address"...)

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-11 16:08                 ` Alexander.Steffen
  0 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-11 16:08 UTC (permalink / raw)
  To: pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> Dear Jason,
> 
> 
> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> >
> >> I have no access to the system right now, but want to point out, that the
> >> log was created by `journactl -k`, so I do not know if that messes with the
> >> time stamps. I checked the output of `dmesg` but didn't see the TPM
> error
> >> messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE,
> >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> >
> > It is a good question, I don't know.. If your kernel isn't setup to
> > timestamp messages then the journalstamp will certainly be garbage.
> >
> > No idea why you wouldn't see the messages in dmesg, if they are not in
> > dmesg they couldn't get into the journal
> 
> It looks like I was running an older Linux kernel version, when running
> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> kernel time stamps, showing that the delays work correctly.
> 
> ```
> $ uname -a
> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> $ sudo dmesg | grep TPM
> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
> 00000001 AMI  00000000)
> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    3.734808] tpm tpm0: TPM self test failed
> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> ```

Thanks for the fixed log. So your TPM seems to be rather slow with executing the selftests. Could try to apply the patch that I've just sent you? It ensures that your TPM gets more time to execute all the tests, up to the limit set in the PTP.

(I would have sent that patch also to linux-kernel@vger.kernel.org, since it was included in the discussion, but for some strange reason my mail system declared that to be an "Invalid address"...)

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-11 16:08                 ` Alexander.Steffen
@ 2017-12-14 10:33                   ` Paul Menzel
  -1 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-14 10:33 UTC (permalink / raw)
  To: Alexander Steffen, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, Mario Limonciello

[-- Attachment #1: Type: text/plain, Size: 3434 bytes --]

[Mario from Dell added to CC list.]

Dear Alexander,


On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:

>> On 12/08/17 17:18, Jason Gunthorpe wrote:
>>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
>>>
>>>> I have no access to the system right now, but want to point out, that the
>>>> log was created by `journactl -k`, so I do not know if that messes with the
>>>> time stamps. I checked the output of `dmesg` but didn’t see the TPM error
>>>> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE,
>>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
>>>
>>> It is a good question, I don't know.. If your kernel isn't setup to
>>> timestamp messages then the journalstamp will certainly be garbage.
>>>
>>> No idea why you wouldn't see the messages in dmesg, if they are not in
>>> dmesg they couldn't get into the journal
>>
>> It looks like I was running an older Linux kernel version, when running
>> `dmesg`. Sorry for the noise. Here are the messages with the Linux
>> kernel time stamps, showing that the delays work correctly.
>>
>> ```
>> $ uname -a
>> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
>> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>> $ sudo dmesg | grep TPM
>> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
>> 00000001 AMI  00000000)
>> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
>> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
>> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
>> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
>> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
>> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
>> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
>> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
>> [    3.734808] tpm tpm0: TPM self test failed
>> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
>> ```
> 
> Thanks for the fixed log. So your TPM seems to be rather slow with executing the selftests. Could try to apply the patch that I've just sent you? It ensures that your TPM gets more time to execute all the tests, up to the limit set in the PTP.

Thank you for your patch. Judging from the time stamps, it seems it 
works, but the TPM still fails.

```
$ dmesg | grep tpm
[    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
[    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
[    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
[    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
[    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
[    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
[    6.303242] tpm tpm0: TPM self test failed
```

To be clear, this issue is not reproducible during every start. (But 
that was the same before.)


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-14 10:33                   ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-14 10:33 UTC (permalink / raw)
  To: Alexander Steffen, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, Mario Limonciello

[-- Attachment #1: Type: multipart/signed, Size: 3507 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-14 10:33                   ` Paul Menzel
@ 2017-12-14 12:20                     ` Alexander.Steffen
  -1 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-14 12:20 UTC (permalink / raw)
  To: pmenzel, jgg; +Cc: linux-integrity, linux-kernel, mario.limonciello

> [Mario from Dell added to CC list.]
> 
> Dear Alexander,
> 
> 
> On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
> 
> >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> >>>
> >>>> I have no access to the system right now, but want to point out, that
> the
> >>>> log was created by `journactl -k`, so I do not know if that messes with
> the
> >>>> time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
> >>>> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-
> id 0xFE,
> >>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> >>>
> >>> It is a good question, I don't know.. If your kernel isn't setup to
> >>> timestamp messages then the journalstamp will certainly be garbage.
> >>>
> >>> No idea why you wouldn't see the messages in dmesg, if they are not in
> >>> dmesg they couldn't get into the journal
> >>
> >> It looks like I was running an older Linux kernel version, when running
> >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> >> kernel time stamps, showing that the delays work correctly.
> >>
> >> ```
> >> $ uname -a
> >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> >> $ sudo dmesg | grep TPM
> >> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
> >> 00000001 AMI  00000000)
> >> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> >> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    3.734808] tpm tpm0: TPM self test failed
> >> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> >> ```
> >
> > Thanks for the fixed log. So your TPM seems to be rather slow with
> executing the selftests. Could try to apply the patch that I've just sent you? It
> ensures that your TPM gets more time to execute all the tests, up to the limit
> set in the PTP.
> 
> Thank you for your patch. Judging from the time stamps, it seems it
> works, but the TPM still fails.
> 
> ```
> $ dmesg | grep tpm
> [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    6.303242] tpm tpm0: TPM self test failed
> ```
> 
> To be clear, this issue is not reproducible during every start. (But
> that was the same before.)

Thanks for testing. Now you are in the unlucky situation that your TPM was probably always broken, but old kernels did not detect that and used it anyway.

To add some more details to what the problem is: The PTP limits the maximum runtime of the TPM2_SelfTest command that we try to execute here to 2000ms (see https://trustedcomputinggroup.org/wp-content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_Family_2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to the printed page numbers). Technically, we have no evidence that your TPM is in violation of that specification, because it does reply to the command within 2000ms, it just has not completed the selftests within that timeframe. But clearly the intention of the specification authors was to have the selftests completed within that limit, there is no sense in allowing 2s just for the TPM to generate an answer without actually making any progress.

The TPM2_SelfTest command is special in that it is allowed to either execute all selftests and then return TPM_RC_SUCCESS or just schedule the selftest execution in the background and return TPM_RC_TESTING immediately (see https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-3-Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently chooses the second option, but (sometimes?) fails to complete the selftests within the limit that we set (which is far longer than the 2s from the PTP).

I'm not sure what to do about that now. We could increase the timeout even further, but if your TPM does not abide by the specification, what would be the right limit? Maybe there is a bug in your TPM that sometimes causes it to end up in a state where it can never complete the selftests.

The only other idea I have would be to use a different variant of the TPM2_SelfTest command. Currently, we execute the selftest command with the parameter fullTest=NO, so that the TPM only has to execute the missing tests (which should be the fastest implementation for a spec-compliant TPM). Maybe instead of giving up, we can extend the current algorithm to try fullTest=YES once, which should reset the selftest state so that maybe then your TPM can complete them successfully. I'll try to implement a patch to that effect.

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-14 12:20                     ` Alexander.Steffen
  0 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-14 12:20 UTC (permalink / raw)
  To: pmenzel, jgg; +Cc: linux-integrity, linux-kernel, mario.limonciello

> [Mario from Dell added to CC list.]
> 
> Dear Alexander,
> 
> 
> On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
> 
> >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> >>>
> >>>> I have no access to the system right now, but want to point out, that
> the
> >>>> log was created by `journactl -k`, so I do not know if that messes with
> the
> >>>> time stamps. I checked the output of `dmesg` but didn't see the TPM
> error
> >>>> messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM (device-
> id 0xFE,
> >>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> >>>
> >>> It is a good question, I don't know.. If your kernel isn't setup to
> >>> timestamp messages then the journalstamp will certainly be garbage.
> >>>
> >>> No idea why you wouldn't see the messages in dmesg, if they are not in
> >>> dmesg they couldn't get into the journal
> >>
> >> It looks like I was running an older Linux kernel version, when running
> >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> >> kernel time stamps, showing that the delays work correctly.
> >>
> >> ```
> >> $ uname -a
> >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> >> $ sudo dmesg | grep TPM
> >> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
> >> 00000001 AMI  00000000)
> >> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> >> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> >> [    3.734808] tpm tpm0: TPM self test failed
> >> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> >> ```
> >
> > Thanks for the fixed log. So your TPM seems to be rather slow with
> executing the selftests. Could try to apply the patch that I've just sent you? It
> ensures that your TPM gets more time to execute all the tests, up to the limit
> set in the PTP.
> 
> Thank you for your patch. Judging from the time stamps, it seems it
> works, but the TPM still fails.
> 
> ```
> $ dmesg | grep tpm
> [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    6.303242] tpm tpm0: TPM self test failed
> ```
> 
> To be clear, this issue is not reproducible during every start. (But
> that was the same before.)

Thanks for testing. Now you are in the unlucky situation that your TPM was probably always broken, but old kernels did not detect that and used it anyway.

To add some more details to what the problem is: The PTP limits the maximum runtime of the TPM2_SelfTest command that we try to execute here to 2000ms (see https://trustedcomputinggroup.org/wp-content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_Family_2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to the printed page numbers). Technically, we have no evidence that your TPM is in violation of that specification, because it does reply to the command within 2000ms, it just has not completed the selftests within that timeframe. But clearly the intention of the specification authors was to have the selftests completed within that limit, there is no sense in allowing 2s just for the TPM to generate an answer without actually making any progress.

The TPM2_SelfTest command is special in that it is allowed to either execute all selftests and then return TPM_RC_SUCCESS or just schedule the selftest execution in the background and return TPM_RC_TESTING immediately (see https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-3-Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently chooses the second option, but (sometimes?) fails to complete the selftests within the limit that we set (which is far longer than the 2s from the PTP).

I'm not sure what to do about that now. We could increase the timeout even further, but if your TPM does not abide by the specification, what would be the right limit? Maybe there is a bug in your TPM that sometimes causes it to end up in a state where it can never complete the selftests.

The only other idea I have would be to use a different variant of the TPM2_SelfTest command. Currently, we execute the selftest command with the parameter fullTest=NO, so that the TPM only has to execute the missing tests (which should be the fastest implementation for a spec-compliant TPM). Maybe instead of giving up, we can extend the current algorithm to try fullTest=YES once, which should reset the selftest state so that maybe then your TPM can complete them successfully. I'll try to implement a patch to that effect.

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-14 12:20                     ` Alexander.Steffen
@ 2017-12-14 14:15                       ` Mario.Limonciello
  -1 siblings, 0 replies; 51+ messages in thread
From: Mario.Limonciello @ 2017-12-14 14:15 UTC (permalink / raw)
  To: Alexander.Steffen, pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> -----Original Message-----
> From: Alexander.Steffen@infineon.com [mailto:Alexander.Steffen@infineon.com]
> Sent: Thursday, December 14, 2017 6:21 AM
> To: pmenzel@molgen.mpg.de; jgg@ziepe.ca
> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org; Limonciello,
> Mario <Mario_Limonciello@Dell.com>
> Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> occurred continue selftest`
> 
> > [Mario from Dell added to CC list.]
> >
> > Dear Alexander,
> >
> >
> > On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
> >
> > >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > >>>
> > >>>> I have no access to the system right now, but want to point out, that
> > the
> > >>>> log was created by `journactl -k`, so I do not know if that messes with
> > the
> > >>>> time stamps. I checked the output of `dmesg` but didn’t see the TPM
> > error
> > >>>> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-
> > id 0xFE,
> > >>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > >>>
> > >>> It is a good question, I don't know.. If your kernel isn't setup to
> > >>> timestamp messages then the journalstamp will certainly be garbage.
> > >>>
> > >>> No idea why you wouldn't see the messages in dmesg, if they are not in
> > >>> dmesg they couldn't get into the journal
> > >>
> > >> It looks like I was running an older Linux kernel version, when running
> > >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > >> kernel time stamps, showing that the delays work correctly.
> > >>
> > >> ```
> > >> $ uname -a
> > >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > >> $ sudo dmesg | grep TPM
> > >> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
> > >> 00000001 AMI  00000000)
> > >> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > >> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    3.734808] tpm tpm0: TPM self test failed
> > >> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > >> ```
> > >
> > > Thanks for the fixed log. So your TPM seems to be rather slow with
> > executing the selftests. Could try to apply the patch that I've just sent you? It
> > ensures that your TPM gets more time to execute all the tests, up to the limit
> > set in the PTP.
> >
> > Thank you for your patch. Judging from the time stamps, it seems it
> > works, but the TPM still fails.
> >
> > ```
> > $ dmesg | grep tpm
> > [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    6.303242] tpm tpm0: TPM self test failed
> > ```
> >
> > To be clear, this issue is not reproducible during every start. (But
> > that was the same before.)
> 
> Thanks for testing. Now you are in the unlucky situation that your TPM was
> probably always broken, but old kernels did not detect that and used it anyway.
> 
Something that Paul can consider is to upgrade the TPM firmware if it's not already
upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware update
issued.  It has been posted to LVFS and can be upgraded using fwupd/fwupdate.
Note: If your TPM is currently owned you will need to go into BIOS setup to clear it
first before upgrading.

I don’t have any insight into what changed between firmware versions.  It might not
change this at all.

> To add some more details to what the problem is: The PTP limits the maximum
> runtime of the TPM2_SelfTest command that we try to execute here to 2000ms
> (see https://trustedcomputinggroup.org/wp-
> content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_Family
> _2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to the
> printed page numbers). Technically, we have no evidence that your TPM is in
> violation of that specification, because it does reply to the command within
> 2000ms, it just has not completed the selftests within that timeframe. But clearly
> the intention of the specification authors was to have the selftests completed
> within that limit, there is no sense in allowing 2s just for the TPM to generate an
> answer without actually making any progress.
> 
> The TPM2_SelfTest command is special in that it is allowed to either execute all
> selftests and then return TPM_RC_SUCCESS or just schedule the selftest execution
> in the background and return TPM_RC_TESTING immediately (see
> https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-3-
> Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently chooses
> the second option, but (sometimes?) fails to complete the selftests within the limit
> that we set (which is far longer than the 2s from the PTP).
> 
> I'm not sure what to do about that now. We could increase the timeout even
> further, but if your TPM does not abide by the specification, what would be the
> right limit? Maybe there is a bug in your TPM that sometimes causes it to end up in
> a state where it can never complete the selftests.

Are there any representatives from the other TPM vendors on the linux-integrrity 
mailing list?  Maybe someone from the vendor involved in this laptop can comment
if they know of limitations in the self tests on this particular model and can 
recommend a solution.

> 
> The only other idea I have would be to use a different variant of the TPM2_SelfTest
> command. Currently, we execute the selftest command with the parameter
> fullTest=NO, so that the TPM only has to execute the missing tests (which should be
> the fastest implementation for a spec-compliant TPM). Maybe instead of giving up,
> we can extend the current algorithm to try fullTest=YES once, which should reset
> the selftest state so that maybe then your TPM can complete them successfully. I'll
> try to implement a patch to that effect.
> 
> Alexander

If you're fairly certain it's a TPM bug, another possibility is to quirk to skip self tests
based on TPM model + TPM firmware version.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-14 14:15                       ` Mario.Limonciello
  0 siblings, 0 replies; 51+ messages in thread
From: Mario.Limonciello @ 2017-12-14 14:15 UTC (permalink / raw)
  To: Alexander.Steffen, pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> -----Original Message-----
> From: Alexander.Steffen@infineon.com [mailto:Alexander.Steffen@infineon.com]
> Sent: Thursday, December 14, 2017 6:21 AM
> To: pmenzel@molgen.mpg.de; jgg@ziepe.ca
> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org; Limonciello,
> Mario <Mario_Limonciello@Dell.com>
> Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> occurred continue selftest`
> 
> > [Mario from Dell added to CC list.]
> >
> > Dear Alexander,
> >
> >
> > On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
> >
> > >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > >>>
> > >>>> I have no access to the system right now, but want to point out, that
> > the
> > >>>> log was created by `journactl -k`, so I do not know if that messes with
> > the
> > >>>> time stamps. I checked the output of `dmesg` but didn't see the TPM
> > error
> > >>>> messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM (device-
> > id 0xFE,
> > >>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > >>>
> > >>> It is a good question, I don't know.. If your kernel isn't setup to
> > >>> timestamp messages then the journalstamp will certainly be garbage.
> > >>>
> > >>> No idea why you wouldn't see the messages in dmesg, if they are not in
> > >>> dmesg they couldn't get into the journal
> > >>
> > >> It looks like I was running an older Linux kernel version, when running
> > >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > >> kernel time stamps, showing that the delays work correctly.
> > >>
> > >> ```
> > >> $ uname -a
> > >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > >> $ sudo dmesg | grep TPM
> > >> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
> > >> 00000001 AMI  00000000)
> > >> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > >> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > >> [    3.734808] tpm tpm0: TPM self test failed
> > >> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > >> ```
> > >
> > > Thanks for the fixed log. So your TPM seems to be rather slow with
> > executing the selftests. Could try to apply the patch that I've just sent you? It
> > ensures that your TPM gets more time to execute all the tests, up to the limit
> > set in the PTP.
> >
> > Thank you for your patch. Judging from the time stamps, it seems it
> > works, but the TPM still fails.
> >
> > ```
> > $ dmesg | grep tpm
> > [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    6.303242] tpm tpm0: TPM self test failed
> > ```
> >
> > To be clear, this issue is not reproducible during every start. (But
> > that was the same before.)
> 
> Thanks for testing. Now you are in the unlucky situation that your TPM was
> probably always broken, but old kernels did not detect that and used it anyway.
> 
Something that Paul can consider is to upgrade the TPM firmware if it's not already
upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware update
issued.  It has been posted to LVFS and can be upgraded using fwupd/fwupdate.
Note: If your TPM is currently owned you will need to go into BIOS setup to clear it
first before upgrading.

I don't have any insight into what changed between firmware versions.  It might not
change this at all.

> To add some more details to what the problem is: The PTP limits the maximum
> runtime of the TPM2_SelfTest command that we try to execute here to 2000ms
> (see https://trustedcomputinggroup.org/wp-
> content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_Family
> _2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to the
> printed page numbers). Technically, we have no evidence that your TPM is in
> violation of that specification, because it does reply to the command within
> 2000ms, it just has not completed the selftests within that timeframe. But clearly
> the intention of the specification authors was to have the selftests completed
> within that limit, there is no sense in allowing 2s just for the TPM to generate an
> answer without actually making any progress.
> 
> The TPM2_SelfTest command is special in that it is allowed to either execute all
> selftests and then return TPM_RC_SUCCESS or just schedule the selftest execution
> in the background and return TPM_RC_TESTING immediately (see
> https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-3-
> Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently chooses
> the second option, but (sometimes?) fails to complete the selftests within the limit
> that we set (which is far longer than the 2s from the PTP).
> 
> I'm not sure what to do about that now. We could increase the timeout even
> further, but if your TPM does not abide by the specification, what would be the
> right limit? Maybe there is a bug in your TPM that sometimes causes it to end up in
> a state where it can never complete the selftests.

Are there any representatives from the other TPM vendors on the linux-integrrity 
mailing list?  Maybe someone from the vendor involved in this laptop can comment
if they know of limitations in the self tests on this particular model and can 
recommend a solution.

> 
> The only other idea I have would be to use a different variant of the TPM2_SelfTest
> command. Currently, we execute the selftest command with the parameter
> fullTest=NO, so that the TPM only has to execute the missing tests (which should be
> the fastest implementation for a spec-compliant TPM). Maybe instead of giving up,
> we can extend the current algorithm to try fullTest=YES once, which should reset
> the selftest state so that maybe then your TPM can complete them successfully. I'll
> try to implement a patch to that effect.
> 
> Alexander

If you're fairly certain it's a TPM bug, another possibility is to quirk to skip self tests
based on TPM model + TPM firmware version.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-14 14:15                       ` Mario.Limonciello
@ 2017-12-14 16:12                         ` Alexander.Steffen
  -1 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-14 16:12 UTC (permalink / raw)
  To: Mario.Limonciello, pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> > -----Original Message-----
> > From: Alexander.Steffen@infineon.com
> [mailto:Alexander.Steffen@infineon.com]
> > Sent: Thursday, December 14, 2017 6:21 AM
> > To: pmenzel@molgen.mpg.de; jgg@ziepe.ca
> > Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org;
> Limonciello,
> > Mario <Mario_Limonciello@Dell.com>
> > Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error
> (2314)
> > occurred continue selftest`
> >
> > > [Mario from Dell added to CC list.]
> > >
> > > Dear Alexander,
> > >
> > >
> > > On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
> > >
> > > >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > > >>>
> > > >>>> I have no access to the system right now, but want to point out,
> that
> > > the
> > > >>>> log was created by `journactl -k`, so I do not know if that messes
> with
> > > the
> > > >>>> time stamps. I checked the output of `dmesg` but didn’t see the
> TPM
> > > error
> > > >>>> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM
> (device-
> > > id 0xFE,
> > > >>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > > >>>
> > > >>> It is a good question, I don't know.. If your kernel isn't setup to
> > > >>> timestamp messages then the journalstamp will certainly be
> garbage.
> > > >>>
> > > >>> No idea why you wouldn't see the messages in dmesg, if they are
> not in
> > > >>> dmesg they couldn't get into the journal
> > > >>
> > > >> It looks like I was running an older Linux kernel version, when running
> > > >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > > >> kernel time stamps, showing that the delays work correctly.
> > > >>
> > > >> ```
> > > >> $ uname -a
> > > >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > > >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > > >> $ sudo dmesg | grep TPM
> > > >> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03
> Tpm2Tabl
> > > >> 00000001 AMI  00000000)
> > > >> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > >> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    3.734808] tpm tpm0: TPM self test failed
> > > >> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > > >> ```
> > > >
> > > > Thanks for the fixed log. So your TPM seems to be rather slow with
> > > executing the selftests. Could try to apply the patch that I've just sent
> you? It
> > > ensures that your TPM gets more time to execute all the tests, up to the
> limit
> > > set in the PTP.
> > >
> > > Thank you for your patch. Judging from the time stamps, it seems it
> > > works, but the TPM still fails.
> > >
> > > ```
> > > $ dmesg | grep tpm
> > > [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    6.303242] tpm tpm0: TPM self test failed
> > > ```
> > >
> > > To be clear, this issue is not reproducible during every start. (But
> > > that was the same before.)
> >
> > Thanks for testing. Now you are in the unlucky situation that your TPM was
> > probably always broken, but old kernels did not detect that and used it
> anyway.
> >
> Something that Paul can consider is to upgrade the TPM firmware if it's not
> already
> upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware
> update
> issued.  It has been posted to LVFS and can be upgraded using
> fwupd/fwupdate.
> Note: If your TPM is currently owned you will need to go into BIOS setup to
> clear it
> first before upgrading.

I'm not familiar with the specific TPM in your model, but according to the log it is a TPM 2.0, which does not really carry over the owner concept of a TPM 1.2. Is clearing it still necessary for an upgrade then?

> I don’t have any insight into what changed between firmware versions.  It
> might not
> change this at all.
> 
> > To add some more details to what the problem is: The PTP limits the
> maximum
> > runtime of the TPM2_SelfTest command that we try to execute here to
> 2000ms
> > (see https://trustedcomputinggroup.org/wp-
> >
> content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_
> Family
> > _2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to
> the
> > printed page numbers). Technically, we have no evidence that your TPM is
> in
> > violation of that specification, because it does reply to the command within
> > 2000ms, it just has not completed the selftests within that timeframe. But
> clearly
> > the intention of the specification authors was to have the selftests
> completed
> > within that limit, there is no sense in allowing 2s just for the TPM to
> generate an
> > answer without actually making any progress.
> >
> > The TPM2_SelfTest command is special in that it is allowed to either
> execute all
> > selftests and then return TPM_RC_SUCCESS or just schedule the selftest
> execution
> > in the background and return TPM_RC_TESTING immediately (see
> > https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-
> Part-3-
> > Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently
> chooses
> > the second option, but (sometimes?) fails to complete the selftests within
> the limit
> > that we set (which is far longer than the 2s from the PTP).
> >
> > I'm not sure what to do about that now. We could increase the timeout
> even
> > further, but if your TPM does not abide by the specification, what would be
> the
> > right limit? Maybe there is a bug in your TPM that sometimes causes it to
> end up in
> > a state where it can never complete the selftests.
> 
> Are there any representatives from the other TPM vendors on the linux-
> integrrity
> mailing list?  Maybe someone from the vendor involved in this laptop can
> comment
> if they know of limitations in the self tests on this particular model and can
> recommend a solution.
> 
> >
> > The only other idea I have would be to use a different variant of the
> TPM2_SelfTest
> > command. Currently, we execute the selftest command with the
> parameter
> > fullTest=NO, so that the TPM only has to execute the missing tests (which
> should be
> > the fastest implementation for a spec-compliant TPM). Maybe instead of
> giving up,
> > we can extend the current algorithm to try fullTest=YES once, which should
> reset
> > the selftest state so that maybe then your TPM can complete them
> successfully. I'll
> > try to implement a patch to that effect.
> >
> > Alexander
> 
> If you're fairly certain it's a TPM bug, another possibility is to quirk to skip self
> tests
> based on TPM model + TPM firmware version.

As a last resort maybe, yes. But currently the kernel's policy is that it only wants to talk to a TPM device that is guaranteed to be error-free, i.e. has executed the selftests correctly. I'd like to change that for other reasons (see the patches that I just posted for details), but now that you mention it, maybe there is a simple solution that solves both problems:

The TPM specification says "If a command requires use of an untested algorithm or functional module, the TPM performs the test and then completes the command actions." (https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-1-Architecture-01.38.pdf chapter 12.3, page 83/59). So as far as I understand that, there is no need for us to explicitly execute selftests on any TPM (2.0) device, the TPM is required to do that automatically. So what about getting rid of the selftest call completely?

It will improve startup performance, because we do not have to wait for the TPM to complete all selftests. The worst case is that the first execution of a command requiring a specific functionality will be a bit slower, because the TPM has to do the selftests first. But maybe even that won't be the case, since the same chapter in the specification also says "It is preferable for the TPM to perform self-tests on untested algorithms and functional blocks as a background task to increase the likelihood that algorithms are tested before they are needed."

The only disadvantage I can see from a user's point of view is that he will discover a broken TPM device only when he tries to use it, not already when the kernel tries to load the driver. But that also applies to other devices, you will not notice a broken flash drive unless you try to access the data, not from just plugging it in. And if a user really cares, he is always free to execute TPM2_SelfTest via /dev/tpm*. Any other objections?

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-14 16:12                         ` Alexander.Steffen
  0 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-14 16:12 UTC (permalink / raw)
  To: Mario.Limonciello, pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> > -----Original Message-----
> > From: Alexander.Steffen@infineon.com
> [mailto:Alexander.Steffen@infineon.com]
> > Sent: Thursday, December 14, 2017 6:21 AM
> > To: pmenzel@molgen.mpg.de; jgg@ziepe.ca
> > Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org;
> Limonciello,
> > Mario <Mario_Limonciello@Dell.com>
> > Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error
> (2314)
> > occurred continue selftest`
> >
> > > [Mario from Dell added to CC list.]
> > >
> > > Dear Alexander,
> > >
> > >
> > > On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
> > >
> > > >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > > >>>
> > > >>>> I have no access to the system right now, but want to point out,
> that
> > > the
> > > >>>> log was created by `journactl -k`, so I do not know if that messes
> with
> > > the
> > > >>>> time stamps. I checked the output of `dmesg` but didn't see the
> TPM
> > > error
> > > >>>> messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM
> (device-
> > > id 0xFE,
> > > >>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > > >>>
> > > >>> It is a good question, I don't know.. If your kernel isn't setup to
> > > >>> timestamp messages then the journalstamp will certainly be
> garbage.
> > > >>>
> > > >>> No idea why you wouldn't see the messages in dmesg, if they are
> not in
> > > >>> dmesg they couldn't get into the journal
> > > >>
> > > >> It looks like I was running an older Linux kernel version, when running
> > > >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > > >> kernel time stamps, showing that the delays work correctly.
> > > >>
> > > >> ```
> > > >> $ uname -a
> > > >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > > >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > > >> $ sudo dmesg | grep TPM
> > > >> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03
> Tpm2Tabl
> > > >> 00000001 AMI  00000000)
> > > >> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > >> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > >> [    3.734808] tpm tpm0: TPM self test failed
> > > >> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > > >> ```
> > > >
> > > > Thanks for the fixed log. So your TPM seems to be rather slow with
> > > executing the selftests. Could try to apply the patch that I've just sent
> you? It
> > > ensures that your TPM gets more time to execute all the tests, up to the
> limit
> > > set in the PTP.
> > >
> > > Thank you for your patch. Judging from the time stamps, it seems it
> > > works, but the TPM still fails.
> > >
> > > ```
> > > $ dmesg | grep tpm
> > > [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > [    6.303242] tpm tpm0: TPM self test failed
> > > ```
> > >
> > > To be clear, this issue is not reproducible during every start. (But
> > > that was the same before.)
> >
> > Thanks for testing. Now you are in the unlucky situation that your TPM was
> > probably always broken, but old kernels did not detect that and used it
> anyway.
> >
> Something that Paul can consider is to upgrade the TPM firmware if it's not
> already
> upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware
> update
> issued.  It has been posted to LVFS and can be upgraded using
> fwupd/fwupdate.
> Note: If your TPM is currently owned you will need to go into BIOS setup to
> clear it
> first before upgrading.

I'm not familiar with the specific TPM in your model, but according to the log it is a TPM 2.0, which does not really carry over the owner concept of a TPM 1.2. Is clearing it still necessary for an upgrade then?

> I don't have any insight into what changed between firmware versions.  It
> might not
> change this at all.
> 
> > To add some more details to what the problem is: The PTP limits the
> maximum
> > runtime of the TPM2_SelfTest command that we try to execute here to
> 2000ms
> > (see https://trustedcomputinggroup.org/wp-
> >
> content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_
> Family
> > _2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to
> the
> > printed page numbers). Technically, we have no evidence that your TPM is
> in
> > violation of that specification, because it does reply to the command within
> > 2000ms, it just has not completed the selftests within that timeframe. But
> clearly
> > the intention of the specification authors was to have the selftests
> completed
> > within that limit, there is no sense in allowing 2s just for the TPM to
> generate an
> > answer without actually making any progress.
> >
> > The TPM2_SelfTest command is special in that it is allowed to either
> execute all
> > selftests and then return TPM_RC_SUCCESS or just schedule the selftest
> execution
> > in the background and return TPM_RC_TESTING immediately (see
> > https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-
> Part-3-
> > Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently
> chooses
> > the second option, but (sometimes?) fails to complete the selftests within
> the limit
> > that we set (which is far longer than the 2s from the PTP).
> >
> > I'm not sure what to do about that now. We could increase the timeout
> even
> > further, but if your TPM does not abide by the specification, what would be
> the
> > right limit? Maybe there is a bug in your TPM that sometimes causes it to
> end up in
> > a state where it can never complete the selftests.
> 
> Are there any representatives from the other TPM vendors on the linux-
> integrrity
> mailing list?  Maybe someone from the vendor involved in this laptop can
> comment
> if they know of limitations in the self tests on this particular model and can
> recommend a solution.
> 
> >
> > The only other idea I have would be to use a different variant of the
> TPM2_SelfTest
> > command. Currently, we execute the selftest command with the
> parameter
> > fullTest=NO, so that the TPM only has to execute the missing tests (which
> should be
> > the fastest implementation for a spec-compliant TPM). Maybe instead of
> giving up,
> > we can extend the current algorithm to try fullTest=YES once, which should
> reset
> > the selftest state so that maybe then your TPM can complete them
> successfully. I'll
> > try to implement a patch to that effect.
> >
> > Alexander
> 
> If you're fairly certain it's a TPM bug, another possibility is to quirk to skip self
> tests
> based on TPM model + TPM firmware version.

As a last resort maybe, yes. But currently the kernel's policy is that it only wants to talk to a TPM device that is guaranteed to be error-free, i.e. has executed the selftests correctly. I'd like to change that for other reasons (see the patches that I just posted for details), but now that you mention it, maybe there is a simple solution that solves both problems:

The TPM specification says "If a command requires use of an untested algorithm or functional module, the TPM performs the test and then completes the command actions." (https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-Part-1-Architecture-01.38.pdf chapter 12.3, page 83/59). So as far as I understand that, there is no need for us to explicitly execute selftests on any TPM (2.0) device, the TPM is required to do that automatically. So what about getting rid of the selftest call completely?

It will improve startup performance, because we do not have to wait for the TPM to complete all selftests. The worst case is that the first execution of a command requiring a specific functionality will be a bit slower, because the TPM has to do the selftests first. But maybe even that won't be the case, since the same chapter in the specification also says "It is preferable for the TPM to perform self-tests on untested algorithms and functional blocks as a background task to increase the likelihood that algorithms are tested before they are needed."

The only disadvantage I can see from a user's point of view is that he will discover a broken TPM device only when he tries to use it, not already when the kernel tries to load the driver. But that also applies to other devices, you will not notice a broken flash drive unless you try to access the data, not from just plugging it in. And if a user really cares, he is always free to execute TPM2_SelfTest via /dev/tpm*. Any other objections?

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-14 16:12                         ` Alexander.Steffen
@ 2017-12-14 19:43                           ` Mario.Limonciello
  -1 siblings, 0 replies; 51+ messages in thread
From: Mario.Limonciello @ 2017-12-14 19:43 UTC (permalink / raw)
  To: Alexander.Steffen, pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> -----Original Message-----
> From: Alexander.Steffen@infineon.com [mailto:Alexander.Steffen@infineon.com]
> Sent: Thursday, December 14, 2017 10:12 AM
> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; pmenzel@molgen.mpg.de;
> jgg@ziepe.ca
> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> occurred continue selftest`
> 
> > > -----Original Message-----
> > > From: Alexander.Steffen@infineon.com
> > [mailto:Alexander.Steffen@infineon.com]
> > > Sent: Thursday, December 14, 2017 6:21 AM
> > > To: pmenzel@molgen.mpg.de; jgg@ziepe.ca
> > > Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org;
> > Limonciello,
> > > Mario <Mario_Limonciello@Dell.com>
> > > Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error
> > (2314)
> > > occurred continue selftest`
> > >
> > > > [Mario from Dell added to CC list.]
> > > >
> > > > Dear Alexander,
> > > >
> > > >
> > > > On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
> > > >
> > > > >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > > >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > > > >>>
> > > > >>>> I have no access to the system right now, but want to point out,
> > that
> > > > the
> > > > >>>> log was created by `journactl -k`, so I do not know if that messes
> > with
> > > > the
> > > > >>>> time stamps. I checked the output of `dmesg` but didn’t see the
> > TPM
> > > > error
> > > > >>>> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM
> > (device-
> > > > id 0xFE,
> > > > >>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > > > >>>
> > > > >>> It is a good question, I don't know.. If your kernel isn't setup to
> > > > >>> timestamp messages then the journalstamp will certainly be
> > garbage.
> > > > >>>
> > > > >>> No idea why you wouldn't see the messages in dmesg, if they are
> > not in
> > > > >>> dmesg they couldn't get into the journal
> > > > >>
> > > > >> It looks like I was running an older Linux kernel version, when running
> > > > >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > > > >> kernel time stamps, showing that the delays work correctly.
> > > > >>
> > > > >> ```
> > > > >> $ uname -a
> > > > >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > > > >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > > > >> $ sudo dmesg | grep TPM
> > > > >> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03
> > Tpm2Tabl
> > > > >> 00000001 AMI  00000000)
> > > > >> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > > >> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    3.734808] tpm tpm0: TPM self test failed
> > > > >> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > > > >> ```
> > > > >
> > > > > Thanks for the fixed log. So your TPM seems to be rather slow with
> > > > executing the selftests. Could try to apply the patch that I've just sent
> > you? It
> > > > ensures that your TPM gets more time to execute all the tests, up to the
> > limit
> > > > set in the PTP.
> > > >
> > > > Thank you for your patch. Judging from the time stamps, it seems it
> > > > works, but the TPM still fails.
> > > >
> > > > ```
> > > > $ dmesg | grep tpm
> > > > [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > > [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    6.303242] tpm tpm0: TPM self test failed
> > > > ```
> > > >
> > > > To be clear, this issue is not reproducible during every start. (But
> > > > that was the same before.)
> > >
> > > Thanks for testing. Now you are in the unlucky situation that your TPM was
> > > probably always broken, but old kernels did not detect that and used it
> > anyway.
> > >
> > Something that Paul can consider is to upgrade the TPM firmware if it's not
> > already
> > upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware
> > update
> > issued.  It has been posted to LVFS and can be upgraded using
> > fwupd/fwupdate.
> > Note: If your TPM is currently owned you will need to go into BIOS setup to
> > clear it
> > first before upgrading.
> 
> I'm not familiar with the specific TPM in your model, but according to the log it is a
> TPM 2.0, which does not really carry over the owner concept of a TPM 1.2. Is
> clearing it still necessary for an upgrade then?

Yes it's required for the TPM model/vendor that is used in the XPS model that
Paul has.  If you try to run the upgrade without clearing it the firmware will
reject the upgrade.

> 
> > I don’t have any insight into what changed between firmware versions.  It
> > might not
> > change this at all.
> >
> > > To add some more details to what the problem is: The PTP limits the
> > maximum
> > > runtime of the TPM2_SelfTest command that we try to execute here to
> > 2000ms
> > > (see https://trustedcomputinggroup.org/wp-
> > >
> > content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_
> > Family
> > > _2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to
> > the
> > > printed page numbers). Technically, we have no evidence that your TPM is
> > in
> > > violation of that specification, because it does reply to the command within
> > > 2000ms, it just has not completed the selftests within that timeframe. But
> > clearly
> > > the intention of the specification authors was to have the selftests
> > completed
> > > within that limit, there is no sense in allowing 2s just for the TPM to
> > generate an
> > > answer without actually making any progress.
> > >
> > > The TPM2_SelfTest command is special in that it is allowed to either
> > execute all
> > > selftests and then return TPM_RC_SUCCESS or just schedule the selftest
> > execution
> > > in the background and return TPM_RC_TESTING immediately (see
> > > https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-
> > Part-3-
> > > Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently
> > chooses
> > > the second option, but (sometimes?) fails to complete the selftests within
> > the limit
> > > that we set (which is far longer than the 2s from the PTP).
> > >
> > > I'm not sure what to do about that now. We could increase the timeout
> > even
> > > further, but if your TPM does not abide by the specification, what would be
> > the
> > > right limit? Maybe there is a bug in your TPM that sometimes causes it to
> > end up in
> > > a state where it can never complete the selftests.
> >
> > Are there any representatives from the other TPM vendors on the linux-
> > integrrity
> > mailing list?  Maybe someone from the vendor involved in this laptop can
> > comment
> > if they know of limitations in the self tests on this particular model and can
> > recommend a solution.
> >
> > >
> > > The only other idea I have would be to use a different variant of the
> > TPM2_SelfTest
> > > command. Currently, we execute the selftest command with the
> > parameter
> > > fullTest=NO, so that the TPM only has to execute the missing tests (which
> > should be
> > > the fastest implementation for a spec-compliant TPM). Maybe instead of
> > giving up,
> > > we can extend the current algorithm to try fullTest=YES once, which should
> > reset
> > > the selftest state so that maybe then your TPM can complete them
> > successfully. I'll
> > > try to implement a patch to that effect.
> > >
> > > Alexander
> >
> > If you're fairly certain it's a TPM bug, another possibility is to quirk to skip self
> > tests
> > based on TPM model + TPM firmware version.
> 
> As a last resort maybe, yes. But currently the kernel's policy is that it only wants to
> talk to a TPM device that is guaranteed to be error-free, i.e. has executed the
> selftests correctly. I'd like to change that for other reasons (see the patches that I
> just posted for details), but now that you mention it, maybe there is a simple
> solution that solves both problems:
> 
> The TPM specification says "If a command requires use of an untested algorithm or
> functional module, the TPM performs the test and then completes the command
> actions." (https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-
> Part-1-Architecture-01.38.pdf chapter 12.3, page 83/59). So as far as I understand
> that, there is no need for us to explicitly execute selftests on any TPM (2.0) device,
> the TPM is required to do that automatically. So what about getting rid of the
> selftest call completely?
> 
> It will improve startup performance, because we do not have to wait for the TPM to
> complete all selftests. The worst case is that the first execution of a command
> requiring a specific functionality will be a bit slower, because the TPM has to do the
> selftests first. But maybe even that won't be the case, since the same chapter in the
> specification also says "It is preferable for the TPM to perform self-tests on
> untested algorithms and functional blocks as a background task to increase the
> likelihood that algorithms are tested before they are needed."
> 
> The only disadvantage I can see from a user's point of view is that he will discover a
> broken TPM device only when he tries to use it, not already when the kernel tries to
> load the driver. But that also applies to other devices, you will not notice a broken
> flash drive unless you try to access the data, not from just plugging it in. And if a
> user really cares, he is always free to execute TPM2_SelfTest via /dev/tpm*. Any
> other objections?
> 
> Alexander

Your logic to this idea sounds good to me.  The only potential problem would be if
the kernel were ever to directly use the TPM for storing data.  It perhaps might hit
this at an inopportune time.
Otherwise no objections from my side, but I'm no decision maker in this area.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-14 19:43                           ` Mario.Limonciello
  0 siblings, 0 replies; 51+ messages in thread
From: Mario.Limonciello @ 2017-12-14 19:43 UTC (permalink / raw)
  To: Alexander.Steffen, pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> -----Original Message-----
> From: Alexander.Steffen@infineon.com [mailto:Alexander.Steffen@infineon.com]
> Sent: Thursday, December 14, 2017 10:12 AM
> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; pmenzel@molgen.mpg.de;
> jgg@ziepe.ca
> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org
> Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> occurred continue selftest`
> 
> > > -----Original Message-----
> > > From: Alexander.Steffen@infineon.com
> > [mailto:Alexander.Steffen@infineon.com]
> > > Sent: Thursday, December 14, 2017 6:21 AM
> > > To: pmenzel@molgen.mpg.de; jgg@ziepe.ca
> > > Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org;
> > Limonciello,
> > > Mario <Mario_Limonciello@Dell.com>
> > > Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error
> > (2314)
> > > occurred continue selftest`
> > >
> > > > [Mario from Dell added to CC list.]
> > > >
> > > > Dear Alexander,
> > > >
> > > >
> > > > On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
> > > >
> > > > >> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > > >>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > > > >>>
> > > > >>>> I have no access to the system right now, but want to point out,
> > that
> > > > the
> > > > >>>> log was created by `journactl -k`, so I do not know if that messes
> > with
> > > > the
> > > > >>>> time stamps. I checked the output of `dmesg` but didn't see the
> > TPM
> > > > error
> > > > >>>> messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM
> > (device-
> > > > id 0xFE,
> > > > >>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > > > >>>
> > > > >>> It is a good question, I don't know.. If your kernel isn't setup to
> > > > >>> timestamp messages then the journalstamp will certainly be
> > garbage.
> > > > >>>
> > > > >>> No idea why you wouldn't see the messages in dmesg, if they are
> > not in
> > > > >>> dmesg they couldn't get into the journal
> > > > >>
> > > > >> It looks like I was running an older Linux kernel version, when running
> > > > >> `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > > > >> kernel time stamps, showing that the delays work correctly.
> > > > >>
> > > > >> ```
> > > > >> $ uname -a
> > > > >> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > > > >> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > > > >> $ sudo dmesg | grep TPM
> > > > >> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03
> > Tpm2Tabl
> > > > >> 00000001 AMI  00000000)
> > > > >> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > > >> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > >> [    3.734808] tpm tpm0: TPM self test failed
> > > > >> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > > > >> ```
> > > > >
> > > > > Thanks for the fixed log. So your TPM seems to be rather slow with
> > > > executing the selftests. Could try to apply the patch that I've just sent
> > you? It
> > > > ensures that your TPM gets more time to execute all the tests, up to the
> > limit
> > > > set in the PTP.
> > > >
> > > > Thank you for your patch. Judging from the time stamps, it seems it
> > > > works, but the TPM still fails.
> > > >
> > > > ```
> > > > $ dmesg | grep tpm
> > > > [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > > > [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
> > > > [    6.303242] tpm tpm0: TPM self test failed
> > > > ```
> > > >
> > > > To be clear, this issue is not reproducible during every start. (But
> > > > that was the same before.)
> > >
> > > Thanks for testing. Now you are in the unlucky situation that your TPM was
> > > probably always broken, but old kernels did not detect that and used it
> > anyway.
> > >
> > Something that Paul can consider is to upgrade the TPM firmware if it's not
> > already
> > upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware
> > update
> > issued.  It has been posted to LVFS and can be upgraded using
> > fwupd/fwupdate.
> > Note: If your TPM is currently owned you will need to go into BIOS setup to
> > clear it
> > first before upgrading.
> 
> I'm not familiar with the specific TPM in your model, but according to the log it is a
> TPM 2.0, which does not really carry over the owner concept of a TPM 1.2. Is
> clearing it still necessary for an upgrade then?

Yes it's required for the TPM model/vendor that is used in the XPS model that
Paul has.  If you try to run the upgrade without clearing it the firmware will
reject the upgrade.

> 
> > I don't have any insight into what changed between firmware versions.  It
> > might not
> > change this at all.
> >
> > > To add some more details to what the problem is: The PTP limits the
> > maximum
> > > runtime of the TPM2_SelfTest command that we try to execute here to
> > 2000ms
> > > (see https://trustedcomputinggroup.org/wp-
> > >
> > content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_
> > Family
> > > _2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to
> > the
> > > printed page numbers). Technically, we have no evidence that your TPM is
> > in
> > > violation of that specification, because it does reply to the command within
> > > 2000ms, it just has not completed the selftests within that timeframe. But
> > clearly
> > > the intention of the specification authors was to have the selftests
> > completed
> > > within that limit, there is no sense in allowing 2s just for the TPM to
> > generate an
> > > answer without actually making any progress.
> > >
> > > The TPM2_SelfTest command is special in that it is allowed to either
> > execute all
> > > selftests and then return TPM_RC_SUCCESS or just schedule the selftest
> > execution
> > > in the background and return TPM_RC_TESTING immediately (see
> > > https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-
> > Part-3-
> > > Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently
> > chooses
> > > the second option, but (sometimes?) fails to complete the selftests within
> > the limit
> > > that we set (which is far longer than the 2s from the PTP).
> > >
> > > I'm not sure what to do about that now. We could increase the timeout
> > even
> > > further, but if your TPM does not abide by the specification, what would be
> > the
> > > right limit? Maybe there is a bug in your TPM that sometimes causes it to
> > end up in
> > > a state where it can never complete the selftests.
> >
> > Are there any representatives from the other TPM vendors on the linux-
> > integrrity
> > mailing list?  Maybe someone from the vendor involved in this laptop can
> > comment
> > if they know of limitations in the self tests on this particular model and can
> > recommend a solution.
> >
> > >
> > > The only other idea I have would be to use a different variant of the
> > TPM2_SelfTest
> > > command. Currently, we execute the selftest command with the
> > parameter
> > > fullTest=NO, so that the TPM only has to execute the missing tests (which
> > should be
> > > the fastest implementation for a spec-compliant TPM). Maybe instead of
> > giving up,
> > > we can extend the current algorithm to try fullTest=YES once, which should
> > reset
> > > the selftest state so that maybe then your TPM can complete them
> > successfully. I'll
> > > try to implement a patch to that effect.
> > >
> > > Alexander
> >
> > If you're fairly certain it's a TPM bug, another possibility is to quirk to skip self
> > tests
> > based on TPM model + TPM firmware version.
> 
> As a last resort maybe, yes. But currently the kernel's policy is that it only wants to
> talk to a TPM device that is guaranteed to be error-free, i.e. has executed the
> selftests correctly. I'd like to change that for other reasons (see the patches that I
> just posted for details), but now that you mention it, maybe there is a simple
> solution that solves both problems:
> 
> The TPM specification says "If a command requires use of an untested algorithm or
> functional module, the TPM performs the test and then completes the command
> actions." (https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-
> Part-1-Architecture-01.38.pdf chapter 12.3, page 83/59). So as far as I understand
> that, there is no need for us to explicitly execute selftests on any TPM (2.0) device,
> the TPM is required to do that automatically. So what about getting rid of the
> selftest call completely?
> 
> It will improve startup performance, because we do not have to wait for the TPM to
> complete all selftests. The worst case is that the first execution of a command
> requiring a specific functionality will be a bit slower, because the TPM has to do the
> selftests first. But maybe even that won't be the case, since the same chapter in the
> specification also says "It is preferable for the TPM to perform self-tests on
> untested algorithms and functional blocks as a background task to increase the
> likelihood that algorithms are tested before they are needed."
> 
> The only disadvantage I can see from a user's point of view is that he will discover a
> broken TPM device only when he tries to use it, not already when the kernel tries to
> load the driver. But that also applies to other devices, you will not notice a broken
> flash drive unless you try to access the data, not from just plugging it in. And if a
> user really cares, he is always free to execute TPM2_SelfTest via /dev/tpm*. Any
> other objections?
> 
> Alexander

Your logic to this idea sounds good to me.  The only potential problem would be if
the kernel were ever to directly use the TPM for storing data.  It perhaps might hit
this at an inopportune time.
Otherwise no objections from my side, but I'm no decision maker in this area.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-14 19:43                           ` Mario.Limonciello
@ 2017-12-15 11:54                             ` Paul Menzel
  -1 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-15 11:54 UTC (permalink / raw)
  To: Mario Limonciello, Alexander Steffen, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, Rafael J. Wysocki, Len Brown

[-- Attachment #1: Type: text/plain, Size: 11975 bytes --]

[Adding Rafael and Len as they, to my knowledge, also use or have a 
access to a Dell XPS 13 9360. With latest Linux master do you get TPM 
self-test errors, when cold starting the system without the power supply 
plugged in?]

Dear Mario, dear Alexander,


the added line breaks to the quoted parts really mess up the citation. 
Can we please try to use MUAs avoiding that, or fixing that manually?


On 12/14/17 20:43, Mario.Limonciello@dell.com wrote:
>> -----Original Message-----
>> From: Alexander.Steffen@infineon.com [mailto:Alexander.Steffen@infineon.com]
>> Sent: Thursday, December 14, 2017 10:12 AM
>> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; pmenzel@molgen.mpg.de;
>> jgg@ziepe.ca
>> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org
>> Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
>> occurred continue selftest`
>>
>>>> -----Original Message-----
>>>> From: Alexander.Steffen@infineon.com
>>> [mailto:Alexander.Steffen@infineon.com]
>>>> Sent: Thursday, December 14, 2017 6:21 AM
>>>> To: pmenzel@molgen.mpg.de; jgg@ziepe.ca
>>>> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org;
>>> Limonciello,
>>>> Mario <Mario_Limonciello@Dell.com>
>>>> Subject: RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error
>>> (2314)
>>>> occurred continue selftest`
>>>>
>>>>> [Mario from Dell added to CC list.]
>>>>>
>>>>> Dear Alexander,
>>>>>
>>>>>
>>>>> On 12/11/17 17:08, Alexander.Steffen@infineon.com wrote:
>>>>>
>>>>>>> On 12/08/17 17:18, Jason Gunthorpe wrote:
>>>>>>>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
>>>>>>>>
>>>>>>>>> I have no access to the system right now, but want to point out,
>>> that
>>>>> the
>>>>>>>>> log was created by `journactl -k`, so I do not know if that messes
>>> with
>>>>> the
>>>>>>>>> time stamps. I checked the output of `dmesg` but didn’t see the
>>> TPM
>>>>> error
>>>>>>>>> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM
>>> (device-
>>>>> id 0xFE,
>>>>>>>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
>>>>>>>>
>>>>>>>> It is a good question, I don't know.. If your kernel isn't setup to
>>>>>>>> timestamp messages then the journalstamp will certainly be
>>> garbage.
>>>>>>>>
>>>>>>>> No idea why you wouldn't see the messages in dmesg, if they are
>>> not in
>>>>>>>> dmesg they couldn't get into the journal
>>>>>>>
>>>>>>> It looks like I was running an older Linux kernel version, when running
>>>>>>> `dmesg`. Sorry for the noise. Here are the messages with the Linux
>>>>>>> kernel time stamps, showing that the delays work correctly.
>>>>>>>
>>>>>>> ```
>>>>>>> $ uname -a
>>>>>>> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
>>>>>>> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>>>>>>> $ sudo dmesg | grep TPM
>>>>>>> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03
>>> Tpm2Tabl
>>>>>>> 00000001 AMI  00000000)
>>>>>>> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
>>>>>>> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>>>> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>>>> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>>>> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>>>> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>>>> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>>>> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>>>> [    3.734808] tpm tpm0: TPM self test failed
>>>>>>> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
>>>>>>> ```
>>>>>>
>>>>>> Thanks for the fixed log. So your TPM seems to be rather slow with
>>>>> executing the selftests. Could try to apply the patch that I've just sent
>>> you? It
>>>>> ensures that your TPM gets more time to execute all the tests, up to the
>>> limit
>>>>> set in the PTP.
>>>>>
>>>>> Thank you for your patch. Judging from the time stamps, it seems it
>>>>> works, but the TPM still fails.
>>>>>
>>>>> ```
>>>>> $ dmesg | grep tpm
>>>>> [    1.100958] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
>>>>> [    1.111768] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.143020] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.194251] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.285509] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.457103] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    1.788709] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    2.440216] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    3.731704] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    6.303216] tpm tpm0: A TPM error (2314) occurred continue selftest
>>>>> [    6.303242] tpm tpm0: TPM self test failed
>>>>> ```
>>>>>
>>>>> To be clear, this issue is not reproducible during every start. (But
>>>>> that was the same before.)

I think I found out how to reproduce the issue. Cold start the system 
without the power supply connected.

>>>> Thanks for testing. Now you are in the unlucky situation that your TPM was
>>>> probably always broken, but old kernels did not detect that and used it anyway.

Just to clarify, I do not know if the TPM could ever be used. I believe 
the module loaded but the user space tools (tpm2_version or so) always 
returned an error in my tests.

>>> Something that Paul can consider is to upgrade the TPM firmware if it's not
>>> already
>>> upgraded.  Since the launch of XPS 9360 there was at least one TPM firmware
>>> update
>>> issued.  It has been posted to LVFS and can be upgraded using
>>> fwupd/fwupdate.
>>> Note: If your TPM is currently owned you will need to go into BIOS setup to
>>> clear it
>>> first before upgrading.
>>
>> I'm not familiar with the specific TPM in your model, but according to the log it is a
>> TPM 2.0, which does not really carry over the owner concept of a TPM 1.2. Is
>> clearing it still necessary for an upgrade then?
> 
> Yes it's required for the TPM model/vendor that is used in the XPS model that
> Paul has.  If you try to run the upgrade without clearing it the firmware will
> reject the upgrade.

Mario, thank you for your quick reaction.

[…]

1.  Can you reproduce this issue too?
2.  How do I find out, what TPM firmware version is installed?
3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn’t fix 
the issue.

>>>> To add some more details to what the problem is: The PTP limits the
>>> maximum
>>>> runtime of the TPM2_SelfTest command that we try to execute here to
>>> 2000ms
>>>> (see https://trustedcomputinggroup.org/wp-
>>>>
>>> content/uploads/TCG_PC_Client_Platform_TPM_Profile_PTP_Specification_
>>> Family
>>>> _2.0_Revision_1.3v22.pdf table 15, page 65 in the PDF, page 57 according to
>>> the
>>>> printed page numbers). Technically, we have no evidence that your TPM is
>>> in
>>>> violation of that specification, because it does reply to the command within
>>>> 2000ms, it just has not completed the selftests within that timeframe. But
>>> clearly
>>>> the intention of the specification authors was to have the selftests
>>> completed
>>>> within that limit, there is no sense in allowing 2s just for the TPM to
>>> generate an
>>>> answer without actually making any progress.
>>>>
>>>> The TPM2_SelfTest command is special in that it is allowed to either
>>> execute all
>>>> selftests and then return TPM_RC_SUCCESS or just schedule the selftest
>>> execution
>>>> in the background and return TPM_RC_TESTING immediately (see
>>>> https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-
>>> Part-3-
>>>> Commands-01.38.pdf chapter 10.2.1, page 43/29). Your TPM apparently
>>> chooses
>>>> the second option, but (sometimes?) fails to complete the selftests within
>>> the limit
>>>> that we set (which is far longer than the 2s from the PTP).
>>>>
>>>> I'm not sure what to do about that now. We could increase the timeout
>>> even
>>>> further, but if your TPM does not abide by the specification, what would be
>>> the
>>>> right limit? Maybe there is a bug in your TPM that sometimes causes it to
>>> end up in
>>>> a state where it can never complete the selftests.
>>>
>>> Are there any representatives from the other TPM vendors on the linux-
>>> integrrity
>>> mailing list?  Maybe someone from the vendor involved in this laptop can
>>> comment
>>> if they know of limitations in the self tests on this particular model and can
>>> recommend a solution.
>>>
>>>>
>>>> The only other idea I have would be to use a different variant of the
>>> TPM2_SelfTest
>>>> command. Currently, we execute the selftest command with the
>>> parameter
>>>> fullTest=NO, so that the TPM only has to execute the missing tests (which
>>> should be
>>>> the fastest implementation for a spec-compliant TPM). Maybe instead of
>>> giving up,
>>>> we can extend the current algorithm to try fullTest=YES once, which should
>>> reset
>>>> the selftest state so that maybe then your TPM can complete them
>>> successfully. I'll
>>>> try to implement a patch to that effect.
>>>
>>> If you're fairly certain it's a TPM bug, another possibility is to quirk to skip self
>>> tests
>>> based on TPM model + TPM firmware version.
>>
>> As a last resort maybe, yes. But currently the kernel's policy is that it only wants to
>> talk to a TPM device that is guaranteed to be error-free, i.e. has executed the
>> selftests correctly. I'd like to change that for other reasons (see the patches that I
>> just posted for details), but now that you mention it, maybe there is a simple
>> solution that solves both problems:
>>
>> The TPM specification says "If a command requires use of an untested algorithm or
>> functional module, the TPM performs the test and then completes the command
>> actions." (https://trustedcomputinggroup.org/wp-content/uploads/TPM-Rev-2.0-
>> Part-1-Architecture-01.38.pdf chapter 12.3, page 83/59). So as far as I understand
>> that, there is no need for us to explicitly execute selftests on any TPM (2.0) device,
>> the TPM is required to do that automatically. So what about getting rid of the
>> selftest call completely?
>>
>> It will improve startup performance, because we do not have to wait for the TPM to
>> complete all selftests. The worst case is that the first execution of a command
>> requiring a specific functionality will be a bit slower, because the TPM has to do the
>> selftests first. But maybe even that won't be the case, since the same chapter in the
>> specification also says "It is preferable for the TPM to perform self-tests on
>> untested algorithms and functional blocks as a background task to increase the
>> likelihood that algorithms are tested before they are needed."
>>
>> The only disadvantage I can see from a user's point of view is that he will discover a
>> broken TPM device only when he tries to use it, not already when the kernel tries to
>> load the driver. But that also applies to other devices, you will not notice a broken
>> flash drive unless you try to access the data, not from just plugging it in. And if a
>> user really cares, he is always free to execute TPM2_SelfTest via /dev/tpm*. Any
>> other objections?
> 
> Your logic to this idea sounds good to me.  The only potential problem would be if
> the kernel were ever to directly use the TPM for storing data.  It perhaps might hit
> this at an inopportune time.
> Otherwise no objections from my side, but I'm no decision maker in this area.


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-15 11:54                             ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-15 11:54 UTC (permalink / raw)
  To: Mario Limonciello, Alexander Steffen, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, Rafael J. Wysocki, Len Brown

[-- Attachment #1: Type: multipart/signed, Size: 11857 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-15 11:54                             ` Paul Menzel
@ 2017-12-15 14:39                               ` Mario.Limonciello
  -1 siblings, 0 replies; 51+ messages in thread
From: Mario.Limonciello @ 2017-12-15 14:39 UTC (permalink / raw)
  To: pmenzel, Alexander.Steffen, jgg
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> Sent: Friday, December 15, 2017 5:54 AM
> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org; Rafael J.
> Wysocki <rafael.j.wysocki@intel.com>; Len Brown <len.brown@intel.com>
> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> occurred continue selftest`
> 
> [Adding Rafael and Len as they, to my knowledge, also use or have a
> access to a Dell XPS 13 9360. With latest Linux master do you get TPM
> self-test errors, when cold starting the system without the power supply
> plugged in?]
> 
> Dear Mario, dear Alexander,
> 
> 
> the added line breaks to the quoted parts really mess up the citation.
> Can we please try to use MUAs avoiding that, or fixing that manually?

I don't know what you mean.  I think this is directed at Alexander?
If this is directed to me I can't change mail clients, sorry.

<snip>

> >
> > Yes it's required for the TPM model/vendor that is used in the XPS model that
> > Paul has.  If you try to run the upgrade without clearing it the firmware will
> > reject the upgrade.
> 
> Mario, thank you for your quick reaction.
> 
> […]
> 
> 1.  Can you reproduce this issue too?

I haven't seen this, but if this is a regression I also have not run anything
later than 4.15-rc1 right now.

> 2.  How do I find out, what TPM firmware version is installed?

fwupd will tell you.  Documentation (and code) here:
https://github.com/hughsie/fwupd/tree/master/plugins/dell

> 3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn’t fix

The TPM in the XPS is a discrete TPM that is treated separately from the system
firmware payload.  It supports both a "1.2" and a "2.0" firmware. 

It is independently flashed from a separate TPM payload that is distributed
on LVFS.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-15 14:39                               ` Mario.Limonciello
  0 siblings, 0 replies; 51+ messages in thread
From: Mario.Limonciello @ 2017-12-15 14:39 UTC (permalink / raw)
  To: pmenzel, Alexander.Steffen, jgg
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> Sent: Friday, December 15, 2017 5:54 AM
> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org; Rafael J.
> Wysocki <rafael.j.wysocki@intel.com>; Len Brown <len.brown@intel.com>
> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> occurred continue selftest`
> 
> [Adding Rafael and Len as they, to my knowledge, also use or have a
> access to a Dell XPS 13 9360. With latest Linux master do you get TPM
> self-test errors, when cold starting the system without the power supply
> plugged in?]
> 
> Dear Mario, dear Alexander,
> 
> 
> the added line breaks to the quoted parts really mess up the citation.
> Can we please try to use MUAs avoiding that, or fixing that manually?

I don't know what you mean.  I think this is directed at Alexander?
If this is directed to me I can't change mail clients, sorry.

<snip>

> >
> > Yes it's required for the TPM model/vendor that is used in the XPS model that
> > Paul has.  If you try to run the upgrade without clearing it the firmware will
> > reject the upgrade.
> 
> Mario, thank you for your quick reaction.
> 
> [...]
> 
> 1.  Can you reproduce this issue too?

I haven't seen this, but if this is a regression I also have not run anything
later than 4.15-rc1 right now.

> 2.  How do I find out, what TPM firmware version is installed?

fwupd will tell you.  Documentation (and code) here:
https://github.com/hughsie/fwupd/tree/master/plugins/dell

> 3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn't fix

The TPM in the XPS is a discrete TPM that is treated separately from the system
firmware payload.  It supports both a "1.2" and a "2.0" firmware. 

It is independently flashed from a separate TPM payload that is distributed
on LVFS.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-15 11:54                             ` Paul Menzel
@ 2017-12-15 14:54                               ` Alexander.Steffen
  -1 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-15 14:54 UTC (permalink / raw)
  To: pmenzel, mario.limonciello, jgg
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

> [Adding Rafael and Len as they, to my knowledge, also use or have a
> access to a Dell XPS 13 9360. With latest Linux master do you get TPM
> self-test errors, when cold starting the system without the power supply
> plugged in?]
> 
> Dear Mario, dear Alexander,
> 
> 
> the added line breaks to the quoted parts really mess up the citation.
> Can we please try to use MUAs avoiding that, or fixing that manually?

Sorry, I'm not sure whether my company has a way for me to avoid using Outlook ;-) But if there are any configuration changes to make it behave better, I will gladly apply them. Do you know of any documentation on this? All I found so far either is already applied or was outdated.

I'll remove some of the less relevant quoted parts, so that this is less of an issue.

> >>>>> To be clear, this issue is not reproducible during every start. (But
> >>>>> that was the same before.)
> 
> I think I found out how to reproduce the issue. Cold start the system
> without the power supply connected.
> 
> >>>> Thanks for testing. Now you are in the unlucky situation that your TPM
> was
> >>>> probably always broken, but old kernels did not detect that and used it
> anyway.
> 
> Just to clarify, I do not know if the TPM could ever be used. I believe
> the module loaded but the user space tools (tpm2_version or so) always
> returned an error in my tests.

Interesting. So maybe it is not a bug in your TPM's firmware, but really a single defective TPM? Can you try to figure that out? That is, when using an older kernel in the cold start scenario, can you execute any useful commands on your TPM successfully?

> >>> Something that Paul can consider is to upgrade the TPM firmware if it's
> not
> >>> already
> >>> upgraded.  Since the launch of XPS 9360 there was at least one TPM
> firmware
> >>> update
> >>> issued.  It has been posted to LVFS and can be upgraded using
> >>> fwupd/fwupdate.
> >>> Note: If your TPM is currently owned you will need to go into BIOS setup
> to
> >>> clear it
> >>> first before upgrading.
> >>
> >> I'm not familiar with the specific TPM in your model, but according to the
> log it is a
> >> TPM 2.0, which does not really carry over the owner concept of a TPM 1.2.
> Is
> >> clearing it still necessary for an upgrade then?
> >
> > Yes it's required for the TPM model/vendor that is used in the XPS model
> that
> > Paul has.  If you try to run the upgrade without clearing it the firmware will
> > reject the upgrade.
> 
> Mario, thank you for your quick reaction.
> 
> […]
> 
> 1.  Can you reproduce this issue too?
> 2.  How do I find out, what TPM firmware version is installed?

If you get the driver loaded, you can ask the TPM (TPM2_GetCapability for TPM_PT_FIRMWARE_VERSION_1 and TPM_PT_FIRMWARE_VERSION_2):

python3 -c 'f=open("/dev/tpm0", "r+b", buffering=0); f.write(b"\x80\x01\x00\x00\x00\x16\x00\x00\x01z\x00\x00\x00\x06\x00\x00\x01\x0b\x00\x00\x00\x02"); print(f.readall())'

> 3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn’t fix
> the issue.

You've got a firmware from the future? ;-)

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-15 14:54                               ` Alexander.Steffen
  0 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-15 14:54 UTC (permalink / raw)
  To: pmenzel, mario.limonciello, jgg
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

> [Adding Rafael and Len as they, to my knowledge, also use or have a
> access to a Dell XPS 13 9360. With latest Linux master do you get TPM
> self-test errors, when cold starting the system without the power supply
> plugged in?]
> 
> Dear Mario, dear Alexander,
> 
> 
> the added line breaks to the quoted parts really mess up the citation.
> Can we please try to use MUAs avoiding that, or fixing that manually?

Sorry, I'm not sure whether my company has a way for me to avoid using Outlook ;-) But if there are any configuration changes to make it behave better, I will gladly apply them. Do you know of any documentation on this? All I found so far either is already applied or was outdated.

I'll remove some of the less relevant quoted parts, so that this is less of an issue.

> >>>>> To be clear, this issue is not reproducible during every start. (But
> >>>>> that was the same before.)
> 
> I think I found out how to reproduce the issue. Cold start the system
> without the power supply connected.
> 
> >>>> Thanks for testing. Now you are in the unlucky situation that your TPM
> was
> >>>> probably always broken, but old kernels did not detect that and used it
> anyway.
> 
> Just to clarify, I do not know if the TPM could ever be used. I believe
> the module loaded but the user space tools (tpm2_version or so) always
> returned an error in my tests.

Interesting. So maybe it is not a bug in your TPM's firmware, but really a single defective TPM? Can you try to figure that out? That is, when using an older kernel in the cold start scenario, can you execute any useful commands on your TPM successfully?

> >>> Something that Paul can consider is to upgrade the TPM firmware if it's
> not
> >>> already
> >>> upgraded.  Since the launch of XPS 9360 there was at least one TPM
> firmware
> >>> update
> >>> issued.  It has been posted to LVFS and can be upgraded using
> >>> fwupd/fwupdate.
> >>> Note: If your TPM is currently owned you will need to go into BIOS setup
> to
> >>> clear it
> >>> first before upgrading.
> >>
> >> I'm not familiar with the specific TPM in your model, but according to the
> log it is a
> >> TPM 2.0, which does not really carry over the owner concept of a TPM 1.2.
> Is
> >> clearing it still necessary for an upgrade then?
> >
> > Yes it's required for the TPM model/vendor that is used in the XPS model
> that
> > Paul has.  If you try to run the upgrade without clearing it the firmware will
> > reject the upgrade.
> 
> Mario, thank you for your quick reaction.
> 
> [...]
> 
> 1.  Can you reproduce this issue too?
> 2.  How do I find out, what TPM firmware version is installed?

If you get the driver loaded, you can ask the TPM (TPM2_GetCapability for TPM_PT_FIRMWARE_VERSION_1 and TPM_PT_FIRMWARE_VERSION_2):

python3 -c 'f=open("/dev/tpm0", "r+b", buffering=0); f.write(b"\x80\x01\x00\x00\x00\x16\x00\x00\x01z\x00\x00\x00\x06\x00\x00\x01\x0b\x00\x00\x00\x02"); print(f.readall())'

> 3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn't fix
> the issue.

You've got a firmware from the future? ;-)

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-15 14:39                               ` Mario.Limonciello
@ 2017-12-15 15:10                                 ` Paul Menzel
  -1 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-15 15:10 UTC (permalink / raw)
  To: Mario Limonciello, Alexander Steffen, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

[-- Attachment #1: Type: text/plain, Size: 3784 bytes --]

Dear Mario,


On 12/15/17 15:39, Mario.Limonciello@dell.com wrote:
>> -----Original Message-----
>> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
>> Sent: Friday, December 15, 2017 5:54 AM
>> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
>> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
>> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org; Rafael J.
>> Wysocki <rafael.j.wysocki@intel.com>; Len Brown <len.brown@intel.com>
>> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
>> occurred continue selftest`

[…]

>> the added line breaks to the quoted parts really mess up the citation.
>> Can we please try to use MUAs avoiding that, or fixing that manually?
> 
> I don't know what you mean.  I think this is directed at Alexander?
> If this is directed to me I can't change mail clients, sorry.

I think it started in Alexander’s reply (Message-ID: 
<10b81a727ba940889095fa4bb29d0863@infineon.com>) that line breaks were 
added to quotes. Your only adds a long “Original Message” header.

> <snip>
>
>>> Yes it's required for the TPM model/vendor that is used in the XPS model that
>>> Paul has.  If you try to run the upgrade without clearing it the firmware will
>>> reject the upgrade.
>>
>> Mario, thank you for your quick reaction.
>>
>> […]
>>
>> 1.  Can you reproduce this issue too?
> 
> I haven't seen this, but if this is a regression I also have not run anything
> later than 4.15-rc1 right now.

Well as far as I understood it, it’s not a regression, and there is now 
just better error reporting. Did you ever get the TPM to work?

>> 2.  How do I find out, what TPM firmware version is installed?
> 
> fwupd will tell you.  Documentation (and code) here:
> https://github.com/hughsie/fwupd/tree/master/plugins/dell

Unfortuntately it’s not listed with fwupd 0.7.0-0ubuntu4.3 in Ubuntu 
16.04.3 LTS installed by Dell.

```
$ fwupdmgr get-devices
ro__sys_devices_pci0000_00_0000_00_02_0
   Guid:                 3ec3df3a-2290-56e5-9d2f-eda62e9ab50b
   Provider:             Udev
   Flags:                internal|locked
   DeviceVendor:         Intel Corporation
   Created:              2017-12-15
   Trusted:              none

UEFI-5ffdbc0d-f340-441c-a803-8439c8c0ae10-dev0
   Guid:                 5ffdbc0d-f340-441c-a803-8439c8c0ae10
   DisplayName:          XPS 13 9360
   Provider:             UEFI
   Flags:                internal|allow-offline|require-ac
   Version:              0.2.4.2
   VersionLowest:        0.2.4.2
   Created:              2017-12-15
   Trusted:              none

usb:00:05
   Guid:                 87c78d19-a3ed-5778-9b69-8eb701529940
   DisplayName:          Integrated_Webcam_HD
   Provider:             USB
   Flags:                none
   Version:              99.24
   Created:              2017-12-15
   Trusted:              none

usb:00:04
   Guid:                 0f15c153-cc04-589b-8886-aba87f98918d
   DisplayName:          Touchscreen
   Provider:             USB
   Flags:                none
   Version:              17.17
   Created:              2017-12-15
   Trusted:              none

```

>> 3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn’t fix

[The date is December 12th, 2017.]

> The TPM in the XPS is a discrete TPM that is treated separately from the system
> firmware payload.  It supports both a "1.2" and a "2.0" firmware.
> 
> It is independently flashed from a separate TPM payload that is distributed
> on LVFS.

It looks like I am out of luck with Ubuntu 16.04.3 [1].


Kind regards,

Paul


[1] https://github.com/hughsie/fwupd/issues/301#issuecomment-342164366


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-15 15:10                                 ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-15 15:10 UTC (permalink / raw)
  To: Mario Limonciello, Alexander Steffen, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

[-- Attachment #1: Type: multipart/signed, Size: 3810 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-15 15:10                                 ` Paul Menzel
@ 2017-12-15 15:24                                   ` Mario.Limonciello
  -1 siblings, 0 replies; 51+ messages in thread
From: Mario.Limonciello @ 2017-12-15 15:24 UTC (permalink / raw)
  To: pmenzel, Alexander.Steffen, jgg
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> Sent: Friday, December 15, 2017 9:11 AM
> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org;
> rafael.j.wysocki@intel.com; len.brown@intel.com
> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> occurred continue selftest`
> 
> Dear Mario,
> 
> 
> On 12/15/17 15:39, Mario.Limonciello@dell.com wrote:
> >> -----Original Message-----
> >> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> >> Sent: Friday, December 15, 2017 5:54 AM
> >> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
> >> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
> >> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org; Rafael J.
> >> Wysocki <rafael.j.wysocki@intel.com>; Len Brown <len.brown@intel.com>
> >> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> >> occurred continue selftest`
> 
> […]
> 
> >> the added line breaks to the quoted parts really mess up the citation.
> >> Can we please try to use MUAs avoiding that, or fixing that manually?
> >
> > I don't know what you mean.  I think this is directed at Alexander?
> > If this is directed to me I can't change mail clients, sorry.
> 
> I think it started in Alexander’s reply (Message-ID:
> <10b81a727ba940889095fa4bb29d0863@infineon.com>) that line breaks were
> added to quotes. Your only adds a long “Original Message” header.
> 
File->Options->Mail->Replies and Forwards
Change "When Replying to a message"
to "Prefix each line of the original message"

That should help.

> > <snip>
> >
> >>> Yes it's required for the TPM model/vendor that is used in the XPS model that
> >>> Paul has.  If you try to run the upgrade without clearing it the firmware will
> >>> reject the upgrade.
> >>
> >> Mario, thank you for your quick reaction.
> >>
> >> […]
> >>
> >> 1.  Can you reproduce this issue too?
> >
> > I haven't seen this, but if this is a regression I also have not run anything
> > later than 4.15-rc1 right now.
> 
> Well as far as I understood it, it’s not a regression, and there is now
> just better error reporting. Did you ever get the TPM to work?

I don't personally use a TPM with Linux on the XPS 9360, but TPM was 
tested by our partners when the XPS 9360 was enabled for Ubuntu.

> 
> >> 2.  How do I find out, what TPM firmware version is installed?
> >
> > fwupd will tell you.  Documentation (and code) here:
> > https://github.com/hughsie/fwupd/tree/master/plugins/dell
> 
> Unfortuntately it’s not listed with fwupd 0.7.0-0ubuntu4.3 in Ubuntu
> 16.04.3 LTS installed by Dell.

If you're unwilling to upgrade to a newer userspace, you can write 
a simple application that can use the new dell-smbios from 4.15 interface to look this up.

Most of the work is already done (see
https://github.com/torvalds/linux/blob/master/tools/wmi/dell-smbios-example.c#L179)

You'll just need to add some parsing around the output (see fwupd code) to
tell the version.  If you do this, please feel free to submit it platform-x86, it may be useful
for someone else in this sample application too.

Given you don't have a newer fwupd on your system you won't be able to "easily" flash
the newer TPM firmware.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-15 15:24                                   ` Mario.Limonciello
  0 siblings, 0 replies; 51+ messages in thread
From: Mario.Limonciello @ 2017-12-15 15:24 UTC (permalink / raw)
  To: pmenzel, Alexander.Steffen, jgg
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

> -----Original Message-----
> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> Sent: Friday, December 15, 2017 9:11 AM
> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org;
> rafael.j.wysocki@intel.com; len.brown@intel.com
> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> occurred continue selftest`
> 
> Dear Mario,
> 
> 
> On 12/15/17 15:39, Mario.Limonciello@dell.com wrote:
> >> -----Original Message-----
> >> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
> >> Sent: Friday, December 15, 2017 5:54 AM
> >> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
> >> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
> >> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org; Rafael J.
> >> Wysocki <rafael.j.wysocki@intel.com>; Len Brown <len.brown@intel.com>
> >> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
> >> occurred continue selftest`
> 
> [...]
> 
> >> the added line breaks to the quoted parts really mess up the citation.
> >> Can we please try to use MUAs avoiding that, or fixing that manually?
> >
> > I don't know what you mean.  I think this is directed at Alexander?
> > If this is directed to me I can't change mail clients, sorry.
> 
> I think it started in Alexander's reply (Message-ID:
> <10b81a727ba940889095fa4bb29d0863@infineon.com>) that line breaks were
> added to quotes. Your only adds a long "Original Message" header.
> 
File->Options->Mail->Replies and Forwards
Change "When Replying to a message"
to "Prefix each line of the original message"

That should help.

> > <snip>
> >
> >>> Yes it's required for the TPM model/vendor that is used in the XPS model that
> >>> Paul has.  If you try to run the upgrade without clearing it the firmware will
> >>> reject the upgrade.
> >>
> >> Mario, thank you for your quick reaction.
> >>
> >> [...]
> >>
> >> 1.  Can you reproduce this issue too?
> >
> > I haven't seen this, but if this is a regression I also have not run anything
> > later than 4.15-rc1 right now.
> 
> Well as far as I understood it, it's not a regression, and there is now
> just better error reporting. Did you ever get the TPM to work?

I don't personally use a TPM with Linux on the XPS 9360, but TPM was 
tested by our partners when the XPS 9360 was enabled for Ubuntu.

> 
> >> 2.  How do I find out, what TPM firmware version is installed?
> >
> > fwupd will tell you.  Documentation (and code) here:
> > https://github.com/hughsie/fwupd/tree/master/plugins/dell
> 
> Unfortuntately it's not listed with fwupd 0.7.0-0ubuntu4.3 in Ubuntu
> 16.04.3 LTS installed by Dell.

If you're unwilling to upgrade to a newer userspace, you can write 
a simple application that can use the new dell-smbios from 4.15 interface to look this up.

Most of the work is already done (see
https://github.com/torvalds/linux/blob/master/tools/wmi/dell-smbios-example.c#L179)

You'll just need to add some parsing around the output (see fwupd code) to
tell the version.  If you do this, please feel free to submit it platform-x86, it may be useful
for someone else in this sample application too.

Given you don't have a newer fwupd on your system you won't be able to "easily" flash
the newer TPM firmware.

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-15 14:54                               ` Alexander.Steffen
@ 2017-12-15 15:26                                 ` Paul Menzel
  -1 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-15 15:26 UTC (permalink / raw)
  To: Alexander Steffen, Mario Limonciello, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

[-- Attachment #1: Type: text/plain, Size: 4274 bytes --]

Dear Alexander,


On 12/15/17 15:54, Alexander.Steffen@infineon.com wrote:

[…]

>> the added line breaks to the quoted parts really mess up the citation.
>> Can we please try to use MUAs avoiding that, or fixing that manually?
> 
> Sorry, I'm not sure whether my company has a way for me to avoid using Outlook ;-) But if there are any configuration changes to make it behave better, I will gladly apply them. Do you know of any documentation on this? All I found so far either is already applied or was outdated.

No idea, lines in quotes should probably not be touch and wrapped. At 
least not without adding the right quoting level on the next line.

> I'll remove some of the less relevant quoted parts, so that this is less of an issue.
> 
>>>>>>> To be clear, this issue is not reproducible during every start. (But
>>>>>>> that was the same before.)
>>
>> I think I found out how to reproduce the issue. Cold start the system
>> without the power supply connected.
>>
>>>>>> Thanks for testing. Now you are in the unlucky situation that your TPM was
>>>>>> probably always broken, but old kernels did not detect that and used it anyway.
>>
>> Just to clarify, I do not know if the TPM could ever be used. I believe
>> the module loaded but the user space tools (tpm2_version or so) always
>> returned an error in my tests.
> 
> Interesting. So maybe it is not a bug in your TPM's firmware, but really a single defective TPM? Can you try to figure that out? That is, when using an older kernel in the cold start scenario, can you execute any useful commands on your TPM successfully?

```
$ uname -a
Linux Ixpees 4.10.0-42-generic #46~16.04.1-Ubuntu SMP Mon Dec 4 15:57:59 
UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
$ more /proc/version
Linux version 4.10.0-42-generic (buildd@lgw01-amd64-007) (gcc version 
5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16
.04.5) ) #46~16.04.1-Ubuntu SMP Mon Dec 4 15:57:59 UTC 2017
$ dmesg | grep tpm
[    0.999122] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
$ sudo tpm_version
Tspi_Context_Connect failed: 0x00003011 - layer=tsp, code=0011 (17), 
Communication failure
$ tpm_version --version
tpm_version version: 1.3.8
```

>>>>> Something that Paul can consider is to upgrade the TPM firmware if it's not
>>>>> already upgraded.  Since the launch of XPS 9360 there was at least one TPM
>>>>> firmware update issued.  It has been posted to LVFS and can be upgraded using
>>>>> fwupd/fwupdate.
>>>>> Note: If your TPM is currently owned you will need to go into BIOS setup to
>>>>> clear it first before upgrading.
>>>>
>>>> I'm not familiar with the specific TPM in your model, but according to the
>>>> log it is a TPM 2.0, which does not really carry over the owner concept of
>>>> a TPM 1.2. Is clearing it still necessary for an upgrade then?
>>>
>>> Yes it's required for the TPM model/vendor that is used in the XPS model that
>>> Paul has.  If you try to run the upgrade without clearing it the firmware will
>>> reject the upgrade.
>>
>> Mario, thank you for your quick reaction.
>>
>> […]
>>
>> 1.  Can you reproduce this issue too?
>> 2.  How do I find out, what TPM firmware version is installed?
> 
> If you get the driver loaded, you can ask the TPM (TPM2_GetCapability for TPM_PT_FIRMWARE_VERSION_1 and TPM_PT_FIRMWARE_VERSION_2):
> 
> python3 -c 'f=open("/dev/tpm0", "r+b", buffering=0); f.write(b"\x80\x01\x00\x00\x00\x16\x00\x00\x01z\x00\x00\x00\x06\x00\x00\x01\x0b\x00\x00\x00\x02"); print(f.readall())'

```
$ sudo python3
Python 3.5.2 (default, Nov 23 2017, 16:37:01)
[GCC 5.4.0 20160609] on linux
Type "help", "copyright", "credits" or "license" for more information.
 >>> f=open("/dev/tpm0", "r+b", buffering=0)
 >>> 
f.write(b"\x80\x01\x00\x00\x00\x16\x00\x00\x01z\x00\x00\x00\x06\x00\x00\x01\x0b\x00\x00\x00\x02")
22
 >>> print(f.readall())
b'\x80\x01\x00\x00\x00#\x00\x00\x00\x00\x01\x00\x00\x00\x06\x00\x00\x00\x02\x00\x00\x01\x0b\x00\x01\x00\x03\x00\x00\x01\x0c\x00\x00\x00\x01'
```

>> 3.  Updating to the firmware 2.4.2 from December 17th, 2017 didn’t fix
>> the issue.
> 
> You've got a firmware from the future? ;-)

Uups, right. It’s from December 12th, 2017. ;-)


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-15 15:26                                 ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-15 15:26 UTC (permalink / raw)
  To: Alexander Steffen, Mario Limonciello, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, rafael.j.wysocki, len.brown

[-- Attachment #1: Type: multipart/signed, Size: 4320 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-15 15:24                                   ` Mario.Limonciello
@ 2017-12-15 15:38                                     ` Paul Menzel
  -1 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-15 15:38 UTC (permalink / raw)
  To: Mario Limonciello, Alexander Steffen, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, Rafael J. Wysocki, Len Brown,
	Thorsten Leemhuis

[-- Attachment #1: Type: text/plain, Size: 3455 bytes --]

Dear Mario,


On 12/15/17 16:24, Mario.Limonciello@dell.com wrote:
>> -----Original Message-----
>> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
>> Sent: Friday, December 15, 2017 9:11 AM
>> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
>> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
>> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org;
>> rafael.j.wysocki@intel.com; len.brown@intel.com
>> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
>> occurred continue selftest`
>>
>> Dear Mario,
>>
>>
>> On 12/15/17 15:39, Mario.Limonciello@dell.com wrote:
>>>> -----Original Message-----
>>>> From: Paul Menzel [mailto:pmenzel@molgen.mpg.de]
>>>> Sent: Friday, December 15, 2017 5:54 AM
>>>> To: Limonciello, Mario <Mario_Limonciello@Dell.com>; Alexander Steffen
>>>> <Alexander.Steffen@infineon.com>; Jason Gunthorpe <jgg@ziepe.ca>
>>>> Cc: linux-integrity@vger.kernel.org; linux-kernel@vger.kernel.org; Rafael J.
>>>> Wysocki <rafael.j.wysocki@intel.com>; Len Brown <len.brown@intel.com>
>>>> Subject: Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314)
>>>> occurred continue selftest`

[…]

>>>> Mario, thank you for your quick reaction.
>>>>
>>>> […]
>>>>
>>>> 1.  Can you reproduce this issue too?
>>>
>>> I haven't seen this, but if this is a regression I also have not run anything
>>> later than 4.15-rc1 right now.
>>
>> Well as far as I understood it, it’s not a regression, and there is now
>> just better error reporting. Did you ever get the TPM to work?
> 
> I don't personally use a TPM with Linux on the XPS 9360, but TPM was
> tested by our partners when the XPS 9360 was enabled for Ubuntu.

It’d be great to know how it was tested, as currently it doesn’t work 
here. I included the output in my reply to Alexander just now.

>>>> 2.  How do I find out, what TPM firmware version is installed?
>>>
>>> fwupd will tell you.  Documentation (and code) here:
>>> https://github.com/hughsie/fwupd/tree/master/plugins/dell
>>
>> Unfortuntately it’s not listed with fwupd 0.7.0-0ubuntu4.3 in Ubuntu
>> 16.04.3 LTS installed by Dell.
> 
> If you're unwilling to upgrade to a newer userspace, you can write
> a simple application that can use the new dell-smbios from 4.15 interface to look this up.

Is the newer userspace officially supported by Dell. Because I really 
like to avoid the time-waste with the Dell support again, if I do not 
use their installed OS. Sometimes the Dell support even claims the 
installed Ubuntu flavor does not get any support by Dell.

> Most of the work is already done (see
> https://github.com/torvalds/linux/blob/master/tools/wmi/dell-smbios-example.c#L179)
> 
> You'll just need to add some parsing around the output (see fwupd code) to
> tell the version.  If you do this, please feel free to submit it platform-x86, it may be useful
> for someone else in this sample application too.
> 
> Given you don't have a newer fwupd on your system you won't be able to "easily" flash
> the newer TPM firmware.

I assume you don’t make these decisions, but please relay to your 
superiors that working functionality in the recommended/shipped OS is 
really what customers expect buying a device with GNU/Linux support. In 
my opinion, Dell and Canonical really need to step up there.


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-15 15:38                                     ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-15 15:38 UTC (permalink / raw)
  To: Mario Limonciello, Alexander Steffen, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, Rafael J. Wysocki, Len Brown,
	Thorsten Leemhuis

[-- Attachment #1: Type: multipart/signed, Size: 3511 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-11 12:54               ` Paul Menzel
@ 2017-12-21 13:36                 ` Mimi Zohar
  -1 siblings, 0 replies; 51+ messages in thread
From: Mimi Zohar @ 2017-12-21 13:36 UTC (permalink / raw)
  To: Paul Menzel, Jason Gunthorpe
  Cc: Alexander.Steffen, linux-integrity, linux-kernel

Hi Paul,

On Mon, 2017-12-11 at 13:54 +0100, Paul Menzel wrote:
> Dear Jason,
> 
> 
> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > 
> >> I have no access to the system right now, but want to point out, that the
> >> log was created by `journactl -k`, so I do not know if that messes with the
> >> time stamps. I checked the output of `dmesg` but didn’t see the TPM error
> >> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE,
> >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > 
> > It is a good question, I don't know.. If your kernel isn't setup to
> > timestamp messages then the journalstamp will certainly be garbage.
> > 
> > No idea why you wouldn't see the messages in dmesg, if they are not in
> > dmesg they couldn't get into the journal
> 
> It looks like I was running an older Linux kernel version, when running 
> `dmesg`. Sorry for the noise. Here are the messages with the Linux 
> kernel time stamps, showing that the delays work correctly.
> 
> ```
> $ uname -a
> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3 
> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> $ sudo dmesg | grep TPM
> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl 
> 00000001 AMI  00000000)
> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    3.734808] tpm tpm0: TPM self test failed
> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> ```

I've sort of been following this thread, but just want to make sure
that once the self test is/was fixed, that you aren't seeing the IMA
message.

Assuming this is fixed, could someone provide the commit that fixes
it?

thanks,

Mimi

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-21 13:36                 ` Mimi Zohar
  0 siblings, 0 replies; 51+ messages in thread
From: Mimi Zohar @ 2017-12-21 13:36 UTC (permalink / raw)
  To: Paul Menzel, Jason Gunthorpe
  Cc: Alexander.Steffen, linux-integrity, linux-kernel

Hi Paul,

On Mon, 2017-12-11 at 13:54 +0100, Paul Menzel wrote:
> Dear Jason,
> 
> 
> On 12/08/17 17:18, Jason Gunthorpe wrote:
> > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > 
> >> I have no access to the system right now, but want to point out, that the
> >> log was created by `journactl -k`, so I do not know if that messes with the
> >> time stamps. I checked the output of `dmesg` but didn't see the TPM error
> >> messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE,
> >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > 
> > It is a good question, I don't know.. If your kernel isn't setup to
> > timestamp messages then the journalstamp will certainly be garbage.
> > 
> > No idea why you wouldn't see the messages in dmesg, if they are not in
> > dmesg they couldn't get into the journal
> 
> It looks like I was running an older Linux kernel version, when running 
> `dmesg`. Sorry for the noise. Here are the messages with the Linux 
> kernel time stamps, showing that the delays work correctly.
> 
> ```
> $ uname -a
> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3 
> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> $ sudo dmesg | grep TPM
> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl 
> 00000001 AMI  00000000)
> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> [    3.734808] tpm tpm0: TPM self test failed
> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> ```

I've sort of been following this thread, but just want to make sure
that once the self test is/was fixed, that you aren't seeing the IMA
message.

Assuming this is fixed, could someone provide the commit that fixes
it?

thanks,

Mimi

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-21 13:36                 ` Mimi Zohar
@ 2017-12-22 14:00                   ` Alexander.Steffen
  -1 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-22 14:00 UTC (permalink / raw)
  To: zohar, pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> Hi Paul,
> 
> On Mon, 2017-12-11 at 13:54 +0100, Paul Menzel wrote:
> > Dear Jason,
> >
> >
> > On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > >
> > >> I have no access to the system right now, but want to point out, that the
> > >> log was created by `journactl -k`, so I do not know if that messes with
> the
> > >> time stamps. I checked the output of `dmesg` but didn’t see the TPM
> error
> > >> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE,
> > >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > >
> > > It is a good question, I don't know.. If your kernel isn't setup to
> > > timestamp messages then the journalstamp will certainly be garbage.
> > >
> > > No idea why you wouldn't see the messages in dmesg, if they are not in
> > > dmesg they couldn't get into the journal
> >
> > It looks like I was running an older Linux kernel version, when running
> > `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > kernel time stamps, showing that the delays work correctly.
> >
> > ```
> > $ uname -a
> > Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > $ sudo dmesg | grep TPM
> > [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
> > 00000001 AMI  00000000)
> > [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    3.734808] tpm tpm0: TPM self test failed
> > [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > ```
> 
> I've sort of been following this thread, but just want to make sure
> that once the self test is/was fixed, that you aren't seeing the IMA
> message.
> 
> Assuming this is fixed, could someone provide the commit that fixes
> it?

I don't think we've found a solution yet. There might be a firmware upgrade that changes that TPM's behavior. Or maybe my latest patch helps? https://patchwork.kernel.org/patch/10130535/

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* RE: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-22 14:00                   ` Alexander.Steffen
  0 siblings, 0 replies; 51+ messages in thread
From: Alexander.Steffen @ 2017-12-22 14:00 UTC (permalink / raw)
  To: zohar, pmenzel, jgg; +Cc: linux-integrity, linux-kernel

> Hi Paul,
> 
> On Mon, 2017-12-11 at 13:54 +0100, Paul Menzel wrote:
> > Dear Jason,
> >
> >
> > On 12/08/17 17:18, Jason Gunthorpe wrote:
> > > On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
> > >
> > >> I have no access to the system right now, but want to point out, that the
> > >> log was created by `journactl -k`, so I do not know if that messes with
> the
> > >> time stamps. I checked the output of `dmesg` but didn't see the TPM
> error
> > >> messages in the output - only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
> 0xFE,
> > >> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
> > >
> > > It is a good question, I don't know.. If your kernel isn't setup to
> > > timestamp messages then the journalstamp will certainly be garbage.
> > >
> > > No idea why you wouldn't see the messages in dmesg, if they are not in
> > > dmesg they couldn't get into the journal
> >
> > It looks like I was running an older Linux kernel version, when running
> > `dmesg`. Sorry for the noise. Here are the messages with the Linux
> > kernel time stamps, showing that the delays work correctly.
> >
> > ```
> > $ uname -a
> > Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
> > 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
> > $ sudo dmesg | grep TPM
> > [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
> > 00000001 AMI  00000000)
> > [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
> > [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
> > [    3.734808] tpm tpm0: TPM self test failed
> > [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
> > ```
> 
> I've sort of been following this thread, but just want to make sure
> that once the self test is/was fixed, that you aren't seeing the IMA
> message.
> 
> Assuming this is fixed, could someone provide the commit that fixes
> it?

I don't think we've found a solution yet. There might be a firmware upgrade that changes that TPM's behavior. Or maybe my latest patch helps? https://patchwork.kernel.org/patch/10130535/

Alexander

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
  2017-12-22 14:00                   ` Alexander.Steffen
@ 2017-12-22 14:08                     ` Paul Menzel
  -1 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-22 14:08 UTC (permalink / raw)
  To: Alexander Steffen, Mimi Zohar, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, Mario Limonciello, Thorsten Leemhuis

[-- Attachment #1: Type: text/plain, Size: 3014 bytes --]

Dear Alexander, dear Mimi,


On 12/22/17 15:00, Alexander.Steffen@infineon.com wrote:

>> On Mon, 2017-12-11 at 13:54 +0100, Paul Menzel wrote:
>>> Dear Jason,
>>>
>>>
>>> On 12/08/17 17:18, Jason Gunthorpe wrote:
>>>> On Fri, Dec 08, 2017 at 05:07:39PM +0100, Paul Menzel wrote:
>>>>
>>>>> I have no access to the system right now, but want to point out, that the
>>>>> log was created by `journactl -k`, so I do not know if that messes with
>> the
>>>>> time stamps. I checked the output of `dmesg` but didn’t see the TPM
>> error
>>>>> messages in the output – only `tpm_tis MSFT0101:00: 2.0 TPM (device-id
>> 0xFE,
>>>>> rev-id 4)`. Do I need to pass a different error message to `dmesg`?
>>>>
>>>> It is a good question, I don't know.. If your kernel isn't setup to
>>>> timestamp messages then the journalstamp will certainly be garbage.
>>>>
>>>> No idea why you wouldn't see the messages in dmesg, if they are not in
>>>> dmesg they couldn't get into the journal
>>>
>>> It looks like I was running an older Linux kernel version, when running
>>> `dmesg`. Sorry for the noise. Here are the messages with the Linux
>>> kernel time stamps, showing that the delays work correctly.
>>>
>>> ```
>>> $ uname -a
>>> Linux Ixpees 4.15.0-041500rc2-generic #201712031230 SMP Sun Dec 3
>>> 17:32:03 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux
>>> $ sudo dmesg | grep TPM
>>> [    0.000000] ACPI: TPM2 0x000000006F332168 000034 (v03        Tpm2Tabl
>>> 00000001 AMI  00000000)
>>> [    1.114355] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0xFE, rev-id 4)
>>> [    1.125250] tpm tpm0: A TPM error (2314) occurred continue selftest
>>> [    1.156645] tpm tpm0: A TPM error (2314) occurred continue selftest
>>> [    1.208053] tpm tpm0: A TPM error (2314) occurred continue selftest
>>> [    1.299640] tpm tpm0: A TPM error (2314) occurred continue selftest
>>> [    1.471223] tpm tpm0: A TPM error (2314) occurred continue selftest
>>> [    1.802819] tpm tpm0: A TPM error (2314) occurred continue selftest
>>> [    2.454320] tpm tpm0: A TPM error (2314) occurred continue selftest
>>> [    3.734808] tpm tpm0: TPM self test failed
>>> [    3.759675] ima: No TPM chip found, activating TPM-bypass! (rc=-19)
>>> ```
>>
>> I've sort of been following this thread, but just want to make sure
>> that once the self test is/was fixed, that you aren't seeing the IMA
>> message.
>>
>> Assuming this is fixed, could someone provide the commit that fixes
>> it?
> 
> I don't think we've found a solution yet.

Correct, it’s not fixed yet to my knowledge.

> There might be a firmware  upgrade that changes that TPM's behavior.

Indeed, but I am unable to update without loosing the support from Dell. 
Maybe some Dell XPS 13 9630 user is able to test the TPM functionality 
over the holidays.

> Or maybe my latest patch helps? https://patchwork.kernel.org/patch/10130535/

I’ll only have access to the device in the next year.


Kind regards,

Paul


[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 5174 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

* Re: [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest`
@ 2017-12-22 14:08                     ` Paul Menzel
  0 siblings, 0 replies; 51+ messages in thread
From: Paul Menzel @ 2017-12-22 14:08 UTC (permalink / raw)
  To: Alexander Steffen, Mimi Zohar, Jason Gunthorpe
  Cc: linux-integrity, linux-kernel, Mario Limonciello, Thorsten Leemhuis

[-- Attachment #1: Type: multipart/signed, Size: 3080 bytes --]

^ permalink raw reply	[flat|nested] 51+ messages in thread

end of thread, other threads:[~2017-12-22 14:08 UTC | newest]

Thread overview: 51+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-12-06 12:34 [Regression 4.15-rc2] New messages `tpm tpm0: A TPM error (2314) occurred continue selftest` Paul Menzel
2017-12-06 12:34 ` Paul Menzel
2017-12-06 16:40 ` Jason Gunthorpe
2017-12-06 16:40   ` Jason Gunthorpe
2017-12-07 15:56 ` Alexander.Steffen
2017-12-07 15:56   ` Alexander.Steffen
2017-12-07 18:37   ` Jason Gunthorpe
2017-12-07 18:37     ` Jason Gunthorpe
2017-12-08 12:14     ` Alexander.Steffen
2017-12-08 12:14       ` Alexander.Steffen
2017-12-08 15:56       ` Jason Gunthorpe
2017-12-08 16:07         ` Paul Menzel
2017-12-08 16:07           ` Paul Menzel
2017-12-08 16:18           ` Jason Gunthorpe
2017-12-08 16:18             ` Jason Gunthorpe
2017-12-11 12:54             ` Paul Menzel
2017-12-11 12:54               ` Paul Menzel
2017-12-11 16:08               ` Alexander.Steffen
2017-12-11 16:08                 ` Alexander.Steffen
2017-12-14 10:33                 ` Paul Menzel
2017-12-14 10:33                   ` Paul Menzel
2017-12-14 12:20                   ` Alexander.Steffen
2017-12-14 12:20                     ` Alexander.Steffen
2017-12-14 14:15                     ` Mario.Limonciello
2017-12-14 14:15                       ` Mario.Limonciello
2017-12-14 16:12                       ` Alexander.Steffen
2017-12-14 16:12                         ` Alexander.Steffen
2017-12-14 19:43                         ` Mario.Limonciello
2017-12-14 19:43                           ` Mario.Limonciello
2017-12-15 11:54                           ` Paul Menzel
2017-12-15 11:54                             ` Paul Menzel
2017-12-15 14:39                             ` Mario.Limonciello
2017-12-15 14:39                               ` Mario.Limonciello
2017-12-15 15:10                               ` Paul Menzel
2017-12-15 15:10                                 ` Paul Menzel
2017-12-15 15:24                                 ` Mario.Limonciello
2017-12-15 15:24                                   ` Mario.Limonciello
2017-12-15 15:38                                   ` Paul Menzel
2017-12-15 15:38                                     ` Paul Menzel
2017-12-15 14:54                             ` Alexander.Steffen
2017-12-15 14:54                               ` Alexander.Steffen
2017-12-15 15:26                               ` Paul Menzel
2017-12-15 15:26                                 ` Paul Menzel
2017-12-21 13:36               ` Mimi Zohar
2017-12-21 13:36                 ` Mimi Zohar
2017-12-22 14:00                 ` Alexander.Steffen
2017-12-22 14:00                   ` Alexander.Steffen
2017-12-22 14:08                   ` Paul Menzel
2017-12-22 14:08                     ` Paul Menzel
2017-12-08 16:17         ` Mimi Zohar
2017-12-08 16:17           ` Mimi Zohar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.