All of lore.kernel.org
 help / color / mirror / Atom feed
* 2.6.37.1 s2disk regression (TPM)
@ 2011-02-20 10:13 Jiri Slaby
  2011-02-20 10:44 ` Rafael J. Wysocki
  2011-02-20 10:44 ` Rafael J. Wysocki
  0 siblings, 2 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-20 10:13 UTC (permalink / raw)
  To: stefanb
  Cc: linux-pm, srajiv, stable, Linux kernel mailing list, debora, tpmdd-devel

Hi,

I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
[10974.074587] Suspending console(s) (use no_console_suspend to debug)
[10974.103073] tpm_tis 00:0c: Operation Timed out
[10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
[10974.103095] PM: Device 00:0c failed to freeze: error -62

2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
returned from TPM) for testing.

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:13 2.6.37.1 s2disk regression (TPM) Jiri Slaby
@ 2011-02-20 10:44 ` Rafael J. Wysocki
  2011-02-20 10:46   ` Jiri Slaby
  2011-02-20 10:46   ` Jiri Slaby
  2011-02-20 10:44 ` Rafael J. Wysocki
  1 sibling, 2 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2011-02-20 10:44 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, linux-pm, srajiv, stable, Linux kernel mailing list,
	debora, tpmdd-devel

On Sunday, February 20, 2011, Jiri Slaby wrote:
> Hi,
> 
> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
> [10974.103073] tpm_tis 00:0c: Operation Timed out
> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
> [10974.103095] PM: Device 00:0c failed to freeze: error -62
> 
> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
> returned from TPM) for testing.

Yes, this has been confirmed to cause suspend regressions to happen.

Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:13 2.6.37.1 s2disk regression (TPM) Jiri Slaby
  2011-02-20 10:44 ` Rafael J. Wysocki
@ 2011-02-20 10:44 ` Rafael J. Wysocki
  1 sibling, 0 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2011-02-20 10:44 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, srajiv, Linux kernel mailing list, debora, tpmdd-devel,
	linux-pm, stable

On Sunday, February 20, 2011, Jiri Slaby wrote:
> Hi,
> 
> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
> [10974.103073] tpm_tis 00:0c: Operation Timed out
> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
> [10974.103095] PM: Device 00:0c failed to freeze: error -62
> 
> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
> returned from TPM) for testing.

Yes, this has been confirmed to cause suspend regressions to happen.

Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:44 ` Rafael J. Wysocki
@ 2011-02-20 10:46   ` Jiri Slaby
  2011-02-20 10:51     ` Rafael J. Wysocki
  2011-02-20 10:51     ` Rafael J. Wysocki
  2011-02-20 10:46   ` Jiri Slaby
  1 sibling, 2 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-20 10:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: stefanb, linux-pm, srajiv, stable, Linux kernel mailing list,
	debora, tpmdd-devel

On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
> On Sunday, February 20, 2011, Jiri Slaby wrote:
>> Hi,
>>
>> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
>> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
>> [10974.103073] tpm_tis 00:0c: Operation Timed out
>> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
>> [10974.103095] PM: Device 00:0c failed to freeze: error -62
>>
>> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
>> returned from TPM) for testing.
> 
> Yes, this has been confirmed to cause suspend regressions to happen

OK, the revert works for me too... Are there any fixes?

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:44 ` Rafael J. Wysocki
  2011-02-20 10:46   ` Jiri Slaby
@ 2011-02-20 10:46   ` Jiri Slaby
  1 sibling, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-20 10:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: stefanb, srajiv, Linux kernel mailing list, debora, tpmdd-devel,
	linux-pm, stable

On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
> On Sunday, February 20, 2011, Jiri Slaby wrote:
>> Hi,
>>
>> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
>> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
>> [10974.103073] tpm_tis 00:0c: Operation Timed out
>> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
>> [10974.103095] PM: Device 00:0c failed to freeze: error -62
>>
>> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
>> returned from TPM) for testing.
> 
> Yes, this has been confirmed to cause suspend regressions to happen

OK, the revert works for me too... Are there any fixes?

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:46   ` Jiri Slaby
@ 2011-02-20 10:51     ` Rafael J. Wysocki
  2011-02-20 10:57       ` [REVERT request stable-2.6.36/37] " Jiri Slaby
                         ` (3 more replies)
  2011-02-20 10:51     ` Rafael J. Wysocki
  1 sibling, 4 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2011-02-20 10:51 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, linux-pm, srajiv, stable, Linux kernel mailing list,
	debora, tpmdd-devel

On Sunday, February 20, 2011, Jiri Slaby wrote:
> On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
> > On Sunday, February 20, 2011, Jiri Slaby wrote:
> >> Hi,
> >>
> >> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
> >> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
> >> [10974.103073] tpm_tis 00:0c: Operation Timed out
> >> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
> >> [10974.103095] PM: Device 00:0c failed to freeze: error -62
> >>
> >> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
> >> returned from TPM) for testing.
> > 
> > Yes, this has been confirmed to cause suspend regressions to happen
> 
> OK, the revert works for me too... Are there any fixes?

No, and the author and maintainer have not been responding.  If that contiunes,
I'll simply ask Linus to revert it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:46   ` Jiri Slaby
  2011-02-20 10:51     ` Rafael J. Wysocki
@ 2011-02-20 10:51     ` Rafael J. Wysocki
  1 sibling, 0 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2011-02-20 10:51 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, srajiv, Linux kernel mailing list, debora, tpmdd-devel,
	linux-pm, stable

On Sunday, February 20, 2011, Jiri Slaby wrote:
> On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
> > On Sunday, February 20, 2011, Jiri Slaby wrote:
> >> Hi,
> >>
> >> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
> >> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
> >> [10974.103073] tpm_tis 00:0c: Operation Timed out
> >> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
> >> [10974.103095] PM: Device 00:0c failed to freeze: error -62
> >>
> >> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
> >> returned from TPM) for testing.
> > 
> > Yes, this has been confirmed to cause suspend regressions to happen
> 
> OK, the revert works for me too... Are there any fixes?

No, and the author and maintainer have not been responding.  If that contiunes,
I'll simply ask Linus to revert it.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [REVERT request stable-2.6.36/37] Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:51     ` Rafael J. Wysocki
  2011-02-20 10:57       ` [REVERT request stable-2.6.36/37] " Jiri Slaby
@ 2011-02-20 10:57       ` Jiri Slaby
  2011-02-20 16:50         ` [stable] " Greg KH
  2011-02-20 16:50         ` Greg KH
  2011-02-20 11:48       ` Rafael J. Wysocki
  2011-02-20 11:48       ` Rafael J. Wysocki
  3 siblings, 2 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-20 10:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: stefanb, linux-pm, srajiv, stable, Linux kernel mailing list,
	debora, tpmdd-devel

On 02/20/2011 11:51 AM, Rafael J. Wysocki wrote:
> On Sunday, February 20, 2011, Jiri Slaby wrote:
>> On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
>>> On Sunday, February 20, 2011, Jiri Slaby wrote:
>>>> Hi,
>>>>
>>>> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
>>>> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
>>>> [10974.103073] tpm_tis 00:0c: Operation Timed out
>>>> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
>>>> [10974.103095] PM: Device 00:0c failed to freeze: error -62
>>>>
>>>> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
>>>> returned from TPM) for testing.
>>>
>>> Yes, this has been confirmed to cause suspend regressions to happen
>>
>> OK, the revert works for me too... Are there any fixes?
> 
> No, and the author and maintainer have not been responding.  If that contiunes,
> I'll simply ask Linus to revert it.

Ok then, stable team, please revert that from 2.6.37 stable.

I guess the same problem is in 2.6.36, so reverting it in 2.6.36 should
be done too...

thanks,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* [REVERT request stable-2.6.36/37] Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:51     ` Rafael J. Wysocki
@ 2011-02-20 10:57       ` Jiri Slaby
  2011-02-20 10:57       ` Jiri Slaby
                         ` (2 subsequent siblings)
  3 siblings, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-20 10:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: stefanb, srajiv, Linux kernel mailing list, debora, tpmdd-devel,
	linux-pm, stable

On 02/20/2011 11:51 AM, Rafael J. Wysocki wrote:
> On Sunday, February 20, 2011, Jiri Slaby wrote:
>> On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
>>> On Sunday, February 20, 2011, Jiri Slaby wrote:
>>>> Hi,
>>>>
>>>> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
>>>> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
>>>> [10974.103073] tpm_tis 00:0c: Operation Timed out
>>>> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
>>>> [10974.103095] PM: Device 00:0c failed to freeze: error -62
>>>>
>>>> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
>>>> returned from TPM) for testing.
>>>
>>> Yes, this has been confirmed to cause suspend regressions to happen
>>
>> OK, the revert works for me too... Are there any fixes?
> 
> No, and the author and maintainer have not been responding.  If that contiunes,
> I'll simply ask Linus to revert it.

Ok then, stable team, please revert that from 2.6.37 stable.

I guess the same problem is in 2.6.36, so reverting it in 2.6.36 should
be done too...

thanks,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:51     ` Rafael J. Wysocki
                         ` (2 preceding siblings ...)
  2011-02-20 11:48       ` Rafael J. Wysocki
@ 2011-02-20 11:48       ` Rafael J. Wysocki
  2011-02-21 15:30         ` Rajiv Andrade
  2011-02-21 15:30         ` Rajiv Andrade
  3 siblings, 2 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2011-02-20 11:48 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, linux-pm, srajiv, stable, Linux kernel mailing list,
	debora, Linus Torvalds

On Sunday, February 20, 2011, Rafael J. Wysocki wrote:
> On Sunday, February 20, 2011, Jiri Slaby wrote:
> > On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
> > > On Sunday, February 20, 2011, Jiri Slaby wrote:
> > >> Hi,
> > >>
> > >> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
> > >> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
> > >> [10974.103073] tpm_tis 00:0c: Operation Timed out
> > >> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
> > >> [10974.103095] PM: Device 00:0c failed to freeze: error -62
> > >>
> > >> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
> > >> returned from TPM) for testing.
> > > 
> > > Yes, this has been confirmed to cause suspend regressions to happen
> > 
> > OK, the revert works for me too... Are there any fixes?
> 
> No, and the author and maintainer have not been responding.  If that contiunes,
> I'll simply ask Linus to revert it.

BTW, the first hunk from that commit in drivers/char/tpm/tpm.c seems to be
completely broken:

@@ -577,9 +577,11 @@ duration:
        if (rc)
                return;
 
-       if (be32_to_cpu(tpm_cmd.header.out.return_code)
-           != 3 * sizeof(u32))
+       if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
+           be32_to_cpu(tpm_cmd.header.out.length)
+           != sizeof(tpm_cmd.header.out) + sizeof(u32) + 3 * sizeof(u32))
                return;
+
        duration_cap = &tpm_cmd.params.getcap_out.cap.duration;
        chip->vendor.duration[TPM_SHORT] =
            usecs_to_jiffies(be32_to_cpu(duration_cap->tpm_short));

Namely, either the old code always returned as a result of the conditional
being removed, or the new code will always return as a result of
the (... != 0) check.  I wonder if there's supposed to be (... == 0) instead?
[And why not to simply use 4*sizeof(u32) FWIW?]

Anyway, it looks like a good revert candidate to me.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:51     ` Rafael J. Wysocki
  2011-02-20 10:57       ` [REVERT request stable-2.6.36/37] " Jiri Slaby
  2011-02-20 10:57       ` Jiri Slaby
@ 2011-02-20 11:48       ` Rafael J. Wysocki
  2011-02-20 11:48       ` Rafael J. Wysocki
  3 siblings, 0 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2011-02-20 11:48 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, srajiv, Linux kernel mailing list, debora, linux-pm,
	Linus Torvalds, stable

On Sunday, February 20, 2011, Rafael J. Wysocki wrote:
> On Sunday, February 20, 2011, Jiri Slaby wrote:
> > On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
> > > On Sunday, February 20, 2011, Jiri Slaby wrote:
> > >> Hi,
> > >>
> > >> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
> > >> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
> > >> [10974.103073] tpm_tis 00:0c: Operation Timed out
> > >> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
> > >> [10974.103095] PM: Device 00:0c failed to freeze: error -62
> > >>
> > >> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
> > >> returned from TPM) for testing.
> > > 
> > > Yes, this has been confirmed to cause suspend regressions to happen
> > 
> > OK, the revert works for me too... Are there any fixes?
> 
> No, and the author and maintainer have not been responding.  If that contiunes,
> I'll simply ask Linus to revert it.

BTW, the first hunk from that commit in drivers/char/tpm/tpm.c seems to be
completely broken:

@@ -577,9 +577,11 @@ duration:
        if (rc)
                return;
 
-       if (be32_to_cpu(tpm_cmd.header.out.return_code)
-           != 3 * sizeof(u32))
+       if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
+           be32_to_cpu(tpm_cmd.header.out.length)
+           != sizeof(tpm_cmd.header.out) + sizeof(u32) + 3 * sizeof(u32))
                return;
+
        duration_cap = &tpm_cmd.params.getcap_out.cap.duration;
        chip->vendor.duration[TPM_SHORT] =
            usecs_to_jiffies(be32_to_cpu(duration_cap->tpm_short));

Namely, either the old code always returned as a result of the conditional
being removed, or the new code will always return as a result of
the (... != 0) check.  I wonder if there's supposed to be (... == 0) instead?
[And why not to simply use 4*sizeof(u32) FWIW?]

Anyway, it looks like a good revert candidate to me.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [stable] [REVERT request stable-2.6.36/37] Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:57       ` Jiri Slaby
  2011-02-20 16:50         ` [stable] " Greg KH
@ 2011-02-20 16:50         ` Greg KH
  1 sibling, 0 replies; 48+ messages in thread
From: Greg KH @ 2011-02-20 16:50 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rafael J. Wysocki, stefanb, srajiv, Linux kernel mailing list,
	debora, tpmdd-devel, linux-pm, stable

On Sun, Feb 20, 2011 at 11:57:16AM +0100, Jiri Slaby wrote:
> On 02/20/2011 11:51 AM, Rafael J. Wysocki wrote:
> > On Sunday, February 20, 2011, Jiri Slaby wrote:
> >> On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
> >>> On Sunday, February 20, 2011, Jiri Slaby wrote:
> >>>> Hi,
> >>>>
> >>>> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
> >>>> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
> >>>> [10974.103073] tpm_tis 00:0c: Operation Timed out
> >>>> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
> >>>> [10974.103095] PM: Device 00:0c failed to freeze: error -62
> >>>>
> >>>> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
> >>>> returned from TPM) for testing.
> >>>
> >>> Yes, this has been confirmed to cause suspend regressions to happen
> >>
> >> OK, the revert works for me too... Are there any fixes?
> > 
> > No, and the author and maintainer have not been responding.  If that contiunes,
> > I'll simply ask Linus to revert it.
> 
> Ok then, stable team, please revert that from 2.6.37 stable.

Ok, I'll do that for the next .37-stable release.

> I guess the same problem is in 2.6.36, so reverting it in 2.6.36 should
> be done too...

.36-stable is now end-of-life :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: [stable] [REVERT request stable-2.6.36/37] Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 10:57       ` Jiri Slaby
@ 2011-02-20 16:50         ` Greg KH
  2011-02-20 16:50         ` Greg KH
  1 sibling, 0 replies; 48+ messages in thread
From: Greg KH @ 2011-02-20 16:50 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, srajiv, Linux kernel mailing list, debora, tpmdd-devel,
	linux-pm, stable

On Sun, Feb 20, 2011 at 11:57:16AM +0100, Jiri Slaby wrote:
> On 02/20/2011 11:51 AM, Rafael J. Wysocki wrote:
> > On Sunday, February 20, 2011, Jiri Slaby wrote:
> >> On 02/20/2011 11:44 AM, Rafael J. Wysocki wrote:
> >>> On Sunday, February 20, 2011, Jiri Slaby wrote:
> >>>> Hi,
> >>>>
> >>>> I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
> >>>> [10974.074587] Suspending console(s) (use no_console_suspend to debug)
> >>>> [10974.103073] tpm_tis 00:0c: Operation Timed out
> >>>> [10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
> >>>> [10974.103095] PM: Device 00:0c failed to freeze: error -62
> >>>>
> >>>> 2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
> >>>> returned from TPM) for testing.
> >>>
> >>> Yes, this has been confirmed to cause suspend regressions to happen
> >>
> >> OK, the revert works for me too... Are there any fixes?
> > 
> > No, and the author and maintainer have not been responding.  If that contiunes,
> > I'll simply ask Linus to revert it.
> 
> Ok then, stable team, please revert that from 2.6.37 stable.

Ok, I'll do that for the next .37-stable release.

> I guess the same problem is in 2.6.36, so reverting it in 2.6.36 should
> be done too...

.36-stable is now end-of-life :(

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 11:48       ` Rafael J. Wysocki
  2011-02-21 15:30         ` Rajiv Andrade
@ 2011-02-21 15:30         ` Rajiv Andrade
  2011-02-21 16:34           ` Jiri Slaby
                             ` (3 more replies)
  1 sibling, 4 replies; 48+ messages in thread
From: Rajiv Andrade @ 2011-02-21 15:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jiri Slaby, stefanb, linux-pm, stable, Linux kernel mailing list,
	debora, Linus Torvalds, preining

On 02/20/2011 08:48 AM, Rafael J. Wysocki wrote:

> On Sunday, February 20, 2011, Rafael J. Wysocki wrote:
>> No, and the author and maintainer have not been responding.  If that contiunes,I'll simply ask Linus to revert it.
Sorry, but you sent the email this Friday, I didn't catch it in time and I wasn't working during the weekend.

> BTW, the first hunk from that commit in drivers/char/tpm/tpm.c seems to be
> completely broken:
>
> @@ -577,9 +577,11 @@ duration:
>          if (rc)
>                  return;
>
> -       if (be32_to_cpu(tpm_cmd.header.out.return_code)
> -           != 3 * sizeof(u32))
> +       if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
> +           be32_to_cpu(tpm_cmd.header.out.length)
> +           != sizeof(tpm_cmd.header.out) + sizeof(u32) + 3 * sizeof(u32))
>                  return;
> +
>          duration_cap =&tpm_cmd.params.getcap_out.cap.duration;
>          chip->vendor.duration[TPM_SHORT] =
>              usecs_to_jiffies(be32_to_cpu(duration_cap->tpm_short));
>
> Namely, either the old code always returned as a result of the conditional
> being removed, or the new code will always return as a result of
> the (... != 0) check.  I wonder if there's supposed to be (... == 0) instead?
The previous code was checking the wrong field of the TPM returned buffer, probably
due an old commit that incorporated the tpm_cmd strucuture, it should check if the return code
is != 0, which if true, means that the command didn't succeed. The output length check should be
just a sanity check, so indeed the logical operator should be&&  instead. Although it should also be
fixed, I don't think this is the cause, since in case the timeout retrieval from the TPM fail,
the device driver falls back to default values, which has been working before this commit.

> [And why not to simply use 4*sizeof(u32) FWIW?]
I can't see why, I'll update it.

The failure for this specific board then sounds to be due the TPM returning inconsistent timeout values.
Norbert, can 'cat /sys/devices/pnp0/00\:0*/timeouts' and send the output?

Rajiv



^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-20 11:48       ` Rafael J. Wysocki
@ 2011-02-21 15:30         ` Rajiv Andrade
  2011-02-21 15:30         ` Rajiv Andrade
  1 sibling, 0 replies; 48+ messages in thread
From: Rajiv Andrade @ 2011-02-21 15:30 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jiri Slaby, stefanb, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/20/2011 08:48 AM, Rafael J. Wysocki wrote:

> On Sunday, February 20, 2011, Rafael J. Wysocki wrote:
>> No, and the author and maintainer have not been responding.  If that contiunes,I'll simply ask Linus to revert it.
Sorry, but you sent the email this Friday, I didn't catch it in time and I wasn't working during the weekend.

> BTW, the first hunk from that commit in drivers/char/tpm/tpm.c seems to be
> completely broken:
>
> @@ -577,9 +577,11 @@ duration:
>          if (rc)
>                  return;
>
> -       if (be32_to_cpu(tpm_cmd.header.out.return_code)
> -           != 3 * sizeof(u32))
> +       if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
> +           be32_to_cpu(tpm_cmd.header.out.length)
> +           != sizeof(tpm_cmd.header.out) + sizeof(u32) + 3 * sizeof(u32))
>                  return;
> +
>          duration_cap =&tpm_cmd.params.getcap_out.cap.duration;
>          chip->vendor.duration[TPM_SHORT] =
>              usecs_to_jiffies(be32_to_cpu(duration_cap->tpm_short));
>
> Namely, either the old code always returned as a result of the conditional
> being removed, or the new code will always return as a result of
> the (... != 0) check.  I wonder if there's supposed to be (... == 0) instead?
The previous code was checking the wrong field of the TPM returned buffer, probably
due an old commit that incorporated the tpm_cmd strucuture, it should check if the return code
is != 0, which if true, means that the command didn't succeed. The output length check should be
just a sanity check, so indeed the logical operator should be&&  instead. Although it should also be
fixed, I don't think this is the cause, since in case the timeout retrieval from the TPM fail,
the device driver falls back to default values, which has been working before this commit.

> [And why not to simply use 4*sizeof(u32) FWIW?]
I can't see why, I'll update it.

The failure for this specific board then sounds to be due the TPM returning inconsistent timeout values.
Norbert, can 'cat /sys/devices/pnp0/00\:0*/timeouts' and send the output?

Rajiv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 15:30         ` Rajiv Andrade
@ 2011-02-21 16:34           ` Jiri Slaby
  2011-02-21 16:57             ` Linus Torvalds
                               ` (3 more replies)
  2011-02-21 16:34           ` Jiri Slaby
                             ` (2 subsequent siblings)
  3 siblings, 4 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-21 16:34 UTC (permalink / raw)
  To: Rajiv Andrade
  Cc: Rafael J. Wysocki, stefanb, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

Linus, please stop garbling my name. I'm not Sladby (2nd time you
committed that) and not even Juri.

On 02/21/2011 04:30 PM, Rajiv Andrade wrote:
> On 02/20/2011 08:48 AM, Rafael J. Wysocki wrote:
> 
>> On Sunday, February 20, 2011, Rafael J. Wysocki wrote:
>>> No, and the author and maintainer have not been responding.  If that
>>> contiunes,I'll simply ask Linus to revert it.
> Sorry, but you sent the email this Friday, I didn't catch it in time and
> I wasn't working during the weekend.
> 
>> BTW, the first hunk from that commit in drivers/char/tpm/tpm.c seems
>> to be
>> completely broken:
>> @@ -577,9 +577,11 @@ duration:
>>          if (rc)
>>                  return;
>>
>> -       if (be32_to_cpu(tpm_cmd.header.out.return_code)
>> -           != 3 * sizeof(u32))
>> +       if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
>> +           be32_to_cpu(tpm_cmd.header.out.length)
>> +           != sizeof(tpm_cmd.header.out) + sizeof(u32) + 3 *
>> sizeof(u32))
>>                  return;
>> +
>>          duration_cap =&tpm_cmd.params.getcap_out.cap.duration;
>>          chip->vendor.duration[TPM_SHORT] =
>>              usecs_to_jiffies(be32_to_cpu(duration_cap->tpm_short));
>>
>> Namely, either the old code always returned as a result of the
>> conditional
>> being removed, or the new code will always return as a result of
>> the (... != 0) check.  I wonder if there's supposed to be (... == 0)
>> instead?
> The previous code was checking the wrong field of the TPM returned
> buffer, probably
> due an old commit that incorporated the tpm_cmd strucuture, it should
> check if the return code
> is != 0, which if true, means that the command didn't succeed. The
> output length check should be
> just a sanity check, so indeed the logical operator should be&& 
> instead.

No it should not. You want 'if (wrong_retcode OR wrong_len) die;'. IOW I
don't see what exactly is wrong on the 'if'. I think it's correct.
(Given the old test was incorrect.)

There has to be another problem which caused my regression. And since it
reports "Operation Timed out", the former default timeout values worked
for me, the ones read from TPM do not.

I'm not at that machine now, what are the usual timeouts in usecs? The
use of conversed jiffies seem bogus. If the usecs are so low (or HZ so
high on some configs) that the conversion returns 1 jiffie,
wait_event_interruptible_timeout in wait_for_stat will return 0 when:
* 1 jiffie passes without change in status (proper timeout)
* status changed, but also the timer ticked once meanwhile, i.e. we
scheduled a moment before timer tick

But this is only a theory so far. What about this (wrapped, just dropped
by mouse), I may try it when I'm back to the machine?
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -201,7 +201,7 @@ static int wait_for_stat(struct tpm_chip *chip, u8
mask, unsigned long timeout,
                                                      ((tpm_tis_status
                                                        (chip) & mask) ==
                                                       mask), timeout);
-               if (rc > 0)
+               if (rc > 0 || (tpm_tis_status(chip) & mask) == mask)
                        return 0;
        } else {
                stop = jiffies + timeout;

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 15:30         ` Rajiv Andrade
  2011-02-21 16:34           ` Jiri Slaby
@ 2011-02-21 16:34           ` Jiri Slaby
  2011-02-22  5:39           ` Norbert Preining
  2011-02-22  5:39           ` Norbert Preining
  3 siblings, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-21 16:34 UTC (permalink / raw)
  To: Rajiv Andrade
  Cc: stefanb, Linux kernel mailing list, debora, preining, linux-pm,
	Linus Torvalds, stable

Linus, please stop garbling my name. I'm not Sladby (2nd time you
committed that) and not even Juri.

On 02/21/2011 04:30 PM, Rajiv Andrade wrote:
> On 02/20/2011 08:48 AM, Rafael J. Wysocki wrote:
> 
>> On Sunday, February 20, 2011, Rafael J. Wysocki wrote:
>>> No, and the author and maintainer have not been responding.  If that
>>> contiunes,I'll simply ask Linus to revert it.
> Sorry, but you sent the email this Friday, I didn't catch it in time and
> I wasn't working during the weekend.
> 
>> BTW, the first hunk from that commit in drivers/char/tpm/tpm.c seems
>> to be
>> completely broken:
>> @@ -577,9 +577,11 @@ duration:
>>          if (rc)
>>                  return;
>>
>> -       if (be32_to_cpu(tpm_cmd.header.out.return_code)
>> -           != 3 * sizeof(u32))
>> +       if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
>> +           be32_to_cpu(tpm_cmd.header.out.length)
>> +           != sizeof(tpm_cmd.header.out) + sizeof(u32) + 3 *
>> sizeof(u32))
>>                  return;
>> +
>>          duration_cap =&tpm_cmd.params.getcap_out.cap.duration;
>>          chip->vendor.duration[TPM_SHORT] =
>>              usecs_to_jiffies(be32_to_cpu(duration_cap->tpm_short));
>>
>> Namely, either the old code always returned as a result of the
>> conditional
>> being removed, or the new code will always return as a result of
>> the (... != 0) check.  I wonder if there's supposed to be (... == 0)
>> instead?
> The previous code was checking the wrong field of the TPM returned
> buffer, probably
> due an old commit that incorporated the tpm_cmd strucuture, it should
> check if the return code
> is != 0, which if true, means that the command didn't succeed. The
> output length check should be
> just a sanity check, so indeed the logical operator should be&& 
> instead.

No it should not. You want 'if (wrong_retcode OR wrong_len) die;'. IOW I
don't see what exactly is wrong on the 'if'. I think it's correct.
(Given the old test was incorrect.)

There has to be another problem which caused my regression. And since it
reports "Operation Timed out", the former default timeout values worked
for me, the ones read from TPM do not.

I'm not at that machine now, what are the usual timeouts in usecs? The
use of conversed jiffies seem bogus. If the usecs are so low (or HZ so
high on some configs) that the conversion returns 1 jiffie,
wait_event_interruptible_timeout in wait_for_stat will return 0 when:
* 1 jiffie passes without change in status (proper timeout)
* status changed, but also the timer ticked once meanwhile, i.e. we
scheduled a moment before timer tick

But this is only a theory so far. What about this (wrapped, just dropped
by mouse), I may try it when I'm back to the machine?
--- a/drivers/char/tpm/tpm_tis.c
+++ b/drivers/char/tpm/tpm_tis.c
@@ -201,7 +201,7 @@ static int wait_for_stat(struct tpm_chip *chip, u8
mask, unsigned long timeout,
                                                      ((tpm_tis_status
                                                        (chip) & mask) ==
                                                       mask), timeout);
-               if (rc > 0)
+               if (rc > 0 || (tpm_tis_status(chip) & mask) == mask)
                        return 0;
        } else {
                stop = jiffies + timeout;

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 16:34           ` Jiri Slaby
@ 2011-02-21 16:57             ` Linus Torvalds
  2011-02-21 16:57             ` Linus Torvalds
                               ` (2 subsequent siblings)
  3 siblings, 0 replies; 48+ messages in thread
From: Linus Torvalds @ 2011-02-21 16:57 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Rafael J. Wysocki, stefanb, linux-pm, stable,
	Linux kernel mailing list, debora, preining

2011/2/21 Jiri Slaby <jirislaby@gmail.com>:
> Linus, please stop garbling my name. I'm not Sladby (2nd time you
> committed that) and not even Juri.

I think the Juri was just my fingers being off. The Sladby was real,
though. I don't know why I think your name has a 'd' in it. Sorry.

I tend to try to copy-paste complicated names to avoid typo's, but
yours is short enough that I think I can write it. I obviously can't.

                    Linus

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 16:34           ` Jiri Slaby
  2011-02-21 16:57             ` Linus Torvalds
@ 2011-02-21 16:57             ` Linus Torvalds
  2011-02-21 17:12             ` Rajiv Andrade
  2011-02-21 17:12             ` Rajiv Andrade
  3 siblings, 0 replies; 48+ messages in thread
From: Linus Torvalds @ 2011-02-21 16:57 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, Rajiv Andrade, Linux kernel mailing list, debora,
	preining, linux-pm, stable

2011/2/21 Jiri Slaby <jirislaby@gmail.com>:
> Linus, please stop garbling my name. I'm not Sladby (2nd time you
> committed that) and not even Juri.

I think the Juri was just my fingers being off. The Sladby was real,
though. I don't know why I think your name has a 'd' in it. Sorry.

I tend to try to copy-paste complicated names to avoid typo's, but
yours is short enough that I think I can write it. I obviously can't.

                    Linus

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 16:34           ` Jiri Slaby
                               ` (2 preceding siblings ...)
  2011-02-21 17:12             ` Rajiv Andrade
@ 2011-02-21 17:12             ` Rajiv Andrade
  2011-02-21 20:39               ` Jiri Slaby
  2011-02-21 20:39               ` Jiri Slaby
  3 siblings, 2 replies; 48+ messages in thread
From: Rajiv Andrade @ 2011-02-21 17:12 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rafael J. Wysocki, stefanb, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/21/2011 01:34 PM, Jiri Slaby wrote:

>>> BTW, the first hunk from that commit in drivers/char/tpm/tpm.c seems
>>> to be
>>> completely broken:
>>> @@ -577,9 +577,11 @@ duration:
>>>           if (rc)
>>>                   return;
>>>
>>> -       if (be32_to_cpu(tpm_cmd.header.out.return_code)
>>> -           != 3 * sizeof(u32))
>>> +       if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
>>> +           be32_to_cpu(tpm_cmd.header.out.length)
>>> +           != sizeof(tpm_cmd.header.out) + sizeof(u32) + 3 *
>>> sizeof(u32))
>>>                   return;
>>> +
>>>           duration_cap =&tpm_cmd.params.getcap_out.cap.duration;
>>>           chip->vendor.duration[TPM_SHORT] =
>>>               usecs_to_jiffies(be32_to_cpu(duration_cap->tpm_short));
>>>
>>> Namely, either the old code always returned as a result of the
>>> conditional
>>> being removed, or the new code will always return as a result of
>>> the (... != 0) check.  I wonder if there's supposed to be (... == 0)
>>> instead?
>> The previous code was checking the wrong field of the TPM returned
>> buffer, probably
>> due an old commit that incorporated the tpm_cmd strucuture, it should
>> check if the return code
>> is != 0, which if true, means that the command didn't succeed. The
>> output length check should be
>> just a sanity check, so indeed the logical operator should be&&
>> instead.
> No it should not. You want 'if (wrong_retcode OR wrong_len) die;'. IOW I
> don't see what exactly is wrong on the 'if'. I think it's correct.
> (Given the old test was incorrect.)
Exactly, I read it too fast and inverted the logic while writing the response, missing De Morgan's.

> There has to be another problem which caused my regression. And since it
> reports "Operation Timed out", the former default timeout values worked
> for me, the ones read from TPM do not.
Yes, it's highly due inconsistent timeout values reported by the TPM as I mentioned, my working timeouts are:
3020000 4510000 181000000

> I'm not at that machine now, what are the usual timeouts in usecs? The
> use of conversed jiffies seem bogus. If the usecs are so low (or HZ so
> high on some configs) that the conversion returns 1 jiffie,
> wait_event_interruptible_timeout in wait_for_stat will return 0 when:
> * 1 jiffie passes without change in status (proper timeout)
> * status changed, but also the timer ticked once meanwhile, i.e. we
> scheduled a moment before timer tick
>
> But this is only a theory so far. What about this (wrapped, just dropped
> by mouse), I may try it when I'm back to the machine?
> --- a/drivers/char/tpm/tpm_tis.c
> +++ b/drivers/char/tpm/tpm_tis.c
> @@ -201,7 +201,7 @@ static int wait_for_stat(struct tpm_chip *chip, u8
> mask, unsigned long timeout,
>                                                        ((tpm_tis_status
>                                                          (chip)&  mask) ==
>                                                         mask), timeout);
> -               if (rc>  0)
> +               if (rc>  0 || (tpm_tis_status(chip)&  mask) == mask)
>                          return 0;
>          } else {
>                  stop = jiffies + timeout;
>
This will work for small timeout values discrepancies, but not if, for example,
the TPM is returning the values in msecs instead of usecs, which should affect the
result when issuing commands that last longer. That's why I'd like to see
the failing timeout values first.

Rajiv


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 16:34           ` Jiri Slaby
  2011-02-21 16:57             ` Linus Torvalds
  2011-02-21 16:57             ` Linus Torvalds
@ 2011-02-21 17:12             ` Rajiv Andrade
  2011-02-21 17:12             ` Rajiv Andrade
  3 siblings, 0 replies; 48+ messages in thread
From: Rajiv Andrade @ 2011-02-21 17:12 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, Linux kernel mailing list, debora, preining, linux-pm,
	Linus Torvalds, stable

On 02/21/2011 01:34 PM, Jiri Slaby wrote:

>>> BTW, the first hunk from that commit in drivers/char/tpm/tpm.c seems
>>> to be
>>> completely broken:
>>> @@ -577,9 +577,11 @@ duration:
>>>           if (rc)
>>>                   return;
>>>
>>> -       if (be32_to_cpu(tpm_cmd.header.out.return_code)
>>> -           != 3 * sizeof(u32))
>>> +       if (be32_to_cpu(tpm_cmd.header.out.return_code) != 0 ||
>>> +           be32_to_cpu(tpm_cmd.header.out.length)
>>> +           != sizeof(tpm_cmd.header.out) + sizeof(u32) + 3 *
>>> sizeof(u32))
>>>                   return;
>>> +
>>>           duration_cap =&tpm_cmd.params.getcap_out.cap.duration;
>>>           chip->vendor.duration[TPM_SHORT] =
>>>               usecs_to_jiffies(be32_to_cpu(duration_cap->tpm_short));
>>>
>>> Namely, either the old code always returned as a result of the
>>> conditional
>>> being removed, or the new code will always return as a result of
>>> the (... != 0) check.  I wonder if there's supposed to be (... == 0)
>>> instead?
>> The previous code was checking the wrong field of the TPM returned
>> buffer, probably
>> due an old commit that incorporated the tpm_cmd strucuture, it should
>> check if the return code
>> is != 0, which if true, means that the command didn't succeed. The
>> output length check should be
>> just a sanity check, so indeed the logical operator should be&&
>> instead.
> No it should not. You want 'if (wrong_retcode OR wrong_len) die;'. IOW I
> don't see what exactly is wrong on the 'if'. I think it's correct.
> (Given the old test was incorrect.)
Exactly, I read it too fast and inverted the logic while writing the response, missing De Morgan's.

> There has to be another problem which caused my regression. And since it
> reports "Operation Timed out", the former default timeout values worked
> for me, the ones read from TPM do not.
Yes, it's highly due inconsistent timeout values reported by the TPM as I mentioned, my working timeouts are:
3020000 4510000 181000000

> I'm not at that machine now, what are the usual timeouts in usecs? The
> use of conversed jiffies seem bogus. If the usecs are so low (or HZ so
> high on some configs) that the conversion returns 1 jiffie,
> wait_event_interruptible_timeout in wait_for_stat will return 0 when:
> * 1 jiffie passes without change in status (proper timeout)
> * status changed, but also the timer ticked once meanwhile, i.e. we
> scheduled a moment before timer tick
>
> But this is only a theory so far. What about this (wrapped, just dropped
> by mouse), I may try it when I'm back to the machine?
> --- a/drivers/char/tpm/tpm_tis.c
> +++ b/drivers/char/tpm/tpm_tis.c
> @@ -201,7 +201,7 @@ static int wait_for_stat(struct tpm_chip *chip, u8
> mask, unsigned long timeout,
>                                                        ((tpm_tis_status
>                                                          (chip)&  mask) ==
>                                                         mask), timeout);
> -               if (rc>  0)
> +               if (rc>  0 || (tpm_tis_status(chip)&  mask) == mask)
>                          return 0;
>          } else {
>                  stop = jiffies + timeout;
>
This will work for small timeout values discrepancies, but not if, for example,
the TPM is returning the values in msecs instead of usecs, which should affect the
result when issuing commands that last longer. That's why I'd like to see
the failing timeout values first.

Rajiv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 17:12             ` Rajiv Andrade
@ 2011-02-21 20:39               ` Jiri Slaby
  2011-02-21 21:29                 ` Stefan Berger
  2011-02-21 21:29                 ` Stefan Berger
  2011-02-21 20:39               ` Jiri Slaby
  1 sibling, 2 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-21 20:39 UTC (permalink / raw)
  To: Rajiv Andrade
  Cc: Rafael J. Wysocki, stefanb, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>> There has to be another problem which caused my regression. And since it
>> reports "Operation Timed out", the former default timeout values worked
>> for me, the ones read from TPM do not.
> Yes, it's highly due inconsistent timeout values reported by the TPM as
> I mentioned, my working timeouts are:
> 3020000 4510000 181000000

1000000 2000 150000

Actually the first one from HW is 1. This is one is HZ after correction
in get_timeout. So perhaps it is in ms, yes.

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 17:12             ` Rajiv Andrade
  2011-02-21 20:39               ` Jiri Slaby
@ 2011-02-21 20:39               ` Jiri Slaby
  1 sibling, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-21 20:39 UTC (permalink / raw)
  To: Rajiv Andrade
  Cc: stefanb, Linux kernel mailing list, debora, preining, linux-pm,
	Linus Torvalds, stable

On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>> There has to be another problem which caused my regression. And since it
>> reports "Operation Timed out", the former default timeout values worked
>> for me, the ones read from TPM do not.
> Yes, it's highly due inconsistent timeout values reported by the TPM as
> I mentioned, my working timeouts are:
> 3020000 4510000 181000000

1000000 2000 150000

Actually the first one from HW is 1. This is one is HZ after correction
in get_timeout. So perhaps it is in ms, yes.

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 20:39               ` Jiri Slaby
@ 2011-02-21 21:29                 ` Stefan Berger
  2011-02-21 21:44                   ` Jiri Slaby
  2011-02-21 21:44                   ` Jiri Slaby
  2011-02-21 21:29                 ` Stefan Berger
  1 sibling, 2 replies; 48+ messages in thread
From: Stefan Berger @ 2011-02-21 21:29 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Rafael J. Wysocki, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/21/2011 03:39 PM, Jiri Slaby wrote:
> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>> There has to be another problem which caused my regression. And since it
>>> reports "Operation Timed out", the former default timeout values worked
>>> for me, the ones read from TPM do not.
>> Yes, it's highly due inconsistent timeout values reported by the TPM as
>> I mentioned, my working timeouts are:
>> 3020000 4510000 181000000
> 1000000 2000 150000
>
> Actually the first one from HW is 1. This is one is HZ after correction
> in get_timeout. So perhaps it is in ms, yes.

Following the specs, the timeouts are supposed to be in microseconds and 
ascending order for short, medium and long duration. Of course, if the 
device returns wrong timeouts, the command isn't going to succeed, 
failing the suspend in this case. Nevertheless, I think we need the 
patch I put in but at the same time we'll need a work-around for devices 
like this. There was one person stating that this patch fixed are 
problem on his machine. Can you find the 'caps' entry in /sys 
(/sys/devices/pnp0/00:06) and let us know its content?

Thanks and regards,
     Stefan


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 20:39               ` Jiri Slaby
  2011-02-21 21:29                 ` Stefan Berger
@ 2011-02-21 21:29                 ` Stefan Berger
  1 sibling, 0 replies; 48+ messages in thread
From: Stefan Berger @ 2011-02-21 21:29 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/21/2011 03:39 PM, Jiri Slaby wrote:
> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>> There has to be another problem which caused my regression. And since it
>>> reports "Operation Timed out", the former default timeout values worked
>>> for me, the ones read from TPM do not.
>> Yes, it's highly due inconsistent timeout values reported by the TPM as
>> I mentioned, my working timeouts are:
>> 3020000 4510000 181000000
> 1000000 2000 150000
>
> Actually the first one from HW is 1. This is one is HZ after correction
> in get_timeout. So perhaps it is in ms, yes.

Following the specs, the timeouts are supposed to be in microseconds and 
ascending order for short, medium and long duration. Of course, if the 
device returns wrong timeouts, the command isn't going to succeed, 
failing the suspend in this case. Nevertheless, I think we need the 
patch I put in but at the same time we'll need a work-around for devices 
like this. There was one person stating that this patch fixed are 
problem on his machine. Can you find the 'caps' entry in /sys 
(/sys/devices/pnp0/00:06) and let us know its content?

Thanks and regards,
     Stefan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 21:29                 ` Stefan Berger
@ 2011-02-21 21:44                   ` Jiri Slaby
  2011-02-21 22:07                     ` Rajiv Andrade
  2011-02-21 22:07                     ` Rajiv Andrade
  2011-02-21 21:44                   ` Jiri Slaby
  1 sibling, 2 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-21 21:44 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Rajiv Andrade, Rafael J. Wysocki, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/21/2011 10:29 PM, Stefan Berger wrote:
> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>> There has to be another problem which caused my regression. And
>>>> since it
>>>> reports "Operation Timed out", the former default timeout values worked
>>>> for me, the ones read from TPM do not.
>>> Yes, it's highly due inconsistent timeout values reported by the TPM as
>>> I mentioned, my working timeouts are:
>>> 3020000 4510000 181000000
>> 1000000 2000 150000
>>
>> Actually the first one from HW is 1. This is one is HZ after correction
>> in get_timeout. So perhaps it is in ms, yes.
> 
> Following the specs, the timeouts are supposed to be in microseconds and
> ascending order for short, medium and long duration. Of course, if the
> device returns wrong timeouts, the command isn't going to succeed,
> failing the suspend in this case. Nevertheless, I think we need the
> patch I put in but at the same time we'll need a work-around for devices
> like this.

Yes, the patch is correct per se. But as it breaks bunch of machines it
cannot go in now. The rule is no regressions.

After you have the workaround it should go into the next rc1 after that.
Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
dmidecode output? Or are you going to base it solely on TPM
manufacturer/version?

> There was one person stating that this patch fixed are
> problem on his machine. 

Yes, I can see that. But we have to live with our mistakes.

> Can you find the 'caps' entry in /sys
> (/sys/devices/pnp0/00:06) and let us know its content?

TPM is at 00:0c:
Manufacturer: 0x49465800
TCG version: 1.2
Firmware version: 1.0

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 21:29                 ` Stefan Berger
  2011-02-21 21:44                   ` Jiri Slaby
@ 2011-02-21 21:44                   ` Jiri Slaby
  1 sibling, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-21 21:44 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Rajiv Andrade, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/21/2011 10:29 PM, Stefan Berger wrote:
> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>> There has to be another problem which caused my regression. And
>>>> since it
>>>> reports "Operation Timed out", the former default timeout values worked
>>>> for me, the ones read from TPM do not.
>>> Yes, it's highly due inconsistent timeout values reported by the TPM as
>>> I mentioned, my working timeouts are:
>>> 3020000 4510000 181000000
>> 1000000 2000 150000
>>
>> Actually the first one from HW is 1. This is one is HZ after correction
>> in get_timeout. So perhaps it is in ms, yes.
> 
> Following the specs, the timeouts are supposed to be in microseconds and
> ascending order for short, medium and long duration. Of course, if the
> device returns wrong timeouts, the command isn't going to succeed,
> failing the suspend in this case. Nevertheless, I think we need the
> patch I put in but at the same time we'll need a work-around for devices
> like this.

Yes, the patch is correct per se. But as it breaks bunch of machines it
cannot go in now. The rule is no regressions.

After you have the workaround it should go into the next rc1 after that.
Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
dmidecode output? Or are you going to base it solely on TPM
manufacturer/version?

> There was one person stating that this patch fixed are
> problem on his machine. 

Yes, I can see that. But we have to live with our mistakes.

> Can you find the 'caps' entry in /sys
> (/sys/devices/pnp0/00:06) and let us know its content?

TPM is at 00:0c:
Manufacturer: 0x49465800
TCG version: 1.2
Firmware version: 1.0

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 21:44                   ` Jiri Slaby
  2011-02-21 22:07                     ` Rajiv Andrade
@ 2011-02-21 22:07                     ` Rajiv Andrade
  2011-02-21 22:10                       ` Jiri Slaby
  2011-02-21 22:10                       ` Jiri Slaby
  1 sibling, 2 replies; 48+ messages in thread
From: Rajiv Andrade @ 2011-02-21 22:07 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Stefan Berger, Rafael J. Wysocki, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/21/2011 06:44 PM, Jiri Slaby wrote:
> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>> There has to be another problem which caused my regression. And
>>>>> since it
>>>>> reports "Operation Timed out", the former default timeout values worked
>>>>> for me, the ones read from TPM do not.
>>>> Yes, it's highly due inconsistent timeout values reported by the TPM as
>>>> I mentioned, my working timeouts are:
>>>> 3020000 4510000 181000000
>>> 1000000 2000 150000
>>>
>>> Actually the first one from HW is 1. This is one is HZ after correction
>>> in get_timeout. So perhaps it is in ms, yes.
>> Following the specs, the timeouts are supposed to be in microseconds and
>> ascending order for short, medium and long duration. Of course, if the
>> device returns wrong timeouts, the command isn't going to succeed,
>> failing the suspend in this case. Nevertheless, I think we need the
>> patch I put in but at the same time we'll need a work-around for devices
>> like this.
> Yes, the patch is correct per se. But as it breaks bunch of machines it
> cannot go in now. The rule is no regressions.
>
> After you have the workaround it should go into the next rc1 after that.
> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
> dmidecode output? Or are you going to base it solely on TPM
> manufacturer/version
It's more reliable to base the workaround on the values themselves, instead of the TPM's ID, since
we don't know whether other models will behave similarly.

It should be fine then to extend the existing workaround for short timeouts to the medium and long ones.

Rajiv


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 21:44                   ` Jiri Slaby
@ 2011-02-21 22:07                     ` Rajiv Andrade
  2011-02-21 22:07                     ` Rajiv Andrade
  1 sibling, 0 replies; 48+ messages in thread
From: Rajiv Andrade @ 2011-02-21 22:07 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Stefan Berger, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/21/2011 06:44 PM, Jiri Slaby wrote:
> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>> There has to be another problem which caused my regression. And
>>>>> since it
>>>>> reports "Operation Timed out", the former default timeout values worked
>>>>> for me, the ones read from TPM do not.
>>>> Yes, it's highly due inconsistent timeout values reported by the TPM as
>>>> I mentioned, my working timeouts are:
>>>> 3020000 4510000 181000000
>>> 1000000 2000 150000
>>>
>>> Actually the first one from HW is 1. This is one is HZ after correction
>>> in get_timeout. So perhaps it is in ms, yes.
>> Following the specs, the timeouts are supposed to be in microseconds and
>> ascending order for short, medium and long duration. Of course, if the
>> device returns wrong timeouts, the command isn't going to succeed,
>> failing the suspend in this case. Nevertheless, I think we need the
>> patch I put in but at the same time we'll need a work-around for devices
>> like this.
> Yes, the patch is correct per se. But as it breaks bunch of machines it
> cannot go in now. The rule is no regressions.
>
> After you have the workaround it should go into the next rc1 after that.
> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
> dmidecode output? Or are you going to base it solely on TPM
> manufacturer/version
It's more reliable to base the workaround on the values themselves, instead of the TPM's ID, since
we don't know whether other models will behave similarly.

It should be fine then to extend the existing workaround for short timeouts to the medium and long ones.

Rajiv

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 22:07                     ` Rajiv Andrade
@ 2011-02-21 22:10                       ` Jiri Slaby
  2011-02-21 22:17                         ` Rafael J. Wysocki
                                           ` (3 more replies)
  2011-02-21 22:10                       ` Jiri Slaby
  1 sibling, 4 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-21 22:10 UTC (permalink / raw)
  To: Rajiv Andrade
  Cc: Stefan Berger, Rafael J. Wysocki, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>> There has to be another problem which caused my regression. And
>>>>>> since it
>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>> worked
>>>>>> for me, the ones read from TPM do not.
>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>> TPM as
>>>>> I mentioned, my working timeouts are:
>>>>> 3020000 4510000 181000000
>>>> 1000000 2000 150000
>>>>
>>>> Actually the first one from HW is 1. This is one is HZ after correction
>>>> in get_timeout. So perhaps it is in ms, yes.
>>> Following the specs, the timeouts are supposed to be in microseconds and
>>> ascending order for short, medium and long duration. Of course, if the
>>> device returns wrong timeouts, the command isn't going to succeed,
>>> failing the suspend in this case. Nevertheless, I think we need the
>>> patch I put in but at the same time we'll need a work-around for devices
>>> like this.
>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>> cannot go in now. The rule is no regressions.
>>
>> After you have the workaround it should go into the next rc1 after that.
>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>> dmidecode output? Or are you going to base it solely on TPM
>> manufacturer/version
> It's more reliable to base the workaround on the values themselves,
> instead of the TPM's ID, since
> we don't know whether other models will behave similarly.

As I wrote, you may base it on dmi data.

> It should be fine then to extend the existing workaround for short
> timeouts to the medium and long ones.

OK, but how will you guess the values?

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 22:07                     ` Rajiv Andrade
  2011-02-21 22:10                       ` Jiri Slaby
@ 2011-02-21 22:10                       ` Jiri Slaby
  1 sibling, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-21 22:10 UTC (permalink / raw)
  To: Rajiv Andrade
  Cc: Stefan Berger, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>> There has to be another problem which caused my regression. And
>>>>>> since it
>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>> worked
>>>>>> for me, the ones read from TPM do not.
>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>> TPM as
>>>>> I mentioned, my working timeouts are:
>>>>> 3020000 4510000 181000000
>>>> 1000000 2000 150000
>>>>
>>>> Actually the first one from HW is 1. This is one is HZ after correction
>>>> in get_timeout. So perhaps it is in ms, yes.
>>> Following the specs, the timeouts are supposed to be in microseconds and
>>> ascending order for short, medium and long duration. Of course, if the
>>> device returns wrong timeouts, the command isn't going to succeed,
>>> failing the suspend in this case. Nevertheless, I think we need the
>>> patch I put in but at the same time we'll need a work-around for devices
>>> like this.
>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>> cannot go in now. The rule is no regressions.
>>
>> After you have the workaround it should go into the next rc1 after that.
>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>> dmidecode output? Or are you going to base it solely on TPM
>> manufacturer/version
> It's more reliable to base the workaround on the values themselves,
> instead of the TPM's ID, since
> we don't know whether other models will behave similarly.

As I wrote, you may base it on dmi data.

> It should be fine then to extend the existing workaround for short
> timeouts to the medium and long ones.

OK, but how will you guess the values?

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 22:10                       ` Jiri Slaby
  2011-02-21 22:17                         ` Rafael J. Wysocki
@ 2011-02-21 22:17                         ` Rafael J. Wysocki
  2011-02-22  0:42                         ` Stefan Berger
  2011-02-22  0:42                         ` Stefan Berger
  3 siblings, 0 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2011-02-21 22:17 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Stefan Berger, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On Monday, February 21, 2011, Jiri Slaby wrote:
> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
> > On 02/21/2011 06:44 PM, Jiri Slaby wrote:
> >> On 02/21/2011 10:29 PM, Stefan Berger wrote:
> >>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
> >>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
> >>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
> >>>>>> There has to be another problem which caused my regression. And
> >>>>>> since it
> >>>>>> reports "Operation Timed out", the former default timeout values
> >>>>>> worked
> >>>>>> for me, the ones read from TPM do not.
> >>>>> Yes, it's highly due inconsistent timeout values reported by the
> >>>>> TPM as
> >>>>> I mentioned, my working timeouts are:
> >>>>> 3020000 4510000 181000000
> >>>> 1000000 2000 150000
> >>>>
> >>>> Actually the first one from HW is 1. This is one is HZ after correction
> >>>> in get_timeout. So perhaps it is in ms, yes.
> >>> Following the specs, the timeouts are supposed to be in microseconds and
> >>> ascending order for short, medium and long duration. Of course, if the
> >>> device returns wrong timeouts, the command isn't going to succeed,
> >>> failing the suspend in this case. Nevertheless, I think we need the
> >>> patch I put in but at the same time we'll need a work-around for devices
> >>> like this.
> >> Yes, the patch is correct per se. But as it breaks bunch of machines it
> >> cannot go in now. The rule is no regressions.
> >>
> >> After you have the workaround it should go into the next rc1 after that.
> >> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
> >> dmidecode output? Or are you going to base it solely on TPM
> >> manufacturer/version
> > It's more reliable to base the workaround on the values themselves,
> > instead of the TPM's ID, since
> > we don't know whether other models will behave similarly.
> 
> As I wrote, you may base it on dmi data.

In which case this report will have to be taken into account too:
http://marc.info/?l=linux-acpi&m=129796038509311&w=4

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 22:10                       ` Jiri Slaby
@ 2011-02-21 22:17                         ` Rafael J. Wysocki
  2011-02-21 22:17                         ` Rafael J. Wysocki
                                           ` (2 subsequent siblings)
  3 siblings, 0 replies; 48+ messages in thread
From: Rafael J. Wysocki @ 2011-02-21 22:17 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Stefan Berger, Rajiv Andrade, Linux kernel mailing list, debora,
	preining, linux-pm, Linus Torvalds, stable

On Monday, February 21, 2011, Jiri Slaby wrote:
> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
> > On 02/21/2011 06:44 PM, Jiri Slaby wrote:
> >> On 02/21/2011 10:29 PM, Stefan Berger wrote:
> >>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
> >>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
> >>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
> >>>>>> There has to be another problem which caused my regression. And
> >>>>>> since it
> >>>>>> reports "Operation Timed out", the former default timeout values
> >>>>>> worked
> >>>>>> for me, the ones read from TPM do not.
> >>>>> Yes, it's highly due inconsistent timeout values reported by the
> >>>>> TPM as
> >>>>> I mentioned, my working timeouts are:
> >>>>> 3020000 4510000 181000000
> >>>> 1000000 2000 150000
> >>>>
> >>>> Actually the first one from HW is 1. This is one is HZ after correction
> >>>> in get_timeout. So perhaps it is in ms, yes.
> >>> Following the specs, the timeouts are supposed to be in microseconds and
> >>> ascending order for short, medium and long duration. Of course, if the
> >>> device returns wrong timeouts, the command isn't going to succeed,
> >>> failing the suspend in this case. Nevertheless, I think we need the
> >>> patch I put in but at the same time we'll need a work-around for devices
> >>> like this.
> >> Yes, the patch is correct per se. But as it breaks bunch of machines it
> >> cannot go in now. The rule is no regressions.
> >>
> >> After you have the workaround it should go into the next rc1 after that.
> >> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
> >> dmidecode output? Or are you going to base it solely on TPM
> >> manufacturer/version
> > It's more reliable to base the workaround on the values themselves,
> > instead of the TPM's ID, since
> > we don't know whether other models will behave similarly.
> 
> As I wrote, you may base it on dmi data.

In which case this report will have to be taken into account too:
http://marc.info/?l=linux-acpi&m=129796038509311&w=4

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 22:10                       ` Jiri Slaby
  2011-02-21 22:17                         ` Rafael J. Wysocki
  2011-02-21 22:17                         ` Rafael J. Wysocki
@ 2011-02-22  0:42                         ` Stefan Berger
  2011-02-22  8:41                           ` Jiri Slaby
  2011-02-22  8:41                           ` Jiri Slaby
  2011-02-22  0:42                         ` Stefan Berger
  3 siblings, 2 replies; 48+ messages in thread
From: Stefan Berger @ 2011-02-22  0:42 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Rafael J. Wysocki, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/21/2011 05:10 PM, Jiri Slaby wrote:
> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
>> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>>> There has to be another problem which caused my regression. And
>>>>>>> since it
>>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>>> worked
>>>>>>> for me, the ones read from TPM do not.
>>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>>> TPM as
>>>>>> I mentioned, my working timeouts are:
>>>>>> 3020000 4510000 181000000
>>>>> 1000000 2000 150000
>>>>>
>>>>> Actually the first one from HW is 1. This is one is HZ after correction
>>>>> in get_timeout. So perhaps it is in ms, yes.
>>>> Following the specs, the timeouts are supposed to be in microseconds and
>>>> ascending order for short, medium and long duration. Of course, if the
>>>> device returns wrong timeouts, the command isn't going to succeed,
>>>> failing the suspend in this case. Nevertheless, I think we need the
>>>> patch I put in but at the same time we'll need a work-around for devices
>>>> like this.
>>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>>> cannot go in now. The rule is no regressions.
>>>
>>> After you have the workaround it should go into the next rc1 after that.
>>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>>> dmidecode output? Or are you going to base it solely on TPM
>>> manufacturer/version
>> It's more reliable to base the workaround on the values themselves,
>> instead of the TPM's ID, since
>> we don't know whether other models will behave similarly.
> As I wrote, you may base it on dmi data.
>
>> It should be fine then to extend the existing workaround for short
>> timeouts to the medium and long ones.
> OK, but how will you guess the values?
One way of doing it would be to at least make sure that the timeouts are

short < medium < long

and if that's not true, as in the case of your TPM, set the timeouts to 
0 and have Rajiv's work-around kick in  OR we assign the same high 
values to the timeouts explicily that Rajiv's work-around is using right 
now. Of course there could be another type of bad TPM firmware out there 
where all values are in ascending order but given in ms and cause 
time-outs -- but I would wait for someone to point that out since I am 
not aware of such a device.

Using the manufacturer, firmware version etc. to distinguish would 
probably open a can of worms...

    Stefan


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 22:10                       ` Jiri Slaby
                                           ` (2 preceding siblings ...)
  2011-02-22  0:42                         ` Stefan Berger
@ 2011-02-22  0:42                         ` Stefan Berger
  3 siblings, 0 replies; 48+ messages in thread
From: Stefan Berger @ 2011-02-22  0:42 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/21/2011 05:10 PM, Jiri Slaby wrote:
> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
>> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>>> There has to be another problem which caused my regression. And
>>>>>>> since it
>>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>>> worked
>>>>>>> for me, the ones read from TPM do not.
>>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>>> TPM as
>>>>>> I mentioned, my working timeouts are:
>>>>>> 3020000 4510000 181000000
>>>>> 1000000 2000 150000
>>>>>
>>>>> Actually the first one from HW is 1. This is one is HZ after correction
>>>>> in get_timeout. So perhaps it is in ms, yes.
>>>> Following the specs, the timeouts are supposed to be in microseconds and
>>>> ascending order for short, medium and long duration. Of course, if the
>>>> device returns wrong timeouts, the command isn't going to succeed,
>>>> failing the suspend in this case. Nevertheless, I think we need the
>>>> patch I put in but at the same time we'll need a work-around for devices
>>>> like this.
>>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>>> cannot go in now. The rule is no regressions.
>>>
>>> After you have the workaround it should go into the next rc1 after that.
>>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>>> dmidecode output? Or are you going to base it solely on TPM
>>> manufacturer/version
>> It's more reliable to base the workaround on the values themselves,
>> instead of the TPM's ID, since
>> we don't know whether other models will behave similarly.
> As I wrote, you may base it on dmi data.
>
>> It should be fine then to extend the existing workaround for short
>> timeouts to the medium and long ones.
> OK, but how will you guess the values?
One way of doing it would be to at least make sure that the timeouts are

short < medium < long

and if that's not true, as in the case of your TPM, set the timeouts to 
0 and have Rajiv's work-around kick in  OR we assign the same high 
values to the timeouts explicily that Rajiv's work-around is using right 
now. Of course there could be another type of bad TPM firmware out there 
where all values are in ascending order but given in ms and cause 
time-outs -- but I would wait for someone to point that out since I am 
not aware of such a device.

Using the manufacturer, firmware version etc. to distinguish would 
probably open a can of worms...

    Stefan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 15:30         ` Rajiv Andrade
                             ` (2 preceding siblings ...)
  2011-02-22  5:39           ` Norbert Preining
@ 2011-02-22  5:39           ` Norbert Preining
  2011-02-22  8:42             ` Jiri Slaby
  2011-02-22  8:42             ` Jiri Slaby
  3 siblings, 2 replies; 48+ messages in thread
From: Norbert Preining @ 2011-02-22  5:39 UTC (permalink / raw)
  To: Rajiv Andrade
  Cc: Rafael J. Wysocki, Jiri Slaby, stefanb, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds

Hi everyone,

sorry for late reply, was night over here in Japan.

On Mo, 21 Feb 2011, Rajiv Andrade wrote:
> Norbert, can 'cat /sys/devices/pnp0/00\:0*/timeouts' and send the output?

I don't have any of these files:
$ ls /sys/devices/pnp0/*
/sys/devices/pnp0/uevent

/sys/devices/pnp0/00:00:
id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:01:
driver	firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:02:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:03:
driver	       id     options  resources  subsystem
firmware_node  nvram  power    rtc	  uevent

/sys/devices/pnp0/00:04:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:05:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:06:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:07:
driver	firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:08:
driver	firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:09:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:0a:
active	driver	       id	owned  pubek	  temp_deactivated
cancel	enabled        misc	pcrs   resources  uevent
caps	firmware_node  options	power  subsystem

/sys/devices/pnp0/power:
async		      runtime_status	      wakeup_count
autosuspend_delay_ms  runtime_suspended_time  wakeup_hit_count
control		      runtime_usage	      wakeup_last_time_ms
runtime_active_kids   wakeup		      wakeup_max_time_ms
runtime_active_time   wakeup_active	      wakeup_total_time_ms
runtime_enabled       wakeup_active_count
$

running git kernel 6f576d57f1 (were the commit is reverted)

More I cannot offer ... let me know what else I can provide.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining            preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan                                 TeX Live & Debian Developer
DSA: 0x09C5B094   fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
The major difference between a thing that might go wrong
and a thing that cannot possibly go wrong is that when a
thing that cannot possibly go wrong goes wrong it usually
turns out to be impossible to get at or repair.
                 --- One of the laws of computers and programming revealed.
                 --- Douglas Adams, The Hitchhikers Guide to the Galaxy

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-21 15:30         ` Rajiv Andrade
  2011-02-21 16:34           ` Jiri Slaby
  2011-02-21 16:34           ` Jiri Slaby
@ 2011-02-22  5:39           ` Norbert Preining
  2011-02-22  5:39           ` Norbert Preining
  3 siblings, 0 replies; 48+ messages in thread
From: Norbert Preining @ 2011-02-22  5:39 UTC (permalink / raw)
  To: Rajiv Andrade
  Cc: Jiri Slaby, stefanb, Linux kernel mailing list, debora, linux-pm,
	Linus Torvalds, stable

Hi everyone,

sorry for late reply, was night over here in Japan.

On Mo, 21 Feb 2011, Rajiv Andrade wrote:
> Norbert, can 'cat /sys/devices/pnp0/00\:0*/timeouts' and send the output?

I don't have any of these files:
$ ls /sys/devices/pnp0/*
/sys/devices/pnp0/uevent

/sys/devices/pnp0/00:00:
id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:01:
driver	firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:02:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:03:
driver	       id     options  resources  subsystem
firmware_node  nvram  power    rtc	  uevent

/sys/devices/pnp0/00:04:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:05:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:06:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:07:
driver	firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:08:
driver	firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:09:
firmware_node  id  options  power  resources  subsystem  uevent

/sys/devices/pnp0/00:0a:
active	driver	       id	owned  pubek	  temp_deactivated
cancel	enabled        misc	pcrs   resources  uevent
caps	firmware_node  options	power  subsystem

/sys/devices/pnp0/power:
async		      runtime_status	      wakeup_count
autosuspend_delay_ms  runtime_suspended_time  wakeup_hit_count
control		      runtime_usage	      wakeup_last_time_ms
runtime_active_kids   wakeup		      wakeup_max_time_ms
runtime_active_time   wakeup_active	      wakeup_total_time_ms
runtime_enabled       wakeup_active_count
$

running git kernel 6f576d57f1 (were the commit is reverted)

More I cannot offer ... let me know what else I can provide.

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining            preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan                                 TeX Live & Debian Developer
DSA: 0x09C5B094   fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
The major difference between a thing that might go wrong
and a thing that cannot possibly go wrong is that when a
thing that cannot possibly go wrong goes wrong it usually
turns out to be impossible to get at or repair.
                 --- One of the laws of computers and programming revealed.
                 --- Douglas Adams, The Hitchhikers Guide to the Galaxy

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22  0:42                         ` Stefan Berger
  2011-02-22  8:41                           ` Jiri Slaby
@ 2011-02-22  8:41                           ` Jiri Slaby
  2011-02-22 11:57                             ` Stefan Berger
  2011-02-22 11:57                             ` Stefan Berger
  1 sibling, 2 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-22  8:41 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Rajiv Andrade, Rafael J. Wysocki, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/22/2011 01:42 AM, Stefan Berger wrote:
> On 02/21/2011 05:10 PM, Jiri Slaby wrote:
>> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
>>> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>>>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>>>> There has to be another problem which caused my regression. And
>>>>>>>> since it
>>>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>>>> worked
>>>>>>>> for me, the ones read from TPM do not.
>>>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>>>> TPM as
>>>>>>> I mentioned, my working timeouts are:
>>>>>>> 3020000 4510000 181000000
>>>>>> 1000000 2000 150000
>>>>>>
>>>>>> Actually the first one from HW is 1. This is one is HZ after
>>>>>> correction
>>>>>> in get_timeout. So perhaps it is in ms, yes.
>>>>> Following the specs, the timeouts are supposed to be in
>>>>> microseconds and
>>>>> ascending order for short, medium and long duration. Of course, if the
>>>>> device returns wrong timeouts, the command isn't going to succeed,
>>>>> failing the suspend in this case. Nevertheless, I think we need the
>>>>> patch I put in but at the same time we'll need a work-around for
>>>>> devices
>>>>> like this.
>>>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>>>> cannot go in now. The rule is no regressions.
>>>>
>>>> After you have the workaround it should go into the next rc1 after
>>>> that.
>>>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>>>> dmidecode output? Or are you going to base it solely on TPM
>>>> manufacturer/version
>>> It's more reliable to base the workaround on the values themselves,
>>> instead of the TPM's ID, since
>>> we don't know whether other models will behave similarly.
>> As I wrote, you may base it on dmi data.
>>
>>> It should be fine then to extend the existing workaround for short
>>> timeouts to the medium and long ones.
>> OK, but how will you guess the values?
> One way of doing it would be to at least make sure that the timeouts are
> 
> short < medium < long
> 
> and if that's not true, as in the case of your TPM, set the timeouts to
> 0 and have Rajiv's work-around kick in  OR we assign the same high
> values to the timeouts explicily that Rajiv's work-around is using right
> now. Of course there could be another type of bad TPM firmware out there
> where all values are in ascending order but given in ms and cause
> time-outs -- but I would wait for someone to point that out since I am
> not aware of such a device.

Note that it is in ascending order (1 2000 150000). As I wrote the first
timeout (1) is replaced by one HZ in get_timeouts.

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22  0:42                         ` Stefan Berger
@ 2011-02-22  8:41                           ` Jiri Slaby
  2011-02-22  8:41                           ` Jiri Slaby
  1 sibling, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-22  8:41 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Rajiv Andrade, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/22/2011 01:42 AM, Stefan Berger wrote:
> On 02/21/2011 05:10 PM, Jiri Slaby wrote:
>> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
>>> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>>>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>>>> There has to be another problem which caused my regression. And
>>>>>>>> since it
>>>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>>>> worked
>>>>>>>> for me, the ones read from TPM do not.
>>>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>>>> TPM as
>>>>>>> I mentioned, my working timeouts are:
>>>>>>> 3020000 4510000 181000000
>>>>>> 1000000 2000 150000
>>>>>>
>>>>>> Actually the first one from HW is 1. This is one is HZ after
>>>>>> correction
>>>>>> in get_timeout. So perhaps it is in ms, yes.
>>>>> Following the specs, the timeouts are supposed to be in
>>>>> microseconds and
>>>>> ascending order for short, medium and long duration. Of course, if the
>>>>> device returns wrong timeouts, the command isn't going to succeed,
>>>>> failing the suspend in this case. Nevertheless, I think we need the
>>>>> patch I put in but at the same time we'll need a work-around for
>>>>> devices
>>>>> like this.
>>>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>>>> cannot go in now. The rule is no regressions.
>>>>
>>>> After you have the workaround it should go into the next rc1 after
>>>> that.
>>>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>>>> dmidecode output? Or are you going to base it solely on TPM
>>>> manufacturer/version
>>> It's more reliable to base the workaround on the values themselves,
>>> instead of the TPM's ID, since
>>> we don't know whether other models will behave similarly.
>> As I wrote, you may base it on dmi data.
>>
>>> It should be fine then to extend the existing workaround for short
>>> timeouts to the medium and long ones.
>> OK, but how will you guess the values?
> One way of doing it would be to at least make sure that the timeouts are
> 
> short < medium < long
> 
> and if that's not true, as in the case of your TPM, set the timeouts to
> 0 and have Rajiv's work-around kick in  OR we assign the same high
> values to the timeouts explicily that Rajiv's work-around is using right
> now. Of course there could be another type of bad TPM firmware out there
> where all values are in ascending order but given in ms and cause
> time-outs -- but I would wait for someone to point that out since I am
> not aware of such a device.

Note that it is in ascending order (1 2000 150000). As I wrote the first
timeout (1) is replaced by one HZ in get_timeouts.

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22  5:39           ` Norbert Preining
@ 2011-02-22  8:42             ` Jiri Slaby
  2011-02-22  9:13               ` Norbert Preining
  2011-02-22  9:13               ` Norbert Preining
  2011-02-22  8:42             ` Jiri Slaby
  1 sibling, 2 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-22  8:42 UTC (permalink / raw)
  To: Norbert Preining
  Cc: Rajiv Andrade, Rafael J. Wysocki, stefanb, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds

On 02/22/2011 06:39 AM, Norbert Preining wrote:
> Hi everyone,
> 
> sorry for late reply, was night over here in Japan.
> 
> On Mo, 21 Feb 2011, Rajiv Andrade wrote:
>> Norbert, can 'cat /sys/devices/pnp0/00\:0*/timeouts' and send the output?
> 
> I don't have any of these files:
> $ ls /sys/devices/pnp0/*
> /sys/devices/pnp0/uevent
> 
> /sys/devices/pnp0/00:00:
> id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:01:
> driver	firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:02:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:03:
> driver	       id     options  resources  subsystem
> firmware_node  nvram  power    rtc	  uevent
> 
> /sys/devices/pnp0/00:04:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:05:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:06:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:07:
> driver	firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:08:
> driver	firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:09:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:0a:
> active	driver	       id	owned  pubek	  temp_deactivated
> cancel	enabled        misc	pcrs   resources  uevent
> caps	firmware_node  options	power  subsystem
> 
> /sys/devices/pnp0/power:
> async		      runtime_status	      wakeup_count
> autosuspend_delay_ms  runtime_suspended_time  wakeup_hit_count
> control		      runtime_usage	      wakeup_last_time_ms
> runtime_active_kids   wakeup		      wakeup_max_time_ms
> runtime_active_time   wakeup_active	      wakeup_total_time_ms
> runtime_enabled       wakeup_active_count
> $
> 
> running git kernel 6f576d57f1 (were the commit is reverted)

Ok, run some older kernel where this is not reverted (.38-rc5, 2.6.37.1)
or revert the revert :).

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22  5:39           ` Norbert Preining
  2011-02-22  8:42             ` Jiri Slaby
@ 2011-02-22  8:42             ` Jiri Slaby
  1 sibling, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-22  8:42 UTC (permalink / raw)
  To: Norbert Preining
  Cc: stefanb, Rajiv Andrade, Linux kernel mailing list, debora,
	linux-pm, Linus Torvalds, stable

On 02/22/2011 06:39 AM, Norbert Preining wrote:
> Hi everyone,
> 
> sorry for late reply, was night over here in Japan.
> 
> On Mo, 21 Feb 2011, Rajiv Andrade wrote:
>> Norbert, can 'cat /sys/devices/pnp0/00\:0*/timeouts' and send the output?
> 
> I don't have any of these files:
> $ ls /sys/devices/pnp0/*
> /sys/devices/pnp0/uevent
> 
> /sys/devices/pnp0/00:00:
> id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:01:
> driver	firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:02:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:03:
> driver	       id     options  resources  subsystem
> firmware_node  nvram  power    rtc	  uevent
> 
> /sys/devices/pnp0/00:04:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:05:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:06:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:07:
> driver	firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:08:
> driver	firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:09:
> firmware_node  id  options  power  resources  subsystem  uevent
> 
> /sys/devices/pnp0/00:0a:
> active	driver	       id	owned  pubek	  temp_deactivated
> cancel	enabled        misc	pcrs   resources  uevent
> caps	firmware_node  options	power  subsystem
> 
> /sys/devices/pnp0/power:
> async		      runtime_status	      wakeup_count
> autosuspend_delay_ms  runtime_suspended_time  wakeup_hit_count
> control		      runtime_usage	      wakeup_last_time_ms
> runtime_active_kids   wakeup		      wakeup_max_time_ms
> runtime_active_time   wakeup_active	      wakeup_total_time_ms
> runtime_enabled       wakeup_active_count
> $
> 
> running git kernel 6f576d57f1 (were the commit is reverted)

Ok, run some older kernel where this is not reverted (.38-rc5, 2.6.37.1)
or revert the revert :).

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22  8:42             ` Jiri Slaby
@ 2011-02-22  9:13               ` Norbert Preining
  2011-02-22  9:13               ` Norbert Preining
  1 sibling, 0 replies; 48+ messages in thread
From: Norbert Preining @ 2011-02-22  9:13 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Rafael J. Wysocki, stefanb, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds

On Di, 22 Feb 2011, Jiri Slaby wrote:
> Ok, run some older kernel where this is not reverted (.38-rc5, 2.6.37.1)
> or revert the revert :).

1000000 4000 152000

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining            preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan                                 TeX Live & Debian Developer
DSA: 0x09C5B094   fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
HARBLEDOWN (vb.)
To manoeuvre a double mattress down a winding staircase.
			--- Douglas Adams, The Meaning of Liff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22  8:42             ` Jiri Slaby
  2011-02-22  9:13               ` Norbert Preining
@ 2011-02-22  9:13               ` Norbert Preining
  1 sibling, 0 replies; 48+ messages in thread
From: Norbert Preining @ 2011-02-22  9:13 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: stefanb, Rajiv Andrade, Linux kernel mailing list, debora,
	linux-pm, Linus Torvalds, stable

On Di, 22 Feb 2011, Jiri Slaby wrote:
> Ok, run some older kernel where this is not reverted (.38-rc5, 2.6.37.1)
> or revert the revert :).

1000000 4000 152000

Best wishes

Norbert
------------------------------------------------------------------------
Norbert Preining            preining@{jaist.ac.jp, logic.at, debian.org}
JAIST, Japan                                 TeX Live & Debian Developer
DSA: 0x09C5B094   fp: 14DF 2E6C 0307 BE6D AD76  A9C0 D2BF 4AA3 09C5 B094
------------------------------------------------------------------------
HARBLEDOWN (vb.)
To manoeuvre a double mattress down a winding staircase.
			--- Douglas Adams, The Meaning of Liff

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22  8:41                           ` Jiri Slaby
  2011-02-22 11:57                             ` Stefan Berger
@ 2011-02-22 11:57                             ` Stefan Berger
  2011-02-22 17:39                               ` Jiri Slaby
  2011-02-22 17:39                               ` Jiri Slaby
  1 sibling, 2 replies; 48+ messages in thread
From: Stefan Berger @ 2011-02-22 11:57 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Rafael J. Wysocki, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/22/2011 03:41 AM, Jiri Slaby wrote:
> On 02/22/2011 01:42 AM, Stefan Berger wrote:
>> On 02/21/2011 05:10 PM, Jiri Slaby wrote:
>>> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
>>>> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>>>>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>>>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>>>>> There has to be another problem which caused my regression. And
>>>>>>>>> since it
>>>>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>>>>> worked
>>>>>>>>> for me, the ones read from TPM do not.
>>>>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>>>>> TPM as
>>>>>>>> I mentioned, my working timeouts are:
>>>>>>>> 3020000 4510000 181000000
>>>>>>> 1000000 2000 150000
>>>>>>>
>>>>>>> Actually the first one from HW is 1. This is one is HZ after
>>>>>>> correction
>>>>>>> in get_timeout. So perhaps it is in ms, yes.
>>>>>> Following the specs, the timeouts are supposed to be in
>>>>>> microseconds and
>>>>>> ascending order for short, medium and long duration. Of course, if the
>>>>>> device returns wrong timeouts, the command isn't going to succeed,
>>>>>> failing the suspend in this case. Nevertheless, I think we need the
>>>>>> patch I put in but at the same time we'll need a work-around for
>>>>>> devices
>>>>>> like this.
>>>>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>>>>> cannot go in now. The rule is no regressions.
>>>>>
>>>>> After you have the workaround it should go into the next rc1 after
>>>>> that.
>>>>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>>>>> dmidecode output? Or are you going to base it solely on TPM
>>>>> manufacturer/version
>>>> It's more reliable to base the workaround on the values themselves,
>>>> instead of the TPM's ID, since
>>>> we don't know whether other models will behave similarly.
>>> As I wrote, you may base it on dmi data.
>>>
>>>> It should be fine then to extend the existing workaround for short
>>>> timeouts to the medium and long ones.
>>> OK, but how will you guess the values?
>> One way of doing it would be to at least make sure that the timeouts are
>>
>> short<  medium<  long
>>
>> and if that's not true, as in the case of your TPM, set the timeouts to
>> 0 and have Rajiv's work-around kick in  OR we assign the same high
>> values to the timeouts explicily that Rajiv's work-around is using right
>> now. Of course there could be another type of bad TPM firmware out there
>> where all values are in ascending order but given in ms and cause
>> time-outs -- but I would wait for someone to point that out since I am
>> not aware of such a device.
> Note that it is in ascending order (1 2000 150000). As I wrote the first
> timeout (1) is replaced by one HZ in get_timeouts.
The forthcoming patch will simply also adapt the other 2 values and 
multiply them by 1000. The reason for the suspend failure is the 2nd 
timeout with TPM_SaveState command being of medium duration.

There will be a 2nd patch for re-enabling the TPM's interrupts that the 
BIOS may (this may be BIOS-dependent) have disabled while sending a 
command (TPM_Startup) to the TPM upon resume and having used polling 
mode and leaving it with the interrupts disabled.

I'd appreciate it if you tested both of them.

    Stefan


^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22  8:41                           ` Jiri Slaby
@ 2011-02-22 11:57                             ` Stefan Berger
  2011-02-22 11:57                             ` Stefan Berger
  1 sibling, 0 replies; 48+ messages in thread
From: Stefan Berger @ 2011-02-22 11:57 UTC (permalink / raw)
  To: Jiri Slaby
  Cc: Rajiv Andrade, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/22/2011 03:41 AM, Jiri Slaby wrote:
> On 02/22/2011 01:42 AM, Stefan Berger wrote:
>> On 02/21/2011 05:10 PM, Jiri Slaby wrote:
>>> On 02/21/2011 11:07 PM, Rajiv Andrade wrote:
>>>> On 02/21/2011 06:44 PM, Jiri Slaby wrote:
>>>>> On 02/21/2011 10:29 PM, Stefan Berger wrote:
>>>>>> On 02/21/2011 03:39 PM, Jiri Slaby wrote:
>>>>>>> On 02/21/2011 06:12 PM, Rajiv Andrade wrote:
>>>>>>>> On 02/21/2011 01:34 PM, Jiri Slaby wrote:
>>>>>>>>> There has to be another problem which caused my regression. And
>>>>>>>>> since it
>>>>>>>>> reports "Operation Timed out", the former default timeout values
>>>>>>>>> worked
>>>>>>>>> for me, the ones read from TPM do not.
>>>>>>>> Yes, it's highly due inconsistent timeout values reported by the
>>>>>>>> TPM as
>>>>>>>> I mentioned, my working timeouts are:
>>>>>>>> 3020000 4510000 181000000
>>>>>>> 1000000 2000 150000
>>>>>>>
>>>>>>> Actually the first one from HW is 1. This is one is HZ after
>>>>>>> correction
>>>>>>> in get_timeout. So perhaps it is in ms, yes.
>>>>>> Following the specs, the timeouts are supposed to be in
>>>>>> microseconds and
>>>>>> ascending order for short, medium and long duration. Of course, if the
>>>>>> device returns wrong timeouts, the command isn't going to succeed,
>>>>>> failing the suspend in this case. Nevertheless, I think we need the
>>>>>> patch I put in but at the same time we'll need a work-around for
>>>>>> devices
>>>>>> like this.
>>>>> Yes, the patch is correct per se. But as it breaks bunch of machines it
>>>>> cannot go in now. The rule is no regressions.
>>>>>
>>>>> After you have the workaround it should go into the next rc1 after
>>>>> that.
>>>>> Do you plan to add a dmi-based quirk? Or, IOW do you want me to attach
>>>>> dmidecode output? Or are you going to base it solely on TPM
>>>>> manufacturer/version
>>>> It's more reliable to base the workaround on the values themselves,
>>>> instead of the TPM's ID, since
>>>> we don't know whether other models will behave similarly.
>>> As I wrote, you may base it on dmi data.
>>>
>>>> It should be fine then to extend the existing workaround for short
>>>> timeouts to the medium and long ones.
>>> OK, but how will you guess the values?
>> One way of doing it would be to at least make sure that the timeouts are
>>
>> short<  medium<  long
>>
>> and if that's not true, as in the case of your TPM, set the timeouts to
>> 0 and have Rajiv's work-around kick in  OR we assign the same high
>> values to the timeouts explicily that Rajiv's work-around is using right
>> now. Of course there could be another type of bad TPM firmware out there
>> where all values are in ascending order but given in ms and cause
>> time-outs -- but I would wait for someone to point that out since I am
>> not aware of such a device.
> Note that it is in ascending order (1 2000 150000). As I wrote the first
> timeout (1) is replaced by one HZ in get_timeouts.
The forthcoming patch will simply also adapt the other 2 values and 
multiply them by 1000. The reason for the suspend failure is the 2nd 
timeout with TPM_SaveState command being of medium duration.

There will be a 2nd patch for re-enabling the TPM's interrupts that the 
BIOS may (this may be BIOS-dependent) have disabled while sending a 
command (TPM_Startup) to the TPM upon resume and having used polling 
mode and leaving it with the interrupts disabled.

I'd appreciate it if you tested both of them.

    Stefan

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22 11:57                             ` Stefan Berger
@ 2011-02-22 17:39                               ` Jiri Slaby
  2011-02-22 17:39                               ` Jiri Slaby
  1 sibling, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-22 17:39 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Rajiv Andrade, Rafael J. Wysocki, linux-pm, stable,
	Linux kernel mailing list, debora, Linus Torvalds, preining

On 02/22/2011 12:57 PM, Stefan Berger wrote:
> The forthcoming patch will simply also adapt the other 2 values and
> multiply them by 1000. The reason for the suspend failure is the 2nd
> timeout with TPM_SaveState command being of medium duration.
> 
> There will be a 2nd patch for re-enabling the TPM's interrupts that the
> BIOS may (this may be BIOS-dependent) have disabled while sending a
> command (TPM_Startup) to the TPM upon resume and having used polling
> mode and leaving it with the interrupts disabled.
> 
> I'd appreciate it if you tested both of them.

I didn't get any. Could you add a pointer to them?

BTW. there is another guy who hit this. He has:
/sys/devices/pnp0/00:02/timeouts:1000000 2000 150000

like I do.

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* Re: 2.6.37.1 s2disk regression (TPM)
  2011-02-22 11:57                             ` Stefan Berger
  2011-02-22 17:39                               ` Jiri Slaby
@ 2011-02-22 17:39                               ` Jiri Slaby
  1 sibling, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-22 17:39 UTC (permalink / raw)
  To: Stefan Berger
  Cc: Rajiv Andrade, Linux kernel mailing list, debora, preining,
	linux-pm, Linus Torvalds, stable

On 02/22/2011 12:57 PM, Stefan Berger wrote:
> The forthcoming patch will simply also adapt the other 2 values and
> multiply them by 1000. The reason for the suspend failure is the 2nd
> timeout with TPM_SaveState command being of medium duration.
> 
> There will be a 2nd patch for re-enabling the TPM's interrupts that the
> BIOS may (this may be BIOS-dependent) have disabled while sending a
> command (TPM_Startup) to the TPM upon resume and having used polling
> mode and leaving it with the interrupts disabled.
> 
> I'd appreciate it if you tested both of them.

I didn't get any. Could you add a pointer to them?

BTW. there is another guy who hit this. He has:
/sys/devices/pnp0/00:02/timeouts:1000000 2000 150000

like I do.

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

* 2.6.37.1 s2disk regression (TPM)
@ 2011-02-20 10:13 Jiri Slaby
  0 siblings, 0 replies; 48+ messages in thread
From: Jiri Slaby @ 2011-02-20 10:13 UTC (permalink / raw)
  To: stefanb
  Cc: srajiv, Linux kernel mailing list, debora, tpmdd-devel, linux-pm, stable

Hi,

I'm unable to hibernate 2.6.37.1 unless I rmmod tpm_tis:
[10974.074587] Suspending console(s) (use no_console_suspend to debug)
[10974.103073] tpm_tis 00:0c: Operation Timed out
[10974.103089] legacy_suspend(): pnp_bus_suspend+0x0/0xa0 returns -62
[10974.103095] PM: Device 00:0c failed to freeze: error -62

2.6.37 worked fine. Going to revert 9b29050f8f7 (tpm_tis: Use timeouts
returned from TPM) for testing.

regards,
-- 
js

^ permalink raw reply	[flat|nested] 48+ messages in thread

end of thread, other threads:[~2011-02-22 17:39 UTC | newest]

Thread overview: 48+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-02-20 10:13 2.6.37.1 s2disk regression (TPM) Jiri Slaby
2011-02-20 10:44 ` Rafael J. Wysocki
2011-02-20 10:46   ` Jiri Slaby
2011-02-20 10:51     ` Rafael J. Wysocki
2011-02-20 10:57       ` [REVERT request stable-2.6.36/37] " Jiri Slaby
2011-02-20 10:57       ` Jiri Slaby
2011-02-20 16:50         ` [stable] " Greg KH
2011-02-20 16:50         ` Greg KH
2011-02-20 11:48       ` Rafael J. Wysocki
2011-02-20 11:48       ` Rafael J. Wysocki
2011-02-21 15:30         ` Rajiv Andrade
2011-02-21 15:30         ` Rajiv Andrade
2011-02-21 16:34           ` Jiri Slaby
2011-02-21 16:57             ` Linus Torvalds
2011-02-21 16:57             ` Linus Torvalds
2011-02-21 17:12             ` Rajiv Andrade
2011-02-21 17:12             ` Rajiv Andrade
2011-02-21 20:39               ` Jiri Slaby
2011-02-21 21:29                 ` Stefan Berger
2011-02-21 21:44                   ` Jiri Slaby
2011-02-21 22:07                     ` Rajiv Andrade
2011-02-21 22:07                     ` Rajiv Andrade
2011-02-21 22:10                       ` Jiri Slaby
2011-02-21 22:17                         ` Rafael J. Wysocki
2011-02-21 22:17                         ` Rafael J. Wysocki
2011-02-22  0:42                         ` Stefan Berger
2011-02-22  8:41                           ` Jiri Slaby
2011-02-22  8:41                           ` Jiri Slaby
2011-02-22 11:57                             ` Stefan Berger
2011-02-22 11:57                             ` Stefan Berger
2011-02-22 17:39                               ` Jiri Slaby
2011-02-22 17:39                               ` Jiri Slaby
2011-02-22  0:42                         ` Stefan Berger
2011-02-21 22:10                       ` Jiri Slaby
2011-02-21 21:44                   ` Jiri Slaby
2011-02-21 21:29                 ` Stefan Berger
2011-02-21 20:39               ` Jiri Slaby
2011-02-21 16:34           ` Jiri Slaby
2011-02-22  5:39           ` Norbert Preining
2011-02-22  5:39           ` Norbert Preining
2011-02-22  8:42             ` Jiri Slaby
2011-02-22  9:13               ` Norbert Preining
2011-02-22  9:13               ` Norbert Preining
2011-02-22  8:42             ` Jiri Slaby
2011-02-20 10:51     ` Rafael J. Wysocki
2011-02-20 10:46   ` Jiri Slaby
2011-02-20 10:44 ` Rafael J. Wysocki
2011-02-20 10:13 Jiri Slaby

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.