All of lore.kernel.org
 help / color / mirror / Atom feed
* Odd hang on suspend and shutdown
@ 2015-07-05 23:20 Linus Torvalds
       [not found] ` <CA+55aFy1x6iwbdV8WfR+wawj_1+PxJ+P-Js=EVqD9ZsQetSNJA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2015-07-05 23:20 UTC (permalink / raw)
  To: Tomas Winkler, Greg Kroah-Hartman; +Cc: Linux API, Samuel Ortiz

On my Sony VAIO Pro 11 laptop, commit c93b76b34b4d ("mei: bus: report
also uuid in module alias") seems to cause problems at suspend and
shutdown.

In particular, reverting just the oneliner to drivers/nfc/pn544/mei.c
seems to fix things for me. The bisection was a pain, because it took
me forever to realize that it was that one-liner that caused it: I had
initially undone that one line simply because it didn't compile with
it in place (complaints about MEI_NFC_UUID not being constant). So I
continued to bisect with the fix unintentionally in place.

So just removing the MEI_NFC_UUID entry from the pn544_mei_tbl[] array
initialization makes things work for me again.

The symptoms are just a hard hang at suspend or shutdown.

This is a pretty regular Intel-only laptop, running Fedora 22.

Any ideas?

                    Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found] ` <CA+55aFy1x6iwbdV8WfR+wawj_1+PxJ+P-Js=EVqD9ZsQetSNJA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-05 23:24   ` Linus Torvalds
  2015-07-06 13:26   ` Winkler, Tomas
  2015-07-06 16:07   ` Samuel Ortiz
  2 siblings, 0 replies; 21+ messages in thread
From: Linus Torvalds @ 2015-07-05 23:24 UTC (permalink / raw)
  To: Tomas Winkler, Greg Kroah-Hartman; +Cc: Linux API, Samuel Ortiz

On Sun, Jul 5, 2015 at 4:20 PM, Linus Torvalds
<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
>
> So just removing the MEI_NFC_UUID entry from the pn544_mei_tbl[] array
> initialization makes things work for me again.

Side note: the fact that the initial commit didn't even compile, and
that this seems to be an API change, _and_ that it breaks things for a
laptop of mine makes me suspect that the answer is "revert the crap as
an obvious regression, and let's make sure it never resurfaces".

But I'm willing to entertain sane alternatives. Not for very long,
though, because I hate having known breakage on one of my machines.

                     Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Odd hang on suspend and shutdown
       [not found] ` <CA+55aFy1x6iwbdV8WfR+wawj_1+PxJ+P-Js=EVqD9ZsQetSNJA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-07-05 23:24   ` Linus Torvalds
@ 2015-07-06 13:26   ` Winkler, Tomas
       [not found]     ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E1F3F-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2015-07-06 16:07   ` Samuel Ortiz
  2 siblings, 1 reply; 21+ messages in thread
From: Winkler, Tomas @ 2015-07-06 13:26 UTC (permalink / raw)
  To: Linus Torvalds, Greg Kroah-Hartman; +Cc: Linux API, Samuel Ortiz


> 
> On my Sony VAIO Pro 11 laptop, commit c93b76b34b4d ("mei: bus: report
> also uuid in module alias") seems to cause problems at suspend and
> shutdown.
> 
> In particular, reverting just the oneliner to drivers/nfc/pn544/mei.c
> seems to fix things for me. The bisection was a pain, because it took
> me forever to realize that it was that one-liner that caused it: I had
> initially undone that one line simply because it didn't compile with
> it in place (complaints about MEI_NFC_UUID not being constant). So I
> continued to bisect with the fix unintentionally in place.
> 
> So just removing the MEI_NFC_UUID entry from the pn544_mei_tbl[] array
> initialization makes things work for me again.
> 
> The symptoms are just a hard hang at suspend or shutdown.
> 
> This is a pretty regular Intel-only laptop, running Fedora 22.
> 
> Any ideas?
> 
>                     Linus


I'm looking into this (not sure I get the exact platform to test with)

There was a miss order in how patches were applied, which may confuse bisection even more 
The fix for that was provided in this patch

In commit b144ce2d37619e05afdb0a15676500d76a64b1be
Author: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date:   Wed May 27 17:17:27 2015 -0700

    mei: fix up uuid matching


We had  suspend issue that was resolved by a patch bellow, but it's not related to the NFC at all. 
The NFC under Haswell should not be actually operational, I'm not sure if Samuel is planning to support it at all.
Still need to figure out what caused that regression. 
 
commit 3dc196eae1db548f05e53e5875ff87b8ff79f249
Author: Alexander Usyskin <alexander.usyskin@intel.com>
Date:   Sat Jun 13 08:51:17 2015 +0300

    mei: me: wait for power gating exit confirmation

    Fix the hbm power gating state machine so it will wait till it receives
    confirmation interrupt for the PG_ISOLATION_EXIT message.

Thanks
Tomas



^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]     ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E1F3F-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-07-06 15:28       ` Linus Torvalds
  0 siblings, 0 replies; 21+ messages in thread
From: Linus Torvalds @ 2015-07-06 15:28 UTC (permalink / raw)
  To: Winkler, Tomas; +Cc: Greg Kroah-Hartman, Linux API, Samuel Ortiz

On Mon, Jul 6, 2015 at 6:26 AM, Winkler, Tomas <tomas.winkler-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>
> We had  suspend issue that was resolved by a patch below

That patch is in 4.2-rc1, and that's what I see the problem with. So
it doesn't fix anything for me, I'm afraid.

                      Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found] ` <CA+55aFy1x6iwbdV8WfR+wawj_1+PxJ+P-Js=EVqD9ZsQetSNJA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-07-05 23:24   ` Linus Torvalds
  2015-07-06 13:26   ` Winkler, Tomas
@ 2015-07-06 16:07   ` Samuel Ortiz
       [not found]     ` <20150706160706.GA22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
  2 siblings, 1 reply; 21+ messages in thread
From: Samuel Ortiz @ 2015-07-06 16:07 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tomas Winkler, Greg Kroah-Hartman, Linux API

On Sun, Jul 05, 2015 at 04:20:02PM -0700, Linus Torvalds wrote:
> On my Sony VAIO Pro 11 laptop, commit c93b76b34b4d ("mei: bus: report
> also uuid in module alias") seems to cause problems at suspend and
> shutdown.
> 
> In particular, reverting just the oneliner to drivers/nfc/pn544/mei.c
> seems to fix things for me. The bisection was a pain, because it took
> me forever to realize that it was that one-liner that caused it: I had
> initially undone that one line simply because it didn't compile with
> it in place (complaints about MEI_NFC_UUID not being constant). So I
> continued to bisect with the fix unintentionally in place.
> 
> So just removing the MEI_NFC_UUID entry from the pn544_mei_tbl[] array
> initialization makes things work for me again.
I suppose you were not seeing this issue on e.g. a 4.1 kernel (assuming
this is not a brand new laptop) ?
If that's the case, is any of the pn544* kernel modules loaded when
running on top of a good kernel ?

Cheers,
Samuel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]     ` <20150706160706.GA22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
@ 2015-07-06 16:49       ` Linus Torvalds
       [not found]         ` <CA+55aFxfENpyWSDxgaQhB1mBHE6zs=w=Mc0VEQLgdwSaCMX5eQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2015-07-06 16:49 UTC (permalink / raw)
  To: Samuel Ortiz; +Cc: Tomas Winkler, Greg Kroah-Hartman, Linux API

On Mon, Jul 6, 2015 at 9:07 AM, Samuel Ortiz <sameo-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
>
> I suppose you were not seeing this issue on e.g. a 4.1 kernel (assuming
> this is not a brand new laptop) ?

Nope, 4.1 is fine. This is my travel laptop for the last almost two
years. Best laptop I've ever had (I like them small and light).

> If that's the case, is any of the pn544* kernel modules loaded when
> running on top of a good kernel ?

So it doesn't show up in lsmod, but presumably the module got loaded
and then errored out or something. The mei and mei_me modules are
loaded, nut no pn544_mei.

I'm not even sure why it's in my configuration, because I tend to try
to keep those small, and my config was generated (long ago) with "make
localmodconfig", but maybe there's something non-obvious that brought
it in. Presumably that pn544 module was loaded at _some_ point in the
past.

I just tested, and loading it manually with "modprobe pn544_mei"
doesn't seem to do anything bad.  I can still suspend with a good
kernel. So it's not just about loading the module, there's some other
interaction going on.

                     Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]         ` <CA+55aFxfENpyWSDxgaQhB1mBHE6zs=w=Mc0VEQLgdwSaCMX5eQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-06 16:59           ` Samuel Ortiz
       [not found]             ` <20150706165959.GB22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
  2015-07-06 18:59           ` Winkler, Tomas
  2015-07-06 22:31           ` Samuel Ortiz
  2 siblings, 1 reply; 21+ messages in thread
From: Samuel Ortiz @ 2015-07-06 16:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tomas Winkler, Greg Kroah-Hartman, Linux API

On Mon, Jul 06, 2015 at 09:49:51AM -0700, Linus Torvalds wrote:
> I just tested, and loading it manually with "modprobe pn544_mei"
> doesn't seem to do anything bad.  I can still suspend with a good
> kernel. So it's not just about loading the module, there's some other
> interaction going on.
Ok, if you can suspend after this modprobe we're looking at some
potential mei bus or mei core issues. I'll revamp one of my HSW box and
will try to reproduce that issue on a 4.2-rc1 kernel.
I'll keep you posted.

Cheers,
Samuel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]             ` <20150706165959.GB22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
@ 2015-07-06 17:29               ` Linus Torvalds
       [not found]                 ` <CA+55aFwM8c3Zm0t0cEnbGS0d0rinzGwnX9uSxqKjff8K=ATaRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2015-07-06 17:29 UTC (permalink / raw)
  To: Samuel Ortiz; +Cc: Tomas Winkler, Greg Kroah-Hartman, Linux API

On Mon, Jul 6, 2015 at 9:59 AM, Samuel Ortiz <sameo-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
>
> Ok, if you can suspend after this modprobe we're looking at some
> potential mei bus or mei core issues.

Just looking at the mei/bus.c code to match the driver, and it looks just odd.

Before the commit that breaks for me, it used to have this loop:

-       while (id->name[0]) {

and now it has

+       while (uuid_le_cmp(NULL_UUID_LE, uuid_le_cast(id->uuid))) {

which seems to mean that together with my change to make a working
kernel (just removing the UUID entirely) means that now the
device_driver isn't matched against anything at all any more.

So I'm not sure "modprobe pn544_mei" ends up actually triggering
anything at all with my change to make it not lock up for me.

                  Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Odd hang on suspend and shutdown
       [not found]         ` <CA+55aFxfENpyWSDxgaQhB1mBHE6zs=w=Mc0VEQLgdwSaCMX5eQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-07-06 16:59           ` Samuel Ortiz
@ 2015-07-06 18:59           ` Winkler, Tomas
       [not found]             ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E3474-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  2015-07-06 22:31           ` Samuel Ortiz
  2 siblings, 1 reply; 21+ messages in thread
From: Winkler, Tomas @ 2015-07-06 18:59 UTC (permalink / raw)
  To: Linus Torvalds, Samuel Ortiz; +Cc: Greg Kroah-Hartman, Linux API

> >
> > I suppose you were not seeing this issue on e.g. a 4.1 kernel (assuming
> > this is not a brand new laptop) ?
> 
> Nope, 4.1 is fine. This is my travel laptop for the last almost two
> years. Best laptop I've ever had (I like them small and light).

If this two years old that's probably not HSW, can you send out the exact HW info, mostly pci device ids are in interest. 

> 
> > If that's the case, is any of the pn544* kernel modules loaded when
> > running on top of a good kernel ?
> 
> So it doesn't show up in lsmod, but presumably the module got loaded
> and then errored out or something. The mei and mei_me modules are
> loaded, nut no pn544_mei.

This should be loaded only if you have NFC on the board or it bailed out more cleanly before.
There is a file under debugfs mei0/meclients , if you can send out the dump we can see what we support on that platform.
> 
> I'm not even sure why it's in my configuration, because I tend to try
> to keep those small, and my config was generated (long ago) with "make
> localmodconfig", but maybe there's something non-obvious that brought
> it in. Presumably that pn544 module was loaded at _some_ point in the
> past.
> 
Frankly, if you are not using any of the pro or NFC you don't really need that loaded.

> I just tested, and loading it manually with "modprobe pn544_mei"
> doesn't seem to do anything bad.  I can still suspend with a good
> kernel. So it's not just about loading the module, there's some other
> interaction going on.

The question if the module get bound to the NFC me client?

Thanks
Tomas

 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Odd hang on suspend and shutdown
       [not found]                 ` <CA+55aFwM8c3Zm0t0cEnbGS0d0rinzGwnX9uSxqKjff8K=ATaRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-06 19:18                   ` Winkler, Tomas
       [not found]                     ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E34A6-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Winkler, Tomas @ 2015-07-06 19:18 UTC (permalink / raw)
  To: Linus Torvalds, Samuel Ortiz; +Cc: Greg Kroah-Hartman, Linux API

> On Mon, Jul 6, 2015 at 9:59 AM, Samuel Ortiz <sameo@linux.intel.com> wrote:
> >
> > Ok, if you can suspend after this modprobe we're looking at some
> > potential mei bus or mei core issues.

> 
> Just looking at the mei/bus.c code to match the driver, and it looks just odd.

The major part of the bus was rejected by Greg and I'm still waiting for him to like be me back and review the resend. 
> 
> Before the commit that breaks for me, it used to have this loop:
> 
> -       while (id->name[0]) {
> 
> and now it has
> 
> +       while (uuid_le_cmp(NULL_UUID_LE, uuid_le_cast(id->uuid))) {

The is the major change in bus, instead of inventing names for each device
we do PnP by matching to UUIDs which are fetched from the parent mei device at initialization.
Looks like there is some conflict between UUID and the driver (pn544) that binds to it and probably didn't before. 
I'll be smarter when we have the HW info. 
 
> 
> which seems to mean that together with my change to make a working
> kernel (just removing the UUID entirely) means that now the
> device_driver isn't matched against anything at all any more.
>
> So I'm not sure "modprobe pn544_mei" ends up actually triggering
> anything at all with my change to make it not lock up for me.

If you removed the UUID it would just be loaded, I guess

Thanks
Tomas


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]             ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E3474-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-07-06 19:40               ` Linus Torvalds
       [not found]                 ` <CA+55aFxjuufbJpn+MngBzO3QyaYQm_ZNsgQY3VqPpguVpzOk6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2015-07-06 19:40 UTC (permalink / raw)
  To: Winkler, Tomas; +Cc: Samuel Ortiz, Greg Kroah-Hartman, Linux API

On Mon, Jul 6, 2015 at 11:59 AM, Winkler, Tomas <tomas.winkler-ral2JQCrhuEAvxtiuMwx3w@public.gmane.org> wrote:
>>
>> Nope, 4.1 is fine. This is my travel laptop for the last almost two
>> years. Best laptop I've ever had (I like them small and light).
>
> If this two years old that's probably not HSW, can you send out the exact HW info, mostly pci device ids are in interest.

Oh, it's Haswell. It's a i5-4200u. And no, it's not quite two years
old, but it's from fall-2013.

And it's literally all Intel (series 8 mobile chipset) with a intel
7260 wireless. A whole lot of 8086 there:

  [torvalds@vaio ~]$ lspci -n
  00:00.0 0600: 8086:0a04 (rev 09)
  00:02.0 0300: 8086:0a16 (rev 09)
  00:03.0 0403: 8086:0a0c (rev 09)
  00:14.0 0c03: 8086:9c31 (rev 04)
  00:16.0 0780: 8086:9c3a (rev 04)
  00:1b.0 0403: 8086:9c20 (rev 04)
  00:1c.0 0604: 8086:9c14 (rev e4)
  00:1c.3 0604: 8086:9c16 (rev e4)
  00:1d.0 0c03: 8086:9c26 (rev 04)
  00:1f.0 0601: 8086:9c43 (rev 04)
  00:1f.2 0106: 8086:9c03 (rev 04)
  00:1f.3 0c05: 8086:9c22 (rev 04)
  01:00.0 0280: 8086:08b1 (rev 6b)

> This should be loaded only if you have NFC on the board or it bailed out more cleanly before.
> There is a file under debugfs mei0/meclients , if you can send out the dump we can see what
> we support on that platform.

  [torvalds@vaio ~]$ sudo cat /sys/kernel/debug/mei0/meclients
    |id|fix|         UUID                       |con|msg len|sb|refc|
   0|42|  0|b638ab7e-94e2-4ea2-a552-d1c54b627f04|  1|   2048| 0|   2|
   1|41|  0|fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1|  3|  15360| 0|   2|
   2|40|  0|3c4852d6-d47b-4f46-b05e-b5edc1aa440e|  1|   4096| 0|   2|
   3|39|  0|0bb17a78-2a8e-4c50-94d4-50266723775c|  1|    312| 0|   4|
   4|38|  0|d2de1625-382d-417d-48a4-efabba8a1206|  5|    312| 0|   2|
   5|37|  0|3c4852d6-d47b-4f46-b05e-b5edc1aa430a|  1|   4096| 0|   2|
   6|36|  0|f908627d-13bf-4a04-b91f-a64e9245323d|  1|  12388| 0|   2|
   7|35|  0|8c2f4425-77d6-4755-aca3-891fdbc66a58|  1|   1328| 0|   2|
   8|34|  0|3893448c-eab6-4f4c-b23c-57c2c4658dfc|  2|     64| 0|   2|
   9|33|  0|309dcde8-ccb1-4062-8f78-600115a34327|  1|    512| 0|   2|
  10|32|  0|8e6a6715-9abc-4043-88ef-9e39c6f63e0f|  2|   2048| 0|   2|
  11| 8|  1|bf3cb4da-4045-4f9b-838d-8cbcfb21a107|  0|   1328| 1|   2|
  12| 7|  1|55213584-9a29-4916-badf-0fb7ed682aeb|  0|   2048| 1|   2|
  13| 5|  1|fa8f55e8-ab22-42dd-b916-7dce39002574|  1|   4096| 1|   2|

> Frankly, if you are not using any of the pro or NFC you don't really need that loaded.

Yeah. Except I will not be releasing a kernel that I can tell is buggy
on my own hardware, so it really doesn't matter.

So that commit gets reverted unless somebody figures out what's wrong.

If *I* see problems from my very limited hardware testing, I guarantee
that others will see them too. And others will not be able to
necessarily bisect and test things, they'll just say "it hangs" and
maybe never try Linux any more than that.

So bugs I can see myself end up being stuff that I revert pretty much
immediately. I don't run odd hardware.

> The question if the module get bound to the NFC me client?

.. and how would that show up?

              Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Odd hang on suspend and shutdown
       [not found]                 ` <CA+55aFxjuufbJpn+MngBzO3QyaYQm_ZNsgQY3VqPpguVpzOk6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-06 21:12                   ` Winkler, Tomas
  2015-07-06 22:33                   ` Samuel Ortiz
  1 sibling, 0 replies; 21+ messages in thread
From: Winkler, Tomas @ 2015-07-06 21:12 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Samuel Ortiz, Greg Kroah-Hartman, Linux API

> >> Nope, 4.1 is fine. This is my travel laptop for the last almost two
> >> years. Best laptop I've ever had (I like them small and light).
> >
> > If this two years old that's probably not HSW, can you send out the exact HW
> info, mostly pci device ids are in interest.
> 
> Oh, it's Haswell. It's a i5-4200u. And no, it's not quite two years
> old, but it's from fall-2013.

Okay have similar machine, will dig into it tomorrow morning 
> 
> And it's literally all Intel (series 8 mobile chipset) with a intel
> 7260 wireless. A whole lot of 8086 there:
> 
>   [torvalds@vaio ~]$ lspci -n
>   00:00.0 0600: 8086:0a04 (rev 09)
>   00:02.0 0300: 8086:0a16 (rev 09)
>   00:03.0 0403: 8086:0a0c (rev 09)
>   00:14.0 0c03: 8086:9c31 (rev 04)
>   00:16.0 0780: 8086:9c3a (rev 04)
>   00:1b.0 0403: 8086:9c20 (rev 04)
>   00:1c.0 0604: 8086:9c14 (rev e4)
>   00:1c.3 0604: 8086:9c16 (rev e4)
>   00:1d.0 0c03: 8086:9c26 (rev 04)
>   00:1f.0 0601: 8086:9c43 (rev 04)
>   00:1f.2 0106: 8086:9c03 (rev 04)
>   00:1f.3 0c05: 8086:9c22 (rev 04)
>   01:00.0 0280: 8086:08b1 (rev 6b)
> 
> > This should be loaded only if you have NFC on the board or it bailed out more
> cleanly before.
> > There is a file under debugfs mei0/meclients , if you can send out the dump we
> can see what
> > we support on that platform.
> 
>   [torvalds@vaio ~]$ sudo cat /sys/kernel/debug/mei0/meclients
>     |id|fix|         UUID                       |con|msg len|sb|refc|
>    0|42|  0|b638ab7e-94e2-4ea2-a552-d1c54b627f04|  1|   2048| 0|   2|
>    1|41|  0|fbf6fcf1-96cf-4e2e-a6a6-1bab8cbe36b1|  3|  15360| 0|   2|
>    2|40|  0|3c4852d6-d47b-4f46-b05e-b5edc1aa440e|  1|   4096| 0|   2|
>    3|39|  0|0bb17a78-2a8e-4c50-94d4-50266723775c|  1|    312| 0|   4|
The above is NFC.

> 
> > Frankly, if you are not using any of the pro or NFC you don't really need that
> loaded.
> 
> Yeah. Except I will not be releasing a kernel that I can tell is buggy
> on my own hardware, so it really doesn't matter.
> So that commit gets reverted unless somebody figures out what's wrong.

Oh no, there wasn't any intention to left it unaddressed, I hope very much someone is actually using it.

> 
> If *I* see problems from my very limited hardware testing, I guarantee
> that others will see them too. And others will not be able to
> necessarily bisect and test things, they'll just say "it hangs" and
> maybe never try Linux any more than that.
> 
> So bugs I can see myself end up being stuff that I revert pretty much
> immediately. I don't run odd hardware.

I hope we found the issue quickly now,  the quick fix would be to enable the NFC only on ivy bridge. 

>
> > The question if the module get bound to the NFC me client?
> 
> .. and how would that show up?
> 
I think w/o installing the whole neard stack it would be just watching udev events or open the dynamic debug in the kernel on mei_phy and pn544 modules.
What I know for sure is that NFC in current stage is just  not functional on HSW so it should bail out at some point, of course it should not cause the regression either. It should be working on ivy bridge. 

Tomas 


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]                     ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E34A6-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
@ 2015-07-06 22:17                       ` Greg Kroah-Hartman
       [not found]                         ` <20150706221736.GA3135-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Greg Kroah-Hartman @ 2015-07-06 22:17 UTC (permalink / raw)
  To: Winkler, Tomas; +Cc: Linus Torvalds, Samuel Ortiz, Linux API

On Mon, Jul 06, 2015 at 07:18:15PM +0000, Winkler, Tomas wrote:
> > On Mon, Jul 6, 2015 at 9:59 AM, Samuel Ortiz <sameo-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
> > >
> > > Ok, if you can suspend after this modprobe we're looking at some
> > > potential mei bus or mei core issues.
> 
> > 
> > Just looking at the mei/bus.c code to match the driver, and it looks just odd.
> 
> The major part of the bus was rejected by Greg and I'm still waiting
> for him to like be me back and review the resend. 

That's 4.3-rc1 work, not for 4.2, so if this is breaking boxes, I'm with
Linus, it should just be reverted for now.

I can't duplicate this here with my laptop, it's a few years old as
well, but no problems, so this must be device specific somehow.

thanks

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]         ` <CA+55aFxfENpyWSDxgaQhB1mBHE6zs=w=Mc0VEQLgdwSaCMX5eQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-07-06 16:59           ` Samuel Ortiz
  2015-07-06 18:59           ` Winkler, Tomas
@ 2015-07-06 22:31           ` Samuel Ortiz
       [not found]             ` <20150706223133.GC22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
  2 siblings, 1 reply; 21+ messages in thread
From: Samuel Ortiz @ 2015-07-06 22:31 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Tomas Winkler, Greg Kroah-Hartman, Linux API

On Mon, Jul 06, 2015 at 09:49:51AM -0700, Linus Torvalds wrote:
> On Mon, Jul 6, 2015 at 9:07 AM, Samuel Ortiz <sameo-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
> >
> > I suppose you were not seeing this issue on e.g. a 4.1 kernel (assuming
> > this is not a brand new laptop) ?
> 
> Nope, 4.1 is fine. This is my travel laptop for the last almost two
> years. Best laptop I've ever had (I like them small and light).
> 
> > If that's the case, is any of the pn544* kernel modules loaded when
> > running on top of a good kernel ?
> 
> So it doesn't show up in lsmod, but presumably the module got loaded
> and then errored out or something. The mei and mei_me modules are
> loaded, nut no pn544_mei.
Oh, by 'good kernel' I meant a 4.1 one for example, sorry for the
confusion.
Not one where you removed the UUID from pn544/mei.c because thay will
essentially make sure the driver is never probed.

What I'm trying to rule out is that you never saw that issue on previous
kernels because the pn544 driver was not loaded at all, but this:

> I'm not even sure why it's in my configuration, because I tend to try
> to keep those small, and my config was generated (long ago) with "make
> localmodconfig", but maybe there's something non-obvious that brought
> it in. Presumably that pn544 module was loaded at _some_ point in the
> past.
seems to prove it has already been automatically loaded (and thus probed
because the loading happened as the NFC device was found on the MEI
bus).

The pn544 mei support was added with 3.10 iirc, so starting from there
your pn544* modules have been most likely automatically loaded.

Cheers,
Samuel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]                 ` <CA+55aFxjuufbJpn+MngBzO3QyaYQm_ZNsgQY3VqPpguVpzOk6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  2015-07-06 21:12                   ` Winkler, Tomas
@ 2015-07-06 22:33                   ` Samuel Ortiz
  1 sibling, 0 replies; 21+ messages in thread
From: Samuel Ortiz @ 2015-07-06 22:33 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Winkler, Tomas, Greg Kroah-Hartman, Linux API

On Mon, Jul 06, 2015 at 12:40:10PM -0700, Linus Torvalds wrote:
> > The question if the module get bound to the NFC me client?
> 
> .. and how would that show up?
If the pn544 probed correctly you will see a /sys/class/nfc/nfc0 entry.

Cheers,
Samuel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]                         ` <20150706221736.GA3135-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
@ 2015-07-06 22:38                           ` Samuel Ortiz
  0 siblings, 0 replies; 21+ messages in thread
From: Samuel Ortiz @ 2015-07-06 22:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: Winkler, Tomas, Linus Torvalds, Linux API

On Mon, Jul 06, 2015 at 03:17:36PM -0700, Greg Kroah-Hartman wrote:
> > The major part of the bus was rejected by Greg and I'm still waiting
> > for him to like be me back and review the resend. 
> 
> That's 4.3-rc1 work, not for 4.2, so if this is breaking boxes, I'm with
> Linus, it should just be reverted for now.
> 
> I can't duplicate this here with my laptop, it's a few years old as
> well, but no problems, so this must be device specific somehow.
It's very much device specific as only few HSW based SKUs have NFC
enabled through the ME.

Cheers,
Samuel.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]             ` <20150706223133.GC22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
@ 2015-07-06 23:06               ` Linus Torvalds
       [not found]                 ` <CA+55aFwsrywW4fEt33BZFPSu5F0eAFB5YAY-zHUZ8-BLv1VcDg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2015-07-06 23:06 UTC (permalink / raw)
  To: Samuel Ortiz; +Cc: Tomas Winkler, Greg Kroah-Hartman, Linux API

On Mon, Jul 6, 2015 at 3:31 PM, Samuel Ortiz <sameo-VuQAYsv1563Yd54FQh9/CA@public.gmane.org> wrote:
>
> The pn544 mei support was added with 3.10 iirc, so starting from there
> your pn544* modules have been most likely automatically loaded.

I no longer had a 4.1 kernel (with all the bisecting I had pruned away
old kernel builds), but booting into a distro kernel shows that in
4.0.6, the pn544 modules do get loaded, and suspend/resume works fine.

.. and that also explains why I had it enabled in my .config. It would
have been picked up by "make localmodconfig".

> If the pn544 probed correctly you will see a /sys/class/nfc/nfc0 entry.

Yes, that shows up in 4.0.6. It does not show up in my modified
4.2-rc1 kernel that dropped the UUID.

I'll go and test the plain revert case.

               Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]                 ` <CA+55aFwsrywW4fEt33BZFPSu5F0eAFB5YAY-zHUZ8-BLv1VcDg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-07  0:04                   ` Linus Torvalds
       [not found]                     ` <CA+55aFxo064xa5m3KPjSJxQCrDL3H1QwuEcpCKMg0TL8C0aedQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
  0 siblings, 1 reply; 21+ messages in thread
From: Linus Torvalds @ 2015-07-07  0:04 UTC (permalink / raw)
  To: Samuel Ortiz; +Cc: Tomas Winkler, Greg Kroah-Hartman, Linux API

On Mon, Jul 6, 2015 at 4:06 PM, Linus Torvalds
<torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
>
> I'll go and test the plain revert case.

So the "plain" revert ended up being more than a single revert, just
because there were multiple changes in that area. But after doing
this:

    git revert b144ce2d3761
    git revert 1d3ff76721fb
    git revert be9b720a0ccb
    git revert 007d64eb2232
    git revert dbac993f6a6d
    git revert d4b78c7290dd
    git revert c93b76b34b4d

I have a working kernel that loads the pn544 thing automatically, and
shows it in /sys/class/nfc/, and suspend works.

So I'm back to the pre-merge-window behavior.

I'd be happy to do a more minimal revert, but the above was the
trivial "revert without any conflicts" set.

              Linus

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: Odd hang on suspend and shutdown
       [not found]                     ` <CA+55aFxo064xa5m3KPjSJxQCrDL3H1QwuEcpCKMg0TL8C0aedQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
@ 2015-07-07  4:06                       ` Greg Kroah-Hartman
       [not found]                         ` <20150707040630.GB25900-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
  2015-07-07 16:32                         ` Winkler, Tomas
  0 siblings, 2 replies; 21+ messages in thread
From: Greg Kroah-Hartman @ 2015-07-07  4:06 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Samuel Ortiz, Tomas Winkler, Linux API

On Mon, Jul 06, 2015 at 05:04:18PM -0700, Linus Torvalds wrote:
> On Mon, Jul 6, 2015 at 4:06 PM, Linus Torvalds
> <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
> >
> > I'll go and test the plain revert case.
> 
> So the "plain" revert ended up being more than a single revert, just
> because there were multiple changes in that area. But after doing
> this:
> 
>     git revert b144ce2d3761
>     git revert 1d3ff76721fb
>     git revert be9b720a0ccb
>     git revert 007d64eb2232
>     git revert dbac993f6a6d
>     git revert d4b78c7290dd
>     git revert c93b76b34b4d
> 
> I have a working kernel that loads the pn544 thing automatically, and
> shows it in /sys/class/nfc/, and suspend works.
> 
> So I'm back to the pre-merge-window behavior.

I have no objection to just reverting all of these to get back to a
working system.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Odd hang on suspend and shutdown
       [not found]                         ` <20150707040630.GB25900-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
@ 2015-07-07 14:38                           ` Winkler, Tomas
  0 siblings, 0 replies; 21+ messages in thread
From: Winkler, Tomas @ 2015-07-07 14:38 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Linus Torvalds; +Cc: Samuel Ortiz, Linux API



> Subject: Re: Odd hang on suspend and shutdown
> 
> On Mon, Jul 06, 2015 at 05:04:18PM -0700, Linus Torvalds wrote:
> > On Mon, Jul 6, 2015 at 4:06 PM, Linus Torvalds
> > <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
> > >
> > > I'll go and test the plain revert case.
> >
> > So the "plain" revert ended up being more than a single revert, just
> > because there were multiple changes in that area. But after doing
> > this:
> >
> >     git revert b144ce2d3761
> >     git revert 1d3ff76721fb
> >     git revert be9b720a0ccb
> >     git revert 007d64eb2232
> >     git revert dbac993f6a6d
> >     git revert d4b78c7290dd
> >     git revert c93b76b34b4d
> >
> > I have a working kernel that loads the pn544 thing automatically, and
> > shows it in /sys/class/nfc/, and suspend works.
> >
> > So I'm back to the pre-merge-window behavior.
> 
> I have no objection to just reverting all of these to get back to a
> working system.
I was able to reproduce the issue
The bug was introduced in here be9b720a0ccba096d669bc86634f900b82b9bf71
Hope to localize the actual root cause soon so we don't have  to revert the all the patches.  

Thanks
Tomas

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: Odd hang on suspend and shutdown
  2015-07-07  4:06                       ` Greg Kroah-Hartman
       [not found]                         ` <20150707040630.GB25900-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
@ 2015-07-07 16:32                         ` Winkler, Tomas
  1 sibling, 0 replies; 21+ messages in thread
From: Winkler, Tomas @ 2015-07-07 16:32 UTC (permalink / raw)
  To: Greg Kroah-Hartman, Linus Torvalds; +Cc: Samuel Ortiz, Linux API



> -----Original Message-----
> From: Winkler, Tomas
> Sent: Tuesday, July 07, 2015 17:39
> To: 'Greg Kroah-Hartman'; Linus Torvalds
> Cc: Samuel Ortiz; Linux API
> Subject: RE: Odd hang on suspend and shutdown
> 
> 
> 
> > Subject: Re: Odd hang on suspend and shutdown
> >
> > On Mon, Jul 06, 2015 at 05:04:18PM -0700, Linus Torvalds wrote:
> > > On Mon, Jul 6, 2015 at 4:06 PM, Linus Torvalds
> > > <torvalds-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> wrote:
> > > >
> > > > I'll go and test the plain revert case.
> > >
> > > So the "plain" revert ended up being more than a single revert, just
> > > because there were multiple changes in that area. But after doing
> > > this:
> > >
> > >     git revert b144ce2d3761
> > >     git revert 1d3ff76721fb
> > >     git revert be9b720a0ccb
> > >     git revert 007d64eb2232
> > >     git revert dbac993f6a6d
> > >     git revert d4b78c7290dd
> > >     git revert c93b76b34b4d
> > >
> > > I have a working kernel that loads the pn544 thing automatically, and
> > > shows it in /sys/class/nfc/, and suspend works.
> > >
> > > So I'm back to the pre-merge-window behavior.
> >
> > I have no objection to just reverting all of these to get back to a
> > working system.
> I was able to reproduce the issue
> The bug was introduced in here be9b720a0ccba096d669bc86634f900b82b9bf71
> Hope to localize the actual root cause soon so we don't have  to revert the all the
> patches.

There is a device lock taken twice  when calling mei_cl_disable_device()... 
Thanks
Tomas
 

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2015-07-07 16:32 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-07-05 23:20 Odd hang on suspend and shutdown Linus Torvalds
     [not found] ` <CA+55aFy1x6iwbdV8WfR+wawj_1+PxJ+P-Js=EVqD9ZsQetSNJA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-05 23:24   ` Linus Torvalds
2015-07-06 13:26   ` Winkler, Tomas
     [not found]     ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E1F3F-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-07-06 15:28       ` Linus Torvalds
2015-07-06 16:07   ` Samuel Ortiz
     [not found]     ` <20150706160706.GA22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
2015-07-06 16:49       ` Linus Torvalds
     [not found]         ` <CA+55aFxfENpyWSDxgaQhB1mBHE6zs=w=Mc0VEQLgdwSaCMX5eQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-06 16:59           ` Samuel Ortiz
     [not found]             ` <20150706165959.GB22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
2015-07-06 17:29               ` Linus Torvalds
     [not found]                 ` <CA+55aFwM8c3Zm0t0cEnbGS0d0rinzGwnX9uSxqKjff8K=ATaRA-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-06 19:18                   ` Winkler, Tomas
     [not found]                     ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E34A6-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-07-06 22:17                       ` Greg Kroah-Hartman
     [not found]                         ` <20150706221736.GA3135-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2015-07-06 22:38                           ` Samuel Ortiz
2015-07-06 18:59           ` Winkler, Tomas
     [not found]             ` <5B8DA87D05A7694D9FA63FD143655C1B3D3E3474-Jy8z56yoSI8MvF1YICWikbfspsVTdybXVpNB7YpNyf8@public.gmane.org>
2015-07-06 19:40               ` Linus Torvalds
     [not found]                 ` <CA+55aFxjuufbJpn+MngBzO3QyaYQm_ZNsgQY3VqPpguVpzOk6A-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-06 21:12                   ` Winkler, Tomas
2015-07-06 22:33                   ` Samuel Ortiz
2015-07-06 22:31           ` Samuel Ortiz
     [not found]             ` <20150706223133.GC22015-nKCvNrh56OoJmsy6czSMtA@public.gmane.org>
2015-07-06 23:06               ` Linus Torvalds
     [not found]                 ` <CA+55aFwsrywW4fEt33BZFPSu5F0eAFB5YAY-zHUZ8-BLv1VcDg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-07  0:04                   ` Linus Torvalds
     [not found]                     ` <CA+55aFxo064xa5m3KPjSJxQCrDL3H1QwuEcpCKMg0TL8C0aedQ-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2015-07-07  4:06                       ` Greg Kroah-Hartman
     [not found]                         ` <20150707040630.GB25900-U8xfFu+wG4EAvxtiuMwx3w@public.gmane.org>
2015-07-07 14:38                           ` Winkler, Tomas
2015-07-07 16:32                         ` Winkler, Tomas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.