linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 3.12: kernel panic when resuming from suspend to RAM (x86_64)
@ 2013-11-17  9:42 Francis Moreau
  2013-11-17 13:25 ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-17  9:42 UTC (permalink / raw)
  To: LKML

Hello,

I recently acquiered a new laptop. After installing archlinux which is
shipping a kernel 3.12, I've got some troubles after resuming from each
suspend to RAM.

The behaviour is as following: each resumes correctly and my session
seems to be restored but after typing a command on the term, I got a
black screen and the fan is becoming noisy which seems to indicate the
cpus are running intensively.

Today I got a different behaviour, after resuming I got a kernel panic.
I could take a picture of the laptop screen: http://imgur.com/f5uWFTY

Could anybody help me to sort this out ?

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17  9:42 3.12: kernel panic when resuming from suspend to RAM (x86_64) Francis Moreau
@ 2013-11-17 13:25 ` Borislav Petkov
  2013-11-17 15:50   ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-17 13:25 UTC (permalink / raw)
  To: Francis Moreau; +Cc: LKML

On Sun, Nov 17, 2013 at 10:42:05AM +0100, Francis Moreau wrote:
> Today I got a different behaviour, after resuming I got a
> kernel panic. I could take a picture of the laptop screen:
> http://imgur.com/f5uWFTY

Does archlinux ship the upstream kernel or do they have patches ontop?
If "yes" to the last one, try reproducing this panic with the upstream
kernel 3.12.

In general, how reliably can you reproduce the kernel panic with the
upstream 3.12 kernel?

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 13:25 ` Borislav Petkov
@ 2013-11-17 15:50   ` Francis Moreau
  2013-11-17 16:01     ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-17 15:50 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML

Le 17/11/2013 14:25, Borislav Petkov a écrit :
> On Sun, Nov 17, 2013 at 10:42:05AM +0100, Francis Moreau wrote:
>> Today I got a different behaviour, after resuming I got a
>> kernel panic. I could take a picture of the laptop screen:
>> http://imgur.com/f5uWFTY
> 
> Does archlinux ship the upstream kernel or do they have patches ontop?
> If "yes" to the last one, try reproducing this panic with the upstream
> kernel 3.12.

AFAIK, the kernel has 2 simple patches on top of the vanilla one.
They're both are trivial and can't be related to this issue.

You can have look to them here:
https://wiki.archlinux.org/index.php/Kernels#Official_packages

> 
> In general, how reliably can you reproduce the kernel panic with the
> upstream 3.12 kernel?
> 

Assuming that I'm running an upstream kernel, it's almost 100%
reproductible.

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 15:50   ` Francis Moreau
@ 2013-11-17 16:01     ` Borislav Petkov
  2013-11-17 18:02       ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-17 16:01 UTC (permalink / raw)
  To: Francis Moreau; +Cc: LKML

On Sun, Nov 17, 2013 at 04:50:23PM +0100, Francis Moreau wrote:
> AFAIK, the kernel has 2 simple patches on top of the vanilla one.
> They're both are trivial and can't be related to this issue.
> 
> You can have look to them here:
> https://wiki.archlinux.org/index.php/Kernels#Official_packages

Ok.

> Assuming that I'm running an upstream kernel, it's almost 100%
> reproductible.

Is there any chance you can catch the whole oops, esp. keep the Code:
line complete?

Also, can you do:

$ objdump -d vmlinux | less

then search for 'call_timer_fn' and paste the whole function somewhere.

Also, can you catch a full dmesg and upload that somewhere too?

Thanks.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 16:01     ` Borislav Petkov
@ 2013-11-17 18:02       ` Francis Moreau
  2013-11-17 19:53         ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-17 18:02 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML

On Sun, Nov 17, 2013 at 5:01 PM, Borislav Petkov <bp@alien8.de> wrote:
> On Sun, Nov 17, 2013 at 04:50:23PM +0100, Francis Moreau wrote:
>> AFAIK, the kernel has 2 simple patches on top of the vanilla one.
>> They're both are trivial and can't be related to this issue.
>>
>> You can have look to them here:
>> https://wiki.archlinux.org/index.php/Kernels#Official_packages
>
> Ok.
>
>> Assuming that I'm running an upstream kernel, it's almost 100%
>> reproductible.
>
> Is there any chance you can catch the whole oops, esp. keep the Code:
> line complete?

Sorry I haven't taken the original picture large enough, and getting
this kernel panic is pretty hard since the kernel usually displays the
black screen.

>
> Also, can you do:
>
> $ objdump -d vmlinux | less
>
> then search for 'call_timer_fn' and paste the whole function somewhere.

I can't find any traces of this function in the dump...

>
> Also, can you catch a full dmesg and upload that somewhere too?

http://paste.debian.net/66294/

Thanks.
-- 
Francis

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 18:02       ` Francis Moreau
@ 2013-11-17 19:53         ` Borislav Petkov
  2013-11-17 20:49           ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-17 19:53 UTC (permalink / raw)
  To: Francis Moreau; +Cc: LKML

On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote:
> Sorry I haven't taken the original picture large enough, and getting
> this kernel panic is pretty hard since the kernel usually displays the
> black screen.

Ok, just try to make a readable picture of the whole line, next time you
trigger it.

> I can't find any traces of this function in the dump...

Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the
official archlinux kernel? If so, where can I get it from?

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 19:53         ` Borislav Petkov
@ 2013-11-17 20:49           ` Francis Moreau
  2013-11-17 22:06             ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-17 20:49 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML

On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov <bp@alien8.de> wrote:
> On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote:
>> Sorry I haven't taken the original picture large enough, and getting
>> this kernel panic is pretty hard since the kernel usually displays the
>> black screen.
>
> Ok, just try to make a readable picture of the whole line, next time you
> trigger it.
>
>> I can't find any traces of this function in the dump...
>
> Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the
> official archlinux kernel? If so, where can I get it from?

Yes, you can download the bin package from :
https://www.archlinux.org/packages/core/x86_64/linux/

The bin package is a tar archive, so it pretty straightforward to
unpack the vmlinux file  (actual is filename vmlinuz-linux).

Thanks for you help.
-- 
Francis

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 20:49           ` Francis Moreau
@ 2013-11-17 22:06             ` Borislav Petkov
  2013-11-17 22:34               ` Rafael J. Wysocki
                                 ` (2 more replies)
  0 siblings, 3 replies; 63+ messages in thread
From: Borislav Petkov @ 2013-11-17 22:06 UTC (permalink / raw)
  To: Francis Moreau; +Cc: LKML, Rafael J. Wysocki, Thomas Gleixner

On Sun, Nov 17, 2013 at 09:49:40PM +0100, Francis Moreau wrote:
> On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov <bp@alien8.de> wrote:
> > On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote:
> >> Sorry I haven't taken the original picture large enough, and getting
> >> this kernel panic is pretty hard since the kernel usually displays the
> >> black screen.
> >
> > Ok, just try to make a readable picture of the whole line, next time you
> > trigger it.
> >
> >> I can't find any traces of this function in the dump...
> >
> > Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the
> > official archlinux kernel? If so, where can I get it from?
> 
> Yes, you can download the bin package from :
> https://www.archlinux.org/packages/core/x86_64/linux/
> 
> The bin package is a tar archive, so it pretty straightforward to
> unpack the vmlinux file  (actual is filename vmlinuz-linux).

Ok, here's what I was able to see: rIP points to call_timer_fn+0x33
which is this:

ffffffff8106f590 <call_timer_fn>:
ffffffff8106f590:       e8 2b b2 48 00          callq  ffffffff814fa7c0 <__fentry__>
ffffffff8106f595:       55                      push   %rbp
ffffffff8106f596:       65 48 8b 04 25 70 c7    mov    %gs:0xc770,%rax
ffffffff8106f59d:       00 00 
ffffffff8106f59f:       48 89 e5                mov    %rsp,%rbp
ffffffff8106f5a2:       41 57                   push   %r15
ffffffff8106f5a4:       49 89 d7                mov    %rdx,%r15
ffffffff8106f5a7:       41 56                   push   %r14
ffffffff8106f5a9:       49 89 f6                mov    %rsi,%r14
ffffffff8106f5ac:       41 55                   push   %r13
ffffffff8106f5ae:       41 54                   push   %r12
ffffffff8106f5b0:       49 89 fc                mov    %rdi,%r12
ffffffff8106f5b3:       53                      push   %rbx
ffffffff8106f5b4:       44 8b a8 44 e0 ff ff    mov    -0x1fbc(%rax),%r13d
ffffffff8106f5bb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
ffffffff8106f5c0:       4c 89 ff                mov    %r15,%rdi
ffffffff8106f5c3:       41 ff d6                callq  *%r14			<--- faulting insn
ffffffff8106f5c6:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
ffffffff8106f5cb:       65 48 8b 04 25 70 c7    mov    %gs:0xc770,%rax
ffffffff8106f5d2:       00 00 
ffffffff8106f5d4:       44 39 a8 44 e0 ff ff    cmp    %r13d,-0x1fbc(%rax)

and the virtual address in rIP is ffffffff8106f5c3, i.e. the same one
as in the photo. Thus, the CALL instruction tries to call the timer
function 'fn' which we pass as an argument to call_timer_fn.

However, the address we're trying to call in %r14 is garbage:
0x455300323d504544 and not in canonical form, causing the #GP.

So basically what happens is suspend to RAM corrupts something
containing one or more timer functions and we end up calling crap after
resume.

If you want to debug this further, you could try playing through
Documentation/power/basic-pm-debugging.txt and see whether suspend to
disk works. There's also a section 2 which talks about testing suspend
to RAM which could be of help.

But let me add Rafael and Thomas - they should have much better ideas
than me.

Guys, thread starts here:
http://marc.info/?l=linux-kernel&m=138468134321335

HTH.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 22:06             ` Borislav Petkov
@ 2013-11-17 22:34               ` Rafael J. Wysocki
  2013-11-17 22:46                 ` Borislav Petkov
  2013-11-18 12:20                 ` Francis Moreau
  2013-11-18  0:33               ` Kevin Easton
  2013-11-18 12:19               ` Francis Moreau
  2 siblings, 2 replies; 63+ messages in thread
From: Rafael J. Wysocki @ 2013-11-17 22:34 UTC (permalink / raw)
  To: Borislav Petkov, Francis Moreau; +Cc: LKML, Thomas Gleixner, Linux PM list

On Sunday, November 17, 2013 11:06:12 PM Borislav Petkov wrote:
> On Sun, Nov 17, 2013 at 09:49:40PM +0100, Francis Moreau wrote:
> > On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov <bp@alien8.de> wrote:
> > > On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote:
> > >> Sorry I haven't taken the original picture large enough, and getting
> > >> this kernel panic is pretty hard since the kernel usually displays the
> > >> black screen.
> > >
> > > Ok, just try to make a readable picture of the whole line, next time you
> > > trigger it.
> > >
> > >> I can't find any traces of this function in the dump...
> > >
> > > Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the
> > > official archlinux kernel? If so, where can I get it from?
> > 
> > Yes, you can download the bin package from :
> > https://www.archlinux.org/packages/core/x86_64/linux/
> > 
> > The bin package is a tar archive, so it pretty straightforward to
> > unpack the vmlinux file  (actual is filename vmlinuz-linux).
> 
> Ok, here's what I was able to see: rIP points to call_timer_fn+0x33
> which is this:
> 
> ffffffff8106f590 <call_timer_fn>:
> ffffffff8106f590:       e8 2b b2 48 00          callq  ffffffff814fa7c0 <__fentry__>
> ffffffff8106f595:       55                      push   %rbp
> ffffffff8106f596:       65 48 8b 04 25 70 c7    mov    %gs:0xc770,%rax
> ffffffff8106f59d:       00 00 
> ffffffff8106f59f:       48 89 e5                mov    %rsp,%rbp
> ffffffff8106f5a2:       41 57                   push   %r15
> ffffffff8106f5a4:       49 89 d7                mov    %rdx,%r15
> ffffffff8106f5a7:       41 56                   push   %r14
> ffffffff8106f5a9:       49 89 f6                mov    %rsi,%r14
> ffffffff8106f5ac:       41 55                   push   %r13
> ffffffff8106f5ae:       41 54                   push   %r12
> ffffffff8106f5b0:       49 89 fc                mov    %rdi,%r12
> ffffffff8106f5b3:       53                      push   %rbx
> ffffffff8106f5b4:       44 8b a8 44 e0 ff ff    mov    -0x1fbc(%rax),%r13d
> ffffffff8106f5bb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> ffffffff8106f5c0:       4c 89 ff                mov    %r15,%rdi
> ffffffff8106f5c3:       41 ff d6                callq  *%r14			<--- faulting insn
> ffffffff8106f5c6:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> ffffffff8106f5cb:       65 48 8b 04 25 70 c7    mov    %gs:0xc770,%rax
> ffffffff8106f5d2:       00 00 
> ffffffff8106f5d4:       44 39 a8 44 e0 ff ff    cmp    %r13d,-0x1fbc(%rax)
> 
> and the virtual address in rIP is ffffffff8106f5c3, i.e. the same one
> as in the photo. Thus, the CALL instruction tries to call the timer
> function 'fn' which we pass as an argument to call_timer_fn.
> 
> However, the address we're trying to call in %r14 is garbage:
> 0x455300323d504544 and not in canonical form, causing the #GP.
> 
> So basically what happens is suspend to RAM corrupts something
> containing one or more timer functions and we end up calling crap after
> resume.
> 
> If you want to debug this further, you could try playing through
> Documentation/power/basic-pm-debugging.txt and see whether suspend to
> disk works. There's also a section 2 which talks about testing suspend
> to RAM which could be of help.
> 
> But let me add Rafael and Thomas - they should have much better ideas
> than me.
> 
> Guys, thread starts here:
> http://marc.info/?l=linux-kernel&m=138468134321335

This looks like a softirq bug to me (and related to cpuidle).

I'm wondering if that happens with any of the older kernels or just 3.12?

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 22:34               ` Rafael J. Wysocki
@ 2013-11-17 22:46                 ` Borislav Petkov
  2013-11-18 12:21                   ` Francis Moreau
  2013-11-18 12:20                 ` Francis Moreau
  1 sibling, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-17 22:46 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Francis Moreau, LKML, Thomas Gleixner, Linux PM list

On Sun, Nov 17, 2013 at 11:34:20PM +0100, Rafael J. Wysocki wrote:
> This looks like a softirq bug to me (and related to cpuidle).

Reportedly, it happens right after resume from RAM. Francis, is that
correct?

> I'm wondering if that happens with any of the older kernels or just
> 3.12?

That could be helpful, yeah.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 22:06             ` Borislav Petkov
  2013-11-17 22:34               ` Rafael J. Wysocki
@ 2013-11-18  0:33               ` Kevin Easton
  2013-11-18  1:04                 ` Borislav Petkov
  2013-11-18 12:19               ` Francis Moreau
  2 siblings, 1 reply; 63+ messages in thread
From: Kevin Easton @ 2013-11-18  0:33 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Francis Moreau, LKML, Rafael J. Wysocki, Thomas Gleixner

On Sun, Nov 17, 2013 at 11:06:12PM +0100, Borislav Petkov wrote:
> and the virtual address in rIP is ffffffff8106f5c3, i.e. the same one
> as in the photo. Thus, the CALL instruction tries to call the timer
> function 'fn' which we pass as an argument to call_timer_fn.
> 
> However, the address we're trying to call in %r14 is garbage:
> 0x455300323d504544 and not in canonical form, causing the #GP.

That's part of an ASCII string, "DEP=2\0SE", so if that looks familiar
to anyone (part of a kernel command line?) it might give a clue to
where the timer callback pointer is being overwritten.

> So basically what happens is suspend to RAM corrupts something
> containing one or more timer functions and we end up calling crap after
> resume.

    - Kevin

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-18  0:33               ` Kevin Easton
@ 2013-11-18  1:04                 ` Borislav Petkov
  2013-11-18  2:43                   ` Kevin Easton
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-18  1:04 UTC (permalink / raw)
  To: Kevin Easton; +Cc: Francis Moreau, LKML, Rafael J. Wysocki, Thomas Gleixner

On Mon, Nov 18, 2013 at 11:33:25AM +1100, Kevin Easton wrote:
> That's part of an ASCII string, "DEP=2\0SE", so if that looks familiar
> to anyone (part of a kernel command line?) it might give a clue to
> where the timer callback pointer is being overwritten.

Yeah, it could be anything, maybe even part of dmesg although nothing
matches "DEP" in there.

And command line is:

[    0.000000] Command line: BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=b588f212-e32e-4914-9841-5246e0bd7632 rw quiet

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-18  1:04                 ` Borislav Petkov
@ 2013-11-18  2:43                   ` Kevin Easton
  0 siblings, 0 replies; 63+ messages in thread
From: Kevin Easton @ 2013-11-18  2:43 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Francis Moreau, LKML, Rafael J. Wysocki, Thomas Gleixner

On Mon, Nov 18, 2013 at 02:04:28AM +0100, Borislav Petkov wrote:
> On Mon, Nov 18, 2013 at 11:33:25AM +1100, Kevin Easton wrote:
> > That's part of an ASCII string, "DEP=2\0SE", so if that looks familiar
> > to anyone (part of a kernel command line?) it might give a clue to
> > where the timer callback pointer is being overwritten.
> 
> Yeah, it could be anything, maybe even part of dmesg although nothing
> matches "DEP" in there.

We actually have the following long as well, from the .data member.
Pasting them together gives "DEP=2\0SEQNUM=223" which looks like udev.

    - Kevin 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 22:06             ` Borislav Petkov
  2013-11-17 22:34               ` Rafael J. Wysocki
  2013-11-18  0:33               ` Kevin Easton
@ 2013-11-18 12:19               ` Francis Moreau
  2013-11-18 13:32                 ` Borislav Petkov
  2 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-18 12:19 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, Rafael J. Wysocki, Thomas Gleixner

Hello Borislav,

Le 17/11/2013 23:06, Borislav Petkov a écrit :
> On Sun, Nov 17, 2013 at 09:49:40PM +0100, Francis Moreau wrote:
>> On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov <bp@alien8.de> wrote:
>>> On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote:
>>>> Sorry I haven't taken the original picture large enough, and getting
>>>> this kernel panic is pretty hard since the kernel usually displays the
>>>> black screen.
>>>
>>> Ok, just try to make a readable picture of the whole line, next time you
>>> trigger it.
>>>
>>>> I can't find any traces of this function in the dump...
>>>
>>> Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the
>>> official archlinux kernel? If so, where can I get it from?
>>
>> Yes, you can download the bin package from :
>> https://www.archlinux.org/packages/core/x86_64/linux/
>>
>> The bin package is a tar archive, so it pretty straightforward to
>> unpack the vmlinux file  (actual is filename vmlinuz-linux).
> 
> Ok, here's what I was able to see: rIP points to call_timer_fn+0x33
> which is this:
> 
> ffffffff8106f590 <call_timer_fn>:
> ffffffff8106f590:       e8 2b b2 48 00          callq  ffffffff814fa7c0 <__fentry__>
> ffffffff8106f595:       55                      push   %rbp
> ffffffff8106f596:       65 48 8b 04 25 70 c7    mov    %gs:0xc770,%rax
> ffffffff8106f59d:       00 00 
> ffffffff8106f59f:       48 89 e5                mov    %rsp,%rbp
> ffffffff8106f5a2:       41 57                   push   %r15
> ffffffff8106f5a4:       49 89 d7                mov    %rdx,%r15
> ffffffff8106f5a7:       41 56                   push   %r14
> ffffffff8106f5a9:       49 89 f6                mov    %rsi,%r14
> ffffffff8106f5ac:       41 55                   push   %r13
> ffffffff8106f5ae:       41 54                   push   %r12
> ffffffff8106f5b0:       49 89 fc                mov    %rdi,%r12
> ffffffff8106f5b3:       53                      push   %rbx
> ffffffff8106f5b4:       44 8b a8 44 e0 ff ff    mov    -0x1fbc(%rax),%r13d
> ffffffff8106f5bb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> ffffffff8106f5c0:       4c 89 ff                mov    %r15,%rdi
> ffffffff8106f5c3:       41 ff d6                callq  *%r14			<--- faulting insn
> ffffffff8106f5c6:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
> ffffffff8106f5cb:       65 48 8b 04 25 70 c7    mov    %gs:0xc770,%rax
> ffffffff8106f5d2:       00 00 
> ffffffff8106f5d4:       44 39 a8 44 e0 ff ff    cmp    %r13d,-0x1fbc(%rax)
> 
> and the virtual address in rIP is ffffffff8106f5c3, i.e. the same one
> as in the photo. Thus, the CALL instruction tries to call the timer
> function 'fn' which we pass as an argument to call_timer_fn.
> 
> However, the address we're trying to call in %r14 is garbage:
> 0x455300323d504544 and not in canonical form, causing the #GP.
> 

Thanks for digging this out !

Just out of curiosity, running "objdump -D" doesn't seem to show the
same thing here. How did you get such dump with function names for example ?

> So basically what happens is suspend to RAM corrupts something
> containing one or more timer functions and we end up calling crap after
> resume.
> 
> If you want to debug this further, you could try playing through
> Documentation/power/basic-pm-debugging.txt and see whether suspend to
> disk works. There's also a section 2 which talks about testing suspend
> to RAM which could be of help.

The thing is that I'd like to avoid to oops my kernel to avoid to
corrupt my filesystem.

Thanks

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 22:34               ` Rafael J. Wysocki
  2013-11-17 22:46                 ` Borislav Petkov
@ 2013-11-18 12:20                 ` Francis Moreau
  1 sibling, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2013-11-18 12:20 UTC (permalink / raw)
  To: Rafael J. Wysocki, Borislav Petkov; +Cc: LKML, Thomas Gleixner, Linux PM list

Le 17/11/2013 23:34, Rafael J. Wysocki a écrit :
> On Sunday, November 17, 2013 11:06:12 PM Borislav Petkov wrote:
>> On Sun, Nov 17, 2013 at 09:49:40PM +0100, Francis Moreau wrote:
>>> On Sun, Nov 17, 2013 at 8:53 PM, Borislav Petkov <bp@alien8.de> wrote:
>>>> On Sun, Nov 17, 2013 at 07:02:21PM +0100, Francis Moreau wrote:
>>>>> Sorry I haven't taken the original picture large enough, and getting
>>>>> this kernel panic is pretty hard since the kernel usually displays the
>>>>> black screen.
>>>>
>>>> Ok, just try to make a readable picture of the whole line, next time you
>>>> trigger it.
>>>>
>>>>> I can't find any traces of this function in the dump...
>>>>
>>>> Hmm, strange. Can you upload the whole vmlinux somewhere? Or is this the
>>>> official archlinux kernel? If so, where can I get it from?
>>>
>>> Yes, you can download the bin package from :
>>> https://www.archlinux.org/packages/core/x86_64/linux/
>>>
>>> The bin package is a tar archive, so it pretty straightforward to
>>> unpack the vmlinux file  (actual is filename vmlinuz-linux).
>>
>> Ok, here's what I was able to see: rIP points to call_timer_fn+0x33
>> which is this:
>>
>> ffffffff8106f590 <call_timer_fn>:
>> ffffffff8106f590:       e8 2b b2 48 00          callq  ffffffff814fa7c0 <__fentry__>
>> ffffffff8106f595:       55                      push   %rbp
>> ffffffff8106f596:       65 48 8b 04 25 70 c7    mov    %gs:0xc770,%rax
>> ffffffff8106f59d:       00 00 
>> ffffffff8106f59f:       48 89 e5                mov    %rsp,%rbp
>> ffffffff8106f5a2:       41 57                   push   %r15
>> ffffffff8106f5a4:       49 89 d7                mov    %rdx,%r15
>> ffffffff8106f5a7:       41 56                   push   %r14
>> ffffffff8106f5a9:       49 89 f6                mov    %rsi,%r14
>> ffffffff8106f5ac:       41 55                   push   %r13
>> ffffffff8106f5ae:       41 54                   push   %r12
>> ffffffff8106f5b0:       49 89 fc                mov    %rdi,%r12
>> ffffffff8106f5b3:       53                      push   %rbx
>> ffffffff8106f5b4:       44 8b a8 44 e0 ff ff    mov    -0x1fbc(%rax),%r13d
>> ffffffff8106f5bb:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>> ffffffff8106f5c0:       4c 89 ff                mov    %r15,%rdi
>> ffffffff8106f5c3:       41 ff d6                callq  *%r14			<--- faulting insn
>> ffffffff8106f5c6:       0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)
>> ffffffff8106f5cb:       65 48 8b 04 25 70 c7    mov    %gs:0xc770,%rax
>> ffffffff8106f5d2:       00 00 
>> ffffffff8106f5d4:       44 39 a8 44 e0 ff ff    cmp    %r13d,-0x1fbc(%rax)
>>
>> and the virtual address in rIP is ffffffff8106f5c3, i.e. the same one
>> as in the photo. Thus, the CALL instruction tries to call the timer
>> function 'fn' which we pass as an argument to call_timer_fn.
>>
>> However, the address we're trying to call in %r14 is garbage:
>> 0x455300323d504544 and not in canonical form, causing the #GP.
>>
>> So basically what happens is suspend to RAM corrupts something
>> containing one or more timer functions and we end up calling crap after
>> resume.
>>
>> If you want to debug this further, you could try playing through
>> Documentation/power/basic-pm-debugging.txt and see whether suspend to
>> disk works. There's also a section 2 which talks about testing suspend
>> to RAM which could be of help.
>>
>> But let me add Rafael and Thomas - they should have much better ideas
>> than me.
>>
>> Guys, thread starts here:
>> http://marc.info/?l=linux-kernel&m=138468134321335
> 
> This looks like a softirq bug to me (and related to cpuidle).
> 
> I'm wondering if that happens with any of the older kernels or just 3.12?
> 

I can try to find the old kernel package and see if that happens tonight.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-17 22:46                 ` Borislav Petkov
@ 2013-11-18 12:21                   ` Francis Moreau
  0 siblings, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2013-11-18 12:21 UTC (permalink / raw)
  To: Borislav Petkov, Rafael J. Wysocki; +Cc: LKML, Thomas Gleixner, Linux PM list

Le 17/11/2013 23:46, Borislav Petkov a écrit :
> On Sun, Nov 17, 2013 at 11:34:20PM +0100, Rafael J. Wysocki wrote:
>> This looks like a softirq bug to me (and related to cpuidle).
> 
> Reportedly, it happens right after resume from RAM. Francis, is that
> correct?

yes that's correct. I haven't been hit by this issue otherwise.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-18 12:19               ` Francis Moreau
@ 2013-11-18 13:32                 ` Borislav Petkov
  2013-11-19 10:01                   ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-18 13:32 UTC (permalink / raw)
  To: Francis Moreau; +Cc: LKML, Rafael J. Wysocki, Thomas Gleixner

On Mon, Nov 18, 2013 at 01:19:28PM +0100, Francis Moreau wrote:
> Just out of curiosity, running "objdump -D" doesn't seem to show the
> same thing here. How did you get such dump with function names for
> example ?

There's another, non-stripped vmlinux in the kernel package:

$ objdump -D usr/src/linux-3.12.0-1-ARCH/vmlinux | less

> The thing is that I'd like to avoid to oops my kernel to avoid to
> corrupt my filesystem.

Then debugging this thing would be very hard, if not impossible but it
is your decision at the end of the day...

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-18 13:32                 ` Borislav Petkov
@ 2013-11-19 10:01                   ` Francis Moreau
  2013-11-19 10:15                     ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-19 10:01 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, Rafael J. Wysocki, Thomas Gleixner

On 11/18/2013 02:32 PM, Borislav Petkov wrote:
> On Mon, Nov 18, 2013 at 01:19:28PM +0100, Francis Moreau wrote:
>> Just out of curiosity, running "objdump -D" doesn't seem to show the
>> same thing here. How did you get such dump with function names for
>> example ?
> 
> There's another, non-stripped vmlinux in the kernel package:
> 
> $ objdump -D usr/src/linux-3.12.0-1-ARCH/vmlinux | less
> 

oh I see, thks.

>> The thing is that I'd like to avoid to oops my kernel to avoid to
>> corrupt my filesystem.
> 
> Then debugging this thing would be very hard, if not impossible but it
> is your decision at the end of the day...
> 

This issue is really annoying so I'll try to debug it.

I think the easiest way to do it is to install a minimal system on a USB
stick and try to reproduce first in order to preserve my system.

Then I'll try to see if this issue exists in a previous kernel version
and if so, I'll do a git-bisect session.

I can't find a quicker way to do that although using git-bisect (which
implies several kernel builds) is a PITA.

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-19 10:01                   ` Francis Moreau
@ 2013-11-19 10:15                     ` Borislav Petkov
  2013-11-20  9:45                       ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-19 10:15 UTC (permalink / raw)
  To: Francis Moreau; +Cc: LKML, Rafael J. Wysocki, Thomas Gleixner

On Tue, Nov 19, 2013 at 11:01:14AM +0100, Francis Moreau wrote:
> I think the easiest way to do it is to install a minimal system on a
> USB stick and try to reproduce first in order to preserve my system.

Yep, sounds simple enough.

> Then I'll try to see if this issue exists in a previous kernel version
> and if so, I'll do a git-bisect session.
>
> I can't find a quicker way to do that although using git-bisect (which
> implies several kernel builds) is a PITA.

You can start with a coarse bisect by testing the major kernel versions
first, i.e. 3.11, 3.10, 3.9 ... and once you find good and bad, then you
can do the git-bisect thing.

Also, you can check for BIOS updates for your machine and if there are,
check their changelogs whether they fix something suspend-related.

Good luck!

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-19 10:15                     ` Borislav Petkov
@ 2013-11-20  9:45                       ` Francis Moreau
  2013-11-20 11:15                         ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-20  9:45 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, Rafael J. Wysocki, Thomas Gleixner

Hello Borislav,

On 11/19/2013 11:15 AM, Borislav Petkov wrote:
> On Tue, Nov 19, 2013 at 11:01:14AM +0100, Francis Moreau wrote:
>> I think the easiest way to do it is to install a minimal system on a
>> USB stick and try to reproduce first in order to preserve my system.
> 
> Yep, sounds simple enough.
> 
>> Then I'll try to see if this issue exists in a previous kernel version
>> and if so, I'll do a git-bisect session.
>>
>> I can't find a quicker way to do that although using git-bisect (which
>> implies several kernel builds) is a PITA.
> 
> You can start with a coarse bisect by testing the major kernel versions
> first, i.e. 3.11, 3.10, 3.9 ... and once you find good and bad, then you
> can do the git-bisect thing.
> 

Unfortunately the bisect session didn't give any positive results: I
couldn't be sure if a specific revision was good or bad because the bug
wasn't reproductible every time.

But I got a different kernel oops on my stripped system that may give us
a clue: http://imgur.com/zdCknbY

Does this help ?

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-20  9:45                       ` Francis Moreau
@ 2013-11-20 11:15                         ` Borislav Petkov
  2013-11-21  8:22                           ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-20 11:15 UTC (permalink / raw)
  To: Francis Moreau; +Cc: LKML, Rafael J. Wysocki, Thomas Gleixner

On Wed, Nov 20, 2013 at 10:45:05AM +0100, Francis Moreau wrote:
> Unfortunately the bisect session didn't give any positive results: I
> couldn't be sure if a specific revision was good or bad because the
> bug wasn't reproductible every time.
>
> But I got a different kernel oops on my stripped system that may give
> us a clue: http://imgur.com/zdCknbY
>
> Does this help ?

Unfortunately, this is the second oops:

"Oops: 0000 [#2] ..."

The first has scrolled off but I can see the RIP: ioread32+0x40 and the
code must be:

ffffffff812a1e40 <ioread32>:
ffffffff812a1e40:       48 81 ff ff ff 03 00    cmp    $0x3ffff,%rdi
ffffffff812a1e47:       77 37                   ja     ffffffff812a1e80 <ioread32+0x40>

...

ffffffff812a1e77:       66 0f 1f 84 00 00 00    nopw   0x0(%rax,%rax,1)
ffffffff812a1e7e:       00 00 
ffffffff812a1e80:       8b 07                   mov    (%rdi),%eax			<--- faulting insn
ffffffff812a1e82:       c3                      retq   
ffffffff812a1e83:       66 66 66 66 2e 0f 1f    data32 data32 data32 nopw %cs:0x0(%rax,%rax,1)
ffffffff812a1e8a:       84 00 00 00 00 00

and judging by the instruction, that's addr in %rdi which we try to read
and I'd guess %rdi contains garbage after resume.

IOW, this looks like another corruption that happens when you suspend to
ram.

I asked you already but you didn't say:

"Also, you can check for BIOS updates for your machine and if there are,
check their changelogs whether they fix something suspend-related."

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-20 11:15                         ` Borislav Petkov
@ 2013-11-21  8:22                           ` Francis Moreau
  2013-11-21 10:12                             ` Borislav Petkov
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-21  8:22 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: LKML, Rafael J. Wysocki, Thomas Gleixner

On 11/20/2013 12:15 PM, Borislav Petkov wrote:
> On Wed, Nov 20, 2013 at 10:45:05AM +0100, Francis Moreau wrote:
>> Unfortunately the bisect session didn't give any positive results: I
>> couldn't be sure if a specific revision was good or bad because the
>> bug wasn't reproductible every time.
>>
>> But I got a different kernel oops on my stripped system that may give
>> us a clue: http://imgur.com/zdCknbY
>>
>> Does this help ?
> 
> Unfortunately, this is the second oops:
> 
[...]
> 
> and judging by the instruction, that's addr in %rdi which we try to read
> and I'd guess %rdi contains garbage after resume.
> 
> IOW, this looks like another corruption that happens when you suspend to
> ram.

Hmm, I think it's more than that because if I'm removing both
rtsx_pci_ms and memstick modules, then suspending and resuming doesn't
oops anymore.

Also I took a look at the changes between v3.11 and v3.12 in this area
and those changes match the issue I'm facing:

$ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
09fd867 mfd: rtsx: Copyright modifications
eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
extra_init_hw
5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
773ccdf mfd: rtsx: Read vendor setting from config space

I'll try to bisect one more time those changes tonight to see if I can
find out if one of those commits is the culprit.

> 
> I asked you already but you didn't say:
> 
> "Also, you can check for BIOS updates for your machine and if there are,
> check their changelogs whether they fix something suspend-related."
> 

Sorry for not answering the first time, but yes my bios is uptodate.

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-21  8:22                           ` Francis Moreau
@ 2013-11-21 10:12                             ` Borislav Petkov
  2013-11-21 11:17                               ` Jingoo Han
  0 siblings, 1 reply; 63+ messages in thread
From: Borislav Petkov @ 2013-11-21 10:12 UTC (permalink / raw)
  To: Francis Moreau
  Cc: LKML, Rafael J. Wysocki, Thomas Gleixner, Samuel Ortiz, Wei WANG,
	Chris Ball, Jingoo Han

On Thu, Nov 21, 2013 at 09:22:02AM +0100, Francis Moreau wrote:
> Hmm, I think it's more than that because if I'm removing both
> rtsx_pci_ms and memstick modules, then suspending and resuming doesn't
> oops anymore.

Interesting - so this is a good datapoint. Sounds like those modules
can't stomach suspend-to-ram. So let's add a couple more people for
comment.

To the newly-CCed, guys, thread starts here:

http://marc.info/?l=linux-kernel&m=138468134321335

> Also I took a look at the changes between v3.11 and v3.12 in this area
> and those changes match the issue I'm facing:
> 
> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
> 09fd867 mfd: rtsx: Copyright modifications
> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
> extra_init_hw
> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
> 773ccdf mfd: rtsx: Read vendor setting from config space
> 
> I'll try to bisect one more time those changes tonight to see if I can
> find out if one of those commits is the culprit.

Good idea.

> Sorry for not answering the first time, but yes my bios is uptodate.

Ok, good.

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-21 10:12                             ` Borislav Petkov
@ 2013-11-21 11:17                               ` Jingoo Han
  2013-11-21 13:07                                 ` Francis Moreau
  2013-11-22  7:43                                 ` Francis Moreau
  0 siblings, 2 replies; 63+ messages in thread
From: Jingoo Han @ 2013-11-21 11:17 UTC (permalink / raw)
  To: 'Borislav Petkov', 'Francis Moreau', 'Wei WANG'
  Cc: 'LKML', 'Rafael J. Wysocki',
	'Thomas Gleixner', 'Samuel Ortiz',
	'Chris Ball', 'Jingoo Han'

On Thursday, November 21, 2013 7:13 PM, Borislav Petkov wrote:
> On Thu, Nov 21, 2013 at 09:22:02AM +0100, Francis Moreau wrote:
> > Hmm, I think it's more than that because if I'm removing both
> > rtsx_pci_ms and memstick modules, then suspending and resuming doesn't
> > oops anymore.
> 
> Interesting - so this is a good datapoint. Sounds like those modules
> can't stomach suspend-to-ram. So let's add a couple more people for
> comment.

Thank you for inviting me. :-)
It is very interesting.

> 
> To the newly-CCed, guys, thread starts here:
> 
> http://marc.info/?l=linux-kernel&m=138468134321335
> 
> > Also I took a look at the changes between v3.11 and v3.12 in this area
> > and those changes match the issue I'm facing:
> >
> > $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
> > 09fd867 mfd: rtsx: Copyright modifications
> > eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
> > 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
> > extra_init_hw
> > 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
> > 773ccdf mfd: rtsx: Read vendor setting from config space

In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card
reader driver may make the kernel panic.

I think that the commit "mfd: rtsx: Configure to enter a deeper
power-saving mode in S3" may be the culprit.

Francis Moreau,
Let us know the exact model name of rts52xx PCIe card reader
in your laptop? According to the commit, Wei WANG added S3 mode
to rts5227 driver ,and rts5249 driver.

Wei Wang,
Would you confirm that the S3 mode works properly in rts52xx PCIe
card reader devices?

Best regards,
Jingoo Han

> >
> > I'll try to bisect one more time those changes tonight to see if I can
> > find out if one of those commits is the culprit.
> 
> Good idea.
> 
> > Sorry for not answering the first time, but yes my bios is uptodate.
> 
> Ok, good.
> 
> --
> Regards/Gruss,
>     Boris.
> 
> Sent from a fat crate under my desk. Formatting is fine.
> --


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-21 11:17                               ` Jingoo Han
@ 2013-11-21 13:07                                 ` Francis Moreau
  2013-11-22  7:43                                 ` Francis Moreau
  1 sibling, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2013-11-21 13:07 UTC (permalink / raw)
  To: Jingoo Han, 'Borislav Petkov', 'Wei WANG'
  Cc: 'LKML', 'Rafael J. Wysocki',
	'Thomas Gleixner', 'Samuel Ortiz',
	'Chris Ball'

Hi,

On 11/21/2013 12:17 PM, Jingoo Han wrote:
[...]
> 
> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card
> reader driver may make the kernel panic.
> 
> I think that the commit "mfd: rtsx: Configure to enter a deeper
> power-saving mode in S3" may be the culprit.

I'll do the check tonight.

> 
> Francis Moreau,
> Let us know the exact model name of rts52xx PCIe card reader
> in your laptop? According to the commit, Wei WANG added S3 mode
> to rts5227 driver ,and rts5249 driver.

03:00.0 Unassigned class [ff00]: Realtek Semiconductor Co., Ltd. Device
[10ec:5289] (rev 01)
	Subsystem: CLEVO/KAPOK Computer Device [1558:0540]
	Flags: bus master, fast devsel, latency 0, IRQ 40
	Memory at f7200000 (32-bit, non-prefetchable) [size=64K]
	Capabilities: [40] Power Management version 3
	Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [70] Express Endpoint, MSI 00
	Capabilities: [b0] MSI-X: Enable- Count=1 Masked-
	Capabilities: [d0] Vital Product Data
	Capabilities: [100] Advanced Error Reporting
	Capabilities: [140] Virtual Channel
	Capabilities: [160] Device Serial Number 00-00-00-00-00-00-00-00
	Kernel driver in use: rtsx_pci
	Kernel modules: rtsx_pci

Thanks,


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-21 11:17                               ` Jingoo Han
  2013-11-21 13:07                                 ` Francis Moreau
@ 2013-11-22  7:43                                 ` Francis Moreau
  2013-11-22  9:57                                   ` Francis Moreau
  1 sibling, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-22  7:43 UTC (permalink / raw)
  To: Jingoo Han, 'Borislav Petkov', 'Wei WANG'
  Cc: 'LKML', 'Rafael J. Wysocki',
	'Thomas Gleixner', 'Samuel Ortiz',
	'Chris Ball'

Le 21/11/2013 12:17, Jingoo Han a écrit :
[...]
>>
>>> Also I took a look at the changes between v3.11 and v3.12 in this area
>>> and those changes match the issue I'm facing:
>>>
>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
>>> 09fd867 mfd: rtsx: Copyright modifications
>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
>>> extra_init_hw
>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
>>> 773ccdf mfd: rtsx: Read vendor setting from config space
> 
> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card
> reader driver may make the kernel panic.
> 
> I think that the commit "mfd: rtsx: Configure to enter a deeper
> power-saving mode in S3" may be the culprit.

Unfortunately no, reverting this commit on top of v3.12 doesn't help. I
also reverted 7140812, 5947c16 but it didn't improve anything.

The good news is that I managed to have a "light" kernel configuration
which is faster to build and more important it seems that the bug is
almost 100% reproductible now.

So I'll try to do another git-bisect session later.

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-22  7:43                                 ` Francis Moreau
@ 2013-11-22  9:57                                   ` Francis Moreau
  2013-11-22 12:54                                     ` Rafael J. Wysocki
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-22  9:57 UTC (permalink / raw)
  To: rafael.j.wysocki
  Cc: Jingoo Han, 'Borislav Petkov', 'Wei WANG',
	'LKML', 'Thomas Gleixner', 'Samuel Ortiz',
	'Chris Ball'

Le 22/11/2013 08:43, Francis Moreau a écrit :
> Le 21/11/2013 12:17, Jingoo Han a écrit :
> [...]
>>>
>>>> Also I took a look at the changes between v3.11 and v3.12 in this area
>>>> and those changes match the issue I'm facing:
>>>>
>>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
>>>> 09fd867 mfd: rtsx: Copyright modifications
>>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
>>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
>>>> extra_init_hw
>>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
>>>> 773ccdf mfd: rtsx: Read vendor setting from config space
>>
>> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card
>> reader driver may make the kernel panic.
>>
>> I think that the commit "mfd: rtsx: Configure to enter a deeper
>> power-saving mode in S3" may be the culprit.
> 
> Unfortunately no, reverting this commit on top of v3.12 doesn't help. I
> also reverted 7140812, 5947c16 but it didn't improve anything.
> 
> The good news is that I managed to have a "light" kernel configuration
> which is faster to build and more important it seems that the bug is
> almost 100% reproductible now.
> 
> So I'll try to do another git-bisect session later.

So after bisecting between v3.11..v3.12 range, git bisect told me:

the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0

Merge branch 'acpi-processor'

    * acpi-processor:
      ACPI / processor: Acquire writer lock to update CPU maps
      ACPI / processor: Remove acpi_processor_get_limit_info()

The two commits brought by the merge are not the culprits because
reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU
maps" doesn't have the issue anymore.

At that point I'm not sure how to bisect futher.

Hope that helps.

Thanks

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-22  9:57                                   ` Francis Moreau
@ 2013-11-22 12:54                                     ` Rafael J. Wysocki
  2013-11-22 21:36                                       ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Rafael J. Wysocki @ 2013-11-22 12:54 UTC (permalink / raw)
  To: Francis Moreau
  Cc: rafael.j.wysocki, Jingoo Han, 'Borislav Petkov',
	'Wei WANG', 'LKML', 'Thomas Gleixner',
	'Samuel Ortiz', 'Chris Ball'

On Friday, November 22, 2013 10:57:25 AM Francis Moreau wrote:
> Le 22/11/2013 08:43, Francis Moreau a écrit :
> > Le 21/11/2013 12:17, Jingoo Han a écrit :
> > [...]
> >>>
> >>>> Also I took a look at the changes between v3.11 and v3.12 in this area
> >>>> and those changes match the issue I'm facing:
> >>>>
> >>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
> >>>> 09fd867 mfd: rtsx: Copyright modifications
> >>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
> >>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
> >>>> extra_init_hw
> >>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
> >>>> 773ccdf mfd: rtsx: Read vendor setting from config space
> >>
> >> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card
> >> reader driver may make the kernel panic.
> >>
> >> I think that the commit "mfd: rtsx: Configure to enter a deeper
> >> power-saving mode in S3" may be the culprit.
> > 
> > Unfortunately no, reverting this commit on top of v3.12 doesn't help. I
> > also reverted 7140812, 5947c16 but it didn't improve anything.
> > 
> > The good news is that I managed to have a "light" kernel configuration
> > which is faster to build and more important it seems that the bug is
> > almost 100% reproductible now.
> > 
> > So I'll try to do another git-bisect session later.
> 
> So after bisecting between v3.11..v3.12 range, git bisect told me:
> 
> the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0
> 
> Merge branch 'acpi-processor'
> 
>     * acpi-processor:
>       ACPI / processor: Acquire writer lock to update CPU maps
>       ACPI / processor: Remove acpi_processor_get_limit_info()
> 
> The two commits brought by the merge are not the culprits because
> reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU
> maps" doesn't have the issue anymore.
> 
> At that point I'm not sure how to bisect futher.

Does the second parent of this merge (that is, 8462d9df9d50) have the problem?

Rafael


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-22 12:54                                     ` Rafael J. Wysocki
@ 2013-11-22 21:36                                       ` Francis Moreau
  2013-11-22 22:08                                         ` Rafael J. Wysocki
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-22 21:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: rafael.j.wysocki, Jingoo Han, 'Borislav Petkov',
	'Wei WANG', 'LKML', 'Thomas Gleixner',
	'Samuel Ortiz', 'Chris Ball'

On 11/22/2013 01:54 PM, Rafael J. Wysocki wrote:
> On Friday, November 22, 2013 10:57:25 AM Francis Moreau wrote:
>> Le 22/11/2013 08:43, Francis Moreau a écrit :
>>> Le 21/11/2013 12:17, Jingoo Han a écrit :
>>> [...]
>>>>>
>>>>>> Also I took a look at the changes between v3.11 and v3.12 in this area
>>>>>> and those changes match the issue I'm facing:
>>>>>>
>>>>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
>>>>>> 09fd867 mfd: rtsx: Copyright modifications
>>>>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
>>>>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
>>>>>> extra_init_hw
>>>>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
>>>>>> 773ccdf mfd: rtsx: Read vendor setting from config space
>>>>
>>>> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card
>>>> reader driver may make the kernel panic.
>>>>
>>>> I think that the commit "mfd: rtsx: Configure to enter a deeper
>>>> power-saving mode in S3" may be the culprit.
>>>
>>> Unfortunately no, reverting this commit on top of v3.12 doesn't help. I
>>> also reverted 7140812, 5947c16 but it didn't improve anything.
>>>
>>> The good news is that I managed to have a "light" kernel configuration
>>> which is faster to build and more important it seems that the bug is
>>> almost 100% reproductible now.
>>>
>>> So I'll try to do another git-bisect session later.
>>
>> So after bisecting between v3.11..v3.12 range, git bisect told me:
>>
>> the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0
>>
>> Merge branch 'acpi-processor'
>>
>>     * acpi-processor:
>>       ACPI / processor: Acquire writer lock to update CPU maps
>>       ACPI / processor: Remove acpi_processor_get_limit_info()
>>
>> The two commits brought by the merge are not the culprits because
>> reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU
>> maps" doesn't have the issue anymore.
>>
>> At that point I'm not sure how to bisect futher.
> 
> Does the second parent of this merge (that is, 8462d9df9d50) have the problem?
> 

Yes it does.

Ok, I've finally managed to find out the bad commit:
ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
over system PM transitions

I verified that the parent commit doesn't have the problem.

Rafael, you're the man now ;)

Thanks



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-22 21:36                                       ` Francis Moreau
@ 2013-11-22 22:08                                         ` Rafael J. Wysocki
  2013-11-22 22:27                                           ` Thomas Gleixner
  2013-11-24  9:42                                           ` Francis Moreau
  0 siblings, 2 replies; 63+ messages in thread
From: Rafael J. Wysocki @ 2013-11-22 22:08 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Jingoo Han, 'Borislav Petkov', 'Wei WANG',
	'LKML', 'Thomas Gleixner', 'Samuel Ortiz',
	'Chris Ball'

On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
> On 11/22/2013 01:54 PM, Rafael J. Wysocki wrote:
> > On Friday, November 22, 2013 10:57:25 AM Francis Moreau wrote:
> >> Le 22/11/2013 08:43, Francis Moreau a écrit :
> >>> Le 21/11/2013 12:17, Jingoo Han a écrit :
> >>> [...]
> >>>>>
> >>>>>> Also I took a look at the changes between v3.11 and v3.12 in this area
> >>>>>> and those changes match the issue I'm facing:
> >>>>>>
> >>>>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
> >>>>>> 09fd867 mfd: rtsx: Copyright modifications
> >>>>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
> >>>>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
> >>>>>> extra_init_hw
> >>>>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
> >>>>>> 773ccdf mfd: rtsx: Read vendor setting from config space
> >>>>
> >>>> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card
> >>>> reader driver may make the kernel panic.
> >>>>
> >>>> I think that the commit "mfd: rtsx: Configure to enter a deeper
> >>>> power-saving mode in S3" may be the culprit.
> >>>
> >>> Unfortunately no, reverting this commit on top of v3.12 doesn't help. I
> >>> also reverted 7140812, 5947c16 but it didn't improve anything.
> >>>
> >>> The good news is that I managed to have a "light" kernel configuration
> >>> which is faster to build and more important it seems that the bug is
> >>> almost 100% reproductible now.
> >>>
> >>> So I'll try to do another git-bisect session later.
> >>
> >> So after bisecting between v3.11..v3.12 range, git bisect told me:
> >>
> >> the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0
> >>
> >> Merge branch 'acpi-processor'
> >>
> >>     * acpi-processor:
> >>       ACPI / processor: Acquire writer lock to update CPU maps
> >>       ACPI / processor: Remove acpi_processor_get_limit_info()
> >>
> >> The two commits brought by the merge are not the culprits because
> >> reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU
> >> maps" doesn't have the issue anymore.
> >>
> >> At that point I'm not sure how to bisect futher.
> > 
> > Does the second parent of this merge (that is, 8462d9df9d50) have the problem?
> > 
> 
> Yes it does.
> 
> Ok, I've finally managed to find out the bad commit:
> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
> over system PM transitions
> 
> I verified that the parent commit doesn't have the problem.

Interesting.

> Rafael, you're the man now ;)

I kind of don't see how that commit may result in behavior that you
described earlier in the thread.

You get a memory corruption that seems to have started to happen because
we're holding an additional lock over suspend resume now.  Something's fishy
on that machine and we need to figure out what it is.

Please file a bug at bugzilla.kernel.org against ACPI and assign it to me.
Please put all of the relevant info in there and attach the output of dmesg
after a fresh boot and the output of acpidump from the affected machine to
the bug entry.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-22 22:08                                         ` Rafael J. Wysocki
@ 2013-11-22 22:27                                           ` Thomas Gleixner
  2013-11-24  9:39                                             ` Francis Moreau
  2013-11-24  9:42                                           ` Francis Moreau
  1 sibling, 1 reply; 63+ messages in thread
From: Thomas Gleixner @ 2013-11-22 22:27 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Francis Moreau, Jingoo Han, 'Borislav Petkov',
	'Wei WANG', 'LKML', 'Samuel Ortiz',
	'Chris Ball'

On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
> > Ok, I've finally managed to find out the bad commit:
> > ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
> > over system PM transitions
> > 
> > I verified that the parent commit doesn't have the problem.
> 
> Interesting.
> 
> > Rafael, you're the man now ;)
> 
> I kind of don't see how that commit may result in behavior that you
> described earlier in the thread.
> 
> You get a memory corruption that seems to have started to happen because
> we're holding an additional lock over suspend resume now.  Something's fishy
> on that machine and we need to figure out what it is.

The hickup happens in the timer softirq.

@Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
	  a try.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-22 22:27                                           ` Thomas Gleixner
@ 2013-11-24  9:39                                             ` Francis Moreau
  2013-11-24 13:31                                               ` Borislav Petkov
  2013-11-24 21:06                                               ` Rafael J. Wysocki
  0 siblings, 2 replies; 63+ messages in thread
From: Francis Moreau @ 2013-11-24  9:39 UTC (permalink / raw)
  To: Thomas Gleixner, Rafael J. Wysocki
  Cc: Jingoo Han, 'Borislav Petkov', 'Wei WANG',
	'LKML', 'Samuel Ortiz', 'Chris Ball'

[-- Attachment #1: Type: text/plain, Size: 4541 bytes --]

Hello Thomas

On 11/22/2013 11:27 PM, Thomas Gleixner wrote:
> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
>> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
>>> Ok, I've finally managed to find out the bad commit:
>>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
>>> over system PM transitions
>>>
>>> I verified that the parent commit doesn't have the problem.
>>
>> Interesting.
>>
>>> Rafael, you're the man now ;)
>>
>> I kind of don't see how that commit may result in behavior that you
>> described earlier in the thread.
>>
>> You get a memory corruption that seems to have started to happen because
>> we're holding an additional lock over suspend resume now.  Something's fishy
>> on that machine and we need to figure out what it is.
> 
> The hickup happens in the timer softirq.
> 
> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
> 	  a try.

This looks like it was a good idea.

The kernel now outputs the following traces after resuming.

[   26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
debug_print_object+0x83/0xa0()
[   26.973932] ODEBUG: free active (active state 0) object type:
timer_list hint: delayed_work_timer_fn+0x0/0x20
[   26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt
iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper
r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel
battery thermal wmi evdev mei_me video mei button mperf processor
serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci
ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common
[   26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted
3.11.0-rc2-ARCH #64
[   26.974014] Hardware name: CLEVO CO.                        W55xEU
                       /W55xEU                          , BIOS 4.6.5
03/05/2013
[   26.974019] Workqueue: kacpi_hotplug hotplug_event_work
[   26.974020]  0000000000000009 ffff880407d0da18 ffffffff81459fe9
ffff880407d0da60
[   26.974023]  ffff880407d0da50 ffffffff8104dc7d ffff880407fad488
ffffffff81836fc0
[   26.974025]  ffffffff81701358 ffffffff81afef70 0000000000000003
ffff880407d0dab0
[   26.974027] Call Trace:
[   26.974031]  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
[   26.974043]  [<ffffffff8104dc7d>] warn_slowpath_common+0x7d/0xa0
[   26.974044]  [<ffffffff8104dcec>] warn_slowpath_fmt+0x4c/0x50
[   26.974047]  [<ffffffff81261433>] debug_print_object+0x83/0xa0
[   26.974050]  [<ffffffff8106b820>] ? queue_work_on+0x50/0x50
[   26.974053]  [<ffffffff81261c2b>] __debug_check_no_obj_freed+0x1fb/0x240
[   26.974059]  [<ffffffffa008e959>] ? rtsx_pci_remove+0x119/0x1d0
[rtsx_pci]
[   26.974062]  [<ffffffff81262619>] debug_check_no_obj_freed+0x19/0x20
[   26.974065]  [<ffffffff8116f861>] kfree+0x191/0x210
[   26.974069]  [<ffffffff813819e0>] ? pcibios_disable_device+0x20/0x30
[   26.974072]  [<ffffffffa008e959>] ? rtsx_pci_remove+0x119/0x1d0
[rtsx_pci]
[   26.974075]  [<ffffffffa008e959>] rtsx_pci_remove+0x119/0x1d0 [rtsx_pci]
[   26.974079]  [<ffffffff8128004b>] pci_device_remove+0x3b/0xb0
[   26.974092]  [<ffffffff8132c92f>] __device_release_driver+0x7f/0xf0
[   26.974094]  [<ffffffff8132c9c3>] device_release_driver+0x23/0x30
[   26.974096]  [<ffffffff8132c194>] bus_remove_device+0xf4/0x170
[   26.974098]  [<ffffffff81328c55>] device_del+0x135/0x1d0
[   26.974108]  [<ffffffff8127ae24>] pci_stop_bus_device+0x94/0xa0
[   26.974110]  [<ffffffff8127af32>]
pci_stop_and_remove_bus_device+0x12/0x20
[   26.974113]  [<ffffffff81297466>] disable_slot+0x76/0xd0
[   26.974115]  [<ffffffff81297568>] acpiphp_check_bridge+0xa8/0xd0
[   26.974118]  [<ffffffff81297c8a>] hotplug_event+0xfa/0x210
[   26.974120]  [<ffffffff81297dc7>] hotplug_event_work+0x27/0x60
[   26.974123]  [<ffffffff8106c178>] process_one_work+0x178/0x470
[   26.974125]  [<ffffffff8106cb91>] worker_thread+0x121/0x3a0
[   26.974127]  [<ffffffff8106ca70>] ? manage_workers.isra.21+0x2b0/0x2b0
[   26.974130]  [<ffffffff81073a50>] kthread+0xc0/0xd0
[   26.974132]  [<ffffffff81073990>] ? kthread_create_on_node+0x120/0x120
[   26.974135]  [<ffffffff814688ec>] ret_from_fork+0x7c/0xb0
[   26.974137]  [<ffffffff81073990>] ? kthread_create_on_node+0x120/0x120
[   26.974139] ---[ end trace 0895c2e7925b5485 ]---

Also the kernel doesn't panic anymore.

I'm also attaching the dmesg when CONFIG_DEBUG_KOBJECT and
CONFIG_DEBUG_OBJECT* were activated.

Thanks.

[-- Attachment #2: dmesg-with-debug-objects.txt.gz --]
[-- Type: application/gzip, Size: 62316 bytes --]

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-22 22:08                                         ` Rafael J. Wysocki
  2013-11-22 22:27                                           ` Thomas Gleixner
@ 2013-11-24  9:42                                           ` Francis Moreau
  1 sibling, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2013-11-24  9:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Jingoo Han, 'Borislav Petkov', 'Wei WANG',
	'LKML', 'Thomas Gleixner', 'Samuel Ortiz',
	'Chris Ball'

Hello Rafael,

On 11/22/2013 11:08 PM, Rafael J. Wysocki wrote:
> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
>> On 11/22/2013 01:54 PM, Rafael J. Wysocki wrote:
>>> On Friday, November 22, 2013 10:57:25 AM Francis Moreau wrote:
>>>> Le 22/11/2013 08:43, Francis Moreau a écrit :
>>>>> Le 21/11/2013 12:17, Jingoo Han a écrit :
>>>>> [...]
>>>>>>>
>>>>>>>> Also I took a look at the changes between v3.11 and v3.12 in this area
>>>>>>>> and those changes match the issue I'm facing:
>>>>>>>>
>>>>>>>> $ git log --oneline v3.11..v3.12 -- drivers/mfd/rtsx_pcr.c
>>>>>>>> 09fd867 mfd: rtsx: Copyright modifications
>>>>>>>> eb891c6 mfd: rtsx: Configure to enter a deeper power-saving mode in S3
>>>>>>>> 7140812 mfd: rtsx: Move some actions from rtsx_pci_init_hw to individual
>>>>>>>> extra_init_hw
>>>>>>>> 5947c16 mfd: rtsx: Add shutdown callback in rtsx_pci_driver
>>>>>>>> 773ccdf mfd: rtsx: Read vendor setting from config space
>>>>>>
>>>>>> In my opinion, rtsx_pci_resume()/rtsx_pci_suspend() in realtek PCIe card
>>>>>> reader driver may make the kernel panic.
>>>>>>
>>>>>> I think that the commit "mfd: rtsx: Configure to enter a deeper
>>>>>> power-saving mode in S3" may be the culprit.
>>>>>
>>>>> Unfortunately no, reverting this commit on top of v3.12 doesn't help. I
>>>>> also reverted 7140812, 5947c16 but it didn't improve anything.
>>>>>
>>>>> The good news is that I managed to have a "light" kernel configuration
>>>>> which is faster to build and more important it seems that the bug is
>>>>> almost 100% reproductible now.
>>>>>
>>>>> So I'll try to do another git-bisect session later.
>>>>
>>>> So after bisecting between v3.11..v3.12 range, git bisect told me:
>>>>
>>>> the first bad commit is 551f5c74e17ba9257cdc35bf657ee448cad2d5b0
>>>>
>>>> Merge branch 'acpi-processor'
>>>>
>>>>     * acpi-processor:
>>>>       ACPI / processor: Acquire writer lock to update CPU maps
>>>>       ACPI / processor: Remove acpi_processor_get_limit_info()
>>>>
>>>> The two commits brought by the merge are not the culprits because
>>>> reseting HEAD on "ACPI / processor: Acquire writer lock to update CPU
>>>> maps" doesn't have the issue anymore.
>>>>
>>>> At that point I'm not sure how to bisect futher.
>>>
>>> Does the second parent of this merge (that is, 8462d9df9d50) have the problem?
>>>
>>
>> Yes it does.
>>
>> Ok, I've finally managed to find out the bad commit:
>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
>> over system PM transitions
>>
>> I verified that the parent commit doesn't have the problem.
> 
> Interesting.
> 
>> Rafael, you're the man now ;)
> 
> I kind of don't see how that commit may result in behavior that you
> described earlier in the thread.
> 
> You get a memory corruption that seems to have started to happen because
> we're holding an additional lock over suspend resume now.  Something's fishy
> on that machine and we need to figure out what it is.
> 
> Please file a bug at bugzilla.kernel.org against ACPI and assign it to me.
> Please put all of the relevant info in there and attach the output of dmesg
> after a fresh boot and the output of acpidump from the affected machine to
> the bug entry.
> 

I just sent a new trace with DEBUG_OBJECTS enabled which seems to give
some interesting traces.

If nothing can be found from them, I'll do the bug report.

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-24  9:39                                             ` Francis Moreau
@ 2013-11-24 13:31                                               ` Borislav Petkov
  2013-11-24 21:06                                               ` Rafael J. Wysocki
  1 sibling, 0 replies; 63+ messages in thread
From: Borislav Petkov @ 2013-11-24 13:31 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Thomas Gleixner, Rafael J. Wysocki, Jingoo Han,
	'Wei WANG', 'LKML', 'Samuel Ortiz',
	'Chris Ball'

On Sun, Nov 24, 2013 at 10:39:20AM +0100, Francis Moreau wrote:
> This looks like it was a good idea.
> 
> The kernel now outputs the following traces after resuming.
> 
> [   26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
> debug_print_object+0x83/0xa0()
> [   26.973932] ODEBUG: free active (active state 0) object type:
> timer_list hint: delayed_work_timer_fn+0x0/0x20

Just a stab in the dark, does the below fix it?

--
diff --git a/drivers/mfd/rtsx_pcr.c b/drivers/mfd/rtsx_pcr.c
index 11e20afbdcac..e65a12dd6e20 100644
--- a/drivers/mfd/rtsx_pcr.c
+++ b/drivers/mfd/rtsx_pcr.c
@@ -1228,8 +1228,8 @@ static void rtsx_pci_remove(struct pci_dev *pcidev)
 
 	pcr->remove_pci = true;
 
-	cancel_delayed_work(&pcr->carddet_work);
-	cancel_delayed_work(&pcr->idle_work);
+	cancel_delayed_work_sync(&pcr->carddet_work);
+	cancel_delayed_work_sync(&pcr->idle_work);
 
 	mfd_remove_devices(&pcidev->dev);
 

-- 
Regards/Gruss,
    Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-24  9:39                                             ` Francis Moreau
  2013-11-24 13:31                                               ` Borislav Petkov
@ 2013-11-24 21:06                                               ` Rafael J. Wysocki
  2013-11-25  7:42                                                 ` Francis Moreau
  1 sibling, 1 reply; 63+ messages in thread
From: Rafael J. Wysocki @ 2013-11-24 21:06 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Thomas Gleixner, Jingoo Han, 'Borislav Petkov',
	'Wei WANG', 'LKML', 'Samuel Ortiz',
	'Chris Ball'

On Sunday, November 24, 2013 10:39:20 AM Francis Moreau wrote:
> Hello Thomas
> 
> On 11/22/2013 11:27 PM, Thomas Gleixner wrote:
> > On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
> >> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
> >>> Ok, I've finally managed to find out the bad commit:
> >>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
> >>> over system PM transitions
> >>>
> >>> I verified that the parent commit doesn't have the problem.
> >>
> >> Interesting.
> >>
> >>> Rafael, you're the man now ;)
> >>
> >> I kind of don't see how that commit may result in behavior that you
> >> described earlier in the thread.
> >>
> >> You get a memory corruption that seems to have started to happen because
> >> we're holding an additional lock over suspend resume now.  Something's fishy
> >> on that machine and we need to figure out what it is.
> > 
> > The hickup happens in the timer softirq.
> > 
> > @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
> > 	  a try.
> 
> This looks like it was a good idea.
> 
> The kernel now outputs the following traces after resuming.
> 
> [   26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
> debug_print_object+0x83/0xa0()
> [   26.973932] ODEBUG: free active (active state 0) object type:
> timer_list hint: delayed_work_timer_fn+0x0/0x20
> [   26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
> rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt
> iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper
> r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel
> battery thermal wmi evdev mei_me video mei button mperf processor
> serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
> usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci
> ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common
> [   26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted
> 3.11.0-rc2-ARCH #64
> [   26.974014] Hardware name: CLEVO CO.                        W55xEU
>                        /W55xEU                          , BIOS 4.6.5
> 03/05/2013
> [   26.974019] Workqueue: kacpi_hotplug hotplug_event_work
> [   26.974020]  0000000000000009 ffff880407d0da18 ffffffff81459fe9
> ffff880407d0da60
> [   26.974023]  ffff880407d0da50 ffffffff8104dc7d ffff880407fad488
> ffffffff81836fc0
> [   26.974025]  ffffffff81701358 ffffffff81afef70 0000000000000003
> ffff880407d0dab0
> [   26.974027] Call Trace:
> [   26.974031]  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
> [   26.974043]  [<ffffffff8104dc7d>] warn_slowpath_common+0x7d/0xa0
> [   26.974044]  [<ffffffff8104dcec>] warn_slowpath_fmt+0x4c/0x50
> [   26.974047]  [<ffffffff81261433>] debug_print_object+0x83/0xa0
> [   26.974050]  [<ffffffff8106b820>] ? queue_work_on+0x50/0x50
> [   26.974053]  [<ffffffff81261c2b>] __debug_check_no_obj_freed+0x1fb/0x240
> [   26.974059]  [<ffffffffa008e959>] ? rtsx_pci_remove+0x119/0x1d0
> [rtsx_pci]

So a device driven by rtsx_pcr.c is removed after resume.  Without the commit
you've bisected it is removed as well, but that happens during resume, so
rtsx_pci_resume() is likely not called in that case.

I bet that there's a bug either in rtsx_pci_remove() or in rtsx_pci_resume().
The latter definitely should check if the device is actually still present
before scheduling the delayed work, but then the Boris' patch should take care
of that anyway.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-24 21:06                                               ` Rafael J. Wysocki
@ 2013-11-25  7:42                                                 ` Francis Moreau
  2013-11-25 10:47                                                   ` Rafael J. Wysocki
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-25  7:42 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Jingoo Han, 'Borislav Petkov',
	'Wei WANG', 'LKML', 'Samuel Ortiz',
	'Chris Ball'

On 11/24/2013 10:06 PM, Rafael J. Wysocki wrote:
> On Sunday, November 24, 2013 10:39:20 AM Francis Moreau wrote:
>> Hello Thomas
>>
>> On 11/22/2013 11:27 PM, Thomas Gleixner wrote:
>>> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
>>>> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
>>>>> Ok, I've finally managed to find out the bad commit:
>>>>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
>>>>> over system PM transitions
>>>>>
>>>>> I verified that the parent commit doesn't have the problem.
>>>>
>>>> Interesting.
>>>>
>>>>> Rafael, you're the man now ;)
>>>>
>>>> I kind of don't see how that commit may result in behavior that you
>>>> described earlier in the thread.
>>>>
>>>> You get a memory corruption that seems to have started to happen because
>>>> we're holding an additional lock over suspend resume now.  Something's fishy
>>>> on that machine and we need to figure out what it is.
>>>
>>> The hickup happens in the timer softirq.
>>>
>>> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
>>> 	  a try.
>>
>> This looks like it was a good idea.
>>
>> The kernel now outputs the following traces after resuming.
>>
>> [   26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
>> debug_print_object+0x83/0xa0()
>> [   26.973932] ODEBUG: free active (active state 0) object type:
>> timer_list hint: delayed_work_timer_fn+0x0/0x20
>> [   26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
>> rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt
>> iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper
>> r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel
>> battery thermal wmi evdev mei_me video mei button mperf processor
>> serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
>> usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci
>> ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common
>> [   26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted
>> 3.11.0-rc2-ARCH #64
>> [   26.974014] Hardware name: CLEVO CO.                        W55xEU
>>                        /W55xEU                          , BIOS 4.6.5
>> 03/05/2013
>> [   26.974019] Workqueue: kacpi_hotplug hotplug_event_work
>> [   26.974020]  0000000000000009 ffff880407d0da18 ffffffff81459fe9
>> ffff880407d0da60
>> [   26.974023]  ffff880407d0da50 ffffffff8104dc7d ffff880407fad488
>> ffffffff81836fc0
>> [   26.974025]  ffffffff81701358 ffffffff81afef70 0000000000000003
>> ffff880407d0dab0
>> [   26.974027] Call Trace:
>> [   26.974031]  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
>> [   26.974043]  [<ffffffff8104dc7d>] warn_slowpath_common+0x7d/0xa0
>> [   26.974044]  [<ffffffff8104dcec>] warn_slowpath_fmt+0x4c/0x50
>> [   26.974047]  [<ffffffff81261433>] debug_print_object+0x83/0xa0
>> [   26.974050]  [<ffffffff8106b820>] ? queue_work_on+0x50/0x50
>> [   26.974053]  [<ffffffff81261c2b>] __debug_check_no_obj_freed+0x1fb/0x240
>> [   26.974059]  [<ffffffffa008e959>] ? rtsx_pci_remove+0x119/0x1d0
>> [rtsx_pci]
> 
> So a device driven by rtsx_pcr.c is removed after resume.  Without the commit
> you've bisected it is removed as well, but that happens during resume, so
> rtsx_pci_resume() is likely not called in that case.

I'm not sure to understand your point.

> 
> I bet that there's a bug either in rtsx_pci_remove() or in rtsx_pci_resume().
> The latter definitely should check if the device is actually still present
> before scheduling the delayed work, but then the Boris' patch should take care
> of that anyway.
> 

With Boris' patch applied, I still have the problem.

Thanks.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-25  7:42                                                 ` Francis Moreau
@ 2013-11-25 10:47                                                   ` Rafael J. Wysocki
  2013-11-29  8:28                                                     ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Rafael J. Wysocki @ 2013-11-25 10:47 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Thomas Gleixner, Jingoo Han, 'Borislav Petkov',
	'Wei WANG', 'LKML', 'Samuel Ortiz',
	'Chris Ball'

On Monday, November 25, 2013 08:42:21 AM Francis Moreau wrote:
> On 11/24/2013 10:06 PM, Rafael J. Wysocki wrote:
> > On Sunday, November 24, 2013 10:39:20 AM Francis Moreau wrote:
> >> Hello Thomas
> >>
> >> On 11/22/2013 11:27 PM, Thomas Gleixner wrote:
> >>> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
> >>>> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
> >>>>> Ok, I've finally managed to find out the bad commit:
> >>>>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
> >>>>> over system PM transitions
> >>>>>
> >>>>> I verified that the parent commit doesn't have the problem.
> >>>>
> >>>> Interesting.
> >>>>
> >>>>> Rafael, you're the man now ;)
> >>>>
> >>>> I kind of don't see how that commit may result in behavior that you
> >>>> described earlier in the thread.
> >>>>
> >>>> You get a memory corruption that seems to have started to happen because
> >>>> we're holding an additional lock over suspend resume now.  Something's fishy
> >>>> on that machine and we need to figure out what it is.
> >>>
> >>> The hickup happens in the timer softirq.
> >>>
> >>> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
> >>> 	  a try.
> >>
> >> This looks like it was a good idea.
> >>
> >> The kernel now outputs the following traces after resuming.
> >>
> >> [   26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
> >> debug_print_object+0x83/0xa0()
> >> [   26.973932] ODEBUG: free active (active state 0) object type:
> >> timer_list hint: delayed_work_timer_fn+0x0/0x20
> >> [   26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
> >> rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt
> >> iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper
> >> r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel
> >> battery thermal wmi evdev mei_me video mei button mperf processor
> >> serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
> >> usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci
> >> ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common
> >> [   26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted
> >> 3.11.0-rc2-ARCH #64
> >> [   26.974014] Hardware name: CLEVO CO.                        W55xEU
> >>                        /W55xEU                          , BIOS 4.6.5
> >> 03/05/2013
> >> [   26.974019] Workqueue: kacpi_hotplug hotplug_event_work
> >> [   26.974020]  0000000000000009 ffff880407d0da18 ffffffff81459fe9
> >> ffff880407d0da60
> >> [   26.974023]  ffff880407d0da50 ffffffff8104dc7d ffff880407fad488
> >> ffffffff81836fc0
> >> [   26.974025]  ffffffff81701358 ffffffff81afef70 0000000000000003
> >> ffff880407d0dab0
> >> [   26.974027] Call Trace:
> >> [   26.974031]  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
> >> [   26.974043]  [<ffffffff8104dc7d>] warn_slowpath_common+0x7d/0xa0
> >> [   26.974044]  [<ffffffff8104dcec>] warn_slowpath_fmt+0x4c/0x50
> >> [   26.974047]  [<ffffffff81261433>] debug_print_object+0x83/0xa0
> >> [   26.974050]  [<ffffffff8106b820>] ? queue_work_on+0x50/0x50
> >> [   26.974053]  [<ffffffff81261c2b>] __debug_check_no_obj_freed+0x1fb/0x240
> >> [   26.974059]  [<ffffffffa008e959>] ? rtsx_pci_remove+0x119/0x1d0
> >> [rtsx_pci]
> > 
> > So a device driven by rtsx_pcr.c is removed after resume.  Without the commit
> > you've bisected it is removed as well, but that happens during resume, so
> > rtsx_pci_resume() is likely not called in that case.
> 
> I'm not sure to understand your point.

The problem is that with the commit you've bisected, the whole removal of
rtsx_pcr is likely done *before* the PM core calls resume callbacks of
device drivers (although only incidentally, because it very well may be
done in parallel with that).  However, after that commit the removal is only
done after the resume callbacks have been called, which means that the device
is not physically present when rtsx_pci_resume() is called.  Of course,
it may not be physically present at that point anyway, so rtsx_pci_resume()
should have taken that into consideration already, but it doesn't from what
I can say.

I'll try to prepare a debug patch for you later today.

Thanks!

-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-25 10:47                                                   ` Rafael J. Wysocki
@ 2013-11-29  8:28                                                     ` Francis Moreau
  2013-11-29  9:02                                                       ` Thomas Gleixner
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-11-29  8:28 UTC (permalink / raw)
  To: Jingoo Han, 'Wei WANG', 'Samuel Ortiz',
	'Chris Ball'
  Cc: Rafael J. Wysocki, Thomas Gleixner, 'Borislav Petkov',
	'LKML'

Hello,

On 11/25/2013 11:47 AM, Rafael J. Wysocki wrote:
> On Monday, November 25, 2013 08:42:21 AM Francis Moreau wrote:
>> On 11/24/2013 10:06 PM, Rafael J. Wysocki wrote:
>>> On Sunday, November 24, 2013 10:39:20 AM Francis Moreau wrote:
>>>> Hello Thomas
>>>>
>>>> On 11/22/2013 11:27 PM, Thomas Gleixner wrote:
>>>>> On Fri, 22 Nov 2013, Rafael J. Wysocki wrote:
>>>>>> On Friday, November 22, 2013 10:36:23 PM Francis Moreau wrote:
>>>>>>> Ok, I've finally managed to find out the bad commit:
>>>>>>> ad07277e82dedabacc52c82746633680a3187d25: ACPI / PM: Hold acpi_scan_lock
>>>>>>> over system PM transitions
>>>>>>>
>>>>>>> I verified that the parent commit doesn't have the problem.
>>>>>>
>>>>>> Interesting.
>>>>>>
>>>>>>> Rafael, you're the man now ;)
>>>>>>
>>>>>> I kind of don't see how that commit may result in behavior that you
>>>>>> described earlier in the thread.
>>>>>>
>>>>>> You get a memory corruption that seems to have started to happen because
>>>>>> we're holding an additional lock over suspend resume now.  Something's fishy
>>>>>> on that machine and we need to figure out what it is.
>>>>>
>>>>> The hickup happens in the timer softirq.
>>>>>
>>>>> @Francis: Did you try to enable DEBUG_OBJECTS.*. If not please give it
>>>>> 	  a try.
>>>>
>>>> This looks like it was a good idea.
>>>>
>>>> The kernel now outputs the following traces after resuming.
>>>>
>>>> [   26.973928] WARNING: CPU: 0 PID: 4 at lib/debugobjects.c:260
>>>> debug_print_object+0x83/0xa0()
>>>> [   26.973932] ODEBUG: free active (active state 0) object type:
>>>> timer_list hint: delayed_work_timer_fn+0x0/0x20
>>>> [   26.973972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
>>>> rtsx_pci_ms coretemp memstick kvm_intel i2c_i801 iTCO_wdt
>>>> iTCO_vendor_support i915 i2c_algo_bit intel_agp intel_gtt drm_kms_helper
>>>> r8169 drm kvm mii agpgart i2c_core lpc_ich ac shpchp crc32c_intel
>>>> battery thermal wmi evdev mei_me video mei button mperf processor
>>>> serio_raw microcode ext4 crc16 mbcache jbd2 sr_mod cdrom sd_mod
>>>> usb_storage rtsx_pci_sdmmc mmc_core ahci libahci libata ehci_pci
>>>> ehci_hcd xhci_hcd scsi_mod rtsx_pci usbcore usb_common
>>>> [   26.974013] CPU: 0 PID: 4 Comm: kworker/0:0 Not tainted
>>>> 3.11.0-rc2-ARCH #64
>>>> [   26.974014] Hardware name: CLEVO CO.                        W55xEU
>>>>                        /W55xEU                          , BIOS 4.6.5
>>>> 03/05/2013
>>>> [   26.974019] Workqueue: kacpi_hotplug hotplug_event_work
>>>> [   26.974020]  0000000000000009 ffff880407d0da18 ffffffff81459fe9
>>>> ffff880407d0da60
>>>> [   26.974023]  ffff880407d0da50 ffffffff8104dc7d ffff880407fad488
>>>> ffffffff81836fc0
>>>> [   26.974025]  ffffffff81701358 ffffffff81afef70 0000000000000003
>>>> ffff880407d0dab0
>>>> [   26.974027] Call Trace:
>>>> [   26.974031]  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
>>>> [   26.974043]  [<ffffffff8104dc7d>] warn_slowpath_common+0x7d/0xa0
>>>> [   26.974044]  [<ffffffff8104dcec>] warn_slowpath_fmt+0x4c/0x50
>>>> [   26.974047]  [<ffffffff81261433>] debug_print_object+0x83/0xa0
>>>> [   26.974050]  [<ffffffff8106b820>] ? queue_work_on+0x50/0x50
>>>> [   26.974053]  [<ffffffff81261c2b>] __debug_check_no_obj_freed+0x1fb/0x240
>>>> [   26.974059]  [<ffffffffa008e959>] ? rtsx_pci_remove+0x119/0x1d0
>>>> [rtsx_pci]
>>>
>>> So a device driven by rtsx_pcr.c is removed after resume.  Without the commit
>>> you've bisected it is removed as well, but that happens during resume, so
>>> rtsx_pci_resume() is likely not called in that case.
>>
>> I'm not sure to understand your point.
> 
> The problem is that with the commit you've bisected, the whole removal of
> rtsx_pcr is likely done *before* the PM core calls resume callbacks of
> device drivers (although only incidentally, because it very well may be
> done in parallel with that).  However, after that commit the removal is only
> done after the resume callbacks have been called, which means that the device
> is not physically present when rtsx_pci_resume() is called.  Of course,
> it may not be physically present at that point anyway, so rtsx_pci_resume()
> should have taken that into consideration already, but it doesn't from what
> I can say.
> 

Since it seems to be related to rtsx driver or its upper layer, could
the folks involved in this area have a look to this issue please ?

Thank you


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-29  8:28                                                     ` Francis Moreau
@ 2013-11-29  9:02                                                       ` Thomas Gleixner
  2013-11-30 15:07                                                         ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Thomas Gleixner @ 2013-11-29  9:02 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Jingoo Han, 'Wei WANG', 'Samuel Ortiz',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML'

On Fri, 29 Nov 2013, Francis Moreau wrote:
> Since it seems to be related to rtsx driver or its upper layer, could
> the folks involved in this area have a look to this issue please ?

I'm not involved, but looking at the debug objects backtrace it's
related to the delayed work in rtsx.

Does the untested patch below cure the issue?

Thanks,

	tglx

Index: linux-2.6/drivers/mfd/rtsx_pcr.c
===================================================================
--- linux-2.6.orig/drivers/mfd/rtsx_pcr.c
+++ linux-2.6/drivers/mfd/rtsx_pcr.c
@@ -1227,15 +1227,15 @@ static void rtsx_pci_remove(struct pci_d
 	struct rtsx_pcr *pcr = handle->pcr;
 
 	pcr->remove_pci = true;
+	free_irq(pcr->irq, (void *)pcr);
 
-	cancel_delayed_work(&pcr->carddet_work);
-	cancel_delayed_work(&pcr->idle_work);
+	cancel_delayed_work_sync(&pcr->carddet_work);
+	cancel_delayed_work_sync(&pcr->idle_work);
 
 	mfd_remove_devices(&pcidev->dev);
 
 	dma_free_coherent(&(pcr->pci->dev), RTSX_RESV_BUF_LEN,
 			pcr->rtsx_resv_buf, pcr->rtsx_resv_buf_addr);
-	free_irq(pcr->irq, (void *)pcr);
 	if (pcr->msi_en)
 		pci_disable_msi(pcr->pci);
 	iounmap(pcr->remap_addr);
 

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-29  9:02                                                       ` Thomas Gleixner
@ 2013-11-30 15:07                                                         ` Francis Moreau
  2013-11-30 20:17                                                           ` Rafael J. Wysocki
  2013-12-02 10:49                                                           ` Thomas Gleixner
  0 siblings, 2 replies; 63+ messages in thread
From: Francis Moreau @ 2013-11-30 15:07 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jingoo Han, 'Wei WANG', 'Samuel Ortiz',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML'

Hello Thomas,

Sorry for the delay.

On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
> On Fri, 29 Nov 2013, Francis Moreau wrote:
>> Since it seems to be related to rtsx driver or its upper layer, could
>> the folks involved in this area have a look to this issue please ?
> 
> I'm not involved, but looking at the debug objects backtrace it's
> related to the delayed work in rtsx.
> 
> Does the untested patch below cure the issue?
> 

It seems it does since I can't see the debug object trace anymore
however Ican see this now:

[   64.498270] irq 16: nobody cared (try booting with the "irqpoll" option)
[   64.498314] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc2-ARCH #65
[   64.498316] Hardware name: CLEVO CO.                        W55xEU
                       /W55xEU                          , BIOS 4.6.5
03/05/2013
[   64.498317]  ffff8804078bd38c ffff88041e203e48 ffffffff81459fe9
ffff8804078bd300
[   64.498320]  ffff88041e203e70 ffffffff810d8632 ffff8804078bd300
0000000000000010
[   64.498322]  0000000000000000 ffff88041e203eb0 ffffffff810d8a58
ffffffff8136a882
[   64.498324] Call Trace:
[   64.498325]  <IRQ>  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
[   64.498334]  [<ffffffff810d8632>] __report_bad_irq+0x32/0xd0
[   64.498337]  [<ffffffff810d8a58>] note_interrupt+0x138/0x1f0
[   64.498340]  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
[   64.498343]  [<ffffffff810d6439>] handle_irq_event_percpu+0xf9/0x250
[   64.498345]  [<ffffffff810d65cd>] handle_irq_event+0x3d/0x60
[   64.498347]  [<ffffffff810d95ca>] handle_fasteoi_irq+0x5a/0x100
[   64.498350]  [<ffffffff81004a6e>] handle_irq+0x1e/0x30
[   64.498353]  [<ffffffff8146aafd>] do_IRQ+0x4d/0xc0
[   64.498355]  [<ffffffff8146116d>] common_interrupt+0x6d/0x6d
[   64.498356]  <EOI>  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
[   64.498360]  [<ffffffff8136a878>] ? cpuidle_enter_state+0x48/0xc0
[   64.498362]  [<ffffffff8136a9b9>] cpuidle_idle_call+0xc9/0x280
[   64.498365]  [<ffffffff8100bf6e>] arch_cpu_idle+0xe/0x30
[   64.498368]  [<ffffffff810a1287>] cpu_startup_entry+0x257/0x2d0
[   64.498370]  [<ffffffff8144d404>] rest_init+0x84/0x90
[   64.498373]  [<ffffffff818d9ee1>] start_kernel+0x414/0x420
[   64.498375]  [<ffffffff818d98d6>] ? repair_env_string+0x5c/0x5c
[   64.498377]  [<ffffffff818d9120>] ? early_idt_handlers+0x120/0x120
[   64.498379]  [<ffffffff818d95be>] x86_64_start_reservations+0x2a/0x2c
[   64.498381]  [<ffffffff818d96c8>] x86_64_start_kernel+0x108/0x117
[   64.498382] handlers:
[   64.498402] [<ffffffffa00168f0>] usb_hcd_irq [usbcore]
[   64.498422] Disabling IRQ #16

So I don't think it completely solve the problem but it's a good start.

Thank you.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-30 15:07                                                         ` Francis Moreau
@ 2013-11-30 20:17                                                           ` Rafael J. Wysocki
  2013-12-01 10:11                                                             ` Francis Moreau
  2013-12-01 19:26                                                             ` Francis Moreau
  2013-12-02 10:49                                                           ` Thomas Gleixner
  1 sibling, 2 replies; 63+ messages in thread
From: Rafael J. Wysocki @ 2013-11-30 20:17 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Thomas Gleixner, Jingoo Han, 'Wei WANG',
	'Samuel Ortiz', 'Chris Ball',
	'Borislav Petkov', 'LKML'

On Saturday, November 30, 2013 04:07:36 PM Francis Moreau wrote:
> Hello Thomas,
> 
> Sorry for the delay.
> 
> On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
> > On Fri, 29 Nov 2013, Francis Moreau wrote:
> >> Since it seems to be related to rtsx driver or its upper layer, could
> >> the folks involved in this area have a look to this issue please ?
> > 
> > I'm not involved, but looking at the debug objects backtrace it's
> > related to the delayed work in rtsx.
> > 
> > Does the untested patch below cure the issue?
> > 
> 
> It seems it does since I can't see the debug object trace anymore
> however Ican see this now:

So Thomas' patch should be applied to the rtsx driver.

> [   64.498270] irq 16: nobody cared (try booting with the "irqpoll" option)
> [   64.498314] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc2-ARCH #65
> [   64.498316] Hardware name: CLEVO CO.                        W55xEU
>                        /W55xEU                          , BIOS 4.6.5
> 03/05/2013
> [   64.498317]  ffff8804078bd38c ffff88041e203e48 ffffffff81459fe9
> ffff8804078bd300
> [   64.498320]  ffff88041e203e70 ffffffff810d8632 ffff8804078bd300
> 0000000000000010
> [   64.498322]  0000000000000000 ffff88041e203eb0 ffffffff810d8a58
> ffffffff8136a882
> [   64.498324] Call Trace:
> [   64.498325]  <IRQ>  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
> [   64.498334]  [<ffffffff810d8632>] __report_bad_irq+0x32/0xd0
> [   64.498337]  [<ffffffff810d8a58>] note_interrupt+0x138/0x1f0
> [   64.498340]  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
> [   64.498343]  [<ffffffff810d6439>] handle_irq_event_percpu+0xf9/0x250
> [   64.498345]  [<ffffffff810d65cd>] handle_irq_event+0x3d/0x60
> [   64.498347]  [<ffffffff810d95ca>] handle_fasteoi_irq+0x5a/0x100
> [   64.498350]  [<ffffffff81004a6e>] handle_irq+0x1e/0x30
> [   64.498353]  [<ffffffff8146aafd>] do_IRQ+0x4d/0xc0
> [   64.498355]  [<ffffffff8146116d>] common_interrupt+0x6d/0x6d
> [   64.498356]  <EOI>  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
> [   64.498360]  [<ffffffff8136a878>] ? cpuidle_enter_state+0x48/0xc0
> [   64.498362]  [<ffffffff8136a9b9>] cpuidle_idle_call+0xc9/0x280
> [   64.498365]  [<ffffffff8100bf6e>] arch_cpu_idle+0xe/0x30
> [   64.498368]  [<ffffffff810a1287>] cpu_startup_entry+0x257/0x2d0
> [   64.498370]  [<ffffffff8144d404>] rest_init+0x84/0x90
> [   64.498373]  [<ffffffff818d9ee1>] start_kernel+0x414/0x420
> [   64.498375]  [<ffffffff818d98d6>] ? repair_env_string+0x5c/0x5c
> [   64.498377]  [<ffffffff818d9120>] ? early_idt_handlers+0x120/0x120
> [   64.498379]  [<ffffffff818d95be>] x86_64_start_reservations+0x2a/0x2c
> [   64.498381]  [<ffffffff818d96c8>] x86_64_start_kernel+0x108/0x117
> [   64.498382] handlers:
> [   64.498402] [<ffffffffa00168f0>] usb_hcd_irq [usbcore]
> [   64.498422] Disabling IRQ #16
> 
> So I don't think it completely solve the problem but it's a good start.

That issue may or may not be related.

If your system survives resume (I guess it does?), can you please send
/proc/interrupts before and after the first suspend/resume cycle?

Rafael


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-30 20:17                                                           ` Rafael J. Wysocki
@ 2013-12-01 10:11                                                             ` Francis Moreau
  2013-12-01 19:26                                                             ` Francis Moreau
  1 sibling, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2013-12-01 10:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Jingoo Han, 'Wei WANG',
	'Samuel Ortiz', 'Chris Ball',
	'Borislav Petkov', 'LKML'

On 11/30/2013 09:17 PM, Rafael J. Wysocki wrote:
> On Saturday, November 30, 2013 04:07:36 PM Francis Moreau wrote:
>> Hello Thomas,
>>
>> Sorry for the delay.
>>
>> On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
>>> On Fri, 29 Nov 2013, Francis Moreau wrote:
>>>> Since it seems to be related to rtsx driver or its upper layer, could
>>>> the folks involved in this area have a look to this issue please ?
>>>
>>> I'm not involved, but looking at the debug objects backtrace it's
>>> related to the delayed work in rtsx.
>>>
>>> Does the untested patch below cure the issue?
>>>
>>
>> It seems it does since I can't see the debug object trace anymore
>> however Ican see this now:
> 
> So Thomas' patch should be applied to the rtsx driver.
> 
>> [   64.498270] irq 16: nobody cared (try booting with the "irqpoll" option)
>> [   64.498314] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.11.0-rc2-ARCH #65
>> [   64.498316] Hardware name: CLEVO CO.                        W55xEU
>>                        /W55xEU                          , BIOS 4.6.5
>> 03/05/2013
>> [   64.498317]  ffff8804078bd38c ffff88041e203e48 ffffffff81459fe9
>> ffff8804078bd300
>> [   64.498320]  ffff88041e203e70 ffffffff810d8632 ffff8804078bd300
>> 0000000000000010
>> [   64.498322]  0000000000000000 ffff88041e203eb0 ffffffff810d8a58
>> ffffffff8136a882
>> [   64.498324] Call Trace:
>> [   64.498325]  <IRQ>  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
>> [   64.498334]  [<ffffffff810d8632>] __report_bad_irq+0x32/0xd0
>> [   64.498337]  [<ffffffff810d8a58>] note_interrupt+0x138/0x1f0
>> [   64.498340]  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
>> [   64.498343]  [<ffffffff810d6439>] handle_irq_event_percpu+0xf9/0x250
>> [   64.498345]  [<ffffffff810d65cd>] handle_irq_event+0x3d/0x60
>> [   64.498347]  [<ffffffff810d95ca>] handle_fasteoi_irq+0x5a/0x100
>> [   64.498350]  [<ffffffff81004a6e>] handle_irq+0x1e/0x30
>> [   64.498353]  [<ffffffff8146aafd>] do_IRQ+0x4d/0xc0
>> [   64.498355]  [<ffffffff8146116d>] common_interrupt+0x6d/0x6d
>> [   64.498356]  <EOI>  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
>> [   64.498360]  [<ffffffff8136a878>] ? cpuidle_enter_state+0x48/0xc0
>> [   64.498362]  [<ffffffff8136a9b9>] cpuidle_idle_call+0xc9/0x280
>> [   64.498365]  [<ffffffff8100bf6e>] arch_cpu_idle+0xe/0x30
>> [   64.498368]  [<ffffffff810a1287>] cpu_startup_entry+0x257/0x2d0
>> [   64.498370]  [<ffffffff8144d404>] rest_init+0x84/0x90
>> [   64.498373]  [<ffffffff818d9ee1>] start_kernel+0x414/0x420
>> [   64.498375]  [<ffffffff818d98d6>] ? repair_env_string+0x5c/0x5c
>> [   64.498377]  [<ffffffff818d9120>] ? early_idt_handlers+0x120/0x120
>> [   64.498379]  [<ffffffff818d95be>] x86_64_start_reservations+0x2a/0x2c
>> [   64.498381]  [<ffffffff818d96c8>] x86_64_start_kernel+0x108/0x117
>> [   64.498382] handlers:
>> [   64.498402] [<ffffffffa00168f0>] usb_hcd_irq [usbcore]
>> [   64.498422] Disabling IRQ #16
>>
>> So I don't think it completely solve the problem but it's a good start.
> 
> That issue may or may not be related.
> 
> If your system survives resume (I guess it does?), 

my system survives resume as soon as the DEBUG_OBJECTS facility was
activated.

> can you please send
> /proc/interrupts before and after the first suspend/resume cycle?
> 

Sure, I will do later in the day.


Thanks for your help.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-30 20:17                                                           ` Rafael J. Wysocki
  2013-12-01 10:11                                                             ` Francis Moreau
@ 2013-12-01 19:26                                                             ` Francis Moreau
  1 sibling, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2013-12-01 19:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Thomas Gleixner, Jingoo Han, 'Wei WANG',
	'Samuel Ortiz', 'Chris Ball',
	'Borislav Petkov', 'LKML'

[-- Attachment #1: Type: text/plain, Size: 243 bytes --]

On 11/30/2013 09:17 PM, Rafael J. Wysocki wrote:

[...]

> If your system survives resume (I guess it does?), can you please send
> /proc/interrupts before and after the first suspend/resume cycle?
> 

Please find both dumps attached.

Thanks

[-- Attachment #2: after.irq --]
[-- Type: text/plain, Size: 2877 bytes --]

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       
  0:         18          0          0          0          0          0          0          0   IO-APIC-edge      timer
  1:        202          2          0          0          6          6          1          0   IO-APIC-edge      i8042
  9:        381         11          7          2         28        121          3          7   IO-APIC-fasteoi   acpi
 12:         17          1          0          1          2          1          0          0   IO-APIC-edge      i8042
 16:      99993          0          0          1          2          4          0          1   IO-APIC-fasteoi   ehci_hcd:usb3
 23:         50          4          0          0          6          0          0          1   IO-APIC-fasteoi   ehci_hcd:usb4
 41:      10082        499        229        182       7653        435        112        137   PCI-MSI-edge      xhci_hcd
 42:        973          1         32         65         20        126         17        102   PCI-MSI-edge      ahci
 43:         26          0          0          0          0          0          0          0   PCI-MSI-edge      mei_me
 45:         21         33          0          0          6          1          0          0   PCI-MSI-edge      i915
NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
LOC:       2071       1279        951       1023       1177        764        638        700   Local timer interrupts
SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0          0          0          0          0   Performance monitoring interrupts
IWI:         56         79         36         21         17         43         39         38   IRQ work interrupts
RTR:         12          0          0          0          0          0          0          0   APIC ICR read retries
RES:       2033       2711       2288       1925       2039       1390        916       1161   Rescheduling interrupts
CAL:        419        438        479        466        491        512        502        477   Function call interrupts
TLB:         63          2          3         11          1          7          5          0   TLB shootdowns
TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
MCP:          4          4          4          4          4          4          4          4   Machine check polls
ERR:          0
MIS:          0

[-- Attachment #3: before.irq --]
[-- Type: text/plain, Size: 2999 bytes --]

           CPU0       CPU1       CPU2       CPU3       CPU4       CPU5       CPU6       CPU7       
  0:         18          0          0          0          0          0          0          0   IO-APIC-edge      timer
  1:        100          2          0          0          6          6          1          0   IO-APIC-edge      i8042
  9:        179         11          7          2         26        118          3          6   IO-APIC-fasteoi   acpi
 12:          9          1          0          1          2          1          0          0   IO-APIC-edge      i8042
 16:         25          0          0          1          2          4          0          1   IO-APIC-fasteoi   ehci_hcd:usb3
 23:         30          0          0          0          6          0          0          1   IO-APIC-fasteoi   ehci_hcd:usb4
 40:          6          7          3          0          0          1          4          2   PCI-MSI-edge      rtsx_pci
 41:       7196        491        228        181       7448        425        109        137   PCI-MSI-edge      xhci_hcd
 42:        929          1         32         65         20        125         17        102   PCI-MSI-edge      ahci
 43:         15          0          0          0          0          7          2          0   PCI-MSI-edge      mei_me
 45:         14         33          0          0          5          1          0          0   PCI-MSI-edge      i915
NMI:          0          0          0          0          0          0          0          0   Non-maskable interrupts
LOC:       1331       1141        847        911       1104        673        577        642   Local timer interrupts
SPU:          0          0          0          0          0          0          0          0   Spurious interrupts
PMI:          0          0          0          0          0          0          0          0   Performance monitoring interrupts
IWI:         46         76         31         17         11         42         32         35   IRQ work interrupts
RTR:          6          0          0          0          0          0          0          0   APIC ICR read retries
RES:       1863       2342       2049       1827       1931       1329        685       1104   Rescheduling interrupts
CAL:        418        422        463        450        487        496        496        460   Function call interrupts
TLB:          2          2          3         11          1          7          5          0   TLB shootdowns
TRM:          0          0          0          0          0          0          0          0   Thermal event interrupts
THR:          0          0          0          0          0          0          0          0   Threshold APIC interrupts
MCE:          0          0          0          0          0          0          0          0   Machine check exceptions
MCP:          2          2          2          2          2          2          2          2   Machine check polls
ERR:          0
MIS:          0

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-11-30 15:07                                                         ` Francis Moreau
  2013-11-30 20:17                                                           ` Rafael J. Wysocki
@ 2013-12-02 10:49                                                           ` Thomas Gleixner
  2013-12-02 11:20                                                             ` Thomas Gleixner
  1 sibling, 1 reply; 63+ messages in thread
From: Thomas Gleixner @ 2013-12-02 10:49 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Jingoo Han, 'Wei WANG', 'Samuel Ortiz',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML'

On Sat, 30 Nov 2013, Francis Moreau wrote:
> Hello Thomas,
> 
> Sorry for the delay.
> 
> On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
> > On Fri, 29 Nov 2013, Francis Moreau wrote:
> >> Since it seems to be related to rtsx driver or its upper layer, could
> >> the folks involved in this area have a look to this issue please ?
> > 
> > I'm not involved, but looking at the debug objects backtrace it's
> > related to the delayed work in rtsx.
> > 
> > Does the untested patch below cure the issue?
> > 
> 
> It seems it does since I can't see the debug object trace anymore
> however Ican see this now:

<SNIP>
 
> So I don't think it completely solve the problem but it's a good start.

I kinda expected that, but I wanted to confirm my suspicion, that the
interrupt hits after the delayed work is canceled and just requeues it
again, which then leads to an armed timer being freed further down.

I'm not familiar with that driver and I leave the final fixup to the
driver maintainers. It's enough data for them to figure out the real
solution.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-02 10:49                                                           ` Thomas Gleixner
@ 2013-12-02 11:20                                                             ` Thomas Gleixner
  2013-12-03  8:14                                                               ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: Thomas Gleixner @ 2013-12-02 11:20 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Jingoo Han, 'Wei WANG', 'Samuel Ortiz',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML'

On Mon, 2 Dec 2013, Thomas Gleixner wrote:
> On Sat, 30 Nov 2013, Francis Moreau wrote:
> > Hello Thomas,
> > 
> > Sorry for the delay.
> > 
> > On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
> > > On Fri, 29 Nov 2013, Francis Moreau wrote:
> > >> Since it seems to be related to rtsx driver or its upper layer, could
> > >> the folks involved in this area have a look to this issue please ?
> > > 
> > > I'm not involved, but looking at the debug objects backtrace it's
> > > related to the delayed work in rtsx.
> > > 
> > > Does the untested patch below cure the issue?
> > > 
> > 
> > It seems it does since I can't see the debug object trace anymore
> > however Ican see this now:
> 
> <SNIP>
>  
> > So I don't think it completely solve the problem but it's a good start.
> 
> I kinda expected that, but I wanted to confirm my suspicion, that the
> interrupt hits after the delayed work is canceled and just requeues it
> again, which then leads to an armed timer being freed further down.
> 
> I'm not familiar with that driver and I leave the final fixup to the
> driver maintainers. It's enough data for them to figure out the real
> solution.

Just had a quick look and the obvious solution is to disable the
interrupts at the device level _BEFORE_ doing anything else in the
teardown path. Updated patch below. That should avoid the nobody cared
splat on the other irq line.

Thanks,

	tglx

Index: linux-2.6/drivers/mfd/rtsx_pcr.c
===================================================================
--- linux-2.6.orig/drivers/mfd/rtsx_pcr.c
+++ linux-2.6/drivers/mfd/rtsx_pcr.c
@@ -1228,8 +1228,14 @@ static void rtsx_pci_remove(struct pci_d
 
 	pcr->remove_pci = true;
 
-	cancel_delayed_work(&pcr->carddet_work);
-	cancel_delayed_work(&pcr->idle_work);
+	/* Disable interrupts at the pcr level */
+	spin_lock_irq(&pcr->lock);
+	rtsx_pci_writel(pcr, RTSX_BIER, 0);
+	pcr->bier = 0;
+	spin_unlock_irq(&pcr->lock);
+
+	cancel_delayed_work_sync(&pcr->carddet_work);
+	cancel_delayed_work_sync(&pcr->idle_work);
 
 	mfd_remove_devices(&pcidev->dev);
 


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-02 11:20                                                             ` Thomas Gleixner
@ 2013-12-03  8:14                                                               ` Francis Moreau
  2013-12-09 19:33                                                                 ` Francis Moreau
  2013-12-09 22:17                                                                 ` Samuel Ortiz
  0 siblings, 2 replies; 63+ messages in thread
From: Francis Moreau @ 2013-12-03  8:14 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jingoo Han, 'Wei WANG', 'Samuel Ortiz',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML'

Hello Thomas,

On 12/02/2013 12:20 PM, Thomas Gleixner wrote:
> On Mon, 2 Dec 2013, Thomas Gleixner wrote:
>> On Sat, 30 Nov 2013, Francis Moreau wrote:
>>> Hello Thomas,
>>>
>>> Sorry for the delay.
>>>
>>> On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
>>>> On Fri, 29 Nov 2013, Francis Moreau wrote:
>>>>> Since it seems to be related to rtsx driver or its upper layer, could
>>>>> the folks involved in this area have a look to this issue please ?
>>>>
>>>> I'm not involved, but looking at the debug objects backtrace it's
>>>> related to the delayed work in rtsx.
>>>>
>>>> Does the untested patch below cure the issue?
>>>>
>>>
>>> It seems it does since I can't see the debug object trace anymore
>>> however Ican see this now:
>>
>> <SNIP>
>>  
>>> So I don't think it completely solve the problem but it's a good start.
>>
>> I kinda expected that, but I wanted to confirm my suspicion, that the
>> interrupt hits after the delayed work is canceled and just requeues it
>> again, which then leads to an armed timer being freed further down.
>>
>> I'm not familiar with that driver and I leave the final fixup to the
>> driver maintainers. It's enough data for them to figure out the real
>> solution.
> 
> Just had a quick look and the obvious solution is to disable the
> interrupts at the device level _BEFORE_ doing anything else in the
> teardown path. Updated patch below. That should avoid the nobody cared
> splat on the other irq line.
> 

Yes it does.

Now that you did the hard work, I hope driver's maintainer/developper
will care about this issue.

Thank you.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-03  8:14                                                               ` Francis Moreau
@ 2013-12-09 19:33                                                                 ` Francis Moreau
  2013-12-09 22:27                                                                   ` Samuel Ortiz
  2013-12-09 22:17                                                                 ` Samuel Ortiz
  1 sibling, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-12-09 19:33 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Jingoo Han, 'Wei WANG', 'Samuel Ortiz',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML'

On 12/03/2013 09:14 AM, Francis Moreau wrote:
> Hello Thomas,
> 
> On 12/02/2013 12:20 PM, Thomas Gleixner wrote:
>> On Mon, 2 Dec 2013, Thomas Gleixner wrote:
>>> On Sat, 30 Nov 2013, Francis Moreau wrote:
>>>> Hello Thomas,
>>>>
>>>> Sorry for the delay.
>>>>
>>>> On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
>>>>> On Fri, 29 Nov 2013, Francis Moreau wrote:
>>>>>> Since it seems to be related to rtsx driver or its upper layer, could
>>>>>> the folks involved in this area have a look to this issue please ?
>>>>>
>>>>> I'm not involved, but looking at the debug objects backtrace it's
>>>>> related to the delayed work in rtsx.
>>>>>
>>>>> Does the untested patch below cure the issue?
>>>>>
>>>>
>>>> It seems it does since I can't see the debug object trace anymore
>>>> however Ican see this now:
>>>
>>> <SNIP>
>>>  
>>>> So I don't think it completely solve the problem but it's a good start.
>>>
>>> I kinda expected that, but I wanted to confirm my suspicion, that the
>>> interrupt hits after the delayed work is canceled and just requeues it
>>> again, which then leads to an armed timer being freed further down.
>>>
>>> I'm not familiar with that driver and I leave the final fixup to the
>>> driver maintainers. It's enough data for them to figure out the real
>>> solution.
>>
>> Just had a quick look and the obvious solution is to disable the
>> interrupts at the device level _BEFORE_ doing anything else in the
>> teardown path. Updated patch below. That should avoid the nobody cared
>> splat on the other irq line.
>>
> 
> Yes it does.
> 
> Now that you did the hard work, I hope driver's maintainer/developper
> will care about this issue.
> 

Unfortunately he/she doesn't seem to care.

Moreover I've been by this now:

[  241.003324] INFO: task kworker/u16:4:108 blocked for more than 120
seconds.
[  241.003331]       Not tainted 3.12.2-1-ARCH #1
[  241.003332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[  241.003335] kworker/u16:4   D ffff880405bc8000     0   108      2
0x00000000
[  241.003355] Workqueue: kmemstick memstick_check [memstick]
[  241.003358]  ffff880405bc3c90 0000000000000046 00000000000144c0
ffff880405bc3fd8
[  241.003362]  ffff880405bc3fd8 00000000000144c0 ffff880405bc8000
ffff880405bc3c68
[  241.003366]  ffffffff814ef57c ffff880405bc3fd8 0000000000000286
0000000000000000
[  241.003370] Call Trace:
[  241.003380]  [<ffffffff814ef57c>] ? schedule_timeout+0x13c/0x290
[  241.003385]  [<ffffffff8106f590>] ? detach_if_pending+0x120/0x120
[  241.003388]  [<ffffffff8106f590>] ? detach_if_pending+0x120/0x120
[  241.003392]  [<ffffffff814f2e79>] schedule+0x29/0x70
[  241.003396]  [<ffffffff814ef659>] schedule_timeout+0x219/0x290
[  241.003401]  [<ffffffff8129a4d1>] ? vsnprintf+0x1e1/0x680
[  241.003405]  [<ffffffff814f2213>] wait_for_common+0xd3/0x180
[  241.003411]  [<ffffffff81095100>] ? wake_up_process+0x40/0x40
[  241.003414]  [<ffffffff814f22dd>] wait_for_completion+0x1d/0x20
[  241.003419]  [<ffffffffa061334a>] memstick_set_rw_addr+0x4a/0x50
[memstick]
[  241.003424]  [<ffffffffa061388e>] memstick_check+0x10e/0x370 [memstick]
[  241.003429]  [<ffffffff8107daf7>] process_one_work+0x167/0x450
[  241.003432]  [<ffffffff8107e501>] worker_thread+0x121/0x3a0
[  241.003436]  [<ffffffff8107e3e0>] ? manage_workers.isra.23+0x2b0/0x2b0
[  241.003441]  [<ffffffff81084e90>] kthread+0xc0/0xd0
[  241.003446]  [<ffffffff81084dd0>] ? kthread_create_on_node+0x120/0x120
[  241.003450]  [<ffffffff814fc33c>] ret_from_fork+0x7c/0xb0
[  241.003454]  [<ffffffff81084dd0>] ? kthread_create_on_node+0x120/0x120

looks like a different issue.

I already black listed this driver, maybe it's time to mark it as broken ?

Thanks.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-03  8:14                                                               ` Francis Moreau
  2013-12-09 19:33                                                                 ` Francis Moreau
@ 2013-12-09 22:17                                                                 ` Samuel Ortiz
  2013-12-10  1:39                                                                   ` wwang
  2013-12-10 10:49                                                                   ` Francis Moreau
  1 sibling, 2 replies; 63+ messages in thread
From: Samuel Ortiz @ 2013-12-09 22:17 UTC (permalink / raw)
  To: Francis Moreau
  Cc: Thomas Gleixner, Jingoo Han, 'Wei WANG',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi Francis,

Adding Lee to the Cc list.

On Tue, Dec 03, 2013 at 09:14:14AM +0100, Francis Moreau wrote:
> Now that you did the hard work, I hope driver's maintainer/developper
> will care about this issue.
I applied Thomas' patch to mfd-fixes.
Thanks a lot to you and Thomas for that.

Cheers,
Samuel.

-- 
Intel Open Source Technology Centre
http://oss.intel.com/

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-09 19:33                                                                 ` Francis Moreau
@ 2013-12-09 22:27                                                                   ` Samuel Ortiz
  0 siblings, 0 replies; 63+ messages in thread
From: Samuel Ortiz @ 2013-12-09 22:27 UTC (permalink / raw)
  To: Francis Moreau, Wei WANG
  Cc: Thomas Gleixner, Jingoo Han, 'Wei WANG',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi Francis,

On Mon, Dec 09, 2013 at 08:33:32PM +0100, Francis Moreau wrote:
> On 12/03/2013 09:14 AM, Francis Moreau wrote:
> > Hello Thomas,
> > 
> > On 12/02/2013 12:20 PM, Thomas Gleixner wrote:
> >> On Mon, 2 Dec 2013, Thomas Gleixner wrote:
> >>> On Sat, 30 Nov 2013, Francis Moreau wrote:
> >>>> Hello Thomas,
> >>>>
> >>>> Sorry for the delay.
> >>>>
> >>>> On 11/29/2013 10:02 AM, Thomas Gleixner wrote:
> >>>>> On Fri, 29 Nov 2013, Francis Moreau wrote:
> >>>>>> Since it seems to be related to rtsx driver or its upper layer, could
> >>>>>> the folks involved in this area have a look to this issue please ?
> >>>>>
> >>>>> I'm not involved, but looking at the debug objects backtrace it's
> >>>>> related to the delayed work in rtsx.
> >>>>>
> >>>>> Does the untested patch below cure the issue?
> >>>>>
> >>>>
> >>>> It seems it does since I can't see the debug object trace anymore
> >>>> however Ican see this now:
> >>>
> >>> <SNIP>
> >>>  
> >>>> So I don't think it completely solve the problem but it's a good start.
> >>>
> >>> I kinda expected that, but I wanted to confirm my suspicion, that the
> >>> interrupt hits after the delayed work is canceled and just requeues it
> >>> again, which then leads to an armed timer being freed further down.
> >>>
> >>> I'm not familiar with that driver and I leave the final fixup to the
> >>> driver maintainers. It's enough data for them to figure out the real
> >>> solution.
> >>
> >> Just had a quick look and the obvious solution is to disable the
> >> interrupts at the device level _BEFORE_ doing anything else in the
> >> teardown path. Updated patch below. That should avoid the nobody cared
> >> splat on the other irq line.
> >>
> > 
> > Yes it does.
> > 
> > Now that you did the hard work, I hope driver's maintainer/developper
> > will care about this issue.
> > 
> 
> Unfortunately he/she doesn't seem to care.
> 
> Moreover I've been by this now:
> 
> [  241.003324] INFO: task kworker/u16:4:108 blocked for more than 120
> seconds.
> [  241.003331]       Not tainted 3.12.2-1-ARCH #1
> [  241.003332] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
> disables this message.
> [  241.003335] kworker/u16:4   D ffff880405bc8000     0   108      2
> 0x00000000
> [  241.003355] Workqueue: kmemstick memstick_check [memstick]
> [  241.003358]  ffff880405bc3c90 0000000000000046 00000000000144c0
> ffff880405bc3fd8
> [  241.003362]  ffff880405bc3fd8 00000000000144c0 ffff880405bc8000
> ffff880405bc3c68
> [  241.003366]  ffffffff814ef57c ffff880405bc3fd8 0000000000000286
> 0000000000000000
> [  241.003370] Call Trace:
> [  241.003380]  [<ffffffff814ef57c>] ? schedule_timeout+0x13c/0x290
> [  241.003385]  [<ffffffff8106f590>] ? detach_if_pending+0x120/0x120
> [  241.003388]  [<ffffffff8106f590>] ? detach_if_pending+0x120/0x120
> [  241.003392]  [<ffffffff814f2e79>] schedule+0x29/0x70
> [  241.003396]  [<ffffffff814ef659>] schedule_timeout+0x219/0x290
> [  241.003401]  [<ffffffff8129a4d1>] ? vsnprintf+0x1e1/0x680
> [  241.003405]  [<ffffffff814f2213>] wait_for_common+0xd3/0x180
> [  241.003411]  [<ffffffff81095100>] ? wake_up_process+0x40/0x40
> [  241.003414]  [<ffffffff814f22dd>] wait_for_completion+0x1d/0x20
> [  241.003419]  [<ffffffffa061334a>] memstick_set_rw_addr+0x4a/0x50
> [memstick]
> [  241.003424]  [<ffffffffa061388e>] memstick_check+0x10e/0x370 [memstick]
> [  241.003429]  [<ffffffff8107daf7>] process_one_work+0x167/0x450
> [  241.003432]  [<ffffffff8107e501>] worker_thread+0x121/0x3a0
> [  241.003436]  [<ffffffff8107e3e0>] ? manage_workers.isra.23+0x2b0/0x2b0
> [  241.003441]  [<ffffffff81084e90>] kthread+0xc0/0xd0
> [  241.003446]  [<ffffffff81084dd0>] ? kthread_create_on_node+0x120/0x120
> [  241.003450]  [<ffffffff814fc33c>] ret_from_fork+0x7c/0xb0
> [  241.003454]  [<ffffffff81084dd0>] ? kthread_create_on_node+0x120/0x120
> 
> looks like a different issue.
Indeed. I assume you don't see issue that on the resume path ?
Wei, is that something you've ever seen with the rtsx memstick driver ?

Cheers,
Samuel.

-- 
Intel Open Source Technology Centre
http://oss.intel.com/

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-09 22:17                                                                 ` Samuel Ortiz
@ 2013-12-10  1:39                                                                   ` wwang
  2013-12-10  1:56                                                                     ` micky
  2013-12-10 10:49                                                                   ` Francis Moreau
  1 sibling, 1 reply; 63+ messages in thread
From: wwang @ 2013-12-10  1:39 UTC (permalink / raw)
  To: Samuel Ortiz, Francis Moreau
  Cc: Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones, micky

On 12/10/2013 06:17 AM, Samuel Ortiz wrote:
> Hi Francis,
>
> Adding Lee to the Cc list.
>
> On Tue, Dec 03, 2013 at 09:14:14AM +0100, Francis Moreau wrote:
>> Now that you did the hard work, I hope driver's maintainer/developper
>> will care about this issue.
> I applied Thomas' patch to mfd-fixes.
> Thanks a lot to you and Thomas for that.
>
> Cheers,
> Samuel.
>

Hi Samuel:

Add Micky to the list, who is responsible for maintaining this driver now.

We can't reproduce this issue with our platform, so it seems a little 
difficult for us to catch the point. We have tested Thomas' patch, it's 
OK with our platform. But we find it maybe not very complete.
Micky will send his patch later, which is based on Thomas' patch.

BR,
Wei




^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-10  1:39                                                                   ` wwang
@ 2013-12-10  1:56                                                                     ` micky
  2013-12-10  8:29                                                                       ` Samuel Ortiz
                                                                                         ` (2 more replies)
  0 siblings, 3 replies; 63+ messages in thread
From: micky @ 2013-12-10  1:56 UTC (permalink / raw)
  To: wwang, Samuel Ortiz, Francis Moreau
  Cc: Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi Francis:
On 12/10/2013 09:39 AM, wwang wrote:
> which is based on Thomas' patch. 

Can you help us test this patch, we disable irq while suspend here.

 From 6b2bd6d85780bfd8d4fe5289aee1b09dd655d2d4 Mon Sep 17 00:00:00 2001
From: Micky Ching <micky_ching@realsil.com.cn>
Date: Thu, 5 Dec 2013 16:44:19 +0800
Subject: [PATCH] mfd: rtsx: fix pci remove panic while resuming

On some special condition, when resume from suspend, the rtsx_pci will
being removed. And card insert/remove interrupt triggered during
removing, this will cause kernel panic, since in card detect work will
read pci register but device is no longer exist.

Signed-off-by: Micky Ching <micky_ching@realsil.com.cn>
---
  drivers/mfd/rtsx_pcr.c |   21 ++++++++++++++++-----
  1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/mfd/rtsx_pcr.c b/drivers/mfd/rtsx_pcr.c
index 11e20af..efdd9b9 100644
--- a/drivers/mfd/rtsx_pcr.c
+++ b/drivers/mfd/rtsx_pcr.c
@@ -1228,14 +1228,14 @@ static void rtsx_pci_remove(struct pci_dev *pcidev)

      pcr->remove_pci = true;

-    cancel_delayed_work(&pcr->carddet_work);
-    cancel_delayed_work(&pcr->idle_work);
+    cancel_delayed_work_sync(&pcr->carddet_work);
+    cancel_delayed_work_sync(&pcr->idle_work);

      mfd_remove_devices(&pcidev->dev);

      dma_free_coherent(&(pcr->pci->dev), RTSX_RESV_BUF_LEN,
              pcr->rtsx_resv_buf, pcr->rtsx_resv_buf_addr);
-    free_irq(pcr->irq, (void *)pcr);
+    free_irq(pcr->irq, pcr);
      if (pcr->msi_en)
          pci_disable_msi(pcr->pci);
      iounmap(pcr->remap_addr);
@@ -1268,8 +1268,13 @@ static int rtsx_pci_suspend(struct pci_dev 
*pcidev, pm_message_t state)
      handle = pci_get_drvdata(pcidev);
      pcr = handle->pcr;

-    cancel_delayed_work(&pcr->carddet_work);
-    cancel_delayed_work(&pcr->idle_work);
+    spin_lock_irq(&pcr->lock);
+    rtsx_pci_writel(pcr, RTSX_BIER, 0);
+    pcr->bier = 0;
+    spin_unlock_irq(&pcr->lock);
+    cancel_delayed_work_sync(&pcr->carddet_work);
+    cancel_delayed_work_sync(&pcr->idle_work);
+    free_irq(pcr->irq, pcr);

      mutex_lock(&pcr->pcr_mutex);

@@ -1295,6 +1300,12 @@ static int rtsx_pci_resume(struct pci_dev *pcidev)
      handle = pci_get_drvdata(pcidev);
      pcr = handle->pcr;

+    ret = rtsx_pci_acquire_irq(pcr);
+    if (ret < 0)
+        return ret;
+    synchronize_irq(pcr->irq);
+    rtsx_pci_enable_bus_int(pcr);
+
      mutex_lock(&pcr->pcr_mutex);

      pci_set_power_state(pcidev, PCI_D0);
-- 
1.7.9.5


-- 
Best Regards
Micky.


^ permalink raw reply related	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-10  1:56                                                                     ` micky
@ 2013-12-10  8:29                                                                       ` Samuel Ortiz
  2014-01-10  7:26                                                                         ` Francis Moreau
  2013-12-10 10:50                                                                       ` Francis Moreau
  2013-12-17  8:03                                                                       ` Francis Moreau
  2 siblings, 1 reply; 63+ messages in thread
From: Samuel Ortiz @ 2013-12-10  8:29 UTC (permalink / raw)
  To: micky
  Cc: wwang, Francis Moreau, Thomas Gleixner, Jingoo Han,
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi Micky,

On Tue, Dec 10, 2013 at 09:56:48AM +0800, micky wrote:
> Hi Francis:
> On 12/10/2013 09:39 AM, wwang wrote:
> >which is based on Thomas' patch.
> 
> Can you help us test this patch, we disable irq while suspend here.
I already pushed a patch from Thomas to mfd-fixes that seems to fix the
resume breakage:

https://git.kernel.org/cgit/linux/kernel/git/sameo/mfd-fixes.git/commit/?id=19e49e445e198197c5e243f92d333d076e23d032

Cheers,
Samuel.

--
Intel Open Source Technology Centre
http://oss.intel.com/

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-09 22:17                                                                 ` Samuel Ortiz
  2013-12-10  1:39                                                                   ` wwang
@ 2013-12-10 10:49                                                                   ` Francis Moreau
  1 sibling, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2013-12-10 10:49 UTC (permalink / raw)
  To: Samuel Ortiz
  Cc: Thomas Gleixner, Jingoo Han, 'Wei WANG',
	'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi,

On 12/09/2013 11:17 PM, Samuel Ortiz wrote:
> Hi Francis,
> 
> Adding Lee to the Cc list.
> 
> On Tue, Dec 03, 2013 at 09:14:14AM +0100, Francis Moreau wrote:
>> Now that you did the hard work, I hope driver's maintainer/developper
>> will care about this issue.
> I applied Thomas' patch to mfd-fixes.
> Thanks a lot to you and Thomas for that.

Please, don't forget to propagate the fix to the affected stable trees.

Thanks.


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-10  1:56                                                                     ` micky
  2013-12-10  8:29                                                                       ` Samuel Ortiz
@ 2013-12-10 10:50                                                                       ` Francis Moreau
  2013-12-17  8:03                                                                       ` Francis Moreau
  2 siblings, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2013-12-10 10:50 UTC (permalink / raw)
  To: micky, wwang, Samuel Ortiz
  Cc: Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi,

On 12/10/2013 02:56 AM, micky wrote:
> Hi Francis:
> On 12/10/2013 09:39 AM, wwang wrote:
>> which is based on Thomas' patch. 
> 
> Can you help us test this patch, we disable irq while suspend here.

I'll give it a try tonight.

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-10  1:56                                                                     ` micky
  2013-12-10  8:29                                                                       ` Samuel Ortiz
  2013-12-10 10:50                                                                       ` Francis Moreau
@ 2013-12-17  8:03                                                                       ` Francis Moreau
  2013-12-18  4:05                                                                         ` micky
  2 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-12-17  8:03 UTC (permalink / raw)
  To: micky, wwang, Samuel Ortiz
  Cc: Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi,

On 12/10/2013 02:56 AM, micky wrote:
> Hi Francis:
> On 12/10/2013 09:39 AM, wwang wrote:
>> which is based on Thomas' patch. 
> 
> Can you help us test this patch, we disable irq while suspend here.

This patch doesn't seem to help, it still oops:

[   29.843910] ------------[ cut here ]------------
[   29.843917] WARNING: CPU: 0 PID: 53 at lib/debugobjects.c:260
debug_print_object+0x83/0xa0()
[   29.843921] ODEBUG: free active (active state 0) object type:
timer_list hint: delayed_work_timer_fn+0x0/0x20
[   29.843972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
coretemp kvm_intel kvm rtsx_pci_ms i915 i2c_algo_bit intel_agp intel_gtt
memstick iTCO_wdt drm_kms_helper crc32c_intel video drm r8169 mei_me mii
thermal agpgart mei wmi iTCO_vendor_support ac i2c_i801 i2c_core battery
evdev button shpchp lpc_ich mperf processor serio_raw microcode ext4
crc16 mbcache jbd2 sr_mod cdrom sd_mod usb_storage rtsx_pci_sdmmc
mmc_core ahci libahci libata scsi_mod ehci_pci xhci_hcd ehci_hcd
rtsx_pci usbcore usb_common
[   29.844004] CPU: 0 PID: 53 Comm: kworker/0:1 Not tainted
3.11.0-rc2-ARCH #66
[   29.844006] Hardware name: CLEVO CO.                        W55xEU
                       /W55xEU                          , BIOS 4.6.5
03/05/2013
[   29.844010] Workqueue: kacpi_hotplug hotplug_event_work
[   29.844012]  0000000000000009 ffff880407a95a18 ffffffff81459fe9
ffff880407a95a60
[   29.844014]  ffff880407a95a50 ffffffff8104dc7d ffff880406b896b8
ffffffff81836fc0
[   29.844017]  ffffffff81701358 ffffffff81b2f9b0 0000000000000003
ffff880407a95ab0
[   29.844019] Call Trace:
[   29.844024]  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
[   29.844027]  [<ffffffff8104dc7d>] warn_slowpath_common+0x7d/0xa0
[   29.844029]  [<ffffffff8104dcec>] warn_slowpath_fmt+0x4c/0x50
[   29.844032]  [<ffffffff81261433>] debug_print_object+0x83/0xa0
[   29.844034]  [<ffffffff8106b820>] ? queue_work_on+0x50/0x50
[   29.844037]  [<ffffffff81261c2b>] __debug_check_no_obj_freed+0x1fb/0x240
[   29.844044]  [<ffffffffa00d8989>] ? rtsx_pci_remove+0x119/0x1d0
[rtsx_pci]
[   29.844046]  [<ffffffff81262619>] debug_check_no_obj_freed+0x19/0x20
[   29.844049]  [<ffffffff8116f861>] kfree+0x191/0x210
[   29.844054]  [<ffffffff813819e0>] ? pcibios_disable_device+0x20/0x30
[   29.844066]  [<ffffffffa00d8989>] ? rtsx_pci_remove+0x119/0x1d0
[rtsx_pci]
[   29.844071]  [<ffffffffa00d8989>] rtsx_pci_remove+0x119/0x1d0 [rtsx_pci]
[   29.844075]  [<ffffffff8128004b>] pci_device_remove+0x3b/0xb0
[   29.844079]  [<ffffffff8132c92f>] __device_release_driver+0x7f/0xf0
[   29.844082]  [<ffffffff8132c9c3>] device_release_driver+0x23/0x30
[   29.844084]  [<ffffffff8132c194>] bus_remove_device+0xf4/0x170
[   29.844087]  [<ffffffff81328c55>] device_del+0x135/0x1d0
[   29.844089]  [<ffffffff8127ae24>] pci_stop_bus_device+0x94/0xa0
[   29.844091]  [<ffffffff8127af32>]
pci_stop_and_remove_bus_device+0x12/0x20
[   29.844094]  [<ffffffff81297466>] disable_slot+0x76/0xd0
[   29.844096]  [<ffffffff81297568>] acpiphp_check_bridge+0xa8/0xd0
[   29.844099]  [<ffffffff81297c8a>] hotplug_event+0xfa/0x210
[   29.844101]  [<ffffffff81297dc7>] hotplug_event_work+0x27/0x60
[   29.844104]  [<ffffffff8106c178>] process_one_work+0x178/0x470
[   29.844106]  [<ffffffff8106cb91>] worker_thread+0x121/0x3a0
[   29.844109]  [<ffffffff8106ca70>] ? manage_workers.isra.21+0x2b0/0x2b0
[   29.844111]  [<ffffffff81073a50>] kthread+0xc0/0xd0
[   29.844114]  [<ffffffff81073990>] ? kthread_create_on_node+0x120/0x120
[   29.844117]  [<ffffffff814688ec>] ret_from_fork+0x7c/0xb0
[   29.844119]  [<ffffffff81073990>] ? kthread_create_on_node+0x120/0x120
[   29.844120] ---[ end trace ed9751fe6c0cd9e3 ]---
[   29.844137] kobject: '0000:03:00.0' (ffff880407a010a8):
kobject_uevent_env
[   29.844150] kobject: '0000:03:00.0' (ffff880407a010a8):
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1c.3/0000:03:00.0'
[   29.844162] kobject: '0000:03:00.0' (ffff880407a010a8): kobject_cleanup
[   29.844164] kobject: '0000:03:00.0' (ffff880407a010a8): calling ktype
release
[   29.844166] kobject: '0000:03:00.0': free name
[   29.844367] kobject: 'rx-0' (ffff8804067ae010): kobject_cleanup
[   29.844370] kobject: 'rx-0' (ffff8804067ae010): auto cleanup 'remove'
event
[   29.844371] kobject: 'rx-0' (ffff8804067ae010): kobject_uevent_env
[   29.844374] kobject: 'rx-0' (ffff8804067ae010): fill_kobj_path: path
= '/devices/pci0000:00/0000:00:1c.3/0000:03:00.2/net/enp3s0f2/queues/rx-0'
[   29.844379] kobject: 'rx-0' (ffff8804067ae010): auto cleanup kobject_del
[   29.844383] kobject: 'rx-0' (ffff8804067ae010): calling ktype release
[   29.844384] kobject: 'rx-0': free name
[   29.844389] kobject: 'tx-0' (ffff880407205e18): kobject_cleanup
[   29.844390] kobject: 'tx-0' (ffff880407205e18): auto cleanup 'remove'
event
[   29.844391] kobject: 'tx-0' (ffff880407205e18): kobject_uevent_env
[   29.844393] kobject: 'tx-0' (ffff880407205e18): fill_kobj_path: path
= '/devices/pci0000:00/0000:00:1c.3/0000:03:00.2/net/enp3s0f2/queues/tx-0'
[   29.844396] kobject: 'tx-0' (ffff880407205e18): auto cleanup kobject_del
[   29.844398] kobject: 'tx-0' (ffff880407205e18): calling ktype release
[   29.844399] kobject: 'tx-0': free name
[   29.844400] kobject: 'queues' (ffff880406216c78): kobject_cleanup
[   29.844401] kobject: 'queues' (ffff880406216c78): auto cleanup
kobject_del
[   29.844403] kobject: 'queues' (ffff880406216c78): calling ktype release
[   29.844404] kobject: 'queues' (ffff880406216c78): kset_release
[   29.844405] kobject: 'queues': free name
[   29.844438] kobject: 'enp3s0f2' (ffff880406be2410): kobject_uevent_env
[   29.844440] kobject: 'enp3s0f2' (ffff880406be2410): fill_kobj_path:
path = '/devices/pci0000:00/0000:00:1c.3/0000:03:00.2/net/enp3s0f2'
[   29.844445] kobject: 'net' (ffff880406216cc0): kobject_cleanup
[   29.844446] kobject: 'net' (ffff880406216cc0): auto cleanup kobject_del
[   29.844447] kobject: 'net' (ffff880406216cc0): calling ktype release
[   29.844448] kobject: 'net': free name
[   29.890009] kobject: '44' (ffff880407327408): kobject_cleanup
[   29.890014] kobject: '44' (ffff880407327408): calling ktype release
[   29.890015] kobject: '44': free name
[   29.890018] kobject: 'msi_irqs' (ffff880406216d38): kobject_cleanup
[   29.890019] kobject: 'msi_irqs' (ffff880406216d38): auto cleanup
kobject_del
[   29.890022] kobject: 'msi_irqs' (ffff880406216d38): calling ktype release
[   29.890024] kobject: 'msi_irqs' (ffff880406216d38): kset_release
[   29.890026] kobject: 'msi_irqs': free name
[   29.890121] kobject: 'enp3s0f2' (ffff880406be2410): kobject_cleanup
[   29.890123] kobject: 'enp3s0f2' (ffff880406be2410): calling ktype release
[   29.890127] kobject: 'enp3s0f2': free name
[   29.899887] kobject: '0000:03:00.2' (ffff880407a020a8):
kobject_uevent_env
[   29.899892] kobject: '0000:03:00.2' (ffff880407a020a8):
fill_kobj_path: path = '/devices/pci0000:00/0000:00:1c.3/0000:03:00.2'
[   29.899914] kobject: '0000:03:00.2' (ffff880407a020a8): kobject_cleanup
[   29.899915] kobject: '0000:03:00.2' (ffff880407a020a8): calling ktype
release
[   29.899918] kobject: '0000:03:00.2': free name
[   29.899966] kobject: 'device:10' (ffff8804079731f0): kobject_uevent_env
[   29.899969] kobject: 'device:10' (ffff8804079731f0): fill_kobj_path:
path = '/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:0f/device:10'
[   29.899976] ACPI: Device does not support D3cold
[   29.899978] kobject: 'device:10' (ffff8804079731f0): kobject_cleanup
[   29.899979] kobject: 'device:10' (ffff8804079731f0): calling ktype
release
[   29.899982] kobject: 'device:10': free name
[   29.900025] kobject: 'device:11' (ffff8804079739f0): kobject_uevent_env
[   29.900028] kobject: 'device:11' (ffff8804079739f0): fill_kobj_path:
path = '/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:0f/device:11'
[   29.900033] ACPI: Device does not support D3cold
[   29.900035] kobject: 'device:11' (ffff8804079739f0): kobject_cleanup
[   29.900036] kobject: 'device:11' (ffff8804079739f0): calling ktype
release
[   29.900038] kobject: 'device:11': free name
[   29.900105] kobject: 'device:12' (ffff8804079741f0): kobject_uevent_env
[   29.900106] kobject: 'device:12' (ffff8804079741f0): fill_kobj_path:
path = '/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:0f/device:12'
[   29.900112] ACPI: Device does not support D3cold
[   29.900114] kobject: 'device:12' (ffff8804079741f0): kobject_cleanup
[   29.900114] kobject: 'device:12' (ffff8804079741f0): calling ktype
release
[   29.900117] kobject: 'device:12': free name
[   29.900163] kobject: 'device:14' (ffff8804079751f0): kobject_uevent_env
[   29.900164] kobject: 'device:14' (ffff8804079751f0): fill_kobj_path:
path = '/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:13/device:14'
[   29.900171] ACPI: Device does not support D3cold
[   29.900174] kobject: 'device:14' (ffff8804079751f0): kobject_cleanup
[   29.900175] kobject: 'device:14' (ffff8804079751f0): calling ktype
release
[   29.900178] kobject: 'device:14': free name
[   30.404233] irq 16: nobody cared (try booting with the "irqpoll" option)
[   30.404272] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W
3.11.0-rc2-ARCH #66
[   30.404274] Hardware name: CLEVO CO.                        W55xEU
                       /W55xEU                          , BIOS 4.6.5
03/05/2013
[   30.404275]  ffff8804078bd38c ffff88041e203e48 ffffffff81459fe9
ffff8804078bd300
[   30.404278]  ffff88041e203e70 ffffffff810d8632 ffff8804078bd300
0000000000000010
[   30.404280]  0000000000000000 ffff88041e203eb0 ffffffff810d8a58
ffffffff8136a882
[   30.404282] Call Trace:
[   30.404284]  <IRQ>  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
[   30.404293]  [<ffffffff810d8632>] __report_bad_irq+0x32/0xd0
[   30.404296]  [<ffffffff810d8a58>] note_interrupt+0x138/0x1f0
[   30.404299]  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
[   30.404302]  [<ffffffff810d6439>] handle_irq_event_percpu+0xf9/0x250
[   30.404304]  [<ffffffff810d65cd>] handle_irq_event+0x3d/0x60
[   30.404306]  [<ffffffff810d95ca>] handle_fasteoi_irq+0x5a/0x100
[   30.404309]  [<ffffffff81004a6e>] handle_irq+0x1e/0x30
[   30.404312]  [<ffffffff8146aafd>] do_IRQ+0x4d/0xc0
[   30.404314]  [<ffffffff8146116d>] common_interrupt+0x6d/0x6d
[   30.404315]  <EOI>  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
[   30.404319]  [<ffffffff8136a878>] ? cpuidle_enter_state+0x48/0xc0
[   30.404321]  [<ffffffff8136a9b9>] cpuidle_idle_call+0xc9/0x280
[   30.404325]  [<ffffffff8100bf6e>] arch_cpu_idle+0xe/0x30
[   30.404328]  [<ffffffff810a1287>] cpu_startup_entry+0x257/0x2d0
[   30.404330]  [<ffffffff8144d404>] rest_init+0x84/0x90
[   30.404333]  [<ffffffff818d9ee1>] start_kernel+0x414/0x420
[   30.404335]  [<ffffffff818d98d6>] ? repair_env_string+0x5c/0x5c
[   30.404337]  [<ffffffff818d9120>] ? early_idt_handlers+0x120/0x120
[   30.404339]  [<ffffffff818d95be>] x86_64_start_reservations+0x2a/0x2c
[   30.404342]  [<ffffffff818d96c8>] x86_64_start_kernel+0x108/0x117
[   30.404343] handlers:
[   30.404363] [<ffffffffa00168f0>] usb_hcd_irq [usbcore]
[   30.404383] Disabling IRQ #16

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-17  8:03                                                                       ` Francis Moreau
@ 2013-12-18  4:05                                                                         ` micky
  2013-12-18  8:12                                                                           ` Francis Moreau
  0 siblings, 1 reply; 63+ messages in thread
From: micky @ 2013-12-18  4:05 UTC (permalink / raw)
  To: Francis Moreau, wwang, Samuel Ortiz
  Cc: Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi:

It seems that the card-reader was removed during suspend or resume, is 
that right? or did you removed by hand?
I want to know with Thomas' patch, after resume, is the card-reader and 
card-reader driver still exist?
if not exist, I also want to know which function called first, 
rtsx_pci_resume or rtsx_pci_remove, can you determine it?
And IRQ16 seems not handled by rtsx_pci driver, so with Thomas' patch, 
is there still some go wrong?

so much questions, but it may help us find out the bug. thanks.

Best Regards.

On 12/17/2013 04:03 PM, Francis Moreau wrote:
> Hi,
>
> On 12/10/2013 02:56 AM, micky wrote:
>> Hi Francis:
>> On 12/10/2013 09:39 AM, wwang wrote:
>>> which is based on Thomas' patch.
>> Can you help us test this patch, we disable irq while suspend here.
> This patch doesn't seem to help, it still oops:
>
> [   29.843910] ------------[ cut here ]------------
> [   29.843917] WARNING: CPU: 0 PID: 53 at lib/debugobjects.c:260
> debug_print_object+0x83/0xa0()
> [   29.843921] ODEBUG: free active (active state 0) object type:
> timer_list hint: delayed_work_timer_fn+0x0/0x20
> [   29.843972] Modules linked in: x86_pkg_temp_thermal intel_powerclamp
> coretemp kvm_intel kvm rtsx_pci_ms i915 i2c_algo_bit intel_agp intel_gtt
> memstick iTCO_wdt drm_kms_helper crc32c_intel video drm r8169 mei_me mii
> thermal agpgart mei wmi iTCO_vendor_support ac i2c_i801 i2c_core battery
> evdev button shpchp lpc_ich mperf processor serio_raw microcode ext4
> crc16 mbcache jbd2 sr_mod cdrom sd_mod usb_storage rtsx_pci_sdmmc
> mmc_core ahci libahci libata scsi_mod ehci_pci xhci_hcd ehci_hcd
> rtsx_pci usbcore usb_common
> [   29.844004] CPU: 0 PID: 53 Comm: kworker/0:1 Not tainted
> 3.11.0-rc2-ARCH #66
> [   29.844006] Hardware name: CLEVO CO.                        W55xEU
>                         /W55xEU                          , BIOS 4.6.5
> 03/05/2013
> [   29.844010] Workqueue: kacpi_hotplug hotplug_event_work
> [   29.844012]  0000000000000009 ffff880407a95a18 ffffffff81459fe9
> ffff880407a95a60
> [   29.844014]  ffff880407a95a50 ffffffff8104dc7d ffff880406b896b8
> ffffffff81836fc0
> [   29.844017]  ffffffff81701358 ffffffff81b2f9b0 0000000000000003
> ffff880407a95ab0
> [   29.844019] Call Trace:
> [   29.844024]  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
> [   29.844027]  [<ffffffff8104dc7d>] warn_slowpath_common+0x7d/0xa0
> [   29.844029]  [<ffffffff8104dcec>] warn_slowpath_fmt+0x4c/0x50
> [   29.844032]  [<ffffffff81261433>] debug_print_object+0x83/0xa0
> [   29.844034]  [<ffffffff8106b820>] ? queue_work_on+0x50/0x50
> [   29.844037]  [<ffffffff81261c2b>] __debug_check_no_obj_freed+0x1fb/0x240
> [   29.844044]  [<ffffffffa00d8989>] ? rtsx_pci_remove+0x119/0x1d0
> [rtsx_pci]
> [   29.844046]  [<ffffffff81262619>] debug_check_no_obj_freed+0x19/0x20
> [   29.844049]  [<ffffffff8116f861>] kfree+0x191/0x210
> [   29.844054]  [<ffffffff813819e0>] ? pcibios_disable_device+0x20/0x30
> [   29.844066]  [<ffffffffa00d8989>] ? rtsx_pci_remove+0x119/0x1d0
> [rtsx_pci]
> [   29.844071]  [<ffffffffa00d8989>] rtsx_pci_remove+0x119/0x1d0 [rtsx_pci]
> [   29.844075]  [<ffffffff8128004b>] pci_device_remove+0x3b/0xb0
> [   29.844079]  [<ffffffff8132c92f>] __device_release_driver+0x7f/0xf0
> [   29.844082]  [<ffffffff8132c9c3>] device_release_driver+0x23/0x30
> [   29.844084]  [<ffffffff8132c194>] bus_remove_device+0xf4/0x170
> [   29.844087]  [<ffffffff81328c55>] device_del+0x135/0x1d0
> [   29.844089]  [<ffffffff8127ae24>] pci_stop_bus_device+0x94/0xa0
> [   29.844091]  [<ffffffff8127af32>]
> pci_stop_and_remove_bus_device+0x12/0x20
> [   29.844094]  [<ffffffff81297466>] disable_slot+0x76/0xd0
> [   29.844096]  [<ffffffff81297568>] acpiphp_check_bridge+0xa8/0xd0
> [   29.844099]  [<ffffffff81297c8a>] hotplug_event+0xfa/0x210
> [   29.844101]  [<ffffffff81297dc7>] hotplug_event_work+0x27/0x60
> [   29.844104]  [<ffffffff8106c178>] process_one_work+0x178/0x470
> [   29.844106]  [<ffffffff8106cb91>] worker_thread+0x121/0x3a0
> [   29.844109]  [<ffffffff8106ca70>] ? manage_workers.isra.21+0x2b0/0x2b0
> [   29.844111]  [<ffffffff81073a50>] kthread+0xc0/0xd0
> [   29.844114]  [<ffffffff81073990>] ? kthread_create_on_node+0x120/0x120
> [   29.844117]  [<ffffffff814688ec>] ret_from_fork+0x7c/0xb0
> [   29.844119]  [<ffffffff81073990>] ? kthread_create_on_node+0x120/0x120
> [   29.844120] ---[ end trace ed9751fe6c0cd9e3 ]---
> [   29.844137] kobject: '0000:03:00.0' (ffff880407a010a8):
> kobject_uevent_env
> [   29.844150] kobject: '0000:03:00.0' (ffff880407a010a8):
> fill_kobj_path: path = '/devices/pci0000:00/0000:00:1c.3/0000:03:00.0'
> [   29.844162] kobject: '0000:03:00.0' (ffff880407a010a8): kobject_cleanup
> [   29.844164] kobject: '0000:03:00.0' (ffff880407a010a8): calling ktype
> release
> [   29.844166] kobject: '0000:03:00.0': free name
> [   29.844367] kobject: 'rx-0' (ffff8804067ae010): kobject_cleanup
> [   29.844370] kobject: 'rx-0' (ffff8804067ae010): auto cleanup 'remove'
> event
> [   29.844371] kobject: 'rx-0' (ffff8804067ae010): kobject_uevent_env
> [   29.844374] kobject: 'rx-0' (ffff8804067ae010): fill_kobj_path: path
> = '/devices/pci0000:00/0000:00:1c.3/0000:03:00.2/net/enp3s0f2/queues/rx-0'
> [   29.844379] kobject: 'rx-0' (ffff8804067ae010): auto cleanup kobject_del
> [   29.844383] kobject: 'rx-0' (ffff8804067ae010): calling ktype release
> [   29.844384] kobject: 'rx-0': free name
> [   29.844389] kobject: 'tx-0' (ffff880407205e18): kobject_cleanup
> [   29.844390] kobject: 'tx-0' (ffff880407205e18): auto cleanup 'remove'
> event
> [   29.844391] kobject: 'tx-0' (ffff880407205e18): kobject_uevent_env
> [   29.844393] kobject: 'tx-0' (ffff880407205e18): fill_kobj_path: path
> = '/devices/pci0000:00/0000:00:1c.3/0000:03:00.2/net/enp3s0f2/queues/tx-0'
> [   29.844396] kobject: 'tx-0' (ffff880407205e18): auto cleanup kobject_del
> [   29.844398] kobject: 'tx-0' (ffff880407205e18): calling ktype release
> [   29.844399] kobject: 'tx-0': free name
> [   29.844400] kobject: 'queues' (ffff880406216c78): kobject_cleanup
> [   29.844401] kobject: 'queues' (ffff880406216c78): auto cleanup
> kobject_del
> [   29.844403] kobject: 'queues' (ffff880406216c78): calling ktype release
> [   29.844404] kobject: 'queues' (ffff880406216c78): kset_release
> [   29.844405] kobject: 'queues': free name
> [   29.844438] kobject: 'enp3s0f2' (ffff880406be2410): kobject_uevent_env
> [   29.844440] kobject: 'enp3s0f2' (ffff880406be2410): fill_kobj_path:
> path = '/devices/pci0000:00/0000:00:1c.3/0000:03:00.2/net/enp3s0f2'
> [   29.844445] kobject: 'net' (ffff880406216cc0): kobject_cleanup
> [   29.844446] kobject: 'net' (ffff880406216cc0): auto cleanup kobject_del
> [   29.844447] kobject: 'net' (ffff880406216cc0): calling ktype release
> [   29.844448] kobject: 'net': free name
> [   29.890009] kobject: '44' (ffff880407327408): kobject_cleanup
> [   29.890014] kobject: '44' (ffff880407327408): calling ktype release
> [   29.890015] kobject: '44': free name
> [   29.890018] kobject: 'msi_irqs' (ffff880406216d38): kobject_cleanup
> [   29.890019] kobject: 'msi_irqs' (ffff880406216d38): auto cleanup
> kobject_del
> [   29.890022] kobject: 'msi_irqs' (ffff880406216d38): calling ktype release
> [   29.890024] kobject: 'msi_irqs' (ffff880406216d38): kset_release
> [   29.890026] kobject: 'msi_irqs': free name
> [   29.890121] kobject: 'enp3s0f2' (ffff880406be2410): kobject_cleanup
> [   29.890123] kobject: 'enp3s0f2' (ffff880406be2410): calling ktype release
> [   29.890127] kobject: 'enp3s0f2': free name
> [   29.899887] kobject: '0000:03:00.2' (ffff880407a020a8):
> kobject_uevent_env
> [   29.899892] kobject: '0000:03:00.2' (ffff880407a020a8):
> fill_kobj_path: path = '/devices/pci0000:00/0000:00:1c.3/0000:03:00.2'
> [   29.899914] kobject: '0000:03:00.2' (ffff880407a020a8): kobject_cleanup
> [   29.899915] kobject: '0000:03:00.2' (ffff880407a020a8): calling ktype
> release
> [   29.899918] kobject: '0000:03:00.2': free name
> [   29.899966] kobject: 'device:10' (ffff8804079731f0): kobject_uevent_env
> [   29.899969] kobject: 'device:10' (ffff8804079731f0): fill_kobj_path:
> path = '/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:0f/device:10'
> [   29.899976] ACPI: Device does not support D3cold
> [   29.899978] kobject: 'device:10' (ffff8804079731f0): kobject_cleanup
> [   29.899979] kobject: 'device:10' (ffff8804079731f0): calling ktype
> release
> [   29.899982] kobject: 'device:10': free name
> [   29.900025] kobject: 'device:11' (ffff8804079739f0): kobject_uevent_env
> [   29.900028] kobject: 'device:11' (ffff8804079739f0): fill_kobj_path:
> path = '/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:0f/device:11'
> [   29.900033] ACPI: Device does not support D3cold
> [   29.900035] kobject: 'device:11' (ffff8804079739f0): kobject_cleanup
> [   29.900036] kobject: 'device:11' (ffff8804079739f0): calling ktype
> release
> [   29.900038] kobject: 'device:11': free name
> [   29.900105] kobject: 'device:12' (ffff8804079741f0): kobject_uevent_env
> [   29.900106] kobject: 'device:12' (ffff8804079741f0): fill_kobj_path:
> path = '/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:0f/device:12'
> [   29.900112] ACPI: Device does not support D3cold
> [   29.900114] kobject: 'device:12' (ffff8804079741f0): kobject_cleanup
> [   29.900114] kobject: 'device:12' (ffff8804079741f0): calling ktype
> release
> [   29.900117] kobject: 'device:12': free name
> [   29.900163] kobject: 'device:14' (ffff8804079751f0): kobject_uevent_env
> [   29.900164] kobject: 'device:14' (ffff8804079751f0): fill_kobj_path:
> path = '/devices/LNXSYSTM:00/device:00/PNP0A08:00/device:13/device:14'
> [   29.900171] ACPI: Device does not support D3cold
> [   29.900174] kobject: 'device:14' (ffff8804079751f0): kobject_cleanup
> [   29.900175] kobject: 'device:14' (ffff8804079751f0): calling ktype
> release
> [   29.900178] kobject: 'device:14': free name
> [   30.404233] irq 16: nobody cared (try booting with the "irqpoll" option)
> [   30.404272] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W
> 3.11.0-rc2-ARCH #66
> [   30.404274] Hardware name: CLEVO CO.                        W55xEU
>                         /W55xEU                          , BIOS 4.6.5
> 03/05/2013
> [   30.404275]  ffff8804078bd38c ffff88041e203e48 ffffffff81459fe9
> ffff8804078bd300
> [   30.404278]  ffff88041e203e70 ffffffff810d8632 ffff8804078bd300
> 0000000000000010
> [   30.404280]  0000000000000000 ffff88041e203eb0 ffffffff810d8a58
> ffffffff8136a882
> [   30.404282] Call Trace:
> [   30.404284]  <IRQ>  [<ffffffff81459fe9>] dump_stack+0x54/0x8d
> [   30.404293]  [<ffffffff810d8632>] __report_bad_irq+0x32/0xd0
> [   30.404296]  [<ffffffff810d8a58>] note_interrupt+0x138/0x1f0
> [   30.404299]  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
> [   30.404302]  [<ffffffff810d6439>] handle_irq_event_percpu+0xf9/0x250
> [   30.404304]  [<ffffffff810d65cd>] handle_irq_event+0x3d/0x60
> [   30.404306]  [<ffffffff810d95ca>] handle_fasteoi_irq+0x5a/0x100
> [   30.404309]  [<ffffffff81004a6e>] handle_irq+0x1e/0x30
> [   30.404312]  [<ffffffff8146aafd>] do_IRQ+0x4d/0xc0
> [   30.404314]  [<ffffffff8146116d>] common_interrupt+0x6d/0x6d
> [   30.404315]  <EOI>  [<ffffffff8136a882>] ? cpuidle_enter_state+0x52/0xc0
> [   30.404319]  [<ffffffff8136a878>] ? cpuidle_enter_state+0x48/0xc0
> [   30.404321]  [<ffffffff8136a9b9>] cpuidle_idle_call+0xc9/0x280
> [   30.404325]  [<ffffffff8100bf6e>] arch_cpu_idle+0xe/0x30
> [   30.404328]  [<ffffffff810a1287>] cpu_startup_entry+0x257/0x2d0
> [   30.404330]  [<ffffffff8144d404>] rest_init+0x84/0x90
> [   30.404333]  [<ffffffff818d9ee1>] start_kernel+0x414/0x420
> [   30.404335]  [<ffffffff818d98d6>] ? repair_env_string+0x5c/0x5c
> [   30.404337]  [<ffffffff818d9120>] ? early_idt_handlers+0x120/0x120
> [   30.404339]  [<ffffffff818d95be>] x86_64_start_reservations+0x2a/0x2c
> [   30.404342]  [<ffffffff818d96c8>] x86_64_start_kernel+0x108/0x117
> [   30.404343] handlers:
> [   30.404363] [<ffffffffa00168f0>] usb_hcd_irq [usbcore]
> [   30.404383] Disabling IRQ #16
> .
>


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-18  4:05                                                                         ` micky
@ 2013-12-18  8:12                                                                           ` Francis Moreau
  2013-12-20  1:30                                                                             ` micky
  0 siblings, 1 reply; 63+ messages in thread
From: Francis Moreau @ 2013-12-18  8:12 UTC (permalink / raw)
  To: micky, wwang, Samuel Ortiz
  Cc: Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

On 12/18/2013 05:05 AM, micky wrote:
> Hi:
> 
> It seems that the card-reader was removed during suspend or resume, is 
> that right? or did you removed by hand?

yes during a suspend/resume cycle.

> I want to know with Thomas' patch, after resume, is the card-reader and 
> card-reader driver still exist?

I'm not sure but IIRC it's still loaded in the kernel after resuming.

> if not exist, I also want to know which function called first, 
> rtsx_pci_resume or rtsx_pci_remove, can you determine it?
> And IRQ16 seems not handled by rtsx_pci driver, so with Thomas' patch, 
> is there still some go wrong?
> 

No idea, I'm simply an unfortunate user of that driver.

Aren't the information you're asking for already answered in the
previous posts ?

Thanks.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-18  8:12                                                                           ` Francis Moreau
@ 2013-12-20  1:30                                                                             ` micky
  2013-12-20  2:28                                                                               ` Jingoo Han
  0 siblings, 1 reply; 63+ messages in thread
From: micky @ 2013-12-20  1:30 UTC (permalink / raw)
  To: Francis Moreau, wwang, Samuel Ortiz
  Cc: Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi Francis,

We are trying to get the environment same with yours,  so it may take 
some to solve this problem.
maybe next month.

Best Regards.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-20  1:30                                                                             ` micky
@ 2013-12-20  2:28                                                                               ` Jingoo Han
  0 siblings, 0 replies; 63+ messages in thread
From: Jingoo Han @ 2013-12-20  2:28 UTC (permalink / raw)
  To: 'micky', 'Francis Moreau', 'wwang',
	'Samuel Ortiz'
  Cc: 'Thomas Gleixner', 'Chris Ball',
	'Rafael J. Wysocki', 'Borislav Petkov',
	'LKML', 'Lee Jones', 'Jingoo Han'

On Friday, December 20, 2013 10:31 AM, micky wrote:
> 
> Hi Francis,
> 
> We are trying to get the environment same with yours,  so it may take
> some to solve this problem.
> maybe next month.

Yes, in this case, the problem should be reproduced.
And then, try debugging repeatedly.

Best regards,
Jingoo Han


^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2013-12-10  8:29                                                                       ` Samuel Ortiz
@ 2014-01-10  7:26                                                                         ` Francis Moreau
  2014-01-10  9:16                                                                           ` micky
  2014-01-10  9:52                                                                           ` Samuel Ortiz
  0 siblings, 2 replies; 63+ messages in thread
From: Francis Moreau @ 2014-01-10  7:26 UTC (permalink / raw)
  To: Samuel Ortiz
  Cc: micky, wwang, Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi.

On 12/10/2013 09:29 AM, Samuel Ortiz wrote:
> Hi Micky,
> 
> On Tue, Dec 10, 2013 at 09:56:48AM +0800, micky wrote:
>> Hi Francis:
>> On 12/10/2013 09:39 AM, wwang wrote:
>>> which is based on Thomas' patch.
>>
>> Can you help us test this patch, we disable irq while suspend here.
> I already pushed a patch from Thomas to mfd-fixes that seems to fix the
> resume breakage:
> 
> https://git.kernel.org/cgit/linux/kernel/git/sameo/mfd-fixes.git/commit/?id=19e49e445e198197c5e243f92d333d076e23d032
> 

I still can see any traces of this fix in Linus' tree.

Shouldn't this get merged before 3.13 is out ?

Thanks

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2014-01-10  7:26                                                                         ` Francis Moreau
@ 2014-01-10  9:16                                                                           ` micky
  2014-01-10  9:52                                                                           ` Samuel Ortiz
  1 sibling, 0 replies; 63+ messages in thread
From: micky @ 2014-01-10  9:16 UTC (permalink / raw)
  To: Francis Moreau, Samuel Ortiz
  Cc: wwang, Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi,
On 01/10/2014 03:26 PM, Francis Moreau wrote:
> I still can see any traces of this fix in Linus' tree.
>
> Shouldn't this get merged before 3.13 is out ?
>
> Thanks
Good, I think it is good to merge. Thanks.

Best Regards.
micky.

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2014-01-10  7:26                                                                         ` Francis Moreau
  2014-01-10  9:16                                                                           ` micky
@ 2014-01-10  9:52                                                                           ` Samuel Ortiz
  2014-01-10 10:07                                                                             ` Francis Moreau
  1 sibling, 1 reply; 63+ messages in thread
From: Samuel Ortiz @ 2014-01-10  9:52 UTC (permalink / raw)
  To: Francis Moreau
  Cc: micky, wwang, Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

Hi Francis,

On Fri, Jan 10, 2014 at 08:26:13AM +0100, Francis Moreau wrote:
> Hi.
> 
> On 12/10/2013 09:29 AM, Samuel Ortiz wrote:
> > Hi Micky,
> > 
> > On Tue, Dec 10, 2013 at 09:56:48AM +0800, micky wrote:
> >> Hi Francis:
> >> On 12/10/2013 09:39 AM, wwang wrote:
> >>> which is based on Thomas' patch.
> >>
> >> Can you help us test this patch, we disable irq while suspend here.
> > I already pushed a patch from Thomas to mfd-fixes that seems to fix the
> > resume breakage:
> > 
> > https://git.kernel.org/cgit/linux/kernel/git/sameo/mfd-fixes.git/commit/?id=19e49e445e198197c5e243f92d333d076e23d032
> > 
> 
> I still can see any traces of this fix in Linus' tree.
> 
> Shouldn't this get merged before 3.13 is out ?
Yes, it should. I just sent a pull request to Linus for that.

Cheers,
Samuel.

-- 
Intel Open Source Technology Centre
http://oss.intel.com/

^ permalink raw reply	[flat|nested] 63+ messages in thread

* Re: 3.12: kernel panic when resuming from suspend to RAM (x86_64)
  2014-01-10  9:52                                                                           ` Samuel Ortiz
@ 2014-01-10 10:07                                                                             ` Francis Moreau
  0 siblings, 0 replies; 63+ messages in thread
From: Francis Moreau @ 2014-01-10 10:07 UTC (permalink / raw)
  To: Samuel Ortiz
  Cc: micky, wwang, Thomas Gleixner, Jingoo Han, 'Chris Ball',
	Rafael J. Wysocki, 'Borislav Petkov', 'LKML',
	Lee Jones

On 01/10/2014 10:52 AM, Samuel Ortiz wrote:
> Hi Francis,
> 
> On Fri, Jan 10, 2014 at 08:26:13AM +0100, Francis Moreau wrote:
>> Hi.
>>
>> On 12/10/2013 09:29 AM, Samuel Ortiz wrote:
>>> Hi Micky,
>>>
>>> On Tue, Dec 10, 2013 at 09:56:48AM +0800, micky wrote:
>>>> Hi Francis:
>>>> On 12/10/2013 09:39 AM, wwang wrote:
>>>>> which is based on Thomas' patch.
>>>>
>>>> Can you help us test this patch, we disable irq while suspend here.
>>> I already pushed a patch from Thomas to mfd-fixes that seems to fix the
>>> resume breakage:
>>>
>>> https://git.kernel.org/cgit/linux/kernel/git/sameo/mfd-fixes.git/commit/?id=19e49e445e198197c5e243f92d333d076e23d032
>>>
>>
>> I still can see any traces of this fix in Linus' tree.
>>
>> Shouldn't this get merged before 3.13 is out ?
> Yes, it should. I just sent a pull request to Linus for that.

Thanks, you might consider to send this to the 3.12 stable tree as well.


^ permalink raw reply	[flat|nested] 63+ messages in thread

end of thread, other threads:[~2014-01-10 10:07 UTC | newest]

Thread overview: 63+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-11-17  9:42 3.12: kernel panic when resuming from suspend to RAM (x86_64) Francis Moreau
2013-11-17 13:25 ` Borislav Petkov
2013-11-17 15:50   ` Francis Moreau
2013-11-17 16:01     ` Borislav Petkov
2013-11-17 18:02       ` Francis Moreau
2013-11-17 19:53         ` Borislav Petkov
2013-11-17 20:49           ` Francis Moreau
2013-11-17 22:06             ` Borislav Petkov
2013-11-17 22:34               ` Rafael J. Wysocki
2013-11-17 22:46                 ` Borislav Petkov
2013-11-18 12:21                   ` Francis Moreau
2013-11-18 12:20                 ` Francis Moreau
2013-11-18  0:33               ` Kevin Easton
2013-11-18  1:04                 ` Borislav Petkov
2013-11-18  2:43                   ` Kevin Easton
2013-11-18 12:19               ` Francis Moreau
2013-11-18 13:32                 ` Borislav Petkov
2013-11-19 10:01                   ` Francis Moreau
2013-11-19 10:15                     ` Borislav Petkov
2013-11-20  9:45                       ` Francis Moreau
2013-11-20 11:15                         ` Borislav Petkov
2013-11-21  8:22                           ` Francis Moreau
2013-11-21 10:12                             ` Borislav Petkov
2013-11-21 11:17                               ` Jingoo Han
2013-11-21 13:07                                 ` Francis Moreau
2013-11-22  7:43                                 ` Francis Moreau
2013-11-22  9:57                                   ` Francis Moreau
2013-11-22 12:54                                     ` Rafael J. Wysocki
2013-11-22 21:36                                       ` Francis Moreau
2013-11-22 22:08                                         ` Rafael J. Wysocki
2013-11-22 22:27                                           ` Thomas Gleixner
2013-11-24  9:39                                             ` Francis Moreau
2013-11-24 13:31                                               ` Borislav Petkov
2013-11-24 21:06                                               ` Rafael J. Wysocki
2013-11-25  7:42                                                 ` Francis Moreau
2013-11-25 10:47                                                   ` Rafael J. Wysocki
2013-11-29  8:28                                                     ` Francis Moreau
2013-11-29  9:02                                                       ` Thomas Gleixner
2013-11-30 15:07                                                         ` Francis Moreau
2013-11-30 20:17                                                           ` Rafael J. Wysocki
2013-12-01 10:11                                                             ` Francis Moreau
2013-12-01 19:26                                                             ` Francis Moreau
2013-12-02 10:49                                                           ` Thomas Gleixner
2013-12-02 11:20                                                             ` Thomas Gleixner
2013-12-03  8:14                                                               ` Francis Moreau
2013-12-09 19:33                                                                 ` Francis Moreau
2013-12-09 22:27                                                                   ` Samuel Ortiz
2013-12-09 22:17                                                                 ` Samuel Ortiz
2013-12-10  1:39                                                                   ` wwang
2013-12-10  1:56                                                                     ` micky
2013-12-10  8:29                                                                       ` Samuel Ortiz
2014-01-10  7:26                                                                         ` Francis Moreau
2014-01-10  9:16                                                                           ` micky
2014-01-10  9:52                                                                           ` Samuel Ortiz
2014-01-10 10:07                                                                             ` Francis Moreau
2013-12-10 10:50                                                                       ` Francis Moreau
2013-12-17  8:03                                                                       ` Francis Moreau
2013-12-18  4:05                                                                         ` micky
2013-12-18  8:12                                                                           ` Francis Moreau
2013-12-20  1:30                                                                             ` micky
2013-12-20  2:28                                                                               ` Jingoo Han
2013-12-10 10:49                                                                   ` Francis Moreau
2013-11-24  9:42                                           ` Francis Moreau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).