linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 3.0-rc1: powerpc hangs at Kernel virtual memory layout
@ 2011-05-31 23:50 Christian Kujau
  2011-06-01  0:25 ` Benjamin Herrenschmidt
  2011-06-02  0:16 ` Christian Kujau
  0 siblings, 2 replies; 19+ messages in thread
From: Christian Kujau @ 2011-05-31 23:50 UTC (permalink / raw)
  To: LKML; +Cc: linux ppc dev

Hi,

trying to boot 3.0-rc1 on powerpc32 only progresses until:

  > Kernel virtual memory layout:
  >   * 0xfffcf000..0xfffff000  : fixmap

And then the system hangs, does not respond to keyboard (sysrq does not 
seem to work on this PowerBook G4). But after a while the system reboots 
itself, so I guess the machine panicked but did not print anything on the 
screen.

Full messages (picture), config & (working) dmesg:

   http://nerdbynature.de/bits/3.0-rc1/

I'm currently trying to bisect this, so far I have:

----------------------
git bisect start
# good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
git bisect good 61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf
# bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
git bisect bad 55922c9d1b84b89cb946c777fddccb3247e7df2c
# bad: [c44dead70a841d90ddc01968012f323c33217c9e] Merge branch 'usb-next' 
of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
git bisect bad c44dead70a841d90ddc01968012f323c33217c9e
# bad: [d93515611bbc70c2fe4db232e5feb448ed8e4cc9] macvlan: fix panic if 
lowerdev in a bond
git bisect bad d93515611bbc70c2fe4db232e5feb448ed8e4cc9
----------------------

Any ideas?

Thanks,
Christian.
-- 
BOFH excuse #263:

It's stuck in the Web.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-05-31 23:50 3.0-rc1: powerpc hangs at Kernel virtual memory layout Christian Kujau
@ 2011-06-01  0:25 ` Benjamin Herrenschmidt
  2011-06-01  0:48   ` Christian Kujau
  2011-06-02  0:16 ` Christian Kujau
  1 sibling, 1 reply; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2011-06-01  0:25 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, linux ppc dev

On Tue, 2011-05-31 at 16:50 -0700, Christian Kujau wrote:
> Hi,
> 
> trying to boot 3.0-rc1 on powerpc32 only progresses until:
> 
>   > Kernel virtual memory layout:
>   >   * 0xfffcf000..0xfffff000  : fixmap
> 
> And then the system hangs, does not respond to keyboard (sysrq does not 
> seem to work on this PowerBook G4). But after a while the system reboots 
> itself, so I guess the machine panicked but did not print anything on the 
> screen.
> 
> Full messages (picture), config & (working) dmesg:
> 
>    http://nerdbynature.de/bits/3.0-rc1/
> 
> I'm currently trying to bisect this, so far I have:

Hrm, I had it working on a pair of powerbooks yesterday. Can you try
something like "udbg-immortal" on your kernel command line to see if
that makes a difference in the output ?

Cheers,
Ben.

> ----------------------
> git bisect start
> # good: [61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf] Linux 2.6.39
> git bisect good 61c4f2c81c61f73549928dfd9f3e8f26aa36a8cf
> # bad: [55922c9d1b84b89cb946c777fddccb3247e7df2c] Linux 3.0-rc1
> git bisect bad 55922c9d1b84b89cb946c777fddccb3247e7df2c
> # bad: [c44dead70a841d90ddc01968012f323c33217c9e] Merge branch 'usb-next' 
> of git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb-2.6
> git bisect bad c44dead70a841d90ddc01968012f323c33217c9e
> # bad: [d93515611bbc70c2fe4db232e5feb448ed8e4cc9] macvlan: fix panic if 
> lowerdev in a bond
> git bisect bad d93515611bbc70c2fe4db232e5feb448ed8e4cc9
> ----------------------
> 
> Any ideas?
> 
> Thanks,
> Christian.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-01  0:25 ` Benjamin Herrenschmidt
@ 2011-06-01  0:48   ` Christian Kujau
  2011-06-01  1:08     ` Christian Kujau
  2011-06-01  3:02     ` Christian Kujau
  0 siblings, 2 replies; 19+ messages in thread
From: Christian Kujau @ 2011-06-01  0:48 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: LKML, linux ppc dev

On Wed, 1 Jun 2011 at 10:25, Benjamin Herrenschmidt wrote:
> Hrm, I had it working on a pair of powerbooks yesterday. Can you try
> something like "udbg-immortal" on your kernel command line to see if
> that makes a difference in the output ?

I'll try in a minute.

In the meantime, "git bisect" behaves kinda weird, I don't know what went 
wrong here:

 $ git bisect start
 $ git bisect good         # Linux 2.6.39
 $ git bisect bad v3.0-rc1 # Linux 3.0-rc1
 $ git bisect bad          # c44dead70a...
 $ git bisect bad          # d93515611b..

...yet the ./Makefile shows[0] that I'm already way behind: 2.6.39-rc2. 
Maybe "git bisect" got confused with that whole 2.6.x -> 3.0 renaming?

Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/
-- 
BOFH excuse #383:

Your processor has taken a ride to Heaven's Gate on the UFO behind Hale-Bopp's comet.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-01  0:48   ` Christian Kujau
@ 2011-06-01  1:08     ` Christian Kujau
  2011-06-01  3:02     ` Christian Kujau
  1 sibling, 0 replies; 19+ messages in thread
From: Christian Kujau @ 2011-06-01  1:08 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: LKML, linux ppc dev

On Tue, 31 May 2011 at 17:48, Christian Kujau wrote:
> On Wed, 1 Jun 2011 at 10:25, Benjamin Herrenschmidt wrote:
> > Hrm, I had it working on a pair of powerbooks yesterday. Can you try
> > something like "udbg-immortal" on your kernel command line to see if
> > that makes a difference in the output ?
> 
> I'll try in a minute.

Wow, it really did make a difference:

  http://nerdbynature.de/bits/3.0-rc1/
  * linux-3.0_powerpc_2.jpg
  * linux-3.0_powerpc_2.mp4 (only a few(!) seconds long,
    best to view with the slider in VLC oder Quicktime, to
    get at least a grasp what lead to linux-3.0_powerpc_2.jpg)

Thanks,
Christian.
-- 
BOFH excuse #45:

virus attack, luser responsible

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-01  0:48   ` Christian Kujau
  2011-06-01  1:08     ` Christian Kujau
@ 2011-06-01  3:02     ` Christian Kujau
  2011-06-01  3:49       ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 19+ messages in thread
From: Christian Kujau @ 2011-06-01  3:02 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: LKML, linux ppc dev, Linus Torvalds

(Cc'in Linus)

On Tue, 31 May 2011 at 17:48, Christian Kujau wrote:
> In the meantime, "git bisect" behaves kinda weird, I don't know what went 
> wrong here:
> 
>  $ git bisect start
>  $ git bisect good         # Linux 2.6.39
>  $ git bisect bad v3.0-rc1 # Linux 3.0-rc1
>  $ git bisect bad          # c44dead70a...
>  $ git bisect bad          # d93515611b..
> 
> ...yet the ./Makefile shows[0] that I'm already way behind: 2.6.39-rc2. 
> Maybe "git bisect" got confused with that whole 2.6.x -> 3.0 renaming?

Hm, I tried again, from a clean v3.0-rc1 (git reset --hard), but after the 
2nd "git bad" I'm at 2.6.39-rc2 again - while I /should/ be somwhere 
inbetween v2.6.39..v3.0-rc1, right?

Help, please!
Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/
-- 
BOFH excuse #54:

Evil dogs hypnotised the night shift

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-01  3:02     ` Christian Kujau
@ 2011-06-01  3:49       ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2011-06-01  3:49 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, linux ppc dev, Linus Torvalds

On Tue, 2011-05-31 at 20:02 -0700, Christian Kujau wrote:
> (Cc'in Linus)
> 
> On Tue, 31 May 2011 at 17:48, Christian Kujau wrote:
> > In the meantime, "git bisect" behaves kinda weird, I don't know what went 
> > wrong here:
> > 
> >  $ git bisect start
> >  $ git bisect good         # Linux 2.6.39
> >  $ git bisect bad v3.0-rc1 # Linux 3.0-rc1
> >  $ git bisect bad          # c44dead70a...
> >  $ git bisect bad          # d93515611b..
> > 
> > ...yet the ./Makefile shows[0] that I'm already way behind: 2.6.39-rc2. 
> > Maybe "git bisect" got confused with that whole 2.6.x -> 3.0 renaming?
> 
> Hm, I tried again, from a clean v3.0-rc1 (git reset --hard), but after the 
> 2nd "git bad" I'm at 2.6.39-rc2 again - while I /should/ be somwhere 
> inbetween v2.6.39..v3.0-rc1, right?

Kernel version is totally irrelevant when bisecting. You are not walking
through a linear series of patches but a complex tree of merges which
might have forked off different versions in the first place.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-05-31 23:50 3.0-rc1: powerpc hangs at Kernel virtual memory layout Christian Kujau
  2011-06-01  0:25 ` Benjamin Herrenschmidt
@ 2011-06-02  0:16 ` Christian Kujau
  2011-06-02  0:47   ` Benjamin Herrenschmidt
                     ` (3 more replies)
  1 sibling, 4 replies; 19+ messages in thread
From: Christian Kujau @ 2011-06-02  0:16 UTC (permalink / raw)
  To: LKML; +Cc: linux ppc dev, zajec5, linville, benh

On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> trying to boot 3.0-rc1 on powerpc32 only progresses until:
> 
>   > Kernel virtual memory layout:
>   >   * 0xfffcf000..0xfffff000  : fixmap

After hours (and hours!) of git-bisecting, it said:

-----------------------
ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
commit ccc7c28af205888798b51b6cbc0b557ac1170a49
Author: Rafał Miłecki <zajec5@gmail.com>
Date:   Fri Apr 1 13:26:52 2011 +0200

    ssb: pci: implement serdes workaround
    
    Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
    Signed-off-by: John W. Linville <linville@tuxdriver.com>
-----------------------

When I reverted this one from the gi-bisected tree, the box continued to 
boot (until it got stuck again during IDE/CDROM init, but that may be a 
different story). I'l; try to revert this from a vanilla 3.0-rc1 and see 
if it helps

Thanks,
Christian.

Full gist-bisect-log: http://nerdbynature.de/bits/3.0-rc1/

> And then the system hangs, does not respond to keyboard (sysrq does not 
> seem to work on this PowerBook G4). But after a while the system reboots 
> itself, so I guess the machine panicked but did not print anything on the 
> screen.
> 
> Full messages (picture), config & (working) dmesg:
> 
>    http://nerdbynature.de/bits/3.0-rc1/
> 
-- 
BOFH excuse #406:

Bad cafeteria food landed all the sysadmins in the hospital.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  0:16 ` Christian Kujau
@ 2011-06-02  0:47   ` Benjamin Herrenschmidt
  2011-06-02  2:57   ` Benjamin Herrenschmidt
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2011-06-02  0:47 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, linux ppc dev, zajec5, linville

On Wed, 2011-06-01 at 17:16 -0700, Christian Kujau wrote:
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafał Miłecki <zajec5@gmail.com>
> Date:   Fri Apr 1 13:26:52 2011 +0200
> 
>     ssb: pci: implement serdes workaround
>     
>     Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
>     Signed-off-by: John W. Linville <linville@tuxdriver.com>
> -----------------------
> 
> When I reverted this one from the gi-bisected tree, the box continued to 
> boot (until it got stuck again during IDE/CDROM init, but that may be a 
> different story). I'l; try to revert this from a vanilla 3.0-rc1 and see 
> if it helps 

Thanks. I'll have a look later today. As for the IDE/CDROM init, have
you tried the very latest linus snapshot ? Does that still happens ?
What kind of error do you observe ?

There was some time during the 3.0 merge window process when interrupts
were broken on some PowerBooks, but that should be fixed now.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  0:16 ` Christian Kujau
  2011-06-02  0:47   ` Benjamin Herrenschmidt
@ 2011-06-02  2:57   ` Benjamin Herrenschmidt
  2011-06-02  3:06     ` Christian Kujau
                       ` (2 more replies)
  2011-06-02  6:00   ` Rafał Miłecki
  2011-06-02  6:07   ` Rafał Miłecki
  3 siblings, 3 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2011-06-02  2:57 UTC (permalink / raw)
  To: Christian Kujau, linville; +Cc: LKML, linux ppc dev, zajec5

On Wed, 2011-06-01 at 17:16 -0700, Christian Kujau wrote:
> On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> > trying to boot 3.0-rc1 on powerpc32 only progresses until:
> > 
> >   > Kernel virtual memory layout:
> >   >   * 0xfffcf000..0xfffff000  : fixmap
> 
> After hours (and hours!) of git-bisecting, it said:
> 
> -----------------------
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafał Miłecki <zajec5@gmail.com>
> Date:   Fri Apr 1 13:26:52 2011 +0200
> 
>     ssb: pci: implement serdes workaround
>     
>     Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
>     Signed-off-by: John W. Linville <linville@tuxdriver.com>
> -----------------------

Ok, thanks a lot, It looks rather trivial actually: That new workaround
is PCIe specific but is called unconditionally, and will do bad things
non-PCIe implementations.

John, care to send the patch below to Linus ASAP ? I could reproduce and
verify it fixes it. Thanks !

ssb: pci: Don't call PCIe specific workarounds on PCI cores

Otherwise it can/will crash....

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

diff --git a/drivers/ssb/driver_pcicore.c b/drivers/ssb/driver_pcicore.c
index 82feb34..eddf1b9 100644
--- a/drivers/ssb/driver_pcicore.c
+++ b/drivers/ssb/driver_pcicore.c
@@ -540,7 +540,8 @@ void ssb_pcicore_init(struct ssb_pcicore *pc)
 		ssb_pcicore_init_clientmode(pc);
 
 	/* Additional always once-executed workarounds */
-	ssb_pcicore_serdes_workaround(pc);
+	if (dev->id.coreid == SSB_DEV_PCIE)
+		ssb_pcicore_serdes_workaround(pc);
 	/* TODO: ASPM */
 	/* TODO: Clock Request Update */
 }



^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  2:57   ` Benjamin Herrenschmidt
@ 2011-06-02  3:06     ` Christian Kujau
  2011-06-02  4:27     ` Christian Kujau
  2011-06-10 22:54     ` Christian Kujau
  2 siblings, 0 replies; 19+ messages in thread
From: Christian Kujau @ 2011-06-02  3:06 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linville, LKML, linux ppc dev, zajec5

On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> Ok, thanks a lot, It looks rather trivial actually: That new workaround
> is PCIe specific but is called unconditionally, and will do bad things
> non-PCIe implementations.

Indeed. This PowerBook G4 does not has PCIe, yet the whole SSB thingy gets 
enabled in my .config somehow. Thanks for the quick fix, I tried to revert 
ccc7c28af2... from Linus' current tree, but I had to rip out some more to 
make it compile.

I'll try your fix in a minute and get back to you with those cdrom init 
problems as well.

Thanks,
Christian.
-- 
BOFH excuse #166:

/pub/lunch

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  2:57   ` Benjamin Herrenschmidt
  2011-06-02  3:06     ` Christian Kujau
@ 2011-06-02  4:27     ` Christian Kujau
  2011-06-02  7:33       ` Benjamin Herrenschmidt
  2011-06-10 22:54     ` Christian Kujau
  2 siblings, 1 reply; 19+ messages in thread
From: Christian Kujau @ 2011-06-02  4:27 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linville, LKML, linux ppc dev, zajec5

On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> Ok, thanks a lot, It looks rather trivial actually: That new workaround
> is PCIe specific but is called unconditionally, and will do bad things
> non-PCIe implementations.

OK, with your patch applied to Linus' latest git tree the machine 
continues to boot. Also, with the latest tree, the "machine is stuck after 
ide-cd init" problem[0] went away.

For this particular problem and patch, feel free to add:

Tested-by: Christian Kujau <lists@nerdbynature.de>

However, shortly after boot and loggin in to the box remotely, the bux did 
not respond any more. I'm not sure if these are related to those SSB/PCIe 
changes, but somehow I hope they are - bisecting those would take much 
longer, as it's not an "instant" death:

 * http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck1.jpg
 * http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck2.jpg

This is what an OCR program made of it:

irq euent stamp: 185804850
hardirqs last enabled at (185904849): [<c04005b0>] _raw_spin_unlock_irqrestore+0x40/0x?e
hardirqs last disabled at (185904850): [<c00120b8>] reenable_mmu+0x24/0x78
Softirqs last enabled at (185892414): [<c000fe8c>] call_do_softirq+0x14/0x24
softirqs last disabled at (18589240?): [<c000fe8c>] call_do_softirq+0x14/0x24
NIP: e04005b4 LR: e04005b0 CTR: 00000000
REGS: ef92be10 TRHP: 0901 Not tainted (3.0.0-rel-00049-g1fa?b6a-dirtg)
MSB: 00009032 <EE.ME.IR.DR> CR: 42002084
TRSK = ef8d0000[38B] ’kuorker/0:2’ THREAD:
GPR00: c04005b0 ef92bec0 efBd0000 00000001
GPR08: 00000000 0b14aed0 0049a306 00030600
HIP [c01005b1] _rau_spin_unlock_irqrestore+0x44/0x?c
LR [c04005b0] _rau_spin_unlock_irqrestore+0x40/0x?c
Call Trace:
[ef92bec0] [c04005b0] _raw_spin_unlock_irqrestore+0x40/0x?c (unreliable)
[ef92bed0] [c029c504] flush_tu_ldisc+0x121/0x230
[ef92bf10] [c001c86c] process_one_uork+0x1c1/0x4cB
[ef92bfS0] [c004efac] worker_thread+0x1?8/0x3c1
[ef92bf90] [c0051148] kthread+0x81/0x88
[ef92hff0] [c0810390] kernel_thread+0x1c/0x68

XER: 20000000
ef92a000 ef8d0660 00000006 00000000 18614000 22002088
Instruction dump:
??? 93e1060c ?c9f23?B 38800001 90010011 4bc6e9a9 ?fc3i`3?8 4be61a69
?3e08080 11820021 1bc6b515 ?fe00124
B8c16008 ?c0803a6 83c1000c

Well, the picture is way better :-\

Thanks,
Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1-cdrom.jpg
-- 
BOFH excuse #399:

We are a 100% Microsoft Shop.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  0:16 ` Christian Kujau
  2011-06-02  0:47   ` Benjamin Herrenschmidt
  2011-06-02  2:57   ` Benjamin Herrenschmidt
@ 2011-06-02  6:00   ` Rafał Miłecki
  2011-06-02  6:07   ` Rafał Miłecki
  3 siblings, 0 replies; 19+ messages in thread
From: Rafał Miłecki @ 2011-06-02  6:00 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, linux ppc dev, linville, benh

2011/6/2 Christian Kujau <lists@nerdbynature.de>:
> On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
>> trying to boot 3.0-rc1 on powerpc32 only progresses until:
>>
>>   > Kernel virtual memory layout:
>>   >   * 0xfffcf000..0xfffff000  : fixmap
>
> After hours (and hours!) of git-bisecting, it said:
>
> -----------------------
> ccc7c28af205888798b51b6cbc0b557ac1170a49 is the first bad commit
> commit ccc7c28af205888798b51b6cbc0b557ac1170a49
> Author: Rafał Miłecki <zajec5@gmail.com>
> Date:   Fri Apr 1 13:26:52 2011 +0200
>
>    ssb: pci: implement serdes workaround
>
>    Signed-off-by: Rafał Miłecki <zajec5@gmail.com>
>    Signed-off-by: John W. Linville <linville@tuxdriver.com>
> -----------------------

I'm for the problem :(

Patch was already send yesterday, I've even CCed linuxppc-dev:
[RFT][PATCH 3.0] ssb: fix PCI(e) driver regression causing oops on PCI cards

-- 
Rafał

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  0:16 ` Christian Kujau
                     ` (2 preceding siblings ...)
  2011-06-02  6:00   ` Rafał Miłecki
@ 2011-06-02  6:07   ` Rafał Miłecki
  2011-06-02  6:16     ` Christian Kujau
  3 siblings, 1 reply; 19+ messages in thread
From: Rafał Miłecki @ 2011-06-02  6:07 UTC (permalink / raw)
  To: Christian Kujau; +Cc: LKML, linux ppc dev, linville, benh

On Tue, 31 May 2011 at 16:50, Christian Kujau wrote:
> trying to boot 3.0-rc1 on powerpc32 only progresses until:
>
>   > Kernel virtual memory layout:
>   >   * 0xfffcf000..0xfffff000  : fixmap

The weird thing is that:

1) You didn't see (like Andres):
Machine check in kernel mode.
 Caused by (from SRR1=149030): Transfer error ack signal
 Oops: Machine check, sig: 7 [#1]
But, OK, maybe machine check requires something additional in kernel,
I don't know...

2) You didn't see SSB messages
This is confusing. You should see SSB messages that appear before my
invalid read happens. Did you somehow disable most of the important
logs, or sth? Having ssb messages and the end of hung boot would
directly point you to ssb module.

-- 
Rafał

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  6:07   ` Rafał Miłecki
@ 2011-06-02  6:16     ` Christian Kujau
  0 siblings, 0 replies; 19+ messages in thread
From: Christian Kujau @ 2011-06-02  6:16 UTC (permalink / raw)
  To: Rafał Miłecki; +Cc: LKML, linux ppc dev, linville, benh

On Thu, 2 Jun 2011 at 08:07, Rafał Miłecki wrote:
> 1) You didn't see (like Andres):
> Machine check in kernel mode.
>  Caused by (from SRR1=149030): Transfer error ack signal
>  Oops: Machine check, sig: 7 [#1]
> But, OK, maybe machine check requires something additional in kernel,
> I don't know...
> 
> 2) You didn't see SSB messages
> This is confusing. You should see SSB messages that appear before my
> invalid read happens. Did you somehow disable most of the important
> logs, or sth? Having ssb messages and the end of hung boot would
> directly point you to ssb module.

BenH advised to boot with udbg-immortal and out came:

 http://nerdbynature.de/bits/3.0-rc1/linux-3.0_powerpc_2.jpg
 http://nerdbynature.de/bits/3.0-rc1/linux-3.0_powerpc_2.mp4
 (watch it at very slow speed, as it's only 3sec long)

I've enabled[0] FB_NVIDIA and during normal booting the screen flickers 
after the "... : fixmap" message and the screen clears and is filled again 
from the top - maybe the messages would've been there if booted w/o the 
framebuffer enabled.

Right now I'm happy that Ben's fix helped to get past this message, but 
the system remains unsuable[1] with the latest -git, but more debugging 
has to wait until tomorrow...

Thanks,
Christian.

[0] http://nerdbynature.de/bits/3.0-rc1/config-2.6.39.txt
[1] https://lkml.org/lkml/2011/6/2/6
-- 
BOFH excuse #230:

Lusers learning curve appears to be fractal

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  4:27     ` Christian Kujau
@ 2011-06-02  7:33       ` Benjamin Herrenschmidt
  2011-06-06  2:11         ` Christian Kujau
  0 siblings, 1 reply; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2011-06-02  7:33 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linville, LKML, linux ppc dev, zajec5

On Wed, 2011-06-01 at 21:27 -0700, Christian Kujau wrote:
> On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> > Ok, thanks a lot, It looks rather trivial actually: That new workaround
> > is PCIe specific but is called unconditionally, and will do bad things
> > non-PCIe implementations.
> 
> OK, with your patch applied to Linus' latest git tree the machine 
> continues to boot. Also, with the latest tree, the "machine is stuck after 
> ide-cd init" problem[0] went away.
> 
> For this particular problem and patch, feel free to add:
> 
> Tested-by: Christian Kujau <lists@nerdbynature.de>
> 
> However, shortly after boot and loggin in to the box remotely, the bux did 
> not respond any more. I'm not sure if these are related to those SSB/PCIe 
> changes, but somehow I hope they are - bisecting those would take much 
> longer, as it's not an "instant" death:
> 
>  * http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck1.jpg
>  * http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1_stuck2.jpg
> 
> This is what an OCR program made of it:

I think this is another problem that I'm in the middle of trying to
figure out.

It -looks- to me that something goes wrong in the tty code when a large
file is piped through a pty, causing the kernel to hang for minutes in
the workqueue / ldisk flush code. I've just sent an initial report to
Alan Cox about it and am currently bisecting it.

Cheers,
Ben.

> irq euent stamp: 185804850
> hardirqs last enabled at (185904849): [<c04005b0>] _raw_spin_unlock_irqrestore+0x40/0x?e
> hardirqs last disabled at (185904850): [<c00120b8>] reenable_mmu+0x24/0x78
> Softirqs last enabled at (185892414): [<c000fe8c>] call_do_softirq+0x14/0x24
> softirqs last disabled at (18589240?): [<c000fe8c>] call_do_softirq+0x14/0x24
> NIP: e04005b4 LR: e04005b0 CTR: 00000000
> REGS: ef92be10 TRHP: 0901 Not tainted (3.0.0-rel-00049-g1fa?b6a-dirtg)
> MSB: 00009032 <EE.ME.IR.DR> CR: 42002084
> TRSK = ef8d0000[38B] ’kuorker/0:2’ THREAD:
> GPR00: c04005b0 ef92bec0 efBd0000 00000001
> GPR08: 00000000 0b14aed0 0049a306 00030600
> HIP [c01005b1] _rau_spin_unlock_irqrestore+0x44/0x?c
> LR [c04005b0] _rau_spin_unlock_irqrestore+0x40/0x?c
> Call Trace:
> [ef92bec0] [c04005b0] _raw_spin_unlock_irqrestore+0x40/0x?c (unreliable)
> [ef92bed0] [c029c504] flush_tu_ldisc+0x121/0x230
> [ef92bf10] [c001c86c] process_one_uork+0x1c1/0x4cB
> [ef92bfS0] [c004efac] worker_thread+0x1?8/0x3c1
> [ef92bf90] [c0051148] kthread+0x81/0x88
> [ef92hff0] [c0810390] kernel_thread+0x1c/0x68
> 
> XER: 20000000
> ef92a000 ef8d0660 00000006 00000000 18614000 22002088
> Instruction dump:
> ??? 93e1060c ?c9f23?B 38800001 90010011 4bc6e9a9 ?fc3i`3?8 4be61a69
> ?3e08080 11820021 1bc6b515 ?fe00124
> B8c16008 ?c0803a6 83c1000c
> 
> Well, the picture is way better :-\
> 
> Thanks,
> Christian.
> 
> [0] http://nerdbynature.de/bits/3.0-rc1/linux-3.0-rc1-cdrom.jpg



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  7:33       ` Benjamin Herrenschmidt
@ 2011-06-06  2:11         ` Christian Kujau
  2011-06-06  3:46           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 19+ messages in thread
From: Christian Kujau @ 2011-06-06  2:11 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linville, LKML, linux ppc dev, zajec5

On Thu, 2 Jun 2011 at 17:33, Benjamin Herrenschmidt wrote:
> It -looks- to me that something goes wrong in the tty code when a large
> file is piped through a pty, causing the kernel to hang for minutes in
> the workqueue / ldisk flush code. I've just sent an initial report to
> Alan Cox about it and am currently bisecting it.

This was the "tty vs workqueue oddities" thread, right? FWIW, 
55db4c64eddf37 ("Revert "tty: make receive_buf() return the amout of bytes 
received"") seems to have fixed it on this powerpc machine as well.

With your "ssb: pci: Don't call PCIe specific workarounds on PCI cores" 
patch applied, powerpc32 seems to be quite happy with 3.0-rc1+

Thanks,
Christian.
-- 
BOFH excuse #382:

Someone was smoking in the computer room and set off the halon systems.

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-06  2:11         ` Christian Kujau
@ 2011-06-06  3:46           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 19+ messages in thread
From: Benjamin Herrenschmidt @ 2011-06-06  3:46 UTC (permalink / raw)
  To: Christian Kujau; +Cc: linville, LKML, linux ppc dev, zajec5

On Sun, 2011-06-05 at 19:11 -0700, Christian Kujau wrote:
> On Thu, 2 Jun 2011 at 17:33, Benjamin Herrenschmidt wrote:
> > It -looks- to me that something goes wrong in the tty code when a large
> > file is piped through a pty, causing the kernel to hang for minutes in
> > the workqueue / ldisk flush code. I've just sent an initial report to
> > Alan Cox about it and am currently bisecting it.
> 
> This was the "tty vs workqueue oddities" thread, right? FWIW, 
> 55db4c64eddf37 ("Revert "tty: make receive_buf() return the amout of bytes 
> received"") seems to have fixed it on this powerpc machine as well.

Yup.

> With your "ssb: pci: Don't call PCIe specific workarounds on PCI cores" 
> patch applied, powerpc32 seems to be quite happy with 3.0-rc1+

Good :-)

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-02  2:57   ` Benjamin Herrenschmidt
  2011-06-02  3:06     ` Christian Kujau
  2011-06-02  4:27     ` Christian Kujau
@ 2011-06-10 22:54     ` Christian Kujau
  2011-06-10 22:59       ` Rafał Miłecki
  2 siblings, 1 reply; 19+ messages in thread
From: Christian Kujau @ 2011-06-10 22:54 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linville, LKML, linux ppc dev, zajec5

On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
> John, care to send the patch below to Linus ASAP ? I could reproduce and
> verify it fixes it. Thanks !
> 
> ssb: pci: Don't call PCIe specific workarounds on PCI cores
> 
> Otherwise it can/will crash....

The patch did not make it into -rc2, it's not in today's git tree either, 
AFAICS. Can anyone push this, please?

Thanks,
Christian.

> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> ---
> 
> diff --git a/drivers/ssb/driver_pcicore.c b/drivers/ssb/driver_pcicore.c
> index 82feb34..eddf1b9 100644
> --- a/drivers/ssb/driver_pcicore.c
> +++ b/drivers/ssb/driver_pcicore.c
> @@ -540,7 +540,8 @@ void ssb_pcicore_init(struct ssb_pcicore *pc)
>  		ssb_pcicore_init_clientmode(pc);
>  
>  	/* Additional always once-executed workarounds */
> -	ssb_pcicore_serdes_workaround(pc);
> +	if (dev->id.coreid == SSB_DEV_PCIE)
> +		ssb_pcicore_serdes_workaround(pc);
>  	/* TODO: ASPM */
>  	/* TODO: Clock Request Update */
>  }
> 
-- 
BOFH excuse #312:

incompatible bit-registration operators

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: 3.0-rc1: powerpc hangs at Kernel virtual memory layout
  2011-06-10 22:54     ` Christian Kujau
@ 2011-06-10 22:59       ` Rafał Miłecki
  0 siblings, 0 replies; 19+ messages in thread
From: Rafał Miłecki @ 2011-06-10 22:59 UTC (permalink / raw)
  To: Christian Kujau; +Cc: Benjamin Herrenschmidt, linville, LKML, linux ppc dev

2011/6/11 Christian Kujau <lists@nerdbynature.de>:
> On Thu, 2 Jun 2011 at 12:57, Benjamin Herrenschmidt wrote:
>> John, care to send the patch below to Linus ASAP ? I could reproduce and
>> verify it fixes it. Thanks !
>>
>> ssb: pci: Don't call PCIe specific workarounds on PCI cores
>>
>> Otherwise it can/will crash....
>
> The patch did not make it into -rc2, it's not in today's git tree either,
> AFAICS. Can anyone push this, please?

Yeah, I noticed it wasn't in the pull for rc2. I pinged John, he told
me to just wait.

Patch was taken with the recent pull, it should go into rc3.

-- 
Rafał

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2011-06-10 22:59 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-31 23:50 3.0-rc1: powerpc hangs at Kernel virtual memory layout Christian Kujau
2011-06-01  0:25 ` Benjamin Herrenschmidt
2011-06-01  0:48   ` Christian Kujau
2011-06-01  1:08     ` Christian Kujau
2011-06-01  3:02     ` Christian Kujau
2011-06-01  3:49       ` Benjamin Herrenschmidt
2011-06-02  0:16 ` Christian Kujau
2011-06-02  0:47   ` Benjamin Herrenschmidt
2011-06-02  2:57   ` Benjamin Herrenschmidt
2011-06-02  3:06     ` Christian Kujau
2011-06-02  4:27     ` Christian Kujau
2011-06-02  7:33       ` Benjamin Herrenschmidt
2011-06-06  2:11         ` Christian Kujau
2011-06-06  3:46           ` Benjamin Herrenschmidt
2011-06-10 22:54     ` Christian Kujau
2011-06-10 22:59       ` Rafał Miłecki
2011-06-02  6:00   ` Rafał Miłecki
2011-06-02  6:07   ` Rafał Miłecki
2011-06-02  6:16     ` Christian Kujau

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).