linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* G5 Quad hangs early on 4.20.2 / 5.0-rc2+
@ 2019-01-15 22:49 Tobias Ulmer
  2019-01-16  1:15 ` Benjamin Herrenschmidt
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: Tobias Ulmer @ 2019-01-15 22:49 UTC (permalink / raw)
  To: linuxppc-dev

Hi,

both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.

Surely I'm not the first to run into this, but I couldn't find any
discussion or bug report. Sorry if you're already aware.

You can see it hang here (5.0 rc2+, 4.20.2 is nearly identical) until
the watchdog triggers a reboot:

https://i.imgur.com/UiCVRuG.jpg

If I had to make an uneducated guess, it seems to boot into the same
codepath twice (mpic was already initialized, then it starts again right
after smp bringup). Maybe on a second CPU?

To narrow it down a little, my last known good was 4.18.9

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-15 22:49 G5 Quad hangs early on 4.20.2 / 5.0-rc2+ Tobias Ulmer
@ 2019-01-16  1:15 ` Benjamin Herrenschmidt
  2019-01-17  9:42   ` Tobias Ulmer
  2019-01-16  6:56 ` Michael Ellerman
  2019-01-18  2:22 ` Paul Mackerras
  2 siblings, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2019-01-16  1:15 UTC (permalink / raw)
  To: Tobias Ulmer, linuxppc-dev

On Tue, 2019-01-15 at 23:49 +0100, Tobias Ulmer wrote:
> Hi,
> 
> both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.
> 
> Surely I'm not the first to run into this, but I couldn't find any
> discussion or bug report. Sorry if you're already aware.
> 
> You can see it hang here (5.0 rc2+, 4.20.2 is nearly identical) until
> the watchdog triggers a reboot:
> 
> https://i.imgur.com/UiCVRuG.jpg
> 
> If I had to make an uneducated guess, it seems to boot into the same
> codepath twice (mpic was already initialized, then it starts again right
> after smp bringup). Maybe on a second CPU?
> 
> To narrow it down a little, my last known good was 4.18.9

I don't think it's an MPIC related problem but it does appear to hang
about when interrupts get turned on.

I have one of these critters in the office, but I'm working remotely
this week so I won't be able to dig into this until next week.

It might help if you could bisect in the meantime.

Cheers,
Ben.



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-15 22:49 G5 Quad hangs early on 4.20.2 / 5.0-rc2+ Tobias Ulmer
  2019-01-16  1:15 ` Benjamin Herrenschmidt
@ 2019-01-16  6:56 ` Michael Ellerman
  2019-01-18  2:22 ` Paul Mackerras
  2 siblings, 0 replies; 10+ messages in thread
From: Michael Ellerman @ 2019-01-16  6:56 UTC (permalink / raw)
  To: Tobias Ulmer, linuxppc-dev

Tobias Ulmer <tobiasu@tmux.org> writes:
> Hi,
>
> both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.
>
> Surely I'm not the first to run into this, but I couldn't find any
> discussion or bug report. Sorry if you're already aware.

Actually you're probably the first to hit it :)

I have a quad but it never gets to openfirmware and I haven't had time
to look into why.

My iMac G5 and dual G5 are booting OK though.

> You can see it hang here (5.0 rc2+, 4.20.2 is nearly identical) until
> the watchdog triggers a reboot:
>
> https://i.imgur.com/UiCVRuG.jpg
>
> If I had to make an uneducated guess, it seems to boot into the same
> codepath twice (mpic was already initialized, then it starts again right
> after smp bringup). Maybe on a second CPU?

Can you compare to a working boot log, does that give us any clues?

Can you post your full .config for the non-working case.

cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-16  1:15 ` Benjamin Herrenschmidt
@ 2019-01-17  9:42   ` Tobias Ulmer
  2019-01-17 13:38     ` Michael Ellerman
  2019-01-17 22:32     ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 10+ messages in thread
From: Tobias Ulmer @ 2019-01-17  9:42 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev

On Wed, Jan 16, 2019 at 12:15:14PM +1100, Benjamin Herrenschmidt wrote:
> On Tue, 2019-01-15 at 23:49 +0100, Tobias Ulmer wrote:
> > Hi,
> > 
> > both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.
> > 
> > Surely I'm not the first to run into this, but I couldn't find any
> > discussion or bug report. Sorry if you're already aware.
> > 
> > You can see it hang here (5.0 rc2+, 4.20.2 is nearly identical) until
> > the watchdog triggers a reboot:
> > 
> > https://i.imgur.com/UiCVRuG.jpg
> > 
> > If I had to make an uneducated guess, it seems to boot into the same
> > codepath twice (mpic was already initialized, then it starts again right
> > after smp bringup). Maybe on a second CPU?
> > 
> > To narrow it down a little, my last known good was 4.18.9
> 
> I don't think it's an MPIC related problem but it does appear to hang
> about when interrupts get turned on.

When they get turned on for the second time, for some reason. You can see the
end of the first time just on top of the screen.

It repeats part of the startup initialization right after it's done with
smp bringup.

> 
> I have one of these critters in the office, but I'm working remotely
> this week so I won't be able to dig into this until next week.
> 
> It might help if you could bisect in the meantime.

I'm bisecting it now, but it's slow going since I don't have much time
to babysit the machine. The problem shows up somewhere between v4.19 and
v4.20.

> 
> Cheers,
> Ben.
> 
> 

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-17  9:42   ` Tobias Ulmer
@ 2019-01-17 13:38     ` Michael Ellerman
  2019-01-17 22:32     ` Benjamin Herrenschmidt
  1 sibling, 0 replies; 10+ messages in thread
From: Michael Ellerman @ 2019-01-17 13:38 UTC (permalink / raw)
  To: Tobias Ulmer, Benjamin Herrenschmidt; +Cc: linuxppc-dev

Tobias Ulmer <tobiasu@tmux.org> writes:
> On Wed, Jan 16, 2019 at 12:15:14PM +1100, Benjamin Herrenschmidt wrote:
>> On Tue, 2019-01-15 at 23:49 +0100, Tobias Ulmer wrote:
>> > Hi,
>> > 
>> > both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.
>> > 
>> > Surely I'm not the first to run into this, but I couldn't find any
>> > discussion or bug report. Sorry if you're already aware.
>> > 
>> > You can see it hang here (5.0 rc2+, 4.20.2 is nearly identical) until
>> > the watchdog triggers a reboot:
>> > 
>> > https://i.imgur.com/UiCVRuG.jpg
>> > 
>> > If I had to make an uneducated guess, it seems to boot into the same
>> > codepath twice (mpic was already initialized, then it starts again right
>> > after smp bringup). Maybe on a second CPU?
>> > 
>> > To narrow it down a little, my last known good was 4.18.9
>> 
>> I don't think it's an MPIC related problem but it does appear to hang
>> about when interrupts get turned on.
>
> When they get turned on for the second time, for some reason. You can see the
> end of the first time just on top of the screen.
>
> It repeats part of the startup initialization right after it's done with
> smp bringup.
>
>> 
>> I have one of these critters in the office, but I'm working remotely
>> this week so I won't be able to dig into this until next week.
>> 
>> It might help if you could bisect in the meantime.
>
> I'm bisecting it now, but it's slow going since I don't have much time
> to babysit the machine. The problem shows up somewhere between v4.19 and
> v4.20.

Can you try:

58cfbac25b1f ("Revert "selftests/powerpc: Fix out-of-tree build errors"")

If that's bad and v4.19 is good that would isolate it to about 222
commits.

cheers

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-17  9:42   ` Tobias Ulmer
  2019-01-17 13:38     ` Michael Ellerman
@ 2019-01-17 22:32     ` Benjamin Herrenschmidt
  2019-01-18  0:51       ` Tobias Ulmer
  1 sibling, 1 reply; 10+ messages in thread
From: Benjamin Herrenschmidt @ 2019-01-17 22:32 UTC (permalink / raw)
  To: Tobias Ulmer; +Cc: linuxppc-dev

On Thu, 2019-01-17 at 10:42 +0100, Tobias Ulmer wrote:
> On Wed, Jan 16, 2019 at 12:15:14PM +1100, Benjamin Herrenschmidt wrote:
> > On Tue, 2019-01-15 at 23:49 +0100, Tobias Ulmer wrote:
> > > Hi,
> > > 
> > > both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.
> > > 
> > > Surely I'm not the first to run into this, but I couldn't find any
> > > discussion or bug report. Sorry if you're already aware.
> > > 
> > > You can see it hang here (5.0 rc2+, 4.20.2 is nearly identical) until
> > > the watchdog triggers a reboot:
> > > 
> > > https://i.imgur.com/UiCVRuG.jpg
> > > 
> > > If I had to make an uneducated guess, it seems to boot into the same
> > > codepath twice (mpic was already initialized, then it starts again right
> > > after smp bringup). Maybe on a second CPU?
> > > 
> > > To narrow it down a little, my last known good was 4.18.9
> > 
> > I don't think it's an MPIC related problem but it does appear to hang
> > about when interrupts get turned on.
> 
> When they get turned on for the second time, for some reason. You can see the
> end of the first time just on top of the screen.

No, that top of screen init is something else.

> It repeats part of the startup initialization right after it's done with
> smp bringup.

That's just the BootX console hanging over to the main console and
replaying the messages I think.

> > I have one of these critters in the office, but I'm working remotely
> > this week so I won't be able to dig into this until next week.
> > 
> > It might help if you could bisect in the meantime.
> 
> I'm bisecting it now, but it's slow going since I don't have much time
> to babysit the machine. The problem shows up somewhere between v4.19 and
> v4.20.

Ok, thanks.

I'll be back on monday or tuesday, let me know where you got up to then
and I'll take it from there. Also email me your .config please.

Cheers,
Ben.

> > Cheers,
> > Ben.
> > 
> > 


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-17 22:32     ` Benjamin Herrenschmidt
@ 2019-01-18  0:51       ` Tobias Ulmer
  2019-01-18 10:09         ` Mathieu Malaterre
  0 siblings, 1 reply; 10+ messages in thread
From: Tobias Ulmer @ 2019-01-18  0:51 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: linuxppc-dev, Rob Herring

On Fri, Jan 18, 2019 at 09:32:07AM +1100, Benjamin Herrenschmidt wrote:
> On Thu, 2019-01-17 at 10:42 +0100, Tobias Ulmer wrote:
> > On Wed, Jan 16, 2019 at 12:15:14PM +1100, Benjamin Herrenschmidt wrote:
> > > On Tue, 2019-01-15 at 23:49 +0100, Tobias Ulmer wrote:
> > > > Hi,
> > > > 
> > > > both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.
> > > > 
> > > > Surely I'm not the first to run into this, but I couldn't find any
> > > > discussion or bug report. Sorry if you're already aware.
> > > > 
> > > > You can see it hang here (5.0 rc2+, 4.20.2 is nearly identical) until
> > > > the watchdog triggers a reboot:
> > > > 
> > > > https://i.imgur.com/UiCVRuG.jpg
> > > > 
...
> I'll be back on monday or tuesday, let me know where you got up to then
> and I'll take it from there. Also email me your .config please.


Hi,

this was caused by 5c63e407aaabb0464236cfc6279a2d79aede7073
(fbdev: Convert to using %pOFn instead of device_node.name)

name can't be NULL into offb_init_fb, there's a printk and the call to
offb_init_palette_hacks where name is matched against OF device names.

With the partial revert below, both 5.0-rc2+ and 4.20 are happy.

diff --git a/drivers/video/fbdev/offb.c b/drivers/video/fbdev/offb.c
index 31f769d67195..77c0a2f45b3b 100644
--- a/drivers/video/fbdev/offb.c
+++ b/drivers/video/fbdev/offb.c
@@ -419,13 +419,9 @@ static void __init offb_init_fb(const char *name,
 	var = &info->var;
 	info->par = par;
 
-	if (name) {
-		strcpy(fix->id, "OFfb ");
-		strncat(fix->id, name, sizeof(fix->id) - sizeof("OFfb "));
-		fix->id[sizeof(fix->id) - 1] = '\0';
-	} else
-		snprintf(fix->id, sizeof(fix->id), "OFfb %pOFn", dp);
-
+	strcpy(fix->id, "OFfb ");
+	strncat(fix->id, name, sizeof(fix->id) - sizeof("OFfb "));
+	fix->id[sizeof(fix->id) - 1] = '\0';
 
 	var->xres = var->xres_virtual = width;
 	var->yres = var->yres_virtual = height;
@@ -648,7 +644,7 @@ static void __init offb_init_nodriver(struct device_node *dp, int no_real_node)
 		/* kludge for valkyrie */
 		if (strcmp(dp->name, "valkyrie") == 0)
 			address += 0x1000;
-		offb_init_fb(no_real_node ? "bootx" : NULL,
+		offb_init_fb(no_real_node ? "bootx" : dp->name,
 			     width, height, depth, pitch, address,
 			     foreign_endian, no_real_node ? NULL : dp);
 	}

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-15 22:49 G5 Quad hangs early on 4.20.2 / 5.0-rc2+ Tobias Ulmer
  2019-01-16  1:15 ` Benjamin Herrenschmidt
  2019-01-16  6:56 ` Michael Ellerman
@ 2019-01-18  2:22 ` Paul Mackerras
  2019-01-18  9:44   ` Tobias Ulmer
  2 siblings, 1 reply; 10+ messages in thread
From: Paul Mackerras @ 2019-01-18  2:22 UTC (permalink / raw)
  To: Tobias Ulmer; +Cc: linuxppc-dev

On Tue, Jan 15, 2019 at 11:49:45PM +0100, Tobias Ulmer wrote:
> Hi,
> 
> both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.

Interesting, I just built v4.20 with g5_defconfig for my quad and it
booted just fine.  I'll try Linus' latest.

It could be config-related; could you send me your .config (off-list)?

Paul.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-18  2:22 ` Paul Mackerras
@ 2019-01-18  9:44   ` Tobias Ulmer
  0 siblings, 0 replies; 10+ messages in thread
From: Tobias Ulmer @ 2019-01-18  9:44 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linuxppc-dev

On Fri, Jan 18, 2019 at 01:22:41PM +1100, Paul Mackerras wrote:
> On Tue, Jan 15, 2019 at 11:49:45PM +0100, Tobias Ulmer wrote:
> > Hi,
> > 
> > both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.
> 
> Interesting, I just built v4.20 with g5_defconfig for my quad and it
> booted just fine.  I'll try Linus' latest.
> 
> It could be config-related; could you send me your .config (off-list)?
> 
> Paul.

dmesg: http://ix.io/1yAB
config: http://ix.io/1yAC

Have fun :)

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: G5 Quad hangs early on 4.20.2 / 5.0-rc2+
  2019-01-18  0:51       ` Tobias Ulmer
@ 2019-01-18 10:09         ` Mathieu Malaterre
  0 siblings, 0 replies; 10+ messages in thread
From: Mathieu Malaterre @ 2019-01-18 10:09 UTC (permalink / raw)
  To: Tobias Ulmer; +Cc: Rob Herring, linuxppc-dev

On Fri, Jan 18, 2019 at 1:53 AM Tobias Ulmer <tobiasu@tmux.org> wrote:
>
> On Fri, Jan 18, 2019 at 09:32:07AM +1100, Benjamin Herrenschmidt wrote:
> > On Thu, 2019-01-17 at 10:42 +0100, Tobias Ulmer wrote:
> > > On Wed, Jan 16, 2019 at 12:15:14PM +1100, Benjamin Herrenschmidt wrote:
> > > > On Tue, 2019-01-15 at 23:49 +0100, Tobias Ulmer wrote:
> > > > > Hi,
> > > > >
> > > > > both the latest stable 4.20.2 and 5.0 rc2+ hang early on the G5 Quad.
> > > > >
> > > > > Surely I'm not the first to run into this, but I couldn't find any
> > > > > discussion or bug report. Sorry if you're already aware.
> > > > >
> > > > > You can see it hang here (5.0 rc2+, 4.20.2 is nearly identical) until
> > > > > the watchdog triggers a reboot:
> > > > >
> > > > > https://i.imgur.com/UiCVRuG.jpg
> > > > >
> ...
> > I'll be back on monday or tuesday, let me know where you got up to then
> > and I'll take it from there. Also email me your .config please.
>
>
> Hi,
>
> this was caused by 5c63e407aaabb0464236cfc6279a2d79aede7073
> (fbdev: Convert to using %pOFn instead of device_node.name)

See:

https://patchwork.kernel.org/patch/10750931/



> name can't be NULL into offb_init_fb, there's a printk and the call to
> offb_init_palette_hacks where name is matched against OF device names.
>
> With the partial revert below, both 5.0-rc2+ and 4.20 are happy.
>
> diff --git a/drivers/video/fbdev/offb.c b/drivers/video/fbdev/offb.c
> index 31f769d67195..77c0a2f45b3b 100644
> --- a/drivers/video/fbdev/offb.c
> +++ b/drivers/video/fbdev/offb.c
> @@ -419,13 +419,9 @@ static void __init offb_init_fb(const char *name,
>         var = &info->var;
>         info->par = par;
>
> -       if (name) {
> -               strcpy(fix->id, "OFfb ");
> -               strncat(fix->id, name, sizeof(fix->id) - sizeof("OFfb "));
> -               fix->id[sizeof(fix->id) - 1] = '\0';
> -       } else
> -               snprintf(fix->id, sizeof(fix->id), "OFfb %pOFn", dp);
> -
> +       strcpy(fix->id, "OFfb ");
> +       strncat(fix->id, name, sizeof(fix->id) - sizeof("OFfb "));
> +       fix->id[sizeof(fix->id) - 1] = '\0';
>
>         var->xres = var->xres_virtual = width;
>         var->yres = var->yres_virtual = height;
> @@ -648,7 +644,7 @@ static void __init offb_init_nodriver(struct device_node *dp, int no_real_node)
>                 /* kludge for valkyrie */
>                 if (strcmp(dp->name, "valkyrie") == 0)
>                         address += 0x1000;
> -               offb_init_fb(no_real_node ? "bootx" : NULL,
> +               offb_init_fb(no_real_node ? "bootx" : dp->name,
>                              width, height, depth, pitch, address,
>                              foreign_endian, no_real_node ? NULL : dp);
>         }

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2019-01-18 10:11 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-15 22:49 G5 Quad hangs early on 4.20.2 / 5.0-rc2+ Tobias Ulmer
2019-01-16  1:15 ` Benjamin Herrenschmidt
2019-01-17  9:42   ` Tobias Ulmer
2019-01-17 13:38     ` Michael Ellerman
2019-01-17 22:32     ` Benjamin Herrenschmidt
2019-01-18  0:51       ` Tobias Ulmer
2019-01-18 10:09         ` Mathieu Malaterre
2019-01-16  6:56 ` Michael Ellerman
2019-01-18  2:22 ` Paul Mackerras
2019-01-18  9:44   ` Tobias Ulmer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).