linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Fedora's latest gcc produces unbootable kernels
@ 2007-12-01 14:42 Pierre Ossman
  2007-12-01 17:47 ` Pierre Ossman
  0 siblings, 1 reply; 10+ messages in thread
From: Pierre Ossman @ 2007-12-01 14:42 UTC (permalink / raw)
  To: LKML

The latest GCC in Fedora rawhide contains some serious bug (or provokes a latent one in the kernel) that makes every kernel built unbootable. It just locks up halfway through the init. Kernels that previously worked fine all now experience the same symptom. Even RH's own kernels exhibit this. The kernel built Nov 24th works, Nov 26th doesn't. gcc was updated 26th, 14 hours earlier.

The last message printed is:

isapnp: Scanning for PnP cards...

Comparing with the working kernel, the next steps are:

isapnp: Scanning for PnP cards...
Switched to high resolution mode on CPU 0
isapnp: No Plug & Play device found

Any ideas on how I can work around this? I'm rather unproductive when I can't build working kernels.. :/

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-01 14:42 Fedora's latest gcc produces unbootable kernels Pierre Ossman
@ 2007-12-01 17:47 ` Pierre Ossman
  2007-12-01 18:37   ` Bill Davidsen
  2007-12-01 20:20   ` Pierre Ossman
  0 siblings, 2 replies; 10+ messages in thread
From: Pierre Ossman @ 2007-12-01 17:47 UTC (permalink / raw)
  To: LKML; +Cc: jakub

On Sat, 1 Dec 2007 15:42:23 +0100
Pierre Ossman <drzeus-list@drzeus.cx> wrote:

> The latest GCC in Fedora rawhide contains some serious bug (or provokes a latent one in the kernel) that makes every kernel built unbootable. It just locks up halfway through the init. Kernels that previously worked fine all now experience the same symptom. Even RH's own kernels exhibit this. The kernel built Nov 24th works, Nov 26th doesn't. gcc was updated 26th, 14 hours earlier.
> 

Digging a bit further, it is indeed the high-res stuff (the first missing message) that hangs. If I hard code the kernel to just be non-high-res capable, it boots, but time keeping is horribly broken.

Anyway, hopefully this means I'll soon have the object file that gets miscompiled. Jakub also pointed me to an older gcc RPM so that I can produce an object file with that as well and see what differs.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-01 17:47 ` Pierre Ossman
@ 2007-12-01 18:37   ` Bill Davidsen
  2007-12-01 20:11     ` Pierre Ossman
  2007-12-01 20:20   ` Pierre Ossman
  1 sibling, 1 reply; 10+ messages in thread
From: Bill Davidsen @ 2007-12-01 18:37 UTC (permalink / raw)
  To: Pierre Ossman; +Cc: LKML, jakub

Pierre Ossman wrote:
> On Sat, 1 Dec 2007 15:42:23 +0100
> Pierre Ossman <drzeus-list@drzeus.cx> wrote:
> 
>> The latest GCC in Fedora rawhide contains some serious bug (or provokes a latent one in the kernel) that makes every kernel built unbootable. It just locks up halfway through the init. Kernels that previously worked fine all now experience the same symptom. Even RH's own kernels exhibit this. The kernel built Nov 24th works, Nov 26th doesn't. gcc was updated 26th, 14 hours earlier.
>>
> 
> Digging a bit further, it is indeed the high-res stuff (the first missing message) that hangs. If I hard code the kernel to just be non-high-res capable, it boots, but time keeping is horribly broken.
> 
> Anyway, hopefully this means I'll soon have the object file that gets miscompiled. Jakub also pointed me to an older gcc RPM so that I can produce an object file with that as well and see what differs.
> 
If you are referring to the "compat" RPMs, be aware that they use the 
current headers, which is a good or bad thing depending on what you want 
to do. If you want to build old software, you get to keep a down-rev 
virtual machine to do it right :-(

-- 
Bill Davidsen <davidsen@tmr.com>
   "We have more to fear from the bungling of the incompetent than from
the machinations of the wicked."  - from Slashdot

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-01 18:37   ` Bill Davidsen
@ 2007-12-01 20:11     ` Pierre Ossman
  0 siblings, 0 replies; 10+ messages in thread
From: Pierre Ossman @ 2007-12-01 20:11 UTC (permalink / raw)
  To: Bill Davidsen; +Cc: LKML, jakub

On Sat, 01 Dec 2007 13:37:44 -0500
Bill Davidsen <davidsen@tmr.com> wrote:

> If you are referring to the "compat" RPMs, be aware that they use the 
> current headers, which is a good or bad thing depending on what you want 
> to do. If you want to build old software, you get to keep a down-rev 
> virtual machine to do it right :-(
> 

Nah. The previous gcc package is the one shipped with Fedora 8. So I could just grab that one (plus cpp and libgomp) and downgrade.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-01 17:47 ` Pierre Ossman
  2007-12-01 18:37   ` Bill Davidsen
@ 2007-12-01 20:20   ` Pierre Ossman
  2007-12-03  8:17     ` Thomas Gleixner
  1 sibling, 1 reply; 10+ messages in thread
From: Pierre Ossman @ 2007-12-01 20:20 UTC (permalink / raw)
  To: Pierre Ossman; +Cc: LKML, jakub, Thomas Gleixner

[-- Attachment #1: Type: text/plain, Size: 1792 bytes --]

On Sat, 1 Dec 2007 18:47:52 +0100
Pierre Ossman <drzeus-list@drzeus.cx> wrote:

> On Sat, 1 Dec 2007 15:42:23 +0100
> Pierre Ossman <drzeus-list@drzeus.cx> wrote:
> 
> > The latest GCC in Fedora rawhide contains some serious bug (or provokes a latent one in the kernel) that makes every kernel built unbootable. It just locks up halfway through the init. Kernels that previously worked fine all now experience the same symptom. Even RH's own kernels exhibit this. The kernel built Nov 24th works, Nov 26th doesn't. gcc was updated 26th, 14 hours earlier.
> > 
> 
> Digging a bit further, it is indeed the high-res stuff (the first missing message) that hangs. If I hard code the kernel to just be non-high-res capable, it boots, but time keeping is horribly broken.
> 
> Anyway, hopefully this means I'll soon have the object file that gets miscompiled. Jakub also pointed me to an older gcc RPM so that I can produce an object file with that as well and see what differs.
> 

I've now pinpointed where it hangs. And it doesn't hang in fact. It gets stuck in an infinite loop in tick_setup_sched_timer():

	for (;;) {
		hrtimer_forward(&ts->sched_timer, now, tick_period);
		hrtimer_start(&ts->sched_timer, ts->sched_timer.expires,
			      HRTIMER_MODE_ABS);
		/* Check, if the timer was already in the past */
		if (hrtimer_active(&ts->sched_timer))
			break;
		now = ktime_get();
	}

I've added Thomas as cc as this is his domain, so perhaps he has some idea what the compiler does wrong here. I've also included the two object files (one good, one bad). HEAD is v2.6.24-rc3.

Rgds
-- 
     -- Pierre Ossman

  Linux kernel, MMC maintainer        http://www.kernel.org
  PulseAudio, core developer          http://pulseaudio.org
  rdesktop, core developer          http://www.rdesktop.org

[-- Attachment #2: tick-sched.bad --]
[-- Type: application/octet-stream, Size: 64104 bytes --]

[-- Attachment #3: tick-sched.good --]
[-- Type: application/octet-stream, Size: 64116 bytes --]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-01 20:20   ` Pierre Ossman
@ 2007-12-03  8:17     ` Thomas Gleixner
  2007-12-03  8:58       ` Jakub Jelinek
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2007-12-03  8:17 UTC (permalink / raw)
  To: Pierre Ossman; +Cc: Pierre Ossman, LKML, jakub, Thomas Gleixner

> On Sat, 1 Dec 2007 18:47:52 +0100
> Pierre Ossman <drzeus-list@drzeus.cx> wrote:
>> > The latest GCC in Fedora rawhide contains some serious bug (or
>> provokes a latent one in the kernel) that makes every kernel built
>> unbootable. It just locks up halfway through the init. Kernels that
>> previously worked fine all now experience the same symptom. Even RH's
>> own kernels exhibit this. The kernel built Nov 24th works, Nov 26th
>> doesn't. gcc was updated 26th, 14 hours earlier.
>> >
>>
>> Digging a bit further, it is indeed the high-res stuff (the first
>> missing message) that hangs. If I hard code the kernel to just be
>> non-high-res capable, it boots, but time keeping is horribly broken.
>>
>> Anyway, hopefully this means I'll soon have the object file that gets
>> miscompiled. Jakub also pointed me to an older gcc RPM so that I can
>> produce an object file with that as well and see what differs.
>>
>
> I've now pinpointed where it hangs. And it doesn't hang in fact. It gets
> stuck in an infinite loop in tick_setup_sched_timer():
>
> 	for (;;) {
> 		hrtimer_forward(&ts->sched_timer, now, tick_period);
> 		hrtimer_start(&ts->sched_timer, ts->sched_timer.expires,
> 			      HRTIMER_MODE_ABS);
> 		/* Check, if the timer was already in the past */
> 		if (hrtimer_active(&ts->sched_timer))
> 			break;
> 		now = ktime_get();
> 	}
>
> I've added Thomas as cc as this is his domain, so perhaps he has some idea
> what the compiler does wrong here. I've also included the two object files
> (one good, one bad). HEAD is v2.6.24-rc3.

I looked at the disassembly but I can not spot the problem.

I think the real problem is somewhere else. Likely candidates are
hrtimer_forward() or hrtimer_start() - in that order.

Thanks,

      tglx

P.S.: I have restricted network access today, so I can not reproduce my self.


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-03  8:17     ` Thomas Gleixner
@ 2007-12-03  8:58       ` Jakub Jelinek
  2007-12-03 11:34         ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Jelinek @ 2007-12-03  8:58 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Pierre Ossman, LKML, Thomas Gleixner

On Mon, Dec 03, 2007 at 09:17:22AM +0100, Thomas Gleixner wrote:
> I looked at the disassembly but I can not spot the problem.
> 
> I think the real problem is somewhere else. Likely candidates are
> hrtimer_forward() or hrtimer_start() - in that order.

Should be hopefully fixed in latest Fedora gcc.  The problem was in code like
typedef union { long long int s; } U;
typedef struct { U u; } S;

void foo (S *s, long long int x, unsigned long int y)
{
  s->u = ({ (U) { .s = s->u.s + x * y }; });
}

where a backport of a recent optimization of mine, without which gcc handles
terribly initializers from compound literals (which is something hrtimer
uses just everywhere - why can't ktime.h for #if BITS_PER_LONG == 64 || defined(CONFIG_KTIME_SCALAR)
just use a scalar rather than union with a scalar in it??), sets the LHS
object to the compound literal's initializer rather than forcing creation of
a temporary object (the compound literal).  Unfortunately the gimplifier
had some bugs in case the initializer references (or at least might
reference) parts of LHS object.  Fixed by backporting 2 Ada bugfixes for the
gimplifier from GCC trunk (Ada was hitting those bugs even without this
compound literal optimization).

	Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-03  8:58       ` Jakub Jelinek
@ 2007-12-03 11:34         ` Thomas Gleixner
  2007-12-03 11:51           ` Jakub Jelinek
  0 siblings, 1 reply; 10+ messages in thread
From: Thomas Gleixner @ 2007-12-03 11:34 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Thomas Gleixner, Pierre Ossman, LKML

On Mon, 3 Dec 2007, Jakub Jelinek wrote:

> On Mon, Dec 03, 2007 at 09:17:22AM +0100, Thomas Gleixner wrote:
> > I looked at the disassembly but I can not spot the problem.
> > 
> > I think the real problem is somewhere else. Likely candidates are
> > hrtimer_forward() or hrtimer_start() - in that order.
> 
> Should be hopefully fixed in latest Fedora gcc.  The problem was in code like
> typedef union { long long int s; } U;
> typedef struct { U u; } S;
> 
> void foo (S *s, long long int x, unsigned long int y)
> {
>   s->u = ({ (U) { .s = s->u.s + x * y }; });
> }
> 
> where a backport of a recent optimization of mine, without which gcc handles
> terribly initializers from compound literals (which is something hrtimer
> uses just everywhere - why can't ktime.h for #if BITS_PER_LONG == 64 || defined(CONFIG_KTIME_SCALAR)
> just use a scalar rather than union with a scalar in it??),

Of course just to annoy you :)

Seriously, we want the same code/initializers for both the scalar and the
sec/nsec case. That's where the union comes from.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-03 11:34         ` Thomas Gleixner
@ 2007-12-03 11:51           ` Jakub Jelinek
  2007-12-03 12:03             ` Thomas Gleixner
  0 siblings, 1 reply; 10+ messages in thread
From: Jakub Jelinek @ 2007-12-03 11:51 UTC (permalink / raw)
  To: Thomas Gleixner; +Cc: Thomas Gleixner, Pierre Ossman, LKML

On Mon, Dec 03, 2007 at 12:34:17PM +0100, Thomas Gleixner wrote:
> Of course just to annoy you :)

It doesn't matter whether I'm annoyed about this or not, but whether gcc is
able to generate decent code with it or not.  And especially with union it
is not, at least through all the tree ssa passes.  You already have a lot of
the details hidden in ktime.h accessor inlines, so I don't think it would be
hard to add further one or two.

Anyway, even just using typedef struct ktime { s64 tv64; } ktime_t; could
make things better in case you have just one field.  Unlike unions, structs
can be (and in this case most likely will be) scalarized by SRA, so
half of tree SSA passes will see it as integral var and will be able to
perform optimizations on it.

	Jakub

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: Fedora's latest gcc produces unbootable kernels
  2007-12-03 11:51           ` Jakub Jelinek
@ 2007-12-03 12:03             ` Thomas Gleixner
  0 siblings, 0 replies; 10+ messages in thread
From: Thomas Gleixner @ 2007-12-03 12:03 UTC (permalink / raw)
  To: Jakub Jelinek; +Cc: Thomas Gleixner, Pierre Ossman, LKML

On Mon, 3 Dec 2007, Jakub Jelinek wrote:

> On Mon, Dec 03, 2007 at 12:34:17PM +0100, Thomas Gleixner wrote:
> > Of course just to annoy you :)
> 
> It doesn't matter whether I'm annoyed about this or not, but whether gcc is
> able to generate decent code with it or not.  And especially with union it
> is not, at least through all the tree ssa passes.  You already have a lot of
> the details hidden in ktime.h accessor inlines, so I don't think it would be
> hard to add further one or two.
> 
> Anyway, even just using typedef struct ktime { s64 tv64; } ktime_t; could
> make things better in case you have just one field.  Unlike unions, structs
> can be (and in this case most likely will be) scalarized by SRA, so
> half of tree SSA passes will see it as integral var and will be able to
> perform optimizations on it.

Makes sense. I look into fixing that.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2007-12-03 12:04 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2007-12-01 14:42 Fedora's latest gcc produces unbootable kernels Pierre Ossman
2007-12-01 17:47 ` Pierre Ossman
2007-12-01 18:37   ` Bill Davidsen
2007-12-01 20:11     ` Pierre Ossman
2007-12-01 20:20   ` Pierre Ossman
2007-12-03  8:17     ` Thomas Gleixner
2007-12-03  8:58       ` Jakub Jelinek
2007-12-03 11:34         ` Thomas Gleixner
2007-12-03 11:51           ` Jakub Jelinek
2007-12-03 12:03             ` Thomas Gleixner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).