linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* 2.6.34-rc2 - crash on shutdown
@ 2010-03-23  9:02 David R
  2010-03-23 12:02 ` Clemens Ladisch
  0 siblings, 1 reply; 7+ messages in thread
From: David R @ 2010-03-23  9:02 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 282 bytes --]

I recently upgraded my kernel to the .34 rc, and I'm getting the  
following crash on shutdown (see attached console image)

(I never ran rc1 so the problem may well be present there too).

I can follow up with bootup dmesg and config later this evening if required.

Cheers
David


[-- Attachment #2: shutdown.jpg --]
[-- Type: image/jpeg, Size: 73239 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.34-rc2 - crash on shutdown
  2010-03-23  9:02 2.6.34-rc2 - crash on shutdown David R
@ 2010-03-23 12:02 ` Clemens Ladisch
  2010-03-23 13:31   ` Stephane Eranian
  0 siblings, 1 reply; 7+ messages in thread
From: Clemens Ladisch @ 2010-03-23 12:02 UTC (permalink / raw)
  To: David R; +Cc: Stephane Eranian, linux-kernel

David R wrote:
> I recently upgraded my kernel to the .34 rc, and I'm getting the  
> following crash on shutdown (see attached console image)

Mee too, also in amd_pmu_cpu_offline().

The only pointer access in this function is cpuhw->amd_nb, but
I don't see any obvious bugs.


Regards,
Clemens

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.34-rc2 - crash on shutdown
  2010-03-23 12:02 ` Clemens Ladisch
@ 2010-03-23 13:31   ` Stephane Eranian
  2010-03-23 13:52     ` Clemens Ladisch
  0 siblings, 1 reply; 7+ messages in thread
From: Stephane Eranian @ 2010-03-23 13:31 UTC (permalink / raw)
  To: Clemens Ladisch; +Cc: David R, linux-kernel

On Tue, Mar 23, 2010 at 1:02 PM, Clemens Ladisch <clemens@ladisch.de> wrote:
> David R wrote:
>> I recently upgraded my kernel to the .34 rc, and I'm getting the
>> following crash on shutdown (see attached console image)
>
> Mee too, also in amd_pmu_cpu_offline().
>
> The only pointer access in this function is cpuhw->amd_nb, but
> I don't see any obvious bugs.
>
>
I reported a problem with the AMD initialization just last week.
There is an issue with amd_pmu_cpu_online() which gets called
too early, and thus fails. That leaves some bogus state and causes
a crash in amd_pmu_cpu_offline().

I proposed a fix which was rejected. The alternative involves moving
some the of CPU initialization code (on AMD) to an earlier position,i.e.,
which would be executed before the CPU_STARTED notifier. Nobody
has proposed anything else so far.


> Regards,
> Clemens
>



-- 
Stephane Eranian  | EMEA Software Engineering
Google France | 38 avenue de l'Opéra | 75002 Paris
Tel : +33 (0) 1 42 68 53 00
This email may be confidential or privileged. If you received this
communication by mistake, please
don't forward it to anyone else, please erase all copies and
attachments, and please let me know that
it went to the wrong person. Thanks

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.34-rc2 - crash on shutdown
  2010-03-23 13:31   ` Stephane Eranian
@ 2010-03-23 13:52     ` Clemens Ladisch
  2010-03-23 22:18       ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Clemens Ladisch @ 2010-03-23 13:52 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: David R, linux-kernel

Stephane Eranian wrote:
> On Tue, Mar 23, 2010 at 1:02 PM, Clemens Ladisch <clemens@ladisch.de> wrote:
> > The only pointer access in this function is cpuhw->amd_nb, but
> > I don't see any obvious bugs.
> 
> I reported a problem with the AMD initialization just last week.
> There is an issue with amd_pmu_cpu_online() which gets called
> too early, and thus fails. That leaves some bogus state and causes
> a crash in amd_pmu_cpu_offline().
> 
> I proposed a fix which was rejected. The alternative involves moving
> some the of CPU initialization code (on AMD) to an earlier position,i.e.,
> which would be executed before the CPU_STARTED notifier. Nobody
> has proposed anything else so far.

I don't know about the early bootmem stuff, but regardless of this issue,
if amd_pmu_cpu_online() can fail, then amd_pmu_cpu_offline() must be able
to handle this without blowing up.  Something like this (untested):

Signed-off-by: Clemens Ladisch <clemens@ladisch.de>

--- a/arch/x86/kernel/cpu/perf_event_amd.c
+++ b/arch/x86/kernel/cpu/perf_event_amd.c
@@ -324,17 +324,17 @@ static void amd_pmu_cpu_online(int cpu)
 	if (boot_cpu_data.x86_max_cores < 2)
 		return;
 
+	cpu1 = &per_cpu(cpu_hw_events, cpu);
+	cpu1->amd_nb = NULL;
+
 	/*
 	 * function may be called too early in the
 	 * boot process, in which case nb_id is bogus
 	 */
 	nb_id = amd_get_nb_id(cpu);
 	if (nb_id == BAD_APICID)
 		return;
 
-	cpu1 = &per_cpu(cpu_hw_events, cpu);
-	cpu1->amd_nb = NULL;
-
 	raw_spin_lock(&amd_nb_lock);
 
 	for_each_online_cpu(i) {
@@ -370,7 +370,7 @@ static void amd_pmu_cpu_offline(int cpu)
 
 	raw_spin_lock(&amd_nb_lock);
 
-	if (--cpuhw->amd_nb->refcnt == 0)
+	if (cpuhw->amd_nb && --cpuhw->amd_nb->refcnt == 0)
 		kfree(cpuhw->amd_nb);
 
 	cpuhw->amd_nb = NULL;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.34-rc2 - crash on shutdown
  2010-03-23 13:52     ` Clemens Ladisch
@ 2010-03-23 22:18       ` Rafael J. Wysocki
  2010-03-23 22:40         ` Stephane Eranian
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2010-03-23 22:18 UTC (permalink / raw)
  To: Clemens Ladisch; +Cc: Stephane Eranian, David R, linux-kernel

On Tuesday 23 March 2010, Clemens Ladisch wrote:
> Stephane Eranian wrote:
> > On Tue, Mar 23, 2010 at 1:02 PM, Clemens Ladisch <clemens@ladisch.de> wrote:
> > > The only pointer access in this function is cpuhw->amd_nb, but
> > > I don't see any obvious bugs.
> > 
> > I reported a problem with the AMD initialization just last week.
> > There is an issue with amd_pmu_cpu_online() which gets called
> > too early, and thus fails. That leaves some bogus state and causes
> > a crash in amd_pmu_cpu_offline().
> > 
> > I proposed a fix which was rejected. The alternative involves moving
> > some the of CPU initialization code (on AMD) to an earlier position,i.e.,
> > which would be executed before the CPU_STARTED notifier. Nobody
> > has proposed anything else so far.
> 
> I don't know about the early bootmem stuff, but regardless of this issue,
> if amd_pmu_cpu_online() can fail, then amd_pmu_cpu_offline() must be able
> to handle this without blowing up.  Something like this (untested):

I guess we handle that already:

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a90110c61073eab95d1986322693c2b9a8a6a5f6

Rafael

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.34-rc2 - crash on shutdown
  2010-03-23 22:18       ` Rafael J. Wysocki
@ 2010-03-23 22:40         ` Stephane Eranian
  2010-03-23 23:27           ` Rafael J. Wysocki
  0 siblings, 1 reply; 7+ messages in thread
From: Stephane Eranian @ 2010-03-23 22:40 UTC (permalink / raw)
  To: Rafael J. Wysocki; +Cc: Clemens Ladisch, David R, linux-kernel

On Tue, Mar 23, 2010 at 11:18 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Tuesday 23 March 2010, Clemens Ladisch wrote:
>> Stephane Eranian wrote:
>> > On Tue, Mar 23, 2010 at 1:02 PM, Clemens Ladisch <clemens@ladisch.de> wrote:
>> > > The only pointer access in this function is cpuhw->amd_nb, but
>> > > I don't see any obvious bugs.
>> >
>> > I reported a problem with the AMD initialization just last week.
>> > There is an issue with amd_pmu_cpu_online() which gets called
>> > too early, and thus fails. That leaves some bogus state and causes
>> > a crash in amd_pmu_cpu_offline().
>> >
>> > I proposed a fix which was rejected. The alternative involves moving
>> > some the of CPU initialization code (on AMD) to an earlier position,i.e.,
>> > which would be executed before the CPU_STARTED notifier. Nobody
>> > has proposed anything else so far.
>>
>> I don't know about the early bootmem stuff, but regardless of this issue,
>> if amd_pmu_cpu_online() can fail, then amd_pmu_cpu_offline() must be able
>> to handle this without blowing up.  Something like this (untested):
>
> I guess we handle that already:
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a90110c61073eab95d1986322693c2b9a8a6a5f6
>
Ok, the fix avoids the crash but perf_events support for AMD is still broken.

The root of the problem is elsewhere as I pointed out last week. Peter proposed
a patch today and I think this would be enough to avoid the crash and have
perf_events working again on AMD.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: 2.6.34-rc2 - crash on shutdown
  2010-03-23 22:40         ` Stephane Eranian
@ 2010-03-23 23:27           ` Rafael J. Wysocki
  0 siblings, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2010-03-23 23:27 UTC (permalink / raw)
  To: Stephane Eranian; +Cc: Clemens Ladisch, David R, linux-kernel

On Tuesday 23 March 2010, Stephane Eranian wrote:
> On Tue, Mar 23, 2010 at 11:18 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> > On Tuesday 23 March 2010, Clemens Ladisch wrote:
> >> Stephane Eranian wrote:
> >> > On Tue, Mar 23, 2010 at 1:02 PM, Clemens Ladisch <clemens@ladisch.de> wrote:
> >> > > The only pointer access in this function is cpuhw->amd_nb, but
> >> > > I don't see any obvious bugs.
> >> >
> >> > I reported a problem with the AMD initialization just last week.
> >> > There is an issue with amd_pmu_cpu_online() which gets called
> >> > too early, and thus fails. That leaves some bogus state and causes
> >> > a crash in amd_pmu_cpu_offline().
> >> >
> >> > I proposed a fix which was rejected. The alternative involves moving
> >> > some the of CPU initialization code (on AMD) to an earlier position,i.e.,
> >> > which would be executed before the CPU_STARTED notifier. Nobody
> >> > has proposed anything else so far.
> >>
> >> I don't know about the early bootmem stuff, but regardless of this issue,
> >> if amd_pmu_cpu_online() can fail, then amd_pmu_cpu_offline() must be able
> >> to handle this without blowing up.  Something like this (untested):
> >
> > I guess we handle that already:
> >
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=a90110c61073eab95d1986322693c2b9a8a6a5f6
> >
> Ok, the fix avoids the crash but perf_events support for AMD is still broken.
> 
> The root of the problem is elsewhere as I pointed out last week. Peter proposed
> a patch today and I think this would be enough to avoid the crash and have
> perf_events working again on AMD.

Yes, I saw the Peter's patch.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2010-03-23 23:24 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-23  9:02 2.6.34-rc2 - crash on shutdown David R
2010-03-23 12:02 ` Clemens Ladisch
2010-03-23 13:31   ` Stephane Eranian
2010-03-23 13:52     ` Clemens Ladisch
2010-03-23 22:18       ` Rafael J. Wysocki
2010-03-23 22:40         ` Stephane Eranian
2010-03-23 23:27           ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).