All of lore.kernel.org
 help / color / mirror / Atom feed
* 3.11-rc2: panic in __rdmsr_on_cpu
@ 2013-07-25 22:32 Ilia Mirkin
  2013-07-26 11:59 ` Ilia Mirkin
  0 siblings, 1 reply; 4+ messages in thread
From: Ilia Mirkin @ 2013-07-25 22:32 UTC (permalink / raw)
  To: linux-kernel

Hi,

I just built a 3.11-rc2 kernel (+ a few patches, but nothing
arch-related), and I saw the following: http://i.imgur.com/dCTqOyR.jpg

The rough transcription is

Call Trace:
<IRQ>
generic_smp_call_fucntion_single_interrupt
smp_call_function_single_interrupt
call_function_single_interrupt
<EOI>
? default_idle
? default_idle
arch_cpu_idle
cpu_startup_entry
rest_init
start_kernel
? repair_env_string
x86_64_start_reservations
x86_64_start_kernel
Code: ... cc 81 8b 0f <0f> 32 48 c1 e2 20 89 c0 ...
RIP: __rdmsr_on_cpu+0x2e/0x44
Kernel panic - not syncing: Fatal exception in interrupt

A 3.10-rc7 kernel booted just fine. Is this likely a real issue? Or
perhaps a mis-build of some sort?

Thanks for any advice,

  -ilia

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.11-rc2: panic in __rdmsr_on_cpu
  2013-07-25 22:32 3.11-rc2: panic in __rdmsr_on_cpu Ilia Mirkin
@ 2013-07-26 11:59 ` Ilia Mirkin
  2013-07-26 13:15   ` Ilia Mirkin
  0 siblings, 1 reply; 4+ messages in thread
From: Ilia Mirkin @ 2013-07-26 11:59 UTC (permalink / raw)
  To: linux-kernel

On Thu, Jul 25, 2013 at 6:32 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> Hi,
>
> I just built a 3.11-rc2 kernel (+ a few patches, but nothing
> arch-related), and I saw the following: http://i.imgur.com/dCTqOyR.jpg
>
> The rough transcription is
>
> Call Trace:
> <IRQ>
> generic_smp_call_fucntion_single_interrupt
> smp_call_function_single_interrupt
> call_function_single_interrupt
> <EOI>
> ? default_idle
> ? default_idle
> arch_cpu_idle
> cpu_startup_entry
> rest_init
> start_kernel
> ? repair_env_string
> x86_64_start_reservations
> x86_64_start_kernel
> Code: ... cc 81 8b 0f <0f> 32 48 c1 e2 20 89 c0 ...
> RIP: __rdmsr_on_cpu+0x2e/0x44
> Kernel panic - not syncing: Fatal exception in interrupt
>
> A 3.10-rc7 kernel booted just fine. Is this likely a real issue? Or
> perhaps a mis-build of some sort?

FWIW this is repeatable. I did a clean build (make clean && make) and
I still see the same thing. I have a Core i7-920 cpu, not sure what
other information would be relevant. I'd love to avoid a bisect, so
some likely candidates would be most welcome.

Thanks,

  -ilia

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.11-rc2: panic in __rdmsr_on_cpu
  2013-07-26 11:59 ` Ilia Mirkin
@ 2013-07-26 13:15   ` Ilia Mirkin
  2013-07-26 15:00     ` Srinivas Pandruvada
  0 siblings, 1 reply; 4+ messages in thread
From: Ilia Mirkin @ 2013-07-26 13:15 UTC (permalink / raw)
  To: linux-kernel, Srinivas Pandruvada, Zhang Rui

On Fri, Jul 26, 2013 at 7:59 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
> On Thu, Jul 25, 2013 at 6:32 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>> Hi,
>>
>> I just built a 3.11-rc2 kernel (+ a few patches, but nothing
>> arch-related), and I saw the following: http://i.imgur.com/dCTqOyR.jpg
>>
>> The rough transcription is
>>
>> Call Trace:
>> <IRQ>
>> generic_smp_call_fucntion_single_interrupt
>> smp_call_function_single_interrupt
>> call_function_single_interrupt
>> <EOI>
>> ? default_idle
>> ? default_idle
>> arch_cpu_idle
>> cpu_startup_entry
>> rest_init
>> start_kernel
>> ? repair_env_string
>> x86_64_start_reservations
>> x86_64_start_kernel
>> Code: ... cc 81 8b 0f <0f> 32 48 c1 e2 20 89 c0 ...
>> RIP: __rdmsr_on_cpu+0x2e/0x44
>> Kernel panic - not syncing: Fatal exception in interrupt
>>
>> A 3.10-rc7 kernel booted just fine. Is this likely a real issue? Or
>> perhaps a mis-build of some sort?
>
> FWIW this is repeatable. I did a clean build (make clean && make) and
> I still see the same thing. I have a Core i7-920 cpu, not sure what
> other information would be relevant. I'd love to avoid a bisect, so
> some likely candidates would be most welcome.

Aha, figured it out. I had enabled "X86 package temperature thermal
driver" = Y, which caused my Core i7-920 to produce the above trace on
boot. Glancing over the code, should this:

                if (!cpu_has(c, X86_FEATURE_DTHERM) &&
                                        !cpu_has(c, X86_FEATURE_PTS))
                        return -ENODEV;

perhaps be

                if (!cpu_has(c, X86_FEATURE_DTHERM) ||
                                        !cpu_has(c, X86_FEATURE_PTS))
                        return -ENODEV;

i.e. are both of those things required, or just one of them? My cpu
has DTHERM but not PTS, according to /proc/cpuinfo.

  -ilia

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: 3.11-rc2: panic in __rdmsr_on_cpu
  2013-07-26 13:15   ` Ilia Mirkin
@ 2013-07-26 15:00     ` Srinivas Pandruvada
  0 siblings, 0 replies; 4+ messages in thread
From: Srinivas Pandruvada @ 2013-07-26 15:00 UTC (permalink / raw)
  To: Ilia Mirkin; +Cc: linux-kernel, Zhang Rui

This is already fixed and it is in Linus main line. Check commit id 
"f3ed0a17f0292300b3caca32d823ecd32554a667"


Thanks for analysis and you are correct.

Thanks,
Srinivas

On 07/26/2013 06:15 AM, Ilia Mirkin wrote:
> On Fri, Jul 26, 2013 at 7:59 AM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>> On Thu, Jul 25, 2013 at 6:32 PM, Ilia Mirkin <imirkin@alum.mit.edu> wrote:
>>> Hi,
>>>
>>> I just built a 3.11-rc2 kernel (+ a few patches, but nothing
>>> arch-related), and I saw the following: http://i.imgur.com/dCTqOyR.jpg
>>>
>>> The rough transcription is
>>>
>>> Call Trace:
>>> <IRQ>
>>> generic_smp_call_fucntion_single_interrupt
>>> smp_call_function_single_interrupt
>>> call_function_single_interrupt
>>> <EOI>
>>> ? default_idle
>>> ? default_idle
>>> arch_cpu_idle
>>> cpu_startup_entry
>>> rest_init
>>> start_kernel
>>> ? repair_env_string
>>> x86_64_start_reservations
>>> x86_64_start_kernel
>>> Code: ... cc 81 8b 0f <0f> 32 48 c1 e2 20 89 c0 ...
>>> RIP: __rdmsr_on_cpu+0x2e/0x44
>>> Kernel panic - not syncing: Fatal exception in interrupt
>>>
>>> A 3.10-rc7 kernel booted just fine. Is this likely a real issue? Or
>>> perhaps a mis-build of some sort?
>> FWIW this is repeatable. I did a clean build (make clean && make) and
>> I still see the same thing. I have a Core i7-920 cpu, not sure what
>> other information would be relevant. I'd love to avoid a bisect, so
>> some likely candidates would be most welcome.
> Aha, figured it out. I had enabled "X86 package temperature thermal
> driver" = Y, which caused my Core i7-920 to produce the above trace on
> boot. Glancing over the code, should this:
>
>                  if (!cpu_has(c, X86_FEATURE_DTHERM) &&
>                                          !cpu_has(c, X86_FEATURE_PTS))
>                          return -ENODEV;
>
> perhaps be
>
>                  if (!cpu_has(c, X86_FEATURE_DTHERM) ||
>                                          !cpu_has(c, X86_FEATURE_PTS))
>                          return -ENODEV;
>
> i.e. are both of those things required, or just one of them? My cpu
> has DTHERM but not PTS, according to /proc/cpuinfo.
>
>    -ilia
>


^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2013-07-26 14:54 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-07-25 22:32 3.11-rc2: panic in __rdmsr_on_cpu Ilia Mirkin
2013-07-26 11:59 ` Ilia Mirkin
2013-07-26 13:15   ` Ilia Mirkin
2013-07-26 15:00     ` Srinivas Pandruvada

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.