All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v3] x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value
@ 2009-11-09  4:21 Yong Wang
  2009-11-09  7:08 ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Yong Wang @ 2009-11-09  4:21 UTC (permalink / raw)
  To: Ingo Molnar, H. Peter Anvin, Thomas Gleixner
  Cc: Suresh Siddha, Arjan van de Ven, linux-kernel

Changes since v2:
Incorporate Ingo's comments to get BSP's thermal init value in a
'boot-CPU-only' function.

---

On platforms where bios handles the thermal monitor interrupt,
APIC_LVTTHMR on each logical CPU is programmed to generate a SMI and OS
can't touch it.

Unfortunately AP bringup sequence using INIT-SIPI-SIPI clear all
the LVT entries except the mask bit. Essentially this results in
all LVT entries including the thermal monitoring interrupt set to masked
(clearing the bios programmed value for APIC_LVTTHMR).

And this leads to kernel take over the thermal monitoring interrupt
on AP's but not on BSP (leaving the bios programmed value only on BSP).

As a result of this, we have seen system hangs when the thermal
monitoring interrupt is generated.

Fix this by reading the initial value of thermal LVT entry on BSP
and if bios has taken over the control, then program the same value
on all AP's and leave the thermal monitoring interrupt control
on all the logical cpu's to the bios.

Signed-off-by: Yong Wang <yong.y.wang@intel.com>
Cc: stable@kernel.org
---
 arch/x86/include/asm/apic.h              |    6 ++++++
 arch/x86/kernel/cpu/mcheck/therm_throt.c |   27 ++++++++++++++++++++++++++-
 arch/x86/kernel/setup.c                  |    1 +
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h
index 474d80d..97ea588 100644
--- a/arch/x86/include/asm/apic.h
+++ b/arch/x86/include/asm/apic.h
@@ -594,4 +594,10 @@ static inline physid_mask_t default_apicid_to_cpu_present(int phys_apicid)
 extern u8 cpu_2_logical_apicid[NR_CPUS];
 #endif
 
+#ifdef CONFIG_X86_THERMAL_VECTOR
+extern void get_bsp_lvtthmr_init(void);
+#else
+static inline void get_bsp_lvtthmr_init(void) { }
+#endif
+
 #endif /* _ASM_X86_APIC_H */
diff --git a/arch/x86/kernel/cpu/mcheck/therm_throt.c b/arch/x86/kernel/cpu/mcheck/therm_throt.c
index b3a1dba..55c5995 100644
--- a/arch/x86/kernel/cpu/mcheck/therm_throt.c
+++ b/arch/x86/kernel/cpu/mcheck/therm_throt.c
@@ -49,6 +49,8 @@ static DEFINE_PER_CPU(struct thermal_state, thermal_state);
 
 static atomic_t therm_throt_en	= ATOMIC_INIT(0);
 
+static u32 lvtthmr_init __read_mostly;
+
 #ifdef CONFIG_SYSFS
 #define define_therm_throt_sysdev_one_ro(_name)				\
 	static SYSDEV_ATTR(_name, 0444, therm_throt_sysdev_show_##_name, NULL)
@@ -254,6 +256,16 @@ asmlinkage void smp_thermal_interrupt(struct pt_regs *regs)
 	ack_APIC_irq();
 }
 
+void get_bsp_lvtthmr_init(void)
+{
+	/*
+	 * This function is only called on boot CPU. Save the init thermal
+	 * LVT value on BSP and use that value to restore APs' thermal LVT
+	 * entry BIOS programmed later
+	 */
+	lvtthmr_init = apic_read(APIC_LVTTHMR);
+}
+
 void intel_init_thermal(struct cpuinfo_x86 *c)
 {
 	unsigned int cpu = smp_processor_id();
@@ -270,7 +282,20 @@ void intel_init_thermal(struct cpuinfo_x86 *c)
 	 * since it might be delivered via SMI already:
 	 */
 	rdmsr(MSR_IA32_MISC_ENABLE, l, h);
-	h = apic_read(APIC_LVTTHMR);
+
+	/*
+	 * The initial value of thermal LVT entries on all APs always reads
+	 * 0x10000 because APs are woken up by BSP issuing INIT-SIPI-SIPI
+	 * sequence to them and LVT registers are reset to 0s except for
+	 * the mask bits which are set to 1s when APs receive INIT IPI.
+	 * Always restore the value that BIOS has programmed on AP based on
+	 * BSP's info we saved since BIOS is always setting the same value
+	 * for all threads/cores
+	 */
+	apic_write(APIC_LVTTHMR, lvtthmr_init);
+
+	h = lvtthmr_init;
+
 	if ((l & MSR_IA32_MISC_ENABLE_TM1) && (h & APIC_DM_SMI)) {
 		printk(KERN_DEBUG
 		       "CPU%d: Thermal monitoring handled by SMI\n", cpu);
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index e09f0e2..2323771 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -1002,6 +1002,7 @@ void __init setup_arch(char **cmdline_p)
 
 	init_apic_mappings();
 	ioapic_init_mappings();
+	get_bsp_lvtthmr_init();
 
 	/* need to wait for io_apic is mapped */
 	probe_nr_irqs_gsi();

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value
  2009-11-09  4:21 [PATCH v3] x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value Yong Wang
@ 2009-11-09  7:08 ` Ingo Molnar
  2009-11-09  7:25   ` Yong Wang
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2009-11-09  7:08 UTC (permalink / raw)
  To: Yong Wang
  Cc: H. Peter Anvin, Thomas Gleixner, Suresh Siddha, Arjan van de Ven,
	linux-kernel


* Yong Wang <yong.y.wang@linux.intel.com> wrote:

> Changes since v2:
> Incorporate Ingo's comments to get BSP's thermal init value in a
> 'boot-CPU-only' function.

> diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> index e09f0e2..2323771 100644
> --- a/arch/x86/kernel/setup.c
> +++ b/arch/x86/kernel/setup.c
> @@ -1002,6 +1002,7 @@ void __init setup_arch(char **cmdline_p)
>  
>  	init_apic_mappings();
>  	ioapic_init_mappings();
> +	get_bsp_lvtthmr_init();
>  
>  	/* need to wait for io_apic is mapped */
>  	probe_nr_irqs_gsi();

Ok - it's almost good in this form - it would be nice to name the new 
function something more generic, like mcheck_intel_therm_init(), and 
call it from arch/x86/kernel/cpu/mcheck/mce.c's mcheck_init() function.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value
  2009-11-09  7:08 ` Ingo Molnar
@ 2009-11-09  7:25   ` Yong Wang
  2009-11-09  7:59     ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Yong Wang @ 2009-11-09  7:25 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: H. Peter Anvin, Thomas Gleixner, Suresh Siddha, Arjan van de Ven,
	linux-kernel

On Mon, Nov 09, 2009 at 08:08:44AM +0100, Ingo Molnar wrote:
> 
> * Yong Wang <yong.y.wang@linux.intel.com> wrote:
> 
> > Changes since v2:
> > Incorporate Ingo's comments to get BSP's thermal init value in a
> > 'boot-CPU-only' function.
> 
> > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > index e09f0e2..2323771 100644
> > --- a/arch/x86/kernel/setup.c
> > +++ b/arch/x86/kernel/setup.c
> > @@ -1002,6 +1002,7 @@ void __init setup_arch(char **cmdline_p)
> >  
> >  	init_apic_mappings();
> >  	ioapic_init_mappings();
> > +	get_bsp_lvtthmr_init();
> >  
> >  	/* need to wait for io_apic is mapped */
> >  	probe_nr_irqs_gsi();
> 
> Ok - it's almost good in this form - it would be nice to name the new 
> function something more generic, like mcheck_intel_therm_init(), and 
> call it from arch/x86/kernel/cpu/mcheck/mce.c's mcheck_init() function.
> 

I just checked that arch/x86/kernel/cpu/mcheck/mce.c's mcheck_init()
will not only run on BSP but also on APs. I put get_bsp_lvtthmr_init()
right behind init_apic_mappings() because init_apic_mappings() will
setup the fixmap for LAPIC so that I can call apic_read from inside
get_bsp_lvtthmr_init().

I can rename get_bsp_lvtthmr_init() to mcheck_intel_therm_init() to make
it more generic as we might add more stuff into it in the future.

Thanks for your comments.
-Yong

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value
  2009-11-09  7:25   ` Yong Wang
@ 2009-11-09  7:59     ` Ingo Molnar
  2009-11-09  8:31       ` Yong Wang
  0 siblings, 1 reply; 6+ messages in thread
From: Ingo Molnar @ 2009-11-09  7:59 UTC (permalink / raw)
  To: Yong Wang, Borislav Petkov
  Cc: H. Peter Anvin, Thomas Gleixner, Suresh Siddha, Arjan van de Ven,
	linux-kernel


* Yong Wang <yong.y.wang@linux.intel.com> wrote:

> On Mon, Nov 09, 2009 at 08:08:44AM +0100, Ingo Molnar wrote:
> > 
> > * Yong Wang <yong.y.wang@linux.intel.com> wrote:
> > 
> > > Changes since v2:
> > > Incorporate Ingo's comments to get BSP's thermal init value in a
> > > 'boot-CPU-only' function.
> > 
> > > diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
> > > index e09f0e2..2323771 100644
> > > --- a/arch/x86/kernel/setup.c
> > > +++ b/arch/x86/kernel/setup.c
> > > @@ -1002,6 +1002,7 @@ void __init setup_arch(char **cmdline_p)
> > >  
> > >  	init_apic_mappings();
> > >  	ioapic_init_mappings();
> > > +	get_bsp_lvtthmr_init();
> > >  
> > >  	/* need to wait for io_apic is mapped */
> > >  	probe_nr_irqs_gsi();
> > 
> > Ok - it's almost good in this form - it would be nice to name the new 
> > function something more generic, like mcheck_intel_therm_init(), and 
> > call it from arch/x86/kernel/cpu/mcheck/mce.c's mcheck_init() function.
> > 
> 
> I just checked that arch/x86/kernel/cpu/mcheck/mce.c's mcheck_init() 
> will not only run on BSP but also on APs. I put get_bsp_lvtthmr_init() 
> right behind init_apic_mappings() because init_apic_mappings() will 
> setup the fixmap for LAPIC so that I can call apic_read from inside 
> get_bsp_lvtthmr_init().

That's true in Linus's tree but Boris Petkov fixed the MCE init 
functions naming mess in the x86 tree, in these commits:

 b33a636: x86, mce: Add a global MCE init helper
 5e09954: x86, mce: Fix up MCE naming nomenclature

Please base your patch on:

  http://people.redhat.com/mingo/tip.git/README

to have the latest MCE code. So if you stick that init function into 
mcheck_init() in -tip it should be called once per bootup only. (I have 
not checked the other boot dependencies, please do that.)

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value
  2009-11-09  7:59     ` Ingo Molnar
@ 2009-11-09  8:31       ` Yong Wang
  2009-11-09  9:06         ` Ingo Molnar
  0 siblings, 1 reply; 6+ messages in thread
From: Yong Wang @ 2009-11-09  8:31 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Borislav Petkov, H. Peter Anvin, Thomas Gleixner, Suresh Siddha,
	Arjan van de Ven, linux-kernel

On Mon, Nov 09, 2009 at 08:59:57AM +0100, Ingo Molnar wrote:
> 
> That's true in Linus's tree but Boris Petkov fixed the MCE init 
> functions naming mess in the x86 tree, in these commits:
> 
>  b33a636: x86, mce: Add a global MCE init helper
>  5e09954: x86, mce: Fix up MCE naming nomenclature
> 
> Please base your patch on:
> 
>   http://people.redhat.com/mingo/tip.git/README
> 
> to have the latest MCE code. So if you stick that init function into 
> mcheck_init() in -tip it should be called once per bootup only. (I have 
> not checked the other boot dependencies, please do that.)
> 

By the time mcheck_init() executes, intel_init_thermal() has finished
running on BSP. Any suggestions on how to resolve this issue?

Thanks for your insights!
-Yong

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v3] x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value
  2009-11-09  8:31       ` Yong Wang
@ 2009-11-09  9:06         ` Ingo Molnar
  0 siblings, 0 replies; 6+ messages in thread
From: Ingo Molnar @ 2009-11-09  9:06 UTC (permalink / raw)
  To: Yong Wang
  Cc: Borislav Petkov, H. Peter Anvin, Thomas Gleixner, Suresh Siddha,
	Arjan van de Ven, linux-kernel


* Yong Wang <yong.y.wang@linux.intel.com> wrote:

> On Mon, Nov 09, 2009 at 08:59:57AM +0100, Ingo Molnar wrote:
> > 
> > That's true in Linus's tree but Boris Petkov fixed the MCE init 
> > functions naming mess in the x86 tree, in these commits:
> > 
> >  b33a636: x86, mce: Add a global MCE init helper
> >  5e09954: x86, mce: Fix up MCE naming nomenclature
> > 
> > Please base your patch on:
> > 
> >   http://people.redhat.com/mingo/tip.git/README
> > 
> > to have the latest MCE code. So if you stick that init function into 
> > mcheck_init() in -tip it should be called once per bootup only. (I have 
> > not checked the other boot dependencies, please do that.)
> 
> By the time mcheck_init() executes, intel_init_thermal() has finished 
> running on BSP. Any suggestions on how to resolve this issue?

it's an early_initcall() right now - if that's not early enough then 
please call it explicitly from setup.c like you did it with the thermal 
init function from your previous patch. (and remove the early_initcall() 
line)

	Ingo

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2009-11-09  9:06 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-11-09  4:21 [PATCH v3] x86: under bios control, restore AP's APIC_LVTTHMR to the BSP value Yong Wang
2009-11-09  7:08 ` Ingo Molnar
2009-11-09  7:25   ` Yong Wang
2009-11-09  7:59     ` Ingo Molnar
2009-11-09  8:31       ` Yong Wang
2009-11-09  9:06         ` Ingo Molnar

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.