linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
@ 2019-08-01  6:24 Daniel Drake
  2019-08-01  7:16 ` Aubrey Li
  2019-08-01  7:27 ` Thomas Gleixner
  0 siblings, 2 replies; 27+ messages in thread
From: Daniel Drake @ 2019-08-01  6:24 UTC (permalink / raw)
  To: Thomas Gleixner, x86, aubrey.li, Ingo Molnar, H . Peter Anvin
  Cc: Linux Kernel, Endless Linux Upstreaming Team

Hi,

Working with a new consumer laptop based on AMD R7-3700U, we are
seeing a kernel panic during early boot (before the display
initializes). It's a new product and there is no previous known
working kernel version (tested 5.0, 5.2 and current linus master).

We may have also seen this problem on a MiniPC based on AMD APU 7010
from another vendor, but we don't have it in hands right now to
confirm that it's the exact same crash.

earlycon shows the details: a NULL dereference under
setup_boot_APIC_clock(), which actually happens in
calibrate_APIC_clock():

    /* Replace the global interrupt handler */
    real_handler = global_clock_event->event_handler;
    global_clock_event->event_handler = lapic_cal_handler;

global_clock_event is NULL here. This is a "reduced hardware" ACPI
platform so acpi_generic_reduced_hw_init() has set timer_init to NULL,
avoiding the usual codepaths that would set up global_clock_event.

I tried the obvious:
 if (!global_clock_event)
    return -1;

However I'm probably missing part of the big picture here, as this
only makes boot fail later on. It continues til the next point that
something leads to schedule(), such as a driver calling msleep() or
mark_readonly() calling rcu_barrier(), etc. Then it hangs.

Is something missing in terms of timer setup here? Suggestions appreciated...

Thanks
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  6:24 setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms Daniel Drake
@ 2019-08-01  7:16 ` Aubrey Li
  2019-08-01  7:35   ` Thomas Gleixner
  2019-08-01  9:00   ` setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms Daniel Drake
  2019-08-01  7:27 ` Thomas Gleixner
  1 sibling, 2 replies; 27+ messages in thread
From: Aubrey Li @ 2019-08-01  7:16 UTC (permalink / raw)
  To: Daniel Drake
  Cc: Thomas Gleixner, x86, Li, Aubrey, Ingo Molnar, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team

On Thu, Aug 1, 2019 at 2:26 PM Daniel Drake <drake@endlessm.com> wrote:
>
> Hi,
>
> Working with a new consumer laptop based on AMD R7-3700U, we are
> seeing a kernel panic during early boot (before the display
> initializes). It's a new product and there is no previous known
> working kernel version (tested 5.0, 5.2 and current linus master).
>
> We may have also seen this problem on a MiniPC based on AMD APU 7010
> from another vendor, but we don't have it in hands right now to
> confirm that it's the exact same crash.
>
> earlycon shows the details: a NULL dereference under
> setup_boot_APIC_clock(), which actually happens in
> calibrate_APIC_clock():
>
>     /* Replace the global interrupt handler */
>     real_handler = global_clock_event->event_handler;
>     global_clock_event->event_handler = lapic_cal_handler;
>
> global_clock_event is NULL here. This is a "reduced hardware" ACPI
> platform so acpi_generic_reduced_hw_init() has set timer_init to NULL,
> avoiding the usual codepaths that would set up global_clock_event.
>
IIRC, acpi_generic_reduced_hw_init() avoids initializing PIT, the status of
this legacy device is unknown in ACPI hw-reduced mode.

> I tried the obvious:
>  if (!global_clock_event)
>     return -1;
>
No, the platform needs a global clock event, can you turn on some other
clock source on your platform, like HPET?

Thanks,
-Aubrey

> However I'm probably missing part of the big picture here, as this
> only makes boot fail later on. It continues til the next point that
> something leads to schedule(), such as a driver calling msleep() or
> mark_readonly() calling rcu_barrier(), etc. Then it hangs.
>
> Is something missing in terms of timer setup here? Suggestions appreciated...
>
> Thanks
> Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  6:24 setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms Daniel Drake
  2019-08-01  7:16 ` Aubrey Li
@ 2019-08-01  7:27 ` Thomas Gleixner
  1 sibling, 0 replies; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-01  7:27 UTC (permalink / raw)
  To: Daniel Drake
  Cc: x86, aubrey.li, Ingo Molnar, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team

Daniel,

On Thu, 1 Aug 2019, Daniel Drake wrote:
> Working with a new consumer laptop based on AMD R7-3700U, we are
> seeing a kernel panic during early boot (before the display
> initializes). It's a new product and there is no previous known
> working kernel version (tested 5.0, 5.2 and current linus master).
> 
> We may have also seen this problem on a MiniPC based on AMD APU 7010
> from another vendor, but we don't have it in hands right now to
> confirm that it's the exact same crash.
> 
> earlycon shows the details: a NULL dereference under
> setup_boot_APIC_clock(), which actually happens in
> calibrate_APIC_clock():
> 
>     /* Replace the global interrupt handler */
>     real_handler = global_clock_event->event_handler;
>     global_clock_event->event_handler = lapic_cal_handler;
> 
> global_clock_event is NULL here. This is a "reduced hardware" ACPI
> platform so acpi_generic_reduced_hw_init() has set timer_init to NULL,
> avoiding the usual codepaths that would set up global_clock_event.
> 
> I tried the obvious:
>  if (!global_clock_event)
>     return -1;
> 
> However I'm probably missing part of the big picture here, as this
> only makes boot fail later on. It continues til the next point that
> something leads to schedule(), such as a driver calling msleep() or
> mark_readonly() calling rcu_barrier(), etc. Then it hangs.
> 
> Is something missing in terms of timer setup here? Suggestions
> appreciated...

So that trips over the problem that there is no timer to calibrate against
and the LAPIC freuency is obviously unknown.

How is the kernel supposed to figure that out?

The only possible option in that case is to use RTC, but we have no support
for this at all.

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  7:16 ` Aubrey Li
@ 2019-08-01  7:35   ` Thomas Gleixner
  2019-08-01  7:56     ` Aubrey Li
  2019-08-01  9:00   ` setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms Daniel Drake
  1 sibling, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-01  7:35 UTC (permalink / raw)
  To: Aubrey Li
  Cc: Daniel Drake, x86, Li, Aubrey, Ingo Molnar, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team

On Thu, 1 Aug 2019, Aubrey Li wrote:
> On Thu, Aug 1, 2019 at 2:26 PM Daniel Drake <drake@endlessm.com> wrote:
> > global_clock_event is NULL here. This is a "reduced hardware" ACPI
> > platform so acpi_generic_reduced_hw_init() has set timer_init to NULL,
> > avoiding the usual codepaths that would set up global_clock_event.
> >
> IIRC, acpi_generic_reduced_hw_init() avoids initializing PIT, the status of
> this legacy device is unknown in ACPI hw-reduced mode.
> 
> > I tried the obvious:
> >  if (!global_clock_event)
> >     return -1;
> >
> No, the platform needs a global clock event, can you turn on some other

Wrong. The kernel boots perfectly fine without a global clock event. But
for that the TSC and LAPIC frequency must be known.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  7:35   ` Thomas Gleixner
@ 2019-08-01  7:56     ` Aubrey Li
  2019-08-01  8:13       ` Thomas Gleixner
  0 siblings, 1 reply; 27+ messages in thread
From: Aubrey Li @ 2019-08-01  7:56 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Daniel Drake, x86, Li, Aubrey, Ingo Molnar, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team

On Thu, Aug 1, 2019 at 3:35 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Thu, 1 Aug 2019, Aubrey Li wrote:
> > On Thu, Aug 1, 2019 at 2:26 PM Daniel Drake <drake@endlessm.com> wrote:
> > > global_clock_event is NULL here. This is a "reduced hardware" ACPI
> > > platform so acpi_generic_reduced_hw_init() has set timer_init to NULL,
> > > avoiding the usual codepaths that would set up global_clock_event.
> > >
> > IIRC, acpi_generic_reduced_hw_init() avoids initializing PIT, the status of
> > this legacy device is unknown in ACPI hw-reduced mode.
> >
> > > I tried the obvious:
> > >  if (!global_clock_event)
> > >     return -1;
> > >
> > No, the platform needs a global clock event, can you turn on some other
>
> Wrong. The kernel boots perfectly fine without a global clock event. But
> for that the TSC and LAPIC frequency must be known.

I think LAPIC fast calibrate is only supported on intel platform, while
Daniel's box is an AMD platform. That's why lapic_init_clockevent() failed
and fall into the code path which needs a global clock event.

Thanks,
-Aubrey

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  7:56     ` Aubrey Li
@ 2019-08-01  8:13       ` Thomas Gleixner
  2019-08-01  8:21         ` Li, Aubrey
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-01  8:13 UTC (permalink / raw)
  To: Aubrey Li
  Cc: Daniel Drake, x86, Li, Aubrey, Ingo Molnar, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team

On Thu, 1 Aug 2019, Aubrey Li wrote:
> On Thu, Aug 1, 2019 at 3:35 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > On Thu, 1 Aug 2019, Aubrey Li wrote:
> > > On Thu, Aug 1, 2019 at 2:26 PM Daniel Drake <drake@endlessm.com> wrote:
> > > > global_clock_event is NULL here. This is a "reduced hardware" ACPI
> > > > platform so acpi_generic_reduced_hw_init() has set timer_init to NULL,
> > > > avoiding the usual codepaths that would set up global_clock_event.
> > > >
> > > IIRC, acpi_generic_reduced_hw_init() avoids initializing PIT, the status of
> > > this legacy device is unknown in ACPI hw-reduced mode.
> > >
> > > > I tried the obvious:
> > > >  if (!global_clock_event)
> > > >     return -1;
> > > >
> > > No, the platform needs a global clock event, can you turn on some other
> >
> > Wrong. The kernel boots perfectly fine without a global clock event. But
> > for that the TSC and LAPIC frequency must be known.
> 
> I think LAPIC fast calibrate is only supported on intel platform, while
> Daniel's box is an AMD platform. That's why lapic_init_clockevent() failed
> and fall into the code path which needs a global clock event.

We know that.

The point is that it does not matter which vendor a CPU comes from. The
kernel does support legacyless boot when the frequencies are known. Whether
that's currently possible on that particular CPU is a different question.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  8:13       ` Thomas Gleixner
@ 2019-08-01  8:21         ` Li, Aubrey
  2019-08-01 10:13           ` Thomas Gleixner
  0 siblings, 1 reply; 27+ messages in thread
From: Li, Aubrey @ 2019-08-01  8:21 UTC (permalink / raw)
  To: Thomas Gleixner, Aubrey Li
  Cc: Daniel Drake, x86, Ingo Molnar, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team

On 2019/8/1 16:13, Thomas Gleixner wrote:
> On Thu, 1 Aug 2019, Aubrey Li wrote:
>> On Thu, Aug 1, 2019 at 3:35 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>>>
>>> On Thu, 1 Aug 2019, Aubrey Li wrote:
>>>> On Thu, Aug 1, 2019 at 2:26 PM Daniel Drake <drake@endlessm.com> wrote:
>>>>> global_clock_event is NULL here. This is a "reduced hardware" ACPI
>>>>> platform so acpi_generic_reduced_hw_init() has set timer_init to NULL,
>>>>> avoiding the usual codepaths that would set up global_clock_event.
>>>>>
>>>> IIRC, acpi_generic_reduced_hw_init() avoids initializing PIT, the status of
>>>> this legacy device is unknown in ACPI hw-reduced mode.
>>>>
>>>>> I tried the obvious:
>>>>>  if (!global_clock_event)
>>>>>     return -1;
>>>>>
>>>> No, the platform needs a global clock event, can you turn on some other
>>>
>>> Wrong. The kernel boots perfectly fine without a global clock event. But
>>> for that the TSC and LAPIC frequency must be known.
>>
>> I think LAPIC fast calibrate is only supported on intel platform, while
>> Daniel's box is an AMD platform. That's why lapic_init_clockevent() failed
>> and fall into the code path which needs a global clock event.
> 
> We know that.
> 
> The point is that it does not matter which vendor a CPU comes from. The
> kernel does support legacyless boot when the frequencies are known. Whether
> that's currently possible on that particular CPU is a different question.
> 
Yeah, I should specify, Daniel, your platform needs a global clock event, ;-)

Thanks,
-Aubrey

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  7:16 ` Aubrey Li
  2019-08-01  7:35   ` Thomas Gleixner
@ 2019-08-01  9:00   ` Daniel Drake
  2019-08-01 10:17     ` Thomas Gleixner
  1 sibling, 1 reply; 27+ messages in thread
From: Daniel Drake @ 2019-08-01  9:00 UTC (permalink / raw)
  To: Aubrey Li
  Cc: Thomas Gleixner, x86, Li, Aubrey, Ingo Molnar, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team

On Thu, Aug 1, 2019 at 3:16 PM Aubrey Li <aubrey.intel@gmail.com> wrote:
> No, the platform needs a global clock event, can you turn on some other
> clock source on your platform, like HPET?

Thanks Audrey and Thomas for the quick hints!

I double checked under Windows - it seems to be using a HPET there.
Also there is the HPET ACPI table. So I think this is the right angle
to look at.

Under Linux, hpet_legacy_clockevent_register() is the function where
global_clock_event can be set to HPET.

However, the only way this can be called is from hpet_enable().

hpet_enable() is called from 2 places:
 1. From hpet_time_init(). This is the default x86 timer_init that
acpi_generic_reduced_hw_init() took out of action here.
 2. From hpet_late_init(). However that function is only called late,
after calibrate_APIC_clock() has already crashed the kernel. Also,
even if moved earlier it would also not call hpet_enable() here
because the ACPI HPET table parsing has already populated
hpet_address.

I tried slotting in a call to hpet_enable() at an earlier point
regardless, but I still end up with the kernel hanging later during
boot, probably because irq0 fails to be setup and this error is hit:
    if (setup_irq(0, &irq0))
        pr_info("Failed to register legacy timer interrupt\n");

I'll go deeper into that; further hints welcome too.

Thanks
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  8:21         ` Li, Aubrey
@ 2019-08-01 10:13           ` Thomas Gleixner
  2019-08-01 15:44             ` Lendacky, Thomas
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-01 10:13 UTC (permalink / raw)
  To: Li, Aubrey
  Cc: Aubrey Li, Daniel Drake, x86, Ingo Molnar, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team, Tom Lendacky

On Thu, 1 Aug 2019, Li, Aubrey wrote:
> On 2019/8/1 16:13, Thomas Gleixner wrote:
> > The point is that it does not matter which vendor a CPU comes from. The
> > kernel does support legacyless boot when the frequencies are known. Whether
> > that's currently possible on that particular CPU is a different question.
> > 
> Yeah, I should specify, Daniel, your platform needs a global clock event, ;-)

Care to look at the manuals before making assumptions?

  2.1.9 Timers

   Each core includes the following timers. These timers do not vary in
   frequency regardless of the current P-state or C-state.

   * Core::X86::Msr::TSC; the TSC increments at the rate specified by the
     P0 Pstate. See Core::X86::Msr::PStateDef.

   * The APIC timer (Core::X86::Apic::TimerInitialCount and
     Core::X86::Apic::TimerCurrentCount), which increments at the rate of
     2xCLKIN; the APIC timer may increment in units of between 1 and 8.

The Ryzens use a 100MHz input clock for the APIC normally, but I'm not sure
whether this is subject to overclocking. If so then it should be possible
to figure that out somehow. Tom?

Thanks,

	tglx


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01  9:00   ` setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms Daniel Drake
@ 2019-08-01 10:17     ` Thomas Gleixner
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-01 10:17 UTC (permalink / raw)
  To: Daniel Drake
  Cc: Aubrey Li, x86, Li, Aubrey, Ingo Molnar, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team

On Thu, 1 Aug 2019, Daniel Drake wrote:
> On Thu, Aug 1, 2019 at 3:16 PM Aubrey Li <aubrey.intel@gmail.com> wrote:
> However, the only way this can be called is from hpet_enable().
> 
> hpet_enable() is called from 2 places:
>  1. From hpet_time_init(). This is the default x86 timer_init that
> acpi_generic_reduced_hw_init() took out of action here.
>  2. From hpet_late_init(). However that function is only called late,
> after calibrate_APIC_clock() has already crashed the kernel. Also,
> even if moved earlier it would also not call hpet_enable() here
> because the ACPI HPET table parsing has already populated
> hpet_address.
> 
> I tried slotting in a call to hpet_enable() at an earlier point
> regardless, but I still end up with the kernel hanging later during
> boot, probably because irq0 fails to be setup and this error is hit:
>     if (setup_irq(0, &irq0))
>         pr_info("Failed to register legacy timer interrupt\n");

Right. The thing also lacks PIT :)

So there are two options:

   1) Make sure the HPET is parsed somehow even with the reduced stuff

   2) Make the clock frequency detection work.

#1 is a trainwreck

#2 is something we really want to have anyway. See the other reply. I cc'ed
   Tom there, he should be able to give us the missing link.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01 10:13           ` Thomas Gleixner
@ 2019-08-01 15:44             ` Lendacky, Thomas
  2019-08-08 20:36               ` Thomas Gleixner
  0 siblings, 1 reply; 27+ messages in thread
From: Lendacky, Thomas @ 2019-08-01 15:44 UTC (permalink / raw)
  To: Thomas Gleixner, Li, Aubrey
  Cc: Aubrey Li, Daniel Drake, x86, Ingo Molnar, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team

On 8/1/19 5:13 AM, Thomas Gleixner wrote:
> On Thu, 1 Aug 2019, Li, Aubrey wrote:
>> On 2019/8/1 16:13, Thomas Gleixner wrote:
>>> The point is that it does not matter which vendor a CPU comes from. The
>>> kernel does support legacyless boot when the frequencies are known. Whether
>>> that's currently possible on that particular CPU is a different question.
>>>
>> Yeah, I should specify, Daniel, your platform needs a global clock event, ;-)
> 
> Care to look at the manuals before making assumptions?
> 
>   2.1.9 Timers
> 
>    Each core includes the following timers. These timers do not vary in
>    frequency regardless of the current P-state or C-state.
> 
>    * Core::X86::Msr::TSC; the TSC increments at the rate specified by the
>      P0 Pstate. See Core::X86::Msr::PStateDef.
> 
>    * The APIC timer (Core::X86::Apic::TimerInitialCount and
>      Core::X86::Apic::TimerCurrentCount), which increments at the rate of
>      2xCLKIN; the APIC timer may increment in units of between 1 and 8.
> 
> The Ryzens use a 100MHz input clock for the APIC normally, but I'm not sure
> whether this is subject to overclocking. If so then it should be possible
> to figure that out somehow. Tom?

Let me check with the hardware folks and I'll get back to you.

Thanks,
Tom

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-01 15:44             ` Lendacky, Thomas
@ 2019-08-08 20:36               ` Thomas Gleixner
  2019-08-08 21:01                 ` Lendacky, Thomas
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-08 20:36 UTC (permalink / raw)
  To: Lendacky, Thomas
  Cc: Li, Aubrey, Aubrey Li, Daniel Drake, x86, Ingo Molnar,
	H . Peter Anvin, Linux Kernel, Endless Linux Upstreaming Team

Tom,

On Thu, 1 Aug 2019, Lendacky, Thomas wrote:
> On 8/1/19 5:13 AM, Thomas Gleixner wrote:
> >   2.1.9 Timers
> > 
> >    Each core includes the following timers. These timers do not vary in
> >    frequency regardless of the current P-state or C-state.
> > 
> >    * Core::X86::Msr::TSC; the TSC increments at the rate specified by the
> >      P0 Pstate. See Core::X86::Msr::PStateDef.
> > 
> >    * The APIC timer (Core::X86::Apic::TimerInitialCount and
> >      Core::X86::Apic::TimerCurrentCount), which increments at the rate of
> >      2xCLKIN; the APIC timer may increment in units of between 1 and 8.
> > 
> > The Ryzens use a 100MHz input clock for the APIC normally, but I'm not sure
> > whether this is subject to overclocking. If so then it should be possible
> > to figure that out somehow. Tom?
> 
> Let me check with the hardware folks and I'll get back to you.

any update on this? The problem seems to come in from several sides now.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-08 20:36               ` Thomas Gleixner
@ 2019-08-08 21:01                 ` Lendacky, Thomas
  2019-08-08 21:08                   ` Thomas Gleixner
  0 siblings, 1 reply; 27+ messages in thread
From: Lendacky, Thomas @ 2019-08-08 21:01 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Li, Aubrey, Aubrey Li, Daniel Drake, x86, Ingo Molnar,
	H . Peter Anvin, Linux Kernel, Endless Linux Upstreaming Team

Hi Thomas,

On 8/8/19 3:36 PM, Thomas Gleixner wrote:
> Tom,
> 
> On Thu, 1 Aug 2019, Lendacky, Thomas wrote:
>> On 8/1/19 5:13 AM, Thomas Gleixner wrote:
>>>    2.1.9 Timers
>>>
>>>     Each core includes the following timers. These timers do not vary in
>>>     frequency regardless of the current P-state or C-state.
>>>
>>>     * Core::X86::Msr::TSC; the TSC increments at the rate specified by the
>>>       P0 Pstate. See Core::X86::Msr::PStateDef.
>>>
>>>     * The APIC timer (Core::X86::Apic::TimerInitialCount and
>>>       Core::X86::Apic::TimerCurrentCount), which increments at the rate of
>>>       2xCLKIN; the APIC timer may increment in units of between 1 and 8.
>>>
>>> The Ryzens use a 100MHz input clock for the APIC normally, but I'm not sure
>>> whether this is subject to overclocking. If so then it should be possible
>>> to figure that out somehow. Tom?
>>
>> Let me check with the hardware folks and I'll get back to you.
> 
> any update on this? The problem seems to come in from several sides now.

Yes, sort of. There are two ways to overclock and it all depends on which
one was used. If the overclocking is done by changing the multipliers,
then that 100MHz clock will still be 100MHz. But if the overclocking is
done by increasing the input clock, then that 100MHz clock will also
increase.

I was trying to get a bit more clarification on this before replying, but
it can be detected in software. The base clock is 100MHz, so read the P0
multiplier and the TSC should be counting at P0 * 100MHz. If you calibrate
the speed of the TSC with the HPET you can see what speed the TSC is
counting at. If you divide the TSC delta from the HPET calibration by the
P0 multiplier you will either get 100MHz if there is no overclocking or if
the multiplier method of overclocking was used, otherwise you'll get a
higher value if the input clock method was used. Either way, that should
give you the APIC clock speed based on a starting assumption of 100MHz.

I'm not all that familiar with this stuff, but I think I translated what
was told to me properly. I'm also not sure if this method can be used at
the point in the code this is all happening. Let me know if it makes sense
or not.

Thanks,
Tom

> 
> Thanks,
> 
> 	tglx
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-08 21:01                 ` Lendacky, Thomas
@ 2019-08-08 21:08                   ` Thomas Gleixner
  2019-08-08 21:36                     ` Lendacky, Thomas
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-08 21:08 UTC (permalink / raw)
  To: Lendacky, Thomas
  Cc: Li, Aubrey, Aubrey Li, Daniel Drake, x86, Ingo Molnar,
	H . Peter Anvin, Linux Kernel, Endless Linux Upstreaming Team

Tom,

On Thu, 8 Aug 2019, Lendacky, Thomas wrote:
> On 8/8/19 3:36 PM, Thomas Gleixner wrote:
> > On Thu, 1 Aug 2019, Lendacky, Thomas wrote:
> >> On 8/1/19 5:13 AM, Thomas Gleixner wrote:
> >>>    2.1.9 Timers
> >>>
> >>>     Each core includes the following timers. These timers do not vary in
> >>>     frequency regardless of the current P-state or C-state.
> >>>
> >>>     * Core::X86::Msr::TSC; the TSC increments at the rate specified by the
> >>>       P0 Pstate. See Core::X86::Msr::PStateDef.
> >>>
> >>>     * The APIC timer (Core::X86::Apic::TimerInitialCount and
> >>>       Core::X86::Apic::TimerCurrentCount), which increments at the rate of
> >>>       2xCLKIN; the APIC timer may increment in units of between 1 and 8.
> >>>
> >>> The Ryzens use a 100MHz input clock for the APIC normally, but I'm not sure
> >>> whether this is subject to overclocking. If so then it should be possible
> >>> to figure that out somehow. Tom?
> >>
> >> Let me check with the hardware folks and I'll get back to you.
> > 
> > any update on this? The problem seems to come in from several sides now.
> 
> Yes, sort of. There are two ways to overclock and it all depends on which
> one was used. If the overclocking is done by changing the multipliers,
> then that 100MHz clock will still be 100MHz. But if the overclocking is
> done by increasing the input clock, then that 100MHz clock will also
> increase.
> 
> I was trying to get a bit more clarification on this before replying, but
> it can be detected in software. The base clock is 100MHz, so read the P0
> multiplier and the TSC should be counting at P0 * 100MHz. If you calibrate
> the speed of the TSC with the HPET you can see what speed the TSC is
> counting at. If you divide the TSC delta from the HPET calibration by the
> P0 multiplier you will either get 100MHz if there is no overclocking or if
> the multiplier method of overclocking was used, otherwise you'll get a
> higher value if the input clock method was used. Either way, that should
> give you the APIC clock speed based on a starting assumption of 100MHz.

The problem is that we have no HPET on those machines ....

I think I can get away without having HPET and PIT and do some smart stuff
with the pm timer for that stuff. I'll look at it tomorrow with brain
actually awake.

Thanks,

	tglx






^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms
  2019-08-08 21:08                   ` Thomas Gleixner
@ 2019-08-08 21:36                     ` Lendacky, Thomas
  2019-08-09 12:54                       ` [PATCH] x86/apic: Handle missing global clockevent gracefully Thomas Gleixner
  0 siblings, 1 reply; 27+ messages in thread
From: Lendacky, Thomas @ 2019-08-08 21:36 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Li, Aubrey, Aubrey Li, Daniel Drake, x86, Ingo Molnar,
	H . Peter Anvin, Linux Kernel, Endless Linux Upstreaming Team

Hi Thomas,

On 8/8/19 4:08 PM, Thomas Gleixner wrote:
> Tom,
> 
> On Thu, 8 Aug 2019, Lendacky, Thomas wrote:
>> On 8/8/19 3:36 PM, Thomas Gleixner wrote:
>>> On Thu, 1 Aug 2019, Lendacky, Thomas wrote:
>>>> On 8/1/19 5:13 AM, Thomas Gleixner wrote:
>>>>>     2.1.9 Timers
>>>>>
>>>>>      Each core includes the following timers. These timers do not vary in
>>>>>      frequency regardless of the current P-state or C-state.
>>>>>
>>>>>      * Core::X86::Msr::TSC; the TSC increments at the rate specified by the
>>>>>        P0 Pstate. See Core::X86::Msr::PStateDef.
>>>>>
>>>>>      * The APIC timer (Core::X86::Apic::TimerInitialCount and
>>>>>        Core::X86::Apic::TimerCurrentCount), which increments at the rate of
>>>>>        2xCLKIN; the APIC timer may increment in units of between 1 and 8.
>>>>>
>>>>> The Ryzens use a 100MHz input clock for the APIC normally, but I'm not sure
>>>>> whether this is subject to overclocking. If so then it should be possible
>>>>> to figure that out somehow. Tom?
>>>>
>>>> Let me check with the hardware folks and I'll get back to you.
>>>
>>> any update on this? The problem seems to come in from several sides now.
>>
>> Yes, sort of. There are two ways to overclock and it all depends on which
>> one was used. If the overclocking is done by changing the multipliers,
>> then that 100MHz clock will still be 100MHz. But if the overclocking is
>> done by increasing the input clock, then that 100MHz clock will also
>> increase.
>>
>> I was trying to get a bit more clarification on this before replying, but
>> it can be detected in software. The base clock is 100MHz, so read the P0
>> multiplier and the TSC should be counting at P0 * 100MHz. If you calibrate
>> the speed of the TSC with the HPET you can see what speed the TSC is
>> counting at. If you divide the TSC delta from the HPET calibration by the
>> P0 multiplier you will either get 100MHz if there is no overclocking or if
>> the multiplier method of overclocking was used, otherwise you'll get a
>> higher value if the input clock method was used. Either way, that should
>> give you the APIC clock speed based on a starting assumption of 100MHz.
> 
> The problem is that we have no HPET on those machines ....

Sorry about that...  I interpreted the email[1] that said the HPET ACPI
table was present, incorrectly. I get it now, for hardware-reduced ACPI
you can't depend on that table to be present.

Thanks,
Tom

[1] https://lore.kernel.org/lkml/CAD8Lp452GdoL-Bt7rSP=u3RKEZ2H3qm3LvKfe=cCsjP0biG_sQ@mail.gmail.com/
 
> 
> I think I can get away without having HPET and PIT and do some smart stuff
> with the pm timer for that stuff. I'll look at it tomorrow with brain
> actually awake.
> 
> Thanks,
> 
> 	tglx
> 
> 
> 
> 
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-08 21:36                     ` Lendacky, Thomas
@ 2019-08-09 12:54                       ` Thomas Gleixner
  2019-08-09 12:59                         ` Jiri Slaby
                                           ` (4 more replies)
  0 siblings, 5 replies; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-09 12:54 UTC (permalink / raw)
  To: Lendacky, Thomas
  Cc: Li, Aubrey, Daniel Drake, x86, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team, Jiri Slaby

Some newer machines do not advertise legacy timers. The kernel can handle
that situation if the TSC and the CPU frequency are enumerated by CPUID or
MSRs and the CPU supports TSC deadline timer. If the CPU does not support
TSC deadline timer the local APIC timer frequency has to be known as well.

Some Ryzens machines do not advertize legacy timers, but there is no
reliable way to determine the bus frequency which feeds the local APIC
timer when the machine allows overclocking of that frequency.

As there is no legacy timer the local APIC timer calibration crashes due to
a NULL pointer dereference when accessing the not installed global clock
event device.

Switch the calibration loop to a non interrupt based one, which polls
either TSC (frequency known) or jiffies. The latter requires a global
clockevent. As the machines which do not have a global clockevent installed
have a known TSC frequency this is a non issue. For older machines where
TSC frequency is not known, there is no known case where the legacy timers
do not exist as that would have been reported long ago.

Reported-by: Daniel Drake <drake@endlessm.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
---

Note: Only lightly tested, but of course not on an affected machine.

      If that works reliably, then this needs some exhaustive testing
      on a wide range of systems so we can risk backports to stable
      kernels.

---
 arch/x86/kernel/apic/apic.c |   70 +++++++++++++++++++++++++++++++++-----------
 include/linux/acpi_pmtmr.h  |   10 ++++++
 2 files changed, 64 insertions(+), 16 deletions(-)

--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -851,7 +851,8 @@ bool __init apic_needs_pit(void)
 static int __init calibrate_APIC_clock(void)
 {
 	struct clock_event_device *levt = this_cpu_ptr(&lapic_events);
-	void (*real_handler)(struct clock_event_device *dev);
+	u64 tsc_perj = 0, tsc_start = 0;
+	unsigned long jif_start;
 	unsigned long deltaj;
 	long delta, deltatsc;
 	int pm_referenced = 0;
@@ -878,29 +879,65 @@ static int __init calibrate_APIC_clock(v
 	apic_printk(APIC_VERBOSE, "Using local APIC timer interrupts.\n"
 		    "calibrating APIC timer ...\n");
 
-	local_irq_disable();
-
-	/* Replace the global interrupt handler */
-	real_handler = global_clock_event->event_handler;
-	global_clock_event->event_handler = lapic_cal_handler;
+	/*
+	 * There are platforms w/o global clockevent devices. Instead of
+	 * making the calibration conditional on that, use a polling based
+	 * approach everywhere.
+	 */
 
+	local_irq_disable();
 	/*
 	 * Setup the APIC counter to maximum. There is no way the lapic
 	 * can underflow in the 100ms detection time frame
 	 */
 	__setup_APIC_LVTT(0xffffffff, 0, 0);
 
-	/* Let the interrupts run */
-	local_irq_enable();
+	/*
+	 * Methods to terminate the calibration loop:
+	 *  1) Global clockevent if available (jiffies)
+	 *  2) TSC if available and frequency is known
+	 */
+	jif_start = READ_ONCE(jiffies);
+
+	if (tsc_khz) {
+		tsc_start = rdtsc();
+		tsc_perj = div_u64((u64)tsc_khz * 1000, HZ);
+	}
+
+	while (lapic_cal_loops <= LAPIC_CAL_LOOPS) {
+		/*
+		 * Enable interrupts so the tick can fire, if a global
+		 * clockevent device is available
+		 */
+		local_irq_enable();
 
-	while (lapic_cal_loops <= LAPIC_CAL_LOOPS)
-		cpu_relax();
+		/* Wait for a tick to elapse */
+		while (1) {
+			if (tsc_khz) {
+				u64 tsc_now = rdtsc();
+				if ((tsc_now - tsc_start) >= tsc_perj) {
+					tsc_start += tsc_perj;
+					break;
+				}
+			} else {
+				unsigned long jif_now = READ_ONCE(jiffies);
+
+				if (time_after(jif_now, jif_start)) {
+					jif_start = jif_now;
+					break;
+				}
+			}
+			cpu_relax();
+		}
+
+		/* Invoke the calibration routine */
+		local_irq_disable();
+		lapic_cal_handler(NULL);
+		local_irq_enable();
+	}
 
 	local_irq_disable();
 
-	/* Restore the real event handler */
-	global_clock_event->event_handler = real_handler;
-
 	/* Build delta t1-t2 as apic timer counts down */
 	delta = lapic_cal_t1 - lapic_cal_t2;
 	apic_printk(APIC_VERBOSE, "... lapic delta = %ld\n", delta);
@@ -943,10 +980,11 @@ static int __init calibrate_APIC_clock(v
 	levt->features &= ~CLOCK_EVT_FEAT_DUMMY;
 
 	/*
-	 * PM timer calibration failed or not turned on
-	 * so lets try APIC timer based calibration
+	 * PM timer calibration failed or not turned on so lets try APIC
+	 * timer based calibration, if a global clockevent device is
+	 * available.
 	 */
-	if (!pm_referenced) {
+	if (!pm_referenced && global_clock_event) {
 		apic_printk(APIC_VERBOSE, "... verify APIC timer\n");
 
 		/*
--- a/include/linux/acpi_pmtmr.h
+++ b/include/linux/acpi_pmtmr.h
@@ -18,6 +18,11 @@
 extern u32 acpi_pm_read_verified(void);
 extern u32 pmtmr_ioport;
 
+static inline bool acpi_pm_timer_available(void)
+{
+	return pmtmr_ioport != 0;
+}
+
 static inline u32 acpi_pm_read_early(void)
 {
 	if (!pmtmr_ioport)
@@ -28,6 +33,11 @@ static inline u32 acpi_pm_read_early(voi
 
 #else
 
+static inline bool acpi_pm_timer_available(void)
+{
+	return false;
+}
+
 static inline u32 acpi_pm_read_early(void)
 {
 	return 0;

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-09 12:54                       ` [PATCH] x86/apic: Handle missing global clockevent gracefully Thomas Gleixner
@ 2019-08-09 12:59                         ` Jiri Slaby
  2019-08-09 13:58                         ` Lendacky, Thomas
                                           ` (3 subsequent siblings)
  4 siblings, 0 replies; 27+ messages in thread
From: Jiri Slaby @ 2019-08-09 12:59 UTC (permalink / raw)
  To: Thomas Gleixner, Lendacky, Thomas
  Cc: Li, Aubrey, Daniel Drake, x86, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team

On 09. 08. 19, 14:54, Thomas Gleixner wrote:
> Some newer machines do not advertise legacy timers. The kernel can handle
> that situation if the TSC and the CPU frequency are enumerated by CPUID or
> MSRs and the CPU supports TSC deadline timer. If the CPU does not support
> TSC deadline timer the local APIC timer frequency has to be known as well.
> 
> Some Ryzens machines do not advertize legacy timers, but there is no
> reliable way to determine the bus frequency which feeds the local APIC
> timer when the machine allows overclocking of that frequency.
> 
> As there is no legacy timer the local APIC timer calibration crashes due to
> a NULL pointer dereference when accessing the not installed global clock
> event device.
> 
> Switch the calibration loop to a non interrupt based one, which polls
> either TSC (frequency known) or jiffies. The latter requires a global
> clockevent. As the machines which do not have a global clockevent installed
> have a known TSC frequency this is a non issue. For older machines where
> TSC frequency is not known, there is no known case where the legacy timers
> do not exist as that would have been reported long ago.
> 
> Reported-by: Daniel Drake <drake@endlessm.com>
> Reported-by: Jiri Slaby <jslaby@suse.cz>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> 
> Note: Only lightly tested, but of course not on an affected machine.

Thanks, I will make them test the patch and let you know.

-- 
js
suse labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-09 12:54                       ` [PATCH] x86/apic: Handle missing global clockevent gracefully Thomas Gleixner
  2019-08-09 12:59                         ` Jiri Slaby
@ 2019-08-09 13:58                         ` Lendacky, Thomas
  2019-08-09 19:40                           ` Thomas Gleixner
  2019-08-12  6:16                         ` Daniel Drake
                                           ` (2 subsequent siblings)
  4 siblings, 1 reply; 27+ messages in thread
From: Lendacky, Thomas @ 2019-08-09 13:58 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Li, Aubrey, Daniel Drake, x86, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team, Jiri Slaby

On 8/9/19 7:54 AM, Thomas Gleixner wrote:
> Some newer machines do not advertise legacy timers. The kernel can handle
> that situation if the TSC and the CPU frequency are enumerated by CPUID or
> MSRs and the CPU supports TSC deadline timer. If the CPU does not support
> TSC deadline timer the local APIC timer frequency has to be known as well.
> 
> Some Ryzens machines do not advertize legacy timers, but there is no
> reliable way to determine the bus frequency which feeds the local APIC
> timer when the machine allows overclocking of that frequency.
> 
> As there is no legacy timer the local APIC timer calibration crashes due to
> a NULL pointer dereference when accessing the not installed global clock
> event device.
> 
> Switch the calibration loop to a non interrupt based one, which polls
> either TSC (frequency known) or jiffies. The latter requires a global
> clockevent. As the machines which do not have a global clockevent installed
> have a known TSC frequency this is a non issue. For older machines where
> TSC frequency is not known, there is no known case where the legacy timers
> do not exist as that would have been reported long ago.
> 
> Reported-by: Daniel Drake <drake@endlessm.com>
> Reported-by: Jiri Slaby <jslaby@suse.cz>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> 
> Note: Only lightly tested, but of course not on an affected machine.
> 
>        If that works reliably, then this needs some exhaustive testing
>        on a wide range of systems so we can risk backports to stable
>        kernels.
> 
> ---
>   arch/x86/kernel/apic/apic.c |   70 +++++++++++++++++++++++++++++++++-----------
>   include/linux/acpi_pmtmr.h  |   10 ++++++
>   2 files changed, 64 insertions(+), 16 deletions(-)
> 
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -851,7 +851,8 @@ bool __init apic_needs_pit(void)
>   static int __init calibrate_APIC_clock(void)
>   {
>   	struct clock_event_device *levt = this_cpu_ptr(&lapic_events);
> -	void (*real_handler)(struct clock_event_device *dev);
> +	u64 tsc_perj = 0, tsc_start = 0;
> +	unsigned long jif_start;
>   	unsigned long deltaj;
>   	long delta, deltatsc;
>   	int pm_referenced = 0;
> @@ -878,29 +879,65 @@ static int __init calibrate_APIC_clock(v
>   	apic_printk(APIC_VERBOSE, "Using local APIC timer interrupts.\n"
>   		    "calibrating APIC timer ...\n");
>   
> -	local_irq_disable();
> -
> -	/* Replace the global interrupt handler */
> -	real_handler = global_clock_event->event_handler;
> -	global_clock_event->event_handler = lapic_cal_handler;
> +	/*
> +	 * There are platforms w/o global clockevent devices. Instead of
> +	 * making the calibration conditional on that, use a polling based
> +	 * approach everywhere.
> +	 */
>   
> +	local_irq_disable();
>   	/*
>   	 * Setup the APIC counter to maximum. There is no way the lapic
>   	 * can underflow in the 100ms detection time frame
>   	 */
>   	__setup_APIC_LVTT(0xffffffff, 0, 0);
>   
> -	/* Let the interrupts run */
> -	local_irq_enable();
> +	/*
> +	 * Methods to terminate the calibration loop:
> +	 *  1) Global clockevent if available (jiffies)
> +	 *  2) TSC if available and frequency is known
> +	 */
> +	jif_start = READ_ONCE(jiffies);
> +
> +	if (tsc_khz) {
> +		tsc_start = rdtsc();
> +		tsc_perj = div_u64((u64)tsc_khz * 1000, HZ);
> +	}
> +
> +	while (lapic_cal_loops <= LAPIC_CAL_LOOPS) {
> +		/*
> +		 * Enable interrupts so the tick can fire, if a global
> +		 * clockevent device is available
> +		 */
> +		local_irq_enable();

Just a nit, but you end up doing this at the bottom of the loop, so you
could move this invocation to just before the loop and avoid doing two
local_irq_enable() calls in succession after the first iteration.

Thanks,
Tom

>   
> -	while (lapic_cal_loops <= LAPIC_CAL_LOOPS)
> -		cpu_relax();
> +		/* Wait for a tick to elapse */
> +		while (1) {
> +			if (tsc_khz) {
> +				u64 tsc_now = rdtsc();
> +				if ((tsc_now - tsc_start) >= tsc_perj) {
> +					tsc_start += tsc_perj;
> +					break;
> +				}
> +			} else {
> +				unsigned long jif_now = READ_ONCE(jiffies);
> +
> +				if (time_after(jif_now, jif_start)) {
> +					jif_start = jif_now;
> +					break;
> +				}
> +			}
> +			cpu_relax();
> +		}
> +
> +		/* Invoke the calibration routine */
> +		local_irq_disable();
> +		lapic_cal_handler(NULL);
> +		local_irq_enable();
> +	}
>   
>   	local_irq_disable();
>   
> -	/* Restore the real event handler */
> -	global_clock_event->event_handler = real_handler;
> -
>   	/* Build delta t1-t2 as apic timer counts down */
>   	delta = lapic_cal_t1 - lapic_cal_t2;
>   	apic_printk(APIC_VERBOSE, "... lapic delta = %ld\n", delta);
> @@ -943,10 +980,11 @@ static int __init calibrate_APIC_clock(v
>   	levt->features &= ~CLOCK_EVT_FEAT_DUMMY;
>   
>   	/*
> -	 * PM timer calibration failed or not turned on
> -	 * so lets try APIC timer based calibration
> +	 * PM timer calibration failed or not turned on so lets try APIC
> +	 * timer based calibration, if a global clockevent device is
> +	 * available.
>   	 */
> -	if (!pm_referenced) {
> +	if (!pm_referenced && global_clock_event) {
>   		apic_printk(APIC_VERBOSE, "... verify APIC timer\n");
>   
>   		/*
> --- a/include/linux/acpi_pmtmr.h
> +++ b/include/linux/acpi_pmtmr.h
> @@ -18,6 +18,11 @@
>   extern u32 acpi_pm_read_verified(void);
>   extern u32 pmtmr_ioport;
>   
> +static inline bool acpi_pm_timer_available(void)
> +{
> +	return pmtmr_ioport != 0;
> +}
> +
>   static inline u32 acpi_pm_read_early(void)
>   {
>   	if (!pmtmr_ioport)
> @@ -28,6 +33,11 @@ static inline u32 acpi_pm_read_early(voi
>   
>   #else
>   
> +static inline bool acpi_pm_timer_available(void)
> +{
> +	return false;
> +}
> +
>   static inline u32 acpi_pm_read_early(void)
>   {
>   	return 0;
> 

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-09 13:58                         ` Lendacky, Thomas
@ 2019-08-09 19:40                           ` Thomas Gleixner
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-09 19:40 UTC (permalink / raw)
  To: Lendacky, Thomas
  Cc: Li, Aubrey, Daniel Drake, x86, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team, Jiri Slaby

On Fri, 9 Aug 2019, Lendacky, Thomas wrote:
> On 8/9/19 7:54 AM, Thomas Gleixner wrote:
> > +	local_irq_disable();
> >   	/*
> >   	 * Setup the APIC counter to maximum. There is no way the lapic
> >   	 * can underflow in the 100ms detection time frame
> >   	 */
> >   	__setup_APIC_LVTT(0xffffffff, 0, 0);
> >   
> > -	/* Let the interrupts run */
> > -	local_irq_enable();
> > +	/*
> > +	 * Methods to terminate the calibration loop:
> > +	 *  1) Global clockevent if available (jiffies)
> > +	 *  2) TSC if available and frequency is known
> > +	 */
> > +	jif_start = READ_ONCE(jiffies);
> > +
> > +	if (tsc_khz) {
> > +		tsc_start = rdtsc();
> > +		tsc_perj = div_u64((u64)tsc_khz * 1000, HZ);
> > +	}
> > +
> > +	while (lapic_cal_loops <= LAPIC_CAL_LOOPS) {
> > +		/*
> > +		 * Enable interrupts so the tick can fire, if a global
> > +		 * clockevent device is available
> > +		 */
> > +		local_irq_enable();
> 
> Just a nit, but you end up doing this at the bottom of the loop, so you
> could move this invocation to just before the loop and avoid doing two
> local_irq_enable() calls in succession after the first iteration.

Indeed. Lets see how the reports go. That change is a nobrainer.

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-09 12:54                       ` [PATCH] x86/apic: Handle missing global clockevent gracefully Thomas Gleixner
  2019-08-09 12:59                         ` Jiri Slaby
  2019-08-09 13:58                         ` Lendacky, Thomas
@ 2019-08-12  6:16                         ` Daniel Drake
  2019-08-12  7:03                           ` Jiri Slaby
  2019-08-15  5:02                           ` Daniel Drake
  2019-08-12  7:46                         ` Li, Aubrey
  2019-08-19 10:37                         ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
  4 siblings, 2 replies; 27+ messages in thread
From: Daniel Drake @ 2019-08-12  6:16 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Lendacky, Thomas, Li, Aubrey, x86, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team, Jiri Slaby

On Fri, Aug 9, 2019 at 8:54 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> Some newer machines do not advertise legacy timers. The kernel can handle
> that situation if the TSC and the CPU frequency are enumerated by CPUID or
> MSRs and the CPU supports TSC deadline timer. If the CPU does not support
> TSC deadline timer the local APIC timer frequency has to be known as well.
>
> Some Ryzens machines do not advertize legacy timers, but there is no
> reliable way to determine the bus frequency which feeds the local APIC
> timer when the machine allows overclocking of that frequency.
>
> As there is no legacy timer the local APIC timer calibration crashes due to
> a NULL pointer dereference when accessing the not installed global clock
> event device.
>
> Switch the calibration loop to a non interrupt based one, which polls
> either TSC (frequency known) or jiffies. The latter requires a global
> clockevent. As the machines which do not have a global clockevent installed
> have a known TSC frequency this is a non issue. For older machines where
> TSC frequency is not known, there is no known case where the legacy timers
> do not exist as that would have been reported long ago.

This solves the problem I described in the thread:
    setup_boot_APIC_clock() NULL dereference during early boot on
reduced hardware platforms

Thanks for your quick support!

> Note: Only lightly tested, but of course not on an affected machine.
>
>       If that works reliably, then this needs some exhaustive testing
>       on a wide range of systems so we can risk backports to stable
>       kernels.

I can do a bit of testing on other platforms too. Are there any
specific tests I should run, other than checking that the system boots
and doesn't have any timer watchdog complaints in the log?

Thanks
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-12  6:16                         ` Daniel Drake
@ 2019-08-12  7:03                           ` Jiri Slaby
  2019-08-15  5:02                           ` Daniel Drake
  1 sibling, 0 replies; 27+ messages in thread
From: Jiri Slaby @ 2019-08-12  7:03 UTC (permalink / raw)
  To: Daniel Drake, Thomas Gleixner
  Cc: Lendacky, Thomas, Li, Aubrey, x86, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team

On 12. 08. 19, 8:16, Daniel Drake wrote:
> On Fri, Aug 9, 2019 at 8:54 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>> Some newer machines do not advertise legacy timers. The kernel can handle
>> that situation if the TSC and the CPU frequency are enumerated by CPUID or
>> MSRs and the CPU supports TSC deadline timer. If the CPU does not support
>> TSC deadline timer the local APIC timer frequency has to be known as well.
>>
>> Some Ryzens machines do not advertize legacy timers, but there is no
>> reliable way to determine the bus frequency which feeds the local APIC
>> timer when the machine allows overclocking of that frequency.
>>
>> As there is no legacy timer the local APIC timer calibration crashes due to
>> a NULL pointer dereference when accessing the not installed global clock
>> event device.
>>
>> Switch the calibration loop to a non interrupt based one, which polls
>> either TSC (frequency known) or jiffies. The latter requires a global
>> clockevent. As the machines which do not have a global clockevent installed
>> have a known TSC frequency this is a non issue. For older machines where
>> TSC frequency is not known, there is no known case where the legacy timers
>> do not exist as that would have been reported long ago.
> 
> This solves the problem I described in the thread:
>     setup_boot_APIC_clock() NULL dereference during early boot on
> reduced hardware platforms

So it does for the openSUSE user:
http://bugzilla.opensuse.org/show_bug.cgi?id=1142926#c12

=========
After installing that build of the kernel from your OBS home project,
that did
more than just fix the issue with the APIC timer screwing up. I now have
all 4
cores/8 threads available.

I do see some errors from the ACPI layer that do indicate that there are
some
areas of the BIOS from HP that are buggy, but at this time, the machine
seems
to be working without issue.
=========

dmesg here:
http://bugzilla.opensuse.org/attachment.cgi?id=813577

thanks,
-- 
js
suse labs

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-09 12:54                       ` [PATCH] x86/apic: Handle missing global clockevent gracefully Thomas Gleixner
                                           ` (2 preceding siblings ...)
  2019-08-12  6:16                         ` Daniel Drake
@ 2019-08-12  7:46                         ` Li, Aubrey
  2019-08-12 12:24                           ` Thomas Gleixner
  2019-08-19 10:37                         ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
  4 siblings, 1 reply; 27+ messages in thread
From: Li, Aubrey @ 2019-08-12  7:46 UTC (permalink / raw)
  To: Thomas Gleixner, Lendacky, Thomas
  Cc: Daniel Drake, x86, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team, Jiri Slaby

On 2019/8/9 20:54, Thomas Gleixner wrote:
> Some newer machines do not advertise legacy timers. The kernel can handle
> that situation if the TSC and the CPU frequency are enumerated by CPUID or
> MSRs and the CPU supports TSC deadline timer. If the CPU does not support
> TSC deadline timer the local APIC timer frequency has to be known as well.
> 
> Some Ryzens machines do not advertize legacy timers, but there is no
> reliable way to determine the bus frequency which feeds the local APIC
> timer when the machine allows overclocking of that frequency.

Are these platforms are all ACPI HW-reduced platform?

> 
> As there is no legacy timer the local APIC timer calibration crashes due to
> a NULL pointer dereference when accessing the not installed global clock
> event device.
> 
> Switch the calibration loop to a non interrupt based one, which polls
> either TSC (frequency known) or jiffies. The latter requires a global
> clockevent. As the machines which do not have a global clockevent installed
> have a known TSC frequency this is a non issue. For older machines where
> TSC frequency is not known, there is no known case where the legacy timers
> do not exist as that would have been reported long ago.
> 
> Reported-by: Daniel Drake <drake@endlessm.com>
> Reported-by: Jiri Slaby <jslaby@suse.cz>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> ---
> 
> Note: Only lightly tested, but of course not on an affected machine.
> 
>       If that works reliably, then this needs some exhaustive testing
>       on a wide range of systems so we can risk backports to stable
>       kernels.
> 
> ---
>  arch/x86/kernel/apic/apic.c |   70 +++++++++++++++++++++++++++++++++-----------
>  include/linux/acpi_pmtmr.h  |   10 ++++++
>  2 files changed, 64 insertions(+), 16 deletions(-)
> 
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -851,7 +851,8 @@ bool __init apic_needs_pit(void)
>  static int __init calibrate_APIC_clock(void)
>  {
>  	struct clock_event_device *levt = this_cpu_ptr(&lapic_events);
> -	void (*real_handler)(struct clock_event_device *dev);
> +	u64 tsc_perj = 0, tsc_start = 0;
> +	unsigned long jif_start;
>  	unsigned long deltaj;
>  	long delta, deltatsc;
>  	int pm_referenced = 0;
> @@ -878,29 +879,65 @@ static int __init calibrate_APIC_clock(v
>  	apic_printk(APIC_VERBOSE, "Using local APIC timer interrupts.\n"
>  		    "calibrating APIC timer ...\n");
>  
> -	local_irq_disable();
> -
> -	/* Replace the global interrupt handler */
> -	real_handler = global_clock_event->event_handler;
> -	global_clock_event->event_handler = lapic_cal_handler;
> +	/*
> +	 * There are platforms w/o global clockevent devices. Instead of
> +	 * making the calibration conditional on that, use a polling based
> +	 * approach everywhere.
> +	 */
>  
> +	local_irq_disable();
>  	/*
>  	 * Setup the APIC counter to maximum. There is no way the lapic
>  	 * can underflow in the 100ms detection time frame
>  	 */
>  	__setup_APIC_LVTT(0xffffffff, 0, 0);
>  
> -	/* Let the interrupts run */
> -	local_irq_enable();
> +	/*
> +	 * Methods to terminate the calibration loop:
> +	 *  1) Global clockevent if available (jiffies)
> +	 *  2) TSC if available and frequency is known
> +	 */
> +	jif_start = READ_ONCE(jiffies);
> +
> +	if (tsc_khz) {
> +		tsc_start = rdtsc();
> +		tsc_perj = div_u64((u64)tsc_khz * 1000, HZ);
> +	}
> +
> +	while (lapic_cal_loops <= LAPIC_CAL_LOOPS) {

Is this loop still meaningful, can we just invoke the handler twice
before and after the tick?

Thanks,
-Aubrey

> +		/*
> +		 * Enable interrupts so the tick can fire, if a global
> +		 * clockevent device is available
> +		 */
> +		local_irq_enable();
>  
> -	while (lapic_cal_loops <= LAPIC_CAL_LOOPS)
> -		cpu_relax();
> +		/* Wait for a tick to elapse */
> +		while (1) {
> +			if (tsc_khz) {
> +				u64 tsc_now = rdtsc();
> +				if ((tsc_now - tsc_start) >= tsc_perj) {
> +					tsc_start += tsc_perj;
> +					break;
> +				}
> +			} else {
> +				unsigned long jif_now = READ_ONCE(jiffies);
> +
> +				if (time_after(jif_now, jif_start)) {
> +					jif_start = jif_now;
> +					break;
> +				}
> +			}
> +			cpu_relax();
> +		}
> +
> +		/* Invoke the calibration routine */
> +		local_irq_disable();
> +		lapic_cal_handler(NULL);
> +		local_irq_enable();
> +	}
>  
>  	local_irq_disable();
>  
> -	/* Restore the real event handler */
> -	global_clock_event->event_handler = real_handler;
> -
>  	/* Build delta t1-t2 as apic timer counts down */
>  	delta = lapic_cal_t1 - lapic_cal_t2;
>  	apic_printk(APIC_VERBOSE, "... lapic delta = %ld\n", delta);
> @@ -943,10 +980,11 @@ static int __init calibrate_APIC_clock(v
>  	levt->features &= ~CLOCK_EVT_FEAT_DUMMY;
>  
>  	/*
> -	 * PM timer calibration failed or not turned on
> -	 * so lets try APIC timer based calibration
> +	 * PM timer calibration failed or not turned on so lets try APIC
> +	 * timer based calibration, if a global clockevent device is
> +	 * available.
>  	 */
> -	if (!pm_referenced) {
> +	if (!pm_referenced && global_clock_event) {
>  		apic_printk(APIC_VERBOSE, "... verify APIC timer\n");
>  
>  		/*
> --- a/include/linux/acpi_pmtmr.h
> +++ b/include/linux/acpi_pmtmr.h
> @@ -18,6 +18,11 @@
>  extern u32 acpi_pm_read_verified(void);
>  extern u32 pmtmr_ioport;
>  
> +static inline bool acpi_pm_timer_available(void)
> +{
> +	return pmtmr_ioport != 0;
> +}
> +
>  static inline u32 acpi_pm_read_early(void)
>  {
>  	if (!pmtmr_ioport)
> @@ -28,6 +33,11 @@ static inline u32 acpi_pm_read_early(voi
>  
>  #else
>  
> +static inline bool acpi_pm_timer_available(void)
> +{
> +	return false;
> +}
> +
>  static inline u32 acpi_pm_read_early(void)
>  {
>  	return 0;
> 


^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-12  7:46                         ` Li, Aubrey
@ 2019-08-12 12:24                           ` Thomas Gleixner
  2019-08-12 12:59                             ` Aubrey Li
  0 siblings, 1 reply; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-12 12:24 UTC (permalink / raw)
  To: Li, Aubrey
  Cc: Lendacky, Thomas, Daniel Drake, x86, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team, Jiri Slaby

On Mon, 12 Aug 2019, Li, Aubrey wrote:
> On 2019/8/9 20:54, Thomas Gleixner wrote:
> > +	local_irq_disable();
> >  	/*
> >  	 * Setup the APIC counter to maximum. There is no way the lapic
> >  	 * can underflow in the 100ms detection time frame
> >  	 */
> >  	__setup_APIC_LVTT(0xffffffff, 0, 0);
> >  
> > -	/* Let the interrupts run */
> > -	local_irq_enable();
> > +	/*
> > +	 * Methods to terminate the calibration loop:
> > +	 *  1) Global clockevent if available (jiffies)
> > +	 *  2) TSC if available and frequency is known
> > +	 */
> > +	jif_start = READ_ONCE(jiffies);
> > +
> > +	if (tsc_khz) {
> > +		tsc_start = rdtsc();
> > +		tsc_perj = div_u64((u64)tsc_khz * 1000, HZ);
> > +	}
> > +
> > +	while (lapic_cal_loops <= LAPIC_CAL_LOOPS) {
> 
> Is this loop still meaningful, can we just invoke the handler twice
> before and after the tick?

And that solves what?

> Thanks,
> -Aubrey

<Remove tons of useless quote>

Can you please trim your replies?

Thanks,

	tglx
	

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-12 12:24                           ` Thomas Gleixner
@ 2019-08-12 12:59                             ` Aubrey Li
  2019-08-12 17:11                               ` Thomas Gleixner
  0 siblings, 1 reply; 27+ messages in thread
From: Aubrey Li @ 2019-08-12 12:59 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Li, Aubrey, Lendacky, Thomas, Daniel Drake, x86, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team, Jiri Slaby

On Mon, Aug 12, 2019 at 8:25 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, 12 Aug 2019, Li, Aubrey wrote:
> > On 2019/8/9 20:54, Thomas Gleixner wrote:
> > > +   local_irq_disable();
> > >     /*
> > >      * Setup the APIC counter to maximum. There is no way the lapic
> > >      * can underflow in the 100ms detection time frame
> > >      */
> > >     __setup_APIC_LVTT(0xffffffff, 0, 0);
> > >
> > > -   /* Let the interrupts run */
> > > -   local_irq_enable();
> > > +   /*
> > > +    * Methods to terminate the calibration loop:
> > > +    *  1) Global clockevent if available (jiffies)
> > > +    *  2) TSC if available and frequency is known
> > > +    */
> > > +   jif_start = READ_ONCE(jiffies);
> > > +
> > > +   if (tsc_khz) {
> > > +           tsc_start = rdtsc();
> > > +           tsc_perj = div_u64((u64)tsc_khz * 1000, HZ);
> > > +   }
> > > +
> > > +   while (lapic_cal_loops <= LAPIC_CAL_LOOPS) {
> >
> > Is this loop still meaningful, can we just invoke the handler twice
> > before and after the tick?
>
> And that solves what?
>

I meant, can we do this one time?
- lapic_cal_t1 = read APIC counter
- /* Wait for a tick to elapse */
- lapic_cal_t2 = read APIC counter

I'm not clear why we still need this loop, to use the
existing lapic_cal_handler()?

Thanks,
-Aubrey

^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-12 12:59                             ` Aubrey Li
@ 2019-08-12 17:11                               ` Thomas Gleixner
  0 siblings, 0 replies; 27+ messages in thread
From: Thomas Gleixner @ 2019-08-12 17:11 UTC (permalink / raw)
  To: Aubrey Li
  Cc: Li, Aubrey, Lendacky, Thomas, Daniel Drake, x86, H . Peter Anvin,
	Linux Kernel, Endless Linux Upstreaming Team, Jiri Slaby

On Mon, 12 Aug 2019, Aubrey Li wrote:

> On Mon, Aug 12, 2019 at 8:25 PM Thomas Gleixner <tglx@linutronix.de> wrote:
> >
> > On Mon, 12 Aug 2019, Li, Aubrey wrote:
> > > On 2019/8/9 20:54, Thomas Gleixner wrote:
> > > > +   local_irq_disable();
> > > >     /*
> > > >      * Setup the APIC counter to maximum. There is no way the lapic
> > > >      * can underflow in the 100ms detection time frame
> > > >      */
> > > >     __setup_APIC_LVTT(0xffffffff, 0, 0);
> > > >
> > > > -   /* Let the interrupts run */
> > > > -   local_irq_enable();
> > > > +   /*
> > > > +    * Methods to terminate the calibration loop:
> > > > +    *  1) Global clockevent if available (jiffies)
> > > > +    *  2) TSC if available and frequency is known
> > > > +    */
> > > > +   jif_start = READ_ONCE(jiffies);
> > > > +
> > > > +   if (tsc_khz) {
> > > > +           tsc_start = rdtsc();
> > > > +           tsc_perj = div_u64((u64)tsc_khz * 1000, HZ);
> > > > +   }
> > > > +
> > > > +   while (lapic_cal_loops <= LAPIC_CAL_LOOPS) {
> > >
> > > Is this loop still meaningful, can we just invoke the handler twice
> > > before and after the tick?
> >
> > And that solves what?
> >
> I meant, can we do this one time?
> - lapic_cal_t1 = read APIC counter
> - /* Wait for a tick to elapse */
> - lapic_cal_t2 = read APIC counter

Sure, but how does this work with randomly broken hardware, e.g. PIT
running at the wrong frequency/

The calibration code is trying to verify against as many and as reliable
references and it served us well so far.
 
> I'm not clear why we still need this loop, to use the
> existing lapic_cal_handler()?

A single tick is way too small to get a proper calibration. Sure, this can
be optimized by avoiding the loop and have a longer delay, but you
definitely want to use the rest of the calibration code as is.

Aside of that this was the minial fix I came up with which might be
suitable for backporting. These platforms seem to come out of the woods
right now, so we definitely want support for them in LTS kernels as well.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 27+ messages in thread

* Re: [PATCH] x86/apic: Handle missing global clockevent gracefully
  2019-08-12  6:16                         ` Daniel Drake
  2019-08-12  7:03                           ` Jiri Slaby
@ 2019-08-15  5:02                           ` Daniel Drake
  1 sibling, 0 replies; 27+ messages in thread
From: Daniel Drake @ 2019-08-15  5:02 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Lendacky, Thomas, Li, Aubrey, x86, H . Peter Anvin, Linux Kernel,
	Endless Linux Upstreaming Team, Jiri Slaby

On Mon, Aug 12, 2019 at 2:16 PM Daniel Drake <drake@endlessm.com> wrote:
> I can do a bit of testing on other platforms too. Are there any
> specific tests I should run, other than checking that the system boots
> and doesn't have any timer watchdog complaints in the log?

Tested this on 2 AMD platforms that were not affected by the crash here.
In addition to confirming that they boot fine without timer complaints
in the logs, I checked the calibrate_APIC_clock() result before and
after this patch. I repeated each test twice.

Asus E402YA (AMD E2-7015)
Before: 99811, 99811
After: 99812, 99812

Acer Aspire A315-21G (AMD A9-9420e)
Before: 99811, 99811
After: 99807, 99820

Those new numbers seem very close to the previous ones and I didn't
observe any problems.

Thanks
Daniel

^ permalink raw reply	[flat|nested] 27+ messages in thread

* [tip:x86/urgent] x86/apic: Handle missing global clockevent gracefully
  2019-08-09 12:54                       ` [PATCH] x86/apic: Handle missing global clockevent gracefully Thomas Gleixner
                                           ` (3 preceding siblings ...)
  2019-08-12  7:46                         ` Li, Aubrey
@ 2019-08-19 10:37                         ` tip-bot for Thomas Gleixner
  4 siblings, 0 replies; 27+ messages in thread
From: tip-bot for Thomas Gleixner @ 2019-08-19 10:37 UTC (permalink / raw)
  To: linux-tip-commits; +Cc: drake, mingo, hpa, linux-kernel, tglx, jslaby

Commit-ID:  f897e60a12f0b9146357780d317879bce2a877dc
Gitweb:     https://git.kernel.org/tip/f897e60a12f0b9146357780d317879bce2a877dc
Author:     Thomas Gleixner <tglx@linutronix.de>
AuthorDate: Fri, 9 Aug 2019 14:54:07 +0200
Committer:  Thomas Gleixner <tglx@linutronix.de>
CommitDate: Mon, 19 Aug 2019 12:34:07 +0200

x86/apic: Handle missing global clockevent gracefully

Some newer machines do not advertise legacy timers. The kernel can handle
that situation if the TSC and the CPU frequency are enumerated by CPUID or
MSRs and the CPU supports TSC deadline timer. If the CPU does not support
TSC deadline timer the local APIC timer frequency has to be known as well.

Some Ryzens machines do not advertize legacy timers, but there is no
reliable way to determine the bus frequency which feeds the local APIC
timer when the machine allows overclocking of that frequency.

As there is no legacy timer the local APIC timer calibration crashes due to
a NULL pointer dereference when accessing the not installed global clock
event device.

Switch the calibration loop to a non interrupt based one, which polls
either TSC (if frequency is known) or jiffies. The latter requires a global
clockevent. As the machines which do not have a global clockevent installed
have a known TSC frequency this is a non issue. For older machines where
TSC frequency is not known, there is no known case where the legacy timers
do not exist as that would have been reported long ago.

Reported-by: Daniel Drake <drake@endlessm.com>
Reported-by: Jiri Slaby <jslaby@suse.cz>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Daniel Drake <drake@endlessm.com>
Cc: stable@vger.kernel.org
Link: https://lkml.kernel.org/r/alpine.DEB.2.21.1908091443030.21433@nanos.tec.linutronix.de
Link: http://bugzilla.opensuse.org/show_bug.cgi?id=1142926#c12
---
 arch/x86/kernel/apic/apic.c | 68 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 53 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index f5291362da1a..aa5495d0f478 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -722,7 +722,7 @@ static __initdata unsigned long lapic_cal_pm1, lapic_cal_pm2;
 static __initdata unsigned long lapic_cal_j1, lapic_cal_j2;
 
 /*
- * Temporary interrupt handler.
+ * Temporary interrupt handler and polled calibration function.
  */
 static void __init lapic_cal_handler(struct clock_event_device *dev)
 {
@@ -851,7 +851,8 @@ bool __init apic_needs_pit(void)
 static int __init calibrate_APIC_clock(void)
 {
 	struct clock_event_device *levt = this_cpu_ptr(&lapic_events);
-	void (*real_handler)(struct clock_event_device *dev);
+	u64 tsc_perj = 0, tsc_start = 0;
+	unsigned long jif_start;
 	unsigned long deltaj;
 	long delta, deltatsc;
 	int pm_referenced = 0;
@@ -878,28 +879,64 @@ static int __init calibrate_APIC_clock(void)
 	apic_printk(APIC_VERBOSE, "Using local APIC timer interrupts.\n"
 		    "calibrating APIC timer ...\n");
 
+	/*
+	 * There are platforms w/o global clockevent devices. Instead of
+	 * making the calibration conditional on that, use a polling based
+	 * approach everywhere.
+	 */
 	local_irq_disable();
 
-	/* Replace the global interrupt handler */
-	real_handler = global_clock_event->event_handler;
-	global_clock_event->event_handler = lapic_cal_handler;
-
 	/*
 	 * Setup the APIC counter to maximum. There is no way the lapic
 	 * can underflow in the 100ms detection time frame
 	 */
 	__setup_APIC_LVTT(0xffffffff, 0, 0);
 
-	/* Let the interrupts run */
+	/*
+	 * Methods to terminate the calibration loop:
+	 *  1) Global clockevent if available (jiffies)
+	 *  2) TSC if available and frequency is known
+	 */
+	jif_start = READ_ONCE(jiffies);
+
+	if (tsc_khz) {
+		tsc_start = rdtsc();
+		tsc_perj = div_u64((u64)tsc_khz * 1000, HZ);
+	}
+
+	/*
+	 * Enable interrupts so the tick can fire, if a global
+	 * clockevent device is available
+	 */
 	local_irq_enable();
 
-	while (lapic_cal_loops <= LAPIC_CAL_LOOPS)
-		cpu_relax();
+	while (lapic_cal_loops <= LAPIC_CAL_LOOPS) {
+		/* Wait for a tick to elapse */
+		while (1) {
+			if (tsc_khz) {
+				u64 tsc_now = rdtsc();
+				if ((tsc_now - tsc_start) >= tsc_perj) {
+					tsc_start += tsc_perj;
+					break;
+				}
+			} else {
+				unsigned long jif_now = READ_ONCE(jiffies);
 
-	local_irq_disable();
+				if (time_after(jif_now, jif_start)) {
+					jif_start = jif_now;
+					break;
+				}
+			}
+			cpu_relax();
+		}
 
-	/* Restore the real event handler */
-	global_clock_event->event_handler = real_handler;
+		/* Invoke the calibration routine */
+		local_irq_disable();
+		lapic_cal_handler(NULL);
+		local_irq_enable();
+	}
+
+	local_irq_disable();
 
 	/* Build delta t1-t2 as apic timer counts down */
 	delta = lapic_cal_t1 - lapic_cal_t2;
@@ -943,10 +980,11 @@ static int __init calibrate_APIC_clock(void)
 	levt->features &= ~CLOCK_EVT_FEAT_DUMMY;
 
 	/*
-	 * PM timer calibration failed or not turned on
-	 * so lets try APIC timer based calibration
+	 * PM timer calibration failed or not turned on so lets try APIC
+	 * timer based calibration, if a global clockevent device is
+	 * available.
 	 */
-	if (!pm_referenced) {
+	if (!pm_referenced && global_clock_event) {
 		apic_printk(APIC_VERBOSE, "... verify APIC timer\n");
 
 		/*

^ permalink raw reply	[flat|nested] 27+ messages in thread

end of thread, other threads:[~2019-08-19 10:37 UTC | newest]

Thread overview: 27+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-01  6:24 setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms Daniel Drake
2019-08-01  7:16 ` Aubrey Li
2019-08-01  7:35   ` Thomas Gleixner
2019-08-01  7:56     ` Aubrey Li
2019-08-01  8:13       ` Thomas Gleixner
2019-08-01  8:21         ` Li, Aubrey
2019-08-01 10:13           ` Thomas Gleixner
2019-08-01 15:44             ` Lendacky, Thomas
2019-08-08 20:36               ` Thomas Gleixner
2019-08-08 21:01                 ` Lendacky, Thomas
2019-08-08 21:08                   ` Thomas Gleixner
2019-08-08 21:36                     ` Lendacky, Thomas
2019-08-09 12:54                       ` [PATCH] x86/apic: Handle missing global clockevent gracefully Thomas Gleixner
2019-08-09 12:59                         ` Jiri Slaby
2019-08-09 13:58                         ` Lendacky, Thomas
2019-08-09 19:40                           ` Thomas Gleixner
2019-08-12  6:16                         ` Daniel Drake
2019-08-12  7:03                           ` Jiri Slaby
2019-08-15  5:02                           ` Daniel Drake
2019-08-12  7:46                         ` Li, Aubrey
2019-08-12 12:24                           ` Thomas Gleixner
2019-08-12 12:59                             ` Aubrey Li
2019-08-12 17:11                               ` Thomas Gleixner
2019-08-19 10:37                         ` [tip:x86/urgent] " tip-bot for Thomas Gleixner
2019-08-01  9:00   ` setup_boot_APIC_clock() NULL dereference during early boot on reduced hardware platforms Daniel Drake
2019-08-01 10:17     ` Thomas Gleixner
2019-08-01  7:27 ` Thomas Gleixner

This is a public inbox, see mirroring instructions
on how to clone and mirror all data and code used for this inbox