All of lore.kernel.org
 help / color / mirror / Atom feed
* PIC probing code from e179f6914152 failing
@ 2023-10-18 18:50 Mario Limonciello
  2023-10-18 22:50 ` Thomas Gleixner
  0 siblings, 1 reply; 23+ messages in thread
From: Mario Limonciello @ 2023-10-18 18:50 UTC (permalink / raw)
  To: Hans de Goede, kys, tglx, hpa; +Cc: x86, LKML, Petkov, Borislav

Hi,

Recently an issue was reported to Bugzilla [1] that the Keyboard (IRQ 1) 
and GPIO controller (IRQ 7) weren't working properly on two separate 
Lenovo machines.  The issues are unique to Linux.

In digging through them, they happen because Lenovo didn't set up the 
PIC in the BIOS.
Specifically the PIC probing code introduced by e179f6914152 ("x86, irq, 
pic: Probe for legacy PIC and set legacy_pic appropriately") expects 
that the BIOS sets up the PIC and uses that assertion to let Linux set 
it up.

One of the reporters confirmed that reverting e179f6914152 fixes the 
issue.  Keyboard and GPIO controller both work properly.

I wanted to ask if we can please revert that and come up with a 
different solution for kexec with HyperV.
Can guests instead perhaps detect in early boot code they're running in 
an EFI based hypervisor and explicitly set 'legacy_pic = &null_legacy_pic;'?

[1] https://bugzilla.kernel.org/show_bug.cgi?id=218003

Thanks,

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-18 18:50 PIC probing code from e179f6914152 failing Mario Limonciello
@ 2023-10-18 22:50 ` Thomas Gleixner
  2023-10-19 16:39   ` David Lazăr
  2023-10-19 21:20   ` Mario Limonciello
  0 siblings, 2 replies; 23+ messages in thread
From: Thomas Gleixner @ 2023-10-18 22:50 UTC (permalink / raw)
  To: Mario Limonciello, Hans de Goede, kys, hpa; +Cc: x86, LKML, Borislav Petkov

On Wed, Oct 18 2023 at 13:50, Mario Limonciello wrote:
> Recently an issue was reported to Bugzilla [1] that the Keyboard (IRQ 1) 
> and GPIO controller (IRQ 7) weren't working properly on two separate 
> Lenovo machines.  The issues are unique to Linux.
>
> In digging through them, they happen because Lenovo didn't set up the 
> PIC in the BIOS.
> Specifically the PIC probing code introduced by e179f6914152 ("x86, irq, 
> pic: Probe for legacy PIC and set legacy_pic appropriately") expects 
> that the BIOS sets up the PIC and uses that assertion to let Linux set 
> it up.
>
> One of the reporters confirmed that reverting e179f6914152 fixes the 
> issue.  Keyboard and GPIO controller both work properly.
>
> I wanted to ask if we can please revert that and come up with a 
> different solution for kexec with HyperV.
> Can guests instead perhaps detect in early boot code they're running in 
> an EFI based hypervisor and explicitly set 'legacy_pic = &null_legacy_pic;'?

No. This detection mechanism prevents PIC usage also in other
scenarios.

It's perfectly valid code and the assumption that you can read back what
you wrote to the master IMR is entirely correct. If that's not the case
then there is no PIC, the BIOS has disabled some parts of the legacy
block or did not initialize it.

Letting the kernel blindly assume that there is always a PIC is just
wrong. Worse, it puts the burden on everyone else to sprinkle
"legacy_pic = null_pic;" all over the place with dubious
conditions. That's exactly what the commit in question avoided.

So no, we are not going back there.

We could obviously change the probe() function to issue a full PIC
initialization sequence before reading a known written value
back. That's basically what the revert causes to happen via
legacy_pic->init().

But I fundamentally hate to do that because forcing the init sequence
just to make presumably broken code which has some dubious dependencies
on the PIC or on nr_legacy_irqs > 0 happy is not really a solution when
the PIC is actually not needed at all. For anything modern where all
legacy interrupts are routed to the IO/APIC the PIC does not make any
sense whatsoever.

We rather go and fix the real underlying problem.

The kernel can handle the legacy interrupts perfectly fine through the
IOAPIC. There is no hard requirement for the PIC at all except for
really old systems which had the timer interrupt wired to the PIC and
therefore required to route the timer interrupt through the PIC instead
of the IO/APIC or did not provide routing entries for the IO/APIC. See
the horrible hackery in check_timer() and the grossly misnomed
init_IO_APIC_traps().

I just took a random machine, forced the NULL PIC and added
'apic=verbose' to the kernel command line and magically all the legacy
interrupts are set up via IO/APIC despite the NULL PIC and therefore 0
preallocated legacy interrupts.

  apic 0 pin 0 not connected
 IOAPIC[0]: Preconfigured routing entry (0-1 -> IRQ 1 Level:0 ActiveLow:0)
 IOAPIC[0]: Preconfigured routing entry (0-2 -> IRQ 2 Level:0 ActiveLow:0)
 IOAPIC[0]: Preconfigured routing entry (0-3 -> IRQ 3 Level:0 ActiveLow:0)
 ...

which is identical to the output with PIC enabled. That debug message is
emitted from mp_irqdomain_alloc() which is invoked via the PNP resource
management code.

Now /proc/interrupts:

           CPU0       CPU1       CPU2       CPU3        
  1:          0         56          0          0    IO-APIC   1-edge      i8042
  4:        442          0          0          0    IO-APIC   4-edge      ttyS0
  8:          0          0          0          0    IO-APIC   8-edge      rtc0
  9:          0          0          0          0    IO-APIC   9-fasteoi   acpi
 12:          0          0        122          0    IO-APIC  12-edge      i8042

Keyboard and serial are working, see?

The only interrupt which does not work is interrupt 0 because nothing
allocates interrupt 0 due to nr_legacy_irqs == 0, but that's a trivially
solvable problem.

That machine does not even need the timer interrupt because it has a
usable APIC and TSC deadline timer, so no APIC timer calibration
required. The same is true for CPUs which do not have a TSC deadline
timer, but enumerate the APIC frequency via CPUID or MSRs.

But that brings up an interesting question. How are those affected
machines even reaching a state where the user notices that just the
keyboard and the GPIO are not working? Why?

The CPUID/MSR APIC frequency enumeration is Intel specific and
everything else depends on a working timer interrupt to calibrate the
APIC timer frequency. So AMD CPUs require the timer interrupt to
work. The only explanation why this "works" in that null PIC case is
that the PIT/HPET interrupt is actually wired to pin 0, but that's
something to be determined...

Can I please get the following information from an affected machine:

  1) dmesg with 'apic=verbose' on the command line
  2) /proc/interrupts
  3) /sys/kernel/debug/irq/irqs/{0..15}

  #3 requires CONFIG_GENERIC_IRQ_DEBUGFS to be set.

Two versions of that please:

  1) Unpatched kernel
  2) Patched kernel

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-18 22:50 ` Thomas Gleixner
@ 2023-10-19 16:39   ` David Lazăr
  2023-10-19 21:20   ` Mario Limonciello
  1 sibling, 0 replies; 23+ messages in thread
From: David Lazăr @ 2023-10-19 16:39 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Mario Limonciello, Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov

[-- Attachment #1: Type: text/plain, Size: 661 bytes --]


--- On Thu, 19 Oct 2023, Thomas Gleixner wrote:
> Can I please get the following information from an affected machine:
> 
>   1) dmesg with 'apic=verbose' on the command line
>   2) /proc/interrupts
>   3) /sys/kernel/debug/irq/irqs/{0..15}
> 
>   #3 requires CONFIG_GENERIC_IRQ_DEBUGFS to be set.
> 
> Two versions of that please:
> 
>   1) Unpatched kernel
>   2) Patched kernel

Since they're a lot of files, I've uploaded them in a .tar.gz[0]
attached to the bug[1].

[0] https://bugzilla.kernel.org/attachment.cgi?id=305266&action=edit
[1] https://bugzilla.kernel.org/show_bug.cgi?id=218003

Hope this helps.

Cheers,
-=[david]=-

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-18 22:50 ` Thomas Gleixner
  2023-10-19 16:39   ` David Lazăr
@ 2023-10-19 21:20   ` Mario Limonciello
  2023-10-20  3:43     ` Mario Limonciello
                       ` (2 more replies)
  1 sibling, 3 replies; 23+ messages in thread
From: Mario Limonciello @ 2023-10-19 21:20 UTC (permalink / raw)
  To: Thomas Gleixner, Hans de Goede, kys, hpa; +Cc: x86, LKML, Borislav Petkov

On 10/18/2023 17:50, Thomas Gleixner wrote:
> On Wed, Oct 18 2023 at 13:50, Mario Limonciello wrote:
>> Recently an issue was reported to Bugzilla [1] that the Keyboard (IRQ 1)
>> and GPIO controller (IRQ 7) weren't working properly on two separate
>> Lenovo machines.  The issues are unique to Linux.
>>
>> In digging through them, they happen because Lenovo didn't set up the
>> PIC in the BIOS.
>> Specifically the PIC probing code introduced by e179f6914152 ("x86, irq,
>> pic: Probe for legacy PIC and set legacy_pic appropriately") expects
>> that the BIOS sets up the PIC and uses that assertion to let Linux set
>> it up.
>>
>> One of the reporters confirmed that reverting e179f6914152 fixes the
>> issue.  Keyboard and GPIO controller both work properly.
>>
>> I wanted to ask if we can please revert that and come up with a
>> different solution for kexec with HyperV.
>> Can guests instead perhaps detect in early boot code they're running in
>> an EFI based hypervisor and explicitly set 'legacy_pic = &null_legacy_pic;'?
> 
> No. This detection mechanism prevents PIC usage also in other
> scenarios.
> 
> It's perfectly valid code and the assumption that you can read back what
> you wrote to the master IMR is entirely correct. If that's not the case
> then there is no PIC, the BIOS has disabled some parts of the legacy
> block or did not initialize it.
> 
> Letting the kernel blindly assume that there is always a PIC is just
> wrong. Worse, it puts the burden on everyone else to sprinkle
> "legacy_pic = null_pic;" all over the place with dubious
> conditions. That's exactly what the commit in question avoided.
> 
> So no, we are not going back there.
> 
> We could obviously change the probe() function to issue a full PIC
> initialization sequence before reading a known written value
> back. That's basically what the revert causes to happen via
> legacy_pic->init().
> 
> But I fundamentally hate to do that because forcing the init sequence
> just to make presumably broken code which has some dubious dependencies
> on the PIC or on nr_legacy_irqs > 0 happy is not really a solution when
> the PIC is actually not needed at all. For anything modern where all
> legacy interrupts are routed to the IO/APIC the PIC does not make any
> sense whatsoever.
> 
> We rather go and fix the real underlying problem.

Looking at the logs from David and also trying to mock up the failure on 
my side I have a few findings I'll share, please agree or disagree with 
them.

> 
> The kernel can handle the legacy interrupts perfectly fine through the
> IOAPIC. There is no hard requirement for the PIC at all except for
> really old systems which had the timer interrupt wired to the PIC and
> therefore required to route the timer interrupt through the PIC instead
> of the IO/APIC or did not provide routing entries for the IO/APIC. See
> the horrible hackery in check_timer() and the grossly misnomed
> init_IO_APIC_traps().
> 
> I just took a random machine, forced the NULL PIC and added
> 'apic=verbose' to the kernel command line and magically all the legacy
> interrupts are set up via IO/APIC despite the NULL PIC and therefore 0
> preallocated legacy interrupts.
> 
>    apic 0 pin 0 not connected
>   IOAPIC[0]: Preconfigured routing entry (0-1 -> IRQ 1 Level:0 ActiveLow:0)
>   IOAPIC[0]: Preconfigured routing entry (0-2 -> IRQ 2 Level:0 ActiveLow:0)
>   IOAPIC[0]: Preconfigured routing entry (0-3 -> IRQ 3 Level:0 ActiveLow:0)
>   ...
> 
> which is identical to the output with PIC enabled. That debug message is
> emitted from mp_irqdomain_alloc() which is invoked via the PNP resource
> management code.
> 
> Now /proc/interrupts:
> 
>             CPU0       CPU1       CPU2       CPU3
>    1:          0         56          0          0    IO-APIC   1-edge      i8042
>    4:        442          0          0          0    IO-APIC   4-edge      ttyS0
>    8:          0          0          0          0    IO-APIC   8-edge      rtc0
>    9:          0          0          0          0    IO-APIC   9-fasteoi   acpi
>   12:          0          0        122          0    IO-APIC  12-edge      i8042
> 
> Keyboard and serial are working, see?
> 
> The only interrupt which does not work is interrupt 0 because nothing
> allocates interrupt 0 due to nr_legacy_irqs == 0, but that's a trivially
> solvable problem.

 From David's logs I can see that the timer interrupt gets wired up to 
IRQ2 instead of IRQ0.

IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 2 Level:0 ActiveLow:0)

In my hacked up forcing NULL pic case this fixes that:

diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
index 43c1c24e934b..885687e64e4e 100644
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -424,7 +424,7 @@ static int legacy_pic_probe(void)
  }

  struct legacy_pic null_legacy_pic = {
-       .nr_legacy_irqs = 0,
+       .nr_legacy_irqs = 1,
         .chip = &dummy_irq_chip,
         .mask = legacy_pic_uint_noop,
         .unmask = legacy_pic_uint_noop,

I think it's cleaner than changing all the places that use 
nr_legacy_irqs().  On my side this makes:

IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 0 Level:0 ActiveLow:0)

> 
> That machine does not even need the timer interrupt because it has a
> usable APIC and TSC deadline timer, so no APIC timer calibration
> required. The same is true for CPUs which do not have a TSC deadline
> timer, but enumerate the APIC frequency via CPUID or MSRs.

Don't you need it for things like rtcwake to be able to work?

> 
> But that brings up an interesting question. How are those affected
> machines even reaching a state where the user notices that just the
> keyboard and the GPIO are not working? Why?

So the GPIO controller driver (pinctrl-amd) uses platform_get_irq() to 
try to discover the IRQ to use.

This calls acpi_irq_get() which isn't implemented on x86 (hardcodes 
-EINVAL).

I can "work around it" by:

diff --git a/drivers/base/platform.c b/drivers/base/platform.c
index 76bfcba25003..2b4b436c65d8 100644
--- a/drivers/base/platform.c
+++ b/drivers/base/platform.c
@@ -187,7 +187,8 @@ int platform_get_irq_optional(struct platform_device 
*dev, unsigned int num)
         }

         r = platform_get_resource(dev, IORESOURCE_IRQ, num);
-       if (has_acpi_companion(&dev->dev)) {
+       if (IS_ENABLED(CONFIG_ACPI_GENERIC_GSI) &&
+            has_acpi_companion(&dev->dev)) {
                 if (r && r->flags & IORESOURCE_DISABLED) {
                         ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, r);
                         if (ret)

but the resource that is returned from the next hunk has the resource 
flags set wrong in the NULL pic case:

NULL case:
r: AMDI0030:00 flags: 0x30000418
PIC case:
r: AMDI0030:00 flags: 0x418

IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET

This then later the GPIO controller interrupts are not actually working.
For example the attn pin for my I2C touchpad doesn't work.

Checking the debugfs with my "work around" in place I can see a few 
things set up differently:

NULL case:
handler:  handle_edge_irq
dstate:   0x3740c208
             IRQ_TYPE_LEVEL_LOW
             IRQD_ACTIVATED
             IRQD_IRQ_STARTED
             IRQD_SINGLE_TARGET
             IRQD_MOVE_PCNTXT
             IRQD_AFFINITY_ON_ACTIVATE
             IRQD_CAN_RESERVE
             IRQD_WAKEUP_STATE
             IRQD_DEFAULT_TRIGGER_SET
             IRQD_HANDLE_ENFORCE_IRQCTX

PIC case:
handler:  handle_fasteoi_irq
dstate:   0x3740e208
             IRQ_TYPE_LEVEL_LOW
             IRQD_LEVEL
             IRQD_ACTIVATED
             IRQD_IRQ_STARTED
             IRQD_SINGLE_TARGET
             IRQD_MOVE_PCNTXT
             IRQD_AFFINITY_ON_ACTIVATE
             IRQD_CAN_RESERVE
             IRQD_WAKEUP_STATE
             IRQD_DEFAULT_TRIGGER_SET
             IRQD_HANDLE_ENFORCE_IRQCTX

I guess something related to the callpath for mp_register_handler().

Maybe this is the same reason for the keyboard not working right.

> 
> The CPUID/MSR APIC frequency enumeration is Intel specific and
> everything else depends on a working timer interrupt to calibrate the
> APIC timer frequency. So AMD CPUs require the timer interrupt to
> work. The only explanation why this "works" in that null PIC case is
> that the PIT/HPET interrupt is actually wired to pin 0, but that's
> something to be determined...
> 
> Can I please get the following information from an affected machine:
> 
>    1) dmesg with 'apic=verbose' on the command line
>    2) /proc/interrupts
>    3) /sys/kernel/debug/irq/irqs/{0..15}
> 
>    #3 requires CONFIG_GENERIC_IRQ_DEBUGFS to be set.
> 
> Two versions of that please:
> 
>    1) Unpatched kernel
>    2) Patched kernel
> 
> Thanks,
> 
>          tglx


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-19 21:20   ` Mario Limonciello
@ 2023-10-20  3:43     ` Mario Limonciello
  2023-10-20 15:16     ` Hans de Goede
  2023-10-23 15:59     ` Thomas Gleixner
  2 siblings, 0 replies; 23+ messages in thread
From: Mario Limonciello @ 2023-10-20  3:43 UTC (permalink / raw)
  To: Thomas Gleixner, Hans de Goede, kys, hpa, dlazar
  Cc: x86, LKML, Borislav Petkov

On 10/19/2023 16:20, Mario Limonciello wrote:
> On 10/18/2023 17:50, Thomas Gleixner wrote:
>> On Wed, Oct 18 2023 at 13:50, Mario Limonciello wrote:
>>> Recently an issue was reported to Bugzilla [1] that the Keyboard (IRQ 1)
>>> and GPIO controller (IRQ 7) weren't working properly on two separate
>>> Lenovo machines.  The issues are unique to Linux.
>>>
>>> In digging through them, they happen because Lenovo didn't set up the
>>> PIC in the BIOS.
>>> Specifically the PIC probing code introduced by e179f6914152 ("x86, irq,
>>> pic: Probe for legacy PIC and set legacy_pic appropriately") expects
>>> that the BIOS sets up the PIC and uses that assertion to let Linux set
>>> it up.
>>>
>>> One of the reporters confirmed that reverting e179f6914152 fixes the
>>> issue.  Keyboard and GPIO controller both work properly.
>>>
>>> I wanted to ask if we can please revert that and come up with a
>>> different solution for kexec with HyperV.
>>> Can guests instead perhaps detect in early boot code they're running in
>>> an EFI based hypervisor and explicitly set 'legacy_pic = 
>>> &null_legacy_pic;'?
>>
>> No. This detection mechanism prevents PIC usage also in other
>> scenarios.
>>
>> It's perfectly valid code and the assumption that you can read back what
>> you wrote to the master IMR is entirely correct. If that's not the case
>> then there is no PIC, the BIOS has disabled some parts of the legacy
>> block or did not initialize it.
>>
>> Letting the kernel blindly assume that there is always a PIC is just
>> wrong. Worse, it puts the burden on everyone else to sprinkle
>> "legacy_pic = null_pic;" all over the place with dubious
>> conditions. That's exactly what the commit in question avoided.
>>
>> So no, we are not going back there.
>>
>> We could obviously change the probe() function to issue a full PIC
>> initialization sequence before reading a known written value
>> back. That's basically what the revert causes to happen via
>> legacy_pic->init().
>>
>> But I fundamentally hate to do that because forcing the init sequence
>> just to make presumably broken code which has some dubious dependencies
>> on the PIC or on nr_legacy_irqs > 0 happy is not really a solution when
>> the PIC is actually not needed at all. For anything modern where all
>> legacy interrupts are routed to the IO/APIC the PIC does not make any
>> sense whatsoever.
>>
>> We rather go and fix the real underlying problem.
> 
> Looking at the logs from David and also trying to mock up the failure on 
> my side I have a few findings I'll share, please agree or disagree with 
> them.
> 
>>
>> The kernel can handle the legacy interrupts perfectly fine through the
>> IOAPIC. There is no hard requirement for the PIC at all except for
>> really old systems which had the timer interrupt wired to the PIC and
>> therefore required to route the timer interrupt through the PIC instead
>> of the IO/APIC or did not provide routing entries for the IO/APIC. See
>> the horrible hackery in check_timer() and the grossly misnomed
>> init_IO_APIC_traps().
>>
>> I just took a random machine, forced the NULL PIC and added
>> 'apic=verbose' to the kernel command line and magically all the legacy
>> interrupts are set up via IO/APIC despite the NULL PIC and therefore 0
>> preallocated legacy interrupts.
>>
>>    apic 0 pin 0 not connected
>>   IOAPIC[0]: Preconfigured routing entry (0-1 -> IRQ 1 Level:0 
>> ActiveLow:0)
>>   IOAPIC[0]: Preconfigured routing entry (0-2 -> IRQ 2 Level:0 
>> ActiveLow:0)
>>   IOAPIC[0]: Preconfigured routing entry (0-3 -> IRQ 3 Level:0 
>> ActiveLow:0)
>>   ...
>>
>> which is identical to the output with PIC enabled. That debug message is
>> emitted from mp_irqdomain_alloc() which is invoked via the PNP resource
>> management code.
>>
>> Now /proc/interrupts:
>>
>>             CPU0       CPU1       CPU2       CPU3
>>    1:          0         56          0          0    IO-APIC   
>> 1-edge      i8042
>>    4:        442          0          0          0    IO-APIC   
>> 4-edge      ttyS0
>>    8:          0          0          0          0    IO-APIC   
>> 8-edge      rtc0
>>    9:          0          0          0          0    IO-APIC   
>> 9-fasteoi   acpi
>>   12:          0          0        122          0    IO-APIC  
>> 12-edge      i8042
>>
>> Keyboard and serial are working, see?
>>
>> The only interrupt which does not work is interrupt 0 because nothing
>> allocates interrupt 0 due to nr_legacy_irqs == 0, but that's a trivially
>> solvable problem.
> 
>  From David's logs I can see that the timer interrupt gets wired up to 
> IRQ2 instead of IRQ0.
> 
> IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 2 Level:0 ActiveLow:0)
> 
> In my hacked up forcing NULL pic case this fixes that:
> 
> diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
> index 43c1c24e934b..885687e64e4e 100644
> --- a/arch/x86/kernel/i8259.c
> +++ b/arch/x86/kernel/i8259.c
> @@ -424,7 +424,7 @@ static int legacy_pic_probe(void)
>   }
> 
>   struct legacy_pic null_legacy_pic = {
> -       .nr_legacy_irqs = 0,
> +       .nr_legacy_irqs = 1,
>          .chip = &dummy_irq_chip,
>          .mask = legacy_pic_uint_noop,
>          .unmask = legacy_pic_uint_noop,
> 
> I think it's cleaner than changing all the places that use 
> nr_legacy_irqs().  On my side this makes:
> 
> IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 0 Level:0 ActiveLow:0)
> 
>>
>> That machine does not even need the timer interrupt because it has a
>> usable APIC and TSC deadline timer, so no APIC timer calibration
>> required. The same is true for CPUs which do not have a TSC deadline
>> timer, but enumerate the APIC frequency via CPUID or MSRs.
> 
> Don't you need it for things like rtcwake to be able to work?
> 
>>
>> But that brings up an interesting question. How are those affected
>> machines even reaching a state where the user notices that just the
>> keyboard and the GPIO are not working? Why?
> 
> So the GPIO controller driver (pinctrl-amd) uses platform_get_irq() to 
> try to discover the IRQ to use.
> 
> This calls acpi_irq_get() which isn't implemented on x86 (hardcodes 
> -EINVAL).
> 
> I can "work around it" by:
> 
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 76bfcba25003..2b4b436c65d8 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -187,7 +187,8 @@ int platform_get_irq_optional(struct platform_device 
> *dev, unsigned int num)
>          }
> 
>          r = platform_get_resource(dev, IORESOURCE_IRQ, num);
> -       if (has_acpi_companion(&dev->dev)) {
> +       if (IS_ENABLED(CONFIG_ACPI_GENERIC_GSI) &&
> +            has_acpi_companion(&dev->dev)) {
>                  if (r && r->flags & IORESOURCE_DISABLED) {
>                          ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, 
> r);
>                          if (ret)
> 
> but the resource that is returned from the next hunk has the resource 
> flags set wrong in the NULL pic case:
> 
> NULL case:
> r: AMDI0030:00 flags: 0x30000418
> PIC case:
> r: AMDI0030:00 flags: 0x418
> 
> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET
> 
> This then later the GPIO controller interrupts are not actually working.
> For example the attn pin for my I2C touchpad doesn't work.
> 
> Checking the debugfs with my "work around" in place I can see a few 
> things set up differently:
> 
> NULL case:
> handler:  handle_edge_irq
> dstate:   0x3740c208
>              IRQ_TYPE_LEVEL_LOW
>              IRQD_ACTIVATED
>              IRQD_IRQ_STARTED
>              IRQD_SINGLE_TARGET
>              IRQD_MOVE_PCNTXT
>              IRQD_AFFINITY_ON_ACTIVATE
>              IRQD_CAN_RESERVE
>              IRQD_WAKEUP_STATE
>              IRQD_DEFAULT_TRIGGER_SET
>              IRQD_HANDLE_ENFORCE_IRQCTX
> 
> PIC case:
> handler:  handle_fasteoi_irq
> dstate:   0x3740e208
>              IRQ_TYPE_LEVEL_LOW
>              IRQD_LEVEL
>              IRQD_ACTIVATED
>              IRQD_IRQ_STARTED
>              IRQD_SINGLE_TARGET
>              IRQD_MOVE_PCNTXT
>              IRQD_AFFINITY_ON_ACTIVATE
>              IRQD_CAN_RESERVE
>              IRQD_WAKEUP_STATE
>              IRQD_DEFAULT_TRIGGER_SET
>              IRQD_HANDLE_ENFORCE_IRQCTX
> 
> I guess something related to the callpath for mp_register_handler().
> 
> Maybe this is the same reason for the keyboard not working right.
> 
>>
>> The CPUID/MSR APIC frequency enumeration is Intel specific and
>> everything else depends on a working timer interrupt to calibrate the
>> APIC timer frequency. So AMD CPUs require the timer interrupt to
>> work. The only explanation why this "works" in that null PIC case is
>> that the PIT/HPET interrupt is actually wired to pin 0, but that's
>> something to be determined...
>>
>> Can I please get the following information from an affected machine:
>>
>>    1) dmesg with 'apic=verbose' on the command line
>>    2) /proc/interrupts
>>    3) /sys/kernel/debug/irq/irqs/{0..15}
>>
>>    #3 requires CONFIG_GENERIC_IRQ_DEBUGFS to be set.
>>
>> Two versions of that please:
>>
>>    1) Unpatched kernel
>>    2) Patched kernel
>>
>> Thanks,
>>
>>          tglx
> 

At least on my system with forcing NULL PIC I've come up with this 
series to make everything work.

David, can you please have a try with it on your system?

https://lore.kernel.org/linux-kernel/20231020033806.88063-1-mario.limonciello@amd.com/#t

Thanks,

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-19 21:20   ` Mario Limonciello
  2023-10-20  3:43     ` Mario Limonciello
@ 2023-10-20 15:16     ` Hans de Goede
  2023-10-20 17:13       ` Mario Limonciello
  2023-10-23 15:59     ` Thomas Gleixner
  2 siblings, 1 reply; 23+ messages in thread
From: Hans de Goede @ 2023-10-20 15:16 UTC (permalink / raw)
  To: Mario Limonciello, Thomas Gleixner, kys, hpa; +Cc: x86, LKML, Borislav Petkov

Hi Mario,

On 10/19/23 23:20, Mario Limonciello wrote:
> On 10/18/2023 17:50, Thomas Gleixner wrote:

<snip>

>> But that brings up an interesting question. How are those affected
>> machines even reaching a state where the user notices that just the
>> keyboard and the GPIO are not working? Why?
> 
> So the GPIO controller driver (pinctrl-amd) uses platform_get_irq() to try to discover the IRQ to use.
> 
> This calls acpi_irq_get() which isn't implemented on x86 (hardcodes -EINVAL).
> 
> I can "work around it" by:
> 
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 76bfcba25003..2b4b436c65d8 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -187,7 +187,8 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
>         }
> 
>         r = platform_get_resource(dev, IORESOURCE_IRQ, num);
> -       if (has_acpi_companion(&dev->dev)) {
> +       if (IS_ENABLED(CONFIG_ACPI_GENERIC_GSI) &&
> +            has_acpi_companion(&dev->dev)) {
>                 if (r && r->flags & IORESOURCE_DISABLED) {
>                         ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, r);
>                         if (ret)
> 
> but the resource that is returned from the next hunk has the resource flags set wrong in the NULL pic case:
> 
> NULL case:
> r: AMDI0030:00 flags: 0x30000418
> PIC case:
> r: AMDI0030:00 flags: 0x418
> 
> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET
> 
> This then later the GPIO controller interrupts are not actually working.
> For example the attn pin for my I2C touchpad doesn't work.

Right the issue is that with the legacy-pic path disabled /
with nr_legacy_irqs() returning 0 them there is no mapping
added for the Legacy ISA IRQs which causes this problem.

My hack to set nr_legacy_irqs to 16 also for the NULL PIC from:
https://bugzilla.kernel.org/show_bug.cgi?id=218003

Does cause the Legacy ISA IRQ mappings to get added and makes
the GPIO controller actually work, as can be seen from:

https://bugzilla.kernel.org/attachment.cgi?id=305241&action=edit

Which is a dmesg with that hack and it does NOT have this error:

[    0.276113] amd_gpio AMDI0030:00: error -EINVAL: IRQ index 0 not found
[    0.278464] amd_gpio: probe of AMDI0030:00 failed with error -22

and the reporter also reports the touchpad works with this patch.

As Thomas already said the legayc PIC really is not necessary,
but what is still necessary on these laptops with the legacy PIC
not initialized is to have the Legacy ISA IRQ mappings added
by the kernel itself since these are missing from the MADT
(if I have my ACPI/IOAPIC terminology correct).

This quick hack (which is the one from the working dmesg)
does this:

--- a/arch/x86/kernel/i8259.c	
+++ a/arch/x86/kernel/i8259.c	
@@ -394,7 +394,7 @@ static int legacy_pic_probe(void)
 }
 
 struct legacy_pic null_legacy_pic = {
-	.nr_legacy_irqs = 0,
+	.nr_legacy_irqs = NR_IRQS_LEGACY,
 	.chip = &dummy_irq_chip,
 	.mask = legacy_pic_uint_noop,
 	.unmask = legacy_pic_uint_noop,

But I believe this will break things when there are actually
non legacy ISA IRQs / GSI-s using GSI numbers < NR_IRQS_LEGACY

Thomas, I'm not at all familiar with this area of the kernel,
but would checking if the MADT defines any non ISA GSIs under
16 and if NOT use nr_legacy_irqs = NR_IRQS_LEGACY for the
NULL PIC be an option?

Or maybe some sort of DMI (sys_vendor == Lenovo) quirk to
set nr_legacy_irqs = NR_IRQS_LEGACY for the NULL PIC ?

Regards,

Hans




^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-20 15:16     ` Hans de Goede
@ 2023-10-20 17:13       ` Mario Limonciello
  0 siblings, 0 replies; 23+ messages in thread
From: Mario Limonciello @ 2023-10-20 17:13 UTC (permalink / raw)
  To: Hans de Goede, Thomas Gleixner, kys, hpa; +Cc: x86, LKML, Borislav Petkov

On 10/20/2023 10:16, Hans de Goede wrote:
> Hi Mario,
> 
> On 10/19/23 23:20, Mario Limonciello wrote:
>> On 10/18/2023 17:50, Thomas Gleixner wrote:
> 
> <snip>
> 
>>> But that brings up an interesting question. How are those affected
>>> machines even reaching a state where the user notices that just the
>>> keyboard and the GPIO are not working? Why?
>>
>> So the GPIO controller driver (pinctrl-amd) uses platform_get_irq() to try to discover the IRQ to use.
>>
>> This calls acpi_irq_get() which isn't implemented on x86 (hardcodes -EINVAL).
>>
>> I can "work around it" by:
>>
>> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
>> index 76bfcba25003..2b4b436c65d8 100644
>> --- a/drivers/base/platform.c
>> +++ b/drivers/base/platform.c
>> @@ -187,7 +187,8 @@ int platform_get_irq_optional(struct platform_device *dev, unsigned int num)
>>          }
>>
>>          r = platform_get_resource(dev, IORESOURCE_IRQ, num);
>> -       if (has_acpi_companion(&dev->dev)) {
>> +       if (IS_ENABLED(CONFIG_ACPI_GENERIC_GSI) &&
>> +            has_acpi_companion(&dev->dev)) {
>>                  if (r && r->flags & IORESOURCE_DISABLED) {
>>                          ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, r);
>>                          if (ret)
>>
>> but the resource that is returned from the next hunk has the resource flags set wrong in the NULL pic case:
>>
>> NULL case:
>> r: AMDI0030:00 flags: 0x30000418
>> PIC case:
>> r: AMDI0030:00 flags: 0x418
>>
>> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET
>>
>> This then later the GPIO controller interrupts are not actually working.
>> For example the attn pin for my I2C touchpad doesn't work.
> 
> Right the issue is that with the legacy-pic path disabled /
> with nr_legacy_irqs() returning 0 them there is no mapping
> added for the Legacy ISA IRQs which causes this problem.
> 
> My hack to set nr_legacy_irqs to 16 also for the NULL PIC from:
> https://bugzilla.kernel.org/show_bug.cgi?id=218003
> 
> Does cause the Legacy ISA IRQ mappings to get added and makes
> the GPIO controller actually work, as can be seen from:
> 
> https://bugzilla.kernel.org/attachment.cgi?id=305241&action=edit
> 
> Which is a dmesg with that hack and it does NOT have this error:
> 
> [    0.276113] amd_gpio AMDI0030:00: error -EINVAL: IRQ index 0 not found
> [    0.278464] amd_gpio: probe of AMDI0030:00 failed with error -22
> 
> and the reporter also reports the touchpad works with this patch.
> 
> As Thomas already said the legayc PIC really is not necessary,
> but what is still necessary on these laptops with the legacy PIC
> not initialized is to have the Legacy ISA IRQ mappings added
> by the kernel itself since these are missing from the MADT
> (if I have my ACPI/IOAPIC terminology correct).

They're not missing, the problem is that the ioapic code doesn't
let it get updated because of what I see as an extra nr_legacy_irqs()
check.

The series I posted I believe fixes this issue.

> 
> This quick hack (which is the one from the working dmesg)
> does this:
> 
> --- a/arch/x86/kernel/i8259.c	
> +++ a/arch/x86/kernel/i8259.c	
> @@ -394,7 +394,7 @@ static int legacy_pic_probe(void)
>   }
>   
>   struct legacy_pic null_legacy_pic = {
> -	.nr_legacy_irqs = 0,
> +	.nr_legacy_irqs = NR_IRQS_LEGACY,
>   	.chip = &dummy_irq_chip,
>   	.mask = legacy_pic_uint_noop,
>   	.unmask = legacy_pic_uint_noop,
> 
> But I believe this will break things when there are actually
> non legacy ISA IRQs / GSI-s using GSI numbers < NR_IRQS_LEGACY
> 
> Thomas, I'm not at all familiar with this area of the kernel,
> but would checking if the MADT defines any non ISA GSIs under
> 16 and if NOT use nr_legacy_irqs = NR_IRQS_LEGACY for the
> NULL PIC be an option?
> 
> Or maybe some sort of DMI (sys_vendor == Lenovo) quirk to
> set nr_legacy_irqs = NR_IRQS_LEGACY for the NULL PIC ?
> 

I'd prefer we don't do this.
As tglx pointed out there is an underlying bug and we shouldn't paper 
over it with quirks.

My guess at what he doesn't see this issue on his system is that the 
default preconfigured IOAPIC mappings (polarity and triggering) happen 
to match the values that would have been programmed from _CRS.

That's not the case here.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-19 21:20   ` Mario Limonciello
  2023-10-20  3:43     ` Mario Limonciello
  2023-10-20 15:16     ` Hans de Goede
@ 2023-10-23 15:59     ` Thomas Gleixner
  2023-10-23 16:17       ` Mario Limonciello
  2023-10-25  9:23       ` Thomas Gleixner
  2 siblings, 2 replies; 23+ messages in thread
From: Thomas Gleixner @ 2023-10-23 15:59 UTC (permalink / raw)
  To: Mario Limonciello, Hans de Goede, kys, hpa; +Cc: x86, LKML, Borislav Petkov

On Thu, Oct 19 2023 at 16:20, Mario Limonciello wrote:
> On 10/18/2023 17:50, Thomas Gleixner wrote:
>> The only interrupt which does not work is interrupt 0 because nothing
>> allocates interrupt 0 due to nr_legacy_irqs == 0, but that's a trivially
>> solvable problem.
>
>  From David's logs I can see that the timer interrupt gets wired up to 
> IRQ2 instead of IRQ0.

Sure, but that's not really a problem. Nothing needs the timer
interrupt in principle.

> IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 2 Level:0 ActiveLow:0)
>
> In my hacked up forcing NULL pic case this fixes that:
>
> diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
> index 43c1c24e934b..885687e64e4e 100644
> --- a/arch/x86/kernel/i8259.c
> +++ b/arch/x86/kernel/i8259.c
> @@ -424,7 +424,7 @@ static int legacy_pic_probe(void)
>   }
>
>   struct legacy_pic null_legacy_pic = {
> -       .nr_legacy_irqs = 0,
> +       .nr_legacy_irqs = 1,
>          .chip = &dummy_irq_chip,
>          .mask = legacy_pic_uint_noop,
>          .unmask = legacy_pic_uint_noop,
>
> I think it's cleaner than changing all the places that use 
> nr_legacy_irqs().

No. It's not cleaner. It's a hack and you still need to audit all places
which depend on nr_legacy_irqs(). Also why '1'? You could as well use
'16', no?

> On my side this makes:
>
> IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 0 Level:0
> ActiveLow:0)

Sure, but that can be achieved by other means in a clean way as
well. Can we please focus on analyzing the underlying problems instead
of trying random hacks? The timer part is well understood already.

>> That machine does not even need the timer interrupt because it has a
>> usable APIC and TSC deadline timer, so no APIC timer calibration
>> required. The same is true for CPUs which do not have a TSC deadline
>> timer, but enumerate the APIC frequency via CPUID or MSRs.
>
> Don't you need it for things like rtcwake to be able to work?

Timer != RTC.

The RTC interrupt is separate (IRQ 8), but in the case of this system it
is using the HPET-RTC emulation which fails to initialize because
interrupt 0 is not available.

>> But that brings up an interesting question. How are those affected
>> machines even reaching a state where the user notices that just the
>> keyboard and the GPIO are not working? Why?
>
> So the GPIO controller driver (pinctrl-amd) uses platform_get_irq() to 
> try to discover the IRQ to use.
>
> This calls acpi_irq_get() which isn't implemented on x86 (hardcodes 
> -EINVAL).
>
> I can "work around it" by:
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 76bfcba25003..2b4b436c65d8 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -187,7 +187,8 @@ int platform_get_irq_optional(struct platform_device 
> *dev, unsigned int num)
>          }
>
>          r = platform_get_resource(dev, IORESOURCE_IRQ, num);
> -       if (has_acpi_companion(&dev->dev)) {
> +       if (IS_ENABLED(CONFIG_ACPI_GENERIC_GSI) &&
> +            has_acpi_companion(&dev->dev)) {
>                  if (r && r->flags & IORESOURCE_DISABLED) {
>                          ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, r);
>                          if (ret)

So why is acpi_irq_get() reached when the PIC is disabled, but not when
the PIC is enabled? Because of the below:

> but the resource that is returned from the next hunk ?

next hunk? The resource is returned by platform_get_resource() above, no?

> has the resource flags set wrong in the NULL pic case:
>
> NULL case:
> r: AMDI0030:00 flags: 0x30000418
> PIC case:
> r: AMDI0030:00 flags: 0x418
>
> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET

So the real question is WHY are the DISABLED/UNSET flags not set in the
PIC case?

> NULL case:
> handler:  handle_edge_irq
> dstate:   0x3740c208
>              IRQ_TYPE_LEVEL_LOW
>
> PIC case:
> handler:  handle_fasteoi_irq
> dstate:   0x3740e208
>              IRQ_TYPE_LEVEL_LOW
>              IRQD_LEVEL
>
> I guess something related to the callpath for mp_register_handler().

Guessing is not helpful.

There is a difference in how the allocation info is set up when legacy
PIC is enabled, but that does not explain the above resource flag
difference.

As there is no override for IRQ7:

[    0.011415] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.011417] Int: type 0, pol 0, trig 0, bus 00, IRQ 00, APIC ID 20, APIC INT 02
[    0.011418] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
[    0.011419] Int: type 0, pol 3, trig 3, bus 00, IRQ 09, APIC ID 20, APIC INT 09
...
[    0.011425] Int: type 0, pol 0, trig 0, bus 00, IRQ 07, APIC ID 20, APIC INT 07

the initial setup of the IOAPIC interrupt is edge, while the initial
setup of the legacy PIC is level. But that gets changed later to edge
when the IOAPIC is initialized.

I'm not seeing the magic which make the above different yet, though I'm
100% sure by now that this "works" definitely not by design. It just
works by pure luck.

Now when platform_get_irq_optional() sets the trigger type via
irqd_set_trigger_type() it just sets LEVEL_LOW, but does not change the
handler and does not set IRQD_LEVEL. It does neither change the IO/APIC
pin setup. This happens because the IOAPIC interrupt chip does not
implement an irq_set_type() callback.

IOW the whole machinery depends on magic setup ordering vs. PIC and pure
luck. Adding the callback is not rocket science, but while it should
make the interrupt work it still does not explain the magic "working"
when the legacy PIC is enabled.

Let me sit down and add a pile of debug printks to all the relevant
places as we really need to understand the underlying magic effects of
legacy PIC first.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-23 15:59     ` Thomas Gleixner
@ 2023-10-23 16:17       ` Mario Limonciello
  2023-10-23 17:50         ` Thomas Gleixner
  2023-10-25  9:23       ` Thomas Gleixner
  1 sibling, 1 reply; 23+ messages in thread
From: Mario Limonciello @ 2023-10-23 16:17 UTC (permalink / raw)
  To: Thomas Gleixner, Hans de Goede, kys, hpa; +Cc: x86, LKML, Borislav Petkov

On 10/23/2023 10:59, Thomas Gleixner wrote:
> On Thu, Oct 19 2023 at 16:20, Mario Limonciello wrote:
>> On 10/18/2023 17:50, Thomas Gleixner wrote:
>>> The only interrupt which does not work is interrupt 0 because nothing
>>> allocates interrupt 0 due to nr_legacy_irqs == 0, but that's a trivially
>>> solvable problem.
>>
>>   From David's logs I can see that the timer interrupt gets wired up to
>> IRQ2 instead of IRQ0.
> 
> Sure, but that's not really a problem. Nothing needs the timer
> interrupt in principle.
> 
>> IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 2 Level:0 ActiveLow:0)
>>
>> In my hacked up forcing NULL pic case this fixes that:
>>
>> diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
>> index 43c1c24e934b..885687e64e4e 100644
>> --- a/arch/x86/kernel/i8259.c
>> +++ b/arch/x86/kernel/i8259.c
>> @@ -424,7 +424,7 @@ static int legacy_pic_probe(void)
>>    }
>>
>>    struct legacy_pic null_legacy_pic = {
>> -       .nr_legacy_irqs = 0,
>> +       .nr_legacy_irqs = 1,
>>           .chip = &dummy_irq_chip,
>>           .mask = legacy_pic_uint_noop,
>>           .unmask = legacy_pic_uint_noop,
>>
>> I think it's cleaner than changing all the places that use
>> nr_legacy_irqs().
> 
> No. It's not cleaner. It's a hack and you still need to audit all places
> which depend on nr_legacy_irqs(). Also why '1'? You could as well use
> '16', no? >
>> On my side this makes:
>>
>> IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 0 Level:0
>> ActiveLow:0)
> 
> Sure, but that can be achieved by other means in a clean way as
> well. Can we please focus on analyzing the underlying problems instead
> of trying random hacks? The timer part is well understood already.
> 
>>> That machine does not even need the timer interrupt because it has a
>>> usable APIC and TSC deadline timer, so no APIC timer calibration
>>> required. The same is true for CPUs which do not have a TSC deadline
>>> timer, but enumerate the APIC frequency via CPUID or MSRs.
>>
>> Don't you need it for things like rtcwake to be able to work?
> 
> Timer != RTC.
> 
> The RTC interrupt is separate (IRQ 8), but in the case of this system it
> is using the HPET-RTC emulation which fails to initialize because
> interrupt 0 is not available.

That's exactly why I allocated 1 IRQ for IRQ 0.

> 
>>> But that brings up an interesting question. How are those affected
>>> machines even reaching a state where the user notices that just the
>>> keyboard and the GPIO are not working? Why?
>>
>> So the GPIO controller driver (pinctrl-amd) uses platform_get_irq() to
>> try to discover the IRQ to use.
>>
>> This calls acpi_irq_get() which isn't implemented on x86 (hardcodes
>> -EINVAL).
>>
>> I can "work around it" by:
>>
>> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
>> index 76bfcba25003..2b4b436c65d8 100644
>> --- a/drivers/base/platform.c
>> +++ b/drivers/base/platform.c
>> @@ -187,7 +187,8 @@ int platform_get_irq_optional(struct platform_device
>> *dev, unsigned int num)
>>           }
>>
>>           r = platform_get_resource(dev, IORESOURCE_IRQ, num);
>> -       if (has_acpi_companion(&dev->dev)) {
>> +       if (IS_ENABLED(CONFIG_ACPI_GENERIC_GSI) &&
>> +            has_acpi_companion(&dev->dev)) {
>>                   if (r && r->flags & IORESOURCE_DISABLED) {
>>                           ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, r);
>>                           if (ret)
> 
> So why is acpi_irq_get() reached when the PIC is disabled, but not when
> the PIC is enabled? Because of the below:
> 
>> but the resource that is returned from the next hunk ?
> 
> next hunk? The resource is returned by platform_get_resource() above, no?
> 
>> has the resource flags set wrong in the NULL pic case:
>>
>> NULL case:
>> r: AMDI0030:00 flags: 0x30000418
>> PIC case:
>> r: AMDI0030:00 flags: 0x418
>>
>> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET
> 
> So the real question is WHY are the DISABLED/UNSET flags not set in the
> PIC case?
> 
>> NULL case:
>> handler:  handle_edge_irq
>> dstate:   0x3740c208
>>               IRQ_TYPE_LEVEL_LOW
>>
>> PIC case:
>> handler:  handle_fasteoi_irq
>> dstate:   0x3740e208
>>               IRQ_TYPE_LEVEL_LOW
>>               IRQD_LEVEL
>>
>> I guess something related to the callpath for mp_register_handler().
> 
> Guessing is not helpful.
> 
> There is a difference in how the allocation info is set up when legacy
> PIC is enabled, but that does not explain the above resource flag
> difference.

I did a pile of printks and that's how I realized it's because of the 
missing call to mp_register_handler() which is dependent upon what 
appeared to me to be a superfluous number of legacy IRQs check (patch 1 
in my solution).

> 
> As there is no override for IRQ7:
> 
> [    0.011415] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [    0.011417] Int: type 0, pol 0, trig 0, bus 00, IRQ 00, APIC ID 20, APIC INT 02
> [    0.011418] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
> [    0.011419] Int: type 0, pol 3, trig 3, bus 00, IRQ 09, APIC ID 20, APIC INT 09
> ...
> [    0.011425] Int: type 0, pol 0, trig 0, bus 00, IRQ 07, APIC ID 20, APIC INT 07
> 
> the initial setup of the IOAPIC interrupt is edge, while the initial
> setup of the legacy PIC is level. But that gets changed later to edge
> when the IOAPIC is initialized.
> 
> I'm not seeing the magic which make the above different yet, though I'm
> 100% sure by now that this "works" definitely not by design. It just
> works by pure luck.
> 
> Now when platform_get_irq_optional() sets the trigger type via
> irqd_set_trigger_type() it just sets LEVEL_LOW, but does not change the
> handler and does not set IRQD_LEVEL. It does neither change the IO/APIC
> pin setup. This happens because the IOAPIC interrupt chip does not
> implement an irq_set_type() callback.
> 
> IOW the whole machinery depends on magic setup ordering vs. PIC and pure
> luck. Adding the callback is not rocket science, but while it should
> make the interrupt work it still does not explain the magic "working"
> when the legacy PIC is enabled.
> 
> Let me sit down and add a pile of debug printks to all the relevant
> places as we really need to understand the underlying magic effects of
> legacy PIC first.
> 

OK let's see if you come up with different conclusions.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-23 16:17       ` Mario Limonciello
@ 2023-10-23 17:50         ` Thomas Gleixner
  2023-10-23 17:59           ` Mario Limonciello
  0 siblings, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2023-10-23 17:50 UTC (permalink / raw)
  To: Mario Limonciello, Hans de Goede, kys, hpa; +Cc: x86, LKML, Borislav Petkov

On Mon, Oct 23 2023 at 11:17, Mario Limonciello wrote:
> On 10/23/2023 10:59, Thomas Gleixner wrote:
>>> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET
>> 
>> So the real question is WHY are the DISABLED/UNSET flags not set in the
>> PIC case?

Do you have an answer for this?
 
>>> NULL case:
>>> handler:  handle_edge_irq
>>> dstate:   0x3740c208
>>>               IRQ_TYPE_LEVEL_LOW
>>>
>>> PIC case:
>>> handler:  handle_fasteoi_irq
>>> dstate:   0x3740e208
>>>               IRQ_TYPE_LEVEL_LOW
>>>               IRQD_LEVEL
>>>
>>> I guess something related to the callpath for mp_register_handler().
>> 
>> Guessing is not helpful.
>> 
>> There is a difference in how the allocation info is set up when legacy
>> PIC is enabled, but that does not explain the above resource flag
>> difference.
>
> I did a pile of printks and that's how I realized it's because of the 
> missing call to mp_register_handler() which is dependent upon what 
> appeared to me to be a superfluous number of legacy IRQs check (patch 1 
> in my solution).

What exactly is superfluous about these legacy checks?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-23 17:50         ` Thomas Gleixner
@ 2023-10-23 17:59           ` Mario Limonciello
  0 siblings, 0 replies; 23+ messages in thread
From: Mario Limonciello @ 2023-10-23 17:59 UTC (permalink / raw)
  To: Thomas Gleixner, Hans de Goede, kys, hpa; +Cc: x86, LKML, Borislav Petkov

On 10/23/2023 12:50, Thomas Gleixner wrote:
> On Mon, Oct 23 2023 at 11:17, Mario Limonciello wrote:
>> On 10/23/2023 10:59, Thomas Gleixner wrote:
>>>> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET
>>>
>>> So the real question is WHY are the DISABLED/UNSET flags not set in the
>>> PIC case?
> 
> Do you have an answer for this?

Here's the problematic call path:

acpi_dev_get_irqresource()
->acpi_register_gsi()
->->__acpi_register_gsi() [ Which is acpi_register_gsi_ioapic() ]
->->->mp_map_gsi_to_irq()
->->->->mp_map_pin_to_irq()
->->->->->mp_check_pin_attr()

In the legacy PIC programmed case this function can overwrite level and 
active when acpi_register_gsi() is called.

Without the change I made in the NULL PIC case it can't.
So the resources get disabled by acpi_dev_get_irqresource().

>   
>>>> NULL case:
>>>> handler:  handle_edge_irq
>>>> dstate:   0x3740c208
>>>>                IRQ_TYPE_LEVEL_LOW
>>>>
>>>> PIC case:
>>>> handler:  handle_fasteoi_irq
>>>> dstate:   0x3740e208
>>>>                IRQ_TYPE_LEVEL_LOW
>>>>                IRQD_LEVEL
>>>>
>>>> I guess something related to the callpath for mp_register_handler().
>>>
>>> Guessing is not helpful.
>>>
>>> There is a difference in how the allocation info is set up when legacy
>>> PIC is enabled, but that does not explain the above resource flag
>>> difference.
>>
>> I did a pile of printks and that's how I realized it's because of the
>> missing call to mp_register_handler() which is dependent upon what
>> appeared to me to be a superfluous number of legacy IRQs check (patch 1
>> in my solution).
> 
> What exactly is superfluous about these legacy checks?
> 
> Thanks,
> 
>          tglx

acpi_dev_get_irqresource() tries to set up to match what's in _CRS.
If acpi_register_gsi() fails, the resource can't get setup.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-23 15:59     ` Thomas Gleixner
  2023-10-23 16:17       ` Mario Limonciello
@ 2023-10-25  9:23       ` Thomas Gleixner
  2023-10-25 14:41         ` Mario Limonciello
  1 sibling, 1 reply; 23+ messages in thread
From: Thomas Gleixner @ 2023-10-25  9:23 UTC (permalink / raw)
  To: Mario Limonciello, Hans de Goede, kys, hpa
  Cc: x86, LKML, Borislav Petkov, Rafael J. Wysocki

On Mon, Oct 23 2023 at 17:59, Thomas Gleixner wrote:
> On Thu, Oct 19 2023 at 16:20, Mario Limonciello wrote:
>>   struct legacy_pic null_legacy_pic = {
>> -       .nr_legacy_irqs = 0,
>> +       .nr_legacy_irqs = 1,
>>          .chip = &dummy_irq_chip,
>>          .mask = legacy_pic_uint_noop,
>>          .unmask = legacy_pic_uint_noop,
>>
>> I think it's cleaner than changing all the places that use 
>> nr_legacy_irqs().
>
> No. It's not cleaner. It's a hack and you still need to audit all places
> which depend on nr_legacy_irqs(). Also why '1'? You could as well use
> '16', no?

So I sat down and did a thorough analysis of legacy PIC dependencies.

Unfortunately this is an unholy mess and sprinkled all over the place,
so there is no trivial way to resolve this quickly. This needs a proper
overhaul to decouple the actual PIC driver selection from the fact that
the kernel runs on a i8259 equipped hardware and therefore needs to
honour the legacy PNP overrides etc.

The probing itself is to stay in order to avoid sprinkling weird
conditions and NULL PIC selections all over the place.

It could be argued that the probe function should try to initialize the
PIC, but that's overkill for scenarios where the PIC does not exist.

Though it turns out that ACPI/MADT is helpful here because the MADT
header has a flags field which denotes in bit 0, whether the system has
a 8259 setup or not.

This allows to override the probe for now until we actually resolved the
dependency problems in a clean way.

Untested patch below.

Thanks,

        tglx
---
--- a/arch/x86/include/asm/i8259.h
+++ b/arch/x86/include/asm/i8259.h
@@ -69,6 +69,8 @@ struct legacy_pic {
 	void (*make_irq)(unsigned int irq);
 };
 
+void legacy_pic_pcat_compat(void);
+
 extern struct legacy_pic *legacy_pic;
 extern struct legacy_pic null_legacy_pic;
 
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct
 		pr_debug("Local APIC address 0x%08x\n", madt->address);
 	}
 
+	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
+		legacy_pic_pcat_compat();
+
 	/* ACPI 6.3 and newer support the online capable bit. */
 	if (acpi_gbl_FADT.header.revision > 6 ||
 	    (acpi_gbl_FADT.header.revision == 6 &&
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -32,6 +32,7 @@
  */
 static void init_8259A(int auto_eoi);
 
+static bool pcat_compat __ro_after_init;
 static int i8259A_auto_eoi;
 DEFINE_RAW_SPINLOCK(i8259A_lock);
 
@@ -299,15 +300,32 @@ static void unmask_8259A(void)
 
 static int probe_8259A(void)
 {
+	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
 	unsigned long flags;
-	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
-	unsigned char new_val;
+
+	/*
+	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
+	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
+	 * fails.
+	 *
+	 * Right now this causes problems as quite some code depends on
+	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
+	 * when the system has an IO/APIC because then PIC is not required
+	 * at all, except for really old machines where the timer interrupt
+	 * must be routed through the PIC. So just pretend that the PIC is
+	 * there and let legacy_pic->init() initialize it for nothing.
+	 *
+	 * Alternatively this could just try to initialize the PIC and
+	 * repeat the probe, but for cases where there is no PIC that's
+	 * just pointless.
+	 */
+	if (pcat_compat)
+		return nr_legacy_irqs();
+
 	/*
-	 * Check to see if we have a PIC.
-	 * Mask all except the cascade and read
-	 * back the value we just wrote. If we don't
-	 * have a PIC, we will read 0xff as opposed to the
-	 * value we wrote.
+	 * Check to see if we have a PIC.  Mask all except the cascade and
+	 * read back the value we just wrote. If we don't have a PIC, we
+	 * will read 0xff as opposed to the value we wrote.
 	 */
 	raw_spin_lock_irqsave(&i8259A_lock, flags);
 
@@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
 
 	return 0;
 }
-
 device_initcall(i8259A_init_ops);
+
+void __init legacy_pic_pcat_compat(void)
+{
+	pcat_compat = true;
+}


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-25  9:23       ` Thomas Gleixner
@ 2023-10-25 14:41         ` Mario Limonciello
  2023-10-25 15:25           ` Mario Limonciello
  2023-10-25 15:25           ` David Lazar
  0 siblings, 2 replies; 23+ messages in thread
From: Mario Limonciello @ 2023-10-25 14:41 UTC (permalink / raw)
  To: Thomas Gleixner, Hans de Goede, kys, hpa, dlazar
  Cc: x86, LKML, Borislav Petkov, Rafael J. Wysocki

On 10/25/2023 04:23, Thomas Gleixner wrote:
> On Mon, Oct 23 2023 at 17:59, Thomas Gleixner wrote:
>> On Thu, Oct 19 2023 at 16:20, Mario Limonciello wrote:
>>>    struct legacy_pic null_legacy_pic = {
>>> -       .nr_legacy_irqs = 0,
>>> +       .nr_legacy_irqs = 1,
>>>           .chip = &dummy_irq_chip,
>>>           .mask = legacy_pic_uint_noop,
>>>           .unmask = legacy_pic_uint_noop,
>>>
>>> I think it's cleaner than changing all the places that use
>>> nr_legacy_irqs().
>>
>> No. It's not cleaner. It's a hack and you still need to audit all places
>> which depend on nr_legacy_irqs(). Also why '1'? You could as well use
>> '16', no?
> 
> So I sat down and did a thorough analysis of legacy PIC dependencies.
> 
> Unfortunately this is an unholy mess and sprinkled all over the place,
> so there is no trivial way to resolve this quickly. This needs a proper
> overhaul to decouple the actual PIC driver selection from the fact that
> the kernel runs on a i8259 equipped hardware and therefore needs to
> honour the legacy PNP overrides etc.
> 
> The probing itself is to stay in order to avoid sprinkling weird
> conditions and NULL PIC selections all over the place.
> 
> It could be argued that the probe function should try to initialize the
> PIC, but that's overkill for scenarios where the PIC does not exist.
> 
> Though it turns out that ACPI/MADT is helpful here because the MADT
> header has a flags field which denotes in bit 0, whether the system has
> a 8259 setup or not.
> 
> This allows to override the probe for now until we actually resolved the
> dependency problems in a clean way.
> 
> Untested patch below.

+David from the bugzilla.

I checked his acpidump and I do think this will work for him.

[024h 0036   4]           Local Apic Address : FEE00000
[028h 0040   4]        Flags (decoded below) : 00000001
                          PC-AT Compatibility : 1


David - can you see if the below helps your hardware?

> 
> Thanks,
> 
>          tglx
> ---
> --- a/arch/x86/include/asm/i8259.h
> +++ b/arch/x86/include/asm/i8259.h
> @@ -69,6 +69,8 @@ struct legacy_pic {
>   	void (*make_irq)(unsigned int irq);
>   };
>   
> +void legacy_pic_pcat_compat(void);
> +
>   extern struct legacy_pic *legacy_pic;
>   extern struct legacy_pic null_legacy_pic;
>   
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct
>   		pr_debug("Local APIC address 0x%08x\n", madt->address);
>   	}
>   
> +	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
> +		legacy_pic_pcat_compat();
> +
>   	/* ACPI 6.3 and newer support the online capable bit. */
>   	if (acpi_gbl_FADT.header.revision > 6 ||
>   	    (acpi_gbl_FADT.header.revision == 6 &&
> --- a/arch/x86/kernel/i8259.c
> +++ b/arch/x86/kernel/i8259.c
> @@ -32,6 +32,7 @@
>    */
>   static void init_8259A(int auto_eoi);
>   
> +static bool pcat_compat __ro_after_init;
>   static int i8259A_auto_eoi;
>   DEFINE_RAW_SPINLOCK(i8259A_lock);
>   
> @@ -299,15 +300,32 @@ static void unmask_8259A(void)
>   
>   static int probe_8259A(void)
>   {
> +	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
>   	unsigned long flags;
> -	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
> -	unsigned char new_val;
> +
> +	/*
> +	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
> +	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
> +	 * fails.
> +	 *
> +	 * Right now this causes problems as quite some code depends on
> +	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
> +	 * when the system has an IO/APIC because then PIC is not required
> +	 * at all, except for really old machines where the timer interrupt
> +	 * must be routed through the PIC. So just pretend that the PIC is
> +	 * there and let legacy_pic->init() initialize it for nothing.
> +	 *
> +	 * Alternatively this could just try to initialize the PIC and
> +	 * repeat the probe, but for cases where there is no PIC that's
> +	 * just pointless.
> +	 */
> +	if (pcat_compat)
> +		return nr_legacy_irqs();
> +
>   	/*
> -	 * Check to see if we have a PIC.
> -	 * Mask all except the cascade and read
> -	 * back the value we just wrote. If we don't
> -	 * have a PIC, we will read 0xff as opposed to the
> -	 * value we wrote.
> +	 * Check to see if we have a PIC.  Mask all except the cascade and
> +	 * read back the value we just wrote. If we don't have a PIC, we
> +	 * will read 0xff as opposed to the value we wrote.
>   	 */
>   	raw_spin_lock_irqsave(&i8259A_lock, flags);
>   
> @@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
>   
>   	return 0;
>   }
> -
>   device_initcall(i8259A_init_ops);
> +
> +void __init legacy_pic_pcat_compat(void)
> +{
> +	pcat_compat = true;
> +}
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-25 14:41         ` Mario Limonciello
@ 2023-10-25 15:25           ` Mario Limonciello
  2023-10-25 15:25           ` David Lazar
  1 sibling, 0 replies; 23+ messages in thread
From: Mario Limonciello @ 2023-10-25 15:25 UTC (permalink / raw)
  To: Thomas Gleixner, Hans de Goede, kys, hpa, dlazar
  Cc: x86, LKML, Borislav Petkov, Rafael J. Wysocki

On 10/25/2023 09:41, Mario Limonciello wrote:
> On 10/25/2023 04:23, Thomas Gleixner wrote:
>> On Mon, Oct 23 2023 at 17:59, Thomas Gleixner wrote:
>>> On Thu, Oct 19 2023 at 16:20, Mario Limonciello wrote:
>>>>    struct legacy_pic null_legacy_pic = {
>>>> -       .nr_legacy_irqs = 0,
>>>> +       .nr_legacy_irqs = 1,
>>>>           .chip = &dummy_irq_chip,
>>>>           .mask = legacy_pic_uint_noop,
>>>>           .unmask = legacy_pic_uint_noop,
>>>>
>>>> I think it's cleaner than changing all the places that use
>>>> nr_legacy_irqs().
>>>
>>> No. It's not cleaner. It's a hack and you still need to audit all places
>>> which depend on nr_legacy_irqs(). Also why '1'? You could as well use
>>> '16', no?
>>
>> So I sat down and did a thorough analysis of legacy PIC dependencies.
>>
>> Unfortunately this is an unholy mess and sprinkled all over the place,
>> so there is no trivial way to resolve this quickly. This needs a proper
>> overhaul to decouple the actual PIC driver selection from the fact that
>> the kernel runs on a i8259 equipped hardware and therefore needs to
>> honour the legacy PNP overrides etc.
>>
>> The probing itself is to stay in order to avoid sprinkling weird
>> conditions and NULL PIC selections all over the place.
>>
>> It could be argued that the probe function should try to initialize the
>> PIC, but that's overkill for scenarios where the PIC does not exist.
>>
>> Though it turns out that ACPI/MADT is helpful here because the MADT
>> header has a flags field which denotes in bit 0, whether the system has
>> a 8259 setup or not.
>>
>> This allows to override the probe for now until we actually resolved the
>> dependency problems in a clean way.
>>
>> Untested patch below.
> 
> +David from the bugzilla.
> 
> I checked his acpidump and I do think this will work for him.
> 
> [024h 0036   4]           Local Apic Address : FEE00000
> [028h 0040   4]        Flags (decoded below) : 00000001
>                           PC-AT Compatibility : 1
> 
> 
> David - can you see if the below helps your hardware?

FYI, David confirmed this works for fixing his hardware, thanks.

https://bugzilla.kernel.org/show_bug.cgi?id=218003#c84

> 
>>
>> Thanks,
>>
>>          tglx
>> ---
>> --- a/arch/x86/include/asm/i8259.h
>> +++ b/arch/x86/include/asm/i8259.h
>> @@ -69,6 +69,8 @@ struct legacy_pic {
>>       void (*make_irq)(unsigned int irq);
>>   };
>> +void legacy_pic_pcat_compat(void);
>> +
>>   extern struct legacy_pic *legacy_pic;
>>   extern struct legacy_pic null_legacy_pic;
>> --- a/arch/x86/kernel/acpi/boot.c
>> +++ b/arch/x86/kernel/acpi/boot.c
>> @@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct
>>           pr_debug("Local APIC address 0x%08x\n", madt->address);
>>       }
>> +    if (madt->flags & ACPI_MADT_PCAT_COMPAT)
>> +        legacy_pic_pcat_compat();
>> +
>>       /* ACPI 6.3 and newer support the online capable bit. */
>>       if (acpi_gbl_FADT.header.revision > 6 ||
>>           (acpi_gbl_FADT.header.revision == 6 &&
>> --- a/arch/x86/kernel/i8259.c
>> +++ b/arch/x86/kernel/i8259.c
>> @@ -32,6 +32,7 @@
>>    */
>>   static void init_8259A(int auto_eoi);
>> +static bool pcat_compat __ro_after_init;
>>   static int i8259A_auto_eoi;
>>   DEFINE_RAW_SPINLOCK(i8259A_lock);
>> @@ -299,15 +300,32 @@ static void unmask_8259A(void)
>>   static int probe_8259A(void)
>>   {
>> +    unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
>>       unsigned long flags;
>> -    unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
>> -    unsigned char new_val;
>> +
>> +    /*
>> +     * If MADT has the PCAT_COMPAT flag set, then do not bother probing
>> +     * for the PIC. Some BIOSes leave the PIC uninitialized and probing
>> +     * fails.
>> +     *
>> +     * Right now this causes problems as quite some code depends on
>> +     * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
>> +     * when the system has an IO/APIC because then PIC is not required
>> +     * at all, except for really old machines where the timer interrupt
>> +     * must be routed through the PIC. So just pretend that the PIC is
>> +     * there and let legacy_pic->init() initialize it for nothing.
>> +     *
>> +     * Alternatively this could just try to initialize the PIC and
>> +     * repeat the probe, but for cases where there is no PIC that's
>> +     * just pointless.
>> +     */
>> +    if (pcat_compat)
>> +        return nr_legacy_irqs();
>> +
>>       /*
>> -     * Check to see if we have a PIC.
>> -     * Mask all except the cascade and read
>> -     * back the value we just wrote. If we don't
>> -     * have a PIC, we will read 0xff as opposed to the
>> -     * value we wrote.
>> +     * Check to see if we have a PIC.  Mask all except the cascade and
>> +     * read back the value we just wrote. If we don't have a PIC, we
>> +     * will read 0xff as opposed to the value we wrote.
>>        */
>>       raw_spin_lock_irqsave(&i8259A_lock, flags);
>> @@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
>>       return 0;
>>   }
>> -
>>   device_initcall(i8259A_init_ops);
>> +
>> +void __init legacy_pic_pcat_compat(void)
>> +{
>> +    pcat_compat = true;
>> +}
>>
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-25 14:41         ` Mario Limonciello
  2023-10-25 15:25           ` Mario Limonciello
@ 2023-10-25 15:25           ` David Lazar
  2023-10-25 17:31             ` Thomas Gleixner
  1 sibling, 1 reply; 23+ messages in thread
From: David Lazar @ 2023-10-25 15:25 UTC (permalink / raw)
  To: Mario Limonciello, Thomas Gleixner
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov, Rafael J. Wysocki

--- On Wed, 25 Oct 2023, Mario Limonciello wrote:
> David - can you see if the below helps your hardware?

The keyboard and mouse work fine with Thomas' patch.

I've uploaded the debug info to the bug:

https://bugzilla.kernel.org/attachment.cgi?id=305291&action=edit

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-25 15:25           ` David Lazar
@ 2023-10-25 17:31             ` Thomas Gleixner
  2023-10-25 17:37               ` Rafael J. Wysocki
  2023-10-25 21:04               ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, Thomas Gleixner
  0 siblings, 2 replies; 23+ messages in thread
From: Thomas Gleixner @ 2023-10-25 17:31 UTC (permalink / raw)
  To: David Lazar, Mario Limonciello
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov, Rafael J. Wysocki

On Wed, Oct 25 2023 at 17:25, David Lazar wrote:
> --- On Wed, 25 Oct 2023, Mario Limonciello wrote:
>> David - can you see if the below helps your hardware?
>
> The keyboard and mouse work fine with Thomas' patch.
>
> I've uploaded the debug info to the bug:
>
> https://bugzilla.kernel.org/attachment.cgi?id=305291&action=edit

Let me write a changelog then. Unless Rafael has any objections to that
approach.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: PIC probing code from e179f6914152 failing
  2023-10-25 17:31             ` Thomas Gleixner
@ 2023-10-25 17:37               ` Rafael J. Wysocki
  2023-10-25 21:04               ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, Thomas Gleixner
  1 sibling, 0 replies; 23+ messages in thread
From: Rafael J. Wysocki @ 2023-10-25 17:37 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: David Lazar, Mario Limonciello, Hans de Goede, kys, hpa, x86,
	LKML, Borislav Petkov, Rafael J. Wysocki

On Wed, Oct 25, 2023 at 7:31 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, Oct 25 2023 at 17:25, David Lazar wrote:
> > --- On Wed, 25 Oct 2023, Mario Limonciello wrote:
> >> David - can you see if the below helps your hardware?
> >
> > The keyboard and mouse work fine with Thomas' patch.
> >
> > I've uploaded the debug info to the bug:
> >
> > https://bugzilla.kernel.org/attachment.cgi?id=305291&action=edit
>
> Let me write a changelog then. Unless Rafael has any objections to that
> approach.

I don't have any, thanks!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, 
  2023-10-25 17:31             ` Thomas Gleixner
  2023-10-25 17:37               ` Rafael J. Wysocki
@ 2023-10-25 21:04               ` Thomas Gleixner
  2023-10-25 22:11                 ` Mario Limonciello
                                   ` (3 more replies)
  1 sibling, 4 replies; 23+ messages in thread
From: Thomas Gleixner @ 2023-10-25 21:04 UTC (permalink / raw)
  To: David Lazar, Mario Limonciello
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov, Rafael J. Wysocki

David and a few others reported that on certain newer systems some legacy
interrupts fail to work correctly.

Debugging revealed that the BIOS of these systems leaves the legacy PIC in
uninitialized state which makes the PIC detection fail and the kernel
switches to a dummy implementation.

Unfortunately this fallback causes quite some code to fail as it depends on
checks for the number of legacy PIC interrupts or the availability of the
real PIC.

In theory there is no reason to use the PIC on any modern system when
IO/APIC is available, but the dependencies on the related checks cannot be
resolved trivially and on short notice. This needs lots of analysis and
rework.

The PIC detection has been added to avoid quirky checks and force selection
of the dummy implementation all over the place, especially in VM guest
scenarios. So it's not an option to revert the relevant commit as that
would break a lot of other scenarios.

One solution would be to try to initialize the PIC on detection fail and
retry the detection, but that puts the burden on everything which does not
have a PIC.

Fortunately the ACPI/MADT table header has a flag field, which advertises
in bit 0 that the system is PCAT compatible, which means it has a legacy
8259 PIC.

Evaluate that bit and if set avoid the detection routine and keep the real
PIC installed, which then gets initialized (for nothing) and makes the rest
of the code with all the dependencies work again.

Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
Reported-by: David Lazar <dlazar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: David Lazar <dlazar@gmail.com>
Cc: stable@vger.kernel.org
Link: https://bugzilla.kernel.org/show_bug.cgi?id=218003
---
---
 arch/x86/include/asm/i8259.h |    2 ++
 arch/x86/kernel/acpi/boot.c  |    3 +++
 arch/x86/kernel/i8259.c      |   38 ++++++++++++++++++++++++++++++--------
 3 files changed, 35 insertions(+), 8 deletions(-)

--- a/arch/x86/include/asm/i8259.h
+++ b/arch/x86/include/asm/i8259.h
@@ -69,6 +69,8 @@ struct legacy_pic {
 	void (*make_irq)(unsigned int irq);
 };
 
+void legacy_pic_pcat_compat(void);
+
 extern struct legacy_pic *legacy_pic;
 extern struct legacy_pic null_legacy_pic;
 
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct
 		pr_debug("Local APIC address 0x%08x\n", madt->address);
 	}
 
+	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
+		legacy_pic_pcat_compat();
+
 	/* ACPI 6.3 and newer support the online capable bit. */
 	if (acpi_gbl_FADT.header.revision > 6 ||
 	    (acpi_gbl_FADT.header.revision == 6 &&
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -32,6 +32,7 @@
  */
 static void init_8259A(int auto_eoi);
 
+static bool pcat_compat __ro_after_init;
 static int i8259A_auto_eoi;
 DEFINE_RAW_SPINLOCK(i8259A_lock);
 
@@ -299,15 +300,32 @@ static void unmask_8259A(void)
 
 static int probe_8259A(void)
 {
+	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
 	unsigned long flags;
-	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
-	unsigned char new_val;
+
+	/*
+	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
+	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
+	 * fails.
+	 *
+	 * Right now this causes problems as quite some code depends on
+	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
+	 * when the system has an IO/APIC because then PIC is not required
+	 * at all, except for really old machines where the timer interrupt
+	 * must be routed through the PIC. So just pretend that the PIC is
+	 * there and let legacy_pic->init() initialize it for nothing.
+	 *
+	 * Alternatively this could just try to initialize the PIC and
+	 * repeat the probe, but for cases where there is no PIC that's
+	 * just pointless.
+	 */
+	if (pcat_compat)
+		return nr_legacy_irqs();
+
 	/*
-	 * Check to see if we have a PIC.
-	 * Mask all except the cascade and read
-	 * back the value we just wrote. If we don't
-	 * have a PIC, we will read 0xff as opposed to the
-	 * value we wrote.
+	 * Check to see if we have a PIC.  Mask all except the cascade and
+	 * read back the value we just wrote. If we don't have a PIC, we
+	 * will read 0xff as opposed to the value we wrote.
 	 */
 	raw_spin_lock_irqsave(&i8259A_lock, flags);
 
@@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
 
 	return 0;
 }
-
 device_initcall(i8259A_init_ops);
+
+void __init legacy_pic_pcat_compat(void)
+{
+	pcat_compat = true;
+}

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re:
  2023-10-25 21:04               ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, Thomas Gleixner
@ 2023-10-25 22:11                 ` Mario Limonciello
  2023-10-26  9:27                   ` Re: Thomas Gleixner
  2023-10-26  8:17                 ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility Hans de Goede
                                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 23+ messages in thread
From: Mario Limonciello @ 2023-10-25 22:11 UTC (permalink / raw)
  To: Thomas Gleixner, David Lazar
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov,
	Rafael J. Wysocki, Linux kernel regressions list

On 10/25/2023 16:04, Thomas Gleixner wrote:
> David and a few others reported that on certain newer systems some legacy
> interrupts fail to work correctly.
> 
> Debugging revealed that the BIOS of these systems leaves the legacy PIC in
> uninitialized state which makes the PIC detection fail and the kernel
> switches to a dummy implementation.
> 
> Unfortunately this fallback causes quite some code to fail as it depends on
> checks for the number of legacy PIC interrupts or the availability of the
> real PIC.
> 
> In theory there is no reason to use the PIC on any modern system when
> IO/APIC is available, but the dependencies on the related checks cannot be
> resolved trivially and on short notice. This needs lots of analysis and
> rework.
> 
> The PIC detection has been added to avoid quirky checks and force selection
> of the dummy implementation all over the place, especially in VM guest
> scenarios. So it's not an option to revert the relevant commit as that
> would break a lot of other scenarios.
> 
> One solution would be to try to initialize the PIC on detection fail and
> retry the detection, but that puts the burden on everything which does not
> have a PIC.
> 
> Fortunately the ACPI/MADT table header has a flag field, which advertises
> in bit 0 that the system is PCAT compatible, which means it has a legacy
> 8259 PIC.
> 
> Evaluate that bit and if set avoid the detection routine and keep the real
> PIC installed, which then gets initialized (for nothing) and makes the rest
> of the code with all the dependencies work again.
> 
> Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
> Reported-by: David Lazar <dlazar@gmail.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: David Lazar <dlazar@gmail.com>
> Cc: stable@vger.kernel.org
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218003

s/Link/Closes/

Presumably you will add a proper subject when this is committed?

With adding title and fixing that tag:

Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

> ---
> ---
>   arch/x86/include/asm/i8259.h |    2 ++
>   arch/x86/kernel/acpi/boot.c  |    3 +++
>   arch/x86/kernel/i8259.c      |   38 ++++++++++++++++++++++++++++++--------
>   3 files changed, 35 insertions(+), 8 deletions(-)
> 
> --- a/arch/x86/include/asm/i8259.h
> +++ b/arch/x86/include/asm/i8259.h
> @@ -69,6 +69,8 @@ struct legacy_pic {
>   	void (*make_irq)(unsigned int irq);
>   };
>   
> +void legacy_pic_pcat_compat(void);
> +
>   extern struct legacy_pic *legacy_pic;
>   extern struct legacy_pic null_legacy_pic;
>   
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct
>   		pr_debug("Local APIC address 0x%08x\n", madt->address);
>   	}
>   
> +	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
> +		legacy_pic_pcat_compat();
> +
>   	/* ACPI 6.3 and newer support the online capable bit. */
>   	if (acpi_gbl_FADT.header.revision > 6 ||
>   	    (acpi_gbl_FADT.header.revision == 6 &&
> --- a/arch/x86/kernel/i8259.c
> +++ b/arch/x86/kernel/i8259.c
> @@ -32,6 +32,7 @@
>    */
>   static void init_8259A(int auto_eoi);
>   
> +static bool pcat_compat __ro_after_init;
>   static int i8259A_auto_eoi;
>   DEFINE_RAW_SPINLOCK(i8259A_lock);
>   
> @@ -299,15 +300,32 @@ static void unmask_8259A(void)
>   
>   static int probe_8259A(void)
>   {
> +	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
>   	unsigned long flags;
> -	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
> -	unsigned char new_val;
> +
> +	/*
> +	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
> +	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
> +	 * fails.
> +	 *
> +	 * Right now this causes problems as quite some code depends on
> +	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
> +	 * when the system has an IO/APIC because then PIC is not required
> +	 * at all, except for really old machines where the timer interrupt
> +	 * must be routed through the PIC. So just pretend that the PIC is
> +	 * there and let legacy_pic->init() initialize it for nothing.
> +	 *
> +	 * Alternatively this could just try to initialize the PIC and
> +	 * repeat the probe, but for cases where there is no PIC that's
> +	 * just pointless.
> +	 */
> +	if (pcat_compat)
> +		return nr_legacy_irqs();
> +
>   	/*
> -	 * Check to see if we have a PIC.
> -	 * Mask all except the cascade and read
> -	 * back the value we just wrote. If we don't
> -	 * have a PIC, we will read 0xff as opposed to the
> -	 * value we wrote.
> +	 * Check to see if we have a PIC.  Mask all except the cascade and
> +	 * read back the value we just wrote. If we don't have a PIC, we
> +	 * will read 0xff as opposed to the value we wrote.
>   	 */
>   	raw_spin_lock_irqsave(&i8259A_lock, flags);
>   
> @@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
>   
>   	return 0;
>   }
> -
>   device_initcall(i8259A_init_ops);
> +
> +void __init legacy_pic_pcat_compat(void)
> +{
> +	pcat_compat = true;
> +}


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility
  2023-10-25 21:04               ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, Thomas Gleixner
  2023-10-25 22:11                 ` Mario Limonciello
@ 2023-10-26  8:17                 ` Hans de Goede
  2023-10-26  9:39                 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
  2023-10-27 18:46                 ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 23+ messages in thread
From: Hans de Goede @ 2023-10-26  8:17 UTC (permalink / raw)
  To: Thomas Gleixner, David Lazar, Mario Limonciello
  Cc: kys, hpa, x86, LKML, Borislav Petkov, Rafael J. Wysocki

Hi,

On 10/25/23 23:04, Thomas Gleixner wrote:
> David and a few others reported that on certain newer systems some legacy
> interrupts fail to work correctly.
> 
> Debugging revealed that the BIOS of these systems leaves the legacy PIC in
> uninitialized state which makes the PIC detection fail and the kernel
> switches to a dummy implementation.
> 
> Unfortunately this fallback causes quite some code to fail as it depends on
> checks for the number of legacy PIC interrupts or the availability of the
> real PIC.
> 
> In theory there is no reason to use the PIC on any modern system when
> IO/APIC is available, but the dependencies on the related checks cannot be
> resolved trivially and on short notice. This needs lots of analysis and
> rework.
> 
> The PIC detection has been added to avoid quirky checks and force selection
> of the dummy implementation all over the place, especially in VM guest
> scenarios. So it's not an option to revert the relevant commit as that
> would break a lot of other scenarios.
> 
> One solution would be to try to initialize the PIC on detection fail and
> retry the detection, but that puts the burden on everything which does not
> have a PIC.
> 
> Fortunately the ACPI/MADT table header has a flag field, which advertises
> in bit 0 that the system is PCAT compatible, which means it has a legacy
> 8259 PIC.
> 
> Evaluate that bit and if set avoid the detection routine and keep the real
> PIC installed, which then gets initialized (for nothing) and makes the rest
> of the code with all the dependencies work again.
> 
> Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
> Reported-by: David Lazar <dlazar@gmail.com>
> Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: David Lazar <dlazar@gmail.com>
> Cc: stable@vger.kernel.org
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218003

Thank you for this fix / workaround.

The patch looks good to me:

Reviewed-by: Hans de Goede <hdegoede@redhat.com>

Regards,

Hans




> ---
> ---
>  arch/x86/include/asm/i8259.h |    2 ++
>  arch/x86/kernel/acpi/boot.c  |    3 +++
>  arch/x86/kernel/i8259.c      |   38 ++++++++++++++++++++++++++++++--------
>  3 files changed, 35 insertions(+), 8 deletions(-)
> 
> --- a/arch/x86/include/asm/i8259.h
> +++ b/arch/x86/include/asm/i8259.h
> @@ -69,6 +69,8 @@ struct legacy_pic {
>  	void (*make_irq)(unsigned int irq);
>  };
>  
> +void legacy_pic_pcat_compat(void);
> +
>  extern struct legacy_pic *legacy_pic;
>  extern struct legacy_pic null_legacy_pic;
>  
> --- a/arch/x86/kernel/acpi/boot.c
> +++ b/arch/x86/kernel/acpi/boot.c
> @@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct
>  		pr_debug("Local APIC address 0x%08x\n", madt->address);
>  	}
>  
> +	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
> +		legacy_pic_pcat_compat();
> +
>  	/* ACPI 6.3 and newer support the online capable bit. */
>  	if (acpi_gbl_FADT.header.revision > 6 ||
>  	    (acpi_gbl_FADT.header.revision == 6 &&
> --- a/arch/x86/kernel/i8259.c
> +++ b/arch/x86/kernel/i8259.c
> @@ -32,6 +32,7 @@
>   */
>  static void init_8259A(int auto_eoi);
>  
> +static bool pcat_compat __ro_after_init;
>  static int i8259A_auto_eoi;
>  DEFINE_RAW_SPINLOCK(i8259A_lock);
>  
> @@ -299,15 +300,32 @@ static void unmask_8259A(void)
>  
>  static int probe_8259A(void)
>  {
> +	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
>  	unsigned long flags;
> -	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
> -	unsigned char new_val;
> +
> +	/*
> +	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
> +	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
> +	 * fails.
> +	 *
> +	 * Right now this causes problems as quite some code depends on
> +	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
> +	 * when the system has an IO/APIC because then PIC is not required
> +	 * at all, except for really old machines where the timer interrupt
> +	 * must be routed through the PIC. So just pretend that the PIC is
> +	 * there and let legacy_pic->init() initialize it for nothing.
> +	 *
> +	 * Alternatively this could just try to initialize the PIC and
> +	 * repeat the probe, but for cases where there is no PIC that's
> +	 * just pointless.
> +	 */
> +	if (pcat_compat)
> +		return nr_legacy_irqs();
> +
>  	/*
> -	 * Check to see if we have a PIC.
> -	 * Mask all except the cascade and read
> -	 * back the value we just wrote. If we don't
> -	 * have a PIC, we will read 0xff as opposed to the
> -	 * value we wrote.
> +	 * Check to see if we have a PIC.  Mask all except the cascade and
> +	 * read back the value we just wrote. If we don't have a PIC, we
> +	 * will read 0xff as opposed to the value we wrote.
>  	 */
>  	raw_spin_lock_irqsave(&i8259A_lock, flags);
>  
> @@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
>  
>  	return 0;
>  }
> -
>  device_initcall(i8259A_init_ops);
> +
> +void __init legacy_pic_pcat_compat(void)
> +{
> +	pcat_compat = true;
> +}
> 


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re:
  2023-10-25 22:11                 ` Mario Limonciello
@ 2023-10-26  9:27                   ` Thomas Gleixner
  0 siblings, 0 replies; 23+ messages in thread
From: Thomas Gleixner @ 2023-10-26  9:27 UTC (permalink / raw)
  To: Mario Limonciello, David Lazar
  Cc: Hans de Goede, kys, hpa, x86, LKML, Borislav Petkov,
	Rafael J. Wysocki, Linux kernel regressions list

On Wed, Oct 25 2023 at 17:11, Mario Limonciello wrote:
> On 10/25/2023 16:04, Thomas Gleixner wrote:
>> Cc: stable@vger.kernel.org
>> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218003
>
> s/Link/Closes/

Sure.

> Presumably you will add a proper subject when this is committed?

Bah, yes. I stopped replacing the subject line right after clearing it :(

> With adding title and fixing that tag:
>
> Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [tip: x86/urgent] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility
  2023-10-25 21:04               ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, Thomas Gleixner
  2023-10-25 22:11                 ` Mario Limonciello
  2023-10-26  8:17                 ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility Hans de Goede
@ 2023-10-26  9:39                 ` tip-bot2 for Thomas Gleixner
  2023-10-27 18:46                 ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 23+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2023-10-26  9:39 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: David Lazar, Thomas Gleixner, Hans de Goede, Mario Limonciello,
	stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     38d54ecfe293ed8bb26d05e6f0270a0aaa6656c6
Gitweb:        https://git.kernel.org/tip/38d54ecfe293ed8bb26d05e6f0270a0aaa6656c6
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 25 Oct 2023 23:04:15 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Thu, 26 Oct 2023 11:31:45 +02:00

x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility

David and a few others reported that on certain newer systems some legacy
interrupts fail to work correctly.

Debugging revealed that the BIOS of these systems leaves the legacy PIC in
uninitialized state which makes the PIC detection fail and the kernel
switches to a dummy implementation.

Unfortunately this fallback causes quite some code to fail as it depends on
checks for the number of legacy PIC interrupts or the availability of the
real PIC.

In theory there is no reason to use the PIC on any modern system when
IO/APIC is available, but the dependencies on the related checks cannot be
resolved trivially and on short notice. This needs lots of analysis and
rework.

The PIC detection has been added to avoid quirky checks and force selection
of the dummy implementation all over the place, especially in VM guest
scenarios. So it's not an option to revert the relevant commit as that
would break a lot of other scenarios.

One solution would be to try to initialize the PIC on detection fail and
retry the detection, but that puts the burden on everything which does not
have a PIC.

Fortunately the ACPI/MADT table header has a flag field, which advertises
in bit 0 that the system is PCAT compatible, which means it has a legacy
8259 PIC.

Evaluate that bit and if set avoid the detection routine and keep the real
PIC installed, which then gets initialized (for nothing) and makes the rest
of the code with all the dependencies work again.

Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
Reported-by: David Lazar <dlazar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: David Lazar <dlazar@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218003
Link: https://lore.kernel.org/r/875y2u5s8g.ffs@tglx

---
 arch/x86/include/asm/i8259.h |  2 ++-
 arch/x86/kernel/acpi/boot.c  |  3 +++-
 arch/x86/kernel/i8259.c      | 38 +++++++++++++++++++++++++++--------
 3 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/i8259.h b/arch/x86/include/asm/i8259.h
index 637fa1d..c715097 100644
--- a/arch/x86/include/asm/i8259.h
+++ b/arch/x86/include/asm/i8259.h
@@ -69,6 +69,8 @@ struct legacy_pic {
 	void (*make_irq)(unsigned int irq);
 };
 
+void legacy_pic_pcat_compat(void);
+
 extern struct legacy_pic *legacy_pic;
 extern struct legacy_pic null_legacy_pic;
 
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 2a0ea38..c55c0ef 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct acpi_table_header *table)
 		pr_debug("Local APIC address 0x%08x\n", madt->address);
 	}
 
+	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
+		legacy_pic_pcat_compat();
+
 	/* ACPI 6.3 and newer support the online capable bit. */
 	if (acpi_gbl_FADT.header.revision > 6 ||
 	    (acpi_gbl_FADT.header.revision == 6 &&
diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
index 30a5520..c20d183 100644
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -32,6 +32,7 @@
  */
 static void init_8259A(int auto_eoi);
 
+static bool pcat_compat __ro_after_init;
 static int i8259A_auto_eoi;
 DEFINE_RAW_SPINLOCK(i8259A_lock);
 
@@ -299,15 +300,32 @@ static void unmask_8259A(void)
 
 static int probe_8259A(void)
 {
+	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
 	unsigned long flags;
-	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
-	unsigned char new_val;
+
+	/*
+	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
+	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
+	 * fails.
+	 *
+	 * Right now this causes problems as quite some code depends on
+	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
+	 * when the system has an IO/APIC because then PIC is not required
+	 * at all, except for really old machines where the timer interrupt
+	 * must be routed through the PIC. So just pretend that the PIC is
+	 * there and let legacy_pic->init() initialize it for nothing.
+	 *
+	 * Alternatively this could just try to initialize the PIC and
+	 * repeat the probe, but for cases where there is no PIC that's
+	 * just pointless.
+	 */
+	if (pcat_compat)
+		return nr_legacy_irqs();
+
 	/*
-	 * Check to see if we have a PIC.
-	 * Mask all except the cascade and read
-	 * back the value we just wrote. If we don't
-	 * have a PIC, we will read 0xff as opposed to the
-	 * value we wrote.
+	 * Check to see if we have a PIC.  Mask all except the cascade and
+	 * read back the value we just wrote. If we don't have a PIC, we
+	 * will read 0xff as opposed to the value we wrote.
 	 */
 	raw_spin_lock_irqsave(&i8259A_lock, flags);
 
@@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
 
 	return 0;
 }
-
 device_initcall(i8259A_init_ops);
+
+void __init legacy_pic_pcat_compat(void)
+{
+	pcat_compat = true;
+}

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [tip: x86/urgent] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility
  2023-10-25 21:04               ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, Thomas Gleixner
                                   ` (2 preceding siblings ...)
  2023-10-26  9:39                 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
@ 2023-10-27 18:46                 ` tip-bot2 for Thomas Gleixner
  3 siblings, 0 replies; 23+ messages in thread
From: tip-bot2 for Thomas Gleixner @ 2023-10-27 18:46 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: David Lazar, Thomas Gleixner, Hans de Goede, Mario Limonciello,
	stable, x86, linux-kernel

The following commit has been merged into the x86/urgent branch of tip:

Commit-ID:     128b0c9781c9f2651bea163cb85e52a6c7be0f9e
Gitweb:        https://git.kernel.org/tip/128b0c9781c9f2651bea163cb85e52a6c7be0f9e
Author:        Thomas Gleixner <tglx@linutronix.de>
AuthorDate:    Wed, 25 Oct 2023 23:04:15 +02:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Fri, 27 Oct 2023 20:36:49 +02:00

x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility

David and a few others reported that on certain newer systems some legacy
interrupts fail to work correctly.

Debugging revealed that the BIOS of these systems leaves the legacy PIC in
uninitialized state which makes the PIC detection fail and the kernel
switches to a dummy implementation.

Unfortunately this fallback causes quite some code to fail as it depends on
checks for the number of legacy PIC interrupts or the availability of the
real PIC.

In theory there is no reason to use the PIC on any modern system when
IO/APIC is available, but the dependencies on the related checks cannot be
resolved trivially and on short notice. This needs lots of analysis and
rework.

The PIC detection has been added to avoid quirky checks and force selection
of the dummy implementation all over the place, especially in VM guest
scenarios. So it's not an option to revert the relevant commit as that
would break a lot of other scenarios.

One solution would be to try to initialize the PIC on detection fail and
retry the detection, but that puts the burden on everything which does not
have a PIC.

Fortunately the ACPI/MADT table header has a flag field, which advertises
in bit 0 that the system is PCAT compatible, which means it has a legacy
8259 PIC.

Evaluate that bit and if set avoid the detection routine and keep the real
PIC installed, which then gets initialized (for nothing) and makes the rest
of the code with all the dependencies work again.

Fixes: e179f6914152 ("x86, irq, pic: Probe for legacy PIC and set legacy_pic appropriately")
Reported-by: David Lazar <dlazar@gmail.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: David Lazar <dlazar@gmail.com>
Reviewed-by: Hans de Goede <hdegoede@redhat.com>
Reviewed-by: Mario Limonciello <mario.limonciello@amd.com>
Cc: stable@vger.kernel.org
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218003
Link: https://lore.kernel.org/r/875y2u5s8g.ffs@tglx

---
 arch/x86/include/asm/i8259.h |  2 ++-
 arch/x86/kernel/acpi/boot.c  |  3 +++-
 arch/x86/kernel/i8259.c      | 38 +++++++++++++++++++++++++++--------
 3 files changed, 35 insertions(+), 8 deletions(-)

diff --git a/arch/x86/include/asm/i8259.h b/arch/x86/include/asm/i8259.h
index 637fa1d..c715097 100644
--- a/arch/x86/include/asm/i8259.h
+++ b/arch/x86/include/asm/i8259.h
@@ -69,6 +69,8 @@ struct legacy_pic {
 	void (*make_irq)(unsigned int irq);
 };
 
+void legacy_pic_pcat_compat(void);
+
 extern struct legacy_pic *legacy_pic;
 extern struct legacy_pic null_legacy_pic;
 
diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c
index 2a0ea38..c55c0ef 100644
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -148,6 +148,9 @@ static int __init acpi_parse_madt(struct acpi_table_header *table)
 		pr_debug("Local APIC address 0x%08x\n", madt->address);
 	}
 
+	if (madt->flags & ACPI_MADT_PCAT_COMPAT)
+		legacy_pic_pcat_compat();
+
 	/* ACPI 6.3 and newer support the online capable bit. */
 	if (acpi_gbl_FADT.header.revision > 6 ||
 	    (acpi_gbl_FADT.header.revision == 6 &&
diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
index 30a5520..c20d183 100644
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -32,6 +32,7 @@
  */
 static void init_8259A(int auto_eoi);
 
+static bool pcat_compat __ro_after_init;
 static int i8259A_auto_eoi;
 DEFINE_RAW_SPINLOCK(i8259A_lock);
 
@@ -299,15 +300,32 @@ static void unmask_8259A(void)
 
 static int probe_8259A(void)
 {
+	unsigned char new_val, probe_val = ~(1 << PIC_CASCADE_IR);
 	unsigned long flags;
-	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
-	unsigned char new_val;
+
+	/*
+	 * If MADT has the PCAT_COMPAT flag set, then do not bother probing
+	 * for the PIC. Some BIOSes leave the PIC uninitialized and probing
+	 * fails.
+	 *
+	 * Right now this causes problems as quite some code depends on
+	 * nr_legacy_irqs() > 0 or has_legacy_pic() == true. This is silly
+	 * when the system has an IO/APIC because then PIC is not required
+	 * at all, except for really old machines where the timer interrupt
+	 * must be routed through the PIC. So just pretend that the PIC is
+	 * there and let legacy_pic->init() initialize it for nothing.
+	 *
+	 * Alternatively this could just try to initialize the PIC and
+	 * repeat the probe, but for cases where there is no PIC that's
+	 * just pointless.
+	 */
+	if (pcat_compat)
+		return nr_legacy_irqs();
+
 	/*
-	 * Check to see if we have a PIC.
-	 * Mask all except the cascade and read
-	 * back the value we just wrote. If we don't
-	 * have a PIC, we will read 0xff as opposed to the
-	 * value we wrote.
+	 * Check to see if we have a PIC.  Mask all except the cascade and
+	 * read back the value we just wrote. If we don't have a PIC, we
+	 * will read 0xff as opposed to the value we wrote.
 	 */
 	raw_spin_lock_irqsave(&i8259A_lock, flags);
 
@@ -429,5 +447,9 @@ static int __init i8259A_init_ops(void)
 
 	return 0;
 }
-
 device_initcall(i8259A_init_ops);
+
+void __init legacy_pic_pcat_compat(void)
+{
+	pcat_compat = true;
+}

^ permalink raw reply related	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2023-10-27 18:46 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-10-18 18:50 PIC probing code from e179f6914152 failing Mario Limonciello
2023-10-18 22:50 ` Thomas Gleixner
2023-10-19 16:39   ` David Lazăr
2023-10-19 21:20   ` Mario Limonciello
2023-10-20  3:43     ` Mario Limonciello
2023-10-20 15:16     ` Hans de Goede
2023-10-20 17:13       ` Mario Limonciello
2023-10-23 15:59     ` Thomas Gleixner
2023-10-23 16:17       ` Mario Limonciello
2023-10-23 17:50         ` Thomas Gleixner
2023-10-23 17:59           ` Mario Limonciello
2023-10-25  9:23       ` Thomas Gleixner
2023-10-25 14:41         ` Mario Limonciello
2023-10-25 15:25           ` Mario Limonciello
2023-10-25 15:25           ` David Lazar
2023-10-25 17:31             ` Thomas Gleixner
2023-10-25 17:37               ` Rafael J. Wysocki
2023-10-25 21:04               ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, Thomas Gleixner
2023-10-25 22:11                 ` Mario Limonciello
2023-10-26  9:27                   ` Re: Thomas Gleixner
2023-10-26  8:17                 ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility Hans de Goede
2023-10-26  9:39                 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2023-10-27 18:46                 ` tip-bot2 for Thomas Gleixner

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.