linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Mario Limonciello <mario.limonciello@amd.com>
To: Thomas Gleixner <tglx@linutronix.de>,
	Hans de Goede <hdegoede@redhat.com>,
	kys@microsoft.com, hpa@linux.intel.com, dlazar@gmail.com
Cc: x86@kernel.org, LKML <linux-kernel@vger.kernel.org>,
	Borislav Petkov <bp@alien8.de>
Subject: Re: PIC probing code from e179f6914152 failing
Date: Thu, 19 Oct 2023 22:43:25 -0500	[thread overview]
Message-ID: <d22d9ed1-25a6-4b65-bda6-8266665a3f70@amd.com> (raw)
In-Reply-To: <e79dea49-0c07-4ca2-b359-97dd1bc579c8@amd.com>

On 10/19/2023 16:20, Mario Limonciello wrote:
> On 10/18/2023 17:50, Thomas Gleixner wrote:
>> On Wed, Oct 18 2023 at 13:50, Mario Limonciello wrote:
>>> Recently an issue was reported to Bugzilla [1] that the Keyboard (IRQ 1)
>>> and GPIO controller (IRQ 7) weren't working properly on two separate
>>> Lenovo machines.  The issues are unique to Linux.
>>>
>>> In digging through them, they happen because Lenovo didn't set up the
>>> PIC in the BIOS.
>>> Specifically the PIC probing code introduced by e179f6914152 ("x86, irq,
>>> pic: Probe for legacy PIC and set legacy_pic appropriately") expects
>>> that the BIOS sets up the PIC and uses that assertion to let Linux set
>>> it up.
>>>
>>> One of the reporters confirmed that reverting e179f6914152 fixes the
>>> issue.  Keyboard and GPIO controller both work properly.
>>>
>>> I wanted to ask if we can please revert that and come up with a
>>> different solution for kexec with HyperV.
>>> Can guests instead perhaps detect in early boot code they're running in
>>> an EFI based hypervisor and explicitly set 'legacy_pic = 
>>> &null_legacy_pic;'?
>>
>> No. This detection mechanism prevents PIC usage also in other
>> scenarios.
>>
>> It's perfectly valid code and the assumption that you can read back what
>> you wrote to the master IMR is entirely correct. If that's not the case
>> then there is no PIC, the BIOS has disabled some parts of the legacy
>> block or did not initialize it.
>>
>> Letting the kernel blindly assume that there is always a PIC is just
>> wrong. Worse, it puts the burden on everyone else to sprinkle
>> "legacy_pic = null_pic;" all over the place with dubious
>> conditions. That's exactly what the commit in question avoided.
>>
>> So no, we are not going back there.
>>
>> We could obviously change the probe() function to issue a full PIC
>> initialization sequence before reading a known written value
>> back. That's basically what the revert causes to happen via
>> legacy_pic->init().
>>
>> But I fundamentally hate to do that because forcing the init sequence
>> just to make presumably broken code which has some dubious dependencies
>> on the PIC or on nr_legacy_irqs > 0 happy is not really a solution when
>> the PIC is actually not needed at all. For anything modern where all
>> legacy interrupts are routed to the IO/APIC the PIC does not make any
>> sense whatsoever.
>>
>> We rather go and fix the real underlying problem.
> 
> Looking at the logs from David and also trying to mock up the failure on 
> my side I have a few findings I'll share, please agree or disagree with 
> them.
> 
>>
>> The kernel can handle the legacy interrupts perfectly fine through the
>> IOAPIC. There is no hard requirement for the PIC at all except for
>> really old systems which had the timer interrupt wired to the PIC and
>> therefore required to route the timer interrupt through the PIC instead
>> of the IO/APIC or did not provide routing entries for the IO/APIC. See
>> the horrible hackery in check_timer() and the grossly misnomed
>> init_IO_APIC_traps().
>>
>> I just took a random machine, forced the NULL PIC and added
>> 'apic=verbose' to the kernel command line and magically all the legacy
>> interrupts are set up via IO/APIC despite the NULL PIC and therefore 0
>> preallocated legacy interrupts.
>>
>>    apic 0 pin 0 not connected
>>   IOAPIC[0]: Preconfigured routing entry (0-1 -> IRQ 1 Level:0 
>> ActiveLow:0)
>>   IOAPIC[0]: Preconfigured routing entry (0-2 -> IRQ 2 Level:0 
>> ActiveLow:0)
>>   IOAPIC[0]: Preconfigured routing entry (0-3 -> IRQ 3 Level:0 
>> ActiveLow:0)
>>   ...
>>
>> which is identical to the output with PIC enabled. That debug message is
>> emitted from mp_irqdomain_alloc() which is invoked via the PNP resource
>> management code.
>>
>> Now /proc/interrupts:
>>
>>             CPU0       CPU1       CPU2       CPU3
>>    1:          0         56          0          0    IO-APIC   
>> 1-edge      i8042
>>    4:        442          0          0          0    IO-APIC   
>> 4-edge      ttyS0
>>    8:          0          0          0          0    IO-APIC   
>> 8-edge      rtc0
>>    9:          0          0          0          0    IO-APIC   
>> 9-fasteoi   acpi
>>   12:          0          0        122          0    IO-APIC  
>> 12-edge      i8042
>>
>> Keyboard and serial are working, see?
>>
>> The only interrupt which does not work is interrupt 0 because nothing
>> allocates interrupt 0 due to nr_legacy_irqs == 0, but that's a trivially
>> solvable problem.
> 
>  From David's logs I can see that the timer interrupt gets wired up to 
> IRQ2 instead of IRQ0.
> 
> IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 2 Level:0 ActiveLow:0)
> 
> In my hacked up forcing NULL pic case this fixes that:
> 
> diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
> index 43c1c24e934b..885687e64e4e 100644
> --- a/arch/x86/kernel/i8259.c
> +++ b/arch/x86/kernel/i8259.c
> @@ -424,7 +424,7 @@ static int legacy_pic_probe(void)
>   }
> 
>   struct legacy_pic null_legacy_pic = {
> -       .nr_legacy_irqs = 0,
> +       .nr_legacy_irqs = 1,
>          .chip = &dummy_irq_chip,
>          .mask = legacy_pic_uint_noop,
>          .unmask = legacy_pic_uint_noop,
> 
> I think it's cleaner than changing all the places that use 
> nr_legacy_irqs().  On my side this makes:
> 
> IOAPIC[0]: Preconfigured routing entry (33-2 -> IRQ 0 Level:0 ActiveLow:0)
> 
>>
>> That machine does not even need the timer interrupt because it has a
>> usable APIC and TSC deadline timer, so no APIC timer calibration
>> required. The same is true for CPUs which do not have a TSC deadline
>> timer, but enumerate the APIC frequency via CPUID or MSRs.
> 
> Don't you need it for things like rtcwake to be able to work?
> 
>>
>> But that brings up an interesting question. How are those affected
>> machines even reaching a state where the user notices that just the
>> keyboard and the GPIO are not working? Why?
> 
> So the GPIO controller driver (pinctrl-amd) uses platform_get_irq() to 
> try to discover the IRQ to use.
> 
> This calls acpi_irq_get() which isn't implemented on x86 (hardcodes 
> -EINVAL).
> 
> I can "work around it" by:
> 
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 76bfcba25003..2b4b436c65d8 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -187,7 +187,8 @@ int platform_get_irq_optional(struct platform_device 
> *dev, unsigned int num)
>          }
> 
>          r = platform_get_resource(dev, IORESOURCE_IRQ, num);
> -       if (has_acpi_companion(&dev->dev)) {
> +       if (IS_ENABLED(CONFIG_ACPI_GENERIC_GSI) &&
> +            has_acpi_companion(&dev->dev)) {
>                  if (r && r->flags & IORESOURCE_DISABLED) {
>                          ret = acpi_irq_get(ACPI_HANDLE(&dev->dev), num, 
> r);
>                          if (ret)
> 
> but the resource that is returned from the next hunk has the resource 
> flags set wrong in the NULL pic case:
> 
> NULL case:
> r: AMDI0030:00 flags: 0x30000418
> PIC case:
> r: AMDI0030:00 flags: 0x418
> 
> IOW NULL pic case has IORESOURCE_DISABLED / IORESOURCE_UNSET
> 
> This then later the GPIO controller interrupts are not actually working.
> For example the attn pin for my I2C touchpad doesn't work.
> 
> Checking the debugfs with my "work around" in place I can see a few 
> things set up differently:
> 
> NULL case:
> handler:  handle_edge_irq
> dstate:   0x3740c208
>              IRQ_TYPE_LEVEL_LOW
>              IRQD_ACTIVATED
>              IRQD_IRQ_STARTED
>              IRQD_SINGLE_TARGET
>              IRQD_MOVE_PCNTXT
>              IRQD_AFFINITY_ON_ACTIVATE
>              IRQD_CAN_RESERVE
>              IRQD_WAKEUP_STATE
>              IRQD_DEFAULT_TRIGGER_SET
>              IRQD_HANDLE_ENFORCE_IRQCTX
> 
> PIC case:
> handler:  handle_fasteoi_irq
> dstate:   0x3740e208
>              IRQ_TYPE_LEVEL_LOW
>              IRQD_LEVEL
>              IRQD_ACTIVATED
>              IRQD_IRQ_STARTED
>              IRQD_SINGLE_TARGET
>              IRQD_MOVE_PCNTXT
>              IRQD_AFFINITY_ON_ACTIVATE
>              IRQD_CAN_RESERVE
>              IRQD_WAKEUP_STATE
>              IRQD_DEFAULT_TRIGGER_SET
>              IRQD_HANDLE_ENFORCE_IRQCTX
> 
> I guess something related to the callpath for mp_register_handler().
> 
> Maybe this is the same reason for the keyboard not working right.
> 
>>
>> The CPUID/MSR APIC frequency enumeration is Intel specific and
>> everything else depends on a working timer interrupt to calibrate the
>> APIC timer frequency. So AMD CPUs require the timer interrupt to
>> work. The only explanation why this "works" in that null PIC case is
>> that the PIT/HPET interrupt is actually wired to pin 0, but that's
>> something to be determined...
>>
>> Can I please get the following information from an affected machine:
>>
>>    1) dmesg with 'apic=verbose' on the command line
>>    2) /proc/interrupts
>>    3) /sys/kernel/debug/irq/irqs/{0..15}
>>
>>    #3 requires CONFIG_GENERIC_IRQ_DEBUGFS to be set.
>>
>> Two versions of that please:
>>
>>    1) Unpatched kernel
>>    2) Patched kernel
>>
>> Thanks,
>>
>>          tglx
> 

At least on my system with forcing NULL PIC I've come up with this 
series to make everything work.

David, can you please have a try with it on your system?

https://lore.kernel.org/linux-kernel/20231020033806.88063-1-mario.limonciello@amd.com/#t

Thanks,

  reply	other threads:[~2023-10-20  3:59 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-10-18 18:50 PIC probing code from e179f6914152 failing Mario Limonciello
2023-10-18 22:50 ` Thomas Gleixner
2023-10-19 16:39   ` David Lazăr
2023-10-19 21:20   ` Mario Limonciello
2023-10-20  3:43     ` Mario Limonciello [this message]
2023-10-20 15:16     ` Hans de Goede
2023-10-20 17:13       ` Mario Limonciello
2023-10-23 15:59     ` Thomas Gleixner
2023-10-23 16:17       ` Mario Limonciello
2023-10-23 17:50         ` Thomas Gleixner
2023-10-23 17:59           ` Mario Limonciello
2023-10-25  9:23       ` Thomas Gleixner
2023-10-25 14:41         ` Mario Limonciello
2023-10-25 15:25           ` Mario Limonciello
2023-10-25 15:25           ` David Lazar
2023-10-25 17:31             ` Thomas Gleixner
2023-10-25 17:37               ` Rafael J. Wysocki
2023-10-25 21:04               ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility, Thomas Gleixner
2023-10-25 22:11                 ` Mario Limonciello
2023-10-26  9:27                   ` Re: Thomas Gleixner
2023-10-26  8:17                 ` [PATCH] x86/i8259: Skip probing when ACPI/MADT advertises PCAT compatibility Hans de Goede
2023-10-26  9:39                 ` [tip: x86/urgent] " tip-bot2 for Thomas Gleixner
2023-10-27 18:46                 ` tip-bot2 for Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d22d9ed1-25a6-4b65-bda6-8266665a3f70@amd.com \
    --to=mario.limonciello@amd.com \
    --cc=bp@alien8.de \
    --cc=dlazar@gmail.com \
    --cc=hdegoede@redhat.com \
    --cc=hpa@linux.intel.com \
    --cc=kys@microsoft.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).