All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] x86/i8259: Work around buggy legacy PIC
@ 2021-05-12 21:04 Maximilian Luz
  2021-05-13  8:10 ` David Laight
  0 siblings, 1 reply; 22+ messages in thread
From: Maximilian Luz @ 2021-05-12 21:04 UTC (permalink / raw)
  To: Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: Maximilian Luz, H. Peter Anvin, Sachi King, x86, linux-kernel, stable

The legacy PIC on the AMD variant of the Microsoft Surface Laptop 4 has
some problems on boot. For some reason it consistently does not respond
on the first try, requiring a couple more tries before it finally
responds.

This currently leads to the PIC not being properly recognized, which
prevents interrupt handling down the line. Ultimately, this also leads
to the pinctrl-amd driver failing to probe due to platform_get_irq()
returning -EINVAL for its base IRQ. That, in turn, means that several
interrupts are not available and device drivers relying on those will
defer probing indefinitely, as querying those interrupts returns
-EPROBE_DEFER.

Add a quirk table and a retry-loop to work around that.

Also switch to pr_info() due to complaints by checkpatch and add a
pr_fmt() definition for completeness.

Cc: <stable@vger.kernel.org> # 5.10+
Co-developed-by: Sachi King <nakato@nakato.io>
Signed-off-by: Sachi King <nakato@nakato.io>
Signed-off-by: Maximilian Luz <luzmaximilian@gmail.com>
---
 arch/x86/kernel/i8259.c | 51 +++++++++++++++++++++++++++++++++++++----
 1 file changed, 46 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kernel/i8259.c b/arch/x86/kernel/i8259.c
index 282b4ee1339f..0da757c6b292 100644
--- a/arch/x86/kernel/i8259.c
+++ b/arch/x86/kernel/i8259.c
@@ -1,4 +1,7 @@
 // SPDX-License-Identifier: GPL-2.0
+
+#define pr_fmt(fmt) "i8259: " fmt
+
 #include <linux/linkage.h>
 #include <linux/errno.h>
 #include <linux/signal.h>
@@ -16,6 +19,7 @@
 #include <linux/io.h>
 #include <linux/delay.h>
 #include <linux/pgtable.h>
+#include <linux/dmi.h>
 
 #include <linux/atomic.h>
 #include <asm/timer.h>
@@ -298,11 +302,39 @@ static void unmask_8259A(void)
 	raw_spin_unlock_irqrestore(&i8259A_lock, flags);
 }
 
+/*
+ * DMI table to identify devices with quirky probe behavior. See comment in
+ * probe_8259A() for more details.
+ */
+static const struct dmi_system_id retry_probe_quirk_table[] = {
+	{
+		.ident = "Microsoft Surface Laptop 4 (AMD)",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
+			DMI_MATCH(DMI_PRODUCT_SKU, "Surface_Laptop_4_1952:1953")
+		},
+	},
+	{}
+};
+
 static int probe_8259A(void)
 {
 	unsigned long flags;
 	unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
 	unsigned char new_val;
+	unsigned int i, imax = 1;
+
+	/*
+	 * Some systems have a legacy PIC that doesn't immediately respond
+	 * after boot. We know it's there, we know it should respond and is
+	 * required for proper interrupt handling later on, so let's try a
+	 * couple of times.
+	 */
+	if (dmi_check_system(retry_probe_quirk_table)) {
+		pr_warn("system with broken legacy PIC detected, re-trying multiple times if necessary\n");
+		imax = 10;
+	}
+
 	/*
 	 * Check to see if we have a PIC.
 	 * Mask all except the cascade and read
@@ -312,15 +344,24 @@ static int probe_8259A(void)
 	 */
 	raw_spin_lock_irqsave(&i8259A_lock, flags);
 
-	outb(0xff, PIC_SLAVE_IMR);	/* mask all of 8259A-2 */
-	outb(probe_val, PIC_MASTER_IMR);
-	new_val = inb(PIC_MASTER_IMR);
-	if (new_val != probe_val) {
-		printk(KERN_INFO "Using NULL legacy PIC\n");
+	for (i = 0; i < imax; i++) {
+		outb(0xff, PIC_SLAVE_IMR);	/* mask all of 8259A-2 */
+		outb(probe_val, PIC_MASTER_IMR);
+		new_val = inb(PIC_MASTER_IMR);
+		if (new_val == probe_val)
+			break;
+	}
+
+	if (i == imax) {
+		pr_info("using NULL legacy PIC\n");
 		legacy_pic = &null_legacy_pic;
 	}
 
 	raw_spin_unlock_irqrestore(&i8259A_lock, flags);
+
+	if (imax > 1 && i < imax)
+		pr_info("got legacy PIC after %d tries\n", i + 1);
+
 	return nr_legacy_irqs();
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 22+ messages in thread

* RE: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-12 21:04 [PATCH] x86/i8259: Work around buggy legacy PIC Maximilian Luz
@ 2021-05-13  8:10 ` David Laight
  2021-05-13 10:11   ` Maximilian Luz
  0 siblings, 1 reply; 22+ messages in thread
From: David Laight @ 2021-05-13  8:10 UTC (permalink / raw)
  To: 'Maximilian Luz', Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, Sachi King, x86, linux-kernel, stable

From: Maximilian Luz
> Sent: 12 May 2021 22:05
> 
> The legacy PIC on the AMD variant of the Microsoft Surface Laptop 4 has
> some problems on boot. For some reason it consistently does not respond
> on the first try, requiring a couple more tries before it finally
> responds.

That seems very strange, something else must be going on that causes the grief.
The 8259 will be built into to the one of the cpu support chips.
I can't imagine that requires anything special.

It's not as though you have a real 8259 - which even a 286 can
break the inter-cycle recovery on (with the circuit from the
application note).

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-13  8:10 ` David Laight
@ 2021-05-13 10:11   ` Maximilian Luz
  2021-05-13 10:36     ` David Laight
  2021-05-14 13:44     ` Thomas Gleixner
  0 siblings, 2 replies; 22+ messages in thread
From: Maximilian Luz @ 2021-05-13 10:11 UTC (permalink / raw)
  To: David Laight, Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, Sachi King, x86, linux-kernel, stable

On 5/13/21 10:10 AM, David Laight wrote:
> From: Maximilian Luz
>> Sent: 12 May 2021 22:05
>>
>> The legacy PIC on the AMD variant of the Microsoft Surface Laptop 4 has
>> some problems on boot. For some reason it consistently does not respond
>> on the first try, requiring a couple more tries before it finally
>> responds.
> 
> That seems very strange, something else must be going on that causes the grief.
> The 8259 will be built into to the one of the cpu support chips.
> I can't imagine that requires anything special.

Right, it's definitely strange. Both Sachi (I imagine) and I don't know
much about these devices, so we're open for suggestions.

For reference: [1] and following comments are the discussion where we
(mostly Sachi) discovered the issue, tracking this from
platform_get_irq() returning -EINVAL to the PIC not probing. It also has
a dmesg log [2] attached, but as far as we can tell

     [    0.105820] Using NULL legacy PIC

is the only relevant line to that in there.

And lastly, if that's any help at all: The PIC device is described in
ACPI in [3]. The Surface Laptop 3 also has an AMD CPU (although a prior
generation) and has the PIC described in the exact same way (can also be
found in that repository), but doesn't exhibit that behavior (and
doesn't show the "Using NULL legacy PIC" line). I expect there's not
much you can change to that definition so that's probably irrelevant
here.

Again, I don't really know anything about these devices, so my guess
would be bad hardware revision or bad firmware revision. All I know is
that retrying seems to "fix" it.

> It's not as though you have a real 8259 - which even a 286 can
> break the inter-cycle recovery on (with the circuit from the
> application note).

Right, I imagine that's some emulation for legacy reasons?

Regards,
Max

[1]: https://github.com/linux-surface/linux-surface/issues/425#issuecomment-831921098
[2]: https://github.com/linux-surface/linux-surface/files/6421166/dmesg-2021-05-04_22-11.txt
[3]: https://github.com/linux-surface/acpidumps/blob/69d5ecc1954ea5e90829b8e33541308e7451e951/surface_laptop_4_amd/dsdt.dsl#L1201-L1221

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-13 10:11   ` Maximilian Luz
@ 2021-05-13 10:36     ` David Laight
  2021-05-14 13:01       ` Thomas Gleixner
  2021-05-14 19:41       ` Sachi King
  2021-05-14 13:44     ` Thomas Gleixner
  1 sibling, 2 replies; 22+ messages in thread
From: David Laight @ 2021-05-13 10:36 UTC (permalink / raw)
  To: 'Maximilian Luz', Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, Sachi King, x86, linux-kernel, stable



> -----Original Message-----
> From: Maximilian Luz <luzmaximilian@gmail.com>
> Sent: 13 May 2021 11:12
> To: David Laight <David.Laight@ACULAB.COM>; Thomas Gleixner <tglx@linutronix.de>; Ingo Molnar
> <mingo@redhat.com>; Borislav Petkov <bp@alien8.de>
> Cc: H. Peter Anvin <hpa@zytor.com>; Sachi King <nakato@nakato.io>; x86@kernel.org; linux-
> kernel@vger.kernel.org; stable@vger.kernel.org
> Subject: Re: [PATCH] x86/i8259: Work around buggy legacy PIC
> 
> On 5/13/21 10:10 AM, David Laight wrote:
> > From: Maximilian Luz
> >> Sent: 12 May 2021 22:05
> >>
> >> The legacy PIC on the AMD variant of the Microsoft Surface Laptop 4 has
> >> some problems on boot. For some reason it consistently does not respond
> >> on the first try, requiring a couple more tries before it finally
> >> responds.
> >
> > That seems very strange, something else must be going on that causes the grief.
> > The 8259 will be built into to the one of the cpu support chips.
> > I can't imagine that requires anything special.
> 
> Right, it's definitely strange. Both Sachi (I imagine) and I don't know
> much about these devices, so we're open for suggestions.

I found a copy of the datasheet (I don't seem to have the black book):

https://pdos.csail.mit.edu/6.828/2010/readings/hardware/8259A.pdf

The PC hardware has two 8259 in cascade mode.
(Cascaded using an interrupt that wasn't really using in the original
8088 PC which only had one 8259.)

I wonder if the bios has actually initialised is properly.
Some initialisation writes have to be done to set everything up.

It is also worth noting that the probe code is spectacularly crap.
It writes 0xff and then checks that 0xff is read back.
Almost anything (including a failed PCIe read to the ISA bridge)
will return 0xff and make the test pass.

It's about 35 years since I last wrote the code to initialise an 8259.
The memory cells are foggy.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 19:41       ` Sachi King
@ 2021-05-14 10:51         ` David Laight
  2021-05-14 11:58         ` Maximilian Luz
  1 sibling, 0 replies; 22+ messages in thread
From: David Laight @ 2021-05-14 10:51 UTC (permalink / raw)
  To: 'Sachi King', 'Maximilian Luz',
	Thomas Gleixner, Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, x86, linux-kernel, stable

From: Sachi King
> Sent: 14 May 2021 20:41
> 
> On Thursday, May 13, 2021 8:36:27 PM AEST David Laight wrote:
> > > -----Original Message-----
> > > From: Maximilian Luz <luzmaximilian@gmail.com>
> > > Sent: 13 May 2021 11:12
> > > To: David Laight <David.Laight@ACULAB.COM>; Thomas Gleixner
> > > <tglx@linutronix.de>; Ingo Molnar <mingo@redhat.com>; Borislav Petkov
> > > <bp@alien8.de>
> > > Cc: H. Peter Anvin <hpa@zytor.com>; Sachi King <nakato@nakato.io>;
> > > x86@kernel.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org
> > > Subject: Re: [PATCH] x86/i8259: Work around buggy legacy PIC
> > >
> > > On 5/13/21 10:10 AM, David Laight wrote:
> > >
> > > > From: Maximilian Luz
> > > >
> > > >> Sent: 12 May 2021 22:05
> > > >>
> > > >>
> > > >>
> > > >> The legacy PIC on the AMD variant of the Microsoft Surface Laptop 4
> > > >> has
> > > >> some problems on boot. For some reason it consistently does not
> > > >> respond
> > > >> on the first try, requiring a couple more tries before it finally
> > > >> responds.
> > > >
> > > >
> > > >
> > > > That seems very strange, something else must be going on that causes the
> > > > grief.
> > > > The 8259 will be built into to the one of the cpu support
> > > > chips.
> > > > I can't imagine that requires anything special.
> > >
> > >
> > > Right, it's definitely strange. Both Sachi (I imagine) and I don't know
> > > much about these devices, so we're open for suggestions.
> >
> >
> > I found a copy of the datasheet (I don't seem to have the black book):
> >
> > https://pdos.csail.mit.edu/6.828/2010/readings/hardware/8259A.pdf
> >
> > The PC hardware has two 8259 in cascade mode.
> > (Cascaded using an interrupt that wasn't really using in the original
> > 8088 PC which only had one 8259.)
> >
> > I wonder if the bios has actually initialised is properly.
> > Some initialisation writes have to be done to set everything up.
> 
> I suspect by the displayed behaviour you are correct and that it has
> not.  I'm struggling to figure out who to talk to to see that is
> something that can be fixed in the firmware.
> 
> > It is also worth noting that the probe code is spectacularly crap.
> > It writes 0xff and then checks that 0xff is read back.
> > Almost anything (including a failed PCIe read to the ISA bridge)
> > will return 0xff and make the test pass.
> 
> I was under the impression that it wrote 0xfb, and 0xff would be
> considered a failure.

I probably misread it.
For anything like a probe you really need to write one value
and read back something different - that is hopefully reasonably unique.
OTOH just the write can trash the wrong hardware - many old PC
hardware probes were very dubious.

Although unlikely here (since the logic is all inside chip)
on a real IO bus with tristate drivers a write followed by
a read to an unknown address could easily return the written
value due to capacitance of the data bus.

> > It's about 35 years since I last wrote the code to initialise an 8259.
> > The memory cells are foggy.
> 
> I'm not sure the i8259 is needed on the device, as the interrupts
> appear to function on the device if I bypass the nr_legacy_irqs() check
> while the legacy_pic is set to the null_legacy_pic.
> 
> The null_legacy_pic however specifies having 0 irqs, and the io_apic
> does not allow us to set the pin attributes unless the pin we are
> attempting to set is less than nr_legacy_irqs.
> 
> The IOAPIC seems to take responsibility for the 0-15 interrupts on this
> specific hardware, should we maybe be ignoring the i8259 and looking
> into allowing interrupts 0-15 to be setup even when the legacy_pic is
> not available?

That woke up some memory cells.
IIRC on an SMP kernel the IOAPIC is always used in preference to the 8259.
So maybe the initialisation code is just plain wrong!

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 19:41       ` Sachi King
  2021-05-14 10:51         ` David Laight
@ 2021-05-14 11:58         ` Maximilian Luz
  2021-05-14 17:32           ` Thomas Gleixner
  1 sibling, 1 reply; 22+ messages in thread
From: Maximilian Luz @ 2021-05-14 11:58 UTC (permalink / raw)
  To: Sachi King, Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Laight
  Cc: H. Peter Anvin, x86, linux-kernel, stable

On 5/14/21 9:41 PM, Sachi King wrote:
> On Thursday, May 13, 2021 8:36:27 PM AEST David Laight wrote:
>>> -----Original Message-----
>>> From: Maximilian Luz <luzmaximilian@gmail.com>
>>> Sent: 13 May 2021 11:12
>>> To: David Laight <David.Laight@ACULAB.COM>; Thomas Gleixner
>>> <tglx@linutronix.de>; Ingo Molnar <mingo@redhat.com>; Borislav Petkov
>>> <bp@alien8.de>
>>> Cc: H. Peter Anvin <hpa@zytor.com>; Sachi King <nakato@nakato.io>;
>>> x86@kernel.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org
>>> Subject: Re: [PATCH] x86/i8259: Work around buggy legacy PIC
>>>
>>> On 5/13/21 10:10 AM, David Laight wrote:
>>>
>>>> From: Maximilian Luz
>>>>
>>>>> Sent: 12 May 2021 22:05
>>>>>
>>>>>
>>>>>
>>>>> The legacy PIC on the AMD variant of the Microsoft Surface Laptop 4
>>>>> has
>>>>> some problems on boot. For some reason it consistently does not
>>>>> respond
>>>>> on the first try, requiring a couple more tries before it finally
>>>>> responds.
>>>>
>>>>
>>>>
>>>> That seems very strange, something else must be going on that causes the
>>>> grief.
>>>> The 8259 will be built into to the one of the cpu support
>>>> chips.
>>>> I can't imagine that requires anything special.
>>>
>>>
>>> Right, it's definitely strange. Both Sachi (I imagine) and I don't know
>>> much about these devices, so we're open for suggestions.
>>
>>
>> I found a copy of the datasheet (I don't seem to have the black book):
>>
>> https://pdos.csail.mit.edu/6.828/2010/readings/hardware/8259A.pdf
>>
>> The PC hardware has two 8259 in cascade mode.
>> (Cascaded using an interrupt that wasn't really using in the original
>> 8088 PC which only had one 8259.)
>>
>> I wonder if the bios has actually initialised is properly.
>> Some initialisation writes have to be done to set everything up.
> 
> I suspect by the displayed behaviour you are correct and that it has
> not.  I'm struggling to figure out who to talk to to see that is
> something that can be fixed in the firmware.

I'd assume that _some_ sort of interrupt setup is done by the BIOS/UEFI.
The UEFI on those devices is fairly well-featured, with touch support
via SPI and all. Furthermore, keyboard (also supported in the device's
UEFI) is handled via a custom UART protocol. Unless they rely on polling
for all of that, I believe they'd have to set up some interrupts.

Although, as you mention later on, that could also be handled via the
IOAPIC and the PIC is actually not supposed to be used. Maybe some
legacy component that never got tested and just broke with some new
hardware/firmware revision without anyone noticing? And since Linux
still seems to rely on that, we might be the first to notice.

Regards,
Max

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-13 10:36     ` David Laight
@ 2021-05-14 13:01       ` Thomas Gleixner
  2021-05-14 13:13         ` David Laight
  2021-05-14 19:41       ` Sachi King
  1 sibling, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2021-05-14 13:01 UTC (permalink / raw)
  To: David Laight, 'Maximilian Luz', Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, Sachi King, x86, linux-kernel, stable

David,

On Thu, May 13 2021 at 10:36, David Laight wrote:

>> -----Original Message-----
>> From: Maximilian Luz <luzmaximilian@gmail.com>
>> Sent: 13 May 2021 11:12
>> To: David Laight <David.Laight@ACULAB.COM>; Thomas Gleixner <tglx@linutronix.de>; Ingo Molnar
>> <mingo@redhat.com>; Borislav Petkov <bp@alien8.de>
>> Cc: H. Peter Anvin <hpa@zytor.com>; Sachi King <nakato@nakato.io>; x86@kernel.org; linux-
>> kernel@vger.kernel.org; stable@vger.kernel.org
>> Subject: Re: [PATCH] x86/i8259: Work around buggy legacy PIC

can you please fix your mail client and spare us the useless header
duplication in the reply?

> It is also worth noting that the probe code is spectacularly crap.
> It writes 0xff and then checks that 0xff is read back.
> Almost anything (including a failed PCIe read to the ISA bridge)
> will return 0xff and make the test pass.

        unsigned char probe_val = ~(1 << PIC_CASCADE_IR);

	outb(probe_val, PIC_MASTER_IMR);
	new_val = inb(PIC_MASTER_IMR);

How is that writing 0xFF?

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 13:01       ` Thomas Gleixner
@ 2021-05-14 13:13         ` David Laight
  2021-05-14 16:19           ` Ingo Molnar
  0 siblings, 1 reply; 22+ messages in thread
From: David Laight @ 2021-05-14 13:13 UTC (permalink / raw)
  To: 'Thomas Gleixner', 'Maximilian Luz',
	Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, Sachi King, x86, linux-kernel, stable

From: Thomas Gleixner
> Sent: 14 May 2021 14:02
> 
> David,
> 
> On Thu, May 13 2021 at 10:36, David Laight wrote:
> 
> >> -----Original Message-----
> >> From: Maximilian Luz <luzmaximilian@gmail.com>
> 
> can you please fix your mail client and spare us the useless header
> duplication in the reply?

I have to delete them by hand - must have forgotten, I can't fix outlook :-)

> > It is also worth noting that the probe code is spectacularly crap.
> > It writes 0xff and then checks that 0xff is read back.
> > Almost anything (including a failed PCIe read to the ISA bridge)
> > will return 0xff and make the test pass.
> 
>         unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
> 
> 	outb(probe_val, PIC_MASTER_IMR);
> 	new_val = inb(PIC_MASTER_IMR);
> 
> How is that writing 0xFF?

Sorry I misread the code and diagnostic output.

In any case writing a value and expecting the same value back
isn't exactly a high-quality probe.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-13 10:11   ` Maximilian Luz
  2021-05-13 10:36     ` David Laight
@ 2021-05-14 13:44     ` Thomas Gleixner
  2021-05-14 16:12       ` David Laight
  1 sibling, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2021-05-14 13:44 UTC (permalink / raw)
  To: Maximilian Luz, David Laight, Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, Sachi King, x86, linux-kernel, stable

Max,

On Thu, May 13 2021 at 12:11, Maximilian Luz wrote:
> And lastly, if that's any help at all: The PIC device is described in
> ACPI in [3]. The Surface Laptop 3 also has an AMD CPU (although a prior
> generation) and has the PIC described in the exact same way (can also be
> found in that repository), but doesn't exhibit that behavior (and
> doesn't show the "Using NULL legacy PIC" line). I expect there's not
> much you can change to that definition so that's probably irrelevant
> here.
>
> Again, I don't really know anything about these devices, so my guess
> would be bad hardware revision or bad firmware revision. All I know is
> that retrying seems to "fix" it.

That might be a a power optimization thing.

Except for older systems the PIC is not really required for IOAPiC to
work. But there is some historical code which makes assumptions. We can
change that, but that needs some careful thoughts.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* RE: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 13:44     ` Thomas Gleixner
@ 2021-05-14 16:12       ` David Laight
  2021-05-14 17:28         ` H. Peter Anvin
  0 siblings, 1 reply; 22+ messages in thread
From: David Laight @ 2021-05-14 16:12 UTC (permalink / raw)
  To: 'Thomas Gleixner', Maximilian Luz, Ingo Molnar, Borislav Petkov
  Cc: H. Peter Anvin, Sachi King, x86, linux-kernel, stable

From: Thomas Gleixner
> Sent: 14 May 2021 14:45
> 
> Max,
> 
> On Thu, May 13 2021 at 12:11, Maximilian Luz wrote:
> > And lastly, if that's any help at all: The PIC device is described in
> > ACPI in [3]. The Surface Laptop 3 also has an AMD CPU (although a prior
> > generation) and has the PIC described in the exact same way (can also be
> > found in that repository), but doesn't exhibit that behavior (and
> > doesn't show the "Using NULL legacy PIC" line). I expect there's not
> > much you can change to that definition so that's probably irrelevant
> > here.
> >
> > Again, I don't really know anything about these devices, so my guess
> > would be bad hardware revision or bad firmware revision. All I know is
> > that retrying seems to "fix" it.
> 
> That might be a a power optimization thing.
> 
> Except for older systems the PIC is not really required for IOAPiC to
> work. But there is some historical code which makes assumptions. We can
> change that, but that needs some careful thoughts.

A more interesting probe would be:
- Write some value to register 1 - the mask.
- Write 9 to register zero (selects interrupt in service register).
- Read register 0 - should be zero since we aren't in as ISR.
- Read register 1 - should get the mask back.
You can also write 8 to register 0, reads then return the pending interrupts.
Their might be pending interrupts - so that value can't be checked.

But if reads start returning the last written value you might only
have capacitors on the data bus.

The required initialisation registers are pretty fixed for the PC hardware.
But finding the values requires a bit of work.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 13:13         ` David Laight
@ 2021-05-14 16:19           ` Ingo Molnar
  0 siblings, 0 replies; 22+ messages in thread
From: Ingo Molnar @ 2021-05-14 16:19 UTC (permalink / raw)
  To: David Laight
  Cc: 'Thomas Gleixner', 'Maximilian Luz',
	Ingo Molnar, Borislav Petkov, H. Peter Anvin, Sachi King, x86,
	linux-kernel, stable


* David Laight <David.Laight@ACULAB.COM> wrote:

> > > It is also worth noting that the probe code is spectacularly crap.
> > > It writes 0xff and then checks that 0xff is read back.
> > > Almost anything (including a failed PCIe read to the ISA bridge)
> > > will return 0xff and make the test pass.
> > 
> >         unsigned char probe_val = ~(1 << PIC_CASCADE_IR);
> > 
> > 	outb(probe_val, PIC_MASTER_IMR);
> > 	new_val = inb(PIC_MASTER_IMR);
> > 
> > How is that writing 0xFF?
> 
> Sorry I misread the code and diagnostic output.
> 
> In any case writing a value and expecting the same value back
> isn't exactly a high-quality probe.

It's not, and it's not intended to be: 0x21 is a well-known port nobody was 
crazy enough to override yet, so that probe basically filters out the 
"there is nothing at that port, at all" case, which would normally return 
0xff, or in a few weird cases 0x00 perhaps.

Writing something inbetween those values and getting the same value back 
tells us that something functional occupies that well-known IO-port, 
pretending to be a i8259 PIC.

Which is what we wanted to know, given the context.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 16:12       ` David Laight
@ 2021-05-14 17:28         ` H. Peter Anvin
  0 siblings, 0 replies; 22+ messages in thread
From: H. Peter Anvin @ 2021-05-14 17:28 UTC (permalink / raw)
  To: David Laight, 'Thomas Gleixner',
	Maximilian Luz, Ingo Molnar, Borislav Petkov
  Cc: Sachi King, x86, linux-kernel, stable

On 5/14/21 9:12 AM, David Laight wrote:
> 
> A more interesting probe would be:
> - Write some value to register 1 - the mask.
> - Write 9 to register zero (selects interrupt in service register).
> - Read register 0 - should be zero since we aren't in as ISR.
> - Read register 1 - should get the mask back.
> You can also write 8 to register 0, reads then return the pending interrupts.
> Their might be pending interrupts - so that value can't be checked.
> 
> But if reads start returning the last written value you might only
> have capacitors on the data bus.

What data bus? These things haven't been on a physical parallel bus for 
ages.

> The required initialisation registers are pretty fixed for the PC hardware.
> But finding the values requires a bit of work.
> 
> 	David

And you always risk activating new bugs.

Since this appears to be a specific platform advertising the wrong 
answer in firmware, this is better handled as a quirk.

	-hpa


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 11:58         ` Maximilian Luz
@ 2021-05-14 17:32           ` Thomas Gleixner
  2021-05-14 17:35             ` H. Peter Anvin
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2021-05-14 17:32 UTC (permalink / raw)
  To: Maximilian Luz, Sachi King, Ingo Molnar, Borislav Petkov, David Laight
  Cc: H. Peter Anvin, x86, linux-kernel, stable

On Fri, May 14 2021 at 13:58, Maximilian Luz wrote:
> On 5/14/21 9:41 PM, Sachi King wrote:
> I'd assume that _some_ sort of interrupt setup is done by the BIOS/UEFI.
> The UEFI on those devices is fairly well-featured, with touch support
> via SPI and all. Furthermore, keyboard (also supported in the device's
> UEFI) is handled via a custom UART protocol. Unless they rely on polling
> for all of that, I believe they'd have to set up some interrupts.

Polling would be truly surprising.

> Although, as you mention later on, that could also be handled via the
> IOAPIC and the PIC is actually not supposed to be used. Maybe some
> legacy component that never got tested and just broke with some new
> hardware/firmware revision without anyone noticing? And since Linux
> still seems to rely on that, we might be the first to notice.

That's a valid assumption. As I said, we can make IOAPIC work even w/o
PIC. I'll have a look how much PIC assumptions are still around.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 17:32           ` Thomas Gleixner
@ 2021-05-14 17:35             ` H. Peter Anvin
  2021-05-14 22:47               ` Maximilian Luz
  0 siblings, 1 reply; 22+ messages in thread
From: H. Peter Anvin @ 2021-05-14 17:35 UTC (permalink / raw)
  To: Thomas Gleixner, Maximilian Luz, Sachi King, Ingo Molnar,
	Borislav Petkov, David Laight
  Cc: x86, linux-kernel, stable

On 5/14/21 10:32 AM, Thomas Gleixner wrote:
> 
> That's a valid assumption. As I said, we can make IOAPIC work even w/o
> PIC. I'll have a look how much PIC assumptions are still around.
> 

As far as I read, the problem isn't actually the absence of a PIC (we 
definitely boot systems without PICs all the time now), but rather that 
the PIC is advertised in ACPI but is buggy or absent; a similar platform 
with different firmware doesn't have problem.

If my understanding of the thread is correct, it's quirk fodder.

	-hpa


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-13 10:36     ` David Laight
  2021-05-14 13:01       ` Thomas Gleixner
@ 2021-05-14 19:41       ` Sachi King
  2021-05-14 10:51         ` David Laight
  2021-05-14 11:58         ` Maximilian Luz
  1 sibling, 2 replies; 22+ messages in thread
From: Sachi King @ 2021-05-14 19:41 UTC (permalink / raw)
  To: 'Maximilian Luz',
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, David Laight
  Cc: H. Peter Anvin, x86, linux-kernel, stable

On Thursday, May 13, 2021 8:36:27 PM AEST David Laight wrote:
> > -----Original Message-----
> > From: Maximilian Luz <luzmaximilian@gmail.com>
> > Sent: 13 May 2021 11:12
> > To: David Laight <David.Laight@ACULAB.COM>; Thomas Gleixner
> > <tglx@linutronix.de>; Ingo Molnar <mingo@redhat.com>; Borislav Petkov
> > <bp@alien8.de>
> > Cc: H. Peter Anvin <hpa@zytor.com>; Sachi King <nakato@nakato.io>;
> > x86@kernel.org; linux-kernel@vger.kernel.org; stable@vger.kernel.org
> > Subject: Re: [PATCH] x86/i8259: Work around buggy legacy PIC
> > 
> > On 5/13/21 10:10 AM, David Laight wrote:
> > 
> > > From: Maximilian Luz
> > > 
> > >> Sent: 12 May 2021 22:05
> > >>
> > >>
> > >>
> > >> The legacy PIC on the AMD variant of the Microsoft Surface Laptop 4
> > >> has
> > >> some problems on boot. For some reason it consistently does not
> > >> respond
> > >> on the first try, requiring a couple more tries before it finally
> > >> responds.
> > >
> > >
> > >
> > > That seems very strange, something else must be going on that causes the
> > > grief.
> > > The 8259 will be built into to the one of the cpu support
> > > chips.
> > > I can't imagine that requires anything special.
> > 
> > 
> > Right, it's definitely strange. Both Sachi (I imagine) and I don't know
> > much about these devices, so we're open for suggestions.
> 
> 
> I found a copy of the datasheet (I don't seem to have the black book):
> 
> https://pdos.csail.mit.edu/6.828/2010/readings/hardware/8259A.pdf
> 
> The PC hardware has two 8259 in cascade mode.
> (Cascaded using an interrupt that wasn't really using in the original
> 8088 PC which only had one 8259.)
> 
> I wonder if the bios has actually initialised is properly.
> Some initialisation writes have to be done to set everything up.

I suspect by the displayed behaviour you are correct and that it has
not.  I'm struggling to figure out who to talk to to see that is
something that can be fixed in the firmware.

> It is also worth noting that the probe code is spectacularly crap.
> It writes 0xff and then checks that 0xff is read back.
> Almost anything (including a failed PCIe read to the ISA bridge)
> will return 0xff and make the test pass.

I was under the impression that it wrote 0xfb, and 0xff would be
considered a failure.

> It's about 35 years since I last wrote the code to initialise an 8259.
> The memory cells are foggy.

I'm not sure the i8259 is needed on the device, as the interrupts
appear to function on the device if I bypass the nr_legacy_irqs() check
while the legacy_pic is set to the null_legacy_pic.

The null_legacy_pic however specifies having 0 irqs, and the io_apic
does not allow us to set the pin attributes unless the pin we are
attempting to set is less than nr_legacy_irqs.

The IOAPIC seems to take responsibility for the 0-15 interrupts on this
specific hardware, should we maybe be ignoring the i8259 and looking
into allowing interrupts 0-15 to be setup even when the legacy_pic is
not available?

Cheers,
Sachi



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 17:35             ` H. Peter Anvin
@ 2021-05-14 22:47               ` Maximilian Luz
  2021-05-17 18:40                 ` Thomas Gleixner
  0 siblings, 1 reply; 22+ messages in thread
From: Maximilian Luz @ 2021-05-14 22:47 UTC (permalink / raw)
  To: H. Peter Anvin, Thomas Gleixner, Sachi King, Ingo Molnar,
	Borislav Petkov, David Laight
  Cc: x86, linux-kernel, stable

On 5/14/21 7:35 PM, H. Peter Anvin wrote:
> On 5/14/21 10:32 AM, Thomas Gleixner wrote:
>>
>> That's a valid assumption. As I said, we can make IOAPIC work even w/o
>> PIC. I'll have a look how much PIC assumptions are still around.
>>
> 
> As far as I read, the problem isn't actually the absence of a PIC (we definitely boot systems without PICs all the time now), but rather that the PIC is advertised in ACPI but is buggy or absent; a similar platform with different firmware doesn't have problem.
> 
> If my understanding of the thread is correct, it's quirk fodder.

I believe the theory was that, while the PIC is advertised in ACPI, it
might be expected to not be used and only present for some legacy reason
(therefore untested and buggy). Which I believe led to the question
whether we shouldn't prefer IOAPIC on systems like that in general. So I
guess it comes down to how you define "systems like that". By Tomas'
comment, I guess it should be possible to implement this as "systems
that should prefer IOAPIC over legacy PIC" quirk.

I guess all modern machines should have an IOAPIC, so it might also be
preferable to expand that definition, maybe over time and with enough
testing.

If possible, I think that would be preferable to the "just retry until
it works" kind of workaround.

As Sachi mentioned in her reply:

> I'm not sure the i8259 is needed on the device, as the interrupts
> appear to function on the device if I bypass the nr_legacy_irqs() check
> while the legacy_pic is set to the null_legacy_pic.
> 
> The null_legacy_pic however specifies having 0 irqs, and the io_apic
> does not allow us to set the pin attributes unless the pin we are
> attempting to set is less than nr_legacy_irqs.
> 
> The IOAPIC seems to take responsibility for the 0-15 interrupts on this
> specific hardware, should we maybe be ignoring the i8259 and looking
> into allowing interrupts 0-15 to be setup even when the legacy_pic is
> not available?

Just my two cents, please keep in mind that I'm out of my depth here.

Regards,
Max

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-14 22:47               ` Maximilian Luz
@ 2021-05-17 18:40                 ` Thomas Gleixner
  2021-05-17 19:25                   ` Maximilian Luz
  0 siblings, 1 reply; 22+ messages in thread
From: Thomas Gleixner @ 2021-05-17 18:40 UTC (permalink / raw)
  To: Maximilian Luz, H. Peter Anvin, Sachi King, Ingo Molnar,
	Borislav Petkov, David Laight
  Cc: x86, linux-kernel, stable

Max,

On Sat, May 15 2021 at 00:47, Maximilian Luz wrote:
> I believe the theory was that, while the PIC is advertised in ACPI, it
> might be expected to not be used and only present for some legacy reason
> (therefore untested and buggy). Which I believe led to the question
> whether we shouldn't prefer IOAPIC on systems like that in general. So I
> guess it comes down to how you define "systems like that". By Tomas'
> comment, I guess it should be possible to implement this as "systems
> that should prefer IOAPIC over legacy PIC" quirk.
>
> I guess all modern machines should have an IOAPIC, so it might also be
> preferable to expand that definition, maybe over time and with enough
> testing.

I just double checked and we actually can boot just fine without the
PIC even when it is advertised, but disfunctional.

Can you please add "apic=verbose" to the kernel command line and provide
full dmesg output for a kernel w/o your patch and one with your patch
applied?

Thanks,

        tglx



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-17 18:40                 ` Thomas Gleixner
@ 2021-05-17 19:25                   ` Maximilian Luz
  2021-05-17 23:27                     ` Thomas Gleixner
  0 siblings, 1 reply; 22+ messages in thread
From: Maximilian Luz @ 2021-05-17 19:25 UTC (permalink / raw)
  To: Thomas Gleixner, H. Peter Anvin, Sachi King, Ingo Molnar,
	Borislav Petkov, David Laight
  Cc: x86, linux-kernel, stable

On 5/17/21 8:40 PM, Thomas Gleixner wrote:
> Max,
> 
> On Sat, May 15 2021 at 00:47, Maximilian Luz wrote:
>> I believe the theory was that, while the PIC is advertised in ACPI, it
>> might be expected to not be used and only present for some legacy reason
>> (therefore untested and buggy). Which I believe led to the question
>> whether we shouldn't prefer IOAPIC on systems like that in general. So I
>> guess it comes down to how you define "systems like that". By Tomas'
>> comment, I guess it should be possible to implement this as "systems
>> that should prefer IOAPIC over legacy PIC" quirk.
>>
>> I guess all modern machines should have an IOAPIC, so it might also be
>> preferable to expand that definition, maybe over time and with enough
>> testing.
> 
> I just double checked and we actually can boot just fine without the
> PIC even when it is advertised, but disfunctional.
> 
> Can you please add "apic=verbose" to the kernel command line and provide
> full dmesg output for a kernel w/o your patch and one with your patch
> applied?

I don't actually own an affected device, but I'm sure Sachi can provide
you with that.

As far as we can tell, due to the NULL PIC being chosen nr_legacy_irqs()
returns 0. That in turn causes mp_check_pin_attr() to return false
because is_level and active_low don't seem to match the expected values.
That check is essentially ignored if nr_legacy_irqs() returns a high
enough value. I guess that might also be a firmware bug here? Not sure
where the expected values come from.

Due to this, mp_map_pin_to_irq() fails with -EBUSY which causes
acpi_register_gsi() to fail. That fails in acpi_dev_get_irqresource(),
which causes the IRQ resource to be marked as disabled.

Down the line, this then causes platform_get_irq() to return -EINVAL,
because the IRQ we're trying to get has the IORESOURCE_DISABLED bit set.

Sachi can probably walk you through this a bit better as she's the one
who tracked this down. See also [1, 2] and following comments.

Regards,
Max

[1]: https://github.com/linux-surface/linux-surface/issues/425#issuecomment-835309201
[2]: https://github.com/linux-surface/linux-surface/issues/425#issuecomment-835261784

^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-17 19:25                   ` Maximilian Luz
@ 2021-05-17 23:27                     ` Thomas Gleixner
  2021-05-18  8:28                       ` Andy Shevchenko
  2021-05-18 19:58                       ` Sachi King
  0 siblings, 2 replies; 22+ messages in thread
From: Thomas Gleixner @ 2021-05-17 23:27 UTC (permalink / raw)
  To: Maximilian Luz
  Cc: H. Peter Anvin, Sachi King, Ingo Molnar, Borislav Petkov,
	David Laight, x86, linux-kernel, stable, Andy Shevchenko,
	Sasha Levin, Tom Lendacky

On Mon, May 17 2021 at 21:25, Maximilian Luz wrote:
> On 5/17/21 8:40 PM, Thomas Gleixner wrote:
>> Can you please add "apic=verbose" to the kernel command line and provide
>> full dmesg output for a kernel w/o your patch and one with your patch
>> applied?
>
> I don't actually own an affected device, but I'm sure Sachi can provide
> you with that.

Ok.

> As far as we can tell, due to the NULL PIC being chosen nr_legacy_irqs()
> returns 0. That in turn causes mp_check_pin_attr() to return false
> because is_level and active_low don't seem to match the expected
> values.

Ok.

> That check is essentially ignored if nr_legacy_irqs() returns a high
> enough value.

Close enough.

> I guess that might also be a firmware bug here? Not sure where the
> expected values come from.

They come from the interrupt override ACPI table and if not supplied
then irq 0-15 is preset with default values, which are type=edge and
polarity=high, i.e.  the opposite of what the failing driver wants.

The ACPI table lacks an override entry for IRQ7. I looked at one of the
dmesg files in that github thread and that has overrides:

[    0.111674] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
[    0.111681] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
[    0.111688] ACPI: IRQ0 used by override.
[    0.111692] ACPI: IRQ9 used by override.

IRQ7 should have a corresponding entry as IRQ9 has:

https://github.com/linux-surface/acpidumps/blob/4da0148744164cea0c924dab92f45842fde03177/surface_laptop_4_amd/apic.dsl#L178

          Subtable Type : 02 [Interrupt Source Override]
                 Length : 0A
                    Bus : 00
                 Source : 07
              Interrupt : 00000007
  Flags (decoded below) : 000F
               Polarity : 3
           Trigger Mode : 3

> Sachi can probably walk you through this a bit better as she's the one
> who tracked this down. See also [1, 2] and following comments.

Impressive detective work!

Sachi, can you please try the hack below to confirm the above?

It's not meant to be a solution, but it's the most trivial way to
validate this.

I'm pretty sure that Windows on Surface does not care about the PIC at
all. Whether that's on purpose to safe power or just because Windows
ignores the PIC completely by now does not matter at all. No idea how
that repeated poking on the PIC makes it come alive either and TBH, I
don't care too much about it simply because Linux is able to cope with a
missing PIC as long as the ACPI tables are correct.

I'm way too tired to think about a proper solution for that problem and
I noticed another related issue in that dmesg output:

[    0.272448] Failed to register legacy timer interrupt

It's not a problem which causes failures, but it's related to the
missing PIC.

Needs some more thoughts with brain awake...

Thanks,

        tglx
---
--- a/arch/x86/kernel/acpi/boot.c
+++ b/arch/x86/kernel/acpi/boot.c
@@ -21,6 +21,7 @@
 #include <linux/efi-bgrt.h>
 #include <linux/serial_core.h>
 #include <linux/pgtable.h>
+#include <linux/dmi.h>
 
 #include <asm/e820/api.h>
 #include <asm/irqdomain.h>
@@ -1155,6 +1156,17 @@ static void __init mp_config_acpi_legacy
 	}
 }
 
+static const struct dmi_system_id surface_quirk[] __initconst = {
+	{
+		.ident = "Microsoft Surface Laptop 4 (AMD)",
+		.matches = {
+			DMI_MATCH(DMI_SYS_VENDOR, "Microsoft Corporation"),
+			DMI_MATCH(DMI_PRODUCT_SKU, "Surface_Laptop_4_1952:1953")
+		},
+	},
+	{}
+};
+
 /*
  * Parse IOAPIC related entries in MADT
  * returns 0 on success, < 0 on error
@@ -1212,6 +1224,11 @@ static int __init acpi_parse_madt_ioapic
 		acpi_sci_ioapic_setup(acpi_gbl_FADT.sci_interrupt, 0, 0,
 				      acpi_gbl_FADT.sci_interrupt);
 
+	if (dmi_check_system(surface_quirk)) {
+		pr_warn("Surface hack: Override irq 7\n");
+		mp_override_legacy_irq(7, 3, 3, 7);
+	}
+
 	/* Fill in identity legacy mappings where no override */
 	mp_config_acpi_legacy_irqs();
 



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-17 23:27                     ` Thomas Gleixner
@ 2021-05-18  8:28                       ` Andy Shevchenko
  2021-05-18 19:58                       ` Sachi King
  1 sibling, 0 replies; 22+ messages in thread
From: Andy Shevchenko @ 2021-05-18  8:28 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Maximilian Luz, H. Peter Anvin, Sachi King, Ingo Molnar,
	Borislav Petkov, David Laight, x86, linux-kernel, stable,
	Sasha Levin, Tom Lendacky

On Tue, May 18, 2021 at 01:27:02AM +0200, Thomas Gleixner wrote:
> On Mon, May 17 2021 at 21:25, Maximilian Luz wrote:
> > On 5/17/21 8:40 PM, Thomas Gleixner wrote:
> >> Can you please add "apic=verbose" to the kernel command line and provide
> >> full dmesg output for a kernel w/o your patch and one with your patch
> >> applied?
> >
> > I don't actually own an affected device, but I'm sure Sachi can provide
> > you with that.
> 
> Ok.
> 
> > As far as we can tell, due to the NULL PIC being chosen nr_legacy_irqs()
> > returns 0. That in turn causes mp_check_pin_attr() to return false
> > because is_level and active_low don't seem to match the expected
> > values.
> 
> Ok.
> 
> > That check is essentially ignored if nr_legacy_irqs() returns a high
> > enough value.
> 
> Close enough.
> 
> > I guess that might also be a firmware bug here? Not sure where the
> > expected values come from.
> 
> They come from the interrupt override ACPI table and if not supplied
> then irq 0-15 is preset with default values, which are type=edge and
> polarity=high, i.e.  the opposite of what the failing driver wants.
> 
> The ACPI table lacks an override entry for IRQ7. I looked at one of the
> dmesg files in that github thread and that has overrides:
> 
> [    0.111674] ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
> [    0.111681] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 low level)
> [    0.111688] ACPI: IRQ0 used by override.
> [    0.111692] ACPI: IRQ9 used by override.
> 
> IRQ7 should have a corresponding entry as IRQ9 has:
> 
> https://github.com/linux-surface/acpidumps/blob/4da0148744164cea0c924dab92f45842fde03177/surface_laptop_4_amd/apic.dsl#L178
> 
>           Subtable Type : 02 [Interrupt Source Override]
>                  Length : 0A
>                     Bus : 00
>                  Source : 07
>               Interrupt : 00000007
>   Flags (decoded below) : 000F
>                Polarity : 3
>            Trigger Mode : 3
> 
> > Sachi can probably walk you through this a bit better as she's the one
> > who tracked this down. See also [1, 2] and following comments.
> 
> Impressive detective work!
> 
> Sachi, can you please try the hack below to confirm the above?
> 
> It's not meant to be a solution, but it's the most trivial way to
> validate this.
> 
> I'm pretty sure that Windows on Surface does not care about the PIC at
> all. Whether that's on purpose to safe power or just because Windows
> ignores the PIC completely by now does not matter at all. No idea how
> that repeated poking on the PIC makes it come alive either and TBH, I
> don't care too much about it simply because Linux is able to cope with a
> missing PIC as long as the ACPI tables are correct.
> 
> I'm way too tired to think about a proper solution for that problem and
> I noticed another related issue in that dmesg output:
> 
> [    0.272448] Failed to register legacy timer interrupt
> 
> It's not a problem which causes failures, but it's related to the
> missing PIC.

But ACPI has a pretty nice means about missing legacy hardware, it's called
Hardware Reduced mode. It excludes automatically the (legacy) PIC, PIT, etc.

OTOH it excludes ACPI power chip as well. I haven't looked into this, just
share my thoughts what else can be checked. (On Intel the MID devices use
that approach)

> Needs some more thoughts with brain awake...

-- 
With Best Regards,
Andy Shevchenko



^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-18 19:58                       ` Sachi King
@ 2021-05-18 15:45                         ` Thomas Gleixner
  0 siblings, 0 replies; 22+ messages in thread
From: Thomas Gleixner @ 2021-05-18 15:45 UTC (permalink / raw)
  To: Sachi King, Maximilian Luz
  Cc: H. Peter Anvin, Ingo Molnar, Borislav Petkov, David Laight, x86,
	linux-kernel, stable, Andy Shevchenko, Sasha Levin, Tom Lendacky,
	Rafael J. Wysocki

Sachi,

On Wed, May 19 2021 at 05:58, Sachi King wrote:
> On Tuesday, May 18, 2021 9:27:02 AM AEST Thomas Gleixner wrote:
>> On Mon, May 17 2021 at 21:25, Maximilian Luz wrote:
>> > On 5/17/21 8:40 PM, Thomas Gleixner wrote:
>> >> Can you please add "apic=verbose" to the kernel command line and provide
>> >> full dmesg output for a kernel w/o your patch and one with your patch
>> >> applied?
>
> I've linked to a dmesg with "apic=verbose" below with the irq 7 override hack 
> applied.  Would you still like a copy without either of these patches
> applied?

No, that's pretty conclusive.

> What's the convention for including a dmesg on the mailing list?  I've 
> included the copy via gist as pasting it into email seems incorrect, but 
> that's also probably not the correct convention.

That's fine.

>> Sachi, can you please try the hack below to confirm the above?
>>
>> It's not meant to be a solution, but it's the most trivial way to
>> validate this.
>
> I've applied that patch and the GPIO driver loads and functions.

It's exactly what I expected. So now for a proper solution. After
talking to Raphael it seems something like that quirk is the best option
we have. I'll try to come up with something less horrible. There is some
other minor issue I noticed, which I need to look at as well.

Thanks for digging into this and for testing!

       tglx


^ permalink raw reply	[flat|nested] 22+ messages in thread

* Re: [PATCH] x86/i8259: Work around buggy legacy PIC
  2021-05-17 23:27                     ` Thomas Gleixner
  2021-05-18  8:28                       ` Andy Shevchenko
@ 2021-05-18 19:58                       ` Sachi King
  2021-05-18 15:45                         ` Thomas Gleixner
  1 sibling, 1 reply; 22+ messages in thread
From: Sachi King @ 2021-05-18 19:58 UTC (permalink / raw)
  To: Maximilian Luz, Thomas Gleixner
  Cc: H. Peter Anvin, Ingo Molnar, Borislav Petkov, David Laight, x86,
	linux-kernel, stable, Andy Shevchenko, Sasha Levin, Tom Lendacky

On Tuesday, May 18, 2021 9:27:02 AM AEST Thomas Gleixner wrote:
> On Mon, May 17 2021 at 21:25, Maximilian Luz wrote:
> > On 5/17/21 8:40 PM, Thomas Gleixner wrote:
> >> Can you please add "apic=verbose" to the kernel command line and provide
> >> full dmesg output for a kernel w/o your patch and one with your patch
> >> applied?

I've linked to a dmesg with "apic=verbose" below with the irq 7 override hack 
applied.  Would you still like a copy without either of these patches applied?

What's the convention for including a dmesg on the mailing list?  I've 
included the copy via gist as pasting it into email seems incorrect, but 
that's also probably not the correct convention.

> Sachi, can you please try the hack below to confirm the above?
>
> It's not meant to be a solution, but it's the most trivial way to
> validate this.

I've applied that patch and the GPIO driver loads and functions.

The dmesg with the patch and "apic=verbose" I've placed at:
https://gist.github.com/nakato/08c6962a540d91b6f9597303d54bffe5

Thanks,
Sachi



^ permalink raw reply	[flat|nested] 22+ messages in thread

end of thread, other threads:[~2021-05-18 15:46 UTC | newest]

Thread overview: 22+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-12 21:04 [PATCH] x86/i8259: Work around buggy legacy PIC Maximilian Luz
2021-05-13  8:10 ` David Laight
2021-05-13 10:11   ` Maximilian Luz
2021-05-13 10:36     ` David Laight
2021-05-14 13:01       ` Thomas Gleixner
2021-05-14 13:13         ` David Laight
2021-05-14 16:19           ` Ingo Molnar
2021-05-14 19:41       ` Sachi King
2021-05-14 10:51         ` David Laight
2021-05-14 11:58         ` Maximilian Luz
2021-05-14 17:32           ` Thomas Gleixner
2021-05-14 17:35             ` H. Peter Anvin
2021-05-14 22:47               ` Maximilian Luz
2021-05-17 18:40                 ` Thomas Gleixner
2021-05-17 19:25                   ` Maximilian Luz
2021-05-17 23:27                     ` Thomas Gleixner
2021-05-18  8:28                       ` Andy Shevchenko
2021-05-18 19:58                       ` Sachi King
2021-05-18 15:45                         ` Thomas Gleixner
2021-05-14 13:44     ` Thomas Gleixner
2021-05-14 16:12       ` David Laight
2021-05-14 17:28         ` H. Peter Anvin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.