linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Lendacky, Thomas" <Thomas.Lendacky@amd.com>
To: Hans de Goede <hdegoede@redhat.com>,
	Thomas Gleixner <tglx@linutronix.de>
Cc: Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>,
	Borislav Petkov <bp@alien8.de>
Subject: Re: False positive "do_IRQ: #.55 No irq handler for vector" messages on AMD ryzen based laptops
Date: Tue, 5 Mar 2019 19:31:35 +0000	[thread overview]
Message-ID: <62f91d1a-4dc7-9628-5c87-5ffca0cd1a0f@amd.com> (raw)
In-Reply-To: <51078b59-161a-0e13-6d8d-87d37c3375f2@redhat.com>

On 3/5/19 1:19 PM, Hans de Goede wrote:
> Hi,
> 
> On 05-03-19 17:02, Hans de Goede wrote:
>> Hi,
>>
>> On 05-03-19 15:06, Lendacky, Thomas wrote:
>>> On 3/3/19 4:57 AM, Hans de Goede wrote:
>>>> Hi,
>>>>
>>>> On 21-02-19 13:30, Hans de Goede wrote:
>>>>> Hi,
>>>>>
>>>>> On 19-02-19 22:47, Lendacky, Thomas wrote:
>>>>>> On 2/19/19 3:01 PM, Thomas Gleixner wrote:
>>>>>>> Hans,
>>>>>>>
>>>>>>> On Tue, 19 Feb 2019, Hans de Goede wrote:
>>>>>>>
>>>>>>> Cc+: ACPI/AMD folks
>>>>>>>
>>>>>>>> Various people are reporting false positive "do_IRQ: #.55 No irq
>>>>>>>> handler for
>>>>>>>> vector"
>>>>>>>> messages on AMD ryzen based laptops, see e.g.:
>>>>>>>>
>>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1551605
>>>>>>>>
>>>>>>>> Which contains this dmesg snippet:
>>>>>>>>
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: smp: Bringing up
>>>>>>>> secondary CPUs
>>>>>>>> ...
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: x86: Booting SMP
>>>>>>>> configuration:
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: .... node  #0,
>>>>>>>> CPUs:      #1
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: do_IRQ: 1.55 No irq
>>>>>>>> handler for
>>>>>>>> vector
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel:  #2
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: do_IRQ: 2.55 No irq
>>>>>>>> handler for
>>>>>>>> vector
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel:  #3
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: do_IRQ: 3.55 No irq
>>>>>>>> handler for
>>>>>>>> vector
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: smp: Brought up 1 node,
>>>>>>>> 4 CPUs
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: smpboot: Max logical
>>>>>>>> packages: 1
>>>>>>>> Feb 07 20:14:29 localhost.localdomain kernel: smpboot: Total of 4
>>>>>>>> processors
>>>>>>>> activated (15968.49 BogoMIPS)
>>>>>>>>
>>>>>>>> It seems that we get an IRQ for each CPU as we bring it online,
>>>>>>>> which feels to me like it is some sorta false-positive.
>>>>>>>
>>>>>>> Sigh, that looks like BIOS value add again.
>>>>>>>
>>>>>>> It's not a false positive. Something _IS_ sending a vector 55 to these
>>>>>>> CPUs
>>>>>>> for whatever reason.
>>>>>>>
>>>>>>
>>>>>> I remember seeing something like this in the past and it turned out
>>>>>> to be
>>>>>> a BIOS issue.  BIOS was enabling the APs to interact with the legacy
>>>>>> 8259
>>>>>> interrupt controller when only the BSP should. During POST the APs were
>>>>>> exposed to ExtINT/INTR events as a result of the mis-configuration
>>>>>> (probably due to a UEFI timer-tick using the 8259) and this left a
>>>>>> pending
>>>>>> ExtINT/INTR interrupt latched on the APs.
>>>>>>
>>>>>> When the APs were started by the OS, the latched ExtINT/INTR
>>>>>> interrupt is
>>>>>> processed shortly after the OS enables interrupts. The AP then
>>>>>> queries the
>>>>>> 8259 to identify the vector number (which is the value of the 8259's
>>>>>> ICW2
>>>>>> register + the IRQ level). The master 8259's ICW2 was set to 0x30 and,
>>>>>> since no interrupts are actually pending, the 8259 will respond with
>>>>>> IRQ7
>>>>>> (spurious interrupt) yielding a vector of 0x37 or 55.
>>>>>>
>>>>>> The OS was not expecting vector 55 and printed the message.
>>>>>>
>>>>>>   From the Intel Developer's Manual: Vol 3a, Section 10.5.1:
>>>>>> "Only one processor in the system should have an LVT entry
>>>>>> configured to
>>>>>> use the ExtINT delivery mode."
>>>>>>
>>>>>> Not saying this is the problem, but very well could be.
>>>>>
>>>>> That sounds like a likely candidate, esp. also since this only happens
>>>>> once per CPU when we first only the CPU.
>>>>>
>>>>> Can you provide me with a patch with some printk-s / pr_debugs to
>>>>> test for this, then I can build a kernel with that patch added and
>>>>> we can see if your hypothesis is right.
>>>>
>>>> Ping? I like your theory, can you provide some help with debugging this
>>>> further (to prove that your theory is correct ) ?
>>>
>>> It's been a very long time since I dealt with this and I was only on the
>>> periphery. You might be able to print the LVT entries from the APIC and
>>> see if any of them have an un-masked ExtINT delivery mode.  You would need
>>> to do this very early before Linux modifies any values.
>>
>> I'm afraid I'm not familiar enough with the interrupt / APIC parts of
>> the kernel to do something like this myself.
>>
>>> Or you can report the issue to the OEM and have them check their BIOS
>>> code to see if they are doing this.
>>
>> I will try to go this route, but I'm not really hopeful that will
>> lead to a solution.
> 
> A similar issue is also reported here:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=1551605
> 
> There are multiple people with different vectors (so likely / possibly
> different bugs) commenting on that bug, but I just got confirmation
> that the vector 55 issue is also happening on an Acer system with an AMD
> A8 processor (I suspect a Ryzen, but that still needs to be confirmed).
> 
> So this seems to be a generic issue with (some) AMD laptops and
> not specific to one OEM.

I also see that comment 17 is for an Intel based machine, which to me
implies that it really is a BIOS issue.

Thanks,
Tom

> 
> Regards,
> 
> Hans

  reply	other threads:[~2019-03-05 19:31 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-02-19 15:53 False positive "do_IRQ: #.55 No irq handler for vector" messages on AMD ryzen based laptops Hans de Goede
2019-02-19 21:01 ` Thomas Gleixner
2019-02-19 21:47   ` Lendacky, Thomas
2019-02-21 12:30     ` Hans de Goede
2019-03-03 10:57       ` Hans de Goede
2019-03-05 14:06         ` Lendacky, Thomas
2019-03-05 16:02           ` Hans de Goede
2019-03-05 19:19             ` Hans de Goede
2019-03-05 19:31               ` Lendacky, Thomas [this message]
2019-03-05 19:40                 ` Hans de Goede
2019-03-05 19:54                   ` Borislav Petkov
2019-03-06  8:41                     ` Hans de Goede
2019-03-06 10:14                       ` Thomas Gleixner
2019-03-07 11:20                         ` Hans de Goede
2019-02-21 12:28   ` Hans de Goede
2021-01-09  5:50 Christopher William Snowhill

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=62f91d1a-4dc7-9628-5c87-5ffca0cd1a0f@amd.com \
    --to=thomas.lendacky@amd.com \
    --cc=bp@alien8.de \
    --cc=hdegoede@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).