linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jack Miller <jack@codezen.org>
To: Borislav Petkov <bp@suse.de>
Cc: Jack Miller <jack@codezen.org>,
	linux-kernel@vger.kernel.org, tglx@linutronix.de,
	Yazen.Ghannam@amd.com, x86@kernel.org
Subject: Re: [PATCH] x86/mce/AMD: Fix partial SMCA bank init when CPU 0 != thread 0
Date: Wed, 28 Jun 2017 12:44:17 -0500	[thread overview]
Message-ID: <CAGXwabS2UNeZ65rFjMQbKxg+e3zhYiWv_bQkYad5BfPO1LWCMA@mail.gmail.com> (raw)
In-Reply-To: <20170628092219.4df52dhwe7q3iao5@pd.tnic>

On Wed, Jun 28, 2017 at 4:22 AM, Borislav Petkov <bp@suse.de> wrote:
> On Tue, Jun 27, 2017 at 07:06:30PM -0500, Jack Miller wrote:
>> After a call to firmware SwitchBSP(),
>
> What is that and who does that?

SwitchBSP() is part of the UEFI MPServices Protocol which I believe is
an extension but it is supported by all of the firmwares I've tested
on.

In this case, I'm using a bootloader to SwitchBSP() so that hardware
thread 0 (and thus core 0) can be offlined on AMD hardware
(cpu0_hotplug unsupported). This is currently working by passing
'nomce' to the kernel, but obviously I'd prefer not to disable it.

>
>> Linux can be booted with a thread
>> that isn't the first in the system. That thread automatically becomes
>> CPU 0.
>
> Btw, you should be seeing other explosions too as a lot of code assumes
> CPU 0 is the BSP.

Actually, with 'nomce' or this patch applied the system seems to chug
along merrily, no further errors in dmesg, no further BUGs. Linux
still gets all of the topology correct (i.e. CPU 0's
core/thread/siblings are correctly identified) so really, aside from
userspace programs doing naive stuff with CPU affinity (like expecting
even,odd CPUs to be SMT pairs), I think the overall result here is
that most threads are interchangeable... except when probing certain
features like these MCA types.

>
> ...
>
>> Signed-off-by: Jack Miller <jack@codezen.org>
>> ---
>>  arch/x86/kernel/cpu/mcheck/mce_amd.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/arch/x86/kernel/cpu/mcheck/mce_amd.c b/arch/x86/kernel/cpu/mcheck/mce_amd.c
>> index 6e4a047e4b68..9d74adcf34d2 100644
>> --- a/arch/x86/kernel/cpu/mcheck/mce_amd.c
>> +++ b/arch/x86/kernel/cpu/mcheck/mce_amd.c
>> @@ -170,8 +170,8 @@ static void get_smca_bank_info(unsigned int bank)
>>       struct smca_hwid *s_hwid;
>>       u32 high, instance_id;
>>
>> -     /* Collect bank_info using CPU 0 for now. */
>> -     if (cpu)
>> +     /* Collect bank_info using hardware thread 0 for now. */
>> +     if (apic->get_apic_id(apic->read(APIC_ID)) != 0)
>
> Does
>
>         if (cpu != boot_cpu_data.cpu_index)
>                 return;
>
> work?

Unfortunately, it doesn't. That value is explicitly set to 0. Most
mechanisms operate around CPU #, which isn't very helpful if the BSP
was changed under the covers.

Alternatively, we could possibly sidestep the APIC ID uncertainty by
patching get_smca_bank_info() to fallback on reading the bank
hwid_mcatype from other online CPUs (it's already using
rdmsr_safe_on_cpu) if its own hwid_mcatype isn't valid/recognized, but
that's a more invasive patch.

Jack

  reply	other threads:[~2017-06-28 17:44 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-28  0:06 [PATCH] x86/mce/AMD: Fix partial SMCA bank init when CPU 0 != thread 0 Jack Miller
2017-06-28  9:22 ` Borislav Petkov
2017-06-28 17:44   ` Jack Miller [this message]
2017-06-28 18:00     ` Ghannam, Yazen
2017-06-28 18:53       ` Jack Miller
2017-06-28 18:58         ` Ghannam, Yazen
2017-06-29 16:22           ` Jack Miller
2017-06-29 17:58             ` Ghannam, Yazen
2017-06-28 18:16     ` Borislav Petkov
2017-06-28 18:51       ` Ghannam, Yazen
2017-06-28 18:55         ` Borislav Petkov
2017-06-29 18:08 ` [PATCH] x86/mce/AMD: Allow any CPU to initialize smca_banks array Yazen Ghannam
2017-06-30 15:57   ` Jack Miller
2017-07-17  5:19   ` Borislav Petkov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGXwabS2UNeZ65rFjMQbKxg+e3zhYiWv_bQkYad5BfPO1LWCMA@mail.gmail.com \
    --to=jack@codezen.org \
    --cc=Yazen.Ghannam@amd.com \
    --cc=bp@suse.de \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tglx@linutronix.de \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).