linux-edac.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Robert Richter <rrichter@amd.com>
To: Yazen Ghannam <yazen.ghannam@amd.com>
Cc: Borislav Petkov <bp@alien8.de>,
	linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
	tony.luck@intel.com, x86@kernel.org, Avadhut.Naik@amd.com,
	John.Allen@amd.com
Subject: Re: [PATCH v2 07/16] x86/mce/amd: Simplify DFR handler setup
Date: Mon, 29 Apr 2024 20:34:37 +0200	[thread overview]
Message-ID: <Zi_oPUzvCDhRVSk4@rric.localdomain> (raw)
In-Reply-To: <e0d10606-4472-4cde-b55d-34180efad42b@amd.com>

On 25.04.24 10:12:44, Yazen Ghannam wrote:
> On 4/24/2024 3:06 PM, Borislav Petkov wrote:
> > On Thu, Apr 04, 2024 at 10:13:50AM -0500, Yazen Ghannam wrote:
> >> AMD systems with the SUCCOR feature can send an APIC LVT interrupt for
> >> deferred errors. The LVT offset is 0x2 by convention, i.e. this is the
> >> default as listed in hardware documentation.
> >>
> >> However, the MCA registers may list a different LVT offset for this
> >> interrupt. The kernel should honor the value from the hardware.
> > 
> > There's this "may" thing again.
> > 
> 
> Right, I should say "the microarchitecture allows it". :)
> 
> > Is this enablement for some future hw too or do you really trust the
> > value in MSR_CU_DEF_ERR is programmed correctly in all cases?
> > 
> 
> I trust the value from hardware.
> 
> The intention here is to simplify the code for general maintenance and to make
> later patches easier.
> 
> >> Simplify the enable flow by using the hardware-provided value. Any
> >> conflicts will be caught by setup_APIC_eilvt(). Conflicts on production
> >> systems can be handled as quirks, if needed.
> > 
> > Well, which systems support succor?
> > 
> > I'd like to test this on them before we face all the quirkery. :)
> > 
> 
> All Zen/SMCA systems. I don't recall any issues in this area.
> 
> Some later Family 15h systems (Carrizo?) had it. But I don't know if it was
> used in production. It was slightly before my time.
> 
> > That area has been plagued by hw snafus if you look at
> > setup_APIC_eilvt() and talk to uncle Robert. :-P
> >
> 
> Right, I found this:
> 27afdf2008da ("apic, x86: Use BIOS settings for IBS and MCE threshold
> interrupt LVT offsets")
> 
> Which is basically the same idea: use what is in the register.
> 
> But it looks there was an issue with IBS on Family 10h.

After looking a while into it I think the issue was the following:

IBS offset was not enabled by firmware, but MCE already was (due to
earlier setup). And mce was (maybe) not on all cpus and only one cpu
per socket enabled. The IBS vector should be enabled on all cpus. Now
firmware allocated offset 1 for mce (instead of offset 0 as for
k8). This caused the hardcoded value (offset 1 for IBS) to be already
taken. Also, hardcoded values couldn't be used at all as this would
have not been worked on k8 (for mce). Another issue was to find the
next free offset as you couldn't examine just the current cpu. So even
if the offset on the current was available, another cpu might have
that offset already in use. Yet another problem was that programmed
offsets for mce and ibs overlapped each other and the kernel had to
reassign them (the ibs offset).

I hope a remember correctly here with all details.

Thanks,

-Robert

  parent reply	other threads:[~2024-04-29 18:34 UTC|newest]

Thread overview: 53+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-04 15:13 [PATCH v2 00/16] MCA Updates Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 01/16] x86/mce: Define mce_setup() helpers for common and per-CPU fields Yazen Ghannam
2024-04-16 10:02   ` Borislav Petkov
2024-04-17 13:50     ` Yazen Ghannam
2024-04-22  8:13       ` Borislav Petkov
2024-04-04 15:13 ` [PATCH v2 02/16] x86/mce: Use mce_setup() helpers for apei_smca_report_x86_error() Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 03/16] x86/mce/amd: Use fixed bank number for quirks Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 04/16] x86/mce/amd: Look up bank type by IPID Yazen Ghannam
2024-04-23 17:06   ` Borislav Petkov
2024-04-23 19:16     ` Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 05/16] x86/mce/amd: Clean up SMCA configuration Yazen Ghannam
2024-04-23 19:06   ` Borislav Petkov
2024-04-23 19:32     ` Yazen Ghannam
2024-04-24  2:29       ` Borislav Petkov
2024-04-24 13:44         ` Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 06/16] x86/mce/amd: Prep DFR handler before enabling banks Yazen Ghannam
2024-04-24 18:34   ` Borislav Petkov
2024-04-25 13:31     ` Yazen Ghannam
2024-04-29 12:38       ` Borislav Petkov
2024-04-29 13:22         ` Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 07/16] x86/mce/amd: Simplify DFR handler setup Yazen Ghannam
2024-04-24 19:06   ` Borislav Petkov
2024-04-25 14:12     ` Yazen Ghannam
2024-04-29 12:59       ` Borislav Petkov
2024-04-29 13:56         ` Yazen Ghannam
2024-04-29 14:12           ` Borislav Petkov
2024-04-29 14:25             ` Yazen Ghannam
2024-04-30 13:47               ` Borislav Petkov
2024-04-29 18:34       ` Robert Richter [this message]
2024-04-30 18:06         ` Borislav Petkov
2024-05-02 16:02           ` Yazen Ghannam
2024-05-02 18:48             ` Robert Richter
2024-05-04 14:37               ` Borislav Petkov
2024-04-04 15:13 ` [PATCH v2 08/16] x86/mce/amd: Clean up enable_deferred_error_interrupt() Yazen Ghannam
2024-04-29 13:12   ` Borislav Petkov
2024-04-29 14:18     ` Yazen Ghannam
2024-05-04 14:41       ` Borislav Petkov
2024-04-04 15:13 ` [PATCH v2 09/16] x86/mce: Unify AMD THR handler with MCA Polling Yazen Ghannam
2024-04-29 13:40   ` Borislav Petkov
2024-04-29 14:36     ` Yazen Ghannam
2024-05-04 14:52       ` Borislav Petkov
2024-05-07 16:25         ` Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 10/16] x86/mce: Unify AMD DFR " Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 11/16] x86/mce: Skip AMD threshold init if no threshold banks found Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 12/16] x86/mce/amd: Support SMCA Corrected Error Interrupt Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 13/16] x86/mce: Add wrapper for struct mce to export vendor specific info Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 14/16] x86/mce, EDAC/mce_amd: Add support for new MCA_SYND{1,2} registers Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 15/16] x86/mce/apei: Handle variable register array size Yazen Ghannam
2024-04-04 15:13 ` [PATCH v2 16/16] EDAC/mce_amd: Add support for FRU Text in MCA Yazen Ghannam
2024-04-05 16:06   ` Luck, Tony
2024-04-07 13:19     ` Yazen Ghannam
2024-04-08 19:47     ` Naik, Avadhut
2024-04-08 19:57       ` Luck, Tony

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zi_oPUzvCDhRVSk4@rric.localdomain \
    --to=rrichter@amd.com \
    --cc=Avadhut.Naik@amd.com \
    --cc=John.Allen@amd.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    --cc=yazen.ghannam@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).