linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Yazen Ghannam <yazen.ghannam@amd.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
	tony.luck@intel.com, x86@kernel.org,
	Smita.KoralahalliChannabasappa@amd.com, mukul.joshi@amd.com,
	alexander.deucher@amd.com, william.roche@oracle.com
Subject: Re: [PATCH 1/3] x86/MCE/AMD: Provide an "Unknown" MCA bank type
Date: Tue, 7 Dec 2021 16:28:42 +0000	[thread overview]
Message-ID: <Ya+LukojuewlomeF@yaz-ubuntu> (raw)
In-Reply-To: <YaqXiVjNLINxwz8G@zn.tnic>

On Fri, Dec 03, 2021 at 11:17:45PM +0100, Borislav Petkov wrote:
> On Fri, Dec 03, 2021 at 02:00:15AM +0000, Yazen Ghannam wrote:
> > The AMD MCA Thresholding sysfs interface populates directories for each
> > bank and thresholding block. The name used for each directory is looked
> > up in a table of known bank types. However, new bank types won't match
> > in this list and will return NULL for the name. This will cause the
> > machinecheck sysfs interface to fail to be populated.
> > 
> > Set new and unknown MCA bank types to the "unknown" type. Also,
> > ensure that the bank's thresholding block directories have unique names.
> > This will ensure that the machinecheck sysfs interface can be
> > initialized.
> 
> What is the advantage of having a sysfs directory structure headed with
> an "unknown" entry vs not having that structure at all when the kernel
> runs on a machine for which it has not been enabled yet?
> 
> IOW, if those new banks would need additional enablement, what's the
> point of having "unknown" on older kernels which do not have any
> functionality?
> 
> IOW, how does this:
> 
> /sys/devices/system/machinecheck/machinecheck0/unknown/unknown/
> ├── error_count
> ├── interrupt_enable
> └── threshold_limit
> 
> help a user?

Yeah, I see your point.

> 
> Btw, looking at the current layout:
> 
> ...
> ├── insn_fetch
> │   └── insn_fetch
> │       ├── error_count
> │       ├── interrupt_enable
> │       └── threshold_limit
> ├── l2_cache
> │   └── l2_cache
> │       ├── error_count
> │       ├── interrupt_enable
> │       └── threshold_limit
> ...
> 
> we have those names repeated which looks wonky and useless too. I'd
> expect them to be:
> 
> ...
> ├── insn_fetch
> │   ├── error_count
> │   ├── interrupt_enable
> │   └── threshold_limit
> ├── l2_cache
> │   ├── error_count
> │   ├── interrupt_enable
> │   └── threshold_limit
> ...
> 
> Can we fix that too pls?
> 

Sure thing. But I don't think removing the second directory will be okay. The
layout is "bank"/"block". If the "block" has special use like DRAM ECC, or L3
Cache on older systems, then it'll have a unique name. Otherwise, the block
will take the name of the bank.

I think the more robust solution is to drop the unique names and use generic
names like "bank"/"block". A new file called "type" can be introduced into the
directory structure, and this can return the name of the bank/block. New bank
types will return "<null>" for the "type", but the directory structure should
remain the same and functional.

I've seen this in other sysfs interfaces like cpuidle,
e.g. /sys/devices/system/cpu/cpu0/cpuidle/stateX

The "blockX/type" file is like the "stateX/desc" file. Or the "type" file can
be called "desc", since it's a description of what the bank or block
represent.

Here are a couple of examples:

/sys/devices/system/machinecheck/machinecheck0/
├── th_bank0
│   ├── type ("Instruction Fetch")
│   └── th_block0
│       ├── type ("All Errors")
│       ├── error_count
│       ├── interrupt_enable
│       └── threshold_limit
├── th_bank1
│   ├── type ("Northbridge")
│   ├── th_block0
│   │   ├── type ("DRAM Errors")
│   │   ├── error_count
│   │   ├── interrupt_enable
│   │   └── threshold_limit
│   └── th_block1
│       ├── type ("Link Errors")
│       ├── error_count
│       ├── interrupt_enable
│       └── threshold_limit
...

OR

/sys/devices/system/machinecheck/machinecheck0/thresholding
├── bank0
│   ├── desc ("Instruction Fetch")
│   └── block0
│       ├── desc ("All Errors")
│       ├── error_count
│       ├── interrupt_enable
│       └── threshold_limit
├── bank1
│   ├── desc ("Northbridge")
│   ├── block0
│   │   ├── desc ("DRAM Errors")
│   │   ├── error_count
│   │   ├── interrupt_enable
│   │   └── threshold_limit
│   └── block1
│       ├── desc ("Link Errors")
│       ├── error_count
│       ├── interrupt_enable
│       └── threshold_limit
...

I'm inclined to the second option, since it keeps all the thresholding
functionality under a single directory.

What do you think?

Thanks,
Yazen

  reply	other threads:[~2021-12-07 16:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-03  2:00 [PATCH 0/3] AMD SMCA Updates Yazen Ghannam
2021-12-03  2:00 ` [PATCH 1/3] x86/MCE/AMD: Provide an "Unknown" MCA bank type Yazen Ghannam
2021-12-03 22:17   ` Borislav Petkov
2021-12-07 16:28     ` Yazen Ghannam [this message]
2021-12-11 15:39       ` Borislav Petkov
2021-12-03  2:00 ` [PATCH 2/3] x86/MCE/AMD, EDAC/mce_amd: Add new SMCA Bank Types Yazen Ghannam
2021-12-03  2:00 ` [PATCH 3/3] x86/MCE/AMD, EDAC/mce_amd: Support non-uniform MCA bank type enumeration Yazen Ghannam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Ya+LukojuewlomeF@yaz-ubuntu \
    --to=yazen.ghannam@amd.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=alexander.deucher@amd.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mukul.joshi@amd.com \
    --cc=tony.luck@intel.com \
    --cc=william.roche@oracle.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).