Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
From: Yazen Ghannam <yazen.ghannam@amd.com>
To: Borislav Petkov <bp@alien8.de>
Cc: linux-edac@vger.kernel.org, linux-kernel@vger.kernel.org,
	tony.luck@intel.com, x86@kernel.org,
	Smita.KoralahalliChannabasappa@amd.com
Subject: Re: [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems
Date: Mon, 14 Sep 2020 14:20:39 -0500
Message-ID: <20200914192039.GA39519@yaz-nikka.amd.com> (raw)
In-Reply-To: <20200910101443.GC8357@zn.tnic>

On Thu, Sep 10, 2020 at 12:14:43PM +0200, Borislav Petkov wrote:
> On Wed, Sep 09, 2020 at 03:17:55PM -0500, Yazen Ghannam wrote:
> > We need to access specific instances of hardware registers in the
> > Northbridge or Data Fabric. The code in arch/x86/kernel/amd_nb.c does
> > this.
> 
> So you don't need the node_id - you need the northbridge/data fabric ID?
> I'm guessing NB == DF, i.e., it was NB before Zen and it is DF now.
> 
> Yes?
>

Yes, that's right.

I called it "node_id" based on the AMD documentation and what it's
called today in the Linux code. It's called other things like nb_id and
nid too.

I think we can call it something else to avoid confusion with NUMA nodes
if that'll help.

> > Package = Socket, i.e. a field replaceable unit. Socket may not be
> > useful for software, but I think it helps users identify the hardware.
> > 
> > I think the following could be changed in the documentation:
> > 
> > "In the past a socket always contained a single package (see below), but
> > with the advent of Multi Chip Modules (MCM) a socket can hold more than one
> > package."
> > 
> > Replace "package" with "die".
> 
> So first of all, we have:
> 
> "AMD nomenclature for package is 'Node'."
> 
> so we either change that because as you explain, node != package on AMD.
> 
> What you need is the ID of that northbridge or data fabric instance,
> AFAIU.
> 
> > You take multiple dies from the foundry and you "package" them together
> > into a single unit.
> 
> I think you're overloading the word "package" here and that leads to
> more confusion. Package in our definition - Linux' - is:
> 
> "Packages contain a number of cores plus shared resources, e.g. DRAM
> controller, shared caches etc." If you glue several packages together,
> you get an MCM.
> 

Yes, you're right. The AMD documentation is different, so I'll try to
stick with the Linux documentation and qualify names with "AMD" when
noting the usage by the AMD docs.

> > They could be equal depending on the system. The values are different on
> > MCM systems like Bulldozer and Naples though.
> > 
> > The functions and structures in amd_nb.c are indexed by the node_id.
> > This is done implicitly right now by using amd_get_nb_id()/cpu_llc_id.
> > But the LLC isn't always equal to the Node/Die like in Naples. So the
> > patches in this set save and explicitly use the node_id when needed.
> > 
> > What do you think?
> 
> Sounds to me that you want to ID that data fabric instance which
> logically belongs to one or multiple packages. Or can a DF a single
> package?
> 
> So let's start simple: how does a DF instance map to a logical NUMA
> node or package? Can a DF serve multiple packages?
> 

There's one DF/NB per package and it's a fixed value, i.e. it shouldn't
change based on the NUMA configuration.

Here's an example of a 2 socket Naples system with 4 packages per socket
and setup to have 1 NUMA node. The "node_id" value is the AMD NodeId
from CPUID.

CPU=0 phys_proc_id=0 node_id=0 cpu_to_node()=0
CPU=8 phys_proc_id=0 node_id=1 cpu_to_node()=0
CPU=16 phys_proc_id=0 node_id=2 cpu_to_node()=0
CPU=24 phys_proc_id=0 node_id=3 cpu_to_node()=0
CPU=32 phys_proc_id=1 node_id=4 cpu_to_node()=0
CPU=40 phys_proc_id=1 node_id=5 cpu_to_node()=0
CPU=48 phys_proc_id=1 node_id=6 cpu_to_node()=0
CPU=56 phys_proc_id=1 node_id=7 cpu_to_node()=0

> You could use the examples at the end of Documentation/x86/topology.rst
> to explain how those things play together. And remember to not think
> about the physical aspect of the hardware structure because it doesn't
> mean anything to software. All you wanna do is address the proper DF
> instance so this needs to be enumerable and properly represented by sw.
>

Yeah, I think example 4b works here. The mismatch though is with
phys_proc_id and package on AMD systems. You can see above that
phys_proc_id gives a socket number, and the AMD NodeId gives a package
number.

Should we add a note under cpuinfo_x86.phys_proc_id to make this
distinction?

> Confused?
> 
> I am.
> 
> :-)
>

Yeah, me too. :)

Thanks,
Yazen

  reply index

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-03 20:01 [PATCH v2 0/8] AMD MCA Address Translation Updates Yazen Ghannam
2020-09-03 20:01 ` [PATCH v2 1/8] x86/CPU/AMD: Save NodeId on AMD-based systems Yazen Ghannam
2020-09-09 18:06   ` Borislav Petkov
2020-09-09 20:17     ` Yazen Ghannam
2020-09-10 10:14       ` Borislav Petkov
2020-09-14 19:20         ` Yazen Ghannam [this message]
2020-09-15  8:35           ` Borislav Petkov
2020-09-16 19:51             ` Yazen Ghannam
2020-09-17 10:37               ` Borislav Petkov
2020-09-17 16:20                 ` Yazen Ghannam
2020-09-17 16:40                   ` Borislav Petkov
2020-09-17 19:44                     ` Yazen Ghannam
2020-09-17 20:10                       ` Borislav Petkov
2020-09-03 20:01 ` [PATCH v2 2/8] x86/CPU/AMD: Remove amd_get_nb_id() Yazen Ghannam
2020-09-03 20:01 ` [PATCH v2 3/8] EDAC/mce_amd: Use struct cpuinfo_x86.node_id for NodeId Yazen Ghannam
2020-09-03 20:01 ` [PATCH v2 4/8] x86/MCE/AMD: Use defines for register addresses in translation code Yazen Ghannam
2020-09-03 20:01 ` [PATCH v2 5/8] x86/MCE/AMD: Use macros to get bitfields " Yazen Ghannam
2020-09-21 13:58   ` Borislav Petkov
2020-09-03 20:01 ` [PATCH v2 6/8] x86/MCE/AMD: Drop tmp variable " Yazen Ghannam
2020-09-23  8:05   ` Borislav Petkov
2020-09-23 16:05     ` Yazen Ghannam
2020-09-03 20:01 ` [PATCH v2 7/8] x86/MCE/AMD: Group register reads " Yazen Ghannam
2020-09-03 20:01 ` [PATCH v2 8/8] x86/MCE/AMD Support new memory interleaving modes during address translation Yazen Ghannam
2020-09-23  8:20   ` Borislav Petkov
2020-09-23 16:25     ` Yazen Ghannam
2020-09-25  7:22       ` Borislav Petkov
2020-09-25 19:51         ` Yazen Ghannam
2020-09-28  9:47           ` Borislav Petkov
2020-09-28 15:53             ` Yazen Ghannam
2020-09-28 18:14               ` Borislav Petkov
2020-09-29 13:21                 ` Yazen Ghannam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200914192039.GA39519@yaz-nikka.amd.com \
    --to=yazen.ghannam@amd.com \
    --cc=Smita.KoralahalliChannabasappa@amd.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org
	public-inbox-index linux-edac

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git