linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Muralidhara M K <muralimk@amd.com>
To: <linux-edac@vger.kernel.org>, <x86@kernel.org>
Cc: <linux-kernel@vger.kernel.org>, <bp@alien8.de>,
	<mchehab@kernel.org>, <yazen.ghannam@amd.com>, <nchatrad@amd.com>,
	Muralidhara M K <muralidhara.mk@amd.com>
Subject: [PATCH 1/5] x86/amd_nb: Add MI200 PCI IDs
Date: Mon, 15 May 2023 11:35:33 +0000	[thread overview]
Message-ID: <20230515113537.1052146-2-muralimk@amd.com> (raw)
In-Reply-To: <20230515113537.1052146-1-muralimk@amd.com>

From: Yazen Ghannam <yazen.ghannam@amd.com>

The AMD Instinct™ MI200 series accelerators are the data center GPUs.
The MI200 (Aldebaran) series of accelerator devices include Unified
Memory Controllers and a data fabric similar to those used in AMD x86
CPU products. The memory controllers report errors using MCA, though
these errors are generally handled through GPU drivers that directly
manage the accelerator device.

In some configurations, memory errors from these devices will be
reported through MCA and managed by x86 CPUs. The OS is expected to
handle these errors in similar fashion to MCA errors originating from
memory controllers on x86 CPUs. In Linux, this flow includes passing MCA
errors to a notifier chain that with handlers in the EDAC subsystem.

The AMD64 EDAC module requires information from the memory controllers
and data fabric in order to provide detailed decoding of memory errors.
The information is read from hardware registers accessed through
interfaces in the data fabric.

The accelerator data fabrics are visible to the host x86 CPUs as PCI
devices just like x86 CPU data fabrics are already. However, the
accelerator fabrics have new and unique PCI IDs.

Add PCI IDs for the MI200 (Aldebaran) series of accelerator devices in
order to enable EDAC support. The data fabrics of the accelerator
devices will be enumerated as any other fabric already supported.
System-specific implementation details will be handled within the AMD64
EDAC module.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
Co-developed-by: Muralidhara M K <muralidhara.mk@amd.com>
Signed-off-by: Muralidhara M K <muralidhara.mk@amd.com>
---
 arch/x86/kernel/amd_nb.c | 5 +++++
 include/linux/pci_ids.h  | 1 +
 2 files changed, 6 insertions(+)

diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c
index 7e331e8f3692..8fd955414b08 100644
--- a/arch/x86/kernel/amd_nb.c
+++ b/arch/x86/kernel/amd_nb.c
@@ -23,6 +23,7 @@
 #define PCI_DEVICE_ID_AMD_19H_M10H_ROOT	0x14a4
 #define PCI_DEVICE_ID_AMD_19H_M60H_ROOT	0x14d8
 #define PCI_DEVICE_ID_AMD_19H_M70H_ROOT	0x14e8
+#define PCI_DEVICE_ID_AMD_MI200_ROOT	0x14bb
 #define PCI_DEVICE_ID_AMD_17H_DF_F4	0x1464
 #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F4 0x15ec
 #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F4 0x1494
@@ -37,6 +38,7 @@
 #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F4 0x14e4
 #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F4 0x14f4
 #define PCI_DEVICE_ID_AMD_19H_M78H_DF_F4 0x12fc
+#define PCI_DEVICE_ID_AMD_MI200_DF_F4	0x14d4
 
 /* Protect the PCI config register pairs used for SMN. */
 static DEFINE_MUTEX(smn_mutex);
@@ -53,6 +55,7 @@ static const struct pci_device_id amd_root_ids[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M40H_ROOT) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_ROOT) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_ROOT) },
+	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_ROOT) },
 	{}
 };
 
@@ -81,6 +84,7 @@ static const struct pci_device_id amd_nb_misc_ids[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M60H_DF_F3) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M70H_DF_F3) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M78H_DF_F3) },
+	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_DF_F3) },
 	{}
 };
 
@@ -101,6 +105,7 @@ static const struct pci_device_id amd_nb_link_ids[] = {
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M40H_DF_F4) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_19H_M50H_DF_F4) },
 	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F4) },
+	{ PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_MI200_DF_F4) },
 	{}
 };
 
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 95f33dadb2be..a99b1fcfc617 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -568,6 +568,7 @@
 #define PCI_DEVICE_ID_AMD_19H_M60H_DF_F3 0x14e3
 #define PCI_DEVICE_ID_AMD_19H_M70H_DF_F3 0x14f3
 #define PCI_DEVICE_ID_AMD_19H_M78H_DF_F3 0x12fb
+#define PCI_DEVICE_ID_AMD_MI200_DF_F3	0x14d3
 #define PCI_DEVICE_ID_AMD_CNB17H_F3	0x1703
 #define PCI_DEVICE_ID_AMD_LANCE		0x2000
 #define PCI_DEVICE_ID_AMD_LANCE_HOME	0x2001
-- 
2.25.1


  reply	other threads:[~2023-05-15 11:40 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-05-15 11:35 [PATCH 0/5] AMD64 EDAC GPU Updates Muralidhara M K
2023-05-15 11:35 ` Muralidhara M K [this message]
2023-05-31  9:42   ` [PATCH 1/5] x86/amd_nb: Add MI200 PCI IDs Borislav Petkov
2023-06-05 14:14     ` [tip: ras/core] x86/amd_nb: Re-sort and re-indent PCI defines tip-bot2 for Borislav Petkov (AMD)
2023-06-05 14:14   ` [tip: ras/core] x86/amd_nb: Add MI200 PCI IDs tip-bot2 for Yazen Ghannam
2023-05-15 11:35 ` [PATCH 2/5] x86/MCE/AMD, EDAC/mce_amd: Decode UMC_V2 ECC errors Muralidhara M K
2023-06-05 14:14   ` [tip: ras/core] " tip-bot2 for Yazen Ghannam
2023-05-15 11:35 ` [PATCH 3/5] EDAC/amd64: Document heterogeneous system enumeration Muralidhara M K
2023-06-05 14:14   ` [tip: ras/core] " tip-bot2 for Muralidhara M K
2023-05-15 11:35 ` [PATCH 4/5] EDAC/amd64: Add support for AMD heterogeneous Family 19h Model 30h-3Fh Muralidhara M K
2023-06-05 14:14   ` [tip: ras/core] " tip-bot2 for Muralidhara M K
2023-05-15 11:35 ` [PATCH 5/5] EDAC/amd64: Cache and use GPU node map Muralidhara M K
2023-06-05 14:14   ` [tip: ras/core] " tip-bot2 for Yazen Ghannam

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230515113537.1052146-2-muralimk@amd.com \
    --to=muralimk@amd.com \
    --cc=bp@alien8.de \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mchehab@kernel.org \
    --cc=muralidhara.mk@amd.com \
    --cc=nchatrad@amd.com \
    --cc=x86@kernel.org \
    --cc=yazen.ghannam@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).