* [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates
@ 2021-12-15 15:53 Yazen Ghannam
2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-15 15:53 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, bp, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa, william.roche, Yazen Ghannam
Hi all,
This set adds support for AMD Family 19h Models 10h-1Fh and A0h-AFh.
This second revision includes patches 3 and 4 from the first. Patches 1
and 2 from the first set were applied.
This set is based on the following branches:
- tip/master
- ras/edac-for-next
- groeck/linux-staging/hwmon-next
The following commit in hwmon-next is needed for functional support of
this set.
49e90c39d0be ("x86/amd_nb: Add AMD Family 19h Models (10h-1Fh) and (A0h-AFh) PCI IDs")
Patch 1 is a small fixup on how some registers are checked. The patch is
functionally the same as patch 3 in set 1, but the commit message
updated to give more details.
Patch 2 adds register offset and other minor changes introduced with
these new models. The patch is updated to use a different name for a new
flag (thanks Boris for the suggestion).
Thanks,
Yazen
Yazen Ghannam (2):
EDAC/amd64: Check register values from all UMCs
EDAC/amd64: Add new register offset support and related changes
drivers/edac/amd64_edac.c | 70 ++++++++++++++++++++++++++++++++++-----
drivers/edac/amd64_edac.h | 14 ++++++++
2 files changed, 76 insertions(+), 8 deletions(-)
--
2.25.1
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
2021-12-15 15:53 [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Yazen Ghannam
@ 2021-12-15 15:53 ` Yazen Ghannam
2021-12-15 18:01 ` Borislav Petkov
2021-12-15 15:53 ` [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes Yazen Ghannam
2021-12-15 17:53 ` [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Borislav Petkov
2 siblings, 1 reply; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-15 15:53 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, bp, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa, william.roche, Yazen Ghannam
The initial support for Unified Memory Controllers (UMCs) was added to
AMD64 EDAC for the first generation of Zen systems. These systems have
two UMCs per Die, and the code originally assumed two UMCs in various
places.
Later systems have more than two UMCs, and this assumption was fixed in
the following commits.
commit bdcee7747f5c ("EDAC/amd64: Support more than two Unified Memory Controllers")
commit d971e28e2ce4 ("EDAC/amd64: Support more than two controllers for chip selects handling")
However, the determine_memory_type() function was missed in these
changes, and two UMCs are still assumed.
Update determine_memory_type() to account for all UMCs when checking the
register values.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
Link:
https://lkml.kernel.org/r/20211208174356.1997855-4-yazen.ghannam@amd.com
v1->v2:
* Was patch 3 in v1.
* Update commit message.
drivers/edac/amd64_edac.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index ff29267e46a6..1df763128483 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1621,9 +1621,16 @@ static void determine_memory_type(struct amd64_pvt *pvt)
u32 dram_ctrl, dcsm;
if (pvt->umc) {
- if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5))
+ u32 umc_cfg = 0, dimm_cfg = 0, i = 0;
+
+ for_each_umc(i) {
+ umc_cfg |= pvt->umc[i].umc_cfg;
+ dimm_cfg |= pvt->umc[i].dimm_cfg;
+ }
+
+ if (dimm_cfg & BIT(5))
pvt->dram_type = MEM_LRDDR4;
- else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4))
+ else if (dimm_cfg & BIT(4))
pvt->dram_type = MEM_RDDR4;
else
pvt->dram_type = MEM_DDR4;
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
2021-12-15 15:53 [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Yazen Ghannam
2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
@ 2021-12-15 15:53 ` Yazen Ghannam
2021-12-15 16:32 ` William Roche
2021-12-15 17:53 ` [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Borislav Petkov
2 siblings, 1 reply; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-15 15:53 UTC (permalink / raw)
To: linux-edac
Cc: linux-kernel, bp, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa, william.roche, Yazen Ghannam
Introduce a "family flags" bitmask that can be used to indicate any
special behavior needed on a per-family basis.
Add a flag to indicate a system uses the new register offsets introduced
with Family 19h Model 10h.
Use this flag to account for register offset changes, a new bitfield
indicating DDR5 use on a memory controller, and to set the proper number
of chip select masks.
Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
Link:
https://lkml.kernel.org/r/20211208174356.1997855-5-yazen.ghannam@amd.com
v1->v2:
* Was patch 4 in v1.
* Change "has_ddr5" flag to "zn_regs_v2".
* Drop flag check helper function.
* Update determine_memory_type() to check bitfield for DDR5.
* Update code comments.
drivers/edac/amd64_edac.c | 59 +++++++++++++++++++++++++++++++++++----
drivers/edac/amd64_edac.h | 14 ++++++++++
2 files changed, 67 insertions(+), 6 deletions(-)
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 1df763128483..b7dd87636155 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -15,6 +15,31 @@ static struct msr __percpu *msrs;
static struct amd64_family_type *fam_type;
+/* Family flag helpers */
+static inline u64 get_addr_cfg(void)
+{
+ if (fam_type->flags.zn_regs_v2)
+ return UMCCH_ADDR_CFG_DDR5;
+
+ return UMCCH_ADDR_CFG;
+}
+
+static inline u64 get_addr_mask_sec(void)
+{
+ if (fam_type->flags.zn_regs_v2)
+ return UMCCH_ADDR_MASK_SEC_DDR5;
+
+ return UMCCH_ADDR_MASK_SEC;
+}
+
+static inline u64 get_dimm_cfg(void)
+{
+ if (fam_type->flags.zn_regs_v2)
+ return UMCCH_DIMM_CFG_DDR5;
+
+ return UMCCH_DIMM_CFG;
+}
+
/* Per-node stuff */
static struct ecc_settings **ecc_stngs;
@@ -1429,8 +1454,10 @@ static void __dump_misc_regs_df(struct amd64_pvt *pvt)
edac_dbg(1, "UMC%d x16 DIMMs present: %s\n",
i, (umc->dimm_cfg & BIT(7)) ? "yes" : "no");
- if (pvt->dram_type == MEM_LRDDR4) {
- amd_smn_read(pvt->mc_node_id, umc_base + UMCCH_ADDR_CFG, &tmp);
+ if (pvt->dram_type == MEM_LRDDR4 || pvt->dram_type == MEM_LRDDR5) {
+ amd_smn_read(pvt->mc_node_id,
+ umc_base + get_addr_cfg(),
+ &tmp);
edac_dbg(1, "UMC%d LRDIMM %dx rank multiply\n",
i, 1 << ((tmp >> 4) & 0x3));
}
@@ -1505,7 +1532,7 @@ static void prep_chip_selects(struct amd64_pvt *pvt)
for_each_umc(umc) {
pvt->csels[umc].b_cnt = 4;
- pvt->csels[umc].m_cnt = 2;
+ pvt->csels[umc].m_cnt = fam_type->flags.zn_regs_v2 ? 4 : 2;
}
} else {
@@ -1545,7 +1572,7 @@ static void read_umc_base_mask(struct amd64_pvt *pvt)
}
umc_mask_reg = get_umc_base(umc) + UMCCH_ADDR_MASK;
- umc_mask_reg_sec = get_umc_base(umc) + UMCCH_ADDR_MASK_SEC;
+ umc_mask_reg_sec = get_umc_base(umc) + get_addr_mask_sec();
for_each_chip_select_mask(cs, umc, pvt) {
mask = &pvt->csels[umc].csmasks[cs];
@@ -1628,6 +1655,20 @@ static void determine_memory_type(struct amd64_pvt *pvt)
dimm_cfg |= pvt->umc[i].dimm_cfg;
}
+ /*
+ * Check if the system supports the "DDR Type" field in UMC Config
+ * and has DDR5 DIMMs in use.
+ */
+ if (fam_type->flags.zn_regs_v2 && ((umc_cfg & GENMASK(2, 0)) == 0x1)) {
+ if (dimm_cfg & BIT(5))
+ pvt->dram_type = MEM_LRDDR5;
+ else if (dimm_cfg & BIT(4))
+ pvt->dram_type = MEM_RDDR5;
+ else
+ pvt->dram_type = MEM_DDR5;
+ return;
+ }
+
if (dimm_cfg & BIT(5))
pvt->dram_type = MEM_LRDDR4;
else if (dimm_cfg & BIT(4))
@@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
* There is one mask per DIMM, and two Chip Selects per DIMM.
* CS0 and CS1 -> DIMM0
* CS2 and CS3 -> DIMM1
+ *
+ * Systems with newer register layout have one mask per Chip Select.
*/
- dimm = csrow_nr >> 1;
+ if (fam_type->flags.zn_regs_v2)
+ dimm = csrow_nr;
+ else
+ dimm = csrow_nr >> 1;
/* Asymmetric dual-rank DIMM support. */
if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY))
@@ -2937,6 +2983,7 @@ static struct amd64_family_type family_types[] = {
.f0_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F0,
.f6_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F6,
.max_mcs = 12,
+ .flags.zn_regs_v2 = 1,
.ops = {
.early_channel_count = f17_early_channel_count,
.dbam_to_cs = f17_addr_mask_to_cs_size,
@@ -3365,7 +3412,7 @@ static void __read_mc_regs_df(struct amd64_pvt *pvt)
umc_base = get_umc_base(i);
umc = &pvt->umc[i];
- amd_smn_read(nid, umc_base + UMCCH_DIMM_CFG, &umc->dimm_cfg);
+ amd_smn_read(nid, umc_base + get_dimm_cfg(), &umc->dimm_cfg);
amd_smn_read(nid, umc_base + UMCCH_UMC_CFG, &umc->umc_cfg);
amd_smn_read(nid, umc_base + UMCCH_SDP_CTRL, &umc->sdp_ctrl);
amd_smn_read(nid, umc_base + UMCCH_ECC_CTRL, &umc->ecc_ctrl);
diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
index 650cab401e21..39ecb77873db 100644
--- a/drivers/edac/amd64_edac.h
+++ b/drivers/edac/amd64_edac.h
@@ -271,8 +271,11 @@
#define UMCCH_BASE_ADDR_SEC 0x10
#define UMCCH_ADDR_MASK 0x20
#define UMCCH_ADDR_MASK_SEC 0x28
+#define UMCCH_ADDR_MASK_SEC_DDR5 0x30
#define UMCCH_ADDR_CFG 0x30
+#define UMCCH_ADDR_CFG_DDR5 0x40
#define UMCCH_DIMM_CFG 0x80
+#define UMCCH_DIMM_CFG_DDR5 0x90
#define UMCCH_UMC_CFG 0x100
#define UMCCH_SDP_CTRL 0x104
#define UMCCH_ECC_CTRL 0x14C
@@ -477,11 +480,22 @@ struct low_ops {
unsigned cs_mode, int cs_mask_nr);
};
+struct amd64_family_flags {
+ /*
+ * Indicates that the system supports the new register offsets, etc.
+ * first introduced with Family 19h Model 10h.
+ */
+ __u64 zn_regs_v2 : 1,
+
+ __reserved : 63;
+};
+
struct amd64_family_type {
const char *ctl_name;
u16 f0_id, f1_id, f2_id, f6_id;
/* Maximum number of memory controllers per die/node. */
u8 max_mcs;
+ struct amd64_family_flags flags;
struct low_ops ops;
};
--
2.25.1
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
2021-12-15 15:53 ` [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes Yazen Ghannam
@ 2021-12-15 16:32 ` William Roche
2021-12-15 18:07 ` Borislav Petkov
0 siblings, 1 reply; 13+ messages in thread
From: William Roche @ 2021-12-15 16:32 UTC (permalink / raw)
To: Yazen Ghannam, linux-edac
Cc: linux-kernel, bp, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa
On 15/12/2021 16:53, Yazen Ghannam wrote:
> Introduce a "family flags" bitmask that can be used to indicate any
> special behavior needed on a per-family basis.
>
> Add a flag to indicate a system uses the new register offsets introduced
> with Family 19h Model 10h.
>
> Use this flag to account for register offset changes, a new bitfield
> indicating DDR5 use on a memory controller, and to set the proper number
> of chip select masks.
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> ---
> Link:
> https://lkml.kernel.org/r/20211208174356.1997855-5-yazen.ghannam@amd.com
>
> v1->v2:
> * Was patch 4 in v1.
> * Change "has_ddr5" flag to "zn_regs_v2".
> * Drop flag check helper function.
> * Update determine_memory_type() to check bitfield for DDR5.
> * Update code comments.
>
> drivers/edac/amd64_edac.c | 59 +++++++++++++++++++++++++++++++++++----
> drivers/edac/amd64_edac.h | 14 ++++++++++
> 2 files changed, 67 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 1df763128483..b7dd87636155 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -15,6 +15,31 @@ static struct msr __percpu *msrs;
>
> static struct amd64_family_type *fam_type;
>
> +/* Family flag helpers */
> +static inline u64 get_addr_cfg(void)
> +{
> + if (fam_type->flags.zn_regs_v2)
> + return UMCCH_ADDR_CFG_DDR5;
> +
> + return UMCCH_ADDR_CFG;
> +}
> +
> +static inline u64 get_addr_mask_sec(void)
> +{
> + if (fam_type->flags.zn_regs_v2)
> + return UMCCH_ADDR_MASK_SEC_DDR5;
> +
> + return UMCCH_ADDR_MASK_SEC;
> +}
> +
> +static inline u64 get_dimm_cfg(void)
> +{
> + if (fam_type->flags.zn_regs_v2)
> + return UMCCH_DIMM_CFG_DDR5;
> +
> + return UMCCH_DIMM_CFG;
> +}
> +
> /* Per-node stuff */
> static struct ecc_settings **ecc_stngs;
>
> @@ -1429,8 +1454,10 @@ static void __dump_misc_regs_df(struct amd64_pvt *pvt)
> edac_dbg(1, "UMC%d x16 DIMMs present: %s\n",
> i, (umc->dimm_cfg & BIT(7)) ? "yes" : "no");
>
> - if (pvt->dram_type == MEM_LRDDR4) {
> - amd_smn_read(pvt->mc_node_id, umc_base + UMCCH_ADDR_CFG, &tmp);
> + if (pvt->dram_type == MEM_LRDDR4 || pvt->dram_type == MEM_LRDDR5) {
> + amd_smn_read(pvt->mc_node_id,
> + umc_base + get_addr_cfg(),
> + &tmp);
> edac_dbg(1, "UMC%d LRDIMM %dx rank multiply\n",
> i, 1 << ((tmp >> 4) & 0x3));
> }
> @@ -1505,7 +1532,7 @@ static void prep_chip_selects(struct amd64_pvt *pvt)
>
> for_each_umc(umc) {
> pvt->csels[umc].b_cnt = 4;
> - pvt->csels[umc].m_cnt = 2;
> + pvt->csels[umc].m_cnt = fam_type->flags.zn_regs_v2 ? 4 : 2;
> }
>
> } else {
> @@ -1545,7 +1572,7 @@ static void read_umc_base_mask(struct amd64_pvt *pvt)
> }
>
> umc_mask_reg = get_umc_base(umc) + UMCCH_ADDR_MASK;
> - umc_mask_reg_sec = get_umc_base(umc) + UMCCH_ADDR_MASK_SEC;
> + umc_mask_reg_sec = get_umc_base(umc) + get_addr_mask_sec();
>
> for_each_chip_select_mask(cs, umc, pvt) {
> mask = &pvt->csels[umc].csmasks[cs];
> @@ -1628,6 +1655,20 @@ static void determine_memory_type(struct amd64_pvt *pvt)
> dimm_cfg |= pvt->umc[i].dimm_cfg;
> }
>
> + /*
> + * Check if the system supports the "DDR Type" field in UMC Config
> + * and has DDR5 DIMMs in use.
> + */
> + if (fam_type->flags.zn_regs_v2 && ((umc_cfg & GENMASK(2, 0)) == 0x1)) {
> + if (dimm_cfg & BIT(5))
> + pvt->dram_type = MEM_LRDDR5;
> + else if (dimm_cfg & BIT(4))
> + pvt->dram_type = MEM_RDDR5;
> + else
> + pvt->dram_type = MEM_DDR5;
> + return;
> + }
> +
> if (dimm_cfg & BIT(5))
> pvt->dram_type = MEM_LRDDR4;
> else if (dimm_cfg & BIT(4))
> @@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
> * There is one mask per DIMM, and two Chip Selects per DIMM.
> * CS0 and CS1 -> DIMM0
> * CS2 and CS3 -> DIMM1
> + *
> + * Systems with newer register layout have one mask per Chip Select.
Just a question about this comment: Can it be translated into this ?
+ * Except on systems with newer register layout where we have one Chip Select per DIMM.
> */
> - dimm = csrow_nr >> 1;
> + if (fam_type->flags.zn_regs_v2)
> + dimm = csrow_nr;
> + else
> + dimm = csrow_nr >> 1;
>
> /* Asymmetric dual-rank DIMM support. */
> if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY))
> @@ -2937,6 +2983,7 @@ static struct amd64_family_type family_types[] = {
> .f0_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F0,
> .f6_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F6,
> .max_mcs = 12,
> + .flags.zn_regs_v2 = 1,
> .ops = {
> .early_channel_count = f17_early_channel_count,
> .dbam_to_cs = f17_addr_mask_to_cs_size,
> @@ -3365,7 +3412,7 @@ static void __read_mc_regs_df(struct amd64_pvt *pvt)
> umc_base = get_umc_base(i);
> umc = &pvt->umc[i];
>
> - amd_smn_read(nid, umc_base + UMCCH_DIMM_CFG, &umc->dimm_cfg);
> + amd_smn_read(nid, umc_base + get_dimm_cfg(), &umc->dimm_cfg);
> amd_smn_read(nid, umc_base + UMCCH_UMC_CFG, &umc->umc_cfg);
> amd_smn_read(nid, umc_base + UMCCH_SDP_CTRL, &umc->sdp_ctrl);
> amd_smn_read(nid, umc_base + UMCCH_ECC_CTRL, &umc->ecc_ctrl);
> diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
> index 650cab401e21..39ecb77873db 100644
> --- a/drivers/edac/amd64_edac.h
> +++ b/drivers/edac/amd64_edac.h
> @@ -271,8 +271,11 @@
> #define UMCCH_BASE_ADDR_SEC 0x10
> #define UMCCH_ADDR_MASK 0x20
> #define UMCCH_ADDR_MASK_SEC 0x28
> +#define UMCCH_ADDR_MASK_SEC_DDR5 0x30
> #define UMCCH_ADDR_CFG 0x30
> +#define UMCCH_ADDR_CFG_DDR5 0x40
> #define UMCCH_DIMM_CFG 0x80
> +#define UMCCH_DIMM_CFG_DDR5 0x90
> #define UMCCH_UMC_CFG 0x100
> #define UMCCH_SDP_CTRL 0x104
> #define UMCCH_ECC_CTRL 0x14C
> @@ -477,11 +480,22 @@ struct low_ops {
> unsigned cs_mode, int cs_mask_nr);
> };
>
> +struct amd64_family_flags {
> + /*
> + * Indicates that the system supports the new register offsets, etc.
> + * first introduced with Family 19h Model 10h.
> + */
> + __u64 zn_regs_v2 : 1,
> +
> + __reserved : 63;
> +};
> +
> struct amd64_family_type {
> const char *ctl_name;
> u16 f0_id, f1_id, f2_id, f6_id;
> /* Maximum number of memory controllers per die/node. */
> u8 max_mcs;
> + struct amd64_family_flags flags;
> struct low_ops ops;
> };
>
Thanks in advance,
William.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates
2021-12-15 15:53 [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Yazen Ghannam
2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
2021-12-15 15:53 ` [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes Yazen Ghannam
@ 2021-12-15 17:53 ` Borislav Petkov
2 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2021-12-15 17:53 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa, william.roche
On Wed, Dec 15, 2021 at 03:53:07PM +0000, Yazen Ghannam wrote:
> The following commit in hwmon-next is needed for functional support of
> this set.
>
> 49e90c39d0be ("x86/amd_nb: Add AMD Family 19h Models (10h-1Fh) and (A0h-AFh) PCI IDs")
I'd normally want do cross-merge that one so that stuff can get tested
but am being told that those models are not shipping yet so we're not
going to break anything even if those go separately Linus-wards.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
@ 2021-12-15 18:01 ` Borislav Petkov
2021-12-16 16:08 ` Yazen Ghannam
0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2021-12-15 18:01 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa, william.roche
On Wed, Dec 15, 2021 at 03:53:08PM +0000, Yazen Ghannam wrote:
> - if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5))
> + u32 umc_cfg = 0, dimm_cfg = 0, i = 0;
> +
> + for_each_umc(i) {
> + umc_cfg |= pvt->umc[i].umc_cfg;
> + dimm_cfg |= pvt->umc[i].dimm_cfg;
> + }
> +
> + if (dimm_cfg & BIT(5))
> pvt->dram_type = MEM_LRDDR4;
> - else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4))
> + else if (dimm_cfg & BIT(4))
You're working here under the assumption that bit 4 and 5 will have the
same value on all those UMCs.
You're probably going to say that that is how the BIOS is programming
them so they should be all the same and any other configuration is
invalid but lemme still ask about it explicitly.
And if so, this would probably need a comment above it which I can add
when applying...
Hmm?
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
2021-12-15 16:32 ` William Roche
@ 2021-12-15 18:07 ` Borislav Petkov
2021-12-16 15:46 ` Yazen Ghannam
0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2021-12-15 18:07 UTC (permalink / raw)
To: William Roche
Cc: Yazen Ghannam, linux-edac, linux-kernel, mchehab, tony.luck,
james.morse, rric, Smita.KoralahalliChannabasappa
On Wed, Dec 15, 2021 at 05:32:27PM +0100, William Roche wrote:
> > @@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
> > * There is one mask per DIMM, and two Chip Selects per DIMM.
> > * CS0 and CS1 -> DIMM0
> > * CS2 and CS3 -> DIMM1
> > + *
> > + * Systems with newer register layout have one mask per Chip Select.
>
> Just a question about this comment: Can it be translated into this ?
>
> + * Except on systems with newer register layout where we have one Chip Select per DIMM.
Sure, but without the "we":
...
* On systems with the newer register layout there is one Chip Select per DIMM.
*/
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
2021-12-15 18:07 ` Borislav Petkov
@ 2021-12-16 15:46 ` Yazen Ghannam
2021-12-16 18:43 ` William Roche
0 siblings, 1 reply; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-16 15:46 UTC (permalink / raw)
To: Borislav Petkov
Cc: William Roche, linux-edac, linux-kernel, mchehab, tony.luck,
james.morse, rric, Smita.KoralahalliChannabasappa
On Wed, Dec 15, 2021 at 07:07:17PM +0100, Borislav Petkov wrote:
> On Wed, Dec 15, 2021 at 05:32:27PM +0100, William Roche wrote:
> > > @@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
> > > * There is one mask per DIMM, and two Chip Selects per DIMM.
> > > * CS0 and CS1 -> DIMM0
> > > * CS2 and CS3 -> DIMM1
> > > + *
> > > + * Systems with newer register layout have one mask per Chip Select.
> >
> > Just a question about this comment: Can it be translated into this ?
> >
> > + * Except on systems with newer register layout where we have one Chip Select per DIMM.
>
> Sure, but without the "we":
>
> ...
> * On systems with the newer register layout there is one Chip Select per DIMM.
> */
>
Hi William,
Thanks for the suggestion, but it's not quite correct.
There are still two Chip Selects per DIMM module, i.e. the system can support
dual-rank (2R) DIMMs. Current AMD systems can support upto 2 DIMMs per Unified
Memory Controller (UMC). There are two "Address Mask" registers in each UMC,
and each register covers an entire DIMM (and by extension the two Chip Selects
available for each DIMM).
Future systems will still support upto 2 DIMMs per UMC. However, the register
space is updated so that there are now four "Address Mask" registers per UMC.
And each of these registers is now explicitly related to one of the four Chip
Selects available per UMC.
Does this help? I can update the code comments with these details.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
2021-12-15 18:01 ` Borislav Petkov
@ 2021-12-16 16:08 ` Yazen Ghannam
2021-12-30 11:36 ` Borislav Petkov
0 siblings, 1 reply; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-16 16:08 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa, william.roche
On Wed, Dec 15, 2021 at 07:01:22PM +0100, Borislav Petkov wrote:
> On Wed, Dec 15, 2021 at 03:53:08PM +0000, Yazen Ghannam wrote:
> > - if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5))
> > + u32 umc_cfg = 0, dimm_cfg = 0, i = 0;
> > +
> > + for_each_umc(i) {
> > + umc_cfg |= pvt->umc[i].umc_cfg;
> > + dimm_cfg |= pvt->umc[i].dimm_cfg;
> > + }
> > +
> > + if (dimm_cfg & BIT(5))
> > pvt->dram_type = MEM_LRDDR4;
> > - else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4))
> > + else if (dimm_cfg & BIT(4))
>
> You're working here under the assumption that bit 4 and 5 will have the
> same value on all those UMCs.
>
> You're probably going to say that that is how the BIOS is programming
> them so they should be all the same and any other configuration is
> invalid but lemme still ask about it explicitly.
>
> And if so, this would probably need a comment above it which I can add
> when applying...
>
> Hmm?
>
No, that's a good question. And actually the assumption is incorrect. It is
allowed to have different DIMM types in a system though all DIMMs on a single
UMC must match.
Do you recommend a follow up patch or should this one be reworked?
Thanks,
Yazen
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
2021-12-16 15:46 ` Yazen Ghannam
@ 2021-12-16 18:43 ` William Roche
2021-12-16 19:21 ` Yazen Ghannam
0 siblings, 1 reply; 13+ messages in thread
From: William Roche @ 2021-12-16 18:43 UTC (permalink / raw)
To: Yazen Ghannam, Borislav Petkov
Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa
On 16/12/2021 16:46, Yazen Ghannam wrote:
> On Wed, Dec 15, 2021 at 07:07:17PM +0100, Borislav Petkov wrote:
>> On Wed, Dec 15, 2021 at 05:32:27PM +0100, William Roche wrote:
>>>> @@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
>>>> * There is one mask per DIMM, and two Chip Selects per DIMM.
>>>> * CS0 and CS1 -> DIMM0
>>>> * CS2 and CS3 -> DIMM1
>>>> + *
>>>> + * Systems with newer register layout have one mask per Chip Select.
>>> Just a question about this comment: Can it be translated into this ?
>>>
>>> + * Except on systems with newer register layout where we have one Chip Select per DIMM.
>> Sure, but without the "we":
>>
>> ...
>> * On systems with the newer register layout there is one Chip Select per DIMM.
>> */
>>
> Hi William,
> Thanks for the suggestion, but it's not quite correct.
That's exactly what I wanted to know. Thanks.
>
> There are still two Chip Selects per DIMM module, i.e. the system can support
> dual-rank (2R) DIMMs. Current AMD systems can support upto 2 DIMMs per Unified
> Memory Controller (UMC). There are two "Address Mask" registers in each UMC,
> and each register covers an entire DIMM (and by extension the two Chip Selects
> available for each DIMM).
>
> Future systems will still support upto 2 DIMMs per UMC. However, the register
> space is updated so that there are now four "Address Mask" registers per UMC.
> And each of these registers is now explicitly related to one of the four Chip
> Selects available per UMC.
From what I understand, future systems would still support the same
number of dimms per UMC (2), the same number of Chip Select (2 per
dimm), the only thing that changes is the number of Address Mask
registers (going from 2 per UMC to 4 per UMC).
So I'm confused, we deduce 'dimm' from csrow_nr, which would be in fact
the Chip Select *masks* number (cs_mask_nr from the dbam_to_cs signature
in struct low_ops), so why are we saying and dimm=csrow_nr in the case
of the new layout, but dimm = csrow_nr / 2 in the case on the standard
layout ?
Should we indicate what this 'dimm' value really is ?
Sorry if I'm missing something very obvious here.
Thanks,
William.
> Does this help? I can update the code comments with these details.
>
> Thanks,
> Yazen
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
2021-12-16 18:43 ` William Roche
@ 2021-12-16 19:21 ` Yazen Ghannam
0 siblings, 0 replies; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-16 19:21 UTC (permalink / raw)
To: William Roche
Cc: Borislav Petkov, linux-edac, linux-kernel, mchehab, tony.luck,
james.morse, rric, Smita.KoralahalliChannabasappa
On Thu, Dec 16, 2021 at 07:43:55PM +0100, William Roche wrote:
...
> From what I understand, future systems would still support the same number
> of dimms per UMC (2), the same number of Chip Select (2 per dimm), the only
> thing that changes is the number of Address Mask registers (going from 2 per
> UMC to 4 per UMC).
>
> So I'm confused, we deduce 'dimm' from csrow_nr, which would be in fact the
> Chip Select *masks* number (cs_mask_nr from the dbam_to_cs signature in
> struct low_ops), so why are we saying and dimm=csrow_nr in the case of the
> new layout, but dimm = csrow_nr / 2 in the case on the standard layout ?
>
> Should we indicate what this 'dimm' value really is ?
>
> Sorry if I'm missing something very obvious here.
>
That's fair. I can rework the patch to explicitly differentiate between "dimm"
and "cs_mask_nr" here.
I think this would resolve an issue in a later debug print statement that
includes the csrow_nr and dimm.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
2021-12-16 16:08 ` Yazen Ghannam
@ 2021-12-30 11:36 ` Borislav Petkov
2022-01-05 16:12 ` Yazen Ghannam
0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2021-12-30 11:36 UTC (permalink / raw)
To: Yazen Ghannam
Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa, william.roche
On Thu, Dec 16, 2021 at 04:08:18PM +0000, Yazen Ghannam wrote:
> No, that's a good question. And actually the assumption is incorrect. It is
> allowed to have different DIMM types in a system though all DIMMs on a single
> UMC must match.
Oh fun, really?!
So a single system can have DDR4 *and* DDR5 on the same board?!
So then that
pvt->dram_type
is insufficient to store the DIMM type for a pvt. If you have multiple
UMCs on a pvt and all have different type DIMMs, then you need the
relevant DIMM type when you dump it in sysfs...
Which then means, you need ->dram_type to be per UMC...
And also, I'm assuming the hw already enforces that DIMMs on a single
UMC must match - it simply won't boot if they don't so you don't have to
enforce that, at least.
> Do you recommend a follow up patch or should this one be reworked?
This one is insufficient, I'm afraid.
One way to address this is, you could use pvt->umc at the places where
dram_type is used and assign directly to the dimm->mtype thing. But then
you'd need a way to map each struct dimm_info *dimm to the UMC so that
you can determine the correct DIMM type.
Which would make pvt->dram_type redundant and can be removed.
Or you might have a better idea...
HTH.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
2021-12-30 11:36 ` Borislav Petkov
@ 2022-01-05 16:12 ` Yazen Ghannam
0 siblings, 0 replies; 13+ messages in thread
From: Yazen Ghannam @ 2022-01-05 16:12 UTC (permalink / raw)
To: Borislav Petkov
Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
Smita.KoralahalliChannabasappa, william.roche
On Thu, Dec 30, 2021 at 12:36:44PM +0100, Borislav Petkov wrote:
> On Thu, Dec 16, 2021 at 04:08:18PM +0000, Yazen Ghannam wrote:
> > No, that's a good question. And actually the assumption is incorrect. It is
> > allowed to have different DIMM types in a system though all DIMMs on a single
> > UMC must match.
>
> Oh fun, really?!
>
> So a single system can have DDR4 *and* DDR5 on the same board?!
>
Well, I don't know about that specifically. There are some restrictions, but
you could have UDIMMs and RDIMMs of the same generation, at least.
> So then that
>
> pvt->dram_type
>
> is insufficient to store the DIMM type for a pvt. If you have multiple
> UMCs on a pvt and all have different type DIMMs, then you need the
> relevant DIMM type when you dump it in sysfs...
>
> Which then means, you need ->dram_type to be per UMC...
>
> And also, I'm assuming the hw already enforces that DIMMs on a single
> UMC must match - it simply won't boot if they don't so you don't have to
> enforce that, at least.
>
> > Do you recommend a follow up patch or should this one be reworked?
>
> This one is insufficient, I'm afraid.
>
> One way to address this is, you could use pvt->umc at the places where
> dram_type is used and assign directly to the dimm->mtype thing. But then
> you'd need a way to map each struct dimm_info *dimm to the UMC so that
> you can determine the correct DIMM type.
>
I did send a patch that did something like this.
https://lkml.kernel.org/r/20211228200615.412999-2-yazen.ghannam@amd.com
Though this got a build warning report, so I need to follow up on that.
> Which would make pvt->dram_type redundant and can be removed.
>
I kept this so as to not break legacy systems. But I'll look at it again. I
think you may be right.
Thanks,
Yazen
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2022-01-05 16:12 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-15 15:53 [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Yazen Ghannam
2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
2021-12-15 18:01 ` Borislav Petkov
2021-12-16 16:08 ` Yazen Ghannam
2021-12-30 11:36 ` Borislav Petkov
2022-01-05 16:12 ` Yazen Ghannam
2021-12-15 15:53 ` [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes Yazen Ghannam
2021-12-15 16:32 ` William Roche
2021-12-15 18:07 ` Borislav Petkov
2021-12-16 15:46 ` Yazen Ghannam
2021-12-16 18:43 ` William Roche
2021-12-16 19:21 ` Yazen Ghannam
2021-12-15 17:53 ` [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Borislav Petkov
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).