linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates
@ 2021-12-15 15:53 Yazen Ghannam
  2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
                   ` (2 more replies)
  0 siblings, 3 replies; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-15 15:53 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, bp, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa, william.roche, Yazen Ghannam

Hi all,

This set adds support for AMD Family 19h Models 10h-1Fh and A0h-AFh.

This second revision includes patches 3 and 4 from the first. Patches 1
and 2 from the first set were applied.

This set is based on the following branches:
  - tip/master
  - ras/edac-for-next
  - groeck/linux-staging/hwmon-next

The following commit in hwmon-next is needed for functional support of
this set.

  49e90c39d0be ("x86/amd_nb: Add AMD Family 19h Models (10h-1Fh) and (A0h-AFh) PCI IDs")

Patch 1 is a small fixup on how some registers are checked. The patch is
functionally the same as patch 3 in set 1, but the commit message
updated to give more details.

Patch 2 adds register offset and other minor changes introduced with
these new models. The patch is updated to use a different name for a new
flag (thanks Boris for the suggestion).

Thanks,
Yazen

Yazen Ghannam (2):
  EDAC/amd64: Check register values from all UMCs
  EDAC/amd64: Add new register offset support and related changes

 drivers/edac/amd64_edac.c | 70 ++++++++++++++++++++++++++++++++++-----
 drivers/edac/amd64_edac.h | 14 ++++++++
 2 files changed, 76 insertions(+), 8 deletions(-)

-- 
2.25.1


^ permalink raw reply	[flat|nested] 13+ messages in thread

* [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
  2021-12-15 15:53 [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Yazen Ghannam
@ 2021-12-15 15:53 ` Yazen Ghannam
  2021-12-15 18:01   ` Borislav Petkov
  2021-12-15 15:53 ` [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes Yazen Ghannam
  2021-12-15 17:53 ` [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Borislav Petkov
  2 siblings, 1 reply; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-15 15:53 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, bp, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa, william.roche, Yazen Ghannam

The initial support for Unified Memory Controllers (UMCs) was added to
AMD64 EDAC for the first generation of Zen systems. These systems have
two UMCs per Die, and the code originally assumed two UMCs in various
places.

Later systems have more than two UMCs, and this assumption was fixed in
the following commits.

commit bdcee7747f5c ("EDAC/amd64: Support more than two Unified Memory Controllers")
commit d971e28e2ce4 ("EDAC/amd64: Support more than two controllers for chip selects handling")

However, the determine_memory_type() function was missed in these
changes, and two UMCs are still assumed.

Update determine_memory_type() to account for all UMCs when checking the
register values.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
Link:
https://lkml.kernel.org/r/20211208174356.1997855-4-yazen.ghannam@amd.com

v1->v2:
* Was patch 3 in v1.
* Update commit message.

 drivers/edac/amd64_edac.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index ff29267e46a6..1df763128483 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1621,9 +1621,16 @@ static void determine_memory_type(struct amd64_pvt *pvt)
 	u32 dram_ctrl, dcsm;
 
 	if (pvt->umc) {
-		if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5))
+		u32 umc_cfg = 0, dimm_cfg = 0, i = 0;
+
+		for_each_umc(i) {
+			umc_cfg  |= pvt->umc[i].umc_cfg;
+			dimm_cfg |= pvt->umc[i].dimm_cfg;
+		}
+
+		if (dimm_cfg & BIT(5))
 			pvt->dram_type = MEM_LRDDR4;
-		else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4))
+		else if (dimm_cfg & BIT(4))
 			pvt->dram_type = MEM_RDDR4;
 		else
 			pvt->dram_type = MEM_DDR4;
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
  2021-12-15 15:53 [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Yazen Ghannam
  2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
@ 2021-12-15 15:53 ` Yazen Ghannam
  2021-12-15 16:32   ` William Roche
  2021-12-15 17:53 ` [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Borislav Petkov
  2 siblings, 1 reply; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-15 15:53 UTC (permalink / raw)
  To: linux-edac
  Cc: linux-kernel, bp, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa, william.roche, Yazen Ghannam

Introduce a "family flags" bitmask that can be used to indicate any
special behavior needed on a per-family basis.

Add a flag to indicate a system uses the new register offsets introduced
with Family 19h Model 10h.

Use this flag to account for register offset changes, a new bitfield
indicating DDR5 use on a memory controller, and to set the proper number
of chip select masks.

Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
---
Link:
https://lkml.kernel.org/r/20211208174356.1997855-5-yazen.ghannam@amd.com

v1->v2:
* Was patch 4 in v1.
* Change "has_ddr5" flag to "zn_regs_v2".
* Drop flag check helper function.
* Update determine_memory_type() to check bitfield for DDR5.
* Update code comments.

 drivers/edac/amd64_edac.c | 59 +++++++++++++++++++++++++++++++++++----
 drivers/edac/amd64_edac.h | 14 ++++++++++
 2 files changed, 67 insertions(+), 6 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 1df763128483..b7dd87636155 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -15,6 +15,31 @@ static struct msr __percpu *msrs;
 
 static struct amd64_family_type *fam_type;
 
+/* Family flag helpers */
+static inline u64 get_addr_cfg(void)
+{
+	if (fam_type->flags.zn_regs_v2)
+		return UMCCH_ADDR_CFG_DDR5;
+
+	return UMCCH_ADDR_CFG;
+}
+
+static inline u64 get_addr_mask_sec(void)
+{
+	if (fam_type->flags.zn_regs_v2)
+		return UMCCH_ADDR_MASK_SEC_DDR5;
+
+	return UMCCH_ADDR_MASK_SEC;
+}
+
+static inline u64 get_dimm_cfg(void)
+{
+	if (fam_type->flags.zn_regs_v2)
+		return UMCCH_DIMM_CFG_DDR5;
+
+	return UMCCH_DIMM_CFG;
+}
+
 /* Per-node stuff */
 static struct ecc_settings **ecc_stngs;
 
@@ -1429,8 +1454,10 @@ static void __dump_misc_regs_df(struct amd64_pvt *pvt)
 		edac_dbg(1, "UMC%d x16 DIMMs present: %s\n",
 				i, (umc->dimm_cfg & BIT(7)) ? "yes" : "no");
 
-		if (pvt->dram_type == MEM_LRDDR4) {
-			amd_smn_read(pvt->mc_node_id, umc_base + UMCCH_ADDR_CFG, &tmp);
+		if (pvt->dram_type == MEM_LRDDR4 || pvt->dram_type == MEM_LRDDR5) {
+			amd_smn_read(pvt->mc_node_id,
+				     umc_base + get_addr_cfg(),
+				     &tmp);
 			edac_dbg(1, "UMC%d LRDIMM %dx rank multiply\n",
 					i, 1 << ((tmp >> 4) & 0x3));
 		}
@@ -1505,7 +1532,7 @@ static void prep_chip_selects(struct amd64_pvt *pvt)
 
 		for_each_umc(umc) {
 			pvt->csels[umc].b_cnt = 4;
-			pvt->csels[umc].m_cnt = 2;
+			pvt->csels[umc].m_cnt = fam_type->flags.zn_regs_v2 ? 4 : 2;
 		}
 
 	} else {
@@ -1545,7 +1572,7 @@ static void read_umc_base_mask(struct amd64_pvt *pvt)
 		}
 
 		umc_mask_reg = get_umc_base(umc) + UMCCH_ADDR_MASK;
-		umc_mask_reg_sec = get_umc_base(umc) + UMCCH_ADDR_MASK_SEC;
+		umc_mask_reg_sec = get_umc_base(umc) + get_addr_mask_sec();
 
 		for_each_chip_select_mask(cs, umc, pvt) {
 			mask = &pvt->csels[umc].csmasks[cs];
@@ -1628,6 +1655,20 @@ static void determine_memory_type(struct amd64_pvt *pvt)
 			dimm_cfg |= pvt->umc[i].dimm_cfg;
 		}
 
+		/*
+		 * Check if the system supports the "DDR Type" field in UMC Config
+		 * and has DDR5 DIMMs in use.
+		 */
+		if (fam_type->flags.zn_regs_v2 && ((umc_cfg & GENMASK(2, 0)) == 0x1)) {
+			if (dimm_cfg & BIT(5))
+				pvt->dram_type = MEM_LRDDR5;
+			else if (dimm_cfg & BIT(4))
+				pvt->dram_type = MEM_RDDR5;
+			else
+				pvt->dram_type = MEM_DDR5;
+			return;
+		}
+
 		if (dimm_cfg & BIT(5))
 			pvt->dram_type = MEM_LRDDR4;
 		else if (dimm_cfg & BIT(4))
@@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
 	 * There is one mask per DIMM, and two Chip Selects per DIMM.
 	 *	CS0 and CS1 -> DIMM0
 	 *	CS2 and CS3 -> DIMM1
+	 *
+	 *	Systems with newer register layout have one mask per Chip Select.
 	 */
-	dimm = csrow_nr >> 1;
+	if (fam_type->flags.zn_regs_v2)
+		dimm = csrow_nr;
+	else
+		dimm = csrow_nr >> 1;
 
 	/* Asymmetric dual-rank DIMM support. */
 	if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY))
@@ -2937,6 +2983,7 @@ static struct amd64_family_type family_types[] = {
 		.f0_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F0,
 		.f6_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F6,
 		.max_mcs = 12,
+		.flags.zn_regs_v2 = 1,
 		.ops = {
 			.early_channel_count	= f17_early_channel_count,
 			.dbam_to_cs		= f17_addr_mask_to_cs_size,
@@ -3365,7 +3412,7 @@ static void __read_mc_regs_df(struct amd64_pvt *pvt)
 		umc_base = get_umc_base(i);
 		umc = &pvt->umc[i];
 
-		amd_smn_read(nid, umc_base + UMCCH_DIMM_CFG, &umc->dimm_cfg);
+		amd_smn_read(nid, umc_base + get_dimm_cfg(), &umc->dimm_cfg);
 		amd_smn_read(nid, umc_base + UMCCH_UMC_CFG, &umc->umc_cfg);
 		amd_smn_read(nid, umc_base + UMCCH_SDP_CTRL, &umc->sdp_ctrl);
 		amd_smn_read(nid, umc_base + UMCCH_ECC_CTRL, &umc->ecc_ctrl);
diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
index 650cab401e21..39ecb77873db 100644
--- a/drivers/edac/amd64_edac.h
+++ b/drivers/edac/amd64_edac.h
@@ -271,8 +271,11 @@
 #define UMCCH_BASE_ADDR_SEC		0x10
 #define UMCCH_ADDR_MASK			0x20
 #define UMCCH_ADDR_MASK_SEC		0x28
+#define UMCCH_ADDR_MASK_SEC_DDR5	0x30
 #define UMCCH_ADDR_CFG			0x30
+#define UMCCH_ADDR_CFG_DDR5		0x40
 #define UMCCH_DIMM_CFG			0x80
+#define UMCCH_DIMM_CFG_DDR5		0x90
 #define UMCCH_UMC_CFG			0x100
 #define UMCCH_SDP_CTRL			0x104
 #define UMCCH_ECC_CTRL			0x14C
@@ -477,11 +480,22 @@ struct low_ops {
 					 unsigned cs_mode, int cs_mask_nr);
 };
 
+struct amd64_family_flags {
+	/*
+	 * Indicates that the system supports the new register offsets, etc.
+	 * first introduced with Family 19h Model 10h.
+	 */
+	__u64 zn_regs_v2	: 1,
+
+	      __reserved	: 63;
+};
+
 struct amd64_family_type {
 	const char *ctl_name;
 	u16 f0_id, f1_id, f2_id, f6_id;
 	/* Maximum number of memory controllers per die/node. */
 	u8 max_mcs;
+	struct amd64_family_flags flags;
 	struct low_ops ops;
 };
 
-- 
2.25.1


^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
  2021-12-15 15:53 ` [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes Yazen Ghannam
@ 2021-12-15 16:32   ` William Roche
  2021-12-15 18:07     ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: William Roche @ 2021-12-15 16:32 UTC (permalink / raw)
  To: Yazen Ghannam, linux-edac
  Cc: linux-kernel, bp, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa

On 15/12/2021 16:53, Yazen Ghannam wrote:
> Introduce a "family flags" bitmask that can be used to indicate any
> special behavior needed on a per-family basis.
>
> Add a flag to indicate a system uses the new register offsets introduced
> with Family 19h Model 10h.
>
> Use this flag to account for register offset changes, a new bitfield
> indicating DDR5 use on a memory controller, and to set the proper number
> of chip select masks.
>
> Signed-off-by: Yazen Ghannam <yazen.ghannam@amd.com>
> ---
> Link:
> https://lkml.kernel.org/r/20211208174356.1997855-5-yazen.ghannam@amd.com
>
> v1->v2:
> * Was patch 4 in v1.
> * Change "has_ddr5" flag to "zn_regs_v2".
> * Drop flag check helper function.
> * Update determine_memory_type() to check bitfield for DDR5.
> * Update code comments.
>
>   drivers/edac/amd64_edac.c | 59 +++++++++++++++++++++++++++++++++++----
>   drivers/edac/amd64_edac.h | 14 ++++++++++
>   2 files changed, 67 insertions(+), 6 deletions(-)
>
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 1df763128483..b7dd87636155 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -15,6 +15,31 @@ static struct msr __percpu *msrs;
>   
>   static struct amd64_family_type *fam_type;
>   
> +/* Family flag helpers */
> +static inline u64 get_addr_cfg(void)
> +{
> +	if (fam_type->flags.zn_regs_v2)
> +		return UMCCH_ADDR_CFG_DDR5;
> +
> +	return UMCCH_ADDR_CFG;
> +}
> +
> +static inline u64 get_addr_mask_sec(void)
> +{
> +	if (fam_type->flags.zn_regs_v2)
> +		return UMCCH_ADDR_MASK_SEC_DDR5;
> +
> +	return UMCCH_ADDR_MASK_SEC;
> +}
> +
> +static inline u64 get_dimm_cfg(void)
> +{
> +	if (fam_type->flags.zn_regs_v2)
> +		return UMCCH_DIMM_CFG_DDR5;
> +
> +	return UMCCH_DIMM_CFG;
> +}
> +
>   /* Per-node stuff */
>   static struct ecc_settings **ecc_stngs;
>   
> @@ -1429,8 +1454,10 @@ static void __dump_misc_regs_df(struct amd64_pvt *pvt)
>   		edac_dbg(1, "UMC%d x16 DIMMs present: %s\n",
>   				i, (umc->dimm_cfg & BIT(7)) ? "yes" : "no");
>   
> -		if (pvt->dram_type == MEM_LRDDR4) {
> -			amd_smn_read(pvt->mc_node_id, umc_base + UMCCH_ADDR_CFG, &tmp);
> +		if (pvt->dram_type == MEM_LRDDR4 || pvt->dram_type == MEM_LRDDR5) {
> +			amd_smn_read(pvt->mc_node_id,
> +				     umc_base + get_addr_cfg(),
> +				     &tmp);
>   			edac_dbg(1, "UMC%d LRDIMM %dx rank multiply\n",
>   					i, 1 << ((tmp >> 4) & 0x3));
>   		}
> @@ -1505,7 +1532,7 @@ static void prep_chip_selects(struct amd64_pvt *pvt)
>   
>   		for_each_umc(umc) {
>   			pvt->csels[umc].b_cnt = 4;
> -			pvt->csels[umc].m_cnt = 2;
> +			pvt->csels[umc].m_cnt = fam_type->flags.zn_regs_v2 ? 4 : 2;
>   		}
>   
>   	} else {
> @@ -1545,7 +1572,7 @@ static void read_umc_base_mask(struct amd64_pvt *pvt)
>   		}
>   
>   		umc_mask_reg = get_umc_base(umc) + UMCCH_ADDR_MASK;
> -		umc_mask_reg_sec = get_umc_base(umc) + UMCCH_ADDR_MASK_SEC;
> +		umc_mask_reg_sec = get_umc_base(umc) + get_addr_mask_sec();
>   
>   		for_each_chip_select_mask(cs, umc, pvt) {
>   			mask = &pvt->csels[umc].csmasks[cs];
> @@ -1628,6 +1655,20 @@ static void determine_memory_type(struct amd64_pvt *pvt)
>   			dimm_cfg |= pvt->umc[i].dimm_cfg;
>   		}
>   
> +		/*
> +		 * Check if the system supports the "DDR Type" field in UMC Config
> +		 * and has DDR5 DIMMs in use.
> +		 */
> +		if (fam_type->flags.zn_regs_v2 && ((umc_cfg & GENMASK(2, 0)) == 0x1)) {
> +			if (dimm_cfg & BIT(5))
> +				pvt->dram_type = MEM_LRDDR5;
> +			else if (dimm_cfg & BIT(4))
> +				pvt->dram_type = MEM_RDDR5;
> +			else
> +				pvt->dram_type = MEM_DDR5;
> +			return;
> +		}
> +
>   		if (dimm_cfg & BIT(5))
>   			pvt->dram_type = MEM_LRDDR4;
>   		else if (dimm_cfg & BIT(4))
> @@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
>   	 * There is one mask per DIMM, and two Chip Selects per DIMM.
>   	 *	CS0 and CS1 -> DIMM0
>   	 *	CS2 and CS3 -> DIMM1
> +	 *
> +	 *	Systems with newer register layout have one mask per Chip Select.

Just a question about this comment: Can it be translated into this ?

+	 * Except on systems with newer register layout where we have one Chip Select per DIMM.

>   	 */
> -	dimm = csrow_nr >> 1;
> +	if (fam_type->flags.zn_regs_v2)
> +		dimm = csrow_nr;
> +	else
> +		dimm = csrow_nr >> 1;
>   
>   	/* Asymmetric dual-rank DIMM support. */
>   	if ((csrow_nr & 1) && (cs_mode & CS_ODD_SECONDARY))
> @@ -2937,6 +2983,7 @@ static struct amd64_family_type family_types[] = {
>   		.f0_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F0,
>   		.f6_id = PCI_DEVICE_ID_AMD_19H_M10H_DF_F6,
>   		.max_mcs = 12,
> +		.flags.zn_regs_v2 = 1,
>   		.ops = {
>   			.early_channel_count	= f17_early_channel_count,
>   			.dbam_to_cs		= f17_addr_mask_to_cs_size,
> @@ -3365,7 +3412,7 @@ static void __read_mc_regs_df(struct amd64_pvt *pvt)
>   		umc_base = get_umc_base(i);
>   		umc = &pvt->umc[i];
>   
> -		amd_smn_read(nid, umc_base + UMCCH_DIMM_CFG, &umc->dimm_cfg);
> +		amd_smn_read(nid, umc_base + get_dimm_cfg(), &umc->dimm_cfg);
>   		amd_smn_read(nid, umc_base + UMCCH_UMC_CFG, &umc->umc_cfg);
>   		amd_smn_read(nid, umc_base + UMCCH_SDP_CTRL, &umc->sdp_ctrl);
>   		amd_smn_read(nid, umc_base + UMCCH_ECC_CTRL, &umc->ecc_ctrl);
> diff --git a/drivers/edac/amd64_edac.h b/drivers/edac/amd64_edac.h
> index 650cab401e21..39ecb77873db 100644
> --- a/drivers/edac/amd64_edac.h
> +++ b/drivers/edac/amd64_edac.h
> @@ -271,8 +271,11 @@
>   #define UMCCH_BASE_ADDR_SEC		0x10
>   #define UMCCH_ADDR_MASK			0x20
>   #define UMCCH_ADDR_MASK_SEC		0x28
> +#define UMCCH_ADDR_MASK_SEC_DDR5	0x30
>   #define UMCCH_ADDR_CFG			0x30
> +#define UMCCH_ADDR_CFG_DDR5		0x40
>   #define UMCCH_DIMM_CFG			0x80
> +#define UMCCH_DIMM_CFG_DDR5		0x90
>   #define UMCCH_UMC_CFG			0x100
>   #define UMCCH_SDP_CTRL			0x104
>   #define UMCCH_ECC_CTRL			0x14C
> @@ -477,11 +480,22 @@ struct low_ops {
>   					 unsigned cs_mode, int cs_mask_nr);
>   };
>   
> +struct amd64_family_flags {
> +	/*
> +	 * Indicates that the system supports the new register offsets, etc.
> +	 * first introduced with Family 19h Model 10h.
> +	 */
> +	__u64 zn_regs_v2	: 1,
> +
> +	      __reserved	: 63;
> +};
> +
>   struct amd64_family_type {
>   	const char *ctl_name;
>   	u16 f0_id, f1_id, f2_id, f6_id;
>   	/* Maximum number of memory controllers per die/node. */
>   	u8 max_mcs;
> +	struct amd64_family_flags flags;
>   	struct low_ops ops;
>   };
>   
Thanks in advance,

William.



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates
  2021-12-15 15:53 [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Yazen Ghannam
  2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
  2021-12-15 15:53 ` [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes Yazen Ghannam
@ 2021-12-15 17:53 ` Borislav Petkov
  2 siblings, 0 replies; 13+ messages in thread
From: Borislav Petkov @ 2021-12-15 17:53 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa, william.roche

On Wed, Dec 15, 2021 at 03:53:07PM +0000, Yazen Ghannam wrote:
> The following commit in hwmon-next is needed for functional support of
> this set.
> 
>   49e90c39d0be ("x86/amd_nb: Add AMD Family 19h Models (10h-1Fh) and (A0h-AFh) PCI IDs")

I'd normally want do cross-merge that one so that stuff can get tested
but am being told that those models are not shipping yet so we're not
going to break anything even if those go separately Linus-wards.

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
  2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
@ 2021-12-15 18:01   ` Borislav Petkov
  2021-12-16 16:08     ` Yazen Ghannam
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2021-12-15 18:01 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa, william.roche

On Wed, Dec 15, 2021 at 03:53:08PM +0000, Yazen Ghannam wrote:
> -		if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5))
> +		u32 umc_cfg = 0, dimm_cfg = 0, i = 0;
> +
> +		for_each_umc(i) {
> +			umc_cfg  |= pvt->umc[i].umc_cfg;
> +			dimm_cfg |= pvt->umc[i].dimm_cfg;
> +		}
> +
> +		if (dimm_cfg & BIT(5))
>  			pvt->dram_type = MEM_LRDDR4;
> -		else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4))
> +		else if (dimm_cfg & BIT(4))

You're working here under the assumption that bit 4 and 5 will have the
same value on all those UMCs.

You're probably going to say that that is how the BIOS is programming
them so they should be all the same and any other configuration is
invalid but lemme still ask about it explicitly.

And if so, this would probably need a comment above it which I can add
when applying...

Hmm?

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
  2021-12-15 16:32   ` William Roche
@ 2021-12-15 18:07     ` Borislav Petkov
  2021-12-16 15:46       ` Yazen Ghannam
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2021-12-15 18:07 UTC (permalink / raw)
  To: William Roche
  Cc: Yazen Ghannam, linux-edac, linux-kernel, mchehab, tony.luck,
	james.morse, rric, Smita.KoralahalliChannabasappa

On Wed, Dec 15, 2021 at 05:32:27PM +0100, William Roche wrote:
> > @@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
> >   	 * There is one mask per DIMM, and two Chip Selects per DIMM.
> >   	 *	CS0 and CS1 -> DIMM0
> >   	 *	CS2 and CS3 -> DIMM1
> > +	 *
> > +	 *	Systems with newer register layout have one mask per Chip Select.
> 
> Just a question about this comment: Can it be translated into this ?
> 
> +	 * Except on systems with newer register layout where we have one Chip Select per DIMM.

Sure, but without the "we":

	...
	* On systems with the newer register layout there is one Chip Select per DIMM.
	*/

Thx.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
  2021-12-15 18:07     ` Borislav Petkov
@ 2021-12-16 15:46       ` Yazen Ghannam
  2021-12-16 18:43         ` William Roche
  0 siblings, 1 reply; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-16 15:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: William Roche, linux-edac, linux-kernel, mchehab, tony.luck,
	james.morse, rric, Smita.KoralahalliChannabasappa

On Wed, Dec 15, 2021 at 07:07:17PM +0100, Borislav Petkov wrote:
> On Wed, Dec 15, 2021 at 05:32:27PM +0100, William Roche wrote:
> > > @@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
> > >   	 * There is one mask per DIMM, and two Chip Selects per DIMM.
> > >   	 *	CS0 and CS1 -> DIMM0
> > >   	 *	CS2 and CS3 -> DIMM1
> > > +	 *
> > > +	 *	Systems with newer register layout have one mask per Chip Select.
> > 
> > Just a question about this comment: Can it be translated into this ?
> > 
> > +	 * Except on systems with newer register layout where we have one Chip Select per DIMM.
> 
> Sure, but without the "we":
> 
> 	...
> 	* On systems with the newer register layout there is one Chip Select per DIMM.
> 	*/
>

Hi William,
Thanks for the suggestion, but it's not quite correct.

There are still two Chip Selects per DIMM module, i.e. the system can support
dual-rank (2R) DIMMs. Current AMD systems can support upto 2 DIMMs per Unified
Memory Controller (UMC). There are two "Address Mask" registers in each UMC,
and each register covers an entire DIMM (and by extension the two Chip Selects
available for each DIMM).

Future systems will still support upto 2 DIMMs per UMC. However, the register
space is updated so that there are now four "Address Mask" registers per UMC.
And each of these registers is now explicitly related to one of the four Chip
Selects available per UMC.

Does this help? I can update the code comments with these details.

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
  2021-12-15 18:01   ` Borislav Petkov
@ 2021-12-16 16:08     ` Yazen Ghannam
  2021-12-30 11:36       ` Borislav Petkov
  0 siblings, 1 reply; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-16 16:08 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa, william.roche

On Wed, Dec 15, 2021 at 07:01:22PM +0100, Borislav Petkov wrote:
> On Wed, Dec 15, 2021 at 03:53:08PM +0000, Yazen Ghannam wrote:
> > -		if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(5))
> > +		u32 umc_cfg = 0, dimm_cfg = 0, i = 0;
> > +
> > +		for_each_umc(i) {
> > +			umc_cfg  |= pvt->umc[i].umc_cfg;
> > +			dimm_cfg |= pvt->umc[i].dimm_cfg;
> > +		}
> > +
> > +		if (dimm_cfg & BIT(5))
> >  			pvt->dram_type = MEM_LRDDR4;
> > -		else if ((pvt->umc[0].dimm_cfg | pvt->umc[1].dimm_cfg) & BIT(4))
> > +		else if (dimm_cfg & BIT(4))
> 
> You're working here under the assumption that bit 4 and 5 will have the
> same value on all those UMCs.
> 
> You're probably going to say that that is how the BIOS is programming
> them so they should be all the same and any other configuration is
> invalid but lemme still ask about it explicitly.
> 
> And if so, this would probably need a comment above it which I can add
> when applying...
> 
> Hmm?
>

No, that's a good question. And actually the assumption is incorrect. It is
allowed to have different DIMM types in a system though all DIMMs on a single
UMC must match.

Do you recommend a follow up patch or should this one be reworked?

Thanks,
Yazen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
  2021-12-16 15:46       ` Yazen Ghannam
@ 2021-12-16 18:43         ` William Roche
  2021-12-16 19:21           ` Yazen Ghannam
  0 siblings, 1 reply; 13+ messages in thread
From: William Roche @ 2021-12-16 18:43 UTC (permalink / raw)
  To: Yazen Ghannam, Borislav Petkov
  Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa

On 16/12/2021 16:46, Yazen Ghannam wrote:
> On Wed, Dec 15, 2021 at 07:07:17PM +0100, Borislav Petkov wrote:
>> On Wed, Dec 15, 2021 at 05:32:27PM +0100, William Roche wrote:
>>>> @@ -2174,8 +2215,13 @@ static int f17_addr_mask_to_cs_size(struct amd64_pvt *pvt, u8 umc,
>>>>    	 * There is one mask per DIMM, and two Chip Selects per DIMM.
>>>>    	 *	CS0 and CS1 -> DIMM0
>>>>    	 *	CS2 and CS3 -> DIMM1
>>>> +	 *
>>>> +	 *	Systems with newer register layout have one mask per Chip Select.
>>> Just a question about this comment: Can it be translated into this ?
>>>
>>> +	 * Except on systems with newer register layout where we have one Chip Select per DIMM.
>> Sure, but without the "we":
>>
>> 	...
>> 	* On systems with the newer register layout there is one Chip Select per DIMM.
>> 	*/
>>
> Hi William,
> Thanks for the suggestion, but it's not quite correct.

That's exactly what I wanted to know. Thanks.

>
> There are still two Chip Selects per DIMM module, i.e. the system can support
> dual-rank (2R) DIMMs. Current AMD systems can support upto 2 DIMMs per Unified
> Memory Controller (UMC). There are two "Address Mask" registers in each UMC,
> and each register covers an entire DIMM (and by extension the two Chip Selects
> available for each DIMM).
>
> Future systems will still support upto 2 DIMMs per UMC. However, the register
> space is updated so that there are now four "Address Mask" registers per UMC.
> And each of these registers is now explicitly related to one of the four Chip
> Selects available per UMC.

 From what I understand, future systems would still support the same 
number of dimms per UMC (2), the same number of Chip Select (2 per 
dimm), the only thing that changes is the number of Address Mask 
registers (going from 2 per UMC  to  4 per UMC).

So I'm confused, we deduce 'dimm' from csrow_nr, which would be in fact 
the Chip Select *masks* number (cs_mask_nr from the dbam_to_cs signature 
in struct low_ops), so why are we saying and dimm=csrow_nr in the case 
of the new layout, but dimm = csrow_nr / 2 in the case on the standard 
layout ?

Should we indicate what this 'dimm' value really is ?

Sorry if I'm missing something very obvious here.

Thanks,
William.


> Does this help? I can update the code comments with these details.
>
> Thanks,
> Yazen

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes
  2021-12-16 18:43         ` William Roche
@ 2021-12-16 19:21           ` Yazen Ghannam
  0 siblings, 0 replies; 13+ messages in thread
From: Yazen Ghannam @ 2021-12-16 19:21 UTC (permalink / raw)
  To: William Roche
  Cc: Borislav Petkov, linux-edac, linux-kernel, mchehab, tony.luck,
	james.morse, rric, Smita.KoralahalliChannabasappa

On Thu, Dec 16, 2021 at 07:43:55PM +0100, William Roche wrote:
...
> From what I understand, future systems would still support the same number
> of dimms per UMC (2), the same number of Chip Select (2 per dimm), the only
> thing that changes is the number of Address Mask registers (going from 2 per
> UMC  to  4 per UMC).
> 
> So I'm confused, we deduce 'dimm' from csrow_nr, which would be in fact the
> Chip Select *masks* number (cs_mask_nr from the dbam_to_cs signature in
> struct low_ops), so why are we saying and dimm=csrow_nr in the case of the
> new layout, but dimm = csrow_nr / 2 in the case on the standard layout ?
> 
> Should we indicate what this 'dimm' value really is ?
> 
> Sorry if I'm missing something very obvious here.
>

That's fair. I can rework the patch to explicitly differentiate between "dimm"
and "cs_mask_nr" here.

I think this would resolve an issue in a later debug print statement that
includes the csrow_nr and dimm.

Thanks,
Yazen 

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
  2021-12-16 16:08     ` Yazen Ghannam
@ 2021-12-30 11:36       ` Borislav Petkov
  2022-01-05 16:12         ` Yazen Ghannam
  0 siblings, 1 reply; 13+ messages in thread
From: Borislav Petkov @ 2021-12-30 11:36 UTC (permalink / raw)
  To: Yazen Ghannam
  Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa, william.roche

On Thu, Dec 16, 2021 at 04:08:18PM +0000, Yazen Ghannam wrote:
> No, that's a good question. And actually the assumption is incorrect. It is
> allowed to have different DIMM types in a system though all DIMMs on a single
> UMC must match.

Oh fun, really?!

So a single system can have DDR4 *and* DDR5 on the same board?!

So then that

	pvt->dram_type

is insufficient to store the DIMM type for a pvt. If you have multiple
UMCs on a pvt and all have different type DIMMs, then you need the
relevant DIMM type when you dump it in sysfs...

Which then means, you need ->dram_type to be per UMC...

And also, I'm assuming the hw already enforces that DIMMs on a single
UMC must match - it simply won't boot if they don't so you don't have to
enforce that, at least.

> Do you recommend a follow up patch or should this one be reworked?

This one is insufficient, I'm afraid.

One way to address this is, you could use pvt->umc at the places where
dram_type is used and assign directly to the dimm->mtype thing. But then
you'd need a way to map each struct dimm_info *dimm to the UMC so that
you can determine the correct DIMM type.

Which would make pvt->dram_type redundant and can be removed.

Or you might have a better idea...

HTH.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs
  2021-12-30 11:36       ` Borislav Petkov
@ 2022-01-05 16:12         ` Yazen Ghannam
  0 siblings, 0 replies; 13+ messages in thread
From: Yazen Ghannam @ 2022-01-05 16:12 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: linux-edac, linux-kernel, mchehab, tony.luck, james.morse, rric,
	Smita.KoralahalliChannabasappa, william.roche

On Thu, Dec 30, 2021 at 12:36:44PM +0100, Borislav Petkov wrote:
> On Thu, Dec 16, 2021 at 04:08:18PM +0000, Yazen Ghannam wrote:
> > No, that's a good question. And actually the assumption is incorrect. It is
> > allowed to have different DIMM types in a system though all DIMMs on a single
> > UMC must match.
> 
> Oh fun, really?!
> 
> So a single system can have DDR4 *and* DDR5 on the same board?!
>

Well, I don't know about that specifically. There are some restrictions, but
you could have UDIMMs and RDIMMs of the same generation, at least.
 
> So then that
> 
> 	pvt->dram_type
> 
> is insufficient to store the DIMM type for a pvt. If you have multiple
> UMCs on a pvt and all have different type DIMMs, then you need the
> relevant DIMM type when you dump it in sysfs...
> 
> Which then means, you need ->dram_type to be per UMC...
> 
> And also, I'm assuming the hw already enforces that DIMMs on a single
> UMC must match - it simply won't boot if they don't so you don't have to
> enforce that, at least.
> 
> > Do you recommend a follow up patch or should this one be reworked?
> 
> This one is insufficient, I'm afraid.
> 
> One way to address this is, you could use pvt->umc at the places where
> dram_type is used and assign directly to the dimm->mtype thing. But then
> you'd need a way to map each struct dimm_info *dimm to the UMC so that
> you can determine the correct DIMM type.
> 

I did send a patch that did something like this.
https://lkml.kernel.org/r/20211228200615.412999-2-yazen.ghannam@amd.com

Though this got a build warning report, so I need to follow up on that.

> Which would make pvt->dram_type redundant and can be removed.
>

I kept this so as to not break legacy systems. But I'll look at it again. I
think you may be right.

Thanks,
Yazen 

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-01-05 16:12 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-12-15 15:53 [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Yazen Ghannam
2021-12-15 15:53 ` [PATCH v2 1/2] EDAC/amd64: Check register values from all UMCs Yazen Ghannam
2021-12-15 18:01   ` Borislav Petkov
2021-12-16 16:08     ` Yazen Ghannam
2021-12-30 11:36       ` Borislav Petkov
2022-01-05 16:12         ` Yazen Ghannam
2021-12-15 15:53 ` [PATCH v2 2/2] EDAC/amd64: Add new register offset support and related changes Yazen Ghannam
2021-12-15 16:32   ` William Roche
2021-12-15 18:07     ` Borislav Petkov
2021-12-16 15:46       ` Yazen Ghannam
2021-12-16 18:43         ` William Roche
2021-12-16 19:21           ` Yazen Ghannam
2021-12-15 17:53 ` [PATCH v2 0/2] AMD Family 19h Models 10h-1Fh Updates Borislav Petkov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).