All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/4] EDAC/ghes: Some cleanups
@ 2020-06-16 17:27 Borislav Petkov
  2020-06-16 17:27 ` [PATCH 1/4] EDAC/ghes: Setup DIMM label from DMI and use it in error reports Borislav Petkov
                   ` (4 more replies)
  0 siblings, 5 replies; 6+ messages in thread
From: Borislav Petkov @ 2020-06-16 17:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: linux-edac, LKML

From: Borislav Petkov <bp@suse.de>

Hi,

here are some cleanups which came up recently.

Thx.

Borislav Petkov (2):
  EDAC/ghes: Scan the system once on driver init
  EDAC: Remove edac_get_dimm_by_index()

Robert Richter (2):
  EDAC/ghes: Setup DIMM label from DMI and use it in error reports
  EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to
    ghes_pvt

 drivers/edac/ghes_edac.c | 323 +++++++++++++++++++++++----------------
 include/linux/edac.h     |  29 +---
 2 files changed, 200 insertions(+), 152 deletions(-)

-- 
2.21.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/4] EDAC/ghes: Setup DIMM label from DMI and use it in error reports
  2020-06-16 17:27 [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov
@ 2020-06-16 17:27 ` Borislav Petkov
  2020-06-16 17:27 ` [PATCH 2/4] EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to ghes_pvt Borislav Petkov
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2020-06-16 17:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: linux-edac, LKML

From: Robert Richter <rrichter@marvell.com>

The ghes driver reports errors with 'unknown label' even if the actual
DIMM label is known, e.g.:

 EDAC MC0: 1 CE Single-bit ECC on unknown label (node:0 card:0
   module:0 rank:1 bank:0 col:13 bit_pos:16 DIMM location:N0 DIMM_A0
   page:0x966a9b3 offset:0x0 grain:1 syndrome:0x0 - APEI location:
   node:0 card:0 module:0 rank:1 bank:0 col:13 bit_pos:16 DIMM
   location:N0 DIMM_A0 status(0x0000000000000400): Storage error in
   DRAM memory)

Fix this by using struct dimm_info's label string in error reports:

 EDAC MC0: 1 CE Single-bit ECC on N0 DIMM_A0 (node:0 card:0 module:0
   rank:1 bank:515 col:14 bit_pos:16 DIMM location:N0 DIMM_A0
   page:0x99223d8 offset:0x0 grain:1 syndrome:0x0 - APEI location:
   node:0 card:0 module:0 rank:1 bank:515 col:14 bit_pos:16 DIMM
   location:N0 DIMM_A0 status(0x0000000000000400): Storage error in
   DRAM memory)

The labels are initialized by reading the bank and device strings
from DMI. Now, the label information can also read from sysfs. E.g. a
ThunderX2 system will show the following:

  /sys/devices/system/edac/mc/mc0/dimm0/dimm_label:N0 DIMM_A0
  /sys/devices/system/edac/mc/mc0/dimm1/dimm_label:N0 DIMM_B0
  /sys/devices/system/edac/mc/mc0/dimm2/dimm_label:N0 DIMM_C0
  /sys/devices/system/edac/mc/mc0/dimm3/dimm_label:N0 DIMM_D0
  /sys/devices/system/edac/mc/mc0/dimm4/dimm_label:N0 DIMM_E0
  /sys/devices/system/edac/mc/mc0/dimm5/dimm_label:N0 DIMM_F0
  /sys/devices/system/edac/mc/mc0/dimm6/dimm_label:N0 DIMM_G0
  /sys/devices/system/edac/mc/mc0/dimm7/dimm_label:N0 DIMM_H0
  /sys/devices/system/edac/mc/mc0/dimm8/dimm_label:N1 DIMM_I0
  /sys/devices/system/edac/mc/mc0/dimm9/dimm_label:N1 DIMM_J0
  /sys/devices/system/edac/mc/mc0/dimm10/dimm_label:N1 DIMM_K0
  /sys/devices/system/edac/mc/mc0/dimm11/dimm_label:N1 DIMM_L0
  /sys/devices/system/edac/mc/mc0/dimm12/dimm_label:N1 DIMM_M0
  /sys/devices/system/edac/mc/mc0/dimm13/dimm_label:N1 DIMM_N0
  /sys/devices/system/edac/mc/mc0/dimm14/dimm_label:N1 DIMM_O0
  /sys/devices/system/edac/mc/mc0/dimm15/dimm_label:N1 DIMM_P0

Since dimm_labels can be rewritten, that label will be used in a later
error report:

  # echo foobar >/sys/devices/system/edac/mc/mc0/dimm0/dimm_label
  # # some error injection here
  # dmesg | grep foobar
  [ 751.383533] EDAC MC0: 1 CE Single-bit ECC on foobar (node:0 card:0
  module:0 rank:1 bank:259 col:3 bit_pos:16 DIMM location:N0 DIMM_A0
  page:0x8c8dc74 offset:0x0 grain:1 syndrome:0x0 - APEI location:
  node:0 card:0 module:0 rank:1 bank:259 col:3 bit_pos:16 DIMM
  location:N0 DIMM_A0 status(0x0000000000000400): Storage error in DRAM
  memory)

 [ bp: Remove curly brackets around a single if-statement in dimm_setup_label(). ]

Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200528101307.23245-1-rrichter@marvell.com
---
 drivers/edac/ghes_edac.c | 35 ++++++++++++++++++++++++-----------
 1 file changed, 24 insertions(+), 11 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index cb3dab56a875..94c70c95a896 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -87,16 +87,27 @@ static void ghes_edac_count_dimms(const struct dmi_header *dh, void *arg)
 		(*num_dimm)++;
 }
 
-static int get_dimm_smbios_index(struct mem_ctl_info *mci, u16 handle)
+static struct dimm_info *find_dimm_by_handle(struct mem_ctl_info *mci, u16 handle)
 {
 	struct dimm_info *dimm;
 
 	mci_for_each_dimm(mci, dimm) {
 		if (dimm->smbios_handle == handle)
-			return dimm->idx;
+			return dimm;
 	}
 
-	return -1;
+	return NULL;
+}
+
+static void dimm_setup_label(struct dimm_info *dimm, u16 handle)
+{
+	const char *bank = NULL, *device = NULL;
+
+	dmi_memdev_name(handle, &bank, &device);
+
+	/* both strings must be non-zero */
+	if (bank && *bank && device && *device)
+		snprintf(dimm->label, sizeof(dimm->label), "%s %s", bank, device);
 }
 
 static void ghes_edac_dmidecode(const struct dmi_header *dh, void *arg)
@@ -179,9 +190,7 @@ static void ghes_edac_dmidecode(const struct dmi_header *dh, void *arg)
 		dimm->dtype = DEV_UNKNOWN;
 		dimm->grain = 128;		/* Likely, worse case */
 
-		/*
-		 * FIXME: It shouldn't be hard to also fill the DIMM labels
-		 */
+		dimm_setup_label(dimm, entry->handle);
 
 		if (dimm->nr_pages) {
 			edac_dbg(1, "DIMM%i: %s size = %d MB%s\n",
@@ -228,7 +237,6 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 	memset(e, 0, sizeof (*e));
 	e->error_count = 1;
 	e->grain = 1;
-	strcpy(e->label, "unknown label");
 	e->msg = pvt->msg;
 	e->other_detail = pvt->other_detail;
 	e->top_layer = -1;
@@ -345,7 +353,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 		p += sprintf(p, "bit_pos:%d ", mem_err->bit_pos);
 	if (mem_err->validation_bits & CPER_MEM_VALID_MODULE_HANDLE) {
 		const char *bank = NULL, *device = NULL;
-		int index = -1;
+		struct dimm_info *dimm;
 
 		dmi_memdev_name(mem_err->mem_dev_handle, &bank, &device);
 		if (bank != NULL && device != NULL)
@@ -354,13 +362,18 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 			p += sprintf(p, "DIMM DMI handle: 0x%.4x ",
 				     mem_err->mem_dev_handle);
 
-		index = get_dimm_smbios_index(mci, mem_err->mem_dev_handle);
-		if (index >= 0)
-			e->top_layer = index;
+		dimm = find_dimm_by_handle(mci, mem_err->mem_dev_handle);
+		if (dimm) {
+			e->top_layer = dimm->idx;
+			strcpy(e->label, dimm->label);
+		}
 	}
 	if (p > e->location)
 		*(p - 1) = '\0';
 
+	if (!*e->label)
+		strcpy(e->label, "unknown memory");
+
 	/* All other fields are mapped on e->other_detail */
 	p = pvt->other_detail;
 	p += snprintf(p, sizeof(pvt->other_detail),
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/4] EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to ghes_pvt
  2020-06-16 17:27 [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov
  2020-06-16 17:27 ` [PATCH 1/4] EDAC/ghes: Setup DIMM label from DMI and use it in error reports Borislav Petkov
@ 2020-06-16 17:27 ` Borislav Petkov
  2020-06-16 17:27 ` [PATCH 3/4] EDAC/ghes: Scan the system once on driver init Borislav Petkov
                   ` (2 subsequent siblings)
  4 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2020-06-16 17:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: linux-edac, LKML

From: Robert Richter <rrichter@marvell.com>

The struct members list and ghes of struct ghes_edac_pvt are unused,
remove them. On that occasion, rename it to the shorter name struct
ghes_pvt.

Signed-off-by: Robert Richter <rrichter@marvell.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Link: https://lkml.kernel.org/r/20200519104443.15673-2-rrichter@marvell.com
---
 drivers/edac/ghes_edac.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index 94c70c95a896..f4f9ae32c743 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -15,9 +15,7 @@
 #include "edac_module.h"
 #include <ras/ras_event.h>
 
-struct ghes_edac_pvt {
-	struct list_head list;
-	struct ghes *ghes;
+struct ghes_pvt {
 	struct mem_ctl_info *mci;
 
 	/* Buffers for the error handling routine */
@@ -32,7 +30,7 @@ static refcount_t ghes_refcount = REFCOUNT_INIT(0);
  * also provides the necessary (implicit) memory barrier for the SMP
  * case to make the pointer visible on another CPU.
  */
-static struct ghes_edac_pvt *ghes_pvt;
+static struct ghes_pvt *ghes_pvt;
 
 /* GHES registration mutex */
 static DEFINE_MUTEX(ghes_reg_mutex);
@@ -212,7 +210,7 @@ void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
 {
 	struct edac_raw_error_desc *e;
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt;
+	struct ghes_pvt *pvt;
 	unsigned long flags;
 	char *p;
 
@@ -470,7 +468,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	bool fake = false;
 	int rc = 0, num_dimm = 0;
 	struct mem_ctl_info *mci;
-	struct ghes_edac_pvt *pvt;
+	struct ghes_pvt *pvt;
 	struct edac_mc_layer layers[1];
 	struct ghes_edac_dimm_fill dimm_fill;
 	unsigned long flags;
@@ -507,7 +505,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	layers[0].size = num_dimm;
 	layers[0].is_virt_csrow = true;
 
-	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct ghes_edac_pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct ghes_pvt));
 	if (!mci) {
 		pr_info("Can't allocate memory for EDAC data\n");
 		rc = -ENOMEM;
@@ -515,7 +513,6 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	}
 
 	pvt		= mci->pvt_info;
-	pvt->ghes	= ghes;
 	pvt->mci	= mci;
 
 	mci->pdev = dev;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 3/4] EDAC/ghes: Scan the system once on driver init
  2020-06-16 17:27 [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov
  2020-06-16 17:27 ` [PATCH 1/4] EDAC/ghes: Setup DIMM label from DMI and use it in error reports Borislav Petkov
  2020-06-16 17:27 ` [PATCH 2/4] EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to ghes_pvt Borislav Petkov
@ 2020-06-16 17:27 ` Borislav Petkov
  2020-06-16 17:27 ` [PATCH 4/4] EDAC: Remove edac_get_dimm_by_index() Borislav Petkov
  2020-06-22 15:09 ` [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov
  4 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2020-06-16 17:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: linux-edac, LKML

From: Borislav Petkov <bp@suse.de>

Change the hardware scanning and figuring out how many DIMMs a machine
has to a single, one-time thing which happens once on driver init. After
that scanning completes, struct ghes_hw_desc contains a representation
of the hardware which the driver can then use for later initialization.

Then, copy the DIMM information into the respective EDAC core
representation of those.

Get rid of ghes_edac_dimm_fill and use a struct dimm_info array
directly.

This way, hw detection and further driver initialization is nicely
and logically split. Further additions should all be added to
ghes_scan_system() and the hw representation extended as needed.

There should be no functionality change resulting from this patch.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/edac/ghes_edac.c | 279 +++++++++++++++++++++++----------------
 1 file changed, 166 insertions(+), 113 deletions(-)

diff --git a/drivers/edac/ghes_edac.c b/drivers/edac/ghes_edac.c
index f4f9ae32c743..da60c29468a7 100644
--- a/drivers/edac/ghes_edac.c
+++ b/drivers/edac/ghes_edac.c
@@ -32,6 +32,15 @@ static refcount_t ghes_refcount = REFCOUNT_INIT(0);
  */
 static struct ghes_pvt *ghes_pvt;
 
+/*
+ * This driver's representation of the system hardware, as collected
+ * from DMI.
+ */
+struct ghes_hw_desc {
+	int num_dimms;
+	struct dimm_info *dimms;
+} ghes_hw;
+
 /* GHES registration mutex */
 static DEFINE_MUTEX(ghes_reg_mutex);
 
@@ -72,19 +81,6 @@ struct memdev_dmi_entry {
 	u16 conf_mem_clk_speed;
 } __attribute__((__packed__));
 
-struct ghes_edac_dimm_fill {
-	struct mem_ctl_info *mci;
-	unsigned int count;
-};
-
-static void ghes_edac_count_dimms(const struct dmi_header *dh, void *arg)
-{
-	int *num_dimm = arg;
-
-	if (dh->type == DMI_ENTRY_MEM_DEVICE)
-		(*num_dimm)++;
-}
-
 static struct dimm_info *find_dimm_by_handle(struct mem_ctl_info *mci, u16 handle)
 {
 	struct dimm_info *dimm;
@@ -108,102 +104,135 @@ static void dimm_setup_label(struct dimm_info *dimm, u16 handle)
 		snprintf(dimm->label, sizeof(dimm->label), "%s %s", bank, device);
 }
 
-static void ghes_edac_dmidecode(const struct dmi_header *dh, void *arg)
+static void assign_dmi_dimm_info(struct dimm_info *dimm, struct memdev_dmi_entry *entry)
 {
-	struct ghes_edac_dimm_fill *dimm_fill = arg;
-	struct mem_ctl_info *mci = dimm_fill->mci;
-
-	if (dh->type == DMI_ENTRY_MEM_DEVICE) {
-		struct memdev_dmi_entry *entry = (struct memdev_dmi_entry *)dh;
-		struct dimm_info *dimm = edac_get_dimm(mci, dimm_fill->count, 0, 0);
-		u16 rdr_mask = BIT(7) | BIT(13);
-
-		if (entry->size == 0xffff) {
-			pr_info("Can't get DIMM%i size\n",
-				dimm_fill->count);
-			dimm->nr_pages = MiB_TO_PAGES(32);/* Unknown */
-		} else if (entry->size == 0x7fff) {
-			dimm->nr_pages = MiB_TO_PAGES(entry->extended_size);
-		} else {
-			if (entry->size & BIT(15))
-				dimm->nr_pages = MiB_TO_PAGES((entry->size & 0x7fff) << 10);
-			else
-				dimm->nr_pages = MiB_TO_PAGES(entry->size);
-		}
+	u16 rdr_mask = BIT(7) | BIT(13);
 
-		switch (entry->memory_type) {
-		case 0x12:
-			if (entry->type_detail & BIT(13))
-				dimm->mtype = MEM_RDDR;
-			else
-				dimm->mtype = MEM_DDR;
-			break;
-		case 0x13:
-			if (entry->type_detail & BIT(13))
-				dimm->mtype = MEM_RDDR2;
-			else
-				dimm->mtype = MEM_DDR2;
-			break;
-		case 0x14:
-			dimm->mtype = MEM_FB_DDR2;
-			break;
-		case 0x18:
-			if (entry->type_detail & BIT(12))
-				dimm->mtype = MEM_NVDIMM;
-			else if (entry->type_detail & BIT(13))
-				dimm->mtype = MEM_RDDR3;
-			else
-				dimm->mtype = MEM_DDR3;
-			break;
-		case 0x1a:
-			if (entry->type_detail & BIT(12))
-				dimm->mtype = MEM_NVDIMM;
-			else if (entry->type_detail & BIT(13))
-				dimm->mtype = MEM_RDDR4;
-			else
-				dimm->mtype = MEM_DDR4;
-			break;
-		default:
-			if (entry->type_detail & BIT(6))
-				dimm->mtype = MEM_RMBS;
-			else if ((entry->type_detail & rdr_mask) == rdr_mask)
-				dimm->mtype = MEM_RDR;
-			else if (entry->type_detail & BIT(7))
-				dimm->mtype = MEM_SDR;
-			else if (entry->type_detail & BIT(9))
-				dimm->mtype = MEM_EDO;
-			else
-				dimm->mtype = MEM_UNKNOWN;
-		}
+	if (entry->size == 0xffff) {
+		pr_info("Can't get DIMM%i size\n", dimm->idx);
+		dimm->nr_pages = MiB_TO_PAGES(32);/* Unknown */
+	} else if (entry->size == 0x7fff) {
+		dimm->nr_pages = MiB_TO_PAGES(entry->extended_size);
+	} else {
+		if (entry->size & BIT(15))
+			dimm->nr_pages = MiB_TO_PAGES((entry->size & 0x7fff) << 10);
+		else
+			dimm->nr_pages = MiB_TO_PAGES(entry->size);
+	}
 
-		/*
-		 * Actually, we can only detect if the memory has bits for
-		 * checksum or not
-		 */
-		if (entry->total_width == entry->data_width)
-			dimm->edac_mode = EDAC_NONE;
+	switch (entry->memory_type) {
+	case 0x12:
+		if (entry->type_detail & BIT(13))
+			dimm->mtype = MEM_RDDR;
+		else
+			dimm->mtype = MEM_DDR;
+		break;
+	case 0x13:
+		if (entry->type_detail & BIT(13))
+			dimm->mtype = MEM_RDDR2;
 		else
-			dimm->edac_mode = EDAC_SECDED;
+			dimm->mtype = MEM_DDR2;
+		break;
+	case 0x14:
+		dimm->mtype = MEM_FB_DDR2;
+		break;
+	case 0x18:
+		if (entry->type_detail & BIT(12))
+			dimm->mtype = MEM_NVDIMM;
+		else if (entry->type_detail & BIT(13))
+			dimm->mtype = MEM_RDDR3;
+		else
+			dimm->mtype = MEM_DDR3;
+		break;
+	case 0x1a:
+		if (entry->type_detail & BIT(12))
+			dimm->mtype = MEM_NVDIMM;
+		else if (entry->type_detail & BIT(13))
+			dimm->mtype = MEM_RDDR4;
+		else
+			dimm->mtype = MEM_DDR4;
+		break;
+	default:
+		if (entry->type_detail & BIT(6))
+			dimm->mtype = MEM_RMBS;
+		else if ((entry->type_detail & rdr_mask) == rdr_mask)
+			dimm->mtype = MEM_RDR;
+		else if (entry->type_detail & BIT(7))
+			dimm->mtype = MEM_SDR;
+		else if (entry->type_detail & BIT(9))
+			dimm->mtype = MEM_EDO;
+		else
+			dimm->mtype = MEM_UNKNOWN;
+	}
 
-		dimm->dtype = DEV_UNKNOWN;
-		dimm->grain = 128;		/* Likely, worse case */
-
-		dimm_setup_label(dimm, entry->handle);
-
-		if (dimm->nr_pages) {
-			edac_dbg(1, "DIMM%i: %s size = %d MB%s\n",
-				dimm_fill->count, edac_mem_types[dimm->mtype],
-				PAGES_TO_MiB(dimm->nr_pages),
-				(dimm->edac_mode != EDAC_NONE) ? "(ECC)" : "");
-			edac_dbg(2, "\ttype %d, detail 0x%02x, width %d(total %d)\n",
-				entry->memory_type, entry->type_detail,
-				entry->total_width, entry->data_width);
-		}
+	/*
+	 * Actually, we can only detect if the memory has bits for
+	 * checksum or not
+	 */
+	if (entry->total_width == entry->data_width)
+		dimm->edac_mode = EDAC_NONE;
+	else
+		dimm->edac_mode = EDAC_SECDED;
+
+	dimm->dtype = DEV_UNKNOWN;
+	dimm->grain = 128;		/* Likely, worse case */
 
-		dimm->smbios_handle = entry->handle;
+	dimm_setup_label(dimm, entry->handle);
 
-		dimm_fill->count++;
+	if (dimm->nr_pages) {
+		edac_dbg(1, "DIMM%i: %s size = %d MB%s\n",
+			dimm->idx, edac_mem_types[dimm->mtype],
+			PAGES_TO_MiB(dimm->nr_pages),
+			(dimm->edac_mode != EDAC_NONE) ? "(ECC)" : "");
+		edac_dbg(2, "\ttype %d, detail 0x%02x, width %d(total %d)\n",
+			entry->memory_type, entry->type_detail,
+			entry->total_width, entry->data_width);
 	}
+
+	dimm->smbios_handle = entry->handle;
+}
+
+static void enumerate_dimms(const struct dmi_header *dh, void *arg)
+{
+	struct memdev_dmi_entry *entry = (struct memdev_dmi_entry *)dh;
+	struct ghes_hw_desc *hw = (struct ghes_hw_desc *)arg;
+	struct dimm_info *d;
+
+	if (dh->type != DMI_ENTRY_MEM_DEVICE)
+		return;
+
+	/* Enlarge the array with additional 16 */
+	if (!hw->num_dimms || !(hw->num_dimms % 16)) {
+		struct dimm_info *new;
+
+		new = krealloc(hw->dimms, (hw->num_dimms + 16) * sizeof(struct dimm_info),
+			        GFP_KERNEL);
+		if (!new) {
+			WARN_ON_ONCE(1);
+			return;
+		}
+
+		hw->dimms = new;
+	}
+
+	d = &hw->dimms[hw->num_dimms];
+	d->idx = hw->num_dimms;
+
+	assign_dmi_dimm_info(d, entry);
+
+	hw->num_dimms++;
+}
+
+static void ghes_scan_system(void)
+{
+	static bool scanned;
+
+	if (scanned)
+		return;
+
+	dmi_walk(enumerate_dimms, &ghes_hw);
+
+	scanned = true;
 }
 
 void ghes_edac_report_mem_error(int sev, struct cper_sec_mem_err *mem_err)
@@ -466,13 +495,12 @@ static struct acpi_platform_list plat_list[] = {
 int ghes_edac_register(struct ghes *ghes, struct device *dev)
 {
 	bool fake = false;
-	int rc = 0, num_dimm = 0;
 	struct mem_ctl_info *mci;
 	struct ghes_pvt *pvt;
 	struct edac_mc_layer layers[1];
-	struct ghes_edac_dimm_fill dimm_fill;
 	unsigned long flags;
 	int idx = -1;
+	int rc = 0;
 
 	if (IS_ENABLED(CONFIG_X86)) {
 		/* Check if safe to enable on this system */
@@ -492,17 +520,16 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	if (refcount_inc_not_zero(&ghes_refcount))
 		goto unlock;
 
-	/* Get the number of DIMMs */
-	dmi_walk(ghes_edac_count_dimms, &num_dimm);
+	ghes_scan_system();
 
 	/* Check if we've got a bogus BIOS */
-	if (num_dimm == 0) {
+	if (!ghes_hw.num_dimms) {
 		fake = true;
-		num_dimm = 1;
+		ghes_hw.num_dimms = 1;
 	}
 
 	layers[0].type = EDAC_MC_LAYER_ALL_MEM;
-	layers[0].size = num_dimm;
+	layers[0].size = ghes_hw.num_dimms;
 	layers[0].is_virt_csrow = true;
 
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, sizeof(struct ghes_pvt));
@@ -533,13 +560,34 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 		pr_info("So, the end result of using this driver varies from vendor to vendor.\n");
 		pr_info("If you find incorrect reports, please contact your hardware vendor\n");
 		pr_info("to correct its BIOS.\n");
-		pr_info("This system has %d DIMM sockets.\n", num_dimm);
+		pr_info("This system has %d DIMM sockets.\n", ghes_hw.num_dimms);
 	}
 
 	if (!fake) {
-		dimm_fill.count = 0;
-		dimm_fill.mci = mci;
-		dmi_walk(ghes_edac_dmidecode, &dimm_fill);
+		struct dimm_info *src, *dst;
+		int i = 0;
+
+		mci_for_each_dimm(mci, dst) {
+			src = &ghes_hw.dimms[i];
+
+			dst->idx	   = src->idx;
+			dst->smbios_handle = src->smbios_handle;
+			dst->nr_pages	   = src->nr_pages;
+			dst->mtype	   = src->mtype;
+			dst->edac_mode	   = src->edac_mode;
+			dst->dtype	   = src->dtype;
+			dst->grain	   = src->grain;
+
+			/*
+			 * If no src->label, preserve default label assigned
+			 * from EDAC core.
+			 */
+			if (strlen(src->label))
+				memcpy(dst->label, src->label, sizeof(src->label));
+
+			i++;
+		}
+
 	} else {
 		struct dimm_info *dimm = edac_get_dimm(mci, 0, 0, 0);
 
@@ -552,7 +600,7 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 
 	rc = edac_mc_add_mc(mci);
 	if (rc < 0) {
-		pr_info("Can't register at EDAC core\n");
+		pr_info("Can't register with the EDAC core\n");
 		edac_mc_free(mci);
 		rc = -ENODEV;
 		goto unlock;
@@ -566,6 +614,11 @@ int ghes_edac_register(struct ghes *ghes, struct device *dev)
 	refcount_set(&ghes_refcount, 1);
 
 unlock:
+
+	/* Not needed anymore */
+	kfree(ghes_hw.dimms);
+	ghes_hw.dimms = NULL;
+
 	mutex_unlock(&ghes_reg_mutex);
 
 	return rc;
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 4/4] EDAC: Remove edac_get_dimm_by_index()
  2020-06-16 17:27 [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov
                   ` (2 preceding siblings ...)
  2020-06-16 17:27 ` [PATCH 3/4] EDAC/ghes: Scan the system once on driver init Borislav Petkov
@ 2020-06-16 17:27 ` Borislav Petkov
  2020-06-22 15:09 ` [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov
  4 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2020-06-16 17:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: linux-edac, LKML

From: Borislav Petkov <bp@suse.de>

It is unused now.

Signed-off-by: Borislav Petkov <bp@suse.de>
---
 include/linux/edac.h | 29 +++++++----------------------
 1 file changed, 7 insertions(+), 22 deletions(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index 6eb7d55d7c3d..15e8f3d8a895 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -594,27 +594,6 @@ struct mem_ctl_info {
 		     ? (mci)->dimms[(dimm)->idx + 1]		\
 		     : NULL)
 
-/**
- * edac_get_dimm_by_index - Get DIMM info at @index from a memory
- * 			    controller
- *
- * @mci:	MC descriptor struct mem_ctl_info
- * @index:	index in the memory controller's DIMM array
- *
- * Returns a struct dimm_info * or NULL on failure.
- */
-static inline struct dimm_info *
-edac_get_dimm_by_index(struct mem_ctl_info *mci, int index)
-{
-	if (index < 0 || index >= mci->tot_dimms)
-		return NULL;
-
-	if (WARN_ON_ONCE(mci->dimms[index]->idx != index))
-		return NULL;
-
-	return mci->dimms[index];
-}
-
 /**
  * edac_get_dimm - Get DIMM info from a memory controller given by
  *                 [layer0,layer1,layer2] position
@@ -650,6 +629,12 @@ static inline struct dimm_info *edac_get_dimm(struct mem_ctl_info *mci,
 	if (mci->n_layers > 2)
 		index = index * mci->layers[2].size + layer2;
 
-	return edac_get_dimm_by_index(mci, index);
+	if (index < 0 || index >= mci->tot_dimms)
+		return NULL;
+
+	if (WARN_ON_ONCE(mci->dimms[index]->idx != index))
+		return NULL;
+
+	return mci->dimms[index];
 }
 #endif /* _LINUX_EDAC_H_ */
-- 
2.21.0


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 0/4] EDAC/ghes: Some cleanups
  2020-06-16 17:27 [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov
                   ` (3 preceding siblings ...)
  2020-06-16 17:27 ` [PATCH 4/4] EDAC: Remove edac_get_dimm_by_index() Borislav Petkov
@ 2020-06-22 15:09 ` Borislav Petkov
  4 siblings, 0 replies; 6+ messages in thread
From: Borislav Petkov @ 2020-06-22 15:09 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: linux-edac, LKML

On Tue, Jun 16, 2020 at 07:27:33PM +0200, Borislav Petkov wrote:
> From: Borislav Petkov <bp@suse.de>
> 
> Hi,
> 
> here are some cleanups which came up recently.
> 
> Thx.
> 
> Borislav Petkov (2):
>   EDAC/ghes: Scan the system once on driver init
>   EDAC: Remove edac_get_dimm_by_index()
> 
> Robert Richter (2):
>   EDAC/ghes: Setup DIMM label from DMI and use it in error reports
>   EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to
>     ghes_pvt
> 
>  drivers/edac/ghes_edac.c | 323 +++++++++++++++++++++++----------------
>  include/linux/edac.h     |  29 +---
>  2 files changed, 200 insertions(+), 152 deletions(-)

Queued for 5.9.

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2020-06-22 15:09 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-16 17:27 [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov
2020-06-16 17:27 ` [PATCH 1/4] EDAC/ghes: Setup DIMM label from DMI and use it in error reports Borislav Petkov
2020-06-16 17:27 ` [PATCH 2/4] EDAC/ghes: Remove unused members of struct ghes_edac_pvt, rename it to ghes_pvt Borislav Petkov
2020-06-16 17:27 ` [PATCH 3/4] EDAC/ghes: Scan the system once on driver init Borislav Petkov
2020-06-16 17:27 ` [PATCH 4/4] EDAC: Remove edac_get_dimm_by_index() Borislav Petkov
2020-06-22 15:09 ` [PATCH 0/4] EDAC/ghes: Some cleanups Borislav Petkov

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.