All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers
@ 2012-03-29 16:45 Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 01/13] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
                   ` (15 more replies)
  0 siblings, 16 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski Filho

This is the 12th and final rebase of this patch series.

It is the first patchset for the EDAC rewrite. On this patchset,
there are all the internal changes at the EDAC core, needed
to properly represent memories at modern memory controllers that
aren't oriented per rank/channel.

It is needed in order to fix a long-term bug at the EDAC drivers
for the Intel memory controllers deployed since 2005 (well, in fact,
there is one Rambus that it is older, but also suffers from the same
syndrome), including the drivers for the recent Intel Nehalem and
Sandy Bridge architectures.

The new EDAC architecture supports both per rank/channel memory
controllers and per-DIMM ones.

On this changeset, there are no changes at the sysfs nodes. Just 
like before this changeset, non-per-rank memory controllers 
will expose memories as "virtual csrows/virtual channels[1].

[1] It sounds better to say "virtual" than to admit that all
EDAC Intel drivers since 2005 need to lie about their age to
the EDAC core, in order for the Kernel to accept them ;)

Mauro Carvalho Chehab (13):
  edac: Create a dimm struct and move the labels into it
  edac: move dimm properties to struct memset_info
  edac: Don't initialize csrow's first_page & friends when not needed
  edac: move nr_pages to dimm struct
  edac: Fix core support for MC's that see DIMMS instead of ranks
  edac: Initialize the dimm label with the known information
  edac: Cleanup the logs for i7core and sb edac drivers
  i5400_edac: improve debug messages to better represent the filled
    memory
  events/hw_event: Create a Hardware Events Report Mecanism (HERM)
  i5000_edac: Fix the logic that retrieves memory information
  e752x_edac: provide more info about how DIMMS/ranks are mapped
  edac: Rename the parent dev to pdev
  edac: use Documentation-nano format for some data structs

 drivers/edac/amd64_edac.c       |  210 +++++++------
 drivers/edac/amd76x_edac.c      |   48 ++-
 drivers/edac/cell_edac.c        |   54 ++-
 drivers/edac/cpc925_edac.c      |   95 ++++---
 drivers/edac/e752x_edac.c       |  121 +++++---
 drivers/edac/e7xxx_edac.c       |   90 ++++--
 drivers/edac/edac_core.h        |   48 +--
 drivers/edac/edac_device.c      |   27 +-
 drivers/edac/edac_mc.c          |  688 ++++++++++++++++++++++++---------------
 drivers/edac/edac_mc_sysfs.c    |  160 ++++++----
 drivers/edac/edac_module.h      |    2 +-
 drivers/edac/edac_pci.c         |    7 +-
 drivers/edac/i3000_edac.c       |   55 ++-
 drivers/edac/i3200_edac.c       |   64 ++--
 drivers/edac/i5000_edac.c       |  227 +++++++------
 drivers/edac/i5100_edac.c       |  108 +++----
 drivers/edac/i5400_edac.c       |  266 ++++++++-------
 drivers/edac/i7300_edac.c       |  119 +++----
 drivers/edac/i7core_edac.c      |  245 ++++----------
 drivers/edac/i82443bxgx_edac.c  |   47 ++-
 drivers/edac/i82860_edac.c      |   61 +++--
 drivers/edac/i82875p_edac.c     |   57 +++-
 drivers/edac/i82975x_edac.c     |   63 +++--
 drivers/edac/mpc85xx_edac.c     |   48 ++-
 drivers/edac/mv64x60_edac.c     |   49 ++-
 drivers/edac/pasemi_edac.c      |   57 ++--
 drivers/edac/ppc4xx_edac.c      |   66 ++--
 drivers/edac/r82600_edac.c      |   46 ++-
 drivers/edac/sb_edac.c          |  207 +++++--------
 drivers/edac/tile_edac.c        |   37 ++-
 drivers/edac/x38_edac.c         |   60 ++--
 include/linux/edac.h            |  244 +++++++++++---
 include/trace/events/hw_event.h |  107 ++++++
 33 files changed, 2195 insertions(+), 1588 deletions(-)
 create mode 100644 include/trace/events/hw_event.h

-- 
1.7.8


^ permalink raw reply	[flat|nested] 206+ messages in thread

* [PATCH 01/13] edac: Create a dimm struct and move the labels into it
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-30 10:50   ` Borislav Petkov
  2012-03-29 16:45 ` [PATCH 02/13] edac: move dimm properties to struct memset_info Mauro Carvalho Chehab
                   ` (14 subsequent siblings)
  15 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The way a DIMM is currently represented implies that they're
linked into a per-csrow struct. However, some drivers don't see
csrows, as they're ridden behind some chip like the AMB's
on FBDIMM's, for example.

This forced drivers to fake a csrow struct, and to create
a mess under csrow/channel original's concept.

Move the DIMM labels into a per-DIMM struct, and add there
the real location of the socket, in terms of csrow/channel.
Latter patches will modify the location to properly represent the
memory architecture.

All other drivers will use a per-csrow type of location.
Some of those drivers will require a latter conversion, as
they also fake the csrows internally.

TODO: While this patch doesn't change the existing behavior, on
csrows-based memory controllers, a csrow/channel pair points to a memory
rank. There's a known bug at the EDAC core that allows having different
labels for the same DIMM, if it has more than one rank. A latter patch
is need to merge the several ranks for a DIMM into the same dimm_info
struct, in order to avoid having different labels for the same DIMM.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/edac_mc.c       |   50 +++++++++++++++++++++++++++++++----------
 drivers/edac/edac_mc_sysfs.c |   11 ++++-----
 drivers/edac/i5100_edac.c    |    8 +++---
 drivers/edac/i7core_edac.c   |    4 +-
 drivers/edac/i82975x_edac.c  |    2 +-
 drivers/edac/sb_edac.c       |    4 +-
 include/linux/edac.h         |   28 +++++++++++++++++++----
 7 files changed, 75 insertions(+), 32 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 690cbf1..c03bfe7 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,7 +44,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
-	debugf4("\tchannel->label = '%s'\n", chan->label);
+	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
 }
 
@@ -157,6 +157,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	struct mem_ctl_info *mci;
 	struct csrow_info *csi, *csrow;
 	struct rank_info *chi, *chp, *chan;
+	struct dimm_info *dimm;
 	void *pvt;
 	unsigned size;
 	int row, chn;
@@ -170,7 +171,8 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	mci = (struct mem_ctl_info *)0;
 	csi = edac_align_ptr(&mci[1], sizeof(*csi));
 	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
-	pvt = edac_align_ptr(&chi[nr_chans * nr_csrows], sz_pvt);
+	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
+	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	mci = kzalloc(size, GFP_KERNEL);
@@ -182,11 +184,13 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
+	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
+	mci->dimms  = dimm;
 	mci->pvt_info = pvt;
 	mci->nr_csrows = nr_csrows;
 
@@ -205,6 +209,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 		}
 	}
 
+	/*
+	 * By default, assumes that a per-csrow arrangement will be used,
+	 * as most drivers are based on such assumption.
+	 */
+	dimm = mci->dimms;
+	for (row = 0; row < mci->nr_csrows; row++) {
+		for (chn = 0; chn < mci->csrows[row].nr_channels; chn++) {
+			mci->csrows[row].channels[chn].dimm = dimm;
+			dimm->csrow = row;
+			dimm->csrow_channel = chn;
+			dimm++;
+			mci->nr_dimms++;
+		}
+	}
+
 	mci->op_state = OP_ALLOC;
 	INIT_LIST_HEAD(&mci->grp_kobj_list);
 
@@ -678,6 +697,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 		int row, int channel, const char *msg)
 {
 	unsigned long remapped_page;
+	char *label = NULL;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
@@ -701,6 +721,8 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 		return;
 	}
 
+	label = mci->csrows[row].channels[channel].dimm->label;
+
 	if (edac_mc_get_log_ce())
 		/* FIXME - put in DIMM location */
 		edac_mc_printk(mci, KERN_WARNING,
@@ -708,7 +730,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
 			page_frame_number, offset_in_page,
 			mci->csrows[row].grain, syndrome, row, channel,
-			mci->csrows[row].channels[channel].label, msg);
+			label, msg);
 
 	mci->ce_count++;
 	mci->csrows[row].ce_count++;
@@ -754,6 +776,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 	char *pos = labels;
 	int chan;
 	int chars;
+	char *label = NULL;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
@@ -767,15 +790,15 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 		return;
 	}
 
-	chars = snprintf(pos, len + 1, "%s",
-			 mci->csrows[row].channels[0].label);
+	label = mci->csrows[row].channels[0].dimm->label;
+	chars = snprintf(pos, len + 1, "%s", label);
 	len -= chars;
 	pos += chars;
 
 	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
 		chan++) {
-		chars = snprintf(pos, len + 1, ":%s",
-				 mci->csrows[row].channels[chan].label);
+		label = mci->csrows[row].channels[chan].dimm->label;
+		chars = snprintf(pos, len + 1, ":%s", label);
 		len -= chars;
 		pos += chars;
 	}
@@ -824,6 +847,7 @@ void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
 	char labels[len + 1];
 	char *pos = labels;
 	int chars;
+	char *label;
 
 	if (csrow >= mci->nr_csrows) {
 		/* something is wrong */
@@ -858,12 +882,12 @@ void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
 	mci->csrows[csrow].ue_count++;
 
 	/* Generate the DIMM labels from the specified channels */
-	chars = snprintf(pos, len + 1, "%s",
-			 mci->csrows[csrow].channels[channela].label);
+	label = mci->csrows[csrow].channels[channela].dimm->label;
+	chars = snprintf(pos, len + 1, "%s", label);
 	len -= chars;
 	pos += chars;
 	chars = snprintf(pos, len + 1, "-%s",
-			 mci->csrows[csrow].channels[channelb].label);
+			mci->csrows[csrow].channels[channelb].dimm->label);
 
 	if (edac_mc_get_log_ue())
 		edac_mc_printk(mci, KERN_EMERG,
@@ -885,6 +909,7 @@ EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
 void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
 			unsigned int csrow, unsigned int channel, char *msg)
 {
+	char *label = NULL;
 
 	/* Ensure boundary values */
 	if (csrow >= mci->nr_csrows) {
@@ -904,12 +929,13 @@ void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
 		return;
 	}
 
+	label = mci->csrows[csrow].channels[channel].dimm->label;
+
 	if (edac_mc_get_log_ce())
 		/* FIXME - put in DIMM location */
 		edac_mc_printk(mci, KERN_WARNING,
 			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel,
-			mci->csrows[csrow].channels[channel].label, msg);
+			csrow, channel, label, msg);
 
 	mci->ce_count++;
 	mci->csrows[csrow].ce_count++;
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index d56e634..c83697c 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -170,11 +170,11 @@ static ssize_t channel_dimm_label_show(struct csrow_info *csrow,
 				char *data, int channel)
 {
 	/* if field has not been initialized, there is nothing to send */
-	if (!csrow->channels[channel].label[0])
+	if (!csrow->channels[channel].dimm->label[0])
 		return 0;
 
 	return snprintf(data, EDAC_MC_LABEL_LEN, "%s\n",
-			csrow->channels[channel].label);
+			csrow->channels[channel].dimm->label);
 }
 
 static ssize_t channel_dimm_label_store(struct csrow_info *csrow,
@@ -184,8 +184,8 @@ static ssize_t channel_dimm_label_store(struct csrow_info *csrow,
 	ssize_t max_size = 0;
 
 	max_size = min((ssize_t) count, (ssize_t) EDAC_MC_LABEL_LEN - 1);
-	strncpy(csrow->channels[channel].label, data, max_size);
-	csrow->channels[channel].label[max_size] = '\0';
+	strncpy(csrow->channels[channel].dimm->label, data, max_size);
+	csrow->channels[channel].dimm->label[max_size] = '\0';
 
 	return max_size;
 }
@@ -952,9 +952,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	/* CSROW error: backout what has already been registered,  */
 fail1:
 	for (i--; i >= 0; i--) {
-		if (csrow->nr_pages > 0) {
+		if (mci->csrows[i].nr_pages > 0)
 			kobject_put(&mci->csrows[i].kobj);
-		}
 	}
 
 	/* remove the mci instance's attributes, if any */
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 2a6e7ff..2ce7ef1 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -433,7 +433,7 @@ static void i5100_handle_ce(struct mem_ctl_info *mci,
 		"CE chan %d, bank %u, rank %u, syndrome 0x%lx, "
 		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
 		chan, bank, rank, syndrome, cas, ras,
-		csrow, mci->csrows[csrow].channels[0].label, msg);
+		csrow, mci->csrows[csrow].channels[0].dimm->label, msg);
 
 	mci->ce_count++;
 	mci->csrows[csrow].ce_count++;
@@ -455,7 +455,7 @@ static void i5100_handle_ue(struct mem_ctl_info *mci,
 		"UE chan %d, bank %u, rank %u, syndrome 0x%lx, "
 		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
 		chan, bank, rank, syndrome, cas, ras,
-		csrow, mci->csrows[csrow].channels[0].label, msg);
+		csrow, mci->csrows[csrow].channels[0].dimm->label, msg);
 
 	mci->ue_count++;
 	mci->csrows[csrow].ue_count++;
@@ -868,8 +868,8 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		mci->csrows[i].channels[0].chan_idx = 0;
 		mci->csrows[i].channels[0].ce_count = 0;
 		mci->csrows[i].channels[0].csrow = mci->csrows + i;
-		snprintf(mci->csrows[i].channels[0].label,
-			 sizeof(mci->csrows[i].channels[0].label),
+		snprintf(mci->csrows[i].channels[0].dimm->label,
+			 sizeof(mci->csrows[i].channels[0].dimm->label),
 			 "DIMM%u", i5100_rank_to_slot(mci, chan, rank));
 
 		total_pages += npages;
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 8568d9b..5203f30 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -746,8 +746,8 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 
 			csr->edac_mode = mode;
 			csr->mtype = mtype;
-			snprintf(csr->channels[0].label,
-					sizeof(csr->channels[0].label),
+			snprintf(csr->channels[0].dimm->label,
+					sizeof(csr->channels[0].dimm->label),
 					"CPU#%uChannel#%u_DIMM#%u",
 					pvt->i7core_dev->socket, i, j);
 
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 4184e01..864061b 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -407,7 +407,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		 *   [0-3] for dual-channel; i.e. csrow->nr_channels = 2
 		 */
 		for (chan = 0; chan < csrow->nr_channels; chan++)
-			strncpy(csrow->channels[chan].label,
+			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
 
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 2917887..dea1ef3 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -651,8 +651,8 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 				csr->channels[0].chan_idx = i;
 				csr->channels[0].ce_count = 0;
 				pvt->csrow_map[i][j] = csrow;
-				snprintf(csr->channels[0].label,
-					 sizeof(csr->channels[0].label),
+				snprintf(csr->channels[0].dimm->label,
+					 sizeof(csr->channels[0].dimm->label),
 					 "CPU_SrcID#%u_Channel#%u_DIMM#%u",
 					 pvt->sbridge_dev->source_id, i, j);
 				last_page += npages;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index e3e3d26..f40b835 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -308,23 +308,34 @@ enum scrub_type {
  * PS - I enjoyed writing all that about as much as you enjoyed reading it.
  */
 
+/* FIXME: add a per-dimm ce error count */
+struct dimm_info {
+	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
+	unsigned memory_controller;
+	unsigned csrow;
+	unsigned csrow_channel;
+};
+
 /**
  * struct rank_info - contains the information for one DIMM rank
  *
  * @chan_idx:	channel number where the rank is (typically, 0 or 1)
  * @ce_count:	number of correctable errors for this rank
- * @label:	DIMM label. Different ranks for the same DIMM should be
- *		filled, on userspace, with the same label.
- *		FIXME: The core currently won't enforce it.
  * @csrow:	A pointer to the chip select row structure (the parent
  *		structure). The location of the rank is given by
  *		the (csrow->csrow_idx, chan_idx) vector.
+ * @dimm:	A pointer to the DIMM structure, where the DIMM label
+ *		information is stored.
+ *
+ * FIXME: Currently, the EDAC core model will assume one DIMM per rank.
+ *	  This is a bad assumption, but it makes this patch easier. Later
+ *	  patches in this series will fix this issue.
  */
 struct rank_info {
 	int chan_idx;
 	u32 ce_count;
-	char label[EDAC_MC_LABEL_LEN + 1];
-	struct csrow_info *csrow;	/* the parent */
+	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 };
 
 struct csrow_info {
@@ -424,6 +435,13 @@ struct mem_ctl_info {
 	int mc_idx;
 	int nr_csrows;
 	struct csrow_info *csrows;
+
+	/*
+	 * DIMM info. Will eventually remove the entire csrows_info some day
+	 */
+	unsigned nr_dimms;
+	struct dimm_info *dimms;
+
 	/*
 	 * FIXME - what about controllers on other busses? - IDs must be
 	 * unique.  dev pointer should be sufficiently unique, but
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 02/13] edac: move dimm properties to struct memset_info
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 01/13] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-30 13:10   ` Borislav Petkov
  2012-03-30 17:03   ` Borislav Petkov
  2012-03-29 16:45 ` [PATCH 03/13] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
                   ` (13 subsequent siblings)
  15 siblings, 2 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

On systems based on chip select rows, all channels need to use memories
with the same properties, otherwise the memories on channels A and B
won't be recognized.

However, such assumption is not true for all types of memory
controllers.

Controllers for FB-DIMM's don't have such requirements.

Also, modern Intel controllers seem to be capable of handling such
differences.

So, we need to get rid of storing the DIMM information into a per-csrow
data, storing it, instead at the right place.

The first step is to move grain, mtype, dtype and edac_mode to the
per-dimm struct.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c      |   30 +++++++++++--------
 drivers/edac/amd76x_edac.c     |   10 ++++--
 drivers/edac/cell_edac.c       |   10 +++++-
 drivers/edac/cpc925_edac.c     |   62 +++++++++++++++++++++------------------
 drivers/edac/e752x_edac.c      |   44 +++++++++++++++-------------
 drivers/edac/e7xxx_edac.c      |   44 ++++++++++++++++------------
 drivers/edac/edac_mc.c         |   19 ++++++++----
 drivers/edac/edac_mc_sysfs.c   |    6 ++--
 drivers/edac/i3000_edac.c      |   18 ++++++-----
 drivers/edac/i3200_edac.c      |   18 ++++++-----
 drivers/edac/i5000_edac.c      |   24 +++++++--------
 drivers/edac/i5100_edac.c      |   38 +++++++++++++-----------
 drivers/edac/i5400_edac.c      |   24 ++++++---------
 drivers/edac/i7300_edac.c      |   25 +++++++++------
 drivers/edac/i7core_edac.c     |   27 ++++++++---------
 drivers/edac/i82443bxgx_edac.c |   13 +++++---
 drivers/edac/i82860_edac.c     |   11 ++++--
 drivers/edac/i82875p_edac.c    |   17 ++++++++---
 drivers/edac/i82975x_edac.c    |   17 +++++++----
 drivers/edac/mpc85xx_edac.c    |   13 +++++---
 drivers/edac/mv64x60_edac.c    |   18 ++++++-----
 drivers/edac/pasemi_edac.c     |   10 ++++--
 drivers/edac/ppc4xx_edac.c     |   13 +++++---
 drivers/edac/r82600_edac.c     |   10 ++++--
 drivers/edac/sb_edac.c         |   31 +++++++++++---------
 drivers/edac/tile_edac.c       |   13 ++++----
 drivers/edac/x38_edac.c        |   17 ++++++-----
 include/linux/edac.h           |   21 ++++++++-----
 28 files changed, 340 insertions(+), 263 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index c9eee6d..3e7bddc 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2168,7 +2168,9 @@ static int init_csrows(struct mem_ctl_info *mci)
 	struct amd64_pvt *pvt = mci->pvt_info;
 	u64 input_addr_min, input_addr_max, sys_addr, base, mask;
 	u32 val;
-	int i, empty = 1;
+	int i, j, empty = 1;
+	enum mem_type mtype;
+	enum edac_type edac_mode;
 
 	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
 
@@ -2202,7 +2204,21 @@ static int init_csrows(struct mem_ctl_info *mci)
 		csrow->page_mask = ~mask;
 		/* 8 bytes of resolution */
 
-		csrow->mtype = amd64_determine_memory_type(pvt, i);
+		mtype = amd64_determine_memory_type(pvt, i);
+
+		/*
+		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
+		 */
+		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
+			edac_mode = (pvt->nbcfg & NBCFG_CHIPKILL) ?
+				    EDAC_S4ECD4ED : EDAC_SECDED;
+		else
+			edac_mode = EDAC_NONE;
+
+		for (j = 0; j < pvt->channel_count; j++) {
+			csrow->channels[j].dimm->mtype = mtype;
+			csrow->channels[j].dimm->edac_mode = edac_mode;
+		}
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
 		debugf1("    input_addr_min: 0x%lx input_addr_max: 0x%lx\n",
@@ -2214,16 +2230,6 @@ static int init_csrows(struct mem_ctl_info *mci)
 			"last_page: 0x%lx\n",
 			(unsigned)csrow->nr_pages,
 			csrow->first_page, csrow->last_page);
-
-		/*
-		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
-		 */
-		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
-			csrow->edac_mode =
-			    (pvt->nbcfg & NBCFG_CHIPKILL) ?
-			    EDAC_S4ECD4ED : EDAC_SECDED;
-		else
-			csrow->edac_mode = EDAC_NONE;
 	}
 
 	return empty;
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index e47e73b..2a63ed0 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -186,11 +186,13 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 			enum edac_type edac_mode)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	u32 mba, mba_base, mba_mask, dms;
 	int index;
 
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
 
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_dword(pdev,
@@ -206,10 +208,10 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		csrow->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
 		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 		csrow->page_mask = mba_mask >> PAGE_SHIFT;
-		csrow->grain = csrow->nr_pages << PAGE_SHIFT;
-		csrow->mtype = MEM_RDDR;
-		csrow->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
-		csrow->edac_mode = edac_mode;
+		dimm->grain = csrow->nr_pages << PAGE_SHIFT;
+		dimm->mtype = MEM_RDDR;
+		dimm->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
+		dimm->edac_mode = edac_mode;
 	}
 }
 
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 9a6a274..94fbb12 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -124,8 +124,10 @@ static void cell_edac_check(struct mem_ctl_info *mci)
 static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 {
 	struct csrow_info		*csrow = &mci->csrows[0];
+	struct dimm_info		*dimm;
 	struct cell_edac_priv		*priv = mci->pvt_info;
 	struct device_node		*np;
+	int				j;
 
 	for (np = NULL;
 	     (np = of_find_node_by_name(np, "memory")) != NULL;) {
@@ -142,8 +144,12 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 		csrow->first_page = r.start >> PAGE_SHIFT;
 		csrow->nr_pages = resource_size(&r) >> PAGE_SHIFT;
 		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-		csrow->mtype = MEM_XDR;
-		csrow->edac_mode = EDAC_SECDED;
+
+		for (j = 0; j < csrow->nr_channels; j++) {
+			dimm = csrow->channels[j].dimm;
+			dimm->mtype = MEM_XDR;
+			dimm->edac_mode = EDAC_SECDED;
+		}
 		dev_dbg(mci->dev,
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index a774c0d..ee90f3d 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -329,7 +329,8 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 {
 	struct cpc925_mc_pdata *pdata = mci->pvt_info;
 	struct csrow_info *csrow;
-	int index;
+	struct dimm_info *dimm;
+	int index, j;
 	u32 mbmr, mbbar, bba;
 	unsigned long row_size, last_nr_pages = 0;
 
@@ -354,32 +355,35 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 		last_nr_pages = csrow->last_page + 1;
 
-		csrow->mtype = MEM_RDDR;
-		csrow->edac_mode = EDAC_SECDED;
-
-		switch (csrow->nr_channels) {
-		case 1: /* Single channel */
-			csrow->grain = 32; /* four-beat burst of 32 bytes */
-			break;
-		case 2: /* Dual channel */
-		default:
-			csrow->grain = 64; /* four-beat burst of 64 bytes */
-			break;
-		}
-
-		switch ((mbmr & MBMR_MODE_MASK) >> MBMR_MODE_SHIFT) {
-		case 6: /* 0110, no way to differentiate X8 VS X16 */
-		case 5:	/* 0101 */
-		case 8: /* 1000 */
-			csrow->dtype = DEV_X16;
-			break;
-		case 7: /* 0111 */
-		case 9: /* 1001 */
-			csrow->dtype = DEV_X8;
-			break;
-		default:
-			csrow->dtype = DEV_UNKNOWN;
-			break;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			dimm = csrow->channels[j].dimm;
+			dimm->mtype = MEM_RDDR;
+			dimm->edac_mode = EDAC_SECDED;
+
+			switch (csrow->nr_channels) {
+			case 1: /* Single channel */
+				dimm->grain = 32; /* four-beat burst of 32 bytes */
+				break;
+			case 2: /* Dual channel */
+			default:
+				dimm->grain = 64; /* four-beat burst of 64 bytes */
+				break;
+			}
+
+			switch ((mbmr & MBMR_MODE_MASK) >> MBMR_MODE_SHIFT) {
+			case 6: /* 0110, no way to differentiate X8 VS X16 */
+			case 5:	/* 0101 */
+			case 8: /* 1000 */
+				dimm->dtype = DEV_X16;
+				break;
+			case 7: /* 0111 */
+			case 9: /* 1001 */
+				dimm->dtype = DEV_X8;
+				break;
+			default:
+				dimm->dtype = DEV_UNKNOWN;
+				break;
+			}
 		}
 	}
 }
@@ -962,9 +966,9 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 		goto err2;
 	}
 
-	nr_channels = cpc925_mc_get_channels(vbase);
+	nr_channels = cpc925_mc_get_channels(vbase) + 1;
 	mci = edac_mc_alloc(sizeof(struct cpc925_mc_pdata),
-			CPC925_NR_CSROWS, nr_channels + 1, edac_mc_idx);
+			CPC925_NR_CSROWS, nr_channels, edac_mc_idx);
 	if (!mci) {
 		cpc925_printk(KERN_ERR, "No memory for mem_ctl_info\n");
 		res = -ENOMEM;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 1af531a..db291ea 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1044,7 +1044,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	int drc_drbg;		/* DRB granularity 0=64mb, 1=128mb */
 	int drc_ddim;		/* DRAM Data Integrity Mode 0=none, 2=edac */
 	u8 value;
-	u32 dra, drc, cumul_size;
+	u32 dra, drc, cumul_size, i;
 
 	dra = 0;
 	for (index = 0; index < 4; index++) {
@@ -1053,7 +1053,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		dra |= dra_reg << (index * 8);
 	}
 	pci_read_config_dword(pdev, E752X_DRC, &drc);
-	drc_chan = dual_channel_active(ddrcsr);
+	drc_chan = dual_channel_active(ddrcsr) ? 1 : 0;
 	drc_drbg = drc_chan + 1;	/* 128 in dual mode, 64 in single */
 	drc_ddim = (drc >> 20) & 0x3;
 
@@ -1080,24 +1080,28 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
-		csrow->mtype = MEM_RDDR;	/* only one type supported */
-		csrow->dtype = mem_dev ? DEV_X4 : DEV_X8;
-
-		/*
-		 * if single channel or x8 devices then SECDED
-		 * if dual channel and x4 then S4ECD4ED
-		 */
-		if (drc_ddim) {
-			if (drc_chan && mem_dev) {
-				csrow->edac_mode = EDAC_S4ECD4ED;
-				mci->edac_cap |= EDAC_FLAG_S4ECD4ED;
-			} else {
-				csrow->edac_mode = EDAC_SECDED;
-				mci->edac_cap |= EDAC_FLAG_SECDED;
-			}
-		} else
-			csrow->edac_mode = EDAC_NONE;
+
+		for (i = 0; i < drc_chan + 1; i++) {
+			struct dimm_info *dimm = csrow->channels[i].dimm;
+			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
+			dimm->mtype = MEM_RDDR;	/* only one type supported */
+			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
+
+			/*
+			* if single channel or x8 devices then SECDED
+			* if dual channel and x4 then S4ECD4ED
+			*/
+			if (drc_ddim) {
+				if (drc_chan && mem_dev) {
+					dimm->edac_mode = EDAC_S4ECD4ED;
+					mci->edac_cap |= EDAC_FLAG_S4ECD4ED;
+				} else {
+					dimm->edac_mode = EDAC_SECDED;
+					mci->edac_cap |= EDAC_FLAG_SECDED;
+				}
+			} else
+				dimm->edac_mode = EDAC_NONE;
+		}
 	}
 }
 
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 6ffb6d2..178d2af 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -347,11 +347,12 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 			int dev_idx, u32 drc)
 {
 	unsigned long last_cumul_size;
-	int index;
+	int index, j;
 	u8 value;
 	u32 dra, cumul_size;
 	int drc_chan, drc_drbg, drc_ddim, mem_dev;
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 
 	pci_read_config_dword(pdev, E7XXX_DRA, &dra);
 	drc_chan = dual_channel_active(drc, dev_idx);
@@ -381,24 +382,29 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
-		csrow->mtype = MEM_RDDR;	/* only one type supported */
-		csrow->dtype = mem_dev ? DEV_X4 : DEV_X8;
-
-		/*
-		 * if single channel or x8 devices then SECDED
-		 * if dual channel and x4 then S4ECD4ED
-		 */
-		if (drc_ddim) {
-			if (drc_chan && mem_dev) {
-				csrow->edac_mode = EDAC_S4ECD4ED;
-				mci->edac_cap |= EDAC_FLAG_S4ECD4ED;
-			} else {
-				csrow->edac_mode = EDAC_SECDED;
-				mci->edac_cap |= EDAC_FLAG_SECDED;
-			}
-		} else
-			csrow->edac_mode = EDAC_NONE;
+
+		for (j = 0; j < drc_chan + 1; j++) {
+			dimm = csrow->channels[j].dimm;
+
+			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
+			dimm->mtype = MEM_RDDR;	/* only one type supported */
+			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
+
+			/*
+			* if single channel or x8 devices then SECDED
+			* if dual channel and x4 then S4ECD4ED
+			*/
+			if (drc_ddim) {
+				if (drc_chan && mem_dev) {
+					dimm->edac_mode = EDAC_S4ECD4ED;
+					mci->edac_cap |= EDAC_FLAG_S4ECD4ED;
+				} else {
+					dimm->edac_mode = EDAC_SECDED;
+					mci->edac_cap |= EDAC_FLAG_SECDED;
+				}
+			} else
+				dimm->edac_mode = EDAC_NONE;
+		}
 	}
 }
 
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index c03bfe7..2430ddb 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -43,7 +43,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 {
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
-	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
+	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
 	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
 }
@@ -698,6 +698,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 {
 	unsigned long remapped_page;
 	char *label = NULL;
+	u32 grain;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
@@ -722,6 +723,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 	}
 
 	label = mci->csrows[row].channels[channel].dimm->label;
+	grain = mci->csrows[row].channels[channel].dimm->grain;
 
 	if (edac_mc_get_log_ce())
 		/* FIXME - put in DIMM location */
@@ -729,11 +731,12 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
 			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
 			page_frame_number, offset_in_page,
-			mci->csrows[row].grain, syndrome, row, channel,
+			grain, syndrome, row, channel,
 			label, msg);
 
 	mci->ce_count++;
 	mci->csrows[row].ce_count++;
+	mci->csrows[row].channels[channel].dimm->ce_count++;
 	mci->csrows[row].channels[channel].ce_count++;
 
 	if (mci->scrub_mode & SCRUB_SW_SRC) {
@@ -750,8 +753,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			mci->ctl_page_to_phys(mci, page_frame_number) :
 			page_frame_number;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page,
-				mci->csrows[row].grain);
+		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
 	}
 }
 EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
@@ -777,6 +779,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 	int chan;
 	int chars;
 	char *label = NULL;
+	u32 grain;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
@@ -790,6 +793,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 		return;
 	}
 
+	grain = mci->csrows[row].channels[0].dimm->grain;
 	label = mci->csrows[row].channels[0].dimm->label;
 	chars = snprintf(pos, len + 1, "%s", label);
 	len -= chars;
@@ -807,14 +811,13 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 		edac_mc_printk(mci, KERN_EMERG,
 			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
 			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, mci->csrows[row].grain, row,
-			labels, msg);
+			offset_in_page, grain, row, labels, msg);
 
 	if (edac_mc_get_panic_on_ue())
 		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
 			"row %d, labels \"%s\": %s\n", mci->mc_idx,
 			page_frame_number, offset_in_page,
-			mci->csrows[row].grain, row, labels, msg);
+			grain, row, labels, msg);
 
 	mci->ue_count++;
 	mci->csrows[row].ue_count++;
@@ -886,6 +889,7 @@ void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
 	chars = snprintf(pos, len + 1, "%s", label);
 	len -= chars;
 	pos += chars;
+
 	chars = snprintf(pos, len + 1, "-%s",
 			mci->csrows[csrow].channels[channelb].dimm->label);
 
@@ -939,6 +943,7 @@ void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
 
 	mci->ce_count++;
 	mci->csrows[csrow].ce_count++;
+	mci->csrows[csrow].channels[channel].dimm->ce_count++;
 	mci->csrows[csrow].channels[channel].ce_count++;
 }
 EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index c83697c..d63904e 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -150,19 +150,19 @@ static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
 static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%s\n", mem_types[csrow->mtype]);
+	return sprintf(data, "%s\n", mem_types[csrow->channels[0].dimm->mtype]);
 }
 
 static ssize_t csrow_dev_type_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%s\n", dev_types[csrow->dtype]);
+	return sprintf(data, "%s\n", dev_types[csrow->channels[0].dimm->dtype]);
 }
 
 static ssize_t csrow_edac_mode_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%s\n", edac_caps[csrow->edac_mode]);
+	return sprintf(data, "%s\n", edac_caps[csrow->channels[0].dimm->edac_mode]);
 }
 
 /* show/store functions for DIMM Label attributes */
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index c0510b3..1498c5f 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -304,7 +304,7 @@ static int i3000_is_interleaved(const unsigned char *c0dra,
 static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc;
-	int i;
+	int i, j;
 	struct mem_ctl_info *mci = NULL;
 	unsigned long last_cumul_size;
 	int interleaved, nr_channels;
@@ -386,19 +386,21 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 			cumul_size <<= 1;
 		debugf3("MC: %s(): (%d) cumul_size 0x%x\n",
 			__func__, i, cumul_size);
-		if (cumul_size == last_cumul_size) {
-			csrow->mtype = MEM_EMPTY;
+		if (cumul_size == last_cumul_size)
 			continue;
-		}
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = I3000_DEAP_GRAIN;
-		csrow->mtype = MEM_DDR2;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = EDAC_UNKNOWN;
+
+		for (j = 0; j < nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			dimm->grain = I3000_DEAP_GRAIN;
+			dimm->mtype = MEM_DDR2;
+			dimm->dtype = DEV_UNKNOWN;
+			dimm->edac_mode = EDAC_UNKNOWN;
+		}
 	}
 
 	/*
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 73f55e200..73529fd 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -319,7 +319,7 @@ static unsigned long drb_to_nr_pages(
 static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc;
-	int i;
+	int i, j;
 	struct mem_ctl_info *mci = NULL;
 	unsigned long last_page;
 	u16 drbs[I3200_CHANNELS][I3200_RANKS_PER_CHANNEL];
@@ -375,20 +375,22 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 			i / I3200_RANKS_PER_CHANNEL,
 			i % I3200_RANKS_PER_CHANNEL);
 
-		if (nr_pages == 0) {
-			csrow->mtype = MEM_EMPTY;
+		if (nr_pages == 0)
 			continue;
-		}
 
 		csrow->first_page = last_page + 1;
 		last_page += nr_pages;
 		csrow->last_page = last_page;
 		csrow->nr_pages = nr_pages;
 
-		csrow->grain = nr_pages << PAGE_SHIFT;
-		csrow->mtype = MEM_DDR2;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = EDAC_UNKNOWN;
+		for (j = 0; j < nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->grain = nr_pages << PAGE_SHIFT;
+			dimm->mtype = MEM_DDR2;
+			dimm->dtype = DEV_UNKNOWN;
+			dimm->edac_mode = EDAC_UNKNOWN;
+		}
 	}
 
 	i3200_clear_error_info(mci);
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 4dc3ac2..e612f1e 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1268,25 +1268,23 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 		p_csrow->last_page = 9 + csrow * 20;
 		p_csrow->page_mask = 0xFFF;
 
-		p_csrow->grain = 8;
-
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
-		}
+			p_csrow->channels[channel].dimm->grain = 8;
 
-		p_csrow->nr_pages = csrow_megs << 8;
+			/* Assume DDR2 for now */
+			p_csrow->channels[channel].dimm->mtype = MEM_FB_DDR2;
 
-		/* Assume DDR2 for now */
-		p_csrow->mtype = MEM_FB_DDR2;
+			/* ask what device type on this row */
+			if (MTR_DRAM_WIDTH(mtr))
+				p_csrow->channels[channel].dimm->dtype = DEV_X8;
+			else
+				p_csrow->channels[channel].dimm->dtype = DEV_X4;
 
-		/* ask what device type on this row */
-		if (MTR_DRAM_WIDTH(mtr))
-			p_csrow->dtype = DEV_X8;
-		else
-			p_csrow->dtype = DEV_X4;
-
-		p_csrow->edac_mode = EDAC_S8ECD8ED;
+			p_csrow->channels[channel].dimm->edac_mode = EDAC_S8ECD8ED;
+		}
+		p_csrow->nr_pages = csrow_megs << 8;
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 2ce7ef1..9caff36 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -428,12 +428,16 @@ static void i5100_handle_ce(struct mem_ctl_info *mci,
 			    const char *msg)
 {
 	const int csrow = i5100_rank_to_csrow(mci, chan, rank);
+	char *label = NULL;
+
+	if (mci->csrows[csrow].channels[0].dimm)
+		label = mci->csrows[csrow].channels[0].dimm->label;
 
 	printk(KERN_ERR
 		"CE chan %d, bank %u, rank %u, syndrome 0x%lx, "
 		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
 		chan, bank, rank, syndrome, cas, ras,
-		csrow, mci->csrows[csrow].channels[0].dimm->label, msg);
+		csrow, label, msg);
 
 	mci->ce_count++;
 	mci->csrows[csrow].ce_count++;
@@ -450,12 +454,16 @@ static void i5100_handle_ue(struct mem_ctl_info *mci,
 			    const char *msg)
 {
 	const int csrow = i5100_rank_to_csrow(mci, chan, rank);
+	char *label = NULL;
+
+	if (mci->csrows[csrow].channels[0].dimm)
+		label = mci->csrows[csrow].channels[0].dimm->label;
 
 	printk(KERN_ERR
 		"UE chan %d, bank %u, rank %u, syndrome 0x%lx, "
 		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
 		chan, bank, rank, syndrome, cas, ras,
-		csrow, mci->csrows[csrow].channels[0].dimm->label, msg);
+		csrow, label, msg);
 
 	mci->ue_count++;
 	mci->csrows[csrow].ue_count++;
@@ -837,6 +845,7 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 	int i;
 	unsigned long total_pages = 0UL;
 	struct i5100_priv *priv = mci->pvt_info;
+	struct dimm_info *dimm;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
 		const unsigned long npages = i5100_npages(mci, i);
@@ -852,27 +861,22 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 */
 		mci->csrows[i].first_page = total_pages;
 		mci->csrows[i].last_page = total_pages + npages - 1;
-		mci->csrows[i].page_mask = 0UL;
-
 		mci->csrows[i].nr_pages = npages;
-		mci->csrows[i].grain = 32;
 		mci->csrows[i].csrow_idx = i;
-		mci->csrows[i].dtype =
-			(priv->mtr[chan][rank].width == 4) ? DEV_X4 : DEV_X8;
-		mci->csrows[i].ue_count = 0;
-		mci->csrows[i].ce_count = 0;
-		mci->csrows[i].mtype = MEM_RDDR2;
-		mci->csrows[i].edac_mode = EDAC_SECDED;
 		mci->csrows[i].mci = mci;
 		mci->csrows[i].nr_channels = 1;
-		mci->csrows[i].channels[0].chan_idx = 0;
-		mci->csrows[i].channels[0].ce_count = 0;
 		mci->csrows[i].channels[0].csrow = mci->csrows + i;
-		snprintf(mci->csrows[i].channels[0].dimm->label,
-			 sizeof(mci->csrows[i].channels[0].dimm->label),
-			 "DIMM%u", i5100_rank_to_slot(mci, chan, rank));
-
 		total_pages += npages;
+
+		dimm = mci->csrows[i].channels[0].dimm;
+		dimm->grain = 32;
+		dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
+			      DEV_X4 : DEV_X8;
+		dimm->mtype = MEM_RDDR2;
+		dimm->edac_mode = EDAC_SECDED;
+		snprintf(dimm->label, sizeof(dimm->label),
+			 "DIMM%u",
+			 i5100_rank_to_slot(mci, chan, rank));
 	}
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index b44a5de..229aff5 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1159,6 +1159,7 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	int csrow_megs;
 	int channel;
 	int csrow;
+	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
 
@@ -1184,24 +1185,17 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		p_csrow->last_page = 9 + csrow * 20;
 		p_csrow->page_mask = 0xFFF;
 
-		p_csrow->grain = 8;
-
 		csrow_megs = 0;
-		for (channel = 0; channel < pvt->maxch; channel++)
+		for (channel = 0; channel < pvt->maxch; channel++) {
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
 
-		p_csrow->nr_pages = csrow_megs << 8;
-
-		/* Assume DDR2 for now */
-		p_csrow->mtype = MEM_FB_DDR2;
-
-		/* ask what device type on this row */
-		if (MTR_DRAM_WIDTH(mtr))
-			p_csrow->dtype = DEV_X8;
-		else
-			p_csrow->dtype = DEV_X4;
-
-		p_csrow->edac_mode = EDAC_S8ECD8ED;
+			p_csrow->nr_pages = csrow_megs << 8;
+			dimm = p_csrow->channels[channel].dimm;
+			dimm->grain = 8;
+			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
+			dimm->mtype = MEM_RDDR2;
+			dimm->edac_mode = EDAC_SECDED;
+		}
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 6104dba..07a5927 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -618,6 +618,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 		      int slot, int ch, int branch,
 		      struct i7300_dimm_info *dinfo,
 		      struct csrow_info *p_csrow,
+		      struct dimm_info *dimm,
 		      u32 *nr_pages)
 {
 	int mtr, ans, addrBits, channel;
@@ -663,10 +664,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 	debugf2("\t\tSIZE: %d MB\n", dinfo->megabytes);
 
-	p_csrow->grain = 8;
-	p_csrow->mtype = MEM_FB_DDR2;
 	p_csrow->csrow_idx = slot;
-	p_csrow->page_mask = 0;
 
 	/*
 	 * The type of error detection actually depends of the
@@ -677,15 +675,17 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	 * See datasheet Sections 7.3.6 to 7.3.8
 	 */
 
+	dimm->grain = 8;
+	dimm->mtype = MEM_FB_DDR2;
 	if (IS_SINGLE_MODE(pvt->mc_settings_a)) {
-		p_csrow->edac_mode = EDAC_SECDED;
+		dimm->edac_mode = EDAC_SECDED;
 		debugf2("\t\tECC code is 8-byte-over-32-byte SECDED+ code\n");
 	} else {
 		debugf2("\t\tECC code is on Lockstep mode\n");
 		if (MTR_DRAM_WIDTH(mtr) == 8)
-			p_csrow->edac_mode = EDAC_S8ECD8ED;
+			dimm->edac_mode = EDAC_S8ECD8ED;
 		else
-			p_csrow->edac_mode = EDAC_S4ECD4ED;
+			dimm->edac_mode = EDAC_S4ECD4ED;
 	}
 
 	/* ask what device type on this row */
@@ -694,9 +694,9 @@ static int decode_mtr(struct i7300_pvt *pvt,
 			IS_SCRBALGO_ENHANCED(pvt->mc_settings) ?
 					    "enhanced" : "normal");
 
-		p_csrow->dtype = DEV_X8;
+		dimm->dtype = DEV_X8;
 	} else
-		p_csrow->dtype = DEV_X4;
+		dimm->dtype = DEV_X4;
 
 	return mtr;
 }
@@ -779,6 +779,7 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	int mtr;
 	int ch, branch, slot, channel;
 	u32 last_page = 0, nr_pages;
+	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
 
@@ -803,20 +804,24 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	/* Get the set of MTR[0-7] regs by each branch */
+	nr_pages = 0;
 	for (slot = 0; slot < MAX_SLOTS; slot++) {
 		int where = mtr_regs[slot];
 		for (branch = 0; branch < MAX_BRANCHES; branch++) {
 			pci_read_config_word(pvt->pci_dev_2x_0_fbd_branch[branch],
 					where,
 					&pvt->mtr[slot][branch]);
-			for (ch = 0; ch < MAX_BRANCHES; ch++) {
+			for (ch = 0; ch < MAX_CH_PER_BRANCH; ch++) {
 				int channel = to_channel(ch, branch);
 
 				dinfo = &pvt->dimm_info[slot][channel];
 				p_csrow = &mci->csrows[slot];
 
+				dimm = p_csrow->channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+
 				mtr = decode_mtr(pvt, slot, ch, branch,
-						 dinfo, p_csrow, &nr_pages);
+						 dinfo, p_csrow, dimm,
+						 &nr_pages);
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 5203f30..21f9791 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -592,7 +592,7 @@ static int i7core_get_active_channels(const u8 socket, unsigned *channels,
 	return 0;
 }
 
-static int get_dimm_config(const struct mem_ctl_info *mci)
+static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct i7core_pvt *pvt = mci->pvt_info;
 	struct csrow_info *csr;
@@ -602,6 +602,7 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 	unsigned long last_page = 0;
 	enum edac_type mode;
 	enum mem_type mtype;
+	struct dimm_info *dimm;
 
 	/* Get data from the MC register, function 0 */
 	pdev = pvt->pci_mcr[0];
@@ -721,7 +722,6 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 			csr->nr_pages = npages;
 
 			csr->page_mask = 0;
-			csr->grain = 8;
 			csr->csrow_idx = csrow;
 			csr->nr_channels = 1;
 
@@ -730,28 +730,27 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 
 			pvt->csrow_map[i][j] = csrow;
 
+			dimm = csr->channels[0].dimm;
 			switch (banks) {
 			case 4:
-				csr->dtype = DEV_X4;
+				dimm->dtype = DEV_X4;
 				break;
 			case 8:
-				csr->dtype = DEV_X8;
+				dimm->dtype = DEV_X8;
 				break;
 			case 16:
-				csr->dtype = DEV_X16;
+				dimm->dtype = DEV_X16;
 				break;
 			default:
-				csr->dtype = DEV_UNKNOWN;
+				dimm->dtype = DEV_UNKNOWN;
 			}
 
-			csr->edac_mode = mode;
-			csr->mtype = mtype;
-			snprintf(csr->channels[0].dimm->label,
-					sizeof(csr->channels[0].dimm->label),
-					"CPU#%uChannel#%u_DIMM#%u",
-					pvt->i7core_dev->socket, i, j);
-
-			csrow++;
+			snprintf(dimm->label, sizeof(dimm->label),
+				 "CPU#%uChannel#%u_DIMM#%u",
+				 pvt->i7core_dev->socket, i, j);
+			dimm->grain = 8;
+			dimm->edac_mode = mode;
+			dimm->mtype = mtype;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 4329d39..1e19492 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -12,7 +12,7 @@
  * 440GX fix by Jason Uhlenkott <juhlenko@akamai.com>.
  *
  * Written with reference to 82443BX Host Bridge Datasheet:
- * http://download.intel.com/design/chipsets/datashts/29063301.pdf 
+ * http://download.intel.com/design/chipsets/datashts/29063301.pdf
  * references to this document given in [].
  *
  * This module doesn't support the 440LX, but it may be possible to
@@ -189,6 +189,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 				enum mem_type mtype)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	int index;
 	u8 drbar, dramc;
 	u32 row_base, row_high_limit, row_high_limit_last;
@@ -197,6 +198,8 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 	row_high_limit_last = 0;
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
+
 		pci_read_config_byte(pdev, I82443BXGX_DRB + index, &drbar);
 		debugf1("MC%d: %s: %s() Row=%d DRB = %#0x\n",
 			mci->mc_idx, __FILE__, __func__, index, drbar);
@@ -219,12 +222,12 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
 		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* EAP reports in 4kilobyte granularity [61] */
-		csrow->grain = 1 << 12;
-		csrow->mtype = mtype;
+		dimm->grain = 1 << 12;
+		dimm->mtype = mtype;
 		/* I don't think 440BX can tell you device type? FIXME? */
-		csrow->dtype = DEV_UNKNOWN;
+		dimm->dtype = DEV_UNKNOWN;
 		/* Mode is global to all rows on 440BX */
-		csrow->edac_mode = edac_mode;
+		dimm->edac_mode = edac_mode;
 		row_high_limit_last = row_high_limit;
 	}
 }
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 931a057..acbd924 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -140,6 +140,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 	u16 value;
 	u32 cumul_size;
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	int index;
 
 	pci_read_config_word(pdev, I82860_MCHCFG, &mchcfg_ddim);
@@ -153,6 +154,8 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 	 */
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
+
 		pci_read_config_word(pdev, I82860_GBA + index * 2, &value);
 		cumul_size = (value & I82860_GBA_MASK) <<
 			(I82860_GBA_SHIFT - PAGE_SHIFT);
@@ -166,10 +169,10 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
-		csrow->mtype = MEM_RMBS;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = mchcfg_ddim ? EDAC_SECDED : EDAC_NONE;
+		dimm->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
+		dimm->mtype = MEM_RMBS;
+		dimm->dtype = DEV_UNKNOWN;
+		dimm->edac_mode = mchcfg_ddim ? EDAC_SECDED : EDAC_NONE;
 	}
 }
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 33864c6..81f79e2 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -342,11 +342,13 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 				void __iomem * ovrfl_window, u32 drc)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
+	unsigned nr_chans = dual_channel_active(drc) + 1;
 	unsigned long last_cumul_size;
 	u8 value;
 	u32 drc_ddim;		/* DRAM Data Integrity Mode 0=none,2=edac */
 	u32 cumul_size;
-	int index;
+	int index, j;
 
 	drc_ddim = (drc >> 18) & 0x1;
 	last_cumul_size = 0;
@@ -371,10 +373,15 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
-		csrow->mtype = MEM_DDR;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = drc_ddim ? EDAC_SECDED : EDAC_NONE;
+
+		for (j = 0; j < nr_chans; j++) {
+			dimm = csrow->channels[j].dimm;
+
+			dimm->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
+			dimm->mtype = MEM_DDR;
+			dimm->dtype = DEV_UNKNOWN;
+			dimm->edac_mode = drc_ddim ? EDAC_SECDED : EDAC_NONE;
+		}
 	}
 }
 
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 864061b..0b40e11 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -309,7 +309,7 @@ static int i82975x_process_error_info(struct mem_ctl_info *mci,
 	chan = (mci->csrows[row].nr_channels == 1) ? 0 : info->eap & 1;
 	offst = info->eap
 			& ((1 << PAGE_SHIFT) -
-				(1 << mci->csrows[row].grain));
+			   (1 << mci->csrows[row].channels[chan].dimm->grain));
 
 	if (info->errsts & 0x0002)
 		edac_mc_handle_ue(mci, page, offst , row, "i82975x UE");
@@ -372,6 +372,8 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 	u8 value;
 	u32 cumul_size;
 	int index, chan;
+	struct dimm_info *dimm;
+	enum dev_type dtype;
 
 	last_cumul_size = 0;
 
@@ -406,10 +408,17 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		 *   [0-7] for single-channel; i.e. csrow->nr_channels = 1
 		 *   [0-3] for dual-channel; i.e. csrow->nr_channels = 2
 		 */
-		for (chan = 0; chan < csrow->nr_channels; chan++)
+		dtype = i82975x_dram_type(mch_window, index);
+		for (chan = 0; chan < csrow->nr_channels; chan++) {
+			dimm = mci->csrows[index].channels[chan].dimm;
 			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
+			dimm->grain = 1 << 7;	/* 128Byte cache-line resolution */
+			dimm->dtype = i82975x_dram_type(mch_window, index);
+			dimm->mtype = MEM_DDR2; /* I82975x supports only DDR2 */
+			dimm->edac_mode = EDAC_SECDED; /* only supported */
+		}
 
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -418,10 +427,6 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 7;	/* 128Byte cache-line resolution */
-		csrow->mtype = MEM_DDR2; /* I82975x supports only DDR2 */
-		csrow->dtype = i82975x_dram_type(mch_window, index);
-		csrow->edac_mode = EDAC_SECDED; /* only supported */
 	}
 }
 
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 73464a6..fb92916 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -883,6 +883,7 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 {
 	struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	u32 sdram_ctl;
 	u32 sdtype;
 	enum mem_type mtype;
@@ -929,6 +930,8 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 		u32 end;
 
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
+
 		cs_bnds = in_be32(pdata->mc_vbase + MPC85XX_MC_CS_BNDS_0 +
 				  (index * MPC85XX_MC_CS_BNDS_OFS));
 
@@ -945,12 +948,12 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 		csrow->first_page = start;
 		csrow->last_page = end;
 		csrow->nr_pages = end + 1 - start;
-		csrow->grain = 8;
-		csrow->mtype = mtype;
-		csrow->dtype = DEV_UNKNOWN;
+		dimm->grain = 8;
+		dimm->mtype = mtype;
+		dimm->dtype = DEV_UNKNOWN;
 		if (sdram_ctl & DSC_X32_EN)
-			csrow->dtype = DEV_X32;
-		csrow->edac_mode = EDAC_SECDED;
+			dimm->dtype = DEV_X32;
+		dimm->edac_mode = EDAC_SECDED;
 	}
 }
 
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 7e5ff36..12d7fe0 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -656,6 +656,8 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 				struct mv64x60_mc_pdata *pdata)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
+
 	u32 devtype;
 	u32 ctl;
 
@@ -664,30 +666,30 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 	ctl = in_le32(pdata->mc_vbase + MV64X60_SDRAM_CONFIG);
 
 	csrow = &mci->csrows[0];
-	csrow->first_page = 0;
+	dimm = csrow->channels[0].dimm;
 	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
 	csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-	csrow->grain = 8;
+	dimm->grain = 8;
 
-	csrow->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
+	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
 
 	devtype = (ctl >> 20) & 0x3;
 	switch (devtype) {
 	case 0x0:
-		csrow->dtype = DEV_X32;
+		dimm->dtype = DEV_X32;
 		break;
 	case 0x2:		/* could be X8 too, but no way to tell */
-		csrow->dtype = DEV_X16;
+		dimm->dtype = DEV_X16;
 		break;
 	case 0x3:
-		csrow->dtype = DEV_X4;
+		dimm->dtype = DEV_X4;
 		break;
 	default:
-		csrow->dtype = DEV_UNKNOWN;
+		dimm->dtype = DEV_UNKNOWN;
 		break;
 	}
 
-	csrow->edac_mode = EDAC_SECDED;
+	dimm->edac_mode = EDAC_SECDED;
 }
 
 static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 7f71ee4..4e53270 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -135,11 +135,13 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 				   enum edac_type edac_mode)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	u32 rankcfg;
 	int index;
 
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
 
 		pci_read_config_dword(pdev,
 				      MCDRAM_RANKCFG + (index * 12),
@@ -177,10 +179,10 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 		last_page_in_mmc += csrow->nr_pages;
 		csrow->page_mask = 0;
-		csrow->grain = PASEMI_EDAC_ERROR_GRAIN;
-		csrow->mtype = MEM_DDR;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = edac_mode;
+		dimm->grain = PASEMI_EDAC_ERROR_GRAIN;
+		dimm->mtype = MEM_DDR;
+		dimm->dtype = DEV_UNKNOWN;
+		dimm->edac_mode = edac_mode;
 	}
 	return 0;
 }
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index d427c69..a75e567 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -895,7 +895,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum mem_type mtype;
 	enum dev_type dtype;
 	enum edac_type edac_mode;
-	int row;
+	int row, j;
 	u32 mbxcf, size;
 	static u32 ppc4xx_last_page;
 
@@ -975,15 +975,18 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		 * possible values would be the PLB width (16), the
 		 * page size (PAGE_SIZE) or the memory width (2 or 4).
 		 */
+		for (j = 0; j < csi->nr_channels; j++) {
+			struct dimm_info *dimm = csi->channels[j].dimm;
 
-		csi->grain	= 1;
+			dimm->grain	= 1;
 
-		csi->mtype	= mtype;
-		csi->dtype	= dtype;
+			dimm->mtype	= mtype;
+			dimm->dtype	= dtype;
 
-		csi->edac_mode	= edac_mode;
+			dimm->edac_mode	= edac_mode;
 
 		ppc4xx_last_page += csi->nr_pages;
+		}
 	}
 
  done:
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index e294e1b..414a532 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -216,6 +216,7 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 			u8 dramcr)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	int index;
 	u8 drbar;		/* SDRAM Row Boundary Address Register */
 	u32 row_high_limit, row_high_limit_last;
@@ -227,6 +228,7 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
 
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_byte(pdev, R82600_DRBA + index, &drbar);
@@ -250,13 +252,13 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* Error address is top 19 bits - so granularity is      *
 		 * 14 bits                                               */
-		csrow->grain = 1 << 14;
-		csrow->mtype = reg_sdram ? MEM_RDDR : MEM_DDR;
+		dimm->grain = 1 << 14;
+		dimm->mtype = reg_sdram ? MEM_RDDR : MEM_DDR;
 		/* FIXME - check that this is unknowable with this chipset */
-		csrow->dtype = DEV_UNKNOWN;
+		dimm->dtype = DEV_UNKNOWN;
 
 		/* Mode is global on 82600 */
-		csrow->edac_mode = ecc_on ? EDAC_SECDED : EDAC_NONE;
+		dimm->edac_mode = ecc_on ? EDAC_SECDED : EDAC_NONE;
 		row_high_limit_last = row_high_limit;
 	}
 }
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index dea1ef3..ec6e03d 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -551,7 +551,7 @@ static int sbridge_get_active_channels(const u8 bus, unsigned *channels,
 	return 0;
 }
 
-static int get_dimm_config(const struct mem_ctl_info *mci)
+static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct sbridge_pvt *pvt = mci->pvt_info;
 	struct csrow_info *csr;
@@ -561,6 +561,7 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
+	struct dimm_info *dimm;
 
 	pci_read_config_dword(pvt->pci_br, SAD_TARGET, &reg);
 	pvt->sbridge_dev->source_id = SOURCE_ID(reg);
@@ -612,6 +613,7 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 	/* On all supported DDR3 DIMM types, there are 8 banks available */
 	banks = 8;
 
+	dimm = mci->dimms;
 	for (i = 0; i < NUM_CHANNELS; i++) {
 		u32 mtr;
 
@@ -634,29 +636,30 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 					pvt->sbridge_dev->mc, i, j,
 					size, npages,
 					banks, ranks, rows, cols);
-				csr = &mci->csrows[csrow];
 
+				/*
+				 * Fake stuff. This controller doesn't see
+				 * csrows.
+				 */
+				csr = &mci->csrows[csrow];
 				csr->first_page = last_page;
 				csr->last_page = last_page + npages - 1;
-				csr->page_mask = 0UL;	/* Unused */
 				csr->nr_pages = npages;
-				csr->grain = 32;
 				csr->csrow_idx = csrow;
-				csr->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
-				csr->ce_count = 0;
-				csr->ue_count = 0;
-				csr->mtype = mtype;
-				csr->edac_mode = mode;
 				csr->nr_channels = 1;
 				csr->channels[0].chan_idx = i;
-				csr->channels[0].ce_count = 0;
 				pvt->csrow_map[i][j] = csrow;
-				snprintf(csr->channels[0].dimm->label,
-					 sizeof(csr->channels[0].dimm->label),
-					 "CPU_SrcID#%u_Channel#%u_DIMM#%u",
-					 pvt->sbridge_dev->source_id, i, j);
 				last_page += npages;
 				csrow++;
+
+				csr->channels[0].dimm = dimm;
+				dimm->grain = 32;
+				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
+				dimm->mtype = mtype;
+				dimm->edac_mode = mode;
+				snprintf(dimm->label, sizeof(dimm->label),
+					 "CPU_SrcID#%u_Channel#%u_DIMM#%u",
+					 pvt->sbridge_dev->source_id, i, j);
 			}
 		}
 	}
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index 1d5cf06..db7d2ae 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -84,6 +84,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 	struct csrow_info	*csrow = &mci->csrows[0];
 	struct tile_edac_priv	*priv = mci->pvt_info;
 	struct mshim_mem_info	mem_info;
+	struct dimm_info *dimm = csrow->channels[0].dimm;
 
 	if (hv_dev_pread(priv->hv_devhdl, 0, (HV_VirtAddr)&mem_info,
 		sizeof(struct mshim_mem_info), MSHIM_MEM_INFO_OFF) !=
@@ -93,16 +94,16 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	if (mem_info.mem_ecc)
-		csrow->edac_mode = EDAC_SECDED;
+		dimm->edac_mode = EDAC_SECDED;
 	else
-		csrow->edac_mode = EDAC_NONE;
+		dimm->edac_mode = EDAC_NONE;
 	switch (mem_info.mem_type) {
 	case DDR2:
-		csrow->mtype = MEM_DDR2;
+		dimm->mtype = MEM_DDR2;
 		break;
 
 	case DDR3:
-		csrow->mtype = MEM_DDR3;
+		dimm->mtype = MEM_DDR3;
 		break;
 
 	default:
@@ -112,8 +113,8 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 	csrow->first_page = 0;
 	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
 	csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-	csrow->grain = TILE_EDAC_ERROR_GRAIN;
-	csrow->dtype = DEV_UNKNOWN;
+	dimm->grain = TILE_EDAC_ERROR_GRAIN;
+	dimm->dtype = DEV_UNKNOWN;
 
 	return 0;
 }
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index b6f47de..52c8d69 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -317,7 +317,7 @@ static unsigned long drb_to_nr_pages(
 static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc;
-	int i;
+	int i, j;
 	struct mem_ctl_info *mci = NULL;
 	unsigned long last_page;
 	u16 drbs[X38_CHANNELS][X38_RANKS_PER_CHANNEL];
@@ -372,20 +372,21 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 			i / X38_RANKS_PER_CHANNEL,
 			i % X38_RANKS_PER_CHANNEL);
 
-		if (nr_pages == 0) {
-			csrow->mtype = MEM_EMPTY;
+		if (nr_pages == 0)
 			continue;
-		}
 
 		csrow->first_page = last_page + 1;
 		last_page += nr_pages;
 		csrow->last_page = last_page;
 		csrow->nr_pages = nr_pages;
 
-		csrow->grain = nr_pages << PAGE_SHIFT;
-		csrow->mtype = MEM_DDR2;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = EDAC_UNKNOWN;
+		for (j = 0; j < x38_channel_num; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			dimm->grain = nr_pages << PAGE_SHIFT;
+			dimm->mtype = MEM_DDR2;
+			dimm->dtype = DEV_UNKNOWN;
+			dimm->edac_mode = EDAC_UNKNOWN;
+		}
 	}
 
 	x38_clear_error_info(mci);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index f40b835..5244193 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -314,6 +314,13 @@ struct dimm_info {
 	unsigned memory_controller;
 	unsigned csrow;
 	unsigned csrow_channel;
+
+	u32 grain;		/* granularity of reported error in bytes */
+	enum dev_type dtype;	/* memory device type */
+	enum mem_type mtype;	/* memory dimm type */
+	enum edac_type edac_mode;	/* EDAC mode for this dimm */
+
+	u32 ce_count;		/* Correctable Errors for this dimm */
 };
 
 /**
@@ -339,19 +346,17 @@ struct rank_info {
 };
 
 struct csrow_info {
-	unsigned long first_page;	/* first page number in dimm */
-	unsigned long last_page;	/* last page number in dimm */
+	unsigned long first_page;	/* first page number in csrow */
+	unsigned long last_page;	/* last page number in csrow */
+	u32 nr_pages;			/* number of pages in csrow */
 	unsigned long page_mask;	/* used for interleaving -
 					 * 0UL for non intlv
 					 */
-	u32 nr_pages;		/* number of pages in csrow */
-	u32 grain;		/* granularity of reported error in bytes */
-	int csrow_idx;		/* the chip-select row */
-	enum dev_type dtype;	/* memory device type */
+	int csrow_idx;			/* the chip-select row */
+
 	u32 ue_count;		/* Uncorrectable Errors for this csrow */
 	u32 ce_count;		/* Correctable Errors for this csrow */
-	enum mem_type mtype;	/* memory csrow type */
-	enum edac_type edac_mode;	/* EDAC mode for this csrow */
+
 	struct mem_ctl_info *mci;	/* the parent */
 
 	struct kobject kobj;	/* sysfs kobject for this csrow */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 03/13] edac: Don't initialize csrow's first_page & friends when not needed
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 01/13] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 02/13] edac: move dimm properties to struct memset_info Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-04-02 12:33   ` Borislav Petkov
  2012-03-29 16:45 ` [PATCH 04/13] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
                   ` (12 subsequent siblings)
  15 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

Almost all edac	drivers	initialize first_page, last_page and
page_mask. Those vars are used inside the EDAC core, in	order to
calculate the csrow affected by	an error, by using the routine
edac_mc_find_csrow_by_page().

However, very few drivers actually use it:
        e752x_edac.c
        e7xxx_edac.c
        i3000_edac.c
        i82443bxgx_edac.c
        i82860_edac.c
        i82875p_edac.c
        i82975x_edac.c
        r82600_edac.c

There also a few other drivers that have their own calculus
formula internally using those vars.

All the others are just wasting time by initializing those
data.

While initializing data without using them won't cause any troubles, as
those information is stored at the wrong place (at csrows structure), it
is better to remove what is unused, in order to simplify the next patch.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c   |   38 ++------------------------------------
 drivers/edac/i3200_edac.c   |    5 -----
 drivers/edac/i5000_edac.c   |    5 -----
 drivers/edac/i5100_edac.c   |    2 --
 drivers/edac/i5400_edac.c   |    5 -----
 drivers/edac/i7300_edac.c   |    5 +----
 drivers/edac/i7core_edac.c  |    5 -----
 drivers/edac/mv64x60_edac.c |    1 -
 drivers/edac/ppc4xx_edac.c  |    7 -------
 drivers/edac/sb_edac.c      |    2 --
 drivers/edac/tile_edac.c    |    2 --
 drivers/edac/x38_edac.c     |    5 -----
 12 files changed, 3 insertions(+), 79 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 3e7bddc..b1b1551 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -715,25 +715,6 @@ static inline u64 input_addr_to_sys_addr(struct mem_ctl_info *mci,
 				     input_addr_to_dram_addr(mci, input_addr));
 }
 
-/*
- * Find the minimum and maximum InputAddr values that map to the given @csrow.
- * Pass back these values in *input_addr_min and *input_addr_max.
- */
-static void find_csrow_limits(struct mem_ctl_info *mci, int csrow,
-			      u64 *input_addr_min, u64 *input_addr_max)
-{
-	struct amd64_pvt *pvt;
-	u64 base, mask;
-
-	pvt = mci->pvt_info;
-	BUG_ON((csrow < 0) || (csrow >= pvt->csels[0].b_cnt));
-
-	get_cs_base_and_mask(pvt, csrow, 0, &base, &mask);
-
-	*input_addr_min = base & ~mask;
-	*input_addr_max = base | mask;
-}
-
 /* Map the Error address to a PAGE and PAGE OFFSET. */
 static inline void error_address_to_page_and_offset(u64 error_address,
 						    u32 *page, u32 *offset)
@@ -2166,7 +2147,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 {
 	struct csrow_info *csrow;
 	struct amd64_pvt *pvt = mci->pvt_info;
-	u64 input_addr_min, input_addr_max, sys_addr, base, mask;
+	u64 base, mask;
 	u32 val;
 	int i, j, empty = 1;
 	enum mem_type mtype;
@@ -2194,14 +2175,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 
 		empty = 0;
 		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
-		find_csrow_limits(mci, i, &input_addr_min, &input_addr_max);
-		sys_addr = input_addr_to_sys_addr(mci, input_addr_min);
-		csrow->first_page = (u32) (sys_addr >> PAGE_SHIFT);
-		sys_addr = input_addr_to_sys_addr(mci, input_addr_max);
-		csrow->last_page = (u32) (sys_addr >> PAGE_SHIFT);
-
 		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
-		csrow->page_mask = ~mask;
 		/* 8 bytes of resolution */
 
 		mtype = amd64_determine_memory_type(pvt, i);
@@ -2221,15 +2195,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 		}
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    input_addr_min: 0x%lx input_addr_max: 0x%lx\n",
-			(unsigned long)input_addr_min,
-			(unsigned long)input_addr_max);
-		debugf1("    sys_addr: 0x%lx  page_mask: 0x%lx\n",
-			(unsigned long)sys_addr, csrow->page_mask);
-		debugf1("    nr_pages: %u  first_page: 0x%lx "
-			"last_page: 0x%lx\n",
-			(unsigned)csrow->nr_pages,
-			csrow->first_page, csrow->last_page);
+		debugf1("    nr_pages: %u\n", csrow->nr_pages);
 	}
 
 	return empty;
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 73529fd..d8fa7f3 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -321,7 +321,6 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_page;
 	u16 drbs[I3200_CHANNELS][I3200_RANKS_PER_CHANNEL];
 	bool stacked;
 	void __iomem *window;
@@ -366,7 +365,6 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	 * cumulative; the last one will contain the total memory
 	 * contained in all ranks.
 	 */
-	last_page = -1UL;
 	for (i = 0; i < mci->nr_csrows; i++) {
 		unsigned long nr_pages;
 		struct csrow_info *csrow = &mci->csrows[i];
@@ -378,9 +376,6 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->first_page = last_page + 1;
-		last_page += nr_pages;
-		csrow->last_page = last_page;
 		csrow->nr_pages = nr_pages;
 
 		for (j = 0; j < nr_channels; j++) {
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index e612f1e..f00f684 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1263,11 +1263,6 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr) && !MTR_DIMMS_PRESENT(mtr1))
 			continue;
 
-		/* FAKE OUT VALUES, FIXME */
-		p_csrow->first_page = 0 + csrow * 20;
-		p_csrow->last_page = 9 + csrow * 20;
-		p_csrow->page_mask = 0xFFF;
-
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 9caff36..8da7ce1 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -859,8 +859,6 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 * FIXME: these two are totally bogus -- I don't see how to
 		 * map them correctly to this structure...
 		 */
-		mci->csrows[i].first_page = total_pages;
-		mci->csrows[i].last_page = total_pages + npages - 1;
 		mci->csrows[i].nr_pages = npages;
 		mci->csrows[i].csrow_idx = i;
 		mci->csrows[i].mci = mci;
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 229aff5..4a23813 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1180,11 +1180,6 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr))
 			continue;
 
-		/* FAKE OUT VALUES, FIXME */
-		p_csrow->first_page = 0 + csrow * 20;
-		p_csrow->last_page = 9 + csrow * 20;
-		p_csrow->page_mask = 0xFFF;
-
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 07a5927..df6cd59 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -778,7 +778,7 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	int rc = -ENODEV;
 	int mtr;
 	int ch, branch, slot, channel;
-	u32 last_page = 0, nr_pages;
+	u32 nr_pages;
 	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
@@ -828,9 +828,6 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 
 				/* Update per_csrow memory count */
 				p_csrow->nr_pages += nr_pages;
-				p_csrow->first_page = last_page;
-				last_page += nr_pages;
-				p_csrow->last_page = last_page;
 
 				rc = 0;
 			}
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 21f9791..89ccec6 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -599,7 +599,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	struct pci_dev *pdev;
 	int i, j;
 	int csrow = 0;
-	unsigned long last_page = 0;
 	enum edac_type mode;
 	enum mem_type mtype;
 	struct dimm_info *dimm;
@@ -716,12 +715,8 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			npages = MiB_TO_PAGES(size);
 
 			csr = &mci->csrows[csrow];
-			csr->first_page = last_page + 1;
-			last_page += npages;
-			csr->last_page = last_page;
 			csr->nr_pages = npages;
 
-			csr->page_mask = 0;
 			csr->csrow_idx = csrow;
 			csr->nr_channels = 1;
 
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 12d7fe0..d2e3c39 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -668,7 +668,6 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 	csrow = &mci->csrows[0];
 	dimm = csrow->channels[0].dimm;
 	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
-	csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 	dimm->grain = 8;
 
 	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index a75e567..ec5e529 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -897,7 +897,6 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum edac_type edac_mode;
 	int row, j;
 	u32 mbxcf, size;
-	static u32 ppc4xx_last_page;
 
 	/* Establish the memory type and width */
 
@@ -959,10 +958,6 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 			goto done;
 		}
 
-		csi->first_page = ppc4xx_last_page;
-		csi->last_page	= csi->first_page + csi->nr_pages - 1;
-		csi->page_mask	= 0;
-
 		/*
 		 * It's unclear exactly what grain should be set to
 		 * here. The SDRAM_ECCES register allows resolution of
@@ -984,8 +979,6 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 			dimm->dtype	= dtype;
 
 			dimm->edac_mode	= edac_mode;
-
-		ppc4xx_last_page += csi->nr_pages;
 		}
 	}
 
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index ec6e03d..cf53007 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -642,8 +642,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				 * csrows.
 				 */
 				csr = &mci->csrows[csrow];
-				csr->first_page = last_page;
-				csr->last_page = last_page + npages - 1;
 				csr->nr_pages = npages;
 				csr->csrow_idx = csrow;
 				csr->nr_channels = 1;
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index db7d2ae..ba0917b 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -110,9 +110,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 		return -1;
 	}
 
-	csrow->first_page = 0;
 	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
-	csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 	dimm->grain = TILE_EDAC_ERROR_GRAIN;
 	dimm->dtype = DEV_UNKNOWN;
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 52c8d69..7be10dd 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -319,7 +319,6 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_page;
 	u16 drbs[X38_CHANNELS][X38_RANKS_PER_CHANNEL];
 	bool stacked;
 	void __iomem *window;
@@ -363,7 +362,6 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	 * cumulative; the last one will contain the total memory
 	 * contained in all ranks.
 	 */
-	last_page = -1UL;
 	for (i = 0; i < mci->nr_csrows; i++) {
 		unsigned long nr_pages;
 		struct csrow_info *csrow = &mci->csrows[i];
@@ -375,9 +373,6 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->first_page = last_page + 1;
-		last_page += nr_pages;
-		csrow->last_page = last_page;
 		csrow->nr_pages = nr_pages;
 
 		for (j = 0; j < x38_channel_num; j++) {
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 04/13] edac: move nr_pages to dimm struct
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (2 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 03/13] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-04-02 13:18   ` Borislav Petkov
  2012-03-29 16:45 ` [PATCH 05/13] edac: Fix core support for MC's that see DIMMS instead of ranks Mauro Carvalho Chehab
                   ` (11 subsequent siblings)
  15 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The number of pages is a dimm property. Move it to the dimm struct.

After this change, it is possible to add sysfs nodes for the DIMM's that
will properly represent the DIMM stick properties, including its size.

A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
the memory controller represents the memory via chip select rows.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c      |   13 ++++------
 drivers/edac/amd76x_edac.c     |    6 ++--
 drivers/edac/cell_edac.c       |    8 ++++--
 drivers/edac/cpc925_edac.c     |    8 ++++--
 drivers/edac/e752x_edac.c      |    6 +++-
 drivers/edac/e7xxx_edac.c      |    5 ++-
 drivers/edac/edac_mc.c         |   28 ++++++++++++---------
 drivers/edac/edac_mc_sysfs.c   |   52 ++++++++++++++++++++++++++++++----------
 drivers/edac/i3000_edac.c      |    6 +++-
 drivers/edac/i3200_edac.c      |    3 +-
 drivers/edac/i5000_edac.c      |   14 ++++++----
 drivers/edac/i5100_edac.c      |   22 ++++++++++-------
 drivers/edac/i5400_edac.c      |    9 ++----
 drivers/edac/i7300_edac.c      |   22 ++++------------
 drivers/edac/i7core_edac.c     |   10 ++-----
 drivers/edac/i82443bxgx_edac.c |    2 +-
 drivers/edac/i82860_edac.c     |    2 +-
 drivers/edac/i82875p_edac.c    |    5 ++-
 drivers/edac/i82975x_edac.c    |   11 ++++++--
 drivers/edac/mpc85xx_edac.c    |    3 +-
 drivers/edac/mv64x60_edac.c    |    3 +-
 drivers/edac/pasemi_edac.c     |   14 +++++-----
 drivers/edac/ppc4xx_edac.c     |    5 ++-
 drivers/edac/r82600_edac.c     |    3 +-
 drivers/edac/sb_edac.c         |    8 +----
 drivers/edac/tile_edac.c       |    2 +-
 drivers/edac/x38_edac.c        |    4 +-
 include/linux/edac.h           |   10 ++++---
 28 files changed, 158 insertions(+), 126 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index b1b1551..ad0376e 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2126,12 +2126,6 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
 
 	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
 
-	/*
-	 * If dual channel then double the memory size of single channel.
-	 * Channel count is 1 or 2
-	 */
-	nr_pages <<= (pvt->channel_count - 1);
-
 	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
 	debugf0("    nr_pages= %u  channel-count = %d\n",
 		nr_pages, pvt->channel_count);
@@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 	int i, j, empty = 1;
 	enum mem_type mtype;
 	enum edac_type edac_mode;
+	int nr_pages;
 
 	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
 
@@ -2174,7 +2169,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 			i, pvt->mc_node_id);
 
 		empty = 0;
-		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
+		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
 		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
 		/* 8 bytes of resolution */
 
@@ -2192,10 +2187,12 @@ static int init_csrows(struct mem_ctl_info *mci)
 		for (j = 0; j < pvt->channel_count; j++) {
 			csrow->channels[j].dimm->mtype = mtype;
 			csrow->channels[j].dimm->edac_mode = edac_mode;
+			csrow->channels[j].dimm->nr_pages = nr_pages;
+
 		}
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    nr_pages: %u\n", csrow->nr_pages);
+		debugf1("    nr_pages: %u\n", nr_pages);
 	}
 
 	return empty;
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 2a63ed0..1532750 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -205,10 +205,10 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		mba_mask = ((mba & 0xff80) << 16) | 0x7fffffUL;
 		pci_read_config_dword(pdev, AMD76X_DRAM_MODE_STATUS, &dms);
 		csrow->first_page = mba_base >> PAGE_SHIFT;
-		csrow->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		dimm->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
 		csrow->page_mask = mba_mask >> PAGE_SHIFT;
-		dimm->grain = csrow->nr_pages << PAGE_SHIFT;
+		dimm->grain = dimm->nr_pages << PAGE_SHIFT;
 		dimm->mtype = MEM_RDDR;
 		dimm->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
 		dimm->edac_mode = edac_mode;
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 94fbb12..09e1b5d 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -128,6 +128,7 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 	struct cell_edac_priv		*priv = mci->pvt_info;
 	struct device_node		*np;
 	int				j;
+	u32				nr_pages;
 
 	for (np = NULL;
 	     (np = of_find_node_by_name(np, "memory")) != NULL;) {
@@ -142,19 +143,20 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 		if (of_node_to_nid(np) != priv->node)
 			continue;
 		csrow->first_page = r.start >> PAGE_SHIFT;
-		csrow->nr_pages = resource_size(&r) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = resource_size(&r) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
 			dimm->mtype = MEM_XDR;
 			dimm->edac_mode = EDAC_SECDED;
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 		}
 		dev_dbg(mci->dev,
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
 			priv->node, priv->chanmask,
-			csrow->first_page, csrow->nr_pages);
+			csrow->first_page, dimm->nr_pages);
 		break;
 	}
 }
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index ee90f3d..7b764a8 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -332,7 +332,7 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 	struct dimm_info *dimm;
 	int index, j;
 	u32 mbmr, mbbar, bba;
-	unsigned long row_size, last_nr_pages = 0;
+	unsigned long row_size, nr_pages, last_nr_pages = 0;
 
 	get_total_mem(pdata);
 
@@ -351,12 +351,14 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 
 		row_size = bba * (1UL << 28);	/* 256M */
 		csrow->first_page = last_nr_pages;
-		csrow->nr_pages = row_size >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = row_size >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 		last_nr_pages = csrow->last_page + 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			dimm->mtype = MEM_RDDR;
 			dimm->edac_mode = EDAC_SECDED;
 
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index db291ea..6d81d3c 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1044,7 +1044,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	int drc_drbg;		/* DRB granularity 0=64mb, 1=128mb */
 	int drc_ddim;		/* DRAM Data Integrity Mode 0=none, 2=edac */
 	u8 value;
-	u32 dra, drc, cumul_size, i;
+	u32 dra, drc, cumul_size, i, nr_pages;
 
 	dra = 0;
 	for (index = 0; index < 4; index++) {
@@ -1078,11 +1078,13 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (i = 0; i < drc_chan + 1; i++) {
 			struct dimm_info *dimm = csrow->channels[i].dimm;
+
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 178d2af..aeb69f0 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -349,7 +349,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	unsigned long last_cumul_size;
 	int index, j;
 	u8 value;
-	u32 dra, cumul_size;
+	u32 dra, cumul_size, nr_pages;
 	int drc_chan, drc_drbg, drc_ddim, mem_dev;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
@@ -380,12 +380,13 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < drc_chan + 1; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 2430ddb..02263c3 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -43,22 +43,22 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 {
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
-	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
+	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
+	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
 {
 	debugf4("\tcsrow = %p\n", csrow);
 	debugf4("\tcsrow->csrow_idx = %d\n", csrow->csrow_idx);
-	debugf4("\tcsrow->first_page = 0x%lx\n", csrow->first_page);
-	debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
-	debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
-	debugf4("\tcsrow->nr_pages = 0x%x\n", csrow->nr_pages);
 	debugf4("\tcsrow->nr_channels = %d\n", csrow->nr_channels);
 	debugf4("\tcsrow->channels = %p\n", csrow->channels);
 	debugf4("\tcsrow->mci = %p\n\n", csrow->mci);
+	debugf4("\tcsrow->first_page = 0x%lx\n", csrow->first_page);
+	debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
+	debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
 }
 
 static void edac_mc_dump_mci(struct mem_ctl_info *mci)
@@ -655,15 +655,19 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 {
 	struct csrow_info *csrows = mci->csrows;
-	int row, i;
+	int row, i, j, n;
 
 	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
 	row = -1;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
 		struct csrow_info *csrow = &csrows[i];
-
-		if (csrow->nr_pages == 0)
+		n = 0;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			n += dimm->nr_pages;
+		}
+		if (n == 0)
 			continue;
 
 		debugf3("MC%d: %s(): first(0x%lx) page(0x%lx) last(0x%lx) "
@@ -672,9 +676,9 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 			csrow->page_mask);
 
 		if ((page >= csrow->first_page) &&
-		    (page <= csrow->last_page) &&
-		    ((page & csrow->page_mask) ==
-		     (csrow->first_page & csrow->page_mask))) {
+		(page <= csrow->last_page) &&
+		((page & csrow->page_mask) ==
+		(csrow->first_page & csrow->page_mask))) {
 			row = i;
 			break;
 		}
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index d63904e..52c56cf 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -144,7 +144,13 @@ static ssize_t csrow_ce_count_show(struct csrow_info *csrow, char *data,
 static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%u\n", PAGES_TO_MiB(csrow->nr_pages));
+	int i;
+	u32 nr_pages = 0;
+
+	for (i = 0; i < csrow->nr_channels; i++)
+		nr_pages += csrow->channels[i].dimm->nr_pages;
+
+	return sprintf(data, "%u\n", PAGES_TO_MiB(nr_pages));
 }
 
 static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
@@ -519,16 +525,17 @@ static ssize_t mci_ctl_name_show(struct mem_ctl_info *mci, char *data)
 
 static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data)
 {
-	int total_pages, csrow_idx;
+	int total_pages, csrow_idx, j;
 
 	for (total_pages = csrow_idx = 0; csrow_idx < mci->nr_csrows;
-		csrow_idx++) {
+	     csrow_idx++) {
 		struct csrow_info *csrow = &mci->csrows[csrow_idx];
 
-		if (!csrow->nr_pages)
-			continue;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
 
-		total_pages += csrow->nr_pages;
+			total_pages += dimm->nr_pages;
+		}
 	}
 
 	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
@@ -900,7 +907,7 @@ static void edac_remove_mci_instance_attributes(struct mem_ctl_info *mci,
  */
 int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	int i, j;
 	int err;
 	struct csrow_info *csrow;
 	struct kobject *kobj_mci = &mci->edac_mci_kobj;
@@ -934,10 +941,15 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	/* Make directories for each CSROW object under the mc<id> kobject
 	 */
 	for (i = 0; i < mci->nr_csrows; i++) {
+		int n = 0;
+
 		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			n += dimm->nr_pages;
+		}
 
-		/* Only expose populated CSROWs */
-		if (csrow->nr_pages > 0) {
+		if (n > 0) {
 			err = edac_create_csrow_object(mci, csrow, i);
 			if (err) {
 				debugf1("%s() failure: create csrow %d obj\n",
@@ -949,10 +961,16 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 
 	return 0;
 
-	/* CSROW error: backout what has already been registered,  */
 fail1:
 	for (i--; i >= 0; i--) {
-		if (mci->csrows[i].nr_pages > 0)
+		int n = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			n += dimm->nr_pages;
+		}
+		if (n > 0)
 			kobject_put(&mci->csrows[i].kobj);
 	}
 
@@ -972,14 +990,22 @@ fail0:
  */
 void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	struct csrow_info *csrow;
+	int i, j;
 
 	debugf0("%s()\n", __func__);
 
 	/* remove all csrow kobjects */
 	debugf4("%s()  unregister this mci kobj\n", __func__);
 	for (i = 0; i < mci->nr_csrows; i++) {
-		if (mci->csrows[i].nr_pages > 0) {
+		int n = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			n += dimm->nr_pages;
+		}
+		if (n > 0) {
 			debugf0("%s()  unreg csrow-%d\n", __func__, i);
 			kobject_put(&mci->csrows[i].kobj);
 		}
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 1498c5f..bf8a230 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -306,7 +306,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_cumul_size;
+	unsigned long last_cumul_size, nr_pages;
 	int interleaved, nr_channels;
 	unsigned char dra[I3000_RANKS / 2], drb[I3000_RANKS];
 	unsigned char *c0dra = dra, *c1dra = &dra[I3000_RANKS_PER_CHANNEL / 2];
@@ -391,11 +391,13 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = I3000_DEAP_GRAIN;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index d8fa7f3..b82667f 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -376,11 +376,10 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index f00f684..e8d32e8 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1236,6 +1236,7 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i5000_pvt *pvt;
 	struct csrow_info *p_csrow;
+	struct dimm_info *dimm;
 	int empty, channel_count;
 	int max_csrows;
 	int mtr, mtr1;
@@ -1265,21 +1266,22 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
+			dimm = p_csrow->channels[channel].dimm;
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
-			p_csrow->channels[channel].dimm->grain = 8;
+			dimm->grain = 8;
 
 			/* Assume DDR2 for now */
-			p_csrow->channels[channel].dimm->mtype = MEM_FB_DDR2;
+			dimm->mtype = MEM_FB_DDR2;
 
 			/* ask what device type on this row */
 			if (MTR_DRAM_WIDTH(mtr))
-				p_csrow->channels[channel].dimm->dtype = DEV_X8;
+				dimm->dtype = DEV_X8;
 			else
-				p_csrow->channels[channel].dimm->dtype = DEV_X4;
+				dimm->dtype = DEV_X4;
 
-			p_csrow->channels[channel].dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->nr_pages = (csrow_megs << 8) / pvt->maxch;
 		}
-		p_csrow->nr_pages = csrow_megs << 8;
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 8da7ce1..a0219a9 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -859,7 +859,6 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 * FIXME: these two are totally bogus -- I don't see how to
 		 * map them correctly to this structure...
 		 */
-		mci->csrows[i].nr_pages = npages;
 		mci->csrows[i].csrow_idx = i;
 		mci->csrows[i].mci = mci;
 		mci->csrows[i].nr_channels = 1;
@@ -867,14 +866,19 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		total_pages += npages;
 
 		dimm = mci->csrows[i].channels[0].dimm;
-		dimm->grain = 32;
-		dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
-			      DEV_X4 : DEV_X8;
-		dimm->mtype = MEM_RDDR2;
-		dimm->edac_mode = EDAC_SECDED;
-		snprintf(dimm->label, sizeof(dimm->label),
-			 "DIMM%u",
-			 i5100_rank_to_slot(mci, chan, rank));
+		dimm->nr_pages = npages;
+		if (npages) {
+			total_pages += npages;
+
+			dimm->grain = 32;
+			dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
+				DEV_X4 : DEV_X8;
+			dimm->mtype = MEM_RDDR2;
+			dimm->edac_mode = EDAC_SECDED;
+			snprintf(dimm->label, sizeof(dimm->label),
+				"DIMM%u",
+				i5100_rank_to_slot(mci, chan, rank));
+		}
 	}
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 4a23813..784d6dc 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1156,7 +1156,7 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	int empty, channel_count;
 	int max_csrows;
 	int mtr;
-	int csrow_megs;
+	int size_mb;
 	int channel;
 	int csrow;
 	struct dimm_info *dimm;
@@ -1171,8 +1171,6 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	for (csrow = 0; csrow < max_csrows; csrow++) {
 		p_csrow = &mci->csrows[csrow];
 
-		p_csrow->csrow_idx = csrow;
-
 		/* use branch 0 for the basis */
 		mtr = determine_mtr(pvt, csrow, 0);
 
@@ -1180,12 +1178,11 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr))
 			continue;
 
-		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
-			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
+			size_mb = pvt->dimm_info[csrow][channel].megabytes;
 
-			p_csrow->nr_pages = csrow_megs << 8;
 			dimm = p_csrow->channels[channel].dimm;
+			dimm->nr_pages = size_mb << 8;
 			dimm->grain = 8;
 			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
 			dimm->mtype = MEM_RDDR2;
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index df6cd59..5e594ae 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -617,9 +617,7 @@ static void i7300_enable_error_reporting(struct mem_ctl_info *mci)
 static int decode_mtr(struct i7300_pvt *pvt,
 		      int slot, int ch, int branch,
 		      struct i7300_dimm_info *dinfo,
-		      struct csrow_info *p_csrow,
-		      struct dimm_info *dimm,
-		      u32 *nr_pages)
+		      struct dimm_info *dimm)
 {
 	int mtr, ans, addrBits, channel;
 
@@ -651,7 +649,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	addrBits -= 3;	/* 8 bits per bytes */
 
 	dinfo->megabytes = 1 << addrBits;
-	*nr_pages = dinfo->megabytes << 8;
 
 	debugf2("\t\tWIDTH: x%d\n", MTR_DRAM_WIDTH(mtr));
 
@@ -664,8 +661,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 	debugf2("\t\tSIZE: %d MB\n", dinfo->megabytes);
 
-	p_csrow->csrow_idx = slot;
-
 	/*
 	 * The type of error detection actually depends of the
 	 * mode of operation. When it is just one single memory chip, at
@@ -675,6 +670,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	 * See datasheet Sections 7.3.6 to 7.3.8
 	 */
 
+	dimm->nr_pages = MiB_TO_PAGES(dinfo->megabytes);
 	dimm->grain = 8;
 	dimm->mtype = MEM_FB_DDR2;
 	if (IS_SINGLE_MODE(pvt->mc_settings_a)) {
@@ -774,11 +770,9 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i7300_pvt *pvt;
 	struct i7300_dimm_info *dinfo;
-	struct csrow_info *p_csrow;
 	int rc = -ENODEV;
 	int mtr;
 	int ch, branch, slot, channel;
-	u32 nr_pages;
 	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
@@ -804,7 +798,6 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	/* Get the set of MTR[0-7] regs by each branch */
-	nr_pages = 0;
 	for (slot = 0; slot < MAX_SLOTS; slot++) {
 		int where = mtr_regs[slot];
 		for (branch = 0; branch < MAX_BRANCHES; branch++) {
@@ -815,21 +808,18 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 				int channel = to_channel(ch, branch);
 
 				dinfo = &pvt->dimm_info[slot][channel];
-				p_csrow = &mci->csrows[slot];
 
-				dimm = p_csrow->channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+				dimm = mci->csrows[slot].channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
 
 				mtr = decode_mtr(pvt, slot, ch, branch,
-						 dinfo, p_csrow, dimm,
-						 &nr_pages);
+						 dinfo, dimm);
+
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
 
-				/* Update per_csrow memory count */
-				p_csrow->nr_pages += nr_pages;
-
 				rc = 0;
+
 			}
 		}
 	}
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 89ccec6..d566797 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -715,17 +715,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			npages = MiB_TO_PAGES(size);
 
 			csr = &mci->csrows[csrow];
-			csr->nr_pages = npages;
-
-			csr->csrow_idx = csrow;
-			csr->nr_channels = 1;
-
-			csr->channels[0].chan_idx = i;
-			csr->channels[0].ce_count = 0;
 
 			pvt->csrow_map[i][j] = csrow;
 
 			dimm = csr->channels[0].dimm;
+			dimm->nr_pages = npages;
+
 			switch (banks) {
 			case 4:
 				dimm->dtype = DEV_X4;
@@ -746,6 +741,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			dimm->grain = 8;
 			dimm->edac_mode = mode;
 			dimm->mtype = mtype;
+			csrow++;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 1e19492..74166ae 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -220,7 +220,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		row_base = row_high_limit_last;
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* EAP reports in 4kilobyte granularity [61] */
 		dimm->grain = 1 << 12;
 		dimm->mtype = mtype;
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index acbd924..48e0ecd 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -167,7 +167,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		dimm->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 		dimm->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
 		dimm->mtype = MEM_RMBS;
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 81f79e2..dc207dc 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -347,7 +347,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 	unsigned long last_cumul_size;
 	u8 value;
 	u32 drc_ddim;		/* DRAM Data Integrity Mode 0=none,2=edac */
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, j;
 
 	drc_ddim = (drc >> 18) & 0x1;
@@ -371,12 +371,13 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_chans; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_chans;
 			dimm->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
 			dimm->mtype = MEM_DDR;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 0b40e11..304af1d 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -370,7 +370,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 	struct csrow_info *csrow;
 	unsigned long last_cumul_size;
 	u8 value;
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, chan;
 	struct dimm_info *dimm;
 	enum dev_type dtype;
@@ -402,6 +402,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
 			cumul_size);
 
+		nr_pages = cumul_size - last_cumul_size;
 		/*
 		 * Initialise dram labels
 		 * index values:
@@ -411,6 +412,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		dtype = i82975x_dram_type(mch_window, index);
 		for (chan = 0; chan < csrow->nr_channels; chan++) {
 			dimm = mci->csrows[index].channels[chan].dimm;
+
+			if (!nr_pages)
+				continue;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
@@ -420,12 +426,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 			dimm->edac_mode = EDAC_SECDED; /* only supported */
 		}
 
-		if (cumul_size == last_cumul_size)
+		if (!nr_pages)
 			continue;	/* not populated */
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 	}
 }
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index fb92916..c1d9e15 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -947,7 +947,8 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 
 		csrow->first_page = start;
 		csrow->last_page = end;
-		csrow->nr_pages = end + 1 - start;
+
+		dimm->nr_pages = end + 1 - start;
 		dimm->grain = 8;
 		dimm->mtype = mtype;
 		dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index d2e3c39..281e245 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -667,7 +667,8 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 
 	csrow = &mci->csrows[0];
 	dimm = csrow->channels[0].dimm;
-	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
+
+	dimm->nr_pages = pdata->total_mem >> PAGE_SHIFT;
 	dimm->grain = 8;
 
 	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 4e53270..3fcefda 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -153,20 +153,20 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		switch ((rankcfg & MCDRAM_RANKCFG_TYPE_SIZE_M) >>
 			MCDRAM_RANKCFG_TYPE_SIZE_S) {
 		case 0:
-			csrow->nr_pages = 128 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 128 << (20 - PAGE_SHIFT);
 			break;
 		case 1:
-			csrow->nr_pages = 256 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 256 << (20 - PAGE_SHIFT);
 			break;
 		case 2:
 		case 3:
-			csrow->nr_pages = 512 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 512 << (20 - PAGE_SHIFT);
 			break;
 		case 4:
-			csrow->nr_pages = 1024 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 1024 << (20 - PAGE_SHIFT);
 			break;
 		case 5:
-			csrow->nr_pages = 2048 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 2048 << (20 - PAGE_SHIFT);
 			break;
 		default:
 			edac_mc_printk(mci, KERN_ERR,
@@ -176,8 +176,8 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		}
 
 		csrow->first_page = last_page_in_mmc;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-		last_page_in_mmc += csrow->nr_pages;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
+		last_page_in_mmc += dimm->nr_pages;
 		csrow->page_mask = 0;
 		dimm->grain = PASEMI_EDAC_ERROR_GRAIN;
 		dimm->mtype = MEM_DDR;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index ec5e529..95cfc0f 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -896,7 +896,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum dev_type dtype;
 	enum edac_type edac_mode;
 	int row, j;
-	u32 mbxcf, size;
+	u32 mbxcf, size, nr_pages;
 
 	/* Establish the memory type and width */
 
@@ -947,7 +947,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		case SDRAM_MBCF_SZ_2GB:
 		case SDRAM_MBCF_SZ_4GB:
 		case SDRAM_MBCF_SZ_8GB:
-			csi->nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
+			nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
 			break;
 		default:
 			ppc4xx_edac_mc_printk(KERN_ERR, mci,
@@ -973,6 +973,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		for (j = 0; j < csi->nr_channels; j++) {
 			struct dimm_info *dimm = csi->channels[j].dimm;
 
+			dimm->nr_pages  = nr_pages / csi->nr_channels;
 			dimm->grain	= 1;
 
 			dimm->mtype	= mtype;
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 414a532..19f3a10 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -249,7 +249,8 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* Error address is top 19 bits - so granularity is      *
 		 * 14 bits                                               */
 		dimm->grain = 1 << 14;
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index cf53007..ee1543d 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -561,7 +561,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
-	struct dimm_info *dimm;
 
 	pci_read_config_dword(pvt->pci_br, SAD_TARGET, &reg);
 	pvt->sbridge_dev->source_id = SOURCE_ID(reg);
@@ -613,11 +612,11 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	/* On all supported DDR3 DIMM types, there are 8 banks available */
 	banks = 8;
 
-	dimm = mci->dimms;
 	for (i = 0; i < NUM_CHANNELS; i++) {
 		u32 mtr;
 
 		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
+			struct dimm_info *dimm = &mci->dimms[j];
 			pci_read_config_dword(pvt->pci_tad[i],
 					      mtr_regs[j], &mtr);
 			debugf4("Channel #%d  MTR%d = %x\n", i, j, mtr);
@@ -642,15 +641,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				 * csrows.
 				 */
 				csr = &mci->csrows[csrow];
-				csr->nr_pages = npages;
-				csr->csrow_idx = csrow;
-				csr->nr_channels = 1;
-				csr->channels[0].chan_idx = i;
 				pvt->csrow_map[i][j] = csrow;
 				last_page += npages;
 				csrow++;
 
 				csr->channels[0].dimm = dimm;
+				dimm->nr_pages = npages;
 				dimm->grain = 32;
 				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
 				dimm->mtype = mtype;
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index ba0917b..6314ff9 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -110,7 +110,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 		return -1;
 	}
 
-	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
+	dimm->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
 	dimm->grain = TILE_EDAC_ERROR_GRAIN;
 	dimm->dtype = DEV_UNKNOWN;
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 7be10dd..0de288f 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -373,10 +373,10 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < x38_channel_num; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / x38_channel_num;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 5244193..de22d4c 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -320,6 +320,8 @@ struct dimm_info {
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
+	u32 nr_pages;			/* number of pages in csrow */
+
 	u32 ce_count;		/* Correctable Errors for this dimm */
 };
 
@@ -346,13 +348,13 @@ struct rank_info {
 };
 
 struct csrow_info {
+	int csrow_idx;			/* the chip-select row */
+
+	/* Used only by edac_mc_find_csrow_by_page() */
 	unsigned long first_page;	/* first page number in csrow */
 	unsigned long last_page;	/* last page number in csrow */
-	u32 nr_pages;			/* number of pages in csrow */
 	unsigned long page_mask;	/* used for interleaving -
-					 * 0UL for non intlv
-					 */
-	int csrow_idx;			/* the chip-select row */
+					 * 0UL for non intlv */
 
 	u32 ue_count;		/* Uncorrectable Errors for this csrow */
 	u32 ce_count;		/* Correctable Errors for this csrow */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 05/13] edac: Fix core support for MC's that see DIMMS instead of ranks
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (3 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 04/13] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 06/13] edac: Initialize the dimm label with the known information Mauro Carvalho Chehab
                   ` (10 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMM's, instead of ranks, accessed
via csrow/channel.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This allowed to remove several hacks on FB-DIMM and RAMBUS
memory controllers.

Compile-tested all platforms (x86_64, i386, tile and several ppc subarchs).

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMM's are currently represented by multiple DIMM
entries at struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same dimm.
Such bug is there since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

PS.: I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger on the drivers for FB-DIMM's.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c      |  137 ++++++---
 drivers/edac/amd76x_edac.c     |   30 ++-
 drivers/edac/cell_edac.c       |   26 ++-
 drivers/edac/cpc925_edac.c     |   25 ++-
 drivers/edac/e752x_edac.c      |   51 +++-
 drivers/edac/e7xxx_edac.c      |   39 ++-
 drivers/edac/edac_core.h       |   48 +--
 drivers/edac/edac_device.c     |   27 +-
 drivers/edac/edac_mc.c         |  657 +++++++++++++++++++++++-----------------
 drivers/edac/edac_mc_sysfs.c   |   91 +++---
 drivers/edac/edac_module.h     |    2 +-
 drivers/edac/edac_pci.c        |    7 +-
 drivers/edac/i3000_edac.c      |   27 ++-
 drivers/edac/i3200_edac.c      |   34 ++-
 drivers/edac/i5000_edac.c      |   58 +++--
 drivers/edac/i5100_edac.c      |   90 +++---
 drivers/edac/i5400_edac.c      |  217 ++++++++------
 drivers/edac/i7300_edac.c      |   81 ++---
 drivers/edac/i7core_edac.c     |  202 +++---------
 drivers/edac/i82443bxgx_edac.c |   28 +-
 drivers/edac/i82860_edac.c     |   44 ++-
 drivers/edac/i82875p_edac.c    |   31 ++-
 drivers/edac/i82975x_edac.c    |   29 ++-
 drivers/edac/mpc85xx_edac.c    |   28 ++-
 drivers/edac/mv64x60_edac.c    |   25 ++-
 drivers/edac/pasemi_edac.c     |   27 +-
 drivers/edac/ppc4xx_edac.c     |   33 ++-
 drivers/edac/r82600_edac.c     |   29 ++-
 drivers/edac/sb_edac.c         |  159 ++++-------
 drivers/edac/tile_edac.c       |   16 +-
 drivers/edac/x38_edac.c        |   30 ++-
 include/linux/edac.h           |  121 +++++++-
 32 files changed, 1392 insertions(+), 1057 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index ad0376e..8e2873f 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1039,6 +1039,37 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	int channel, csrow;
 	u32 page, offset;
 
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
+
+	/*
+	 * Find out which node the error address belongs to. This may be
+	 * different from the node that detected the error.
+	 */
+	src_mci = find_mc_by_sys_addr(mci, sys_addr);
+	if (!src_mci) {
+		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
+			     (unsigned long)sys_addr);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a node",
+				     NULL);
+		return;
+	}
+
+	/* Now map the sys_addr to a CSROW */
+	csrow = sys_addr_to_csrow(src_mci, sys_addr);
+	if (csrow < 0) {
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a csrow",
+				     NULL);
+		return;
+	}
+
 	/* CHIPKILL enabled */
 	if (pvt->nbcfg & NBCFG_CHIPKILL) {
 		channel = get_channel_from_ecc_syndrome(mci, syndrome);
@@ -1048,9 +1079,15 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 			 * 2 DIMMs is in error. So we need to ID 'both' of them
 			 * as suspect.
 			 */
-			amd64_mc_warn(mci, "unknown syndrome 0x%04x - possible "
-					   "error reporting race\n", syndrome);
-			edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+			amd64_mc_warn(src_mci, "unknown syndrome 0x%04x - "
+				      "possible error reporting race\n",
+				      syndrome);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     page, offset, syndrome,
+					     csrow, -1, -1,
+					     EDAC_MOD_STR,
+					     "unknown syndrome - possible error reporting race",
+					     NULL);
 			return;
 		}
 	} else {
@@ -1065,28 +1102,10 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 		channel = ((sys_addr & BIT(3)) != 0);
 	}
 
-	/*
-	 * Find out which node the error address belongs to. This may be
-	 * different from the node that detected the error.
-	 */
-	src_mci = find_mc_by_sys_addr(mci, sys_addr);
-	if (!src_mci) {
-		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
-			     (unsigned long)sys_addr);
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
-		return;
-	}
-
-	/* Now map the sys_addr to a CSROW */
-	csrow = sys_addr_to_csrow(src_mci, sys_addr);
-	if (csrow < 0) {
-		edac_mc_handle_ce_no_info(src_mci, EDAC_MOD_STR);
-	} else {
-		error_address_to_page_and_offset(sys_addr, &page, &offset);
-
-		edac_mc_handle_ce(src_mci, page, offset, syndrome, csrow,
-				  channel, EDAC_MOD_STR);
-	}
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, src_mci,
+			     page, offset, syndrome,
+			     csrow, channel, -1,
+			     EDAC_MOD_STR, "", NULL);
 }
 
 static int ddr2_cs_size(unsigned i, bool dct_width)
@@ -1568,15 +1587,20 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	u32 page, offset;
 	int nid, csrow, chan = 0;
 
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
+
 	csrow = f1x_translate_sysaddr_to_cs(pvt, sys_addr, &nid, &chan);
 
 	if (csrow < 0) {
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a csrow",
+				     NULL);
 		return;
 	}
 
-	error_address_to_page_and_offset(sys_addr, &page, &offset);
-
 	/*
 	 * We need the syndromes for channel detection only when we're
 	 * ganged. Otherwise @chan should already contain the channel at
@@ -1585,16 +1609,10 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	if (dct_ganging_enabled(pvt))
 		chan = get_channel_from_ecc_syndrome(mci, syndrome);
 
-	if (chan >= 0)
-		edac_mc_handle_ce(mci, page, offset, syndrome, csrow, chan,
-				  EDAC_MOD_STR);
-	else
-		/*
-		 * Channel unknown, report all channels on this CSROW as failed.
-		 */
-		for (chan = 0; chan < mci->csrows[csrow].nr_channels; chan++)
-			edac_mc_handle_ce(mci, page, offset, syndrome,
-					  csrow, chan, EDAC_MOD_STR);
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				page, offset, syndrome,
+				csrow, chan, -1,
+				EDAC_MOD_STR, "", NULL);
 }
 
 /*
@@ -1875,7 +1893,12 @@ static void amd64_handle_ce(struct mem_ctl_info *mci, struct mce *m)
 	/* Ensure that the Error Address is VALID */
 	if (!(m->status & MCI_STATUS_ADDRV)) {
 		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     0, 0, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "HW has no ERROR_ADDRESS available",
+				     NULL);
 		return;
 	}
 
@@ -1899,11 +1922,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 
 	if (!(m->status & MCI_STATUS_ADDRV)) {
 		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     0, 0, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "HW has no ERROR_ADDRESS available",
+				     NULL);
 		return;
 	}
 
 	sys_addr = get_error_address(m);
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
 
 	/*
 	 * Find out which node the error address belongs to. This may be
@@ -1913,7 +1942,11 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 	if (!src_mci) {
 		amd64_mc_err(mci, "ERROR ADDRESS (0x%lx) NOT mapped to a MC\n",
 				  (unsigned long)sys_addr);
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "ERROR ADDRESS NOT mapped to a MC", NULL);
 		return;
 	}
 
@@ -1923,10 +1956,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 	if (csrow < 0) {
 		amd64_mc_err(mci, "ERROR_ADDRESS (0x%lx) NOT mapped to CS\n",
 				  (unsigned long)sys_addr);
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "ERROR ADDRESS NOT mapped to CS",
+				     NULL);
 	} else {
-		error_address_to_page_and_offset(sys_addr, &page, &offset);
-		edac_mc_handle_ue(log_mci, page, offset, csrow, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     csrow, -1, -1,
+				     EDAC_MOD_STR, "", NULL);
 	}
 }
 
@@ -2487,6 +2527,7 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 	struct amd64_pvt *pvt = NULL;
 	struct amd64_family_type *fam_type = NULL;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	int err = 0, ret;
 	u8 nid = get_node_id(F2);
 
@@ -2521,7 +2562,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 		goto err_siblings;
 
 	ret = -ENOMEM;
-	mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = pvt->csels[0].b_cnt;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = pvt->channel_count;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		goto err_siblings;
 
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 1532750..4f3e54a 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -29,7 +29,6 @@
 	edac_mc_chipset_printk(mci, level, "amd76x", fmt, ##arg)
 
 #define AMD76X_NR_CSROWS 8
-#define AMD76X_NR_CHANS  1
 #define AMD76X_NR_DIMMS  4
 
 /* AMD 76x register addresses - device 0 function 0 - PCI bridge */
@@ -146,8 +145,10 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 
 		if (handle_errors) {
 			row = (info->ecc_mode_status >> 4) & 0xf;
-			edac_mc_handle_ue(mci, mci->csrows[row].first_page, 0,
-					row, mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     mci->csrows[row].first_page, 0, 0,
+					     row, 0, -1,
+					     mci->ctl_name, "", NULL);
 		}
 	}
 
@@ -159,8 +160,10 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 
 		if (handle_errors) {
 			row = info->ecc_mode_status & 0xf;
-			edac_mc_handle_ce(mci, mci->csrows[row].first_page, 0,
-					0, row, 0, mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     mci->csrows[row].first_page, 0, 0,
+					     row, 0, -1,
+					     mci->ctl_name, "", NULL);
 		}
 	}
 
@@ -190,7 +193,7 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	u32 mba, mba_base, mba_mask, dms;
 	int index;
 
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		csrow = &mci->csrows[index];
 		dimm = csrow->channels[0].dimm;
 
@@ -232,7 +235,8 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 		EDAC_SECDED,
 		EDAC_SECDED
 	};
-	struct mem_ctl_info *mci = NULL;
+	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	u32 ems;
 	u32 ems_mode;
 	struct amd76x_error_info discard;
@@ -240,11 +244,17 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	debugf0("%s()\n", __func__);
 	pci_read_config_dword(pdev, AMD76X_ECC_MODE_STATUS, &ems);
 	ems_mode = (ems >> 10) & 0x3;
-	mci = edac_mc_alloc(0, AMD76X_NR_CSROWS, AMD76X_NR_CHANS, 0);
 
-	if (mci == NULL) {
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = AMD76X_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = 1;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
+
+	if (mci == NULL)
 		return -ENOMEM;
-	}
 
 	debugf0("%s(): mci = %p\n", __func__, mci);
 	mci->dev = &pdev->dev;
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 09e1b5d..39616a3 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -48,8 +48,9 @@ static void cell_edac_count_ce(struct mem_ctl_info *mci, int chan, u64 ar)
 	syndrome = (ar & 0x000000001fe00000ul) >> 21;
 
 	/* TODO: Decoding of the error address */
-	edac_mc_handle_ce(mci, csrow->first_page + pfn, offset,
-			  syndrome, 0, chan, "");
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			     csrow->first_page + pfn, offset, syndrome,
+			     0, chan, -1, "", "", NULL);
 }
 
 static void cell_edac_count_ue(struct mem_ctl_info *mci, int chan, u64 ar)
@@ -69,7 +70,9 @@ static void cell_edac_count_ue(struct mem_ctl_info *mci, int chan, u64 ar)
 	offset = address & ~PAGE_MASK;
 
 	/* TODO: Decoding of the error address */
-	edac_mc_handle_ue(mci, csrow->first_page + pfn, offset, 0, "");
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			     csrow->first_page + pfn, offset, 0,
+			     0, chan, -1, "", "", NULL);
 }
 
 static void cell_edac_check(struct mem_ctl_info *mci)
@@ -156,7 +159,7 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
 			priv->node, priv->chanmask,
-			csrow->first_page, dimm->nr_pages);
+			csrow->first_page, nr_pages);
 		break;
 	}
 }
@@ -165,9 +168,10 @@ static int __devinit cell_edac_probe(struct platform_device *pdev)
 {
 	struct cbe_mic_tm_regs __iomem	*regs;
 	struct mem_ctl_info		*mci;
+	struct edac_mc_layer		layers[2];
 	struct cell_edac_priv		*priv;
 	u64				reg;
-	int				rc, chanmask;
+	int				rc, chanmask, num_chans;
 
 	regs = cbe_get_cpu_mic_tm_regs(cbe_node_to_cpu(pdev->id));
 	if (regs == NULL)
@@ -192,8 +196,16 @@ static int __devinit cell_edac_probe(struct platform_device *pdev)
 		in_be64(&regs->mic_fir));
 
 	/* Allocate & init EDAC MC data structure */
-	mci = edac_mc_alloc(sizeof(struct cell_edac_priv), 1,
-			    chanmask == 3 ? 2 : 1, pdev->id);
+	num_chans = chanmask == 3 ? 2 : 1;
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = 1;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = num_chans;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
+			    sizeof(struct cell_edac_priv));
 	if (mci == NULL)
 		return -ENOMEM;
 	priv = mci->pvt_info;
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 7b764a8..eb6297d 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -336,7 +336,7 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 
 	get_total_mem(pdata);
 
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		mbmr = __raw_readl(pdata->vbase + REG_MBMR_OFFSET +
 				   0x20 * index);
 		mbbar = __raw_readl(pdata->vbase + REG_MBBAR_OFFSET +
@@ -555,13 +555,18 @@ static void cpc925_mc_check(struct mem_ctl_info *mci)
 	if (apiexcp & CECC_EXCP_DETECTED) {
 		cpc925_mc_printk(mci, KERN_INFO, "DRAM CECC Fault\n");
 		channel = cpc925_mc_find_channel(mci, syndrome);
-		edac_mc_handle_ce(mci, pfn, offset, syndrome,
-				  csrow, channel, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     pfn, offset, syndrome,
+				     csrow, channel, -1,
+				     mci->ctl_name, "", NULL);
 	}
 
 	if (apiexcp & UECC_EXCP_DETECTED) {
 		cpc925_mc_printk(mci, KERN_INFO, "DRAM UECC Fault\n");
-		edac_mc_handle_ue(mci, pfn, offset, csrow, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     pfn, offset, 0,
+				     csrow, -1, -1,
+				     mci->ctl_name, "", NULL);
 	}
 
 	cpc925_mc_printk(mci, KERN_INFO, "Dump registers:\n");
@@ -933,6 +938,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 {
 	static int edac_mc_idx;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	void __iomem *vbase;
 	struct cpc925_mc_pdata *pdata;
 	struct resource *r;
@@ -969,8 +975,15 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	}
 
 	nr_channels = cpc925_mc_get_channels(vbase) + 1;
-	mci = edac_mc_alloc(sizeof(struct cpc925_mc_pdata),
-			CPC925_NR_CSROWS, nr_channels, edac_mc_idx);
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = CPC925_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_channels;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+			    sizeof(struct cpc925_mc_pdata));
 	if (!mci) {
 		cpc925_printk(KERN_ERR, "No memory for mem_ctl_info\n");
 		res = -ENOMEM;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 6d81d3c..9f8b763 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -6,6 +6,9 @@
  *
  * See "enum e752x_chips" below for supported chipsets
  *
+ * Datasheet:
+ *	http://www.intel.in/content/www/in/en/chipsets/e7525-memory-controller-hub-datasheet.html
+ *
  * Written by Tom Zimmerman
  *
  * Contributors:
@@ -350,8 +353,10 @@ static void do_process_ce(struct mem_ctl_info *mci, u16 error_one,
 	channel = !(error_one & 1);
 
 	/* e752x mc reads 34:6 of the DRAM linear address */
-	edac_mc_handle_ce(mci, page, offset_in_page(sec1_add << 4),
-			sec1_syndrome, row, channel, "e752x CE");
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			     page, offset_in_page(sec1_add << 4), sec1_syndrome,
+			     row, channel, -1,
+			     "e752x CE", "", NULL);
 }
 
 static inline void process_ce(struct mem_ctl_info *mci, u16 error_one,
@@ -385,9 +390,12 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 			edac_mc_find_csrow_by_page(mci, block_page);
 
 		/* e752x mc reads 34:6 of the DRAM linear address */
-		edac_mc_handle_ue(mci, block_page,
-				offset_in_page(error_2b << 4),
-				row, "e752x UE from Read");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					block_page,
+					offset_in_page(error_2b << 4), 0,
+					 row, -1, -1,
+					"e752x UE from Read", "", NULL);
+
 	}
 	if (error_one & 0x0404) {
 		error_2b = scrb_add;
@@ -401,9 +409,11 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 			edac_mc_find_csrow_by_page(mci, block_page);
 
 		/* e752x mc reads 34:6 of the DRAM linear address */
-		edac_mc_handle_ue(mci, block_page,
-				offset_in_page(error_2b << 4),
-				row, "e752x UE from Scruber");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					block_page,
+					offset_in_page(error_2b << 4), 0,
+					row, -1, -1,
+					"e752x UE from Scruber", "", NULL);
 	}
 }
 
@@ -426,7 +436,9 @@ static inline void process_ue_no_info_wr(struct mem_ctl_info *mci,
 		return;
 
 	debugf3("%s()\n", __func__);
-	edac_mc_handle_ue_no_info(mci, "e752x UE log memory write");
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+			     -1, -1, -1,
+			     "e752x UE log memory write", "", NULL);
 }
 
 static void do_process_ded_retry(struct mem_ctl_info *mci, u16 error,
@@ -1062,7 +1074,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	 * channel operation).  DRB regs are cumulative; therefore DRB7 will
 	 * contain the total memory contained in all eight rows.
 	 */
-	for (last_cumul_size = index = 0; index < mci->nr_csrows; index++) {
+	for (last_cumul_size = index = 0; index < mci->num_csrows; index++) {
 		/* mem_dev 0=x8, 1=x4 */
 		mem_dev = (dra >> (index * 4 + 2)) & 0x3;
 		csrow = &mci->csrows[remap_csrow_index(mci, index)];
@@ -1081,10 +1093,11 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
-		for (i = 0; i < drc_chan + 1; i++) {
+		for (i = 0; i < csrow->nr_channels; i++) {
 			struct dimm_info *dimm = csrow->channels[i].dimm;
 
-			dimm->nr_pages = nr_pages / (drc_chan + 1);
+			debugf3("Initializing rank at (%i,%i)\n", index, i);
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
@@ -1232,6 +1245,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	u16 pci_data;
 	u8 stat8;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct e752x_pvt *pvt;
 	u16 ddrcsr;
 	int drc_chan;		/* Number of channels 0=1chan,1=2chan */
@@ -1258,11 +1272,16 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	/* Dual channel = 1, Single channel = 0 */
 	drc_chan = dual_channel_active(ddrcsr);
 
-	mci = edac_mc_alloc(sizeof(*pvt), E752X_NR_CSROWS, drc_chan + 1, 0);
-
-	if (mci == NULL) {
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = E752X_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = drc_chan + 1;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+			    false, sizeof(*pvt));
+	if (mci == NULL)
 		return -ENOMEM;
-	}
 
 	debugf3("%s(): init mci\n", __func__);
 	mci->mtype_cap = MEM_FLAG_RDDR;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index aeb69f0..7b4f3aa 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -10,6 +10,9 @@
  * Based on work by Dan Hollis <goemon at anime dot net> and others.
  *	http://www.anime.net/~goemon/linux-ecc/
  *
+ * Datasheet:
+ *	http://www.intel.com/content/www/us/en/chipsets/e7501-chipset-memory-controller-hub-datasheet.html
+ *
  * Contributors:
  *	Eric Biederman (Linux Networx)
  *	Tom Zimmerman (Linux Networx)
@@ -71,7 +74,7 @@
 #endif				/* PCI_DEVICE_ID_INTEL_7505_1_ERR */
 
 #define E7XXX_NR_CSROWS		8	/* number of csrows */
-#define E7XXX_NR_DIMMS		8	/* FIXME - is this correct? */
+#define E7XXX_NR_DIMMS		8	/* 2 channels, 4 dimms/channel */
 
 /* E7XXX register addresses - device 0 function 0 */
 #define E7XXX_DRB		0x60	/* DRAM row boundary register (8b) */
@@ -216,13 +219,15 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	row = edac_mc_find_csrow_by_page(mci, page);
 	/* convert syndrome to channel */
 	channel = e7xxx_find_channel(syndrome);
-	edac_mc_handle_ce(mci, page, 0, syndrome, row, channel, "e7xxx CE");
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, page, 0, syndrome,
+			     row, channel, -1, "e7xxx CE", "", NULL);
 }
 
 static void process_ce_no_info(struct mem_ctl_info *mci)
 {
 	debugf3("%s()\n", __func__);
-	edac_mc_handle_ce_no_info(mci, "e7xxx CE log register overflow");
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
+			     "e7xxx CE log register overflow", "", NULL);
 }
 
 static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
@@ -236,13 +241,17 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	/* FIXME - should use PAGE_SHIFT */
 	block_page = error_2b >> 6;	/* convert to 4k address */
 	row = edac_mc_find_csrow_by_page(mci, block_page);
-	edac_mc_handle_ue(mci, block_page, 0, row, "e7xxx UE");
+
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, block_page, 0, 0,
+			     row, -1, -1, "e7xxx UE", "", NULL);
 }
 
 static void process_ue_no_info(struct mem_ctl_info *mci)
 {
 	debugf3("%s()\n", __func__);
-	edac_mc_handle_ue_no_info(mci, "e7xxx UE log register overflow");
+
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
+			     "e7xxx UE log register overflow", "", NULL);
 }
 
 static void e7xxx_get_error_info(struct mem_ctl_info *mci,
@@ -365,7 +374,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	 * channel operation).  DRB regs are cumulative; therefore DRB7 will
 	 * contain the total memory contained in all eight rows.
 	 */
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		/* mem_dev 0=x8, 1=x4 */
 		mem_dev = (dra >> (index * 4 + 3)) & 0x1;
 		csrow = &mci->csrows[index];
@@ -413,6 +422,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	u16 pci_data;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	struct e7xxx_pvt *pvt = NULL;
 	u32 drc;
 	int drc_chan;
@@ -423,8 +433,21 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	pci_read_config_dword(pdev, E7XXX_DRC, &drc);
 
 	drc_chan = dual_channel_active(drc, dev_idx);
-	mci = edac_mc_alloc(sizeof(*pvt), E7XXX_NR_CSROWS, drc_chan + 1, 0);
-
+	/*
+	 * According with the datasheet, this device has a maximum of
+	 * 4 DIMMS per channel, either single-rank or dual-rank. So, the
+	 * total amount of dimms is 8 (E7XXX_NR_DIMMS).
+	 * That means that the DIMM is mapped as CSROWs, and the channel
+	 * will map the rank. So, an error to either channel should be
+	 * attributed to the same dimm.
+	 */
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = E7XXX_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = drc_chan + 1;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (mci == NULL)
 		return -ENOMEM;
 
diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..8aadd83 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,11 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -456,35 +459,17 @@ extern struct mem_ctl_info *find_mci_by_dev(struct device *dev);
 extern struct mem_ctl_info *edac_mc_del_mc(struct device *dev);
 extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
 				      unsigned long page);
-
-/*
- * The no info errors are used when error overflows are reported.
- * There are a limited number of error logging registers that can
- * be exausted.  When all registers are exhausted and an additional
- * error occurs then an error overflow register records that an
- * error occurred and the type of error, but doesn't have any
- * further information.  The ce/ue versions make for cleaner
- * reporting logic and function interface - reduces conditional
- * statement clutter and extra function arguments.
- */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page,
-			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
 
 /*
  * edac_device APIs
@@ -496,6 +481,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 4b15459..cb397d9 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -79,7 +79,7 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	unsigned total_size;
 	unsigned count;
 	unsigned instance, block, attr;
-	void *pvt;
+	void *pvt, *p;
 	int err;
 
 	debugf4("%s() instances=%d blocks=%d\n",
@@ -92,35 +92,30 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	 * to be at least as stringent as what the compiler would
 	 * provide if we could simply hardcode everything into a single struct.
 	 */
-	dev_ctl = (struct edac_device_ctl_info *)NULL;
+	p = NULL;
+	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
 
 	/* Calc the 'end' offset past end of ONE ctl_info structure
 	 * which will become the start of the 'instance' array
 	 */
-	dev_inst = edac_align_ptr(&dev_ctl[1], sizeof(*dev_inst));
+	dev_inst = edac_align_ptr(&p, sizeof(*dev_inst), nr_instances);
 
 	/* Calc the 'end' offset past the instance array within the ctl_info
 	 * which will become the start of the block array
 	 */
-	dev_blk = edac_align_ptr(&dev_inst[nr_instances], sizeof(*dev_blk));
+	count = nr_instances * nr_blocks;
+	dev_blk = edac_align_ptr(&p, sizeof(*dev_blk), count);
 
 	/* Calc the 'end' offset past the dev_blk array
 	 * which will become the start of the attrib array, if any.
 	 */
-	count = nr_instances * nr_blocks;
-	dev_attrib = edac_align_ptr(&dev_blk[count], sizeof(*dev_attrib));
-
-	/* Check for case of when an attribute array is specified */
-	if (nr_attrib > 0) {
-		/* calc how many nr_attrib we need */
+	/* calc how many nr_attrib we need */
+	if (nr_attrib > 0)
 		count *= nr_attrib;
+	dev_attrib = edac_align_ptr(&p, sizeof(*dev_attrib), count);
 
-		/* Calc the 'end' offset past the attributes array */
-		pvt = edac_align_ptr(&dev_attrib[count], sz_private);
-	} else {
-		/* no attribute array specificed */
-		pvt = edac_align_ptr(dev_attrib, sz_private);
-	}
+	/* Calc the 'end' offset past the attributes array */
+	pvt = edac_align_ptr(&p, sz_private, 1);
 
 	/* 'pvt' now points to where the private data area is.
 	 * At this point 'pvt' (like dev_inst,dev_blk and dev_attrib)
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 02263c3..2793fcb 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -68,8 +84,10 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf3("\tmci->edac_ctl_cap = %lx\n", mci->edac_ctl_cap);
 	debugf3("\tmci->edac_cap = %lx\n", mci->edac_cap);
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
-	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
-		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->num_csrows = %d, csrows = %p\n",
+		mci->num_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -108,9 +126,12 @@ EXPORT_SYMBOL_GPL(edac_mem_types);
  * If 'size' is a constant, the compiler will optimize this whole function
  * down to either a no-op or the addition of a constant to the value of 'ptr'.
  */
-void *edac_align_ptr(void *ptr, unsigned size)
+void *edac_align_ptr(void **p, unsigned size, int quant)
 {
 	unsigned align, r;
+	void *ptr = *p;
+
+	*p += size * quant;
 
 	/* Here we assume that the alignment of a "long long" is the most
 	 * stringent alignment that the compiler will ever provide by default.
@@ -132,14 +153,31 @@ void *edac_align_ptr(void *ptr, unsigned size)
 	if (r == 0)
 		return (char *)ptr;
 
+	*p += align - r;
+
 	return (void *)(((unsigned long)ptr) + align - r);
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Number of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -151,30 +189,64 @@ void *edac_align_ptr(void *ptr, unsigned size)
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt)
 {
+	void *ptr;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *lay;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
-	unsigned size;
-	int row, chn;
+	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
+	unsigned tot_csrows, tot_cschannels;
+	int i, j;
 	int err;
+	int row, chn;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	tot_dimms = 1;
+	tot_cschannels = 1;
+	tot_csrows = 1;
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_cschannels *= layers[i].size;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
 	 * stringent as what the compiler would provide if we could simply
 	 * hardcode everything into a single struct.
 	 */
-	mci = (struct mem_ctl_info *)0;
-	csi = edac_align_ptr(&mci[1], sizeof(*csi));
-	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
-	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
-	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
+	ptr = 0;
+	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
+	lay = edac_align_ptr(&ptr, sizeof(*lay), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_cschannels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	count = 1;
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+	}
+	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
+		__func__, size, tot_dimms, tot_csrows * tot_cschannels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -182,45 +254,99 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	lay = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)lay));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
+	mci->n_layers = n_layers;
+	mci->layers = lay;
+	memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
+	mci->num_csrows = tot_csrows;
+	mci->num_cschannel = tot_cschannels;
 
-		for (chn = 0; chn < nr_chans; chn++) {
+	/*
+	 * Fills the csrow struct
+	 */
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_cschannels;
+		chp = &chi[row * tot_cschannels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_cschannels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
 		}
 	}
 
 	/*
-	 * By default, assumes that a per-csrow arrangement will be used,
-	 * as most drivers are based on such assumption.
+	 * Fills the dimm struct
 	 */
-	dimm = mci->dimms;
-	for (row = 0; row < mci->nr_csrows; row++) {
-		for (chn = 0; chn < mci->csrows[row].nr_channels; chn++) {
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = GET_POS(lay, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		if (!rev_order) {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (!layers[j].is_csrow)
+					break;
+			chn++;
+			if (chn == tot_cschannels) {
+				chn = 0;
+				row++;
+			}
+		} else {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (layers[j].is_csrow)
+					break;
+			row++;
+			if (row == tot_csrows) {
+				row = 0;
+				chn++;
+			}
+		}
+
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -509,7 +635,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -528,7 +653,7 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 	if (edac_debug_level >= 4) {
 		int i;
 
-		for (i = 0; i < mci->nr_csrows; i++) {
+		for (i = 0; i < mci->num_csrows; i++) {
 			int j;
 
 			edac_mc_dump_csrow(&mci->csrows[i]);
@@ -536,6 +661,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -660,7 +787,7 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
 	row = -1;
 
-	for (i = 0; i < mci->nr_csrows; i++) {
+	for (i = 0; i < mci->num_csrows; i++) {
 		struct csrow_info *csrow = &csrows[i];
 		n = 0;
 		for (j = 0; j < csrow->nr_channels; j++) {
@@ -676,9 +803,9 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 			csrow->page_mask);
 
 		if ((page >= csrow->first_page) &&
-		(page <= csrow->last_page) &&
-		((page & csrow->page_mask) ==
-		(csrow->first_page & csrow->page_mask))) {
+		    (page <= csrow->last_page) &&
+		    ((page & csrow->page_mask) ==
+		    (csrow->first_page & csrow->page_mask))) {
 			row = i;
 			break;
 		}
@@ -693,261 +820,249 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
-{
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+static void edac_increment_ce_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
+	mci->ce_mc++;
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_increment_ue_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	if (mci->scrub_mode & SCRUB_SW_SRC) {
-		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
-		remapped_page = mci->ctl_page_to_phys ?
-			mci->ctl_page_to_phys(mci, page_frame_number) :
-			page_frame_number;
+	mci->ue_mc++;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
+		return;
 	}
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	unsigned long remapped_page;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_filter = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/* Check if the event report is consistent */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED) {
+				p = "CE";
+				mci->ce_mc++;
+			} else {
+				p = "UE";
+				mci->ue_mc++;
+			}
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_filter = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory,
+	 * the logic here will get all possible labels that could pottentially
+	 * be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called as "rank")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 * It will also get the max grain, over the error match range
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
-
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no sense
+		 * on seeking for the affected DIMMs, as everything may be
+		 * affected. Also, don't show errors for non-filled dimm's.
+		 */
+		if (enable_filter && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the dimm, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: dimm csrows (%d,%d)\n",
+				__func__, dimm->csrow, dimm->cschannel);
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
-
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_filter) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
+		p += sprintf(p, "%s %d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED)
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+	else
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d",
+			page_frame_number, offset_in_page, grain);
+
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		if (edac_mc_get_log_ce())
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		edac_increment_ce_error(mci, enable_filter, pos);
+
+		if (mci->scrub_mode & SCRUB_SW_SRC) {
+			/*
+			 * Some MC's can remap memory so that it is still
+			 * available at a different address when PCI devices
+			 * map into memory.
+			 * MC's that can't do this lose the memory where PCI
+			 * devices are mapped. This mapping is MC dependent
+			 * and so we call back into the MC driver for it to
+			 * map the MC page to a physical (CPU) page which can
+			 * then be mapped to a virtual page - which can then
+			 * be scrubbed.
+			 */
+			remapped_page = mci->ctl_page_to_phys ?
+				mci->ctl_page_to_phys(mci, page_frame_number) :
+				page_frame_number;
+
+			edac_mc_scrub_block(remapped_page,
+					    offset_in_page, grain);
+		}
+	} else {
+		if (edac_mc_get_log_ue())
+			edac_mc_printk(mci, KERN_WARNING,
+				"UE %s on %s (%s%s %s)\n",
+				msg, label, location, detail, other_detail);
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+		if (edac_mc_get_panic_on_ue())
+			panic("UE %s on %s (%s%s %s)\n",
+			      msg, label, location, detail, other_detail);
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		edac_increment_ue_error(mci, enable_filter, pos);
 	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
-
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 52c56cf..256fd4e 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -129,13 +129,13 @@ static const char *edac_caps[] = {
  */
 
 /* Set of more default csrow<id> attribute show/store functions */
-static ssize_t csrow_ue_count_show(struct csrow_info *csrow, char *data,
+static ssize_t csrow_ue_mc_show(struct csrow_info *csrow, char *data,
 				int private)
 {
 	return sprintf(data, "%u\n", csrow->ue_count);
 }
 
-static ssize_t csrow_ce_count_show(struct csrow_info *csrow, char *data,
+static ssize_t csrow_ce_mc_show(struct csrow_info *csrow, char *data,
 				int private)
 {
 	return sprintf(data, "%u\n", csrow->ce_count);
@@ -196,8 +196,8 @@ static ssize_t channel_dimm_label_store(struct csrow_info *csrow,
 	return max_size;
 }
 
-/* show function for dynamic chX_ce_count attribute */
-static ssize_t channel_ce_count_show(struct csrow_info *csrow,
+/* show function for dynamic chX_ce_mc attribute */
+static ssize_t channel_ce_mc_show(struct csrow_info *csrow,
 				char *data, int channel)
 {
 	return sprintf(data, "%u\n", csrow->channels[channel].ce_count);
@@ -258,8 +258,8 @@ CSROWDEV_ATTR(size_mb, S_IRUGO, csrow_size_show, NULL, 0);
 CSROWDEV_ATTR(dev_type, S_IRUGO, csrow_dev_type_show, NULL, 0);
 CSROWDEV_ATTR(mem_type, S_IRUGO, csrow_mem_type_show, NULL, 0);
 CSROWDEV_ATTR(edac_mode, S_IRUGO, csrow_edac_mode_show, NULL, 0);
-CSROWDEV_ATTR(ue_count, S_IRUGO, csrow_ue_count_show, NULL, 0);
-CSROWDEV_ATTR(ce_count, S_IRUGO, csrow_ce_count_show, NULL, 0);
+CSROWDEV_ATTR(ue_mc, S_IRUGO, csrow_ue_mc_show, NULL, 0);
+CSROWDEV_ATTR(ce_mc, S_IRUGO, csrow_ce_mc_show, NULL, 0);
 
 /* default attributes of the CSROW<id> object */
 static struct csrowdev_attribute *default_csrow_attr[] = {
@@ -267,8 +267,8 @@ static struct csrowdev_attribute *default_csrow_attr[] = {
 	&attr_mem_type,
 	&attr_edac_mode,
 	&attr_size_mb,
-	&attr_ue_count,
-	&attr_ce_count,
+	&attr_ue_mc,
+	&attr_ce_mc,
 	NULL,
 };
 
@@ -296,22 +296,22 @@ static struct csrowdev_attribute *dynamic_csrow_dimm_attr[] = {
 	&attr_ch5_dimm_label
 };
 
-/* possible dynamic channel ce_count attribute files */
-CSROWDEV_ATTR(ch0_ce_count, S_IRUGO | S_IWUSR, channel_ce_count_show, NULL, 0);
-CSROWDEV_ATTR(ch1_ce_count, S_IRUGO | S_IWUSR, channel_ce_count_show, NULL, 1);
-CSROWDEV_ATTR(ch2_ce_count, S_IRUGO | S_IWUSR, channel_ce_count_show, NULL, 2);
-CSROWDEV_ATTR(ch3_ce_count, S_IRUGO | S_IWUSR, channel_ce_count_show, NULL, 3);
-CSROWDEV_ATTR(ch4_ce_count, S_IRUGO | S_IWUSR, channel_ce_count_show, NULL, 4);
-CSROWDEV_ATTR(ch5_ce_count, S_IRUGO | S_IWUSR, channel_ce_count_show, NULL, 5);
-
-/* Total possible dynamic ce_count attribute file table */
-static struct csrowdev_attribute *dynamic_csrow_ce_count_attr[] = {
-	&attr_ch0_ce_count,
-	&attr_ch1_ce_count,
-	&attr_ch2_ce_count,
-	&attr_ch3_ce_count,
-	&attr_ch4_ce_count,
-	&attr_ch5_ce_count
+/* possible dynamic channel ce_mc attribute files */
+CSROWDEV_ATTR(ch0_ce_mc, S_IRUGO | S_IWUSR, channel_ce_mc_show, NULL, 0);
+CSROWDEV_ATTR(ch1_ce_mc, S_IRUGO | S_IWUSR, channel_ce_mc_show, NULL, 1);
+CSROWDEV_ATTR(ch2_ce_mc, S_IRUGO | S_IWUSR, channel_ce_mc_show, NULL, 2);
+CSROWDEV_ATTR(ch3_ce_mc, S_IRUGO | S_IWUSR, channel_ce_mc_show, NULL, 3);
+CSROWDEV_ATTR(ch4_ce_mc, S_IRUGO | S_IWUSR, channel_ce_mc_show, NULL, 4);
+CSROWDEV_ATTR(ch5_ce_mc, S_IRUGO | S_IWUSR, channel_ce_mc_show, NULL, 5);
+
+/* Total possible dynamic ce_mc attribute file table */
+static struct csrowdev_attribute *dynamic_csrow_ce_mc_attr[] = {
+	&attr_ch0_ce_mc,
+	&attr_ch1_ce_mc,
+	&attr_ch2_ce_mc,
+	&attr_ch3_ce_mc,
+	&attr_ch4_ce_mc,
+	&attr_ch5_ce_mc
 };
 
 #define EDAC_NR_CHANNELS	6
@@ -333,9 +333,9 @@ static int edac_create_channel_files(struct kobject *kobj, int chan)
 		/* create the CE Count attribute file */
 		err = sysfs_create_file(kobj,
 					(struct attribute *)
-					dynamic_csrow_ce_count_attr[chan]);
+					dynamic_csrow_ce_mc_attr[chan]);
 	} else {
-		debugf1("%s()  dimm labels and ce_count files created",
+		debugf1("%s()  dimm labels and ce_mc files created",
 			__func__);
 	}
 
@@ -395,7 +395,7 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 	 */
 
 	/* Create the dyanmic attribute files on this csrow,
-	 * namely, the DIMM labels and the channel ce_count
+	 * namely, the DIMM labels and the channel ce_mc
 	 */
 	for (chan = 0; chan < csrow->nr_channels; chan++) {
 		err = edac_create_channel_files(&csrow->kobj, chan);
@@ -421,14 +421,14 @@ err_out:
 static ssize_t mci_reset_counters_store(struct mem_ctl_info *mci,
 					const char *data, size_t count)
 {
-	int row, chan;
+	int cnt, row, chan, i;
 
+	mci->ue_mc = 0;
+	mci->ce_mc = 0;
 	mci->ue_noinfo_count = 0;
 	mci->ce_noinfo_count = 0;
-	mci->ue_count = 0;
-	mci->ce_count = 0;
 
-	for (row = 0; row < mci->nr_csrows; row++) {
+	for (row = 0; row < mci->num_csrows; row++) {
 		struct csrow_info *ri = &mci->csrows[row];
 
 		ri->ue_count = 0;
@@ -438,6 +438,13 @@ static ssize_t mci_reset_counters_store(struct mem_ctl_info *mci,
 			ri->channels[chan].ce_count = 0;
 	}
 
+	cnt = 1;
+	for (i = 0; i < mci->n_layers; i++) {
+		cnt *= mci->layers[i].size;
+		memset(mci->ce_per_layer[i], 0, cnt);
+		memset(mci->ue_per_layer[i], 0, cnt);
+	}
+
 	mci->start_time = jiffies;
 	return count;
 }
@@ -493,14 +500,14 @@ static ssize_t mci_sdram_scrub_rate_show(struct mem_ctl_info *mci, char *data)
 }
 
 /* default attribute files for the MCI object */
-static ssize_t mci_ue_count_show(struct mem_ctl_info *mci, char *data)
+static ssize_t mci_ue_mc_show(struct mem_ctl_info *mci, char *data)
 {
-	return sprintf(data, "%d\n", mci->ue_count);
+	return sprintf(data, "%d\n", mci->ue_mc);
 }
 
-static ssize_t mci_ce_count_show(struct mem_ctl_info *mci, char *data)
+static ssize_t mci_ce_mc_show(struct mem_ctl_info *mci, char *data)
 {
-	return sprintf(data, "%d\n", mci->ce_count);
+	return sprintf(data, "%d\n", mci->ce_mc);
 }
 
 static ssize_t mci_ce_noinfo_show(struct mem_ctl_info *mci, char *data)
@@ -527,7 +534,7 @@ static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data)
 {
 	int total_pages, csrow_idx, j;
 
-	for (total_pages = csrow_idx = 0; csrow_idx < mci->nr_csrows;
+	for (total_pages = csrow_idx = 0; csrow_idx < mci->num_csrows;
 	     csrow_idx++) {
 		struct csrow_info *csrow = &mci->csrows[csrow_idx];
 
@@ -595,8 +602,8 @@ MCIDEV_ATTR(size_mb, S_IRUGO, mci_size_mb_show, NULL);
 MCIDEV_ATTR(seconds_since_reset, S_IRUGO, mci_seconds_show, NULL);
 MCIDEV_ATTR(ue_noinfo_count, S_IRUGO, mci_ue_noinfo_show, NULL);
 MCIDEV_ATTR(ce_noinfo_count, S_IRUGO, mci_ce_noinfo_show, NULL);
-MCIDEV_ATTR(ue_count, S_IRUGO, mci_ue_count_show, NULL);
-MCIDEV_ATTR(ce_count, S_IRUGO, mci_ce_count_show, NULL);
+MCIDEV_ATTR(ue_mc, S_IRUGO, mci_ue_mc_show, NULL);
+MCIDEV_ATTR(ce_mc, S_IRUGO, mci_ce_mc_show, NULL);
 
 /* memory scrubber attribute file */
 MCIDEV_ATTR(sdram_scrub_rate, S_IRUGO | S_IWUSR, mci_sdram_scrub_rate_show,
@@ -609,8 +616,8 @@ static struct mcidev_sysfs_attribute *mci_attr[] = {
 	&mci_attr_seconds_since_reset,
 	&mci_attr_ue_noinfo_count,
 	&mci_attr_ce_noinfo_count,
-	&mci_attr_ue_count,
-	&mci_attr_ce_count,
+	&mci_attr_ue_mc,
+	&mci_attr_ce_mc,
 	&mci_attr_sdram_scrub_rate,
 	NULL
 };
@@ -940,7 +947,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 
 	/* Make directories for each CSROW object under the mc<id> kobject
 	 */
-	for (i = 0; i < mci->nr_csrows; i++) {
+	for (i = 0; i < mci->num_csrows; i++) {
 		int n = 0;
 
 		csrow = &mci->csrows[i];
@@ -997,7 +1004,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 
 	/* remove all csrow kobjects */
 	debugf4("%s()  unregister this mci kobj\n", __func__);
-	for (i = 0; i < mci->nr_csrows; i++) {
+	for (i = 0; i < mci->num_csrows; i++) {
 		int n = 0;
 
 		csrow = &mci->csrows[i];
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 00f81b4..0be4b01 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -50,7 +50,7 @@ extern void edac_device_reset_delay_period(struct edac_device_ctl_info
 					   *edac_dev, unsigned long value);
 extern void edac_mc_reset_delay_period(int value);
 
-extern void *edac_align_ptr(void *ptr, unsigned size);
+extern void *edac_align_ptr(void **p, unsigned size, int quant);
 
 /*
  * EDAC PCI functions
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index 63af1c5..9016560 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -42,13 +42,14 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 						const char *edac_pci_name)
 {
 	struct edac_pci_ctl_info *pci;
-	void *pvt;
+	void *p, *pvt;
 	unsigned int size;
 
 	debugf1("%s()\n", __func__);
 
-	pci = (struct edac_pci_ctl_info *)0;
-	pvt = edac_align_ptr(&pci[1], sz_pvt);
+	p = 0;
+	pci = edac_align_ptr(&p, sizeof(*pci), 1);
+	pvt = edac_align_ptr(&p, 1, sz_pvt);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	/* Alloc the needed control struct memory */
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index bf8a230..c366002 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -245,7 +245,9 @@ static int i3000_process_error_info(struct mem_ctl_info *mci,
 		return 1;
 
 	if ((info->errsts ^ info->errsts2) & I3000_ERRSTS_BITS) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1,
+				     "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
@@ -256,10 +258,15 @@ static int i3000_process_error_info(struct mem_ctl_info *mci,
 	row = edac_mc_find_csrow_by_page(mci, pfn);
 
 	if (info->errsts & I3000_ERRSTS_UE)
-		edac_mc_handle_ue(mci, pfn, offset, row, "i3000 UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     pfn, offset, 0,
+				     row, -1, -1,
+				     "i3000 UE", "", NULL);
 	else
-		edac_mc_handle_ce(mci, pfn, offset, info->derrsyn, row,
-				multi_chan ? channel : 0, "i3000 CE");
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     pfn, offset, info->derrsyn,
+				     row, multi_chan ? channel : 0, -1,
+				     "i3000 CE", "", NULL);
 
 	return 1;
 }
@@ -306,6 +313,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	unsigned long last_cumul_size, nr_pages;
 	int interleaved, nr_channels;
 	unsigned char dra[I3000_RANKS / 2], drb[I3000_RANKS];
@@ -347,7 +355,14 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	 */
 	interleaved = i3000_is_interleaved(c0dra, c1dra, c0drb, c1drb);
 	nr_channels = interleaved ? 2 : 1;
-	mci = edac_mc_alloc(0, I3000_RANKS / nr_channels, nr_channels, 0);
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I3000_RANKS / nr_channels;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_channels;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
@@ -375,7 +390,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	 * If we're in interleaved mode then we're only walking through
 	 * the ranks of controller 0, so we double all the values we see.
 	 */
-	for (last_cumul_size = i = 0; i < mci->nr_csrows; i++) {
+	for (last_cumul_size = i = 0; i < mci->num_csrows; i++) {
 		u8 value;
 		u32 cumul_size;
 		struct csrow_info *csrow = &mci->csrows[i];
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index b82667f..7421af9 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -23,6 +23,7 @@
 
 #define PCI_DEVICE_ID_INTEL_3200_HB    0x29f0
 
+#define I3200_DIMMS		4
 #define I3200_RANKS		8
 #define I3200_RANKS_PER_CHANNEL	4
 #define I3200_CHANNELS		2
@@ -217,21 +218,25 @@ static void i3200_process_error_info(struct mem_ctl_info *mci,
 		return;
 
 	if ((info->errsts ^ info->errsts2) & I3200_ERRSTS_BITS) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1, "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
 	for (channel = 0; channel < nr_channels; channel++) {
 		log = info->eccerrlog[channel];
 		if (log & I3200_ECCERRLOG_UE) {
-			edac_mc_handle_ue(mci, 0, 0,
-				eccerrlog_row(channel, log),
-				"i3200 UE");
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     0, 0, 0,
+					     eccerrlog_row(channel, log),
+					     -1, -1,
+					     "i3000 UE", "", NULL);
 		} else if (log & I3200_ECCERRLOG_CE) {
-			edac_mc_handle_ce(mci, 0, 0,
-				eccerrlog_syndrome(log),
-				eccerrlog_row(channel, log), 0,
-				"i3200 CE");
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     0, 0, eccerrlog_syndrome(log),
+					     eccerrlog_row(channel, log),
+					     -1, -1,
+					     "i3000 UE", "", NULL);
 		}
 	}
 }
@@ -321,6 +326,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	u16 drbs[I3200_CHANNELS][I3200_RANKS_PER_CHANNEL];
 	bool stacked;
 	void __iomem *window;
@@ -335,8 +341,14 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	i3200_get_drbs(window, drbs);
 	nr_channels = how_many_channels(pdev);
 
-	mci = edac_mc_alloc(sizeof(struct i3200_priv), I3200_RANKS,
-		nr_channels, 0);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I3200_DIMMS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_channels;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+			    false, sizeof(struct i3200_priv));
 	if (!mci)
 		return -ENOMEM;
 
@@ -365,7 +377,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	 * cumulative; the last one will contain the total memory
 	 * contained in all ranks.
 	 */
-	for (i = 0; i < mci->nr_csrows; i++) {
+	for (i = 0; i < mci->num_csrows; i++) {
 		unsigned long nr_pages;
 		struct csrow_info *csrow = &mci->csrows[i];
 
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index e8d32e8..564fe09 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -533,13 +533,14 @@ static void i5000_process_fatal_error_info(struct mem_ctl_info *mci,
 
 	/* Form out message */
 	snprintf(msg, sizeof(msg),
-		 "(Branch=%d DRAM-Bank=%d RDWR=%s RAS=%d CAS=%d "
-		 "FATAL Err=0x%x (%s))",
-		 branch >> 1, bank, rdwr ? "Write" : "Read", ras, cas,
-		 allErrors, specific);
+		 "Bank=%d RAS=%d CAS=%d FATAL Err=0x%x (%s)",
+		 bank, ras, cas, allErrors, specific);
 
 	/* Call the helper to output message */
-	edac_mc_handle_fbd_ue(mci, rank, channel, channel + 1, msg);
+	edac_mc_handle_error(HW_EVENT_ERR_FATAL, mci, 0, 0, 0,
+			     branch >> 1, -1, rank,
+			     rdwr ? "Write error" : "Read error",
+			     msg, NULL);
 }
 
 /*
@@ -633,13 +634,14 @@ static void i5000_process_nonfatal_error_info(struct mem_ctl_info *mci,
 
 		/* Form out message */
 		snprintf(msg, sizeof(msg),
-			 "(Branch=%d DRAM-Bank=%d RDWR=%s RAS=%d "
-			 "CAS=%d, UE Err=0x%x (%s))",
-			 branch >> 1, bank, rdwr ? "Write" : "Read", ras, cas,
-			 ue_errors, specific);
+			 "Rank=%d Bank=%d RAS=%d CAS=%d, UE Err=0x%x (%s)",
+			 rank, bank, ras, cas, ue_errors, specific);
 
 		/* Call the helper to output message */
-		edac_mc_handle_fbd_ue(mci, rank, channel, channel + 1, msg);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				channel >> 1, -1, rank,
+				rdwr ? "Write error" : "Read error",
+				msg, NULL);
 	}
 
 	/* Check correctable errors */
@@ -685,13 +687,16 @@ static void i5000_process_nonfatal_error_info(struct mem_ctl_info *mci,
 
 		/* Form out message */
 		snprintf(msg, sizeof(msg),
-			 "(Branch=%d DRAM-Bank=%d RDWR=%s RAS=%d "
+			 "Rank=%d Bank=%d RDWR=%s RAS=%d "
 			 "CAS=%d, CE Err=0x%x (%s))", branch >> 1, bank,
 			 rdwr ? "Write" : "Read", ras, cas, ce_errors,
 			 specific);
 
 		/* Call the helper to output message */
-		edac_mc_handle_fbd_ce(mci, rank, channel, msg);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0, 0,
+				channel >> 1, channel % 2, rank,
+				rdwr ? "Write error" : "Read error",
+				msg, NULL);
 	}
 
 	if (!misc_messages)
@@ -731,11 +736,12 @@ static void i5000_process_nonfatal_error_info(struct mem_ctl_info *mci,
 
 		/* Form out message */
 		snprintf(msg, sizeof(msg),
-			 "(Branch=%d Err=%#x (%s))", branch >> 1,
-			 misc_errors, specific);
+			 "Err=%#x (%s)", misc_errors, specific);
 
 		/* Call the helper to output message */
-		edac_mc_handle_fbd_ce(mci, 0, 0, msg);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0, 0,
+				branch >> 1, -1, -1,
+				"Misc error", msg, NULL);
 	}
 }
 
@@ -1251,6 +1257,10 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 
 	empty = 1;		/* Assume NO memory */
 
+	/*
+	 * TODO: it would be better to not use csrow here, filling
+	 * directly the dimm_info structs, based on branch, channel, dim number
+	 */
 	for (csrow = 0; csrow < max_csrows; csrow++) {
 		p_csrow = &mci->csrows[csrow];
 
@@ -1343,10 +1353,10 @@ static void i5000_get_dimm_and_channel_counts(struct pci_dev *pdev,
 static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[3];
 	struct i5000_pvt *pvt;
 	int num_channels;
 	int num_dimms_per_channel;
-	int num_csrows;
 
 	debugf0("MC: %s: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
 		__FILE__, __func__,
@@ -1372,13 +1382,21 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	 */
 	i5000_get_dimm_and_channel_counts(pdev, &num_dimms_per_channel,
 					&num_channels);
-	num_csrows = num_dimms_per_channel * 2;
 
-	debugf0("MC: %s(): Number of - Channels= %d  DIMMS= %d  CSROWS= %d\n",
-		__func__, num_channels, num_dimms_per_channel, num_csrows);
+	debugf0("MC: %s(): Number of Branches=2 Channels= %d  DIMMS= %d\n",
+		__func__, num_channels, num_dimms_per_channel);
 
 	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), num_csrows, num_channels, 0);
+	layers[0].type = EDAC_MC_LAYER_BRANCH;
+	layers[0].size = 2;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = num_channels;
+	layers[1].is_csrow = false;
+	layers[2].type = EDAC_MC_LAYER_SLOT;
+	layers[2].size = num_dimms_per_channel;
+	layers[2].is_csrow = true;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index a0219a9..fda60f8 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -14,6 +14,11 @@
  * rows for each respective channel are laid out one after another,
  * the first half belonging to channel 0, the second half belonging
  * to channel 1.
+ *
+ * This driver is for DDR2 DIMMs, and it uses chip select to select among the
+ * several ranks. However, instead of showing memories as ranks, it outputs
+ * them as DIMM's. An internal table creates the association between ranks
+ * and DIMM's.
  */
 #include <linux/module.h>
 #include <linux/init.h>
@@ -410,14 +415,6 @@ static int i5100_csrow_to_chan(const struct mem_ctl_info *mci, int csrow)
 	return csrow / priv->ranksperchan;
 }
 
-static unsigned i5100_rank_to_csrow(const struct mem_ctl_info *mci,
-				    int chan, int rank)
-{
-	const struct i5100_priv *priv = mci->pvt_info;
-
-	return chan * priv->ranksperchan + rank;
-}
-
 static void i5100_handle_ce(struct mem_ctl_info *mci,
 			    int chan,
 			    unsigned bank,
@@ -427,21 +424,17 @@ static void i5100_handle_ce(struct mem_ctl_info *mci,
 			    unsigned ras,
 			    const char *msg)
 {
-	const int csrow = i5100_rank_to_csrow(mci, chan, rank);
-	char *label = NULL;
+	char detail[80];
 
-	if (mci->csrows[csrow].channels[0].dimm)
-		label = mci->csrows[csrow].channels[0].dimm->label;
+	/* Form out message */
+	snprintf(detail, sizeof(detail),
+		 "bank %u, cas %u, ras %u\n",
+		 bank, cas, ras);
 
-	printk(KERN_ERR
-		"CE chan %d, bank %u, rank %u, syndrome 0x%lx, "
-		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
-		chan, bank, rank, syndrome, cas, ras,
-		csrow, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[0].ce_count++;
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			     0, 0, syndrome,
+			     chan, rank, -1,
+			     msg, detail, NULL);
 }
 
 static void i5100_handle_ue(struct mem_ctl_info *mci,
@@ -453,20 +446,17 @@ static void i5100_handle_ue(struct mem_ctl_info *mci,
 			    unsigned ras,
 			    const char *msg)
 {
-	const int csrow = i5100_rank_to_csrow(mci, chan, rank);
-	char *label = NULL;
-
-	if (mci->csrows[csrow].channels[0].dimm)
-		label = mci->csrows[csrow].channels[0].dimm->label;
+	char detail[80];
 
-	printk(KERN_ERR
-		"UE chan %d, bank %u, rank %u, syndrome 0x%lx, "
-		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
-		chan, bank, rank, syndrome, cas, ras,
-		csrow, label, msg);
+	/* Form out message */
+	snprintf(detail, sizeof(detail),
+		 "bank %u, cas %u, ras %u\n",
+		 bank, cas, ras);
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			     0, 0, syndrome,
+			     chan, rank, -1,
+			     msg, detail, NULL);
 }
 
 static void i5100_read_log(struct mem_ctl_info *mci, int chan,
@@ -843,11 +833,10 @@ static void __devinit i5100_init_interleaving(struct pci_dev *pdev,
 static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 {
 	int i;
-	unsigned long total_pages = 0UL;
 	struct i5100_priv *priv = mci->pvt_info;
-	struct dimm_info *dimm;
 
-	for (i = 0; i < mci->nr_csrows; i++) {
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm;
 		const unsigned long npages = i5100_npages(mci, i);
 		const unsigned chan = i5100_csrow_to_chan(mci, i);
 		const unsigned rank = i5100_csrow_to_rank(mci, i);
@@ -855,30 +844,23 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		if (!npages)
 			continue;
 
-		/*
-		 * FIXME: these two are totally bogus -- I don't see how to
-		 * map them correctly to this structure...
-		 */
-		mci->csrows[i].csrow_idx = i;
-		mci->csrows[i].mci = mci;
-		mci->csrows[i].nr_channels = 1;
-		mci->csrows[i].channels[0].csrow = mci->csrows + i;
-		total_pages += npages;
+		dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+			       chan, rank, 0);
 
-		dimm = mci->csrows[i].channels[0].dimm;
 		dimm->nr_pages = npages;
 		if (npages) {
-			total_pages += npages;
-
 			dimm->grain = 32;
 			dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
-				DEV_X4 : DEV_X8;
+					DEV_X4 : DEV_X8;
 			dimm->mtype = MEM_RDDR2;
 			dimm->edac_mode = EDAC_SECDED;
 			snprintf(dimm->label, sizeof(dimm->label),
 				"DIMM%u",
 				i5100_rank_to_slot(mci, chan, rank));
 		}
+
+		debugf2("dimm channel %d, rank %d, size %zd\n",
+			chan, rank, PAGES_TO_MiB(npages));
 	}
 }
 
@@ -887,6 +869,7 @@ static int __devinit i5100_init_one(struct pci_dev *pdev,
 {
 	int rc;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct i5100_priv *priv;
 	struct pci_dev *ch0mm, *ch1mm;
 	int ret = 0;
@@ -947,7 +930,14 @@ static int __devinit i5100_init_one(struct pci_dev *pdev,
 		goto bail_ch1;
 	}
 
-	mci = edac_mc_alloc(sizeof(*priv), ranksperch * 2, 1, 0);
+	layers[0].type = EDAC_MC_LAYER_CHANNEL;
+	layers[0].size = 2;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_SLOT;
+	layers[1].size = ranksperch;
+	layers[1].is_csrow = true;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+			    false, sizeof(*priv));
 	if (!mci) {
 		ret = -ENOMEM;
 		goto bail_disable_ch1;
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 784d6dc..3aa2a1e 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -18,6 +18,10 @@
  * Intel 5400 Chipset Memory Controller Hub (MCH) - Datasheet
  * 	http://developer.intel.com/design/chipsets/datashts/313070.htm
  *
+ * This Memory Controller manages DDR2 FB-DIMMs. It has 2 branches, each with
+ * 2 channels operating in lockstep no-mirror mode. Each channel can have up to
+ * 4 dimm's, each with up to 8GB.
+ *
  */
 
 #include <linux/module.h>
@@ -44,12 +48,10 @@
 	edac_mc_chipset_printk(mci, level, "i5400", fmt, ##arg)
 
 /* Limits for i5400 */
-#define NUM_MTRS_PER_BRANCH	4
+#define MAX_BRANCHES		2
 #define CHANNELS_PER_BRANCH	2
-#define MAX_DIMMS_PER_CHANNEL	NUM_MTRS_PER_BRANCH
-#define	MAX_CHANNELS		4
-/* max possible csrows per channel */
-#define MAX_CSROWS		(MAX_DIMMS_PER_CHANNEL)
+#define DIMMS_PER_CHANNEL	4
+#define	MAX_CHANNELS		(MAX_BRANCHES * CHANNELS_PER_BRANCH)
 
 /* Device 16,
  * Function 0: System Address
@@ -347,16 +349,16 @@ struct i5400_pvt {
 
 	u16 mir0, mir1;
 
-	u16 b0_mtr[NUM_MTRS_PER_BRANCH];	/* Memory Technlogy Reg */
+	u16 b0_mtr[DIMMS_PER_CHANNEL];	/* Memory Technlogy Reg */
 	u16 b0_ambpresent0;			/* Branch 0, Channel 0 */
 	u16 b0_ambpresent1;			/* Brnach 0, Channel 1 */
 
-	u16 b1_mtr[NUM_MTRS_PER_BRANCH];	/* Memory Technlogy Reg */
+	u16 b1_mtr[DIMMS_PER_CHANNEL];	/* Memory Technlogy Reg */
 	u16 b1_ambpresent0;			/* Branch 1, Channel 8 */
 	u16 b1_ambpresent1;			/* Branch 1, Channel 1 */
 
 	/* DIMM information matrix, allocating architecture maximums */
-	struct i5400_dimm_info dimm_info[MAX_CSROWS][MAX_CHANNELS];
+	struct i5400_dimm_info dimm_info[DIMMS_PER_CHANNEL][MAX_CHANNELS];
 
 	/* Actual values for this controller */
 	int maxch;				/* Max channels */
@@ -532,13 +534,15 @@ static void i5400_proccess_non_recoverable_info(struct mem_ctl_info *mci,
 	int ras, cas;
 	int errnum;
 	char *type = NULL;
+	enum hw_event_mc_err_type tp_event = HW_EVENT_ERR_UNCORRECTED;
 
 	if (!allErrors)
 		return;		/* if no error, return now */
 
-	if (allErrors &  ERROR_FAT_MASK)
+	if (allErrors &  ERROR_FAT_MASK) {
 		type = "FATAL";
-	else if (allErrors & FERR_NF_UNCORRECTABLE)
+		tp_event = HW_EVENT_ERR_FATAL;
+	} else if (allErrors & FERR_NF_UNCORRECTABLE)
 		type = "NON-FATAL uncorrected";
 	else
 		type = "NON-FATAL recoverable";
@@ -556,7 +560,7 @@ static void i5400_proccess_non_recoverable_info(struct mem_ctl_info *mci,
 	ras = nrec_ras(info);
 	cas = nrec_cas(info);
 
-	debugf0("\t\tCSROW= %d  Channels= %d,%d  (Branch= %d "
+	debugf0("\t\tDIMM= %d  Channels= %d,%d  (Branch= %d "
 		"DRAM Bank= %d Buffer ID = %d rdwr= %s ras= %d cas= %d)\n",
 		rank, channel, channel + 1, branch >> 1, bank,
 		buf_id, rdwr_str(rdwr), ras, cas);
@@ -566,13 +570,13 @@ static void i5400_proccess_non_recoverable_info(struct mem_ctl_info *mci,
 
 	/* Form out message */
 	snprintf(msg, sizeof(msg),
-		 "%s (Branch=%d DRAM-Bank=%d Buffer ID = %d RDWR=%s "
-		 "RAS=%d CAS=%d %s Err=0x%lx (%s))",
-		 type, branch >> 1, bank, buf_id, rdwr_str(rdwr), ras, cas,
-		 type, allErrors, error_name[errnum]);
+		 "Bank=%d Buffer ID = %d RAS=%d CAS=%d Err=0x%lx (%s)",
+		 bank, buf_id, ras, cas, allErrors, error_name[errnum]);
 
-	/* Call the helper to output message */
-	edac_mc_handle_fbd_ue(mci, rank, channel, channel + 1, msg);
+	edac_mc_handle_error(tp_event, mci, 0, 0, 0,
+			     branch >> 1, -1, rank,
+			     rdwr ? "Write error" : "Read error",
+			     msg, NULL);
 }
 
 /*
@@ -630,7 +634,7 @@ static void i5400_process_nonfatal_error_info(struct mem_ctl_info *mci,
 		/* Only 1 bit will be on */
 		errnum = find_first_bit(&allErrors, ARRAY_SIZE(error_name));
 
-		debugf0("\t\tCSROW= %d Channel= %d  (Branch %d "
+		debugf0("\t\tDIMM= %d Channel= %d  (Branch %d "
 			"DRAM Bank= %d rdwr= %s ras= %d cas= %d)\n",
 			rank, channel, branch >> 1, bank,
 			rdwr_str(rdwr), ras, cas);
@@ -642,8 +646,10 @@ static void i5400_process_nonfatal_error_info(struct mem_ctl_info *mci,
 			 branch >> 1, bank, rdwr_str(rdwr), ras, cas,
 			 allErrors, error_name[errnum]);
 
-		/* Call the helper to output message */
-		edac_mc_handle_fbd_ce(mci, rank, channel, msg);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0, 0,
+				     branch >> 1, channel % 2, rank,
+				     rdwr ? "Write error" : "Read error",
+				     msg, NULL);
 
 		return;
 	}
@@ -831,8 +837,8 @@ static int i5400_get_devices(struct mem_ctl_info *mci, int dev_idx)
 /*
  *	determine_amb_present
  *
- *		the information is contained in NUM_MTRS_PER_BRANCH different
- *		registers determining which of the NUM_MTRS_PER_BRANCH requires
+ *		the information is contained in DIMMS_PER_CHANNEL different
+ *		registers determining which of the DIMMS_PER_CHANNEL requires
  *              knowing which channel is in question
  *
  *	2 branches, each with 2 channels
@@ -861,11 +867,11 @@ static int determine_amb_present_reg(struct i5400_pvt *pvt, int channel)
 }
 
 /*
- * determine_mtr(pvt, csrow, channel)
+ * determine_mtr(pvt, dimm, channel)
  *
- * return the proper MTR register as determine by the csrow and desired channel
+ * return the proper MTR register as determine by the dimm and desired channel
  */
-static int determine_mtr(struct i5400_pvt *pvt, int csrow, int channel)
+static int determine_mtr(struct i5400_pvt *pvt, int dimm, int channel)
 {
 	int mtr;
 	int n;
@@ -873,11 +879,11 @@ static int determine_mtr(struct i5400_pvt *pvt, int csrow, int channel)
 	/* There is one MTR for each slot pair of FB-DIMMs,
 	   Each slot pair may be at branch 0 or branch 1.
 	 */
-	n = csrow;
+	n = dimm;
 
-	if (n >= NUM_MTRS_PER_BRANCH) {
-		debugf0("ERROR: trying to access an invalid csrow: %d\n",
-			csrow);
+	if (n >= DIMMS_PER_CHANNEL) {
+		debugf0("ERROR: trying to access an invalid dimm: %d\n",
+			dimm);
 		return 0;
 	}
 
@@ -913,19 +919,19 @@ static void decode_mtr(int slot_row, u16 mtr)
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 }
 
-static void handle_channel(struct i5400_pvt *pvt, int csrow, int channel,
+static void handle_channel(struct i5400_pvt *pvt, int dimm, int channel,
 			struct i5400_dimm_info *dinfo)
 {
 	int mtr;
 	int amb_present_reg;
 	int addrBits;
 
-	mtr = determine_mtr(pvt, csrow, channel);
+	mtr = determine_mtr(pvt, dimm, channel);
 	if (MTR_DIMMS_PRESENT(mtr)) {
 		amb_present_reg = determine_amb_present_reg(pvt, channel);
 
 		/* Determine if there is a DIMM present in this DIMM slot */
-		if (amb_present_reg & (1 << csrow)) {
+		if (amb_present_reg & (1 << dimm)) {
 			/* Start with the number of bits for a Bank
 			 * on the DRAM */
 			addrBits = MTR_DRAM_BANKS_ADDR_BITS(mtr);
@@ -954,7 +960,7 @@ static void handle_channel(struct i5400_pvt *pvt, int csrow, int channel,
 static void calculate_dimm_size(struct i5400_pvt *pvt)
 {
 	struct i5400_dimm_info *dinfo;
-	int csrow, max_csrows;
+	int dimm, max_dimms;
 	char *p, *mem_buffer;
 	int space, n;
 	int channel;
@@ -968,32 +974,32 @@ static void calculate_dimm_size(struct i5400_pvt *pvt)
 		return;
 	}
 
-	/* Scan all the actual CSROWS
+	/* Scan all the actual DIMMS
 	 * and calculate the information for each DIMM
-	 * Start with the highest csrow first, to display it first
-	 * and work toward the 0th csrow
+	 * Start with the highest dimm first, to display it first
+	 * and work toward the 0th dimm
 	 */
-	max_csrows = pvt->maxdimmperch;
-	for (csrow = max_csrows - 1; csrow >= 0; csrow--) {
+	max_dimms = pvt->maxdimmperch;
+	for (dimm = max_dimms - 1; dimm >= 0; dimm--) {
 
-		/* on an odd csrow, first output a 'boundary' marker,
+		/* on an odd dimm, first output a 'boundary' marker,
 		 * then reset the message buffer  */
-		if (csrow & 0x1) {
+		if (dimm & 0x1) {
 			n = snprintf(p, space, "---------------------------"
-					"--------------------------------");
+					"-------------------------------");
 			p += n;
 			space -= n;
 			debugf2("%s\n", mem_buffer);
 			p = mem_buffer;
 			space = PAGE_SIZE;
 		}
-		n = snprintf(p, space, "csrow %2d    ", csrow);
+		n = snprintf(p, space, "dimm %2d    ", dimm);
 		p += n;
 		space -= n;
 
 		for (channel = 0; channel < pvt->maxch; channel++) {
-			dinfo = &pvt->dimm_info[csrow][channel];
-			handle_channel(pvt, csrow, channel, dinfo);
+			dinfo = &pvt->dimm_info[dimm][channel];
+			handle_channel(pvt, dimm, channel, dinfo);
 			n = snprintf(p, space, "%4d MB   | ", dinfo->megabytes);
 			p += n;
 			space -= n;
@@ -1005,7 +1011,7 @@ static void calculate_dimm_size(struct i5400_pvt *pvt)
 
 	/* Output the last bottom 'boundary' marker */
 	n = snprintf(p, space, "---------------------------"
-			"--------------------------------");
+			"-------------------------------");
 	p += n;
 	space -= n;
 	debugf2("%s\n", mem_buffer);
@@ -1013,7 +1019,7 @@ static void calculate_dimm_size(struct i5400_pvt *pvt)
 	space = PAGE_SIZE;
 
 	/* now output the 'channel' labels */
-	n = snprintf(p, space, "            ");
+	n = snprintf(p, space, "           ");
 	p += n;
 	space -= n;
 	for (channel = 0; channel < pvt->maxch; channel++) {
@@ -1080,7 +1086,7 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
 	debugf2("MIR1: limit= 0x%x  WAY1= %u  WAY0= %x\n", limit, way1, way0);
 
 	/* Get the set of MTR[0-3] regs by each branch */
-	for (slot_row = 0; slot_row < NUM_MTRS_PER_BRANCH; slot_row++) {
+	for (slot_row = 0; slot_row < DIMMS_PER_CHANNEL; slot_row++) {
 		int where = MTR0 + (slot_row * sizeof(u16));
 
 		/* Branch 0 set of MTR registers */
@@ -1105,7 +1111,7 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
 	/* Read and dump branch 0's MTRs */
 	debugf2("\nMemory Technology Registers:\n");
 	debugf2("   Branch 0:\n");
-	for (slot_row = 0; slot_row < NUM_MTRS_PER_BRANCH; slot_row++)
+	for (slot_row = 0; slot_row < DIMMS_PER_CHANNEL; slot_row++)
 		decode_mtr(slot_row, pvt->b0_mtr[slot_row]);
 
 	pci_read_config_word(pvt->branch_0, AMBPRESENT_0,
@@ -1122,7 +1128,7 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
 	} else {
 		/* Read and dump  branch 1's MTRs */
 		debugf2("   Branch 1:\n");
-		for (slot_row = 0; slot_row < NUM_MTRS_PER_BRANCH; slot_row++)
+		for (slot_row = 0; slot_row < DIMMS_PER_CHANNEL; slot_row++)
 			decode_mtr(slot_row, pvt->b1_mtr[slot_row]);
 
 		pci_read_config_word(pvt->branch_1, AMBPRESENT_0,
@@ -1141,7 +1147,7 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
 }
 
 /*
- *	i5400_init_csrows	Initialize the 'csrows' table within
+ *	i5400_init_dimms	Initialize the 'dimms' table within
  *				the mci control	structure with the
  *				addressing of memory.
  *
@@ -1149,50 +1155,68 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
  *		0	success
  *		1	no actual memory found on this MC
  */
-static int i5400_init_csrows(struct mem_ctl_info *mci)
+static int i5400_init_dimms(struct mem_ctl_info *mci)
 {
 	struct i5400_pvt *pvt;
-	struct csrow_info *p_csrow;
-	int empty, channel_count;
-	int max_csrows;
+	struct dimm_info *dimm;
+	int ndimms, channel_count;
+	int max_dimms;
 	int mtr;
 	int size_mb;
-	int channel;
-	int csrow;
-	struct dimm_info *dimm;
+	int  channel, slot;
 
 	pvt = mci->pvt_info;
 
 	channel_count = pvt->maxch;
-	max_csrows = pvt->maxdimmperch;
+	max_dimms = pvt->maxdimmperch;
 
-	empty = 1;		/* Assume NO memory */
+	ndimms = 0;
 
-	for (csrow = 0; csrow < max_csrows; csrow++) {
-		p_csrow = &mci->csrows[csrow];
+	/*
+	 * FIXME: remove  pvt->dimm_info[slot][channel] and use the 3
+	 * layers here.
+	 */
+	for (channel = 0; channel < mci->layers[0].size * mci->layers[1].size;
+	     channel++) {
+		for (slot = 0; slot < mci->layers[2].size; slot++) {
+			mtr = determine_mtr(pvt, slot, channel);
 
-		/* use branch 0 for the basis */
-		mtr = determine_mtr(pvt, csrow, 0);
+			/* if no DIMMS on this slot, continue */
+			if (!MTR_DIMMS_PRESENT(mtr))
+				continue;
 
-		/* if no DIMMS on this row, continue */
-		if (!MTR_DIMMS_PRESENT(mtr))
-			continue;
+			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+				       channel / 2, channel % 2, slot);
 
-		for (channel = 0; channel < pvt->maxch; channel++) {
-			size_mb = pvt->dimm_info[csrow][channel].megabytes;
+			size_mb =  pvt->dimm_info[slot][channel].megabytes;
+
+			debugf2("%s: dimm%zd (branch %d channel %d slot %d): %d.%03d GB\n",
+				__func__, dimm - mci->dimms,
+				channel / 2, channel % 2, slot,
+				size_mb / 1000, size_mb % 1000);
 
-			dimm = p_csrow->channels[channel].dimm;
 			dimm->nr_pages = size_mb << 8;
 			dimm->grain = 8;
 			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
-			dimm->mtype = MEM_RDDR2;
-			dimm->edac_mode = EDAC_SECDED;
+			dimm->mtype = MEM_FB_DDR2;
+			/*
+			 * The eccc mechanism is SDDC (aka SECC), with
+			 * is similar to Chipkill.
+			 */
+			dimm->edac_mode = MTR_DRAM_WIDTH(mtr) ?
+					  EDAC_S8ECD8ED : EDAC_S4ECD4ED;
+			ndimms++;
 		}
-
-		empty = 0;
 	}
 
-	return empty;
+	/*
+	 * When just one memory is provided, it should be at location (0,0,0).
+	 * With such single-DIMM mode, the SDCC algorithm degrades to SECDEC+.
+	 */
+	if (ndimms == 1)
+		mci->dimms[0].edac_mode = EDAC_SECDED;
+
+	return (ndimms == 0);
 }
 
 /*
@@ -1228,9 +1252,7 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
 	struct i5400_pvt *pvt;
-	int num_channels;
-	int num_dimms_per_channel;
-	int num_csrows;
+	struct edac_mc_layer layers[3];
 
 	if (dev_idx >= ARRAY_SIZE(i5400_devs))
 		return -EINVAL;
@@ -1244,22 +1266,21 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	if (PCI_FUNC(pdev->devfn) != 0)
 		return -ENODEV;
 
-	/* As we don't have a motherboard identification routine to determine
-	 * actual number of slots/dimms per channel, we thus utilize the
-	 * resource as specified by the chipset. Thus, we might have
-	 * have more DIMMs per channel than actually on the mobo, but this
-	 * allows the driver to support up to the chipset max, without
-	 * some fancy mobo determination.
+	/*
+	 * allocate a new MC control structure
+	 *
+	 * This drivers uses the DIMM slot as "csrow" and the rest as "channel".
 	 */
-	num_dimms_per_channel = MAX_DIMMS_PER_CHANNEL;
-	num_channels = MAX_CHANNELS;
-	num_csrows = num_dimms_per_channel;
-
-	debugf0("MC: %s(): Number of - Channels= %d  DIMMS= %d  CSROWS= %d\n",
-		__func__, num_channels, num_dimms_per_channel, num_csrows);
-
-	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), num_csrows, num_channels, 0);
+	layers[0].type = EDAC_MC_LAYER_BRANCH;
+	layers[0].size = MAX_BRANCHES;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = CHANNELS_PER_BRANCH;
+	layers[1].is_csrow = false;
+	layers[2].type = EDAC_MC_LAYER_SLOT;
+	layers[2].size = DIMMS_PER_CHANNEL;
+	layers[2].is_csrow = true;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
@@ -1270,8 +1291,8 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 
 	pvt = mci->pvt_info;
 	pvt->system_address = pdev;	/* Record this device in our private */
-	pvt->maxch = num_channels;
-	pvt->maxdimmperch = num_dimms_per_channel;
+	pvt->maxch = MAX_CHANNELS;
+	pvt->maxdimmperch = DIMMS_PER_CHANNEL;
 
 	/* 'get' the pci devices we want to reserve for our use */
 	if (i5400_get_devices(mci, dev_idx))
@@ -1293,13 +1314,13 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	/* Set the function pointer to an actual operation function */
 	mci->edac_check = i5400_check_error;
 
-	/* initialize the MC control structure 'csrows' table
+	/* initialize the MC control structure 'dimms' table
 	 * with the mapping and control information */
-	if (i5400_init_csrows(mci)) {
+	if (i5400_init_dimms(mci)) {
 		debugf0("MC: Setting mci->edac_cap to EDAC_FLAG_NONE\n"
-			"    because i5400_init_csrows() returned nonzero "
+			"    because i5400_init_dimms() returned nonzero "
 			"value\n");
-		mci->edac_cap = EDAC_FLAG_NONE;	/* no csrows found */
+		mci->edac_cap = EDAC_FLAG_NONE;	/* no dimms found */
 	} else {
 		debugf1("MC: Enable error reporting now\n");
 		i5400_enable_error_reporting(mci);
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 5e594ae..0ff0b26 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -464,17 +464,14 @@ static void i7300_process_fbd_error(struct mem_ctl_info *mci)
 				FERR_FAT_FBD, error_reg);
 
 		snprintf(pvt->tmp_prt_buffer, PAGE_SIZE,
-			"FATAL (Branch=%d DRAM-Bank=%d %s "
-			"RAS=%d CAS=%d Err=0x%lx (%s))",
-			branch, bank,
-			is_wr ? "RDWR" : "RD",
-			ras, cas,
-			errors, specific);
-
-		/* Call the helper to output message */
-		edac_mc_handle_fbd_ue(mci, rank, branch << 1,
-				      (branch << 1) + 1,
-				      pvt->tmp_prt_buffer);
+			 "Bank=%d RAS=%d CAS=%d Err=0x%lx (%s))",
+			 bank, ras, cas, errors, specific);
+
+		edac_mc_handle_error(HW_EVENT_ERR_FATAL, mci, 0, 0, 0,
+				     branch, -1, rank,
+				     is_wr ? "Write error" : "Read error",
+				     pvt->tmp_prt_buffer, NULL);
+
 	}
 
 	/* read in the 1st NON-FATAL error register */
@@ -513,23 +510,14 @@ static void i7300_process_fbd_error(struct mem_ctl_info *mci)
 
 		/* Form out message */
 		snprintf(pvt->tmp_prt_buffer, PAGE_SIZE,
-			"Corrected error (Branch=%d, Channel %d), "
-			" DRAM-Bank=%d %s "
-			"RAS=%d CAS=%d, CE Err=0x%lx, Syndrome=0x%08x(%s))",
-			branch, channel,
-			bank,
-			is_wr ? "RDWR" : "RD",
-			ras, cas,
-			errors, syndrome, specific);
-
-		/*
-		 * Call the helper to output message
-		 * NOTE: Errors are reported per-branch, and not per-channel
-		 *	 Currently, we don't know how to identify the right
-		 *	 channel.
-		 */
-		edac_mc_handle_fbd_ce(mci, rank, channel,
-				      pvt->tmp_prt_buffer);
+			 "DRAM-Bank=%d RAS=%d CAS=%d, Err=0x%lx (%s))",
+			 bank, ras, cas, errors, specific);
+
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0,
+				     syndrome,
+				     branch >> 1, channel % 2, rank,
+				     is_wr ? "Write error" : "Read error",
+				     pvt->tmp_prt_buffer, NULL);
 	}
 	return;
 }
@@ -807,13 +795,17 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 			for (ch = 0; ch < MAX_CH_PER_BRANCH; ch++) {
 				int channel = to_channel(ch, branch);
 
-				dinfo = &pvt->dimm_info[slot][channel];
+				dimm = GET_POS(mci->layers, mci->dimms,
+					       mci->n_layers, branch, ch, slot);
 
-				dimm = mci->csrows[slot].channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+				dinfo = &pvt->dimm_info[slot][channel];
 
 				mtr = decode_mtr(pvt, slot, ch, branch,
 						 dinfo, dimm);
 
+				mci->tot_dimms++;
+				dimm++;
+
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
@@ -1034,10 +1026,8 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 				    const struct pci_device_id *id)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[3];
 	struct i7300_pvt *pvt;
-	int num_channels;
-	int num_dimms_per_channel;
-	int num_csrows;
 	int rc;
 
 	/* wake up device */
@@ -1054,22 +1044,17 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 	if (PCI_FUNC(pdev->devfn) != 0)
 		return -ENODEV;
 
-	/* As we don't have a motherboard identification routine to determine
-	 * actual number of slots/dimms per channel, we thus utilize the
-	 * resource as specified by the chipset. Thus, we might have
-	 * have more DIMMs per channel than actually on the mobo, but this
-	 * allows the driver to support up to the chipset max, without
-	 * some fancy mobo determination.
-	 */
-	num_dimms_per_channel = MAX_SLOTS;
-	num_channels = MAX_CHANNELS;
-	num_csrows = MAX_SLOTS * MAX_CHANNELS;
-
-	debugf0("MC: %s(): Number of - Channels= %d  DIMMS= %d  CSROWS= %d\n",
-		__func__, num_channels, num_dimms_per_channel, num_csrows);
-
 	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), num_csrows, num_channels, 0);
+	layers[0].type = EDAC_MC_LAYER_BRANCH;
+	layers[0].size = MAX_BRANCHES;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = MAX_CH_PER_BRANCH;
+	layers[1].is_csrow = true;
+	layers[2].type = EDAC_MC_LAYER_SLOT;
+	layers[2].size = MAX_SLOTS;
+	layers[2].is_csrow = true;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index d566797..72553dd 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -257,7 +257,6 @@ struct i7core_pvt {
 	struct i7core_channel	channel[NUM_CHANS];
 
 	int		ce_count_available;
-	int 		csrow_map[NUM_CHANS][MAX_DIMMS];
 
 			/* ECC corrected errors counts per udimm */
 	unsigned long	udimm_ce_count[MAX_DIMMS];
@@ -492,113 +491,12 @@ static void free_i7core_dev(struct i7core_dev *i7core_dev)
 /****************************************************************************
 			Memory check routines
  ****************************************************************************/
-static struct pci_dev *get_pdev_slot_func(u8 socket, unsigned slot,
-					  unsigned func)
-{
-	struct i7core_dev *i7core_dev = get_i7core_dev(socket);
-	int i;
-
-	if (!i7core_dev)
-		return NULL;
-
-	for (i = 0; i < i7core_dev->n_devs; i++) {
-		if (!i7core_dev->pdev[i])
-			continue;
-
-		if (PCI_SLOT(i7core_dev->pdev[i]->devfn) == slot &&
-		    PCI_FUNC(i7core_dev->pdev[i]->devfn) == func) {
-			return i7core_dev->pdev[i];
-		}
-	}
-
-	return NULL;
-}
-
-/**
- * i7core_get_active_channels() - gets the number of channels and csrows
- * @socket:	Quick Path Interconnect socket
- * @channels:	Number of channels that will be returned
- * @csrows:	Number of csrows found
- *
- * Since EDAC core needs to know in advance the number of available channels
- * and csrows, in order to allocate memory for csrows/channels, it is needed
- * to run two similar steps. At the first step, implemented on this function,
- * it checks the number of csrows/channels present at one socket.
- * this is used in order to properly allocate the size of mci components.
- *
- * It should be noticed that none of the current available datasheets explain
- * or even mention how csrows are seen by the memory controller. So, we need
- * to add a fake description for csrows.
- * So, this driver is attributing one DIMM memory for one csrow.
- */
-static int i7core_get_active_channels(const u8 socket, unsigned *channels,
-				      unsigned *csrows)
-{
-	struct pci_dev *pdev = NULL;
-	int i, j;
-	u32 status, control;
-
-	*channels = 0;
-	*csrows = 0;
-
-	pdev = get_pdev_slot_func(socket, 3, 0);
-	if (!pdev) {
-		i7core_printk(KERN_ERR, "Couldn't find socket %d fn 3.0!!!\n",
-			      socket);
-		return -ENODEV;
-	}
-
-	/* Device 3 function 0 reads */
-	pci_read_config_dword(pdev, MC_STATUS, &status);
-	pci_read_config_dword(pdev, MC_CONTROL, &control);
-
-	for (i = 0; i < NUM_CHANS; i++) {
-		u32 dimm_dod[3];
-		/* Check if the channel is active */
-		if (!(control & (1 << (8 + i))))
-			continue;
-
-		/* Check if the channel is disabled */
-		if (status & (1 << i))
-			continue;
-
-		pdev = get_pdev_slot_func(socket, i + 4, 1);
-		if (!pdev) {
-			i7core_printk(KERN_ERR, "Couldn't find socket %d "
-						"fn %d.%d!!!\n",
-						socket, i + 4, 1);
-			return -ENODEV;
-		}
-		/* Devices 4-6 function 1 */
-		pci_read_config_dword(pdev,
-				MC_DOD_CH_DIMM0, &dimm_dod[0]);
-		pci_read_config_dword(pdev,
-				MC_DOD_CH_DIMM1, &dimm_dod[1]);
-		pci_read_config_dword(pdev,
-				MC_DOD_CH_DIMM2, &dimm_dod[2]);
-
-		(*channels)++;
-
-		for (j = 0; j < 3; j++) {
-			if (!DIMM_PRESENT(dimm_dod[j]))
-				continue;
-			(*csrows)++;
-		}
-	}
-
-	debugf0("Number of active channels on socket %d: %d\n",
-		socket, *channels);
-
-	return 0;
-}
 
 static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct i7core_pvt *pvt = mci->pvt_info;
-	struct csrow_info *csr;
 	struct pci_dev *pdev;
 	int i, j;
-	int csrow = 0;
 	enum edac_type mode;
 	enum mem_type mtype;
 	struct dimm_info *dimm;
@@ -696,6 +594,8 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			if (!DIMM_PRESENT(dimm_dod[j]))
 				continue;
 
+			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+				       i, j, 0);
 			banks = numbank(MC_DOD_NUMBANK(dimm_dod[j]));
 			ranks = numrank(MC_DOD_NUMRANK(dimm_dod[j]));
 			rows = numrow(MC_DOD_NUMROW(dimm_dod[j]));
@@ -704,8 +604,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			/* DDR3 has 8 I/O banks */
 			size = (rows * cols * banks * ranks) >> (20 - 3);
 
-			pvt->channel[i].dimms++;
-
 			debugf0("\tdimm %d %d Mb offset: %x, "
 				"bank: %d, rank: %d, row: %#x, col: %#x\n",
 				j, size,
@@ -714,11 +612,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 
 			npages = MiB_TO_PAGES(size);
 
-			csr = &mci->csrows[csrow];
-
-			pvt->csrow_map[i][j] = csrow;
-
-			dimm = csr->channels[0].dimm;
 			dimm->nr_pages = npages;
 
 			switch (banks) {
@@ -741,7 +634,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			dimm->grain = 8;
 			dimm->edac_mode = mode;
 			dimm->mtype = mtype;
-			csrow++;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
@@ -1557,22 +1449,16 @@ error:
 /****************************************************************************
 			Error check routines
  ****************************************************************************/
-static void i7core_rdimm_update_csrow(struct mem_ctl_info *mci,
+static void i7core_rdimm_update_errcount(struct mem_ctl_info *mci,
 				      const int chan,
 				      const int dimm,
 				      const int add)
 {
-	char *msg;
-	struct i7core_pvt *pvt = mci->pvt_info;
-	int row = pvt->csrow_map[chan][dimm], i;
+	int i;
 
 	for (i = 0; i < add; i++) {
-		msg = kasprintf(GFP_KERNEL, "Corrected error "
-				"(Socket=%d channel=%d dimm=%d)",
-				pvt->i7core_dev->socket, chan, dimm);
-
-		edac_mc_handle_fbd_ce(mci, row, 0, msg);
-		kfree (msg);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0, 0,
+				     chan, dimm, -1, "error", "", NULL);
 	}
 }
 
@@ -1613,11 +1499,11 @@ static void i7core_rdimm_update_ce_count(struct mem_ctl_info *mci,
 
 	/*updated the edac core */
 	if (add0 != 0)
-		i7core_rdimm_update_csrow(mci, chan, 0, add0);
+		i7core_rdimm_update_errcount(mci, chan, 0, add0);
 	if (add1 != 0)
-		i7core_rdimm_update_csrow(mci, chan, 1, add1);
+		i7core_rdimm_update_errcount(mci, chan, 1, add1);
 	if (add2 != 0)
-		i7core_rdimm_update_csrow(mci, chan, 2, add2);
+		i7core_rdimm_update_errcount(mci, chan, 2, add2);
 
 }
 
@@ -1738,19 +1624,29 @@ static void i7core_mce_output_error(struct mem_ctl_info *mci,
 {
 	struct i7core_pvt *pvt = mci->pvt_info;
 	char *type, *optype, *err, *msg;
+	enum hw_event_mc_err_type tp_event;
 	unsigned long error = m->status & 0x1ff0000l;
+	bool uncorrected_error = m->mcgstatus & 1ll << 61;
+	bool ripv = m->mcgstatus & 1;
 	u32 optypenum = (m->status >> 4) & 0x07;
 	u32 core_err_cnt = (m->status >> 38) & 0x7fff;
 	u32 dimm = (m->misc >> 16) & 0x3;
 	u32 channel = (m->misc >> 18) & 0x3;
 	u32 syndrome = m->misc >> 32;
 	u32 errnum = find_first_bit(&error, 32);
-	int csrow;
 
-	if (m->mcgstatus & 1)
-		type = "FATAL";
-	else
-		type = "NON_FATAL";
+	if (uncorrected_error) {
+		if (ripv) {
+			type = "FATAL";
+			tp_event = HW_EVENT_ERR_FATAL;
+		} else {
+			type = "NON_FATAL";
+			tp_event = HW_EVENT_ERR_UNCORRECTED;
+		}
+	} else {
+		type = "CORRECTED";
+		tp_event = HW_EVENT_ERR_CORRECTED;
+	}
 
 	switch (optypenum) {
 	case 0:
@@ -1805,25 +1701,23 @@ static void i7core_mce_output_error(struct mem_ctl_info *mci,
 		err = "unknown";
 	}
 
-	/* FIXME: should convert addr into bank and rank information */
 	msg = kasprintf(GFP_ATOMIC,
-		"%s (addr = 0x%08llx, cpu=%d, Dimm=%d, Channel=%d, "
-		"syndrome=0x%08x, count=%d, Err=%08llx:%08llx (%s: %s))\n",
-		type, (long long) m->addr, m->cpu, dimm, channel,
-		syndrome, core_err_cnt, (long long)m->status,
-		(long long)m->misc, optype, err);
+		"addr=0x%08llx cpu=%d count=%d Err=%08llx:%08llx (%s: %s))\n",
+		(long long) m->addr, m->cpu, core_err_cnt,
+		(long long)m->status, (long long)m->misc, optype, err);
 
-	debugf0("%s", msg);
-
-	csrow = pvt->csrow_map[channel][dimm];
-
-	/* Call the helper to output message */
-	if (m->mcgstatus & 1)
-		edac_mc_handle_fbd_ue(mci, csrow, 0,
-				0 /* FIXME: should be channel here */, msg);
-	else if (!pvt->is_registered)
-		edac_mc_handle_fbd_ce(mci, csrow,
-				0 /* FIXME: should be channel here */, msg);
+	/*
+	 * Call the helper to output message
+	 * FIXME: what to do if core_err_cnt > 1? Currently, it generates
+	 * only one event
+	 */
+	if (uncorrected_error || !pvt->is_registered)
+		edac_mc_handle_error(tp_event, mci,
+				     m->addr >> PAGE_SHIFT,
+				     m->addr & ~PAGE_MASK,
+				     syndrome,
+				     channel, dimm, -1,
+				     err, msg, m);
 
 	kfree(msg);
 }
@@ -2242,15 +2136,19 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 {
 	struct mem_ctl_info *mci;
 	struct i7core_pvt *pvt;
-	int rc, channels, csrows;
-
-	/* Check the number of active and not disabled channels */
-	rc = i7core_get_active_channels(i7core_dev->socket, &channels, &csrows);
-	if (unlikely(rc < 0))
-		return rc;
+	int rc;
+	struct edac_mc_layer layers[2];
 
 	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), csrows, channels, i7core_dev->socket);
+
+	layers[0].type = EDAC_MC_LAYER_CHANNEL;
+	layers[0].size = NUM_CHANS;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_SLOT;
+	layers[1].size = MAX_DIMMS;
+	layers[1].is_csrow = true;
+	mci = edac_mc_alloc(i7core_dev->socket, ARRAY_SIZE(layers), layers,
+			    false, sizeof(*pvt));
 	if (unlikely(!mci))
 		return -ENOMEM;
 
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 74166ae..09d39c0 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -156,19 +156,19 @@ static int i82443bxgx_edacmc_process_error_info(struct mem_ctl_info *mci,
 	if (info->eap & I82443BXGX_EAP_OFFSET_SBE) {
 		error_found = 1;
 		if (handle_errors)
-			edac_mc_handle_ce(mci, page, pageoffset,
-				/* 440BX/GX don't make syndrome information
-				 * available */
-				0, edac_mc_find_csrow_by_page(mci, page), 0,
-				mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     page, pageoffset, 0,
+					     edac_mc_find_csrow_by_page(mci, page),
+					     0, -1, mci->ctl_name, "", NULL);
 	}
 
 	if (info->eap & I82443BXGX_EAP_OFFSET_MBE) {
 		error_found = 1;
 		if (handle_errors)
-			edac_mc_handle_ue(mci, page, pageoffset,
-					edac_mc_find_csrow_by_page(mci, page),
-					mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     page, pageoffset, 0,
+					     edac_mc_find_csrow_by_page(mci, page),
+					     0, -1, mci->ctl_name, "", NULL);
 	}
 
 	return error_found;
@@ -196,7 +196,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 
 	pci_read_config_byte(pdev, I82443BXGX_DRAMC, &dramc);
 	row_high_limit_last = 0;
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		csrow = &mci->csrows[index];
 		dimm = csrow->channels[0].dimm;
 
@@ -235,6 +235,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	u8 dramc;
 	u32 nbxcfg, ecc_mode;
 	enum mem_type mtype;
@@ -248,8 +249,13 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	if (pci_read_config_dword(pdev, I82443BXGX_NBXCFG, &nbxcfg))
 		return -EIO;
 
-	mci = edac_mc_alloc(0, I82443BXGX_NR_CSROWS, I82443BXGX_NR_CHANS, 0);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I82443BXGX_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = I82443BXGX_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (mci == NULL)
 		return -ENOMEM;
 
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 48e0ecd..85ed3a6 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -99,6 +99,7 @@ static int i82860_process_error_info(struct mem_ctl_info *mci,
 				struct i82860_error_info *info,
 				int handle_errors)
 {
+	struct dimm_info *dimm;
 	int row;
 
 	if (!(info->errsts2 & 0x0003))
@@ -108,18 +109,25 @@ static int i82860_process_error_info(struct mem_ctl_info *mci,
 		return 1;
 
 	if ((info->errsts ^ info->errsts2) & 0x0003) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1, "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
 	info->eap >>= PAGE_SHIFT;
 	row = edac_mc_find_csrow_by_page(mci, info->eap);
+	dimm = mci->csrows[row].channels[0].dimm;
 
 	if (info->errsts & 0x0002)
-		edac_mc_handle_ue(mci, info->eap, 0, row, "i82860 UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     info->eap, 0, 0,
+				     dimm->location[0], dimm->location[1], -1,
+				     "i82860 UE", "", NULL);
 	else
-		edac_mc_handle_ce(mci, info->eap, 0, info->derrsyn, row, 0,
-				"i82860 UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     info->eap, 0, info->derrsyn,
+				     dimm->location[0], dimm->location[1], -1,
+				     "i82860 CE", "", NULL);
 
 	return 1;
 }
@@ -152,7 +160,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 	 * cumulative; therefore GRA15 will contain the total memory contained
 	 * in all eight rows.
 	 */
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		csrow = &mci->csrows[index];
 		dimm = csrow->channels[0].dimm;
 
@@ -179,18 +187,26 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct i82860_error_info discard;
 
-	/* RDRAM has channels but these don't map onto the abstractions that
-	   edac uses.
-	   The device groups from the GRA registers seem to map reasonably
-	   well onto the notion of a chip select row.
-	   There are 16 GRA registers and since the name is associated with
-	   the channel and the GRA registers map to physical devices so we are
-	   going to make 1 channel for group.
+	/*
+	 * RDRAM has channels but these don't map onto the csrow abstraction.
+	 * According with the datasheet, there are 2 Rambus channels, supporting
+	 * up to 16 direct RDRAM devices.
+	 * The device groups from the GRA registers seem to map reasonably
+	 * well onto the notion of a chip select row.
+	 * There are 16 GRA registers and since the name is associated with
+	 * the channel and the GRA registers map to physical devices so we are
+	 * going to make 1 channel for group.
 	 */
-	mci = edac_mc_alloc(0, 16, 1, 0);
-
+	layers[0].type = EDAC_MC_LAYER_CHANNEL;
+	layers[0].size = 2;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_SLOT;
+	layers[1].size = 8;
+	layers[1].is_csrow = true;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index dc207dc..471b26a 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -38,7 +38,8 @@
 #endif				/* PCI_DEVICE_ID_INTEL_82875_6 */
 
 /* four csrows in dual channel, eight in single channel */
-#define I82875P_NR_CSROWS(nr_chans) (8/(nr_chans))
+#define I82875P_NR_DIMMS		8
+#define I82875P_NR_CSROWS(nr_chans)	(I82875P_NR_DIMMS / (nr_chans))
 
 /* Intel 82875p register addresses - device 0 function 0 - DRAM Controller */
 #define I82875P_EAP		0x58	/* Error Address Pointer (32b)
@@ -235,7 +236,9 @@ static int i82875p_process_error_info(struct mem_ctl_info *mci,
 		return 1;
 
 	if ((info->errsts ^ info->errsts2) & 0x0081) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1,
+				     "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
@@ -243,11 +246,15 @@ static int i82875p_process_error_info(struct mem_ctl_info *mci,
 	row = edac_mc_find_csrow_by_page(mci, info->eap);
 
 	if (info->errsts & 0x0080)
-		edac_mc_handle_ue(mci, info->eap, 0, row, "i82875p UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     info->eap, 0, 0,
+				     row, -1, -1,
+				     "i82875p UE", "", NULL);
 	else
-		edac_mc_handle_ce(mci, info->eap, 0, info->derrsyn, row,
-				multi_chan ? (info->des & 0x1) : 0,
-				"i82875p CE");
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     info->eap, 0, info->derrsyn,
+				     row, multi_chan ? (info->des & 0x1) : 0,
+				     -1, "i82875p CE", "", NULL);
 
 	return 1;
 }
@@ -359,7 +366,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 	 * contain the total memory contained in all eight rows.
 	 */
 
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		csrow = &mci->csrows[index];
 
 		value = readb(ovrfl_window + I82875P_DRB + index);
@@ -390,6 +397,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc = -ENODEV;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct i82875p_pvt *pvt;
 	struct pci_dev *ovrfl_pdev;
 	void __iomem *ovrfl_window;
@@ -405,9 +413,14 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENODEV;
 	drc = readl(ovrfl_window + I82875P_DRC);
 	nr_chans = dual_channel_active(drc) + 1;
-	mci = edac_mc_alloc(sizeof(*pvt), I82875P_NR_CSROWS(nr_chans),
-			nr_chans, 0);
 
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I82875P_NR_CSROWS(nr_chans);
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (!mci) {
 		rc = -ENOMEM;
 		goto fail0;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 304af1d..0a95e81 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -29,7 +29,8 @@
 #define PCI_DEVICE_ID_INTEL_82975_0	0x277c
 #endif				/* PCI_DEVICE_ID_INTEL_82975_0 */
 
-#define I82975X_NR_CSROWS(nr_chans)		(8/(nr_chans))
+#define I82975X_NR_DIMMS		8
+#define I82975X_NR_CSROWS(nr_chans)	(I82975X_NR_DIMMS / (nr_chans))
 
 /* Intel 82975X register addresses - device 0 function 0 - DRAM Controller */
 #define I82975X_EAP		0x58	/* Dram Error Address Pointer (32b)
@@ -287,7 +288,8 @@ static int i82975x_process_error_info(struct mem_ctl_info *mci,
 		return 1;
 
 	if ((info->errsts ^ info->errsts2) & 0x0003) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1, "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
@@ -312,10 +314,15 @@ static int i82975x_process_error_info(struct mem_ctl_info *mci,
 			   (1 << mci->csrows[row].channels[chan].dimm->grain));
 
 	if (info->errsts & 0x0002)
-		edac_mc_handle_ue(mci, page, offst , row, "i82975x UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offst, 0,
+				     row, -1, -1,
+				     "i82975x UE", "", NULL);
 	else
-		edac_mc_handle_ce(mci, page, offst, info->derrsyn, row,
-				chan, "i82975x CE");
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offst, info->derrsyn,
+				     row, chan ? chan : 0, -1,
+				     "i82975x CE", "", NULL);
 
 	return 1;
 }
@@ -386,7 +393,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 	 *
 	 */
 
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		csrow = &mci->csrows[index];
 
 		value = readb(mch_window + I82975X_DRB + index +
@@ -473,6 +480,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc = -ENODEV;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct i82975x_pvt *pvt;
 	void __iomem *mch_window;
 	u32 mchbar;
@@ -541,8 +549,13 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	chans = dual_channel_active(mch_window) + 1;
 
 	/* assuming only one controller, index thus is 0 */
-	mci = edac_mc_alloc(sizeof(*pvt), I82975X_NR_CSROWS(chans),
-					chans, 0);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I82975X_NR_DIMMS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = I82975X_NR_CSROWS(chans);
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (!mci) {
 		rc = -ENOMEM;
 		goto fail1;
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index c1d9e15..d83d750 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -812,7 +812,7 @@ static void mpc85xx_mc_check(struct mem_ctl_info *mci)
 	err_addr = in_be32(pdata->mc_vbase + MPC85XX_MC_CAPTURE_ADDRESS);
 	pfn = err_addr >> PAGE_SHIFT;
 
-	for (row_index = 0; row_index < mci->nr_csrows; row_index++) {
+	for (row_index = 0; row_index < mci->num_csrows; row_index++) {
 		csrow = &mci->csrows[row_index];
 		if ((pfn >= csrow->first_page) && (pfn <= csrow->last_page))
 			break;
@@ -850,16 +850,20 @@ static void mpc85xx_mc_check(struct mem_ctl_info *mci)
 	mpc85xx_mc_printk(mci, KERN_ERR, "PFN: %#8.8x\n", pfn);
 
 	/* we are out of range */
-	if (row_index == mci->nr_csrows)
+	if (row_index == mci->num_csrows)
 		mpc85xx_mc_printk(mci, KERN_ERR, "PFN out of range!\n");
 
 	if (err_detect & DDR_EDE_SBE)
-		edac_mc_handle_ce(mci, pfn, err_addr & ~PAGE_MASK,
-				  syndrome, row_index, 0, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     pfn, err_addr & ~PAGE_MASK, syndrome,
+				     row_index, 0, -1,
+				     mci->ctl_name, "", NULL);
 
 	if (err_detect & DDR_EDE_MBE)
-		edac_mc_handle_ue(mci, pfn, err_addr & ~PAGE_MASK,
-				  row_index, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     pfn, err_addr & ~PAGE_MASK, syndrome,
+				     row_index, 0, -1,
+				     mci->ctl_name, "", NULL);
 
 	out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT, err_detect);
 }
@@ -925,7 +929,7 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 		}
 	}
 
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		u32 start;
 		u32 end;
 
@@ -961,6 +965,7 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct mpc85xx_mc_pdata *pdata;
 	struct resource r;
 	u32 sdram_ctl;
@@ -969,7 +974,14 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	if (!devres_open_group(&op->dev, mpc85xx_mc_err_probe, GFP_KERNEL))
 		return -ENOMEM;
 
-	mci = edac_mc_alloc(sizeof(*pdata), 4, 1, edac_mc_idx);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = 4;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = 1;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+			    sizeof(*pdata));
 	if (!mci) {
 		devres_release_group(&op->dev, mpc85xx_mc_err_probe);
 		return -ENOMEM;
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 281e245..a32e9b6 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -611,12 +611,17 @@ static void mv64x60_mc_check(struct mem_ctl_info *mci)
 
 	/* first bit clear in ECC Err Reg, 1 bit error, correctable by HW */
 	if (!(reg & 0x1))
-		edac_mc_handle_ce(mci, err_addr >> PAGE_SHIFT,
-				  err_addr & PAGE_MASK, syndrome, 0, 0,
-				  mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     err_addr >> PAGE_SHIFT,
+				     err_addr & PAGE_MASK, syndrome,
+				     0, 0, -1,
+				     mci->ctl_name, "", NULL);
 	else	/* 2 bit error, UE */
-		edac_mc_handle_ue(mci, err_addr >> PAGE_SHIFT,
-				  err_addr & PAGE_MASK, 0, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     err_addr >> PAGE_SHIFT,
+				     err_addr & PAGE_MASK, 0,
+				     0, 0, -1,
+				     mci->ctl_name, "", NULL);
 
 	/* clear the error */
 	out_le32(pdata->mc_vbase + MV64X60_SDRAM_ERR_ADDR, 0);
@@ -695,6 +700,7 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct mv64x60_mc_pdata *pdata;
 	struct resource *r;
 	u32 ctl;
@@ -703,7 +709,14 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	if (!devres_open_group(&pdev->dev, mv64x60_mc_err_probe, GFP_KERNEL))
 		return -ENOMEM;
 
-	mci = edac_mc_alloc(sizeof(struct mv64x60_mc_pdata), 1, 1, edac_mc_idx);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = 1;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = 1;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+			    sizeof(struct mv64x60_mc_pdata));
 	if (!mci) {
 		printk(KERN_ERR "%s: No memory for CPU err\n", __func__);
 		devres_release_group(&pdev->dev, mv64x60_mc_err_probe);
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 3fcefda..2959db6 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -110,15 +110,16 @@ static void pasemi_edac_process_error_info(struct mem_ctl_info *mci, u32 errsta)
 	/* uncorrectable/multi-bit errors */
 	if (errsta & (MCDEBUG_ERRSTA_MBE_STATUS |
 		      MCDEBUG_ERRSTA_RFL_STATUS)) {
-		edac_mc_handle_ue(mci, mci->csrows[cs].first_page, 0,
-				  cs, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     mci->csrows[cs].first_page, 0, 0,
+				     cs, 0, -1, mci->ctl_name, "", NULL);
 	}
 
 	/* correctable/single-bit errors */
-	if (errsta & MCDEBUG_ERRSTA_SBE_STATUS) {
-		edac_mc_handle_ce(mci, mci->csrows[cs].first_page, 0,
-				  0, cs, 0, mci->ctl_name);
-	}
+	if (errsta & MCDEBUG_ERRSTA_SBE_STATUS)
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     mci->csrows[cs].first_page, 0, 0,
+				     cs, 0, -1, mci->ctl_name, "", NULL);
 }
 
 static void pasemi_edac_check(struct mem_ctl_info *mci)
@@ -139,7 +140,7 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 	u32 rankcfg;
 	int index;
 
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		csrow = &mci->csrows[index];
 		dimm = csrow->channels[0].dimm;
 
@@ -191,6 +192,7 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 		const struct pci_device_id *ent)
 {
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	u32 errctl1, errcor, scrub, mcen;
 
 	pci_read_config_dword(pdev, MCCFG_MCEN, &mcen);
@@ -207,9 +209,14 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 		MCDEBUG_ERRCTL1_RFL_LOG_EN;
 	pci_write_config_dword(pdev, MCDEBUG_ERRCTL1, errctl1);
 
-	mci = edac_mc_alloc(0, PASEMI_EDAC_NR_CSROWS, PASEMI_EDAC_NR_CHANS,
-				system_mmc_id++);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = PASEMI_EDAC_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = PASEMI_EDAC_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(system_mmc_id++, ARRAY_SIZE(layers), layers, false,
+			    0);
 	if (mci == NULL)
 		return -ENOMEM;
 
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index 95cfc0f..5dc0e6b 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -330,7 +330,7 @@ ppc4xx_edac_generate_bank_message(const struct mem_ctl_info *mci,
 	size -= n;
 	total += n;
 
-	for (rows = 0, row = 0; row < mci->nr_csrows; row++) {
+	for (rows = 0, row = 0; row < mci->num_csrows; row++) {
 		if (ppc4xx_edac_check_bank_error(status, row)) {
 			n = snprintf(buffer, size, "%s%u",
 					(rows++ ? ", " : ""), row);
@@ -725,9 +725,12 @@ ppc4xx_edac_handle_ce(struct mem_ctl_info *mci,
 
 	ppc4xx_edac_generate_message(mci, status, message, sizeof(message));
 
-	for (row = 0; row < mci->nr_csrows; row++)
+	for (row = 0; row < mci->num_csrows; row++)
 		if (ppc4xx_edac_check_bank_error(status, row))
-			edac_mc_handle_ce_no_info(mci, message);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     0, 0, 0,
+					     row, 0, -1,
+					     message, "", NULL);
 }
 
 /**
@@ -753,9 +756,12 @@ ppc4xx_edac_handle_ue(struct mem_ctl_info *mci,
 
 	ppc4xx_edac_generate_message(mci, status, message, sizeof(message));
 
-	for (row = 0; row < mci->nr_csrows; row++)
+	for (row = 0; row < mci->num_csrows; row++)
 		if (ppc4xx_edac_check_bank_error(status, row))
-			edac_mc_handle_ue(mci, page, offset, row, message);
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     page, offset, 0,
+					     row, 0, -1,
+					     message, "", NULL);
 }
 
 /**
@@ -917,7 +923,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	 * 1:1 with a controller bank/rank.
 	 */
 
-	for (row = 0; row < mci->nr_csrows; row++) {
+	for (row = 0; row < mci->num_csrows; row++) {
 		struct csrow_info *csi = &mci->csrows[row];
 
 		/*
@@ -1233,6 +1239,7 @@ static int __devinit ppc4xx_edac_probe(struct platform_device *op)
 	dcr_host_t dcr_host;
 	const struct device_node *np = op->dev.of_node;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	static int ppc4xx_edac_instance;
 
 	/*
@@ -1278,12 +1285,14 @@ static int __devinit ppc4xx_edac_probe(struct platform_device *op)
 	 * controller instance and perform the appropriate
 	 * initialization.
 	 */
-
-	mci = edac_mc_alloc(sizeof(struct ppc4xx_edac_pdata),
-			    ppc4xx_edac_nr_csrows,
-			    ppc4xx_edac_nr_chans,
-			    ppc4xx_edac_instance);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = ppc4xx_edac_nr_csrows;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = ppc4xx_edac_nr_chans;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(ppc4xx_edac_instance, ARRAY_SIZE(layers), layers,
+			    false, sizeof(struct ppc4xx_edac_pdata));
 	if (mci == NULL) {
 		ppc4xx_edac_printk(KERN_ERR, "%s: "
 				   "Failed to allocate EDAC MC instance!\n",
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 19f3a10..2b001aa 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -179,10 +179,11 @@ static int r82600_process_error_info(struct mem_ctl_info *mci,
 		error_found = 1;
 
 		if (handle_errors)
-			edac_mc_handle_ce(mci, page, 0,	/* not avail */
-					syndrome,
-					edac_mc_find_csrow_by_page(mci, page),
-					0, mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     page, 0, syndrome,
+					     edac_mc_find_csrow_by_page(mci, page),
+					     0, -1,
+					     mci->ctl_name, "", NULL);
 	}
 
 	if (info->eapr & BIT(1)) {	/* UE? */
@@ -190,9 +191,11 @@ static int r82600_process_error_info(struct mem_ctl_info *mci,
 
 		if (handle_errors)
 			/* 82600 doesn't give enough info */
-			edac_mc_handle_ue(mci, page, 0,
-					edac_mc_find_csrow_by_page(mci, page),
-					mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     page, 0, 0,
+					     edac_mc_find_csrow_by_page(mci, page),
+					     0, -1,
+					     mci->ctl_name, "", NULL);
 	}
 
 	return error_found;
@@ -226,7 +229,7 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	reg_sdram = dramcr & BIT(4);
 	row_high_limit_last = 0;
 
-	for (index = 0; index < mci->nr_csrows; index++) {
+	for (index = 0; index < mci->num_csrows; index++) {
 		csrow = &mci->csrows[index];
 		dimm = csrow->channels[0].dimm;
 
@@ -267,6 +270,7 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	u8 dramcr;
 	u32 eapr;
 	u32 scrub_disabled;
@@ -281,8 +285,13 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	debugf2("%s(): sdram refresh rate = %#0x\n", __func__,
 		sdram_refresh_rate);
 	debugf2("%s(): DRAMC register = %#0x\n", __func__, dramcr);
-	mci = edac_mc_alloc(0, R82600_NR_CSROWS, R82600_NR_CHANS, 0);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = R82600_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = R82600_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (mci == NULL)
 		return -ENOMEM;
 
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index ee1543d..b253675 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -314,8 +314,6 @@ struct sbridge_pvt {
 	struct sbridge_info	info;
 	struct sbridge_channel	channel[NUM_CHANNELS];
 
-	int 			csrow_map[NUM_CHANNELS][MAX_DIMMS];
-
 	/* Memory type detection */
 	bool			is_mirrored, is_lockstep, is_close_pg;
 
@@ -487,29 +485,14 @@ static struct pci_dev *get_pdev_slot_func(u8 bus, unsigned slot,
 }
 
 /**
- * sbridge_get_active_channels() - gets the number of channels and csrows
+ * check_if_ecc_is_active() - Checks if ECC is active
  * bus:		Device bus
- * @channels:	Number of channels that will be returned
- * @csrows:	Number of csrows found
- *
- * Since EDAC core needs to know in advance the number of available channels
- * and csrows, in order to allocate memory for csrows/channels, it is needed
- * to run two similar steps. At the first step, implemented on this function,
- * it checks the number of csrows/channels present at one socket, identified
- * by the associated PCI bus.
- * this is used in order to properly allocate the size of mci components.
- * Note: one csrow is one dimm.
  */
-static int sbridge_get_active_channels(const u8 bus, unsigned *channels,
-				      unsigned *csrows)
+static int check_if_ecc_is_active(const u8 bus)
 {
 	struct pci_dev *pdev = NULL;
-	int i, j;
 	u32 mcmtr;
 
-	*channels = 0;
-	*csrows = 0;
-
 	pdev = get_pdev_slot_func(bus, 15, 0);
 	if (!pdev) {
 		sbridge_printk(KERN_ERR, "Couldn't find PCI device "
@@ -523,41 +506,14 @@ static int sbridge_get_active_channels(const u8 bus, unsigned *channels,
 		sbridge_printk(KERN_ERR, "ECC is disabled. Aborting\n");
 		return -ENODEV;
 	}
-
-	for (i = 0; i < NUM_CHANNELS; i++) {
-		u32 mtr;
-
-		/* Device 15 functions 2 - 5  */
-		pdev = get_pdev_slot_func(bus, 15, 2 + i);
-		if (!pdev) {
-			sbridge_printk(KERN_ERR, "Couldn't find PCI device "
-						 "%2x.%02d.%d!!!\n",
-						 bus, 15, 2 + i);
-			return -ENODEV;
-		}
-		(*channels)++;
-
-		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
-			pci_read_config_dword(pdev, mtr_regs[j], &mtr);
-			debugf1("Bus#%02x channel #%d  MTR%d = %x\n", bus, i, j, mtr);
-			if (IS_DIMM_PRESENT(mtr))
-				(*csrows)++;
-		}
-	}
-
-	debugf0("Number of active channels: %d, number of active dimms: %d\n",
-		*channels, *csrows);
-
 	return 0;
 }
 
 static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct sbridge_pvt *pvt = mci->pvt_info;
-	struct csrow_info *csr;
+	struct dimm_info *dimm;
 	int i, j, banks, ranks, rows, cols, size, npages;
-	int csrow = 0;
-	unsigned long last_page = 0;
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
@@ -616,7 +572,8 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 		u32 mtr;
 
 		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
-			struct dimm_info *dimm = &mci->dimms[j];
+			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+				       i, j, 0);
 			pci_read_config_dword(pvt->pci_tad[i],
 					      mtr_regs[j], &mtr);
 			debugf4("Channel #%d  MTR%d = %x\n", i, j, mtr);
@@ -636,16 +593,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 					size, npages,
 					banks, ranks, rows, cols);
 
-				/*
-				 * Fake stuff. This controller doesn't see
-				 * csrows.
-				 */
-				csr = &mci->csrows[csrow];
-				pvt->csrow_map[i][j] = csrow;
-				last_page += npages;
-				csrow++;
-
-				csr->channels[0].dimm = dimm;
 				dimm->nr_pages = npages;
 				dimm->grain = 32;
 				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
@@ -841,11 +788,10 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 				 u8 *socket,
 				 long *channel_mask,
 				 u8 *rank,
-				 char *area_type)
+				 char *area_type, char *msg)
 {
 	struct mem_ctl_info	*new_mci;
 	struct sbridge_pvt *pvt = mci->pvt_info;
-	char			msg[256];
 	int 			n_rir, n_sads, n_tads, sad_way, sck_xch;
 	int			sad_interl, idx, base_ch;
 	int			interleave_mode;
@@ -867,12 +813,10 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	 */
 	if ((addr > (u64) pvt->tolm) && (addr < (1LL << 32))) {
 		sprintf(msg, "Error at TOLM area, on addr 0x%08Lx", addr);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	if (addr >= (u64)pvt->tohm) {
 		sprintf(msg, "Error at MMIOH area, on addr 0x%016Lx", addr);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 
@@ -889,7 +833,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		limit = SAD_LIMIT(reg);
 		if (limit <= prv) {
 			sprintf(msg, "Can't discover the memory socket");
-			edac_mc_handle_ce_no_info(mci, msg);
 			return -EINVAL;
 		}
 		if  (addr <= limit)
@@ -898,7 +841,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	}
 	if (n_sads == MAX_SAD) {
 		sprintf(msg, "Can't discover the memory socket");
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	area_type = get_dram_attr(reg);
@@ -939,7 +881,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		break;
 	default:
 		sprintf(msg, "Can't discover socket interleave");
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	*socket = sad_interleave[idx];
@@ -954,7 +895,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	if (!new_mci) {
 		sprintf(msg, "Struct for socket #%u wasn't initialized",
 			*socket);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	mci = new_mci;
@@ -970,7 +910,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		limit = TAD_LIMIT(reg);
 		if (limit <= prv) {
 			sprintf(msg, "Can't discover the memory channel");
-			edac_mc_handle_ce_no_info(mci, msg);
 			return -EINVAL;
 		}
 		if  (addr <= limit)
@@ -1010,7 +949,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		break;
 	default:
 		sprintf(msg, "Can't discover the TAD target");
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	*channel_mask = 1 << base_ch;
@@ -1024,7 +962,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 			break;
 		default:
 			sprintf(msg, "Invalid mirror set. Can't decode addr");
-			edac_mc_handle_ce_no_info(mci, msg);
 			return -EINVAL;
 		}
 	} else
@@ -1052,7 +989,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	if (offset > addr) {
 		sprintf(msg, "Can't calculate ch addr: TAD offset 0x%08Lx is too high for addr 0x%08Lx!",
 			offset, addr);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	addr -= offset;
@@ -1092,7 +1028,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	if (n_rir == MAX_RIR_RANGES) {
 		sprintf(msg, "Can't discover the memory rank for ch addr 0x%08Lx",
 			ch_addr);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	rir_way = RIR_WAY(reg);
@@ -1406,7 +1341,8 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 {
 	struct mem_ctl_info *new_mci;
 	struct sbridge_pvt *pvt = mci->pvt_info;
-	char *type, *optype, *msg, *recoverable_msg;
+	enum hw_event_mc_err_type tp_event;
+	char *type, *optype, msg[256], *recoverable_msg;
 	bool ripv = GET_BITFIELD(m->mcgstatus, 0, 0);
 	bool overflow = GET_BITFIELD(m->status, 62, 62);
 	bool uncorrected_error = GET_BITFIELD(m->status, 61, 61);
@@ -1418,13 +1354,21 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	u32 optypenum = GET_BITFIELD(m->status, 4, 6);
 	long channel_mask, first_channel;
 	u8  rank, socket;
-	int csrow, rc, dimm;
+	int rc, dimm;
 	char *area_type = "Unknown";
 
-	if (ripv)
-		type = "NON_FATAL";
-	else
-		type = "FATAL";
+	if (uncorrected_error) {
+		if (ripv) {
+			type = "FATAL";
+			tp_event = HW_EVENT_ERR_FATAL;
+		} else {
+			type = "NON_FATAL";
+			tp_event = HW_EVENT_ERR_UNCORRECTED;
+		}
+	} else {
+		type = "CORRECTED";
+		tp_event = HW_EVENT_ERR_CORRECTED;
+	}
 
 	/*
 	 * According with Table 15-9 of the Intel Archictecture spec vol 3A,
@@ -1442,19 +1386,19 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	} else {
 		switch (optypenum) {
 		case 0:
-			optype = "generic undef request";
+			optype = "generic undef request error";
 			break;
 		case 1:
-			optype = "memory read";
+			optype = "memory read error";
 			break;
 		case 2:
-			optype = "memory write";
+			optype = "memory write error";
 			break;
 		case 3:
-			optype = "addr/cmd";
+			optype = "addr/cmd error";
 			break;
 		case 4:
-			optype = "memory scrubbing";
+			optype = "memory scrubbing error";
 			break;
 		default:
 			optype = "reserved";
@@ -1463,13 +1407,13 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	}
 
 	rc = get_memory_error_data(mci, m->addr, &socket,
-				   &channel_mask, &rank, area_type);
+				   &channel_mask, &rank, area_type, msg);
 	if (rc < 0)
-		return;
+		goto err_parsing;
 	new_mci = get_mci_for_node_id(socket);
 	if (!new_mci) {
-		edac_mc_handle_ce_no_info(mci, "Error: socket got corrupted!");
-		return;
+		strcpy(msg, "Error: socket got corrupted!");
+		goto err_parsing;
 	}
 	mci = new_mci;
 	pvt = mci->pvt_info;
@@ -1483,8 +1427,6 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	else
 		dimm = 2;
 
-	csrow = pvt->csrow_map[first_channel][dimm];
-
 	if (uncorrected_error && recoverable)
 		recoverable_msg = " recoverable";
 	else
@@ -1495,18 +1437,14 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	 * Probably, we can just discard it, as the channel information
 	 * comes from the get_memory_error_data() address decoding
 	 */
-	msg = kasprintf(GFP_ATOMIC,
-			"%d %s error(s): %s on %s area %s%s: cpu=%d Err=%04x:%04x (ch=%d), "
-			"addr = 0x%08llx => socket=%d, Channel=%ld(mask=%ld), rank=%d\n",
+	snprintf(msg, sizeof(msg),
+			"%d error(s)%s: %s%s: cpu=%d Err=%04x:%04x addr = 0x%08llx socket=%d Channel=%ld(mask=%ld), rank=%d\n",
 			core_err_cnt,
+			overflow ? " OVERFLOW" : "",
 			area_type,
-			optype,
-			type,
 			recoverable_msg,
-			overflow ? "OVERFLOW" : "",
 			m->cpu,
 			mscod, errcode,
-			channel,		/* 1111b means not specified */
 			(long long) m->addr,
 			socket,
 			first_channel,		/* This is the real channel on SB */
@@ -1515,13 +1453,19 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 
 	debugf0("%s", msg);
 
+	/* FIXME: need support for channel mask */
+
 	/* Call the helper to output message */
-	if (uncorrected_error)
-		edac_mc_handle_fbd_ue(mci, csrow, 0, 0, msg);
-	else
-		edac_mc_handle_fbd_ce(mci, csrow, 0, msg);
+	edac_mc_handle_error(tp_event, mci,
+			     m->addr >> PAGE_SHIFT, m->addr & ~PAGE_MASK, 0,
+			     channel, dimm, -1,
+			     optype, msg, m);
+	return;
+err_parsing:
+	edac_mc_handle_error(tp_event, mci, 0, 0, 0,
+			     -1, -1, -1,
+			     msg, "", m);
 
-	kfree(msg);
 }
 
 /*
@@ -1680,16 +1624,25 @@ static void sbridge_unregister_mci(struct sbridge_dev *sbridge_dev)
 static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct sbridge_pvt *pvt;
-	int rc, channels, csrows;
+	int rc;
 
 	/* Check the number of active and not disabled channels */
-	rc = sbridge_get_active_channels(sbridge_dev->bus, &channels, &csrows);
+	rc = check_if_ecc_is_active(sbridge_dev->bus);
 	if (unlikely(rc < 0))
 		return rc;
 
 	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), csrows, channels, sbridge_dev->mc);
+	layers[0].type = EDAC_MC_LAYER_CHANNEL;
+	layers[0].size = NUM_CHANNELS;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_SLOT;
+	layers[1].size = MAX_DIMMS;
+	layers[1].is_csrow = true;
+	mci = edac_mc_alloc(sbridge_dev->mc, ARRAY_SIZE(layers), layers,
+			    false, sizeof(*pvt));
+
 	if (unlikely(!mci))
 		return -ENOMEM;
 
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index 6314ff9..3e878bf 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -71,7 +71,10 @@ static void tile_edac_check(struct mem_ctl_info *mci)
 	if (mem_error.sbe_count != priv->ce_count) {
 		dev_dbg(mci->dev, "ECC CE err on node %d\n", priv->node);
 		priv->ce_count = mem_error.sbe_count;
-		edac_mc_handle_ce(mci, 0, 0, 0, 0, 0, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     0, 0, 0,
+				     0, 0, -1,
+				     mci->ctl_name, "", NULL);
 	}
 }
 
@@ -122,6 +125,7 @@ static int __devinit tile_edac_mc_probe(struct platform_device *pdev)
 	char			hv_file[32];
 	int			hv_devhdl;
 	struct mem_ctl_info	*mci;
+	struct edac_mc_layer	layers[2];
 	struct tile_edac_priv	*priv;
 	int			rc;
 
@@ -131,8 +135,14 @@ static int __devinit tile_edac_mc_probe(struct platform_device *pdev)
 		return -EINVAL;
 
 	/* A TILE MC has a single channel and one chip-select row. */
-	mci = edac_mc_alloc(sizeof(struct tile_edac_priv),
-		TILE_EDAC_NR_CSROWS, TILE_EDAC_NR_CHANS, pdev->id);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = TILE_EDAC_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = TILE_EDAC_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
+			    sizeof(struct tile_edac_priv));
 	if (mci == NULL)
 		return -ENOMEM;
 	priv = mci->pvt_info;
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 0de288f..5f3c57f 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -215,19 +215,26 @@ static void x38_process_error_info(struct mem_ctl_info *mci,
 		return;
 
 	if ((info->errsts ^ info->errsts2) & X38_ERRSTS_BITS) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1,
+				     "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
 	for (channel = 0; channel < x38_channel_num; channel++) {
 		log = info->eccerrlog[channel];
 		if (log & X38_ECCERRLOG_UE) {
-			edac_mc_handle_ue(mci, 0, 0,
-				eccerrlog_row(channel, log), "x38 UE");
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     0, 0, 0,
+					     eccerrlog_row(channel, log),
+					     -1, -1,
+					     "x38 UE", "", NULL);
 		} else if (log & X38_ECCERRLOG_CE) {
-			edac_mc_handle_ce(mci, 0, 0,
-				eccerrlog_syndrome(log),
-				eccerrlog_row(channel, log), 0, "x38 CE");
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     0, 0, eccerrlog_syndrome(log),
+					     eccerrlog_row(channel, log),
+					     -1, -1,
+					     "x38 CE", "", NULL);
 		}
 	}
 }
@@ -319,6 +326,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	u16 drbs[X38_CHANNELS][X38_RANKS_PER_CHANNEL];
 	bool stacked;
 	void __iomem *window;
@@ -334,7 +342,13 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	how_many_channel(pdev);
 
 	/* FIXME: unconventional pvt_info usage */
-	mci = edac_mc_alloc(0, X38_RANKS, x38_channel_num, 0);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = X38_RANKS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = x38_channel_num;
+	layers[1].is_csrow = false;
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
@@ -362,7 +376,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	 * cumulative; the last one will contain the total memory
 	 * contained in all ranks.
 	 */
-	for (i = 0; i < mci->nr_csrows; i++) {
+	for (i = 0; i < mci->num_csrows; i++) {
 		unsigned long nr_pages;
 		struct csrow_info *csrow = &mci->csrows[i];
 
diff --git a/include/linux/edac.h b/include/linux/edac.h
index de22d4c..7762da4 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -67,6 +67,25 @@ enum dev_type {
 #define DEV_FLAG_X64		BIT(DEV_X64)
 
 /**
+ * enum hw_event_mc_err_type - type of the detected error
+ *
+ * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
+ *				corrected error was detected
+ * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
+ *				can't be corrected by ECC, but it is not
+ *				factal (maybe it is on an unused memory area,
+ *				or the memory controller could recover from
+ *				it for example, by re-trying the operation).
+ * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
+ *				be recovered.
+ */
+enum hw_event_mc_err_type {
+	HW_EVENT_ERR_CORRECTED,
+	HW_EVENT_ERR_UNCORRECTED,
+	HW_EVENT_ERR_FATAL,
+};
+
+/**
  * enum mem_type - memory types. For a more detailed reference, please see
  *			http://en.wikipedia.org/wiki/DRAM
  *
@@ -308,21 +327,85 @@ enum scrub_type {
  * PS - I enjoyed writing all that about as much as you enjoyed reading it.
  */
 
-/* FIXME: add a per-dimm ce error count */
+/**
+ * enum edac_mc_layer - memory controller hierarchy layer
+ *
+ * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
+ * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
+ * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
+ * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
+ *
+ * This enum is used by the drivers to tell edac_mc_sysfs what name should
+ * be used when describing a memory stick location.
+ */
+enum edac_mc_layer_type {
+	EDAC_MC_LAYER_BRANCH,
+	EDAC_MC_LAYER_CHANNEL,
+	EDAC_MC_LAYER_SLOT,
+	EDAC_MC_LAYER_CHIP_SELECT,
+};
+
+/**
+ * struct edac_mc_layer - describes the memory controller hierarchy
+ * @layer:		layer type
+ * @size:maximum size of the layer
+ * @is_csrow:		This layer is part of the "csrow" when old API
+ *			compatibility mode is enabled. Otherwise, it is
+ *			a channel
+ */
+struct edac_mc_layer {
+	enum edac_mc_layer_type	type;
+	unsigned		size;
+	bool			is_csrow;
+};
+
+/*
+ * Maximum number of layers used by the memory controller to uniquelly
+ * identify a single memory stick.
+ * NOTE: incrementing it would require changes at edac_mc_handle_error()
+ * and at the routines at edac_mc_sysfs that create layers
+ */
+#define EDAC_MAX_LAYERS		3
+
+/*
+ * A loop could be used here to make it more generic, but, as we only have
+ * 3 layers, this is a little faster. By design, layers can never be 0 or
+ * more than 3. If that ever happens, a NULL is returned, causing an OOPS
+ * during the memory allocation routine, with would point to the developer
+ * that he's doing something wrong.
+ */
+#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
+	typeof(var) __p;						\
+	if ((nlayers) == 1)						\
+		__p = &var[lay0];					\
+	else if ((nlayers) == 2)					\
+		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
+	else if ((nlayers) == 3)					\
+		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
+			    ((layers[1]).size * (lay0))))];		\
+	else								\
+		__p = NULL;						\
+	__p;								\
+})
+
+
+/* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -342,9 +425,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -396,6 +480,11 @@ struct mcidev_sysfs_attribute {
         ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
 };
 
+struct edac_hierarchy {
+	char		*name;
+	unsigned	nr;
+};
+
 /* MEMORY controller information structure
  */
 struct mem_ctl_info {
@@ -440,13 +529,16 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned num_csrows, num_cschannel;
 
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -461,12 +553,13 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/* drivers shouldn't access this struct directly */
+	unsigned ce_noinfo_count, ue_noinfo_count;
+	unsigned ce_mc, ue_mc;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -479,7 +572,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 06/13] edac: Initialize the dimm label with the known information
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (4 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 05/13] edac: Fix core support for MC's that see DIMMS instead of ranks Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 07/13] edac: Cleanup the logs for i7core and sb edac drivers Mauro Carvalho Chehab
                   ` (9 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

While userspace doesn't fill the dimm labels, add there the dimm location,
as described by the used memory model. This could eventually match what
is described at the dmidecode, making easier for people to identify the
memory.

For example, on an Intel motherboard, the memory is described as:

Memory Device
	Array Handle: 0x0029
	Error Information Handle: Not Provided
	Total Width: 64 bits
	Data Width: 64 bits
	Size: 2048 MB
	Form Factor: DIMM
	Set: 1
	Locator: A1_DIMM0
	Bank Locator: A1_Node0_Channel0_Dimm0
	Type: <OUT OF SPEC>
	Type Detail: Synchronous
	Speed: 800 MHz
	Manufacturer: A1_Manufacturer0
	Serial Number: A1_SerNum0
	Asset Tag: A1_AssetTagNum0
	Part Number: A1_PartNum0

After this patch, the memory label will be filled with:
	/sys/devices/system/edac/mc/mc0/dimm0/dimm_label:mc#0channel#0slot#0

With somewhat matches what it is at the Bank Locator DMI information.
So, it is easier to associate the dimm labels, of course assuming that
the DMI has the Bank Locator filled, and the BIOS doesn't have any bugs.

Yet, even without it, several motherboards are provided with enough
info to map from channel/slot (or branch/channel/slot) into the DIMM
label. So, letting the EDAC core fill it, by default is a good thing.

It should noticed that, as the label filling happens at the
edac_mc_alloc(), drivers can override it to better describe the memories
(and some actually do it).

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/edac_mc.c |   21 +++++++++++++++++----
 1 files changed, 17 insertions(+), 4 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 2793fcb..138437c 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -202,10 +202,10 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
 	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
-	void *pvt;
+	void *pvt, *p;
 	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
 	unsigned tot_csrows, tot_cschannels;
-	int i, j;
+	int i, j, n, len;
 	int err;
 	int row, chn;
 
@@ -311,9 +311,22 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 			i, (dimm - mci->dimms),
 			pos[0], pos[1], pos[2], row, chn);
 
-		/* Copy DIMM location */
-		for (j = 0; j < n_layers; j++)
+		/*
+		 * Copy DIMM location and initialize the memory location
+		 */
+		len = sizeof(dimm->label);
+		p = dimm->label;
+		n = snprintf(p, len, "mc#%u", edac_index);
+		p += n;
+		len -= n;
+		for (j = 0; j < n_layers; j++) {
+			n = snprintf(p, len, "%s#%u",
+				     edac_layer_name[layers[j].type],
+				     pos[j]);
+			p += n;
+			len -= n;
 			dimm->location[j] = pos[j];
+		}
 
 		/* Link it to the csrows old API data */
 		chan->dimm = dimm;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 07/13] edac: Cleanup the logs for i7core and sb edac drivers
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (5 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 06/13] edac: Initialize the dimm label with the known information Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 08/13] i5400_edac: improve debug messages to better represent the filled memory Mauro Carvalho Chehab
                   ` (8 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

Remove some information that it is duplicated at the MCE log,
and don't have much usage for the error. Those data will be
added again, when creating a trace function that outputs both
memory errors and MCE fields.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i7core_edac.c |    9 ++-------
 drivers/edac/sb_edac.c     |   29 ++++++++++++++---------------
 2 files changed, 16 insertions(+), 22 deletions(-)

diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 72553dd..42775c4 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -1623,7 +1623,7 @@ static void i7core_mce_output_error(struct mem_ctl_info *mci,
 				    const struct mce *m)
 {
 	struct i7core_pvt *pvt = mci->pvt_info;
-	char *type, *optype, *err, *msg;
+	char *type, *optype, *err, msg[80];
 	enum hw_event_mc_err_type tp_event;
 	unsigned long error = m->status & 0x1ff0000l;
 	bool uncorrected_error = m->mcgstatus & 1ll << 61;
@@ -1701,10 +1701,7 @@ static void i7core_mce_output_error(struct mem_ctl_info *mci,
 		err = "unknown";
 	}
 
-	msg = kasprintf(GFP_ATOMIC,
-		"addr=0x%08llx cpu=%d count=%d Err=%08llx:%08llx (%s: %s))\n",
-		(long long) m->addr, m->cpu, core_err_cnt,
-		(long long)m->status, (long long)m->misc, optype, err);
+	snprintf(msg, sizeof(msg), "count=%d %s", core_err_cnt, optype);
 
 	/*
 	 * Call the helper to output message
@@ -1718,8 +1715,6 @@ static void i7core_mce_output_error(struct mem_ctl_info *mci,
 				     syndrome,
 				     channel, dimm, -1,
 				     err, msg, m);
-
-	kfree(msg);
 }
 
 /*
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index b253675..ad27e27 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1433,23 +1433,22 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 		recoverable_msg = "";
 
 	/*
-	 * FIXME: What should we do with "channel" information on mcelog?
-	 * Probably, we can just discard it, as the channel information
-	 * comes from the get_memory_error_data() address decoding
+	 * FIXME: On some memory configurations (mirror, lockstep), the
+	 * Memory Controller can't point the error to a single DIMM. The
+	 * EDAC core should be handling the channel mask, in order to point
+	 * to the group of dimm's where the error may be happening.
 	 */
 	snprintf(msg, sizeof(msg),
-			"%d error(s)%s: %s%s: cpu=%d Err=%04x:%04x addr = 0x%08llx socket=%d Channel=%ld(mask=%ld), rank=%d\n",
-			core_err_cnt,
-			overflow ? " OVERFLOW" : "",
-			area_type,
-			recoverable_msg,
-			m->cpu,
-			mscod, errcode,
-			(long long) m->addr,
-			socket,
-			first_channel,		/* This is the real channel on SB */
-			channel_mask,
-			rank);
+		 "%d error(s)%s: %s%s: Err=%04x:%04x socket=%d channel=%ld/mask=%ld rank=%d",
+		 core_err_cnt,
+		 overflow ? " OVERFLOW" : "",
+		 area_type,
+		 recoverable_msg,
+		 mscod, errcode,
+		 socket,
+		 first_channel,
+		 channel_mask,
+		 rank);
 
 	debugf0("%s", msg);
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 08/13] i5400_edac: improve debug messages to better represent the filled memory
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (6 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 07/13] edac: Cleanup the logs for i7core and sb edac drivers Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 09/13] events/hw_event: Create a Hardware Events Report Mecanism (HERM) Mauro Carvalho Chehab
                   ` (7 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

Improves the debug output message, in order to better represent the
memory controller hierarchy, when outputing the debug messages.

No functional changes when debug is disabled.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i5400_edac.c |   15 ++++++++++++++-
 1 files changed, 14 insertions(+), 1 deletions(-)

diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 3aa2a1e..5debda9 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -963,7 +963,7 @@ static void calculate_dimm_size(struct i5400_pvt *pvt)
 	int dimm, max_dimms;
 	char *p, *mem_buffer;
 	int space, n;
-	int channel;
+	int channel, branch;
 
 	/* ================= Generate some debug output ================= */
 	space = PAGE_SIZE;
@@ -1028,6 +1028,19 @@ static void calculate_dimm_size(struct i5400_pvt *pvt)
 		space -= n;
 	}
 
+	space -= n;
+	debugf2("%s\n", mem_buffer);
+	p = mem_buffer;
+	space = PAGE_SIZE;
+
+	n = snprintf(p, space, "           ");
+	p += n;
+	for (branch = 0; branch < MAX_BRANCHES; branch++) {
+		n = snprintf(p, space, "       branch %d       | ", branch);
+		p += n;
+		space -= n;
+	}
+
 	/* output the last message and free buffer */
 	debugf2("%s\n", mem_buffer);
 	kfree(mem_buffer);
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 09/13] events/hw_event: Create a Hardware Events Report Mecanism (HERM)
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (7 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 08/13] i5400_edac: improve debug messages to better represent the filled memory Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 10/13] i5000_edac: Fix the logic that retrieves memory information Mauro Carvalho Chehab
                   ` (6 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

Adds a trace class for handle hardware events

Part of the description bellow is shamelessly copied from Tony
Luck's notes about the Hardware Error BoF during LPC 2010 [1].
Tony, thanks for your notes and discussions to generate the
h/w error reporting requirements.

[1] http://lwn.net/Articles/416669/

    We have several subsystems & methods for reporting hardware errors:

    1) EDAC ("Error Detection and Correction").  In its original form
    this consisted of a platform specific driver that read topology
    information and error counts from chipset registers and reported
    the results via a sysfs interface.

    2) mcelog - x86 specific decoding of machine check bank registers
    reporting in binary form via /dev/mcelog. Recent additions make use
    of the APEI extensions that were documented in version 4.0a of the
    ACPI specification to acquire more information about errors without
    having to rely reading chipset registers directly. A user level
    programs decodes into somewhat human readable format.

    3) drivers/edac/mce_amd.c - this driver hooks into the mcelog path and
    decodes errors reported via machine check bank registers in AMD
    processors to the console log using printk();

    Each of these mechanisms has a band of followers ... and none
    of them appear to meet all the needs of all users.

In order to provide a proper hardware event subsystem, let's
encapsulate the memory error hardware events into a trace facility.

As no agreement was reached so far for the MCA-based trace events, for
now, let's add events only for memory errors. A latter patch can change
the tracepoint, for events originated via MCA.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/edac_core.h        |    2 +-
 drivers/edac/edac_mc.c          |   11 ++++-
 include/trace/events/hw_event.h |  107 +++++++++++++++++++++++++++++++++++++++
 3 files changed, 118 insertions(+), 2 deletions(-)
 create mode 100644 include/trace/events/hw_event.h

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index 8aadd83..f7c8c8a 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -469,7 +469,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			  const int layer2,
 			  const char *msg,
 			  const char *other_detail,
-			  const void *mcelog);
+			  const void *arch_log);
 
 /*
  * edac_device APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 138437c..721d5cc 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -33,6 +33,9 @@
 #include "edac_core.h"
 #include "edac_module.h"
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/hw_event.h>
+
 /* lock to memory controller's control array */
 static DEFINE_MUTEX(mem_ctls_mutex);
 static LIST_HEAD(mc_devices);
@@ -381,6 +384,9 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	 * which will perform kobj unregistration and the actual free
 	 * will occur during the kobject callback operation
 	 */
+
+	trace_hw_event_init("edac", (unsigned)edac_index);
+
 	return mci;
 }
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
@@ -900,7 +906,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			  const int layer2,
 			  const char *msg,
 			  const char *other_detail,
-			  const void *mcelog)
+			  const void *arch_log)
 {
 	unsigned long remapped_page;
 	/* FIXME: too much for stack: move it to some pre-alocated area */
@@ -1038,6 +1044,9 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			"page 0x%lx offset 0x%lx grain %d",
 			page_frame_number, offset_in_page, grain);
 
+	trace_mc_error(type, mci->mc_idx, msg, label, location,
+		       detail, other_detail);
+
 	if (type == HW_EVENT_ERR_CORRECTED) {
 		if (edac_mc_get_log_ce())
 			edac_mc_printk(mci, KERN_WARNING,
diff --git a/include/trace/events/hw_event.h b/include/trace/events/hw_event.h
new file mode 100644
index 0000000..1fabfe2
--- /dev/null
+++ b/include/trace/events/hw_event.h
@@ -0,0 +1,107 @@
+#undef TRACE_SYSTEM
+#define TRACE_SYSTEM hw_event
+
+#if !defined(_TRACE_HW_EVENT_MC_H) || defined(TRACE_HEADER_MULTI_READ)
+#define _TRACE_HW_EVENT_MC_H
+
+#include <linux/tracepoint.h>
+#include <linux/edac.h>
+#include <linux/ktime.h>
+
+/*
+ * Hardware Anomaly Report Mecanism (HARM) events
+ *
+ * Those events are generated when hardware detected a corrected or
+ * uncorrected event, and are meant to replace the current API to report
+ * errors defined on both EDAC and MCE subsystems.
+ *
+ * FIXME: Add events for handling memory errors originated from the
+ *        MCE subsystem.
+ */
+
+DECLARE_EVENT_CLASS(hw_event_class,
+	TP_PROTO(const char *type, unsigned int instance),
+	TP_ARGS(type, instance),
+
+	TP_STRUCT__entry(
+		__string(	type,		type			)
+		__field(	unsigned int,	instance		)
+	),
+
+	TP_fast_assign(
+		__assign_str(type, type);
+		__entry->instance = instance;
+	),
+
+	TP_printk("Initialized %s#%d\n",
+		__get_str(type),
+		__entry->instance)
+);
+
+/*
+ * This event indicates that a hardware collection mechanism is started
+ */
+DEFINE_EVENT(hw_event_class, hw_event_init,
+
+	TP_PROTO(const char *type, unsigned int instance),
+
+	TP_ARGS(type, instance)
+);
+
+
+/*
+ * Hardware-independent Memory Controller specific events
+ */
+
+/*
+ * Default error mechanisms for Memory Controller errors (CE and UE)
+ */
+TRACE_EVENT(mc_error,
+
+	TP_PROTO(const unsigned int err_type,
+		 const unsigned int mc_index,
+		 const char *msg,
+		 const char *label,
+		 const char *location,
+		 const char *detail,
+		 const char *driver_detail),
+
+	TP_ARGS(err_type, mc_index, msg, label, location,
+		detail, driver_detail),
+
+	TP_STRUCT__entry(
+		__field(	unsigned int,	err_type		)
+		__field(	unsigned int,	mc_index		)
+		__string(	msg,		msg			)
+		__string(	label,		label			)
+		__string(	detail,		detail			)
+		__string(	location,	location		)
+		__string(	driver_detail,	driver_detail		)
+	),
+
+	TP_fast_assign(
+		__entry->err_type		= err_type;
+		__entry->mc_index		= mc_index;
+		__assign_str(msg, msg);
+		__assign_str(label, label);
+		__assign_str(location, location);
+		__assign_str(detail, detail);
+		__assign_str(driver_detail, driver_detail);
+	),
+
+	TP_printk(HW_ERR "mce#%d: %s error %s on label \"%s\" (%s %s %s)",
+		  __entry->mc_index,
+		  (__entry->err_type == HW_EVENT_ERR_CORRECTED) ? "Corrected" :
+			((__entry->err_type == HW_EVENT_ERR_FATAL) ?
+			"Fatal" : "Uncorrected"),
+		  __get_str(msg),
+		  __get_str(label),
+		  __get_str(location),
+		  __get_str(detail),
+		  __get_str(driver_detail))
+);
+
+#endif /* _TRACE_HW_EVENT_MC_H */
+
+/* This part must be outside protection */
+#include <trace/define_trace.h>
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 10/13] i5000_edac: Fix the logic that retrieves memory information
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (8 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 09/13] events/hw_event: Create a Hardware Events Report Mecanism (HERM) Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 11/13] e752x_edac: provide more info about how DIMMS/ranks are mapped Mauro Carvalho Chehab
                   ` (5 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The logic there is broken: it basically creates two csrows for
each DIMM and assumes that all DIMM's are dual rank. Only one of
the csrows will contain the entire DIMM size. If single rank
memories are found, they'll be marked with 0 bytes.

The check if the AMB is present were also wrong.

Yet, as the error reports don't use the memory size in order to
credit an error to the right DIMM, that part of the driver seems
to work. That's why probably nobody detected the issue yet.

After this patch, the memory layout is now properly reported,
when debug mode is enabled, and the number of ranks per dimm is
now shown:

calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size: slot  3       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: slot  2       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size: slot  1       0 MB   |    0 MB   |    0 MB   |    0 MB   |
calculate_dimm_size: slot  0     512 MB 1R|  512 MB 1R|  512 MB 1R|  512 MB 1R|
calculate_dimm_size: ----------------------------------------------------------
calculate_dimm_size:            channel 0 | channel 1 | channel 2 | channel 3 |
calculate_dimm_size:                   branch 0       |        branch 1       |

(1R above means that all memories on my test machine are single-ranked)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i5000_edac.c |  150 ++++++++++++++++++++++++---------------------
 1 files changed, 79 insertions(+), 71 deletions(-)

diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 564fe09..14efb74 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -270,7 +270,8 @@
 #define MTR3		0x8C
 
 #define NUM_MTRS		4
-#define CHANNELS_PER_BRANCH	(2)
+#define CHANNELS_PER_BRANCH	2
+#define MAX_BRANCHES		2
 
 /* Defines to extract the vaious fields from the
  *	MTRx - Memory Technology Registers
@@ -962,14 +963,14 @@ static int determine_amb_present_reg(struct i5000_pvt *pvt, int channel)
  *
  *	return the proper MTR register as determine by the csrow and channel desired
  */
-static int determine_mtr(struct i5000_pvt *pvt, int csrow, int channel)
+static int determine_mtr(struct i5000_pvt *pvt, int slot, int channel)
 {
 	int mtr;
 
 	if (channel < CHANNELS_PER_BRANCH)
-		mtr = pvt->b0_mtr[csrow >> 1];
+		mtr = pvt->b0_mtr[slot];
 	else
-		mtr = pvt->b1_mtr[csrow >> 1];
+		mtr = pvt->b1_mtr[slot];
 
 	return mtr;
 }
@@ -994,37 +995,34 @@ static void decode_mtr(int slot_row, u16 mtr)
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 }
 
-static void handle_channel(struct i5000_pvt *pvt, int csrow, int channel,
+static void handle_channel(struct i5000_pvt *pvt, int slot, int channel,
 			struct i5000_dimm_info *dinfo)
 {
 	int mtr;
 	int amb_present_reg;
 	int addrBits;
 
-	mtr = determine_mtr(pvt, csrow, channel);
+	mtr = determine_mtr(pvt, slot, channel);
 	if (MTR_DIMMS_PRESENT(mtr)) {
 		amb_present_reg = determine_amb_present_reg(pvt, channel);
 
-		/* Determine if there is  a  DIMM present in this DIMM slot */
-		if (amb_present_reg & (1 << (csrow >> 1))) {
+		/* Determine if there is a DIMM present in this DIMM slot */
+		if (amb_present_reg) {
 			dinfo->dual_rank = MTR_DIMM_RANK(mtr);
 
-			if (!((dinfo->dual_rank == 0) &&
-				((csrow & 0x1) == 0x1))) {
-				/* Start with the number of bits for a Bank
-				 * on the DRAM */
-				addrBits = MTR_DRAM_BANKS_ADDR_BITS(mtr);
-				/* Add thenumber of ROW bits */
-				addrBits += MTR_DIMM_ROWS_ADDR_BITS(mtr);
-				/* add the number of COLUMN bits */
-				addrBits += MTR_DIMM_COLS_ADDR_BITS(mtr);
-
-				addrBits += 6;	/* add 64 bits per DIMM */
-				addrBits -= 20;	/* divide by 2^^20 */
-				addrBits -= 3;	/* 8 bits per bytes */
-
-				dinfo->megabytes = 1 << addrBits;
-			}
+			/* Start with the number of bits for a Bank
+				* on the DRAM */
+			addrBits = MTR_DRAM_BANKS_ADDR_BITS(mtr);
+			/* Add the number of ROW bits */
+			addrBits += MTR_DIMM_ROWS_ADDR_BITS(mtr);
+			/* add the number of COLUMN bits */
+			addrBits += MTR_DIMM_COLS_ADDR_BITS(mtr);
+
+			addrBits += 6;	/* add 64 bits per DIMM */
+			addrBits -= 20;	/* divide by 2^^20 */
+			addrBits -= 3;	/* 8 bits per bytes */
+
+			dinfo->megabytes = 1 << addrBits;
 		}
 	}
 }
@@ -1038,10 +1036,9 @@ static void handle_channel(struct i5000_pvt *pvt, int csrow, int channel,
 static void calculate_dimm_size(struct i5000_pvt *pvt)
 {
 	struct i5000_dimm_info *dinfo;
-	int csrow, max_csrows;
+	int slot, channel, branch;
 	char *p, *mem_buffer;
 	int space, n;
-	int channel;
 
 	/* ================= Generate some debug output ================= */
 	space = PAGE_SIZE;
@@ -1052,22 +1049,17 @@ static void calculate_dimm_size(struct i5000_pvt *pvt)
 		return;
 	}
 
-	n = snprintf(p, space, "\n");
-	p += n;
-	space -= n;
-
-	/* Scan all the actual CSROWS (which is # of DIMMS * 2)
+	/* Scan all the actual slots
 	 * and calculate the information for each DIMM
-	 * Start with the highest csrow first, to display it first
-	 * and work toward the 0th csrow
+	 * Start with the highest slot first, to display it first
+	 * and work toward the 0th slot
 	 */
-	max_csrows = pvt->maxdimmperch * 2;
-	for (csrow = max_csrows - 1; csrow >= 0; csrow--) {
+	for (slot = pvt->maxdimmperch - 1; slot >= 0; slot--) {
 
-		/* on an odd csrow, first output a 'boundary' marker,
+		/* on an odd slot, first output a 'boundary' marker,
 		 * then reset the message buffer  */
-		if (csrow & 0x1) {
-			n = snprintf(p, space, "---------------------------"
+		if (slot & 0x1) {
+			n = snprintf(p, space, "--------------------------"
 				"--------------------------------");
 			p += n;
 			space -= n;
@@ -1075,30 +1067,39 @@ static void calculate_dimm_size(struct i5000_pvt *pvt)
 			p = mem_buffer;
 			space = PAGE_SIZE;
 		}
-		n = snprintf(p, space, "csrow %2d    ", csrow);
+		n = snprintf(p, space, "slot %2d    ", slot);
 		p += n;
 		space -= n;
 
 		for (channel = 0; channel < pvt->maxch; channel++) {
-			dinfo = &pvt->dimm_info[csrow][channel];
-			handle_channel(pvt, csrow, channel, dinfo);
-			n = snprintf(p, space, "%4d MB   | ", dinfo->megabytes);
+			dinfo = &pvt->dimm_info[slot][channel];
+			handle_channel(pvt, slot, channel, dinfo);
+			if (dinfo->megabytes)
+				n = snprintf(p, space, "%4d MB %dR| ",
+					     dinfo->megabytes, dinfo->dual_rank + 1);
+			else
+				n = snprintf(p, space, "%4d MB   | ", 0);
 			p += n;
 			space -= n;
 		}
-		n = snprintf(p, space, "\n");
 		p += n;
 		space -= n;
+		debugf2("%s\n", mem_buffer);
+		p = mem_buffer;
+		space = PAGE_SIZE;
 	}
 
 	/* Output the last bottom 'boundary' marker */
-	n = snprintf(p, space, "---------------------------"
-		"--------------------------------\n");
+	n = snprintf(p, space, "--------------------------"
+		"--------------------------------");
 	p += n;
 	space -= n;
+	debugf2("%s\n", mem_buffer);
+	p = mem_buffer;
+	space = PAGE_SIZE;
 
 	/* now output the 'channel' labels */
-	n = snprintf(p, space, "            ");
+	n = snprintf(p, space, "           ");
 	p += n;
 	space -= n;
 	for (channel = 0; channel < pvt->maxch; channel++) {
@@ -1106,9 +1107,17 @@ static void calculate_dimm_size(struct i5000_pvt *pvt)
 		p += n;
 		space -= n;
 	}
-	n = snprintf(p, space, "\n");
+	debugf2("%s\n", mem_buffer);
+	p = mem_buffer;
+	space = PAGE_SIZE;
+
+	n = snprintf(p, space, "           ");
 	p += n;
-	space -= n;
+	for (branch = 0; branch < MAX_BRANCHES; branch++) {
+		n = snprintf(p, space, "       branch %d       | ", branch);
+		p += n;
+		space -= n;
+	}
 
 	/* output the last message and free buffer */
 	debugf2("%s\n", mem_buffer);
@@ -1241,14 +1250,13 @@ static void i5000_get_mc_regs(struct mem_ctl_info *mci)
 static int i5000_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i5000_pvt *pvt;
-	struct csrow_info *p_csrow;
 	struct dimm_info *dimm;
 	int empty, channel_count;
 	int max_csrows;
-	int mtr, mtr1;
+	int mtr;
 	int csrow_megs;
 	int channel;
-	int csrow;
+	int slot;
 
 	pvt = mci->pvt_info;
 
@@ -1258,26 +1266,25 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 	empty = 1;		/* Assume NO memory */
 
 	/*
-	 * TODO: it would be better to not use csrow here, filling
-	 * directly the dimm_info structs, based on branch, channel, dim number
+	 * FIXME: The memory layout used to map slot/channel into the
+	 * real memory architecture is weird: branch+slot are "csrows"
+	 * and channel is channel. That required an extra array (dimm_info)
+	 * to map the dimms. A good cleanup would be to remove this array,
+	 * and do a loop here with branch, channel, slot
 	 */
-	for (csrow = 0; csrow < max_csrows; csrow++) {
-		p_csrow = &mci->csrows[csrow];
+	for (slot = 0; slot < max_csrows; slot++) {
+		for (channel = 0; channel < pvt->maxch; channel++) {
 
-		p_csrow->csrow_idx = csrow;
+			mtr = determine_mtr(pvt, slot, channel);
 
-		/* use branch 0 for the basis */
-		mtr = pvt->b0_mtr[csrow >> 1];
-		mtr1 = pvt->b1_mtr[csrow >> 1];
+			if (!MTR_DIMMS_PRESENT(mtr))
+				continue;
 
-		/* if no DIMMS on this row, continue */
-		if (!MTR_DIMMS_PRESENT(mtr) && !MTR_DIMMS_PRESENT(mtr1))
-			continue;
+			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+				       channel / MAX_BRANCHES,
+				       channel % MAX_BRANCHES, slot);
 
-		csrow_megs = 0;
-		for (channel = 0; channel < pvt->maxch; channel++) {
-			dimm = p_csrow->channels[channel].dimm;
-			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
+			csrow_megs = pvt->dimm_info[slot][channel].megabytes;
 			dimm->grain = 8;
 
 			/* Assume DDR2 for now */
@@ -1290,7 +1297,7 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 				dimm->dtype = DEV_X4;
 
 			dimm->edac_mode = EDAC_S8ECD8ED;
-			dimm->nr_pages = (csrow_megs << 8) / pvt->maxch;
+			dimm->nr_pages = csrow_megs << 8;
 		}
 
 		empty = 0;
@@ -1337,7 +1344,7 @@ static void i5000_get_dimm_and_channel_counts(struct pci_dev *pdev,
 	 * supported on this memory controller
 	 */
 	pci_read_config_byte(pdev, MAXDIMMPERCH, &value);
-	*num_dimms_per_channel = (int)value *2;
+	*num_dimms_per_channel = (int)value;
 
 	pci_read_config_byte(pdev, MAXCH, &value);
 	*num_channels = (int)value;
@@ -1387,11 +1394,12 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 		__func__, num_channels, num_dimms_per_channel);
 
 	/* allocate a new MC control structure */
+
 	layers[0].type = EDAC_MC_LAYER_BRANCH;
-	layers[0].size = 2;
-	layers[0].is_csrow = true;
+	layers[0].size = MAX_BRANCHES;
+	layers[0].is_csrow = false;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
-	layers[1].size = num_channels;
+	layers[1].size = num_channels / MAX_BRANCHES;
 	layers[1].is_csrow = false;
 	layers[2].type = EDAC_MC_LAYER_SLOT;
 	layers[2].size = num_dimms_per_channel;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 11/13] e752x_edac: provide more info about how DIMMS/ranks are mapped
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (9 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 10/13] i5000_edac: Fix the logic that retrieves memory information Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 12/13] edac: Rename the parent dev to pdev Mauro Carvalho Chehab
                   ` (4 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

No funtional changes here. Only the comments got updated.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/e752x_edac.c |   26 ++++++++++++++++++++++----
 1 files changed, 22 insertions(+), 4 deletions(-)

diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 9f8b763..d862fb4 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -4,10 +4,11 @@
  * This file may be distributed under the terms of the
  * GNU General Public License.
  *
- * See "enum e752x_chips" below for supported chipsets
+ * Implement support for the e7520, E7525, e7320 and i3100 memory controllers.
  *
- * Datasheet:
+ * Datasheets:
  *	http://www.intel.in/content/www/in/en/chipsets/e7525-memory-controller-hub-datasheet.html
+ *	ftp://download.intel.com/design/intarch/datashts/31345803.pdf
  *
  * Written by Tom Zimmerman
  *
@@ -16,8 +17,6 @@
  * 	Wang Zhenyu at intel.com
  * 	Dave Jiang at mvista.com
  *
- * $Id: edac_e752x.c,v 1.5.2.11 2005/10/05 00:43:44 dsp_llnl Exp $
- *
  */
 
 #include <linux/module.h>
@@ -190,6 +189,25 @@ enum e752x_chips {
 	I3100 = 3
 };
 
+/*
+ * Those chips Support single-rank and dual-rank memories only.
+ *
+ * On e752x chips, the odd rows are present only on dual-rank memories.
+ * Dividing the rank by two will provide the dimm#
+ *
+ * i3100 MC has a different mapping: it supports only 4 ranks.
+ *
+ * The mapping is (from 1 to n):
+ *	slot	   single-ranked	double-ranked
+ *	dimm #1 -> rank #4		NA
+ *	dimm #2 -> rank #3		NA
+ *	dimm #3 -> rank #2		Ranks 2 and 3
+ *	dimm #4 -> rank $1		Ranks 1 and 4
+ *
+ * FIXME: The current mapping for i3100 considers that it supports up to 8
+ *	  ranks/chanel, but datasheet says that the MC supports only 4 ranks.
+ */
+
 struct e752x_pvt {
 	struct pci_dev *bridge_ck;
 	struct pci_dev *dev_d0f0;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 12/13] edac: Rename the parent dev to pdev
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (10 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 11/13] e752x_edac: provide more info about how DIMMS/ranks are mapped Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 16:45 ` [PATCH 13/13] edac: use Documentation-nano format for some data structs Mauro Carvalho Chehab
                   ` (3 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

As EDAC doesn't use struct device itself, it created a parent dev
pointer called as "pdev".  Now that we'll be converting it to use
struct device, instead of struct devsys, this needs to be fixed.

No functional changes.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c      |    2 +-
 drivers/edac/amd76x_edac.c     |    4 ++--
 drivers/edac/cell_edac.c       |   12 ++++++------
 drivers/edac/cpc925_edac.c     |    2 +-
 drivers/edac/e752x_edac.c      |    2 +-
 drivers/edac/e7xxx_edac.c      |    2 +-
 drivers/edac/edac_mc.c         |    8 ++++----
 drivers/edac/edac_mc_sysfs.c   |    2 +-
 drivers/edac/i3000_edac.c      |    4 ++--
 drivers/edac/i3200_edac.c      |    6 +++---
 drivers/edac/i5000_edac.c      |    2 +-
 drivers/edac/i5100_edac.c      |    2 +-
 drivers/edac/i5400_edac.c      |    2 +-
 drivers/edac/i7300_edac.c      |    2 +-
 drivers/edac/i7core_edac.c     |    4 ++--
 drivers/edac/i82443bxgx_edac.c |    4 ++--
 drivers/edac/i82860_edac.c     |    4 ++--
 drivers/edac/i82875p_edac.c    |    4 ++--
 drivers/edac/i82975x_edac.c    |    4 ++--
 drivers/edac/mpc85xx_edac.c    |    4 ++--
 drivers/edac/mv64x60_edac.c    |    2 +-
 drivers/edac/pasemi_edac.c     |    6 +++---
 drivers/edac/ppc4xx_edac.c     |    8 ++++----
 drivers/edac/r82600_edac.c     |    4 ++--
 drivers/edac/sb_edac.c         |    4 ++--
 drivers/edac/tile_edac.c       |    4 ++--
 drivers/edac/x38_edac.c        |    6 +++---
 include/linux/edac.h           |    2 +-
 28 files changed, 56 insertions(+), 56 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 8e2873f..3118924 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2573,7 +2573,7 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 		goto err_siblings;
 
 	mci->pvt_info = pvt;
-	mci->dev = &pvt->F2->dev;
+	mci->pdev = &pvt->F2->dev;
 
 	setup_mci_misc_attrs(mci, fam_type);
 
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 4f3e54a..addd859 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -105,7 +105,7 @@ static void amd76x_get_error_info(struct mem_ctl_info *mci,
 {
 	struct pci_dev *pdev;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 	pci_read_config_dword(pdev, AMD76X_ECC_MODE_STATUS,
 			&info->ecc_mode_status);
 
@@ -257,7 +257,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENOMEM;
 
 	debugf0("%s(): mci = %p\n", __func__, mci);
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
 	mci->edac_cap = ems_mode ?
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 39616a3..9e53917 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -36,7 +36,7 @@ static void cell_edac_count_ce(struct mem_ctl_info *mci, int chan, u64 ar)
 	struct csrow_info		*csrow = &mci->csrows[0];
 	unsigned long			address, pfn, offset, syndrome;
 
-	dev_dbg(mci->dev, "ECC CE err on node %d, channel %d, ar = 0x%016llx\n",
+	dev_dbg(mci->pdev, "ECC CE err on node %d, channel %d, ar = 0x%016llx\n",
 		priv->node, chan, ar);
 
 	/* Address decoding is likely a bit bogus, to dbl check */
@@ -59,7 +59,7 @@ static void cell_edac_count_ue(struct mem_ctl_info *mci, int chan, u64 ar)
 	struct csrow_info		*csrow = &mci->csrows[0];
 	unsigned long			address, pfn, offset;
 
-	dev_dbg(mci->dev, "ECC UE err on node %d, channel %d, ar = 0x%016llx\n",
+	dev_dbg(mci->pdev, "ECC UE err on node %d, channel %d, ar = 0x%016llx\n",
 		priv->node, chan, ar);
 
 	/* Address decoding is likely a bit bogus, to dbl check */
@@ -83,7 +83,7 @@ static void cell_edac_check(struct mem_ctl_info *mci)
 	fir = in_be64(&priv->regs->mic_fir);
 #ifdef DEBUG
 	if (fir != priv->prev_fir) {
-		dev_dbg(mci->dev, "fir change : 0x%016lx\n", fir);
+		dev_dbg(mci->pdev, "fir change : 0x%016lx\n", fir);
 		priv->prev_fir = fir;
 	}
 #endif
@@ -119,7 +119,7 @@ static void cell_edac_check(struct mem_ctl_info *mci)
 		mb();	/* sync up */
 #ifdef DEBUG
 		fir = in_be64(&priv->regs->mic_fir);
-		dev_dbg(mci->dev, "fir clear  : 0x%016lx\n", fir);
+		dev_dbg(mci->pdev, "fir clear  : 0x%016lx\n", fir);
 #endif
 	}
 }
@@ -155,7 +155,7 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 			dimm->edac_mode = EDAC_SECDED;
 			dimm->nr_pages = nr_pages / csrow->nr_channels;
 		}
-		dev_dbg(mci->dev,
+		dev_dbg(mci->pdev,
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
 			priv->node, priv->chanmask,
@@ -212,7 +212,7 @@ static int __devinit cell_edac_probe(struct platform_device *pdev)
 	priv->regs = regs;
 	priv->node = pdev->id;
 	priv->chanmask = chanmask;
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_XDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
 	mci->edac_cap = EDAC_FLAG_EC | EDAC_FLAG_SECDED;
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index eb6297d..794cec6 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -995,7 +995,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	pdata->edac_idx = edac_mc_idx++;
 	pdata->name = pdev->name;
 
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	platform_set_drvdata(pdev, mci);
 	mci->dev_name = dev_name(&pdev->dev);
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index d862fb4..86428be 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1309,7 +1309,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	/* FIXME - what if different memory types are in different csrows? */
 	mci->mod_name = EDAC_MOD_STR;
 	mci->mod_ver = E752X_REVISION;
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 
 	debugf3("%s(): init pvt\n", __func__);
 	pvt = (struct e752x_pvt *)mci->pvt_info;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 7b4f3aa..49e0aec 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -458,7 +458,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	/* FIXME - what if different memory types are in different csrows? */
 	mci->mod_name = EDAC_MOD_STR;
 	mci->mod_ver = E7XXX_REVISION;
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	debugf3("%s(): init pvt\n", __func__);
 	pvt = (struct e7xxx_pvt *)mci->pvt_info;
 	pvt->dev_info = &e7xxx_devs[dev_idx];
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 721d5cc..2fa10d8 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -91,7 +91,7 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 		mci->num_csrows, mci->csrows);
 	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
 		mci->tot_dimms, mci->dimms);
-	debugf3("\tdev = %p\n", mci->dev);
+	debugf3("\tdev = %p\n", mci->pdev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
 }
@@ -425,7 +425,7 @@ struct mem_ctl_info *find_mci_by_dev(struct device *dev)
 	list_for_each(item, &mc_devices) {
 		mci = list_entry(item, struct mem_ctl_info, link);
 
-		if (mci->dev == dev)
+		if (mci->pdev == dev)
 			return mci;
 	}
 
@@ -577,7 +577,7 @@ static int add_mc_to_global_list(struct mem_ctl_info *mci)
 
 	insert_before = &mc_devices;
 
-	p = find_mci_by_dev(mci->dev);
+	p = find_mci_by_dev(mci->pdev);
 	if (unlikely(p != NULL))
 		goto fail0;
 
@@ -599,7 +599,7 @@ static int add_mc_to_global_list(struct mem_ctl_info *mci)
 
 fail0:
 	edac_printk(KERN_WARNING, EDAC_MC,
-		"%s (%s) %s %s already assigned %d\n", dev_name(p->dev),
+		"%s (%s) %s %s already assigned %d\n", dev_name(p->pdev),
 		edac_dev_name(mci), p->mod_name, p->ctl_name, p->mc_idx);
 	return 1;
 
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 256fd4e..dadc03c 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -924,7 +924,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	INIT_LIST_HEAD(&mci->grp_kobj_list);
 
 	/* create a symlink for the device */
-	err = sysfs_create_link(kobj_mci, &mci->dev->kobj,
+	err = sysfs_create_link(kobj_mci, &mci->pdev->kobj,
 				EDAC_DEVICE_SYMLINK);
 	if (err) {
 		debugf1("%s() failure to create symlink\n", __func__);
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index c366002..7560a9a 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -194,7 +194,7 @@ static void i3000_get_error_info(struct mem_ctl_info *mci,
 {
 	struct pci_dev *pdev;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 
 	/*
 	 * This is a mess because there is no atomic way to read all the
@@ -368,7 +368,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	debugf3("MC: %s(): init mci\n", __func__);
 
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 
 	mci->edac_ctl_cap = EDAC_FLAG_SECDED;
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 7421af9..b04ee3f 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -159,7 +159,7 @@ static void i3200_clear_error_info(struct mem_ctl_info *mci)
 {
 	struct pci_dev *pdev;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 
 	/*
 	 * Clear any error bits.
@@ -176,7 +176,7 @@ static void i3200_get_and_clear_error_info(struct mem_ctl_info *mci,
 	struct i3200_priv *priv = mci->pvt_info;
 	void __iomem *window = priv->window;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 
 	/*
 	 * This is a mess because there is no atomic way to read all the
@@ -354,7 +354,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 
 	debugf3("MC: %s(): init mci\n", __func__);
 
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 
 	mci->edac_ctl_cap = EDAC_FLAG_SECDED;
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 14efb74..0b6ac99 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1412,7 +1412,7 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	kobject_get(&mci->edac_mci_kobj);
 	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
 
-	mci->dev = &pdev->dev;	/* record ptr  to the generic device */
+	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
 	pvt = mci->pvt_info;
 	pvt->system_address = pdev;	/* Record this device in our private */
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index fda60f8..11aba5e 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -943,7 +943,7 @@ static int __devinit i5100_init_one(struct pci_dev *pdev,
 		goto bail_disable_ch1;
 	}
 
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 
 	priv = mci->pvt_info;
 	priv->ranksperchan = ranksperch;
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 5debda9..a16a2b5 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1300,7 +1300,7 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 
 	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
 
-	mci->dev = &pdev->dev;	/* record ptr  to the generic device */
+	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
 	pvt = mci->pvt_info;
 	pvt->system_address = pdev;	/* Record this device in our private */
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 0ff0b26..57f264d 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -1061,7 +1061,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 
 	debugf0("MC: " __FILE__ ": %s(): mci = %p\n", __func__, mci);
 
-	mci->dev = &pdev->dev;	/* record ptr  to the generic device */
+	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
 	pvt = mci->pvt_info;
 	pvt->pci_dev_16_0_fsb_ctlr = pdev;	/* Record this device in our private */
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 42775c4..7f81a04 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -2119,7 +2119,7 @@ static void i7core_unregister_mci(struct i7core_dev *i7core_dev)
 	i7core_pci_ctl_release(pvt);
 
 	/* Remove MC sysfs nodes */
-	edac_mc_del_mc(mci->dev);
+	edac_mc_del_mc(mci->pdev);
 
 	debugf1("%s: free mci struct\n", mci->ctl_name);
 	kfree(mci->ctl_name);
@@ -2185,7 +2185,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 	/* Get dimm basic config */
 	get_dimm_config(mci);
 	/* record ptr to the generic device */
-	mci->dev = &i7core_dev->pdev[0]->dev;
+	mci->pdev = &i7core_dev->pdev[0]->dev;
 	/* Set the function pointer to an actual operation function */
 	mci->edac_check = i7core_check_error;
 
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 09d39c0..ced9ba7 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -124,7 +124,7 @@ static void i82443bxgx_edacmc_get_error_info(struct mem_ctl_info *mci,
 				*info)
 {
 	struct pci_dev *pdev;
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 	pci_read_config_dword(pdev, I82443BXGX_EAP, &info->eap);
 	if (info->eap & I82443BXGX_EAP_OFFSET_SBE)
 		/* Clear error to allow next error to be reported [p.61] */
@@ -260,7 +260,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENOMEM;
 
 	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_EDO | MEM_FLAG_SDR | MEM_FLAG_RDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
 	pci_read_config_byte(pdev, I82443BXGX_DRAMC, &dramc);
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 85ed3a6..e8e5ddb 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -67,7 +67,7 @@ static void i82860_get_error_info(struct mem_ctl_info *mci,
 {
 	struct pci_dev *pdev;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 
 	/*
 	 * This is a mess because there is no atomic way to read all the
@@ -211,7 +211,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENOMEM;
 
 	debugf3("%s(): init mci\n", __func__);
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
 	/* I"m not sure about this but I think that all RDRAM is SECDED */
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 471b26a..e89757e 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -189,7 +189,7 @@ static void i82875p_get_error_info(struct mem_ctl_info *mci,
 {
 	struct pci_dev *pdev;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 
 	/*
 	 * This is a mess because there is no atomic way to read all the
@@ -430,7 +430,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	kobject_get(&mci->edac_mci_kobj);
 
 	debugf3("%s(): init mci\n", __func__);
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
 	mci->edac_cap = EDAC_FLAG_UNKNOWN;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 0a95e81..84e5fcc 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -241,7 +241,7 @@ static void i82975x_get_error_info(struct mem_ctl_info *mci,
 {
 	struct pci_dev *pdev;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 
 	/*
 	 * This is a mess because there is no atomic way to read all the
@@ -562,7 +562,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	debugf3("%s(): init mci\n", __func__);
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
 	mci->edac_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index d83d750..f6fb1e5 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -990,9 +990,9 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	pdata = mci->pvt_info;
 	pdata->name = "mpc85xx_mc_err";
 	pdata->irq = NO_IRQ;
-	mci->dev = &op->dev;
+	mci->pdev = &op->dev;
 	pdata->edac_idx = edac_mc_idx++;
-	dev_set_drvdata(mci->dev, mci);
+	dev_set_drvdata(mci->pdev, mci);
 	mci->ctl_name = pdata->name;
 	mci->dev_name = pdata->name;
 
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index a32e9b6..ef0e710 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -724,7 +724,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	}
 
 	pdata = mci->pvt_info;
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	platform_set_drvdata(pdev, mci);
 	pdata->name = "mv64x60_mc_err";
 	pdata->irq = NO_IRQ;
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 2959db6..8671270 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -74,7 +74,7 @@ static int system_mmc_id;
 
 static u32 pasemi_edac_get_error_info(struct mem_ctl_info *mci)
 {
-	struct pci_dev *pdev = to_pci_dev(mci->dev);
+	struct pci_dev *pdev = to_pci_dev(mci->pdev);
 	u32 tmp;
 
 	pci_read_config_dword(pdev, MCDEBUG_ERRSTA,
@@ -95,7 +95,7 @@ static u32 pasemi_edac_get_error_info(struct mem_ctl_info *mci)
 
 static void pasemi_edac_process_error_info(struct mem_ctl_info *mci, u32 errsta)
 {
-	struct pci_dev *pdev = to_pci_dev(mci->dev);
+	struct pci_dev *pdev = to_pci_dev(mci->pdev);
 	u32 errlog1a;
 	u32 cs;
 
@@ -225,7 +225,7 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 		MCCFG_ERRCOR_ECC_GEN_EN |
 		MCCFG_ERRCOR_ECC_CRR_EN;
 
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR | MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
 	mci->edac_cap = (errcor & MCCFG_ERRCOR_ECC_GEN_EN) ?
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index 5dc0e6b..d3432da 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -1027,9 +1027,9 @@ ppc4xx_edac_mc_init(struct mem_ctl_info *mci,
 
 	/* Initial driver pointers and private data */
 
-	mci->dev		= &op->dev;
+	mci->pdev		= &op->dev;
 
-	dev_set_drvdata(mci->dev, mci);
+	dev_set_drvdata(mci->pdev, mci);
 
 	pdata			= mci->pvt_info;
 
@@ -1334,7 +1334,7 @@ static int __devinit ppc4xx_edac_probe(struct platform_device *op)
 	return 0;
 
  fail1:
-	edac_mc_del_mc(mci->dev);
+	edac_mc_del_mc(mci->pdev);
 
  fail:
 	edac_mc_free(mci);
@@ -1368,7 +1368,7 @@ ppc4xx_edac_remove(struct platform_device *op)
 
 	dcr_unmap(pdata->dcr_host, SDRAM_DCR_RESOURCE_LEN);
 
-	edac_mc_del_mc(mci->dev);
+	edac_mc_del_mc(mci->pdev);
 	edac_mc_free(mci);
 
 	return 0;
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 2b001aa..850076e2 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -140,7 +140,7 @@ static void r82600_get_error_info(struct mem_ctl_info *mci,
 {
 	struct pci_dev *pdev;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 	pci_read_config_dword(pdev, R82600_EAP, &info->eapr);
 
 	if (info->eapr & BIT(0))
@@ -296,7 +296,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENOMEM;
 
 	debugf0("%s(): mci = %p\n", __func__, mci);
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
 	/* FIXME try to work out if the chip leads have been used for COM2
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index ad27e27..ff07f34 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1612,7 +1612,7 @@ static void sbridge_unregister_mci(struct sbridge_dev *sbridge_dev)
 	mce_unregister_decode_chain(&sbridge_mce_dec);
 
 	/* Remove MC sysfs nodes */
-	edac_mc_del_mc(mci->dev);
+	edac_mc_del_mc(mci->pdev);
 
 	debugf1("%s: free mci struct\n", mci->ctl_name);
 	kfree(mci->ctl_name);
@@ -1677,7 +1677,7 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 	get_memory_layout(mci);
 
 	/* record ptr to the generic device */
-	mci->dev = &sbridge_dev->pdev[0]->dev;
+	mci->pdev = &sbridge_dev->pdev[0]->dev;
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (unlikely(edac_mc_add_mc(mci))) {
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index 3e878bf..32cb2c7 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -69,7 +69,7 @@ static void tile_edac_check(struct mem_ctl_info *mci)
 
 	/* Check if the current error count is different from the saved one. */
 	if (mem_error.sbe_count != priv->ce_count) {
-		dev_dbg(mci->dev, "ECC CE err on node %d\n", priv->node);
+		dev_dbg(mci->pdev, "ECC CE err on node %d\n", priv->node);
 		priv->ce_count = mem_error.sbe_count;
 		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
 				     0, 0, 0,
@@ -149,7 +149,7 @@ static int __devinit tile_edac_mc_probe(struct platform_device *pdev)
 	priv->node = pdev->id;
 	priv->hv_devhdl = hv_devhdl;
 
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_SECDED;
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 5f3c57f..6fc9f1d 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -151,7 +151,7 @@ static void x38_clear_error_info(struct mem_ctl_info *mci)
 {
 	struct pci_dev *pdev;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 
 	/*
 	 * Clear any error bits.
@@ -172,7 +172,7 @@ static void x38_get_and_clear_error_info(struct mem_ctl_info *mci,
 	struct pci_dev *pdev;
 	void __iomem *window = mci->pvt_info;
 
-	pdev = to_pci_dev(mci->dev);
+	pdev = to_pci_dev(mci->pdev);
 
 	/*
 	 * This is a mess because there is no atomic way to read all the
@@ -354,7 +354,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 
 	debugf3("MC: %s(): init mci\n", __func__);
 
-	mci->dev = &pdev->dev;
+	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 
 	mci->edac_ctl_cap = EDAC_FLAG_SECDED;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 7762da4..29899ad 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -546,7 +546,7 @@ struct mem_ctl_info {
 	 * unique.  dev pointer should be sufficiently unique, but
 	 * BUS:SLOT.FUNC numbers may not be unique.
 	 */
-	struct device *dev;
+	struct device *pdev;
 	const char *mod_name;
 	const char *mod_ver;
 	const char *ctl_name;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH 13/13] edac: use Documentation-nano format for some data structs
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (11 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 12/13] edac: Rename the parent dev to pdev Mauro Carvalho Chehab
@ 2012-03-29 16:45 ` Mauro Carvalho Chehab
  2012-03-29 20:46 ` [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Aristeu Rozanski Filho
                   ` (2 subsequent siblings)
  15 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-29 16:45 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

No functional changes. Just comment improvements.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 include/linux/edac.h |   80 +++++++++++++++++++++++++++++++++++--------------
 1 files changed, 57 insertions(+), 23 deletions(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index 29899ad..21f37f7 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -45,7 +45,19 @@ static inline void opstate_init(void)
 #define EDAC_MC_LABEL_LEN	31
 #define MC_PROC_NAME_MAX_LEN	7
 
-/* memory devices */
+/**
+ * enum dev_type - describe the type of memory DRAM chips used at the stick
+ * @DEV_UNKNOWN:	Can't be determined, or MC doesn't support detect it
+ * @DEV_X1:		1 bit for data
+ * @DEV_X2:		2 bits for data
+ * @DEV_X4:		4 bits for data
+ * @DEV_X8:		8 bits for data
+ * @DEV_X16:		16 bits for data
+ * @DEV_X32:		32 bits for data
+ * @DEV_X64:		64 bits for data
+ *
+ * Typical values are x4 and x8.
+ */
 enum dev_type {
 	DEV_UNKNOWN = 0,
 	DEV_X1,
@@ -163,18 +175,29 @@ enum mem_type {
 #define MEM_FLAG_DDR3		 BIT(MEM_DDR3)
 #define MEM_FLAG_RDDR3		 BIT(MEM_RDDR3)
 
-/* chipset Error Detection and Correction capabilities and mode */
+/** enum edac-type - Error Detection and Correction capabilities and mode
+ * @EDAC_UNKNOWN:	Unknown if ECC is available
+ * @EDAC_NONE:		Doesn't support ECC
+ * @EDAC_RESERVED:	Reserved ECC type
+ * @EDAC_PARITY:	Detects parity errors
+ * @EDAC_EC:		Error Checking - no correction
+ * @EDAC_SECDED:	Single bit error correction, Double detection
+ * @EDAC_S2ECD2ED:	Chipkill x2 devices - do these exist?
+ * @EDAC_S4ECD4ED:	Chipkill x4 devices
+ * @EDAC_S8ECD8ED:	Chipkill x8 devices
+ * @EDAC_S16ECD16ED:	Chipkill x16 devices
+ */
 enum edac_type {
-	EDAC_UNKNOWN = 0,	/* Unknown if ECC is available */
-	EDAC_NONE,		/* Doesn't support ECC */
-	EDAC_RESERVED,		/* Reserved ECC type */
-	EDAC_PARITY,		/* Detects parity errors */
-	EDAC_EC,		/* Error Checking - no correction */
-	EDAC_SECDED,		/* Single bit error correction, Double detection */
-	EDAC_S2ECD2ED,		/* Chipkill x2 devices - do these exist? */
-	EDAC_S4ECD4ED,		/* Chipkill x4 devices */
-	EDAC_S8ECD8ED,		/* Chipkill x8 devices */
-	EDAC_S16ECD16ED,	/* Chipkill x16 devices */
+	EDAC_UNKNOWN =	0,
+	EDAC_NONE,
+	EDAC_RESERVED,
+	EDAC_PARITY,
+	EDAC_EC,
+	EDAC_SECDED,
+	EDAC_S2ECD2ED,
+	EDAC_S4ECD4ED,
+	EDAC_S8ECD8ED,
+	EDAC_S16ECD16ED,
 };
 
 #define EDAC_FLAG_UNKNOWN	BIT(EDAC_UNKNOWN)
@@ -187,18 +210,29 @@ enum edac_type {
 #define EDAC_FLAG_S8ECD8ED	BIT(EDAC_S8ECD8ED)
 #define EDAC_FLAG_S16ECD16ED	BIT(EDAC_S16ECD16ED)
 
-/* scrubbing capabilities */
+/** enum scrub_type - scrubbing capabilities
+ * @SCRUB_UNKNOWN		Unknown if scrubber is available
+ * @SCRUB_NONE:			No scrubber
+ * @SCRUB_SW_PROG:		SW progressive (sequential) scrubbing
+ * @SCRUB_SW_SRC:		Software scrub only errors
+ * @SCRUB_SW_PROG_SRC:		Progressive software scrub from an error
+ * @SCRUB_SW_TUNABLE:		Software scrub frequency is tunable
+ * @SCRUB_HW_PROG:		HW progressive (sequential) scrubbing
+ * @SCRUB_HW_SRC:		Hardware scrub only errors
+ * @SCRUB_HW_PROG_SRC:		Progressive hardware scrub from an error
+ * SCRUB_HW_TUNABLE:		Hardware scrub frequency is tunable
+ */
 enum scrub_type {
-	SCRUB_UNKNOWN = 0,	/* Unknown if scrubber is available */
-	SCRUB_NONE,		/* No scrubber */
-	SCRUB_SW_PROG,		/* SW progressive (sequential) scrubbing */
-	SCRUB_SW_SRC,		/* Software scrub only errors */
-	SCRUB_SW_PROG_SRC,	/* Progressive software scrub from an error */
-	SCRUB_SW_TUNABLE,	/* Software scrub frequency is tunable */
-	SCRUB_HW_PROG,		/* HW progressive (sequential) scrubbing */
-	SCRUB_HW_SRC,		/* Hardware scrub only errors */
-	SCRUB_HW_PROG_SRC,	/* Progressive hardware scrub from an error */
-	SCRUB_HW_TUNABLE	/* Hardware scrub frequency is tunable */
+	SCRUB_UNKNOWN =	0,
+	SCRUB_NONE,
+	SCRUB_SW_PROG,
+	SCRUB_SW_SRC,
+	SCRUB_SW_PROG_SRC,
+	SCRUB_SW_TUNABLE,
+	SCRUB_HW_PROG,
+	SCRUB_HW_SRC,
+	SCRUB_HW_PROG_SRC,
+	SCRUB_HW_TUNABLE
 };
 
 #define SCRUB_FLAG_SW_PROG	BIT(SCRUB_SW_PROG)
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (12 preceding siblings ...)
  2012-03-29 16:45 ` [PATCH 13/13] edac: use Documentation-nano format for some data structs Mauro Carvalho Chehab
@ 2012-03-29 20:46 ` Aristeu Rozanski Filho
  2012-04-02 13:59 ` Borislav Petkov
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
  15 siblings, 0 replies; 206+ messages in thread
From: Aristeu Rozanski Filho @ 2012-03-29 20:46 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 01:45:33PM -0300, Mauro Carvalho Chehab wrote:
> This is the 12th and final rebase of this patch series.
> 
> It is the first patchset for the EDAC rewrite. On this patchset,
> there are all the internal changes at the EDAC core, needed
> to properly represent memories at modern memory controllers that
> aren't oriented per rank/channel.
> 
> It is needed in order to fix a long-term bug at the EDAC drivers
> for the Intel memory controllers deployed since 2005 (well, in fact,
> there is one Rambus that it is older, but also suffers from the same
> syndrome), including the drivers for the recent Intel Nehalem and
> Sandy Bridge architectures.
> 
> The new EDAC architecture supports both per rank/channel memory
> controllers and per-DIMM ones.
> 
> On this changeset, there are no changes at the sysfs nodes. Just 
> like before this changeset, non-per-rank memory controllers 
> will expose memories as "virtual csrows/virtual channels[1].
> 
> [1] It sounds better to say "virtual" than to admit that all
> EDAC Intel drivers since 2005 need to lie about their age to
> the EDAC core, in order for the Kernel to accept them ;)
> 
> Mauro Carvalho Chehab (13):
>   edac: Create a dimm struct and move the labels into it
>   edac: move dimm properties to struct memset_info
>   edac: Don't initialize csrow's first_page & friends when not needed
>   edac: move nr_pages to dimm struct
>   edac: Fix core support for MC's that see DIMMS instead of ranks
>   edac: Initialize the dimm label with the known information
>   edac: Cleanup the logs for i7core and sb edac drivers
>   i5400_edac: improve debug messages to better represent the filled
>     memory
>   events/hw_event: Create a Hardware Events Report Mecanism (HERM)
>   i5000_edac: Fix the logic that retrieves memory information
>   e752x_edac: provide more info about how DIMMS/ranks are mapped
>   edac: Rename the parent dev to pdev
>   edac: use Documentation-nano format for some data structs
Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>

-- 
Aristeu


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 01/13] edac: Create a dimm struct and move the labels into it
  2012-03-29 16:45 ` [PATCH 01/13] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
@ 2012-03-30 10:50   ` Borislav Petkov
  2012-03-30 13:26     ` Mauro Carvalho Chehab
  2012-04-16  8:41     ` Mauro Carvalho Chehab
  0 siblings, 2 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-03-30 10:50 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 01:45:34PM -0300, Mauro Carvalho Chehab wrote:
> The way a DIMM is currently represented implies that they're
> linked into a per-csrow struct. However, some drivers don't see
> csrows, as they're ridden behind some chip like the AMB's
> on FBDIMM's, for example.
> 
> This forced drivers to fake a csrow struct, and to create
> a mess under csrow/channel original's concept.
> 
> Move the DIMM labels into a per-DIMM struct, and add there
> the real location of the socket, in terms of csrow/channel.
> Latter patches will modify the location to properly represent the
> memory architecture.
> 
> All other drivers will use a per-csrow type of location.
> Some of those drivers will require a latter conversion, as
> they also fake the csrows internally.
> 
> TODO: While this patch doesn't change the existing behavior, on
> csrows-based memory controllers, a csrow/channel pair points to a memory
> rank. There's a known bug at the EDAC core that allows having different
> labels for the same DIMM, if it has more than one rank. A latter patch
> is need to merge the several ranks for a DIMM into the same dimm_info
> struct, in order to avoid having different labels for the same DIMM.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
>  drivers/edac/edac_mc.c       |   50 +++++++++++++++++++++++++++++++----------
>  drivers/edac/edac_mc_sysfs.c |   11 ++++-----
>  drivers/edac/i5100_edac.c    |    8 +++---
>  drivers/edac/i7core_edac.c   |    4 +-
>  drivers/edac/i82975x_edac.c  |    2 +-
>  drivers/edac/sb_edac.c       |    4 +-
>  include/linux/edac.h         |   28 +++++++++++++++++++----
>  7 files changed, 75 insertions(+), 32 deletions(-)
> 
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index 690cbf1..c03bfe7 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -44,7 +44,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>  	debugf4("\tchannel = %p\n", chan);
>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
>  	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
> -	debugf4("\tchannel->label = '%s'\n", chan->label);
> +	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
>  }
>  
> @@ -157,6 +157,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	struct mem_ctl_info *mci;
>  	struct csrow_info *csi, *csrow;
>  	struct rank_info *chi, *chp, *chan;
> +	struct dimm_info *dimm;
>  	void *pvt;
>  	unsigned size;
>  	int row, chn;
> @@ -170,7 +171,8 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	mci = (struct mem_ctl_info *)0;
>  	csi = edac_align_ptr(&mci[1], sizeof(*csi));
>  	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
> -	pvt = edac_align_ptr(&chi[nr_chans * nr_csrows], sz_pvt);
> +	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
> +	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
>  	size = ((unsigned long)pvt) + sz_pvt;
>  
>  	mci = kzalloc(size, GFP_KERNEL);
> @@ -182,11 +184,13 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	 */
>  	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
>  	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
> +	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
>  	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
>  
>  	/* setup index and various internal pointers */
>  	mci->mc_idx = edac_index;
>  	mci->csrows = csi;
> +	mci->dimms  = dimm;
>  	mci->pvt_info = pvt;
>  	mci->nr_csrows = nr_csrows;
>  
> @@ -205,6 +209,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  		}
>  	}
>  
> +	/*
> +	 * By default, assumes that a per-csrow arrangement will be used,
> +	 * as most drivers are based on such assumption.
> +	 */
> +	dimm = mci->dimms;
> +	for (row = 0; row < mci->nr_csrows; row++) {
> +		for (chn = 0; chn < mci->csrows[row].nr_channels; chn++) {
> +			mci->csrows[row].channels[chn].dimm = dimm;
> +			dimm->csrow = row;
> +			dimm->csrow_channel = chn;
> +			dimm++;
> +			mci->nr_dimms++;
> +		}
> +	}

There's a double loop above this one which iterates over the same
things: rows and then channels in each row. So merge that loop with the
one above instead of repeating it here.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 02/13] edac: move dimm properties to struct memset_info
  2012-03-29 16:45 ` [PATCH 02/13] edac: move dimm properties to struct memset_info Mauro Carvalho Chehab
@ 2012-03-30 13:10   ` Borislav Petkov
  2012-03-30 13:22     ` Mauro Carvalho Chehab
  2012-03-30 17:03   ` Borislav Petkov
  1 sibling, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-03-30 13:10 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

Please don't call it 'memset_info' - this is misleading beyond belief.

Hmm, what struct memset_info - where the hell is this? It is nowhere to
be seen in the patches following that one too, WTF?

On Thu, Mar 29, 2012 at 01:45:35PM -0300, Mauro Carvalho Chehab wrote:
> On systems based on chip select rows, all channels need to use memories
> with the same properties, otherwise the memories on channels A and B
> won't be recognized.
> 
> However, such assumption is not true for all types of memory
> controllers.
> 
> Controllers for FB-DIMM's don't have such requirements.
> 
> Also, modern Intel controllers seem to be capable of handling such
> differences.
> 
> So, we need to get rid of storing the DIMM information into a per-csrow
> data, storing it, instead at the right place.
> 
> The first step is to move grain, mtype, dtype and edac_mode to the
> per-dimm struct.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 02/13] edac: move dimm properties to struct memset_info
  2012-03-30 13:10   ` Borislav Petkov
@ 2012-03-30 13:22     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-30 13:22 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

Em 30-03-2012 10:10, Borislav Petkov escreveu:
> Please don't call it 'memset_info' - this is misleading beyond belief.
> 
> Hmm, what struct memset_info - where the hell is this? It is nowhere to
> be seen in the patches following that one too, WTF?

s/memset_info/dimm_info/

Sorry, I forgot to rename the patch name on the rebase I've renamed
this field (rebase 8?).

> On Thu, Mar 29, 2012 at 01:45:35PM -0300, Mauro Carvalho Chehab wrote:
>> On systems based on chip select rows, all channels need to use memories
>> with the same properties, otherwise the memories on channels A and B
>> won't be recognized.
>>
>> However, such assumption is not true for all types of memory
>> controllers.
>>
>> Controllers for FB-DIMM's don't have such requirements.
>>
>> Also, modern Intel controllers seem to be capable of handling such
>> differences.
>>
>> So, we need to get rid of storing the DIMM information into a per-csrow
>> data, storing it, instead at the right place.
>>
>> The first step is to move grain, mtype, dtype and edac_mode to the
>> per-dimm struct.
> 


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 01/13] edac: Create a dimm struct and move the labels into it
  2012-03-30 10:50   ` Borislav Petkov
@ 2012-03-30 13:26     ` Mauro Carvalho Chehab
  2012-03-30 15:38       ` Borislav Petkov
  2012-04-16  8:41     ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-03-30 13:26 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

Em 30-03-2012 07:50, Borislav Petkov escreveu:
> On Thu, Mar 29, 2012 at 01:45:34PM -0300, Mauro Carvalho Chehab wrote:
>> The way a DIMM is currently represented implies that they're
>> linked into a per-csrow struct. However, some drivers don't see
>> csrows, as they're ridden behind some chip like the AMB's
>> on FBDIMM's, for example.
>>
>> This forced drivers to fake a csrow struct, and to create
>> a mess under csrow/channel original's concept.
>>
>> Move the DIMM labels into a per-DIMM struct, and add there
>> the real location of the socket, in terms of csrow/channel.
>> Latter patches will modify the location to properly represent the
>> memory architecture.
>>
>> All other drivers will use a per-csrow type of location.
>> Some of those drivers will require a latter conversion, as
>> they also fake the csrows internally.
>>
>> TODO: While this patch doesn't change the existing behavior, on
>> csrows-based memory controllers, a csrow/channel pair points to a memory
>> rank. There's a known bug at the EDAC core that allows having different
>> labels for the same DIMM, if it has more than one rank. A latter patch
>> is need to merge the several ranks for a DIMM into the same dimm_info
>> struct, in order to avoid having different labels for the same DIMM.
>>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>  drivers/edac/edac_mc.c       |   50 +++++++++++++++++++++++++++++++----------
>>  drivers/edac/edac_mc_sysfs.c |   11 ++++-----
>>  drivers/edac/i5100_edac.c    |    8 +++---
>>  drivers/edac/i7core_edac.c   |    4 +-
>>  drivers/edac/i82975x_edac.c  |    2 +-
>>  drivers/edac/sb_edac.c       |    4 +-
>>  include/linux/edac.h         |   28 +++++++++++++++++++----
>>  7 files changed, 75 insertions(+), 32 deletions(-)
>>
>> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
>> index 690cbf1..c03bfe7 100644
>> --- a/drivers/edac/edac_mc.c
>> +++ b/drivers/edac/edac_mc.c
>> @@ -44,7 +44,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>>  	debugf4("\tchannel = %p\n", chan);
>>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
>>  	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
>> -	debugf4("\tchannel->label = '%s'\n", chan->label);
>> +	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
>>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
>>  }
>>  
>> @@ -157,6 +157,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	struct mem_ctl_info *mci;
>>  	struct csrow_info *csi, *csrow;
>>  	struct rank_info *chi, *chp, *chan;
>> +	struct dimm_info *dimm;
>>  	void *pvt;
>>  	unsigned size;
>>  	int row, chn;
>> @@ -170,7 +171,8 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	mci = (struct mem_ctl_info *)0;
>>  	csi = edac_align_ptr(&mci[1], sizeof(*csi));
>>  	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
>> -	pvt = edac_align_ptr(&chi[nr_chans * nr_csrows], sz_pvt);
>> +	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
>> +	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
>>  	size = ((unsigned long)pvt) + sz_pvt;
>>  
>>  	mci = kzalloc(size, GFP_KERNEL);
>> @@ -182,11 +184,13 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	 */
>>  	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
>>  	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
>> +	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
>>  	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
>>  
>>  	/* setup index and various internal pointers */
>>  	mci->mc_idx = edac_index;
>>  	mci->csrows = csi;
>> +	mci->dimms  = dimm;
>>  	mci->pvt_info = pvt;
>>  	mci->nr_csrows = nr_csrows;
>>  
>> @@ -205,6 +209,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  		}
>>  	}
>>  
>> +	/*
>> +	 * By default, assumes that a per-csrow arrangement will be used,
>> +	 * as most drivers are based on such assumption.
>> +	 */
>> +	dimm = mci->dimms;
>> +	for (row = 0; row < mci->nr_csrows; row++) {
>> +		for (chn = 0; chn < mci->csrows[row].nr_channels; chn++) {
>> +			mci->csrows[row].channels[chn].dimm = dimm;
>> +			dimm->csrow = row;
>> +			dimm->csrow_channel = chn;
>> +			dimm++;
>> +			mci->nr_dimms++;
>> +		}
>> +	}
> 
> There's a double loop above this one which iterates over the same
> things: rows and then channels in each row. So merge that loop with the
> one above instead of repeating it here.
> 

The first loop will disappear on the patch I'm writing right now with the 
edac_mc_alloc() changes to per-csrow/per-dimm/per-channel kzalloc().

Regards,
Mauro



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 01/13] edac: Create a dimm struct and move the labels into it
  2012-03-30 13:26     ` Mauro Carvalho Chehab
@ 2012-03-30 15:38       ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-03-30 15:38 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Fri, Mar 30, 2012 at 10:26:13AM -0300, Mauro Carvalho Chehab wrote:
> > There's a double loop above this one which iterates over the same
> > things: rows and then channels in each row. So merge that loop with the
> > one above instead of repeating it here.
> 
> The first loop will disappear on the patch I'm writing right now with the 
> edac_mc_alloc() changes to per-csrow/per-dimm/per-channel kzalloc().

Nevertheless, please fix it as suggested.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 02/13] edac: move dimm properties to struct memset_info
  2012-03-29 16:45 ` [PATCH 02/13] edac: move dimm properties to struct memset_info Mauro Carvalho Chehab
  2012-03-30 13:10   ` Borislav Petkov
@ 2012-03-30 17:03   ` Borislav Petkov
  2012-04-16  8:56     ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-03-30 17:03 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 01:45:35PM -0300, Mauro Carvalho Chehab wrote:
> On systems based on chip select rows, all channels need to use memories
> with the same properties, otherwise the memories on channels A and B
> won't be recognized.
> 
> However, such assumption is not true for all types of memory
> controllers.
> 
> Controllers for FB-DIMM's don't have such requirements.
> 
> Also, modern Intel controllers seem to be capable of handling such
> differences.
> 
> So, we need to get rid of storing the DIMM information into a per-csrow
> data, storing it, instead at the right place.
> 
> The first step is to move grain, mtype, dtype and edac_mode to the
> per-dimm struct.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
>  drivers/edac/amd64_edac.c      |   30 +++++++++++--------
>  drivers/edac/amd76x_edac.c     |   10 ++++--
>  drivers/edac/cell_edac.c       |   10 +++++-
>  drivers/edac/cpc925_edac.c     |   62 +++++++++++++++++++++------------------
>  drivers/edac/e752x_edac.c      |   44 +++++++++++++++-------------
>  drivers/edac/e7xxx_edac.c      |   44 ++++++++++++++++------------
>  drivers/edac/edac_mc.c         |   19 ++++++++----
>  drivers/edac/edac_mc_sysfs.c   |    6 ++--
>  drivers/edac/i3000_edac.c      |   18 ++++++-----
>  drivers/edac/i3200_edac.c      |   18 ++++++-----
>  drivers/edac/i5000_edac.c      |   24 +++++++--------
>  drivers/edac/i5100_edac.c      |   38 +++++++++++++-----------
>  drivers/edac/i5400_edac.c      |   24 ++++++---------
>  drivers/edac/i7300_edac.c      |   25 +++++++++------
>  drivers/edac/i7core_edac.c     |   27 ++++++++---------
>  drivers/edac/i82443bxgx_edac.c |   13 +++++---
>  drivers/edac/i82860_edac.c     |   11 ++++--
>  drivers/edac/i82875p_edac.c    |   17 ++++++++---
>  drivers/edac/i82975x_edac.c    |   17 +++++++----
>  drivers/edac/mpc85xx_edac.c    |   13 +++++---
>  drivers/edac/mv64x60_edac.c    |   18 ++++++-----
>  drivers/edac/pasemi_edac.c     |   10 ++++--
>  drivers/edac/ppc4xx_edac.c     |   13 +++++---
>  drivers/edac/r82600_edac.c     |   10 ++++--
>  drivers/edac/sb_edac.c         |   31 +++++++++++---------
>  drivers/edac/tile_edac.c       |   13 ++++----
>  drivers/edac/x38_edac.c        |   17 ++++++-----
>  include/linux/edac.h           |   21 ++++++++-----
>  28 files changed, 340 insertions(+), 263 deletions(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index c9eee6d..3e7bddc 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -2168,7 +2168,9 @@ static int init_csrows(struct mem_ctl_info *mci)
>  	struct amd64_pvt *pvt = mci->pvt_info;
>  	u64 input_addr_min, input_addr_max, sys_addr, base, mask;
>  	u32 val;
> -	int i, empty = 1;
> +	int i, j, empty = 1;
> +	enum mem_type mtype;
> +	enum edac_type edac_mode;
>  
>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>  
> @@ -2202,7 +2204,21 @@ static int init_csrows(struct mem_ctl_info *mci)
>  		csrow->page_mask = ~mask;
>  		/* 8 bytes of resolution */
>  
> -		csrow->mtype = amd64_determine_memory_type(pvt, i);
> +		mtype = amd64_determine_memory_type(pvt, i);
> +
> +		/*
> +		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> +		 */
> +		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
> +			edac_mode = (pvt->nbcfg & NBCFG_CHIPKILL) ?
> +				    EDAC_S4ECD4ED : EDAC_SECDED;
> +		else
> +			edac_mode = EDAC_NONE;
> +
> +		for (j = 0; j < pvt->channel_count; j++) {
> +			csrow->channels[j].dimm->mtype = mtype;
> +			csrow->channels[j].dimm->edac_mode = edac_mode;
> +		}
>  
>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
>  		debugf1("    input_addr_min: 0x%lx input_addr_max: 0x%lx\n",
> @@ -2214,16 +2230,6 @@ static int init_csrows(struct mem_ctl_info *mci)
>  			"last_page: 0x%lx\n",
>  			(unsigned)csrow->nr_pages,
>  			csrow->first_page, csrow->last_page);
> -
> -		/*
> -		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> -		 */
> -		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
> -			csrow->edac_mode =
> -			    (pvt->nbcfg & NBCFG_CHIPKILL) ?
> -			    EDAC_S4ECD4ED : EDAC_SECDED;
> -		else
> -			csrow->edac_mode = EDAC_NONE;

This looks like a useless code movement, please leave it where it is
now and add the for-loop after it instead of pulling it up and causing
needless churn.

< snip drivers I'm not maintaining >

> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index c03bfe7..2430ddb 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -43,7 +43,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>  {
>  	debugf4("\tchannel = %p\n", chan);
>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
> -	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
> +	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
>  	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
>  }
> @@ -698,6 +698,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>  {
>  	unsigned long remapped_page;
>  	char *label = NULL;
> +	u32 grain;
>  
>  	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
>  
> @@ -722,6 +723,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>  	}
>  
>  	label = mci->csrows[row].channels[channel].dimm->label;
> +	grain = mci->csrows[row].channels[channel].dimm->grain;
>  
>  	if (edac_mc_get_log_ce())
>  		/* FIXME - put in DIMM location */
> @@ -729,11 +731,12 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>  			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
>  			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
>  			page_frame_number, offset_in_page,
> -			mci->csrows[row].grain, syndrome, row, channel,
> +			grain, syndrome, row, channel,
>  			label, msg);
>  
>  	mci->ce_count++;
>  	mci->csrows[row].ce_count++;
> +	mci->csrows[row].channels[channel].dimm->ce_count++;
>  	mci->csrows[row].channels[channel].ce_count++;
>  
>  	if (mci->scrub_mode & SCRUB_SW_SRC) {
> @@ -750,8 +753,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>  			mci->ctl_page_to_phys(mci, page_frame_number) :
>  			page_frame_number;
>  
> -		edac_mc_scrub_block(remapped_page, offset_in_page,
> -				mci->csrows[row].grain);
> +		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
>  	}
>  }
>  EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
> @@ -777,6 +779,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
>  	int chan;
>  	int chars;
>  	char *label = NULL;
> +	u32 grain;
>  
>  	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
>  
> @@ -790,6 +793,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
>  		return;
>  	}
>  
> +	grain = mci->csrows[row].channels[0].dimm->grain;
>  	label = mci->csrows[row].channels[0].dimm->label;
>  	chars = snprintf(pos, len + 1, "%s", label);
>  	len -= chars;
> @@ -807,14 +811,13 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
>  		edac_mc_printk(mci, KERN_EMERG,
>  			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
>  			"labels \"%s\": %s\n", page_frame_number,
> -			offset_in_page, mci->csrows[row].grain, row,
> -			labels, msg);
> +			offset_in_page, grain, row, labels, msg);
>  
>  	if (edac_mc_get_panic_on_ue())
>  		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
>  			"row %d, labels \"%s\": %s\n", mci->mc_idx,
>  			page_frame_number, offset_in_page,
> -			mci->csrows[row].grain, row, labels, msg);
> +			grain, row, labels, msg);
>  
>  	mci->ue_count++;
>  	mci->csrows[row].ue_count++;
> @@ -886,6 +889,7 @@ void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
>  	chars = snprintf(pos, len + 1, "%s", label);
>  	len -= chars;
>  	pos += chars;
> +

Useless \n.

>  	chars = snprintf(pos, len + 1, "-%s",
>  			mci->csrows[csrow].channels[channelb].dimm->label);
>  
> @@ -939,6 +943,7 @@ void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
>  
>  	mci->ce_count++;
>  	mci->csrows[csrow].ce_count++;
> +	mci->csrows[csrow].channels[channel].dimm->ce_count++;
>  	mci->csrows[csrow].channels[channel].ce_count++;
>  }
>  EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index c83697c..d63904e 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -150,19 +150,19 @@ static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
>  static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
>  				int private)
>  {
> -	return sprintf(data, "%s\n", mem_types[csrow->mtype]);
> +	return sprintf(data, "%s\n", mem_types[csrow->channels[0].dimm->mtype]);

							       ^^^

This looks strange, why always channel 0, because it is always defined?

>  }
>  
>  static ssize_t csrow_dev_type_show(struct csrow_info *csrow, char *data,
>  				int private)
>  {
> -	return sprintf(data, "%s\n", dev_types[csrow->dtype]);
> +	return sprintf(data, "%s\n", dev_types[csrow->channels[0].dimm->dtype]);

ditto.

>  }
>  
>  static ssize_t csrow_edac_mode_show(struct csrow_info *csrow, char *data,
>  				int private)
>  {
> -	return sprintf(data, "%s\n", edac_caps[csrow->edac_mode]);
> +	return sprintf(data, "%s\n", edac_caps[csrow->channels[0].dimm->edac_mode]);

ditto.

<snip more drivers I don't maintain >

Reminder: you need to get yourself Acks at least from the drivers'
maintainers which are still active, at least.

> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index f40b835..5244193 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -314,6 +314,13 @@ struct dimm_info {
>  	unsigned memory_controller;
>  	unsigned csrow;
>  	unsigned csrow_channel;
> +
> +	u32 grain;		/* granularity of reported error in bytes */
> +	enum dev_type dtype;	/* memory device type */
> +	enum mem_type mtype;	/* memory dimm type */
> +	enum edac_type edac_mode;	/* EDAC mode for this dimm */
> +
> +	u32 ce_count;		/* Correctable Errors for this dimm */
>  };
>  
>  /**
> @@ -339,19 +346,17 @@ struct rank_info {
>  };
>  
>  struct csrow_info {
> -	unsigned long first_page;	/* first page number in dimm */
> -	unsigned long last_page;	/* last page number in dimm */
> +	unsigned long first_page;	/* first page number in csrow */
> +	unsigned long last_page;	/* last page number in csrow */
> +	u32 nr_pages;			/* number of pages in csrow */
>  	unsigned long page_mask;	/* used for interleaving -
>  					 * 0UL for non intlv
>  					 */
> -	u32 nr_pages;		/* number of pages in csrow */
> -	u32 grain;		/* granularity of reported error in bytes */
> -	int csrow_idx;		/* the chip-select row */
> -	enum dev_type dtype;	/* memory device type */
> +	int csrow_idx;			/* the chip-select row */
> +
>  	u32 ue_count;		/* Uncorrectable Errors for this csrow */
>  	u32 ce_count;		/* Correctable Errors for this csrow */
> -	enum mem_type mtype;	/* memory csrow type */
> -	enum edac_type edac_mode;	/* EDAC mode for this csrow */
> +
>  	struct mem_ctl_info *mci;	/* the parent */
>  
>  	struct kobject kobj;	/* sysfs kobject for this csrow */
> -- 
> 1.7.8


-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 03/13] edac: Don't initialize csrow's first_page & friends when not needed
  2012-03-29 16:45 ` [PATCH 03/13] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
@ 2012-04-02 12:33   ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-02 12:33 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 01:45:36PM -0300, Mauro Carvalho Chehab wrote:
> Almost all edac	drivers	initialize first_page, last_page and
> page_mask.

Please call them csrow_info->first_page ... or csrow_info.first_page...
so that we can know of which struct they're members.

> Those vars are used inside the EDAC core, in	order to
> calculate the csrow affected by	an error, by using the routine
> edac_mc_find_csrow_by_page().
> 
> However, very few drivers actually use it:
>         e752x_edac.c
>         e7xxx_edac.c
>         i3000_edac.c
>         i82443bxgx_edac.c
>         i82860_edac.c
>         i82875p_edac.c
>         i82975x_edac.c
>         r82600_edac.c
> 
> There also a few other drivers that have their own calculus
> formula internally using those vars.
> 
> All the others are just wasting time by initializing those
> data.
> 
> While initializing data without using them won't cause any troubles, as
> those information is stored at the wrong place (at csrows structure), it
> is better to remove what is unused, in order to simplify the next patch.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
>  drivers/edac/amd64_edac.c   |   38 ++------------------------------------

For the amd64_edac.c bits:

Acked-by: Borislav Petkov <borislav.petkov@amd.com>

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 04/13] edac: move nr_pages to dimm struct
  2012-03-29 16:45 ` [PATCH 04/13] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
@ 2012-04-02 13:18   ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-02 13:18 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Thu, Mar 29, 2012 at 01:45:37PM -0300, Mauro Carvalho Chehab wrote:
> The number of pages is a dimm property. Move it to the dimm struct.
> 
> After this change, it is possible to add sysfs nodes for the DIMM's that

							       DIMMs

> will properly represent the DIMM stick properties, including its size.
> 
> A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
> the memory controller represents the memory via chip select rows.
> 
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
>  drivers/edac/amd64_edac.c      |   13 ++++------
>  drivers/edac/amd76x_edac.c     |    6 ++--
>  drivers/edac/cell_edac.c       |    8 ++++--
>  drivers/edac/cpc925_edac.c     |    8 ++++--
>  drivers/edac/e752x_edac.c      |    6 +++-
>  drivers/edac/e7xxx_edac.c      |    5 ++-
>  drivers/edac/edac_mc.c         |   28 ++++++++++++---------
>  drivers/edac/edac_mc_sysfs.c   |   52 ++++++++++++++++++++++++++++++----------
>  drivers/edac/i3000_edac.c      |    6 +++-
>  drivers/edac/i3200_edac.c      |    3 +-
>  drivers/edac/i5000_edac.c      |   14 ++++++----
>  drivers/edac/i5100_edac.c      |   22 ++++++++++-------
>  drivers/edac/i5400_edac.c      |    9 ++----
>  drivers/edac/i7300_edac.c      |   22 ++++------------
>  drivers/edac/i7core_edac.c     |   10 ++-----
>  drivers/edac/i82443bxgx_edac.c |    2 +-
>  drivers/edac/i82860_edac.c     |    2 +-
>  drivers/edac/i82875p_edac.c    |    5 ++-
>  drivers/edac/i82975x_edac.c    |   11 ++++++--
>  drivers/edac/mpc85xx_edac.c    |    3 +-
>  drivers/edac/mv64x60_edac.c    |    3 +-
>  drivers/edac/pasemi_edac.c     |   14 +++++-----
>  drivers/edac/ppc4xx_edac.c     |    5 ++-
>  drivers/edac/r82600_edac.c     |    3 +-
>  drivers/edac/sb_edac.c         |    8 +----
>  drivers/edac/tile_edac.c       |    2 +-
>  drivers/edac/x38_edac.c        |    4 +-
>  include/linux/edac.h           |   10 ++++---
>  28 files changed, 158 insertions(+), 126 deletions(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index b1b1551..ad0376e 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -2126,12 +2126,6 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>  
>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>  
> -	/*
> -	 * If dual channel then double the memory size of single channel.
> -	 * Channel count is 1 or 2
> -	 */
> -	nr_pages <<= (pvt->channel_count - 1);
> -
>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
>  	debugf0("    nr_pages= %u  channel-count = %d\n",
>  		nr_pages, pvt->channel_count);
> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>  	int i, j, empty = 1;
>  	enum mem_type mtype;
>  	enum edac_type edac_mode;
> +	int nr_pages;
>  
>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>  
> @@ -2174,7 +2169,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>  			i, pvt->mc_node_id);
>  
>  		empty = 0;
> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
>  		/* 8 bytes of resolution */
>  
> @@ -2192,10 +2187,12 @@ static int init_csrows(struct mem_ctl_info *mci)
>  		for (j = 0; j < pvt->channel_count; j++) {
>  			csrow->channels[j].dimm->mtype = mtype;
>  			csrow->channels[j].dimm->edac_mode = edac_mode;
> +			csrow->channels[j].dimm->nr_pages = nr_pages;
> +

One newline too many.

>  		}
>  
>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
> +		debugf1("    nr_pages: %u\n", nr_pages);
>  	}
>  
>  	return empty;


<snip>

> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index 2430ddb..02263c3 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -43,22 +43,22 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>  {
>  	debugf4("\tchannel = %p\n", chan);
>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
> -	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
> -	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
> +	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
> +	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
> +	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
>  }
>  
>  static void edac_mc_dump_csrow(struct csrow_info *csrow)
>  {
>  	debugf4("\tcsrow = %p\n", csrow);
>  	debugf4("\tcsrow->csrow_idx = %d\n", csrow->csrow_idx);
> -	debugf4("\tcsrow->first_page = 0x%lx\n", csrow->first_page);
> -	debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
> -	debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
> -	debugf4("\tcsrow->nr_pages = 0x%x\n", csrow->nr_pages);
>  	debugf4("\tcsrow->nr_channels = %d\n", csrow->nr_channels);
>  	debugf4("\tcsrow->channels = %p\n", csrow->channels);
>  	debugf4("\tcsrow->mci = %p\n\n", csrow->mci);
> +	debugf4("\tcsrow->first_page = 0x%lx\n", csrow->first_page);
> +	debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
> +	debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);

Useless code churn: simply remove the ->nr_pages line.

>  }
>  
>  static void edac_mc_dump_mci(struct mem_ctl_info *mci)
> @@ -655,15 +655,19 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
>  int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
>  {
>  	struct csrow_info *csrows = mci->csrows;
> -	int row, i;
> +	int row, i, j, n;
>  
>  	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
>  	row = -1;
>  
>  	for (i = 0; i < mci->nr_csrows; i++) {
>  		struct csrow_info *csrow = &csrows[i];
> -
> -		if (csrow->nr_pages == 0)
> +		n = 0;
> +		for (j = 0; j < csrow->nr_channels; j++) {
> +			struct dimm_info *dimm = csrow->channels[j].dimm;
> +			n += dimm->nr_pages;
> +		}
> +		if (n == 0)
>  			continue;
>  
>  		debugf3("MC%d: %s(): first(0x%lx) page(0x%lx) last(0x%lx) "
> @@ -672,9 +676,9 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
>  			csrow->page_mask);
>  
>  		if ((page >= csrow->first_page) &&
> -		    (page <= csrow->last_page) &&
> -		    ((page & csrow->page_mask) ==
> -		     (csrow->first_page & csrow->page_mask))) {
> +		(page <= csrow->last_page) &&
> +		((page & csrow->page_mask) ==
> +		(csrow->first_page & csrow->page_mask))) {

Useless and wrong formatting change, please retain the original formatting.

>  			row = i;
>  			break;
>  		}
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index d63904e..52c56cf 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -144,7 +144,13 @@ static ssize_t csrow_ce_count_show(struct csrow_info *csrow, char *data,
>  static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
>  				int private)
>  {
> -	return sprintf(data, "%u\n", PAGES_TO_MiB(csrow->nr_pages));
> +	int i;
> +	u32 nr_pages = 0;
> +
> +	for (i = 0; i < csrow->nr_channels; i++)
> +		nr_pages += csrow->channels[i].dimm->nr_pages;
> +
> +	return sprintf(data, "%u\n", PAGES_TO_MiB(nr_pages));
>  }
>  
>  static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
> @@ -519,16 +525,17 @@ static ssize_t mci_ctl_name_show(struct mem_ctl_info *mci, char *data)
>  
>  static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data)
>  {
> -	int total_pages, csrow_idx;
> +	int total_pages, csrow_idx, j;
>  
>  	for (total_pages = csrow_idx = 0; csrow_idx < mci->nr_csrows;
> -		csrow_idx++) {
> +	     csrow_idx++) {

You can do the total_pages initialization in the declaration above and
thus don't have to break the loop across two lines:

	int total_pages = 0;
	...

	for (csrow_idx = 0; csrow_idx < mci->nr_csrows; csrow_idx++) {
	...

>  		struct csrow_info *csrow = &mci->csrows[csrow_idx];
>  
> -		if (!csrow->nr_pages)
> -			continue;
> +		for (j = 0; j < csrow->nr_channels; j++) {
> +			struct dimm_info *dimm = csrow->channels[j].dimm;
>  
> -		total_pages += csrow->nr_pages;
> +			total_pages += dimm->nr_pages;
> +		}
>  	}
>  
>  	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
> @@ -900,7 +907,7 @@ static void edac_remove_mci_instance_attributes(struct mem_ctl_info *mci,
>   */
>  int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
>  {
> -	int i;
> +	int i, j;
>  	int err;
>  	struct csrow_info *csrow;
>  	struct kobject *kobj_mci = &mci->edac_mci_kobj;
> @@ -934,10 +941,15 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
>  	/* Make directories for each CSROW object under the mc<id> kobject
>  	 */
>  	for (i = 0; i < mci->nr_csrows; i++) {
> +		int n = 0;

Let's call this variable with a bit more descriptive name like
'nr_pages' or 'pages_total' or whatever but not simply n.

> +
>  		csrow = &mci->csrows[i];
> +		for (j = 0; j < csrow->nr_channels; j++) {
> +			struct dimm_info *dimm = csrow->channels[j].dimm;
> +			n += dimm->nr_pages;
> +		}

Btw, you can make the loop a single-statement one and drop the curly
braces like so:

		for (j = 0; j < csrow->nr_channels; j++)
			nr_pages += csrow->channels[j].dimm->nr_pages;

The same applies for all the other occurrences of when you iterate over
the channels, accumulating the pages per dimm.

>  
> -		/* Only expose populated CSROWs */
> -		if (csrow->nr_pages > 0) {
> +		if (n > 0) {
>  			err = edac_create_csrow_object(mci, csrow, i);
>  			if (err) {
>  				debugf1("%s() failure: create csrow %d obj\n",
> @@ -949,10 +961,16 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
>  
>  	return 0;
>  
> -	/* CSROW error: backout what has already been registered,  */
>  fail1:
>  	for (i--; i >= 0; i--) {
> -		if (mci->csrows[i].nr_pages > 0)
> +		int n = 0;

ditto.

> +
> +		csrow = &mci->csrows[i];
> +		for (j = 0; j < csrow->nr_channels; j++) {
> +			struct dimm_info *dimm = csrow->channels[j].dimm;
> +			n += dimm->nr_pages;
> +		}
> +		if (n > 0)
>  			kobject_put(&mci->csrows[i].kobj);
>  	}
>  
> @@ -972,14 +990,22 @@ fail0:
>   */
>  void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
>  {
> -	int i;
> +	struct csrow_info *csrow;
> +	int i, j;
>  
>  	debugf0("%s()\n", __func__);
>  
>  	/* remove all csrow kobjects */
>  	debugf4("%s()  unregister this mci kobj\n", __func__);
>  	for (i = 0; i < mci->nr_csrows; i++) {
> -		if (mci->csrows[i].nr_pages > 0) {
> +		int n = 0;

ditto.

> +
> +		csrow = &mci->csrows[i];
> +		for (j = 0; j < csrow->nr_channels; j++) {
> +			struct dimm_info *dimm = csrow->channels[j].dimm;
> +			n += dimm->nr_pages;
> +		}
> +		if (n > 0) {
>  			debugf0("%s()  unreg csrow-%d\n", __func__, i);
>  			kobject_put(&mci->csrows[i].kobj);
>  		}

<snip>

> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index 5244193..de22d4c 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -320,6 +320,8 @@ struct dimm_info {
>  	enum mem_type mtype;	/* memory dimm type */
>  	enum edac_type edac_mode;	/* EDAC mode for this dimm */
>  
> +	u32 nr_pages;			/* number of pages in csrow */
> +
>  	u32 ce_count;		/* Correctable Errors for this dimm */
>  };
>  
> @@ -346,13 +348,13 @@ struct rank_info {
>  };
>  
>  struct csrow_info {
> +	int csrow_idx;			/* the chip-select row */
> +
> +	/* Used only by edac_mc_find_csrow_by_page() */
>  	unsigned long first_page;	/* first page number in csrow */
>  	unsigned long last_page;	/* last page number in csrow */
> -	u32 nr_pages;			/* number of pages in csrow */
>  	unsigned long page_mask;	/* used for interleaving -
> -					 * 0UL for non intlv
> -					 */
> -	int csrow_idx;			/* the chip-select row */
> +					 * 0UL for non intlv */

Useless moving of code around.

>  
>  	u32 ue_count;		/* Uncorrectable Errors for this csrow */
>  	u32 ce_count;		/* Correctable Errors for this csrow */
> -- 
> 1.7.8

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (13 preceding siblings ...)
  2012-03-29 20:46 ` [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Aristeu Rozanski Filho
@ 2012-04-02 13:59 ` Borislav Petkov
  2012-04-16 12:58   ` Mauro Carvalho Chehab
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
  15 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-02 13:59 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski Filho

On Thu, Mar 29, 2012 at 01:45:33PM -0300, Mauro Carvalho Chehab wrote:
> This is the 12th and final rebase of this patch series.
> 
> It is the first patchset for the EDAC rewrite. On this patchset,
> there are all the internal changes at the EDAC core, needed
> to properly represent memories at modern memory controllers that
> aren't oriented per rank/channel.
> 
> It is needed in order to fix a long-term bug at the EDAC drivers
> for the Intel memory controllers deployed since 2005 (well, in fact,
> there is one Rambus that it is older, but also suffers from the same
> syndrome), including the drivers for the recent Intel Nehalem and
> Sandy Bridge architectures.
> 
> The new EDAC architecture supports both per rank/channel memory
> controllers and per-DIMM ones.
> 
> On this changeset, there are no changes at the sysfs nodes. Just 
> like before this changeset, non-per-rank memory controllers 
> will expose memories as "virtual csrows/virtual channels[1].
> 
> [1] It sounds better to say "virtual" than to admit that all
> EDAC Intel drivers since 2005 need to lie about their age to
> the EDAC core, in order for the Kernel to accept them ;)
> 
> Mauro Carvalho Chehab (13):
>   edac: Create a dimm struct and move the labels into it
>   edac: move dimm properties to struct memset_info
>   edac: Don't initialize csrow's first_page & friends when not needed
>   edac: move nr_pages to dimm struct
>   edac: Fix core support for MC's that see DIMMS instead of ranks

I was wondering why 6/13 doesn't apply cleanly but there's the patch
above, 5/13 missing in the submission. It looks like vger has eaten it
at least for the linux-edac mailing list - the patch is still on lkml
though.

And what a patch it is: almost 5000 lines.

Please split it!

And don't tell me it cannot be done: each patch needs to do one thing
and one thing only. From looking at this monster, here's one possible
way to split it:

* add all changes to include/linux/edac.h
* a bunch of changes to edac_mc.c like edac_align_ptr etc
* changes to edac_mc_alloc
* add edac_mc_handle_error
* switch old edac_mc_handle* stuff to edac_mc_handle_error
...<a bunch more>

This way the code will be much easier to review.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 01/13] edac: Create a dimm struct and move the labels into it
  2012-03-30 10:50   ` Borislav Petkov
  2012-03-30 13:26     ` Mauro Carvalho Chehab
@ 2012-04-16  8:41     ` Mauro Carvalho Chehab
  2012-04-16 11:02       ` Borislav Petkov
  1 sibling, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16  8:41 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

Em 30-03-2012 07:50, Borislav Petkov escreveu:
> On Thu, Mar 29, 2012 at 01:45:34PM -0300, Mauro Carvalho Chehab wrote:
>> The way a DIMM is currently represented implies that they're
>> linked into a per-csrow struct. However, some drivers don't see
>> csrows, as they're ridden behind some chip like the AMB's
>> on FBDIMM's, for example.
>>
>> This forced drivers to fake a csrow struct, and to create
>> a mess under csrow/channel original's concept.
>>
>> Move the DIMM labels into a per-DIMM struct, and add there
>> the real location of the socket, in terms of csrow/channel.
>> Latter patches will modify the location to properly represent the
>> memory architecture.
>>
>> All other drivers will use a per-csrow type of location.
>> Some of those drivers will require a latter conversion, as
>> they also fake the csrows internally.
>>
>> TODO: While this patch doesn't change the existing behavior, on
>> csrows-based memory controllers, a csrow/channel pair points to a memory
>> rank. There's a known bug at the EDAC core that allows having different
>> labels for the same DIMM, if it has more than one rank. A latter patch
>> is need to merge the several ranks for a DIMM into the same dimm_info
>> struct, in order to avoid having different labels for the same DIMM.
>>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>  drivers/edac/edac_mc.c       |   50 +++++++++++++++++++++++++++++++----------
>>  drivers/edac/edac_mc_sysfs.c |   11 ++++-----
>>  drivers/edac/i5100_edac.c    |    8 +++---
>>  drivers/edac/i7core_edac.c   |    4 +-
>>  drivers/edac/i82975x_edac.c  |    2 +-
>>  drivers/edac/sb_edac.c       |    4 +-
>>  include/linux/edac.h         |   28 +++++++++++++++++++----
>>  7 files changed, 75 insertions(+), 32 deletions(-)
>>
>> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
>> index 690cbf1..c03bfe7 100644
>> --- a/drivers/edac/edac_mc.c
>> +++ b/drivers/edac/edac_mc.c
>> @@ -44,7 +44,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>>  	debugf4("\tchannel = %p\n", chan);
>>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
>>  	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
>> -	debugf4("\tchannel->label = '%s'\n", chan->label);
>> +	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
>>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
>>  }
>>  
>> @@ -157,6 +157,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	struct mem_ctl_info *mci;
>>  	struct csrow_info *csi, *csrow;
>>  	struct rank_info *chi, *chp, *chan;
>> +	struct dimm_info *dimm;
>>  	void *pvt;
>>  	unsigned size;
>>  	int row, chn;
>> @@ -170,7 +171,8 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	mci = (struct mem_ctl_info *)0;
>>  	csi = edac_align_ptr(&mci[1], sizeof(*csi));
>>  	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
>> -	pvt = edac_align_ptr(&chi[nr_chans * nr_csrows], sz_pvt);
>> +	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
>> +	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
>>  	size = ((unsigned long)pvt) + sz_pvt;
>>  
>>  	mci = kzalloc(size, GFP_KERNEL);
>> @@ -182,11 +184,13 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	 */
>>  	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
>>  	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
>> +	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
>>  	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
>>  
>>  	/* setup index and various internal pointers */
>>  	mci->mc_idx = edac_index;
>>  	mci->csrows = csi;
>> +	mci->dimms  = dimm;
>>  	mci->pvt_info = pvt;
>>  	mci->nr_csrows = nr_csrows;
>>  
>> @@ -205,6 +209,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  		}
>>  	}
>>  
>> +	/*
>> +	 * By default, assumes that a per-csrow arrangement will be used,
>> +	 * as most drivers are based on such assumption.
>> +	 */
>> +	dimm = mci->dimms;
>> +	for (row = 0; row < mci->nr_csrows; row++) {
>> +		for (chn = 0; chn < mci->csrows[row].nr_channels; chn++) {
>> +			mci->csrows[row].channels[chn].dimm = dimm;
>> +			dimm->csrow = row;
>> +			dimm->csrow_channel = chn;
>> +			dimm++;
>> +			mci->nr_dimms++;
>> +		}
>> +	}
> 
> There's a double loop above this one which iterates over the same
> things: rows and then channels in each row. So merge that loop with the
> one above instead of repeating it here.

This is not a double loop. Instead, this is a preparation step made to simplify
the following patches.

This loop initializes the dimm structs, while the first one initializes the
(virtual) csrows/channels.

On this point on the series, dimms and ranks are still equal because the patches
that add the capability of handling different memory hierarchies weren't added yet,
as there are more changes to be done.

In the following patches, this loop will be transformed to:

	for (i = 0; i < tot_dimms; i++) {
...
	}

while the first loop will keep the per csrows/nr_channels range.

So, there's nothing here to be changed.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 02/13] edac: move dimm properties to struct memset_info
  2012-03-30 17:03   ` Borislav Petkov
@ 2012-04-16  8:56     ` Mauro Carvalho Chehab
  2012-04-16 13:31       ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16  8:56 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

Em 30-03-2012 14:03, Borislav Petkov escreveu:
> On Thu, Mar 29, 2012 at 01:45:35PM -0300, Mauro Carvalho Chehab wrote:
>> On systems based on chip select rows, all channels need to use memories
>> with the same properties, otherwise the memories on channels A and B
>> won't be recognized.
>>
>> However, such assumption is not true for all types of memory
>> controllers.
>>
>> Controllers for FB-DIMM's don't have such requirements.
>>
>> Also, modern Intel controllers seem to be capable of handling such
>> differences.
>>
>> So, we need to get rid of storing the DIMM information into a per-csrow
>> data, storing it, instead at the right place.
>>
>> The first step is to move grain, mtype, dtype and edac_mode to the
>> per-dimm struct.
>>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>  drivers/edac/amd64_edac.c      |   30 +++++++++++--------
>>  drivers/edac/amd76x_edac.c     |   10 ++++--
>>  drivers/edac/cell_edac.c       |   10 +++++-
>>  drivers/edac/cpc925_edac.c     |   62 +++++++++++++++++++++------------------
>>  drivers/edac/e752x_edac.c      |   44 +++++++++++++++-------------
>>  drivers/edac/e7xxx_edac.c      |   44 ++++++++++++++++------------
>>  drivers/edac/edac_mc.c         |   19 ++++++++----
>>  drivers/edac/edac_mc_sysfs.c   |    6 ++--
>>  drivers/edac/i3000_edac.c      |   18 ++++++-----
>>  drivers/edac/i3200_edac.c      |   18 ++++++-----
>>  drivers/edac/i5000_edac.c      |   24 +++++++--------
>>  drivers/edac/i5100_edac.c      |   38 +++++++++++++-----------
>>  drivers/edac/i5400_edac.c      |   24 ++++++---------
>>  drivers/edac/i7300_edac.c      |   25 +++++++++------
>>  drivers/edac/i7core_edac.c     |   27 ++++++++---------
>>  drivers/edac/i82443bxgx_edac.c |   13 +++++---
>>  drivers/edac/i82860_edac.c     |   11 ++++--
>>  drivers/edac/i82875p_edac.c    |   17 ++++++++---
>>  drivers/edac/i82975x_edac.c    |   17 +++++++----
>>  drivers/edac/mpc85xx_edac.c    |   13 +++++---
>>  drivers/edac/mv64x60_edac.c    |   18 ++++++-----
>>  drivers/edac/pasemi_edac.c     |   10 ++++--
>>  drivers/edac/ppc4xx_edac.c     |   13 +++++---
>>  drivers/edac/r82600_edac.c     |   10 ++++--
>>  drivers/edac/sb_edac.c         |   31 +++++++++++---------
>>  drivers/edac/tile_edac.c       |   13 ++++----
>>  drivers/edac/x38_edac.c        |   17 ++++++-----
>>  include/linux/edac.h           |   21 ++++++++-----
>>  28 files changed, 340 insertions(+), 263 deletions(-)
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index c9eee6d..3e7bddc 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -2168,7 +2168,9 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  	struct amd64_pvt *pvt = mci->pvt_info;
>>  	u64 input_addr_min, input_addr_max, sys_addr, base, mask;
>>  	u32 val;
>> -	int i, empty = 1;
>> +	int i, j, empty = 1;
>> +	enum mem_type mtype;
>> +	enum edac_type edac_mode;
>>  
>>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>>  
>> @@ -2202,7 +2204,21 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  		csrow->page_mask = ~mask;
>>  		/* 8 bytes of resolution */
>>  
>> -		csrow->mtype = amd64_determine_memory_type(pvt, i);
>> +		mtype = amd64_determine_memory_type(pvt, i);
>> +
>> +		/*
>> +		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
>> +		 */
>> +		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
>> +			edac_mode = (pvt->nbcfg & NBCFG_CHIPKILL) ?
>> +				    EDAC_S4ECD4ED : EDAC_SECDED;
>> +		else
>> +			edac_mode = EDAC_NONE;
>> +
>> +		for (j = 0; j < pvt->channel_count; j++) {
>> +			csrow->channels[j].dimm->mtype = mtype;
>> +			csrow->channels[j].dimm->edac_mode = edac_mode;
>> +		}
>>  
>>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
>>  		debugf1("    input_addr_min: 0x%lx input_addr_max: 0x%lx\n",
>> @@ -2214,16 +2230,6 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  			"last_page: 0x%lx\n",
>>  			(unsigned)csrow->nr_pages,
>>  			csrow->first_page, csrow->last_page);
>> -
>> -		/*
>> -		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
>> -		 */
>> -		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
>> -			csrow->edac_mode =
>> -			    (pvt->nbcfg & NBCFG_CHIPKILL) ?
>> -			    EDAC_S4ECD4ED : EDAC_SECDED;
>> -		else
>> -			csrow->edac_mode = EDAC_NONE;
> 
> This looks like a useless code movement, please leave it where it is
> now and add the for-loop after it instead of pulling it up and causing
> needless churn.

This is needed, as now mtype/edac_mode is per DIMM, and not per channel.
In the specific case of amd64 (and all per-csrow/channel memory controllers),
all channels use the same mtype/edac_mode, but this is not true for other
memory controllers.

So, what the logic there does is to first retrieve the mtype/edac_mode, and
then fill it for each dimm struct (the for loop).

> < snip drivers I'm not maintaining >
> 
>> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
>> index c03bfe7..2430ddb 100644
>> --- a/drivers/edac/edac_mc.c
>> +++ b/drivers/edac/edac_mc.c
>> @@ -43,7 +43,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>>  {
>>  	debugf4("\tchannel = %p\n", chan);
>>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
>> -	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
>> +	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
>>  	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
>>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
>>  }
>> @@ -698,6 +698,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>>  {
>>  	unsigned long remapped_page;
>>  	char *label = NULL;
>> +	u32 grain;
>>  
>>  	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
>>  
>> @@ -722,6 +723,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>>  	}
>>  
>>  	label = mci->csrows[row].channels[channel].dimm->label;
>> +	grain = mci->csrows[row].channels[channel].dimm->grain;
>>  
>>  	if (edac_mc_get_log_ce())
>>  		/* FIXME - put in DIMM location */
>> @@ -729,11 +731,12 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>>  			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
>>  			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
>>  			page_frame_number, offset_in_page,
>> -			mci->csrows[row].grain, syndrome, row, channel,
>> +			grain, syndrome, row, channel,
>>  			label, msg);
>>  
>>  	mci->ce_count++;
>>  	mci->csrows[row].ce_count++;
>> +	mci->csrows[row].channels[channel].dimm->ce_count++;
>>  	mci->csrows[row].channels[channel].ce_count++;
>>  
>>  	if (mci->scrub_mode & SCRUB_SW_SRC) {
>> @@ -750,8 +753,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
>>  			mci->ctl_page_to_phys(mci, page_frame_number) :
>>  			page_frame_number;
>>  
>> -		edac_mc_scrub_block(remapped_page, offset_in_page,
>> -				mci->csrows[row].grain);
>> +		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
>>  	}
>>  }
>>  EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
>> @@ -777,6 +779,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
>>  	int chan;
>>  	int chars;
>>  	char *label = NULL;
>> +	u32 grain;
>>  
>>  	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
>>  
>> @@ -790,6 +793,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
>>  		return;
>>  	}
>>  
>> +	grain = mci->csrows[row].channels[0].dimm->grain;
>>  	label = mci->csrows[row].channels[0].dimm->label;
>>  	chars = snprintf(pos, len + 1, "%s", label);
>>  	len -= chars;
>> @@ -807,14 +811,13 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
>>  		edac_mc_printk(mci, KERN_EMERG,
>>  			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
>>  			"labels \"%s\": %s\n", page_frame_number,
>> -			offset_in_page, mci->csrows[row].grain, row,
>> -			labels, msg);
>> +			offset_in_page, grain, row, labels, msg);
>>  
>>  	if (edac_mc_get_panic_on_ue())
>>  		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
>>  			"row %d, labels \"%s\": %s\n", mci->mc_idx,
>>  			page_frame_number, offset_in_page,
>> -			mci->csrows[row].grain, row, labels, msg);
>> +			grain, row, labels, msg);
>>  
>>  	mci->ue_count++;
>>  	mci->csrows[row].ue_count++;
>> @@ -886,6 +889,7 @@ void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
>>  	chars = snprintf(pos, len + 1, "%s", label);
>>  	len -= chars;
>>  	pos += chars;
>> +
> 
> Useless \n.
> 
>>  	chars = snprintf(pos, len + 1, "-%s",
>>  			mci->csrows[csrow].channels[channelb].dimm->label);
>>  
>> @@ -939,6 +943,7 @@ void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
>>  
>>  	mci->ce_count++;
>>  	mci->csrows[csrow].ce_count++;
>> +	mci->csrows[csrow].channels[channel].dimm->ce_count++;
>>  	mci->csrows[csrow].channels[channel].ce_count++;
>>  }
>>  EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
>> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
>> index c83697c..d63904e 100644
>> --- a/drivers/edac/edac_mc_sysfs.c
>> +++ b/drivers/edac/edac_mc_sysfs.c
>> @@ -150,19 +150,19 @@ static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
>>  static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
>>  				int private)
>>  {
>> -	return sprintf(data, "%s\n", mem_types[csrow->mtype]);
>> +	return sprintf(data, "%s\n", mem_types[csrow->channels[0].dimm->mtype]);
> 
> 							       ^^^
> 
> This looks strange, why always channel 0, because it is always defined?

This is due to the broken API: with the legacy API, all channels need to be
filled with the same memory types. So, the information on channel 0 is identical
to the ones at the other channels. So, for csrows-based memory controllers, the
above logic works just fine.

This is obviously wrong with FB-DIMMs/Nehalem/Sandy Bridge, as the memories on
each channel can be different. Due to this broken API limitation, almost all of 
those drivers just tell the EDAC core that there's just one channel, and maps the
different channels as different csrows.

The very few FB-DIMM drivers that maps their 2 or 4 channels at csrow->channels
currently reports fake information there, filling it with the properties of either
the first channel or the last one, due to the API limitation.

Anyway, using a code like that preserves API backward compatibility.

> 
>>  }
>>  
>>  static ssize_t csrow_dev_type_show(struct csrow_info *csrow, char *data,
>>  				int private)
>>  {
>> -	return sprintf(data, "%s\n", dev_types[csrow->dtype]);
>> +	return sprintf(data, "%s\n", dev_types[csrow->channels[0].dimm->dtype]);
> 
> ditto.

ditto.

> 
>>  }
>>  
>>  static ssize_t csrow_edac_mode_show(struct csrow_info *csrow, char *data,
>>  				int private)
>>  {
>> -	return sprintf(data, "%s\n", edac_caps[csrow->edac_mode]);
>> +	return sprintf(data, "%s\n", edac_caps[csrow->channels[0].dimm->edac_mode]);
> 
> ditto.

ditto.

> 
> <snip more drivers I don't maintain >
> 
> Reminder: you need to get yourself Acks at least from the drivers'
> maintainers which are still active, at least.
> 
>> diff --git a/include/linux/edac.h b/include/linux/edac.h
>> index f40b835..5244193 100644
>> --- a/include/linux/edac.h
>> +++ b/include/linux/edac.h
>> @@ -314,6 +314,13 @@ struct dimm_info {
>>  	unsigned memory_controller;
>>  	unsigned csrow;
>>  	unsigned csrow_channel;
>> +
>> +	u32 grain;		/* granularity of reported error in bytes */
>> +	enum dev_type dtype;	/* memory device type */
>> +	enum mem_type mtype;	/* memory dimm type */
>> +	enum edac_type edac_mode;	/* EDAC mode for this dimm */
>> +
>> +	u32 ce_count;		/* Correctable Errors for this dimm */
>>  };
>>  
>>  /**
>> @@ -339,19 +346,17 @@ struct rank_info {
>>  };
>>  
>>  struct csrow_info {
>> -	unsigned long first_page;	/* first page number in dimm */
>> -	unsigned long last_page;	/* last page number in dimm */
>> +	unsigned long first_page;	/* first page number in csrow */
>> +	unsigned long last_page;	/* last page number in csrow */
>> +	u32 nr_pages;			/* number of pages in csrow */
>>  	unsigned long page_mask;	/* used for interleaving -
>>  					 * 0UL for non intlv
>>  					 */
>> -	u32 nr_pages;		/* number of pages in csrow */
>> -	u32 grain;		/* granularity of reported error in bytes */
>> -	int csrow_idx;		/* the chip-select row */
>> -	enum dev_type dtype;	/* memory device type */
>> +	int csrow_idx;			/* the chip-select row */
>> +
>>  	u32 ue_count;		/* Uncorrectable Errors for this csrow */
>>  	u32 ce_count;		/* Correctable Errors for this csrow */
>> -	enum mem_type mtype;	/* memory csrow type */
>> -	enum edac_type edac_mode;	/* EDAC mode for this csrow */
>> +
>>  	struct mem_ctl_info *mci;	/* the parent */
>>  
>>  	struct kobject kobj;	/* sysfs kobject for this csrow */
>> -- 
>> 1.7.8
> 
> 


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 01/13] edac: Create a dimm struct and move the labels into it
  2012-04-16  8:41     ` Mauro Carvalho Chehab
@ 2012-04-16 11:02       ` Borislav Petkov
  2012-04-16 11:44         ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-16 11:02 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List, Linux Kernel Mailing List

On Mon, Apr 16, 2012 at 05:41:33AM -0300, Mauro Carvalho Chehab wrote:
> This is not a double loop.

But it is, actually.

Let's look at the code:

        /* setup index and various internal pointers */
        mci->mc_idx = edac_index;
        mci->csrows = csi;
        mci->dimms  = dimm;
        mci->pvt_info = pvt;
        mci->nr_csrows = nr_csrows;

        for (row = 0; row < nr_csrows; row++) {					<-- A1
                csrow = &csi[row];
                csrow->csrow_idx = row;
                csrow->mci = mci;
                csrow->nr_channels = nr_chans;
                chp = &chi[row * nr_chans];
                csrow->channels = chp;

                for (chn = 0; chn < nr_chans; chn++) {				<-- B1
                        chan = &chp[chn];
                        chan->chan_idx = chn;
                        chan->csrow = csrow;
                }
        }

        /*
         * By default, assumes that a per-csrow arrangement will be used,
         * as most drivers are based on such assumption.
         */
        dimm = mci->dimms;
        for (row = 0; row < mci->nr_csrows; row++) {				<-- A2
                for (chn = 0; chn < mci->csrows[row].nr_channels; chn++) {	<-- B2
                        mci->csrows[row].channels[chn].dimm = dimm;
                        dimm->csrow = row;
                        dimm->csrow_channel = chn;
                        dimm++;
                        mci->nr_dimms++;
                }
        }

So the lines tagged with A1 and A2 iterate over the nr_csrows, while
lines tagged with B1 and B2 iterate over nr_chans. In B2, loop termination is

	mci->csrows[row].nr_channels

which is assigned in the first loop above to

	csrow->nr_channels = nr_chans

In B1, it is simply nr_chans.

So how about we merge those two:

	...
	dimm = mci->dimms;

        for (row = 0; row < nr_csrows; row++) {					<-- A1
                csrow = &csi[row];
                csrow->csrow_idx = row;
                csrow->mci = mci;
                csrow->nr_channels = nr_chans;
                chp = &chi[row * nr_chans];
                csrow->channels = chp;

                for (chn = 0; chn < nr_chans; chn++) {				<-- B1
                        chan = &chp[chn];
                        chan->chan_idx = chn;
                        chan->csrow = csrow;

			 /* second loop */
                        csrow->channels[chn].dimm = dimm;
                        dimm->csrow = row;
                        dimm->csrow_channel = chn;
                        dimm++;
                        mci->nr_dimms++;
                }
        }

So it is only 5 lines of code instead of another loop.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 01/13] edac: Create a dimm struct and move the labels into it
  2012-04-16 11:02       ` Borislav Petkov
@ 2012-04-16 11:44         ` Mauro Carvalho Chehab
  2012-04-16 13:21           ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 11:44 UTC (permalink / raw)
  To: Borislav Petkov; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

Em 16-04-2012 08:02, Borislav Petkov escreveu:
> On Mon, Apr 16, 2012 at 05:41:33AM -0300, Mauro Carvalho Chehab wrote:
>> This is not a double loop.
> 
> But it is, actually.
> 
> Let's look at the code:
> 
>         /* setup index and various internal pointers */
>         mci->mc_idx = edac_index;
>         mci->csrows = csi;
>         mci->dimms  = dimm;
>         mci->pvt_info = pvt;
>         mci->nr_csrows = nr_csrows;
> 
>         for (row = 0; row < nr_csrows; row++) {					<-- A1
>                 csrow = &csi[row];
>                 csrow->csrow_idx = row;
>                 csrow->mci = mci;
>                 csrow->nr_channels = nr_chans;
>                 chp = &chi[row * nr_chans];
>                 csrow->channels = chp;
> 
>                 for (chn = 0; chn < nr_chans; chn++) {				<-- B1
>                         chan = &chp[chn];
>                         chan->chan_idx = chn;
>                         chan->csrow = csrow;
>                 }
>         }
> 
>         /*
>          * By default, assumes that a per-csrow arrangement will be used,
>          * as most drivers are based on such assumption.
>          */
>         dimm = mci->dimms;
>         for (row = 0; row < mci->nr_csrows; row++) {				<-- A2
>                 for (chn = 0; chn < mci->csrows[row].nr_channels; chn++) {	<-- B2
>                         mci->csrows[row].channels[chn].dimm = dimm;
>                         dimm->csrow = row;
>                         dimm->csrow_channel = chn;
>                         dimm++;
>                         mci->nr_dimms++;
>                 }
>         }
> 
> So the lines tagged with A1 and A2 iterate over the nr_csrows, while
> lines tagged with B1 and B2 iterate over nr_chans. In B2, loop termination is
> 
> 	mci->csrows[row].nr_channels
> 
> which is assigned in the first loop above to
> 
> 	csrow->nr_channels = nr_chans
> 
> In B1, it is simply nr_chans.

Yes, this is true, on this patchset, because it is a preparation for a bigger
change, but this won't be true after changeset 5/13, where the dimm_info will
get a real life.

> 
> So how about we merge those two:
> 
> 	...
> 	dimm = mci->dimms;
> 
>         for (row = 0; row < nr_csrows; row++) {					<-- A1
>                 csrow = &csi[row];
>                 csrow->csrow_idx = row;
>                 csrow->mci = mci;
>                 csrow->nr_channels = nr_chans;
>                 chp = &chi[row * nr_chans];
>                 csrow->channels = chp;
> 
>                 for (chn = 0; chn < nr_chans; chn++) {				<-- B1
>                         chan = &chp[chn];
>                         chan->chan_idx = chn;
>                         chan->csrow = csrow;
> 
> 			 /* second loop */
>                         csrow->channels[chn].dimm = dimm;
>                         dimm->csrow = row;
>                         dimm->csrow_channel = chn;
>                         dimm++;
>                         mci->nr_dimms++;
>                 }
>         }
> 
> So it is only 5 lines of code instead of another loop.

Because this will make patch 5/13 even bigger and messy. Each of those
loops have different functions: the first one initializes the legacy API
data structures for virtual csrows/channels, while the second one 
initializes the struct that contains the real DIMM or rank information.

Patches 1 to 4 are just a preparation for patch 5/13, cleaning what's
possible before the big change.

While it is possible to do the above merge on this patch alone, such
cleanup doesn't make sense at the patch series (and should be reverted
on patch 5/13 anyway), as what we want is to separate the DIMM information 
on a data structure that won't mix it with a memory layout-dependent 
information, as not all drivers use csrows/channels.

Regards,
Mauro.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers
  2012-04-02 13:59 ` Borislav Petkov
@ 2012-04-16 12:58   ` Mauro Carvalho Chehab
  2012-04-16 14:06     ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 12:58 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski Filho

Em 02-04-2012 10:59, Borislav Petkov escreveu:
> On Thu, Mar 29, 2012 at 01:45:33PM -0300, Mauro Carvalho Chehab wrote:
>> This is the 12th and final rebase of this patch series.
>>
>> It is the first patchset for the EDAC rewrite. On this patchset,
>> there are all the internal changes at the EDAC core, needed
>> to properly represent memories at modern memory controllers that
>> aren't oriented per rank/channel.
>>
>> It is needed in order to fix a long-term bug at the EDAC drivers
>> for the Intel memory controllers deployed since 2005 (well, in fact,
>> there is one Rambus that it is older, but also suffers from the same
>> syndrome), including the drivers for the recent Intel Nehalem and
>> Sandy Bridge architectures.
>>
>> The new EDAC architecture supports both per rank/channel memory
>> controllers and per-DIMM ones.
>>
>> On this changeset, there are no changes at the sysfs nodes. Just 
>> like before this changeset, non-per-rank memory controllers 
>> will expose memories as "virtual csrows/virtual channels[1].
>>
>> [1] It sounds better to say "virtual" than to admit that all
>> EDAC Intel drivers since 2005 need to lie about their age to
>> the EDAC core, in order for the Kernel to accept them ;)
>>
>> Mauro Carvalho Chehab (13):
>>   edac: Create a dimm struct and move the labels into it
>>   edac: move dimm properties to struct memset_info
>>   edac: Don't initialize csrow's first_page & friends when not needed
>>   edac: move nr_pages to dimm struct
>>   edac: Fix core support for MC's that see DIMMS instead of ranks
> 
> I was wondering why 6/13 doesn't apply cleanly but there's the patch
> above, 5/13 missing in the submission. It looks like vger has eaten it
> at least for the linux-edac mailing list - the patch is still on lkml
> though.

That's weird. Maybe it was just a temporary error at vger. I'll contact vger
maintainers in order to double check what's happening there.

> 
> And what a patch it is: almost 5000 lines.

No. It is half of it (2449 lines):
---
 drivers/edac/amd64_edac.c      |  137 ++++++---
 drivers/edac/amd76x_edac.c     |   30 ++-
 drivers/edac/cell_edac.c       |   26 ++-
 drivers/edac/cpc925_edac.c     |   25 ++-
 drivers/edac/e752x_edac.c      |   51 +++-
 drivers/edac/e7xxx_edac.c      |   39 ++-
 drivers/edac/edac_core.h       |   48 +--
 drivers/edac/edac_device.c     |   27 +-
 drivers/edac/edac_mc.c         |  657 +++++++++++++++++++++++-----------------
 drivers/edac/edac_mc_sysfs.c   |   91 +++---
 drivers/edac/edac_module.h     |    2 +-
 drivers/edac/edac_pci.c        |    7 +-
 drivers/edac/i3000_edac.c      |   27 ++-
 drivers/edac/i3200_edac.c      |   34 ++-
 drivers/edac/i5000_edac.c      |   58 +++--
 drivers/edac/i5100_edac.c      |   90 +++---
 drivers/edac/i5400_edac.c      |  217 ++++++++------
 drivers/edac/i7300_edac.c      |   81 ++---
 drivers/edac/i7core_edac.c     |  202 +++---------
 drivers/edac/i82443bxgx_edac.c |   28 +-
 drivers/edac/i82860_edac.c     |   44 ++-
 drivers/edac/i82875p_edac.c    |   31 ++-
 drivers/edac/i82975x_edac.c    |   29 ++-
 drivers/edac/mpc85xx_edac.c    |   28 ++-
 drivers/edac/mv64x60_edac.c    |   25 ++-
 drivers/edac/pasemi_edac.c     |   27 +-
 drivers/edac/ppc4xx_edac.c     |   33 ++-
 drivers/edac/r82600_edac.c     |   29 ++-
 drivers/edac/sb_edac.c         |  159 ++++-------
 drivers/edac/tile_edac.c       |   16 +-
 drivers/edac/x38_edac.c        |   30 ++-
 include/linux/edac.h           |  121 +++++++-
 32 files changed, 1392 insertions(+), 1057 deletions(-)

This patch series is all about the edac.h changes: the old per-csrow/channel
way of allocating/describing/reporting memory errors got replaced. As a side
effect of this single change, all the rest needed to be fixed, to avoid compilation
breakage.

The API change at edac.h has 121 lines, and it directly caused the changes at
edac_mc/edac_mc_sysfs. An EDAC core reviewer should start reading this patch by
those changes.

On non-FB-DIMM/Nehalem/SB drivers, the driver changes are trivial: just function calls
got replaced and a few code were re-ordered on a few places, in order to provide more 
info to the error report function when part of the parser fails. It shouldn't be
hard for driver maintainers to review those changes.

The changes on the other drivers aren't a direct function call conversion.
They got real fixes, in order to proper address the FB-DIMM way of working
with memories. I am the author/maintainer of most of those drivers, so I should
know exactly what I'm doing there. Yet, I had to dig for several hours on
datasheets, in order to double check some of the changes there, and being sure
that the new code will work properly.

Also, I tested the changes there on real hardware.

> 
> Please split it!
> 
> And don't tell me it cannot be done: each patch needs to do one thing
> and one thing only. From looking at this monster, here's one possible
> way to split it:
> 
> * add all changes to include/linux/edac.h

No way. Applying just the include/linux/edac.h changes:

drivers/edac/edac_mc.c: In function ‘edac_mc_dump_channel’:
drivers/edac/edac_mc.c:47:2: error: ‘struct dimm_info’ has no member named ‘ce_count’
drivers/edac/edac_mc.c: In function ‘edac_mc_dump_mci’:
drivers/edac/edac_mc.c:71:2: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c: In function ‘edac_mc_alloc’:
drivers/edac/edac_mc.c:195:5: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:217:25: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:221:8: error: ‘struct dimm_info’ has no member named ‘csrow_channel’
drivers/edac/edac_mc.c:223:7: error: ‘struct mem_ctl_info’ has no member named ‘nr_dimms’
drivers/edac/edac_mc.c: In function ‘edac_mc_add_mc’:
drivers/edac/edac_mc.c:531:22: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c: In function ‘edac_mc_find_csrow_by_page’:
drivers/edac/edac_mc.c:663:21: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c: In function ‘edac_mc_handle_ce’:
drivers/edac/edac_mc.c:710:16: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:712:3: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:741:5: error: ‘struct mem_ctl_info’ has no member named ‘ce_count’
drivers/edac/edac_mc.c:743:41: error: ‘struct dimm_info’ has no member named ‘ce_count’
drivers/edac/edac_mc.c: In function ‘edac_mc_handle_ce_no_info’:
drivers/edac/edac_mc.c:772:5: error: ‘struct mem_ctl_info’ has no member named ‘ce_count’
drivers/edac/edac_mc.c: In function ‘edac_mc_handle_ue’:
drivers/edac/edac_mc.c:791:16: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:793:3: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:826:5: error: ‘struct mem_ctl_info’ has no member named ‘ue_count’
drivers/edac/edac_mc.c: In function ‘edac_mc_handle_ue_no_info’:
drivers/edac/edac_mc.c:840:5: error: ‘struct mem_ctl_info’ has no member named ‘ue_count’
drivers/edac/edac_mc.c: In function ‘edac_mc_handle_fbd_ue’:
drivers/edac/edac_mc.c:859:18: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:861:3: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:888:5: error: ‘struct mem_ctl_info’ has no member named ‘ue_count’
drivers/edac/edac_mc.c: In function ‘edac_mc_handle_fbd_ce’:
drivers/edac/edac_mc.c:923:18: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:925:3: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc.c:948:5: error: ‘struct mem_ctl_info’ has no member named ‘ce_count’
drivers/edac/edac_mc.c:950:43: error: ‘struct dimm_info’ has no member named ‘ce_count’
drivers/edac/edac_mc_sysfs.c: In function ‘mci_reset_counters_store’:
drivers/edac/edac_mc_sysfs.c:428:5: error: ‘struct mem_ctl_info’ has no member named ‘ue_count’
drivers/edac/edac_mc_sysfs.c:429:5: error: ‘struct mem_ctl_info’ has no member named ‘ce_count’
drivers/edac/edac_mc_sysfs.c:431:25: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc_sysfs.c: In function ‘mci_ue_count_show’:
drivers/edac/edac_mc_sysfs.c:498:34: error: ‘struct mem_ctl_info’ has no member named ‘ue_count’
drivers/edac/edac_mc_sysfs.c: In function ‘mci_ce_count_show’:
drivers/edac/edac_mc_sysfs.c:503:34: error: ‘struct mem_ctl_info’ has no member named ‘ce_count’
drivers/edac/edac_mc_sysfs.c: In function ‘mci_size_mb_show’:
drivers/edac/edac_mc_sysfs.c:530:37: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc_sysfs.c: In function ‘edac_create_sysfs_mci_device’:
drivers/edac/edac_mc_sysfs.c:942:21: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc_sysfs.c: In function ‘edac_remove_sysfs_mci_device’:
drivers/edac/edac_mc_sysfs.c:995:21: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’
drivers/edac/edac_mc_sysfs.c: In function ‘mci_ce_count_show’:
drivers/edac/edac_mc_sysfs.c:504:1: warning: control reaches end of non-void function [-Wreturn-type]
drivers/edac/edac_mc_sysfs.c: In function ‘mci_ue_count_show’:
drivers/edac/edac_mc_sysfs.c:499:1: warning: control reaches end of non-void function [-Wreturn-type]
drivers/edac/i5100_edac.c: In function ‘i5100_handle_ce’:
drivers/edac/i5100_edac.c:442:5: error: ‘struct mem_ctl_info’ has no member named ‘ce_count’
drivers/edac/i5100_edac.c: In function ‘i5100_handle_ue’:
drivers/edac/i5100_edac.c:468:5: error: ‘struct mem_ctl_info’ has no member named ‘ue_count’
drivers/edac/i5100_edac.c: In function ‘i5100_init_csrows’:
drivers/edac/i5100_edac.c:850:21: error: ‘struct mem_ctl_info’ has no member named ‘nr_csrows’

The changes at edac.h are replacing the csrow-dependent broken internal ABI
to a csrow-independent one. Due to that single change, all existing code needs to
be touched.

> * a bunch of changes to edac_mc.c like edac_align_ptr etc

edac_align_ptr changes can indeed be put on a separate patch. I'll work on it.

> * changes to edac_mc_alloc

Those are also related with the edac.h changes: the data got moved from one place to 
another one, some fields disappeared, others appeared.

The alloc routine need to follow the representation changes that happened at edac.h.

> * add edac_mc_handle_error
> * switch old edac_mc_handle* stuff to edac_mc_handle_error

Same here: all edac_mc_handle* are dependent on the internal representation
of the memory architecture. For example, all edac_mc_handle*_fbd_* are related
to the way FB-DIMMs got faked inside the EDAC core. Fixing the internal representation
means that all those arch-dependent methods should cease to exist at the patch that
fixes it, as the old way doesn't work anymore.

Basically, except for edac_align_ptr() changes that can indeed be split,
all the rest are just a side effect of changing include/linux/edac.h.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 01/13] edac: Create a dimm struct and move the labels into it
  2012-04-16 11:44         ` Mauro Carvalho Chehab
@ 2012-04-16 13:21           ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-16 13:21 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Mon, Apr 16, 2012 at 08:44:40AM -0300, Mauro Carvalho Chehab wrote:
> Because this will make patch 5/13 even bigger and messy. Each of those
> loops have different functions: the first one initializes the legacy API
> data structures for virtual csrows/channels, while the second one 
> initializes the struct that contains the real DIMM or rank information.
> 
> Patches 1 to 4 are just a preparation for patch 5/13, cleaning what's
> possible before the big change.
> 
> While it is possible to do the above merge on this patch alone, such
> cleanup doesn't make sense at the patch series (and should be reverted
> on patch 5/13 anyway), as what we want is to separate the DIMM information 
> on a data structure that won't mix it with a memory layout-dependent 
> information, as not all drivers use csrows/channels.

So what?!

Does that mean we do patches in between other patches where code quality
is not that good simply because we'll remove that in the later patch?
No!

Also, 5/13 is a monster and needs proper splitting anyway. So, if you
have a strong technical reason to do two loops, please come forward with
it. Otherwise, please change your patches to review requirements as
_everyone_ else does on lkml instead of giving some unrelated and bogus
reasoning for this and that.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 02/13] edac: move dimm properties to struct memset_info
  2012-04-16  8:56     ` Mauro Carvalho Chehab
@ 2012-04-16 13:31       ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-16 13:31 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On Mon, Apr 16, 2012 at 05:56:46AM -0300, Mauro Carvalho Chehab wrote:
> >> @@ -2202,7 +2204,21 @@ static int init_csrows(struct mem_ctl_info *mci)
> >>  		csrow->page_mask = ~mask;
> >>  		/* 8 bytes of resolution */
> >>  
> >> -		csrow->mtype = amd64_determine_memory_type(pvt, i);
> >> +		mtype = amd64_determine_memory_type(pvt, i);
> >> +
> >> +		/*
> >> +		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> >> +		 */
> >> +		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
> >> +			edac_mode = (pvt->nbcfg & NBCFG_CHIPKILL) ?
> >> +				    EDAC_S4ECD4ED : EDAC_SECDED;
> >> +		else
> >> +			edac_mode = EDAC_NONE;
> >> +
> >> +		for (j = 0; j < pvt->channel_count; j++) {
> >> +			csrow->channels[j].dimm->mtype = mtype;
> >> +			csrow->channels[j].dimm->edac_mode = edac_mode;
> >> +		}
> >>  
> >>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
> >>  		debugf1("    input_addr_min: 0x%lx input_addr_max: 0x%lx\n",
> >> @@ -2214,16 +2230,6 @@ static int init_csrows(struct mem_ctl_info *mci)
> >>  			"last_page: 0x%lx\n",
> >>  			(unsigned)csrow->nr_pages,
> >>  			csrow->first_page, csrow->last_page);
> >> -
> >> -		/*
> >> -		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> >> -		 */
> >> -		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
> >> -			csrow->edac_mode =
> >> -			    (pvt->nbcfg & NBCFG_CHIPKILL) ?
> >> -			    EDAC_S4ECD4ED : EDAC_SECDED;
> >> -		else
> >> -			csrow->edac_mode = EDAC_NONE;
> > 
> > This looks like a useless code movement, please leave it where it is
> > now and add the for-loop after it instead of pulling it up and causing
> > needless churn.
> 
> This is needed, as now mtype/edac_mode is per DIMM, and not per channel.
> In the specific case of amd64 (and all per-csrow/channel memory controllers),
> all channels use the same mtype/edac_mode, but this is not true for other
> memory controllers.
> 
> So, what the logic there does is to first retrieve the mtype/edac_mode, and
> then fill it for each dimm struct (the for loop).

I can see that. But you don't have to move it anywhere, simply add the

	for (j = 0; j < pvt->channel_count; j++) {
		...

loop after the debugf1() calls and this way the patch has a smaller
changes net count and it is easier to review.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers
  2012-04-16 12:58   ` Mauro Carvalho Chehab
@ 2012-04-16 14:06     ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-16 14:06 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski Filho

On Mon, Apr 16, 2012 at 09:58:23AM -0300, Mauro Carvalho Chehab wrote:
> That's weird. Maybe it was just a temporary error at vger. I'll contact vger
> maintainers in order to double check what's happening there.

No, not weird. You're probably hitting some limit on patch mail size (it
was 40K AFAICR) and your patch is 156K.

> > And what a patch it is: almost 5000 lines.
> 
> No. It is half of it (2449 lines):

$ wc -l 05-edac-fix-fore-support-for-mcs-that-see-dimms.patch
4642 05-edac-fix-fore-support-for-mcs-that-see-dimms.patch

[..]

> No way. Applying just the include/linux/edac.h changes:
> 
> drivers/edac/edac_mc.c: In function ‘edac_mc_dump_channel’:
> drivers/edac/edac_mc.c:47:2: error: ‘struct dimm_info’ has no member named ‘ce_count’

[..]

> The changes at edac.h are replacing the csrow-dependent broken
> internal ABI to a csrow-independent one. Due to that single change,
> all existing code needs to be touched.

Ok let me spell it:

* Add a patch which only _adds_ the changes to <include/linux/edac.h>, i.e.:

enum hw_event_mc_err_type, edac_mc_layer_type, GET_POS macro and that's it -
this is your first patch out of this monster and the changes there are easily
verifiable when looking at it.

* Then, add a patch which introduces dimm_info->location[] array along with code
that touches it (if possible).

* Then, add a patch which adds dimm_info->mci which is the parent (it
should be called parent_mci btw) and all code that touches it.

...

* Add a patch which adds ->csrow and ->cschannel _without_ removing
->ce_count so that drivers still build.

* Then, after you've switched code to use ->csrow and ->cschannel,
remove ->ce_count.

* Then, after you've converted the data structures properly, you can
always adjust the functions in later patches.

* Then, documentation pointers to memory controllers can go into a
different patch.

* edac_mc_dump_dimm() is yet another patch.

...

Do you catch my drift?

All I'm trying to explain to you is that a reviewer needs small
patches, each patch touching a _single_ thing so that it is easily
understandable what you're changing.

Then, it is much easily debuggable than with a 5000 lines single
monster.

Also, look at Documentation/SubmittingPatches which has some more good
advice. Hint: "If you cannot condense your patch set into a smaller set
of patches, then only post say 15 or so at a time and wait for review
and integration."

> > * changes to edac_mc_alloc
>
> Those are also related with the edac.h changes: the data got moved
> from one place to another one, some fields disappeared, others
> appeared.

As said above, split changing of those members in single patches.

And so on...

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers
  2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
                   ` (14 preceding siblings ...)
  2012-04-02 13:59 ` Borislav Petkov
@ 2012-04-16 20:12 ` Mauro Carvalho Chehab
  2012-04-16 20:12   ` [EDAC PATCH v13 1/7] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
                     ` (7 more replies)
  15 siblings, 8 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

It is the first patchset for the EDAC rewrite. On this patchset,
there are all the internal changes at the EDAC core, needed
to properly represent memories at modern memory controllers that
aren't oriented per rank/channel.

Drivers will be changed by the next changeset.

This series is needed in order to fix a long-term bug at the EDAC drivers
for the Intel memory controllers deployed since 2005 (well, in fact,
there is one Rambus that it is older, but also suffers from the same
syndrome), including the drivers for the recent Intel Nehalem and
Sandy Bridge architectures.

The new EDAC architecture supports both per rank/channel memory
controllers and per-DIMM ones.

On this changeset, there are no changes at the Kernel-Userspace
API. All changes are at the EDAC kernel API used by the drivers.

Mauro Carvalho Chehab (7):
  edac: Create a dimm struct and move the labels into it
  edac: move dimm properties to struct dimm_info
  edac: Don't initialize csrow's first_page & friends when not needed
  edac: move nr_pages to dimm struct
  edac: rewrite edac_align_ptr()
  edac.h: Prepare to handle with generic layers
  edac: Change internal representation to work with layers

 drivers/edac/amd64_edac.c      |   66 +---
 drivers/edac/amd76x_edac.c     |   14 +-
 drivers/edac/cell_edac.c       |   18 +-
 drivers/edac/cpc925_edac.c     |   70 +++--
 drivers/edac/e752x_edac.c      |   48 ++--
 drivers/edac/e7xxx_edac.c      |   49 ++--
 drivers/edac/edac_core.h       |   92 +++++-
 drivers/edac/edac_device.c     |   27 +-
 drivers/edac/edac_mc.c         |  691 ++++++++++++++++++++++++++--------------
 drivers/edac/edac_mc_sysfs.c   |   62 +++--
 drivers/edac/edac_module.h     |    2 +-
 drivers/edac/edac_pci.c        |    7 +-
 drivers/edac/i3000_edac.c      |   24 +-
 drivers/edac/i3200_edac.c      |   24 +-
 drivers/edac/i5000_edac.c      |   31 +-
 drivers/edac/i5100_edac.c      |   46 ++--
 drivers/edac/i5400_edac.c      |   38 +--
 drivers/edac/i7300_edac.c      |   40 +--
 drivers/edac/i7core_edac.c     |   40 +--
 drivers/edac/i82443bxgx_edac.c |   15 +-
 drivers/edac/i82860_edac.c     |   13 +-
 drivers/edac/i82875p_edac.c    |   22 +-
 drivers/edac/i82975x_edac.c    |   30 ++-
 drivers/edac/mpc85xx_edac.c    |   16 +-
 drivers/edac/mv64x60_edac.c    |   22 +-
 drivers/edac/pasemi_edac.c     |   24 +-
 drivers/edac/ppc4xx_edac.c     |   25 +-
 drivers/edac/r82600_edac.c     |   13 +-
 drivers/edac/sb_edac.c         |   37 +--
 drivers/edac/tile_edac.c       |   17 +-
 drivers/edac/x38_edac.c        |   24 +-
 include/linux/edac.h           |  164 ++++++++--
 32 files changed, 1099 insertions(+), 712 deletions(-)

-- 
1.7.8


^ permalink raw reply	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 1/7] edac: Create a dimm struct and move the labels into it
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
@ 2012-04-16 20:12   ` Mauro Carvalho Chehab
  2012-04-26 14:26     ` Borislav Petkov
  2012-04-16 20:12   ` [EDAC PATCH v13 2/7] edac: move dimm properties to struct dimm_info Mauro Carvalho Chehab
                     ` (6 subsequent siblings)
  7 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Ranganathan Desikan,
	Arvind R.,
	Niklas Söderlund

The way a DIMM is currently represented implies that they're
linked into a per-csrow struct. However, some drivers don't see
csrows, as they're ridden behind some chip like the AMB's
on FBDIMM's, for example.

This forced drivers to fake^Wvirtualize a csrow struct, and to create
a mess under csrow/channel original's concept.

Move the DIMM labels into a per-DIMM struct, and add there
the real location of the socket, in terms of csrow/channel.
Latter patches will modify the location to properly represent the
memory architecture.

All other drivers will use a per-csrow type of location.
Some of those drivers will require a latter conversion, as
they also fake the csrows internally.

TODO: While this patch doesn't change the existing behavior, on
csrows-based memory controllers, a csrow/channel pair points to a memory
rank. There's a known bug at the EDAC core that allows having different
labels for the same DIMM, if it has more than one rank. A latter patch
is need to merge the several ranks for a DIMM into the same dimm_info
struct, in order to avoid having different labels for the same DIMM.

The edac_mc_alloc() will now contain a per-dimm initialization loop that
will be changed by latter patches in order to match other types of
memory architectures.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/edac_mc.c       |   47 +++++++++++++++++++++++++++++++----------
 drivers/edac/edac_mc_sysfs.c |   11 ++++-----
 drivers/edac/i5100_edac.c    |    8 +++---
 drivers/edac/i7core_edac.c   |    4 +-
 drivers/edac/i82975x_edac.c  |    2 +-
 drivers/edac/sb_edac.c       |    4 +-
 include/linux/edac.h         |   28 ++++++++++++++++++++----
 7 files changed, 72 insertions(+), 32 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 690cbf1..ba2599e 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,7 +44,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
-	debugf4("\tchannel->label = '%s'\n", chan->label);
+	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
 }
 
@@ -157,6 +157,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	struct mem_ctl_info *mci;
 	struct csrow_info *csi, *csrow;
 	struct rank_info *chi, *chp, *chan;
+	struct dimm_info *dimm;
 	void *pvt;
 	unsigned size;
 	int row, chn;
@@ -170,7 +171,8 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	mci = (struct mem_ctl_info *)0;
 	csi = edac_align_ptr(&mci[1], sizeof(*csi));
 	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
-	pvt = edac_align_ptr(&chi[nr_chans * nr_csrows], sz_pvt);
+	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
+	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	mci = kzalloc(size, GFP_KERNEL);
@@ -182,14 +184,22 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
+	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
+	mci->dimms  = dimm;
 	mci->pvt_info = pvt;
 	mci->nr_csrows = nr_csrows;
 
+	/*
+	 * For now, assumes that a per-csrow arrangement for dimms.
+	 * This will be latter changed.
+	 */
+	dimm = mci->dimms;
+
 	for (row = 0; row < nr_csrows; row++) {
 		csrow = &csi[row];
 		csrow->csrow_idx = row;
@@ -202,6 +212,12 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 			chan = &chp[chn];
 			chan->chan_idx = chn;
 			chan->csrow = csrow;
+
+			mci->csrows[row].channels[chn].dimm = dimm;
+			dimm->csrow = row;
+			dimm->csrow_channel = chn;
+			dimm++;
+			mci->nr_dimms++;
 		}
 	}
 
@@ -678,6 +694,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 		int row, int channel, const char *msg)
 {
 	unsigned long remapped_page;
+	char *label = NULL;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
@@ -701,6 +718,8 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 		return;
 	}
 
+	label = mci->csrows[row].channels[channel].dimm->label;
+
 	if (edac_mc_get_log_ce())
 		/* FIXME - put in DIMM location */
 		edac_mc_printk(mci, KERN_WARNING,
@@ -708,7 +727,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
 			page_frame_number, offset_in_page,
 			mci->csrows[row].grain, syndrome, row, channel,
-			mci->csrows[row].channels[channel].label, msg);
+			label, msg);
 
 	mci->ce_count++;
 	mci->csrows[row].ce_count++;
@@ -754,6 +773,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 	char *pos = labels;
 	int chan;
 	int chars;
+	char *label = NULL;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
@@ -767,15 +787,15 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 		return;
 	}
 
-	chars = snprintf(pos, len + 1, "%s",
-			 mci->csrows[row].channels[0].label);
+	label = mci->csrows[row].channels[0].dimm->label;
+	chars = snprintf(pos, len + 1, "%s", label);
 	len -= chars;
 	pos += chars;
 
 	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
 		chan++) {
-		chars = snprintf(pos, len + 1, ":%s",
-				 mci->csrows[row].channels[chan].label);
+		label = mci->csrows[row].channels[chan].dimm->label;
+		chars = snprintf(pos, len + 1, ":%s", label);
 		len -= chars;
 		pos += chars;
 	}
@@ -824,6 +844,7 @@ void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
 	char labels[len + 1];
 	char *pos = labels;
 	int chars;
+	char *label;
 
 	if (csrow >= mci->nr_csrows) {
 		/* something is wrong */
@@ -858,12 +879,12 @@ void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
 	mci->csrows[csrow].ue_count++;
 
 	/* Generate the DIMM labels from the specified channels */
-	chars = snprintf(pos, len + 1, "%s",
-			 mci->csrows[csrow].channels[channela].label);
+	label = mci->csrows[csrow].channels[channela].dimm->label;
+	chars = snprintf(pos, len + 1, "%s", label);
 	len -= chars;
 	pos += chars;
 	chars = snprintf(pos, len + 1, "-%s",
-			 mci->csrows[csrow].channels[channelb].label);
+			mci->csrows[csrow].channels[channelb].dimm->label);
 
 	if (edac_mc_get_log_ue())
 		edac_mc_printk(mci, KERN_EMERG,
@@ -885,6 +906,7 @@ EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
 void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
 			unsigned int csrow, unsigned int channel, char *msg)
 {
+	char *label = NULL;
 
 	/* Ensure boundary values */
 	if (csrow >= mci->nr_csrows) {
@@ -904,12 +926,13 @@ void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
 		return;
 	}
 
+	label = mci->csrows[csrow].channels[channel].dimm->label;
+
 	if (edac_mc_get_log_ce())
 		/* FIXME - put in DIMM location */
 		edac_mc_printk(mci, KERN_WARNING,
 			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel,
-			mci->csrows[csrow].channels[channel].label, msg);
+			csrow, channel, label, msg);
 
 	mci->ce_count++;
 	mci->csrows[csrow].ce_count++;
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index d56e634..c83697c 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -170,11 +170,11 @@ static ssize_t channel_dimm_label_show(struct csrow_info *csrow,
 				char *data, int channel)
 {
 	/* if field has not been initialized, there is nothing to send */
-	if (!csrow->channels[channel].label[0])
+	if (!csrow->channels[channel].dimm->label[0])
 		return 0;
 
 	return snprintf(data, EDAC_MC_LABEL_LEN, "%s\n",
-			csrow->channels[channel].label);
+			csrow->channels[channel].dimm->label);
 }
 
 static ssize_t channel_dimm_label_store(struct csrow_info *csrow,
@@ -184,8 +184,8 @@ static ssize_t channel_dimm_label_store(struct csrow_info *csrow,
 	ssize_t max_size = 0;
 
 	max_size = min((ssize_t) count, (ssize_t) EDAC_MC_LABEL_LEN - 1);
-	strncpy(csrow->channels[channel].label, data, max_size);
-	csrow->channels[channel].label[max_size] = '\0';
+	strncpy(csrow->channels[channel].dimm->label, data, max_size);
+	csrow->channels[channel].dimm->label[max_size] = '\0';
 
 	return max_size;
 }
@@ -952,9 +952,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	/* CSROW error: backout what has already been registered,  */
 fail1:
 	for (i--; i >= 0; i--) {
-		if (csrow->nr_pages > 0) {
+		if (mci->csrows[i].nr_pages > 0)
 			kobject_put(&mci->csrows[i].kobj);
-		}
 	}
 
 	/* remove the mci instance's attributes, if any */
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 2a6e7ff..2ce7ef1 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -433,7 +433,7 @@ static void i5100_handle_ce(struct mem_ctl_info *mci,
 		"CE chan %d, bank %u, rank %u, syndrome 0x%lx, "
 		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
 		chan, bank, rank, syndrome, cas, ras,
-		csrow, mci->csrows[csrow].channels[0].label, msg);
+		csrow, mci->csrows[csrow].channels[0].dimm->label, msg);
 
 	mci->ce_count++;
 	mci->csrows[csrow].ce_count++;
@@ -455,7 +455,7 @@ static void i5100_handle_ue(struct mem_ctl_info *mci,
 		"UE chan %d, bank %u, rank %u, syndrome 0x%lx, "
 		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
 		chan, bank, rank, syndrome, cas, ras,
-		csrow, mci->csrows[csrow].channels[0].label, msg);
+		csrow, mci->csrows[csrow].channels[0].dimm->label, msg);
 
 	mci->ue_count++;
 	mci->csrows[csrow].ue_count++;
@@ -868,8 +868,8 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		mci->csrows[i].channels[0].chan_idx = 0;
 		mci->csrows[i].channels[0].ce_count = 0;
 		mci->csrows[i].channels[0].csrow = mci->csrows + i;
-		snprintf(mci->csrows[i].channels[0].label,
-			 sizeof(mci->csrows[i].channels[0].label),
+		snprintf(mci->csrows[i].channels[0].dimm->label,
+			 sizeof(mci->csrows[i].channels[0].dimm->label),
 			 "DIMM%u", i5100_rank_to_slot(mci, chan, rank));
 
 		total_pages += npages;
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 8568d9b..5203f30 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -746,8 +746,8 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 
 			csr->edac_mode = mode;
 			csr->mtype = mtype;
-			snprintf(csr->channels[0].label,
-					sizeof(csr->channels[0].label),
+			snprintf(csr->channels[0].dimm->label,
+					sizeof(csr->channels[0].dimm->label),
 					"CPU#%uChannel#%u_DIMM#%u",
 					pvt->i7core_dev->socket, i, j);
 
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 4184e01..864061b 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -407,7 +407,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		 *   [0-3] for dual-channel; i.e. csrow->nr_channels = 2
 		 */
 		for (chan = 0; chan < csrow->nr_channels; chan++)
-			strncpy(csrow->channels[chan].label,
+			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
 
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 2917887..dea1ef3 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -651,8 +651,8 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 				csr->channels[0].chan_idx = i;
 				csr->channels[0].ce_count = 0;
 				pvt->csrow_map[i][j] = csrow;
-				snprintf(csr->channels[0].label,
-					 sizeof(csr->channels[0].label),
+				snprintf(csr->channels[0].dimm->label,
+					 sizeof(csr->channels[0].dimm->label),
 					 "CPU_SrcID#%u_Channel#%u_DIMM#%u",
 					 pvt->sbridge_dev->source_id, i, j);
 				last_page += npages;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index e3e3d26..f40b835 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -308,23 +308,34 @@ enum scrub_type {
  * PS - I enjoyed writing all that about as much as you enjoyed reading it.
  */
 
+/* FIXME: add a per-dimm ce error count */
+struct dimm_info {
+	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
+	unsigned memory_controller;
+	unsigned csrow;
+	unsigned csrow_channel;
+};
+
 /**
  * struct rank_info - contains the information for one DIMM rank
  *
  * @chan_idx:	channel number where the rank is (typically, 0 or 1)
  * @ce_count:	number of correctable errors for this rank
- * @label:	DIMM label. Different ranks for the same DIMM should be
- *		filled, on userspace, with the same label.
- *		FIXME: The core currently won't enforce it.
  * @csrow:	A pointer to the chip select row structure (the parent
  *		structure). The location of the rank is given by
  *		the (csrow->csrow_idx, chan_idx) vector.
+ * @dimm:	A pointer to the DIMM structure, where the DIMM label
+ *		information is stored.
+ *
+ * FIXME: Currently, the EDAC core model will assume one DIMM per rank.
+ *	  This is a bad assumption, but it makes this patch easier. Later
+ *	  patches in this series will fix this issue.
  */
 struct rank_info {
 	int chan_idx;
 	u32 ce_count;
-	char label[EDAC_MC_LABEL_LEN + 1];
-	struct csrow_info *csrow;	/* the parent */
+	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 };
 
 struct csrow_info {
@@ -424,6 +435,13 @@ struct mem_ctl_info {
 	int mc_idx;
 	int nr_csrows;
 	struct csrow_info *csrows;
+
+	/*
+	 * DIMM info. Will eventually remove the entire csrows_info some day
+	 */
+	unsigned nr_dimms;
+	struct dimm_info *dimms;
+
 	/*
 	 * FIXME - what about controllers on other busses? - IDs must be
 	 * unique.  dev pointer should be sufficiently unique, but
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 2/7] edac: move dimm properties to struct dimm_info
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
  2012-04-16 20:12   ` [EDAC PATCH v13 1/7] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
@ 2012-04-16 20:12   ` Mauro Carvalho Chehab
  2012-04-26 14:26     ` Borislav Petkov
  2012-04-16 20:12   ` [EDAC PATCH v13 3/7] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
                     ` (5 subsequent siblings)
  7 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mike Williams, Mauro Carvalho Chehab, Jason Uhlenkott,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Borislav Petkov, Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	James Bottomley, Linux Kernel Mailing List, Joe Perches,
	Andrew Morton, linuxppc-dev

On systems based on chip select rows, all channels need to use memories
with the same properties, otherwise the memories on channels A and B
won't be recognized.

However, such assumption is not true for all types of memory
controllers.

Controllers for FB-DIMM's don't have such requirements.

Also, modern Intel controllers seem to be capable of handling such
differences.

So, we need to get rid of storing the DIMM information into a per-csrow
data, storing it, instead at the right place.

The first step is to move grain, mtype, dtype and edac_mode to the
per-dimm struct.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: James Bottomley <James.Bottomley@parallels.com>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Mike Williams <mike@mikebwilliams.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c      |   18 ++++++++----
 drivers/edac/amd76x_edac.c     |   10 ++++--
 drivers/edac/cell_edac.c       |   10 +++++-
 drivers/edac/cpc925_edac.c     |   62 +++++++++++++++++++++------------------
 drivers/edac/e752x_edac.c      |   44 +++++++++++++++-------------
 drivers/edac/e7xxx_edac.c      |   44 ++++++++++++++++------------
 drivers/edac/edac_mc.c         |   19 ++++++++----
 drivers/edac/edac_mc_sysfs.c   |    6 ++--
 drivers/edac/i3000_edac.c      |   18 ++++++-----
 drivers/edac/i3200_edac.c      |   18 ++++++-----
 drivers/edac/i5000_edac.c      |   24 +++++++--------
 drivers/edac/i5100_edac.c      |   38 +++++++++++++-----------
 drivers/edac/i5400_edac.c      |   24 ++++++---------
 drivers/edac/i7300_edac.c      |   25 +++++++++------
 drivers/edac/i7core_edac.c     |   27 ++++++++---------
 drivers/edac/i82443bxgx_edac.c |   13 +++++---
 drivers/edac/i82860_edac.c     |   11 ++++--
 drivers/edac/i82875p_edac.c    |   17 ++++++++---
 drivers/edac/i82975x_edac.c    |   17 +++++++----
 drivers/edac/mpc85xx_edac.c    |   13 +++++---
 drivers/edac/mv64x60_edac.c    |   18 ++++++-----
 drivers/edac/pasemi_edac.c     |   10 ++++--
 drivers/edac/ppc4xx_edac.c     |   13 +++++---
 drivers/edac/r82600_edac.c     |   10 ++++--
 drivers/edac/sb_edac.c         |   31 +++++++++++---------
 drivers/edac/tile_edac.c       |   13 ++++----
 drivers/edac/x38_edac.c        |   17 ++++++-----
 include/linux/edac.h           |   21 ++++++++-----
 28 files changed, 334 insertions(+), 257 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index c9eee6d..c4c61fb 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2168,7 +2168,9 @@ static int init_csrows(struct mem_ctl_info *mci)
 	struct amd64_pvt *pvt = mci->pvt_info;
 	u64 input_addr_min, input_addr_max, sys_addr, base, mask;
 	u32 val;
-	int i, empty = 1;
+	int i, j, empty = 1;
+	enum mem_type mtype;
+	enum edac_type edac_mode;
 
 	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
 
@@ -2202,7 +2204,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 		csrow->page_mask = ~mask;
 		/* 8 bytes of resolution */
 
-		csrow->mtype = amd64_determine_memory_type(pvt, i);
+		mtype = amd64_determine_memory_type(pvt, i);
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
 		debugf1("    input_addr_min: 0x%lx input_addr_max: 0x%lx\n",
@@ -2219,11 +2221,15 @@ static int init_csrows(struct mem_ctl_info *mci)
 		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
 		 */
 		if (pvt->nbcfg & NBCFG_ECC_ENABLE)
-			csrow->edac_mode =
-			    (pvt->nbcfg & NBCFG_CHIPKILL) ?
-			    EDAC_S4ECD4ED : EDAC_SECDED;
+			edac_mode = (pvt->nbcfg & NBCFG_CHIPKILL) ?
+				    EDAC_S4ECD4ED : EDAC_SECDED;
 		else
-			csrow->edac_mode = EDAC_NONE;
+			edac_mode = EDAC_NONE;
+
+		for (j = 0; j < pvt->channel_count; j++) {
+			csrow->channels[j].dimm->mtype = mtype;
+			csrow->channels[j].dimm->edac_mode = edac_mode;
+		}
 	}
 
 	return empty;
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index e47e73b..2a63ed0 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -186,11 +186,13 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 			enum edac_type edac_mode)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	u32 mba, mba_base, mba_mask, dms;
 	int index;
 
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
 
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_dword(pdev,
@@ -206,10 +208,10 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		csrow->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
 		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 		csrow->page_mask = mba_mask >> PAGE_SHIFT;
-		csrow->grain = csrow->nr_pages << PAGE_SHIFT;
-		csrow->mtype = MEM_RDDR;
-		csrow->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
-		csrow->edac_mode = edac_mode;
+		dimm->grain = csrow->nr_pages << PAGE_SHIFT;
+		dimm->mtype = MEM_RDDR;
+		dimm->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
+		dimm->edac_mode = edac_mode;
 	}
 }
 
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 9a6a274..94fbb12 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -124,8 +124,10 @@ static void cell_edac_check(struct mem_ctl_info *mci)
 static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 {
 	struct csrow_info		*csrow = &mci->csrows[0];
+	struct dimm_info		*dimm;
 	struct cell_edac_priv		*priv = mci->pvt_info;
 	struct device_node		*np;
+	int				j;
 
 	for (np = NULL;
 	     (np = of_find_node_by_name(np, "memory")) != NULL;) {
@@ -142,8 +144,12 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 		csrow->first_page = r.start >> PAGE_SHIFT;
 		csrow->nr_pages = resource_size(&r) >> PAGE_SHIFT;
 		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-		csrow->mtype = MEM_XDR;
-		csrow->edac_mode = EDAC_SECDED;
+
+		for (j = 0; j < csrow->nr_channels; j++) {
+			dimm = csrow->channels[j].dimm;
+			dimm->mtype = MEM_XDR;
+			dimm->edac_mode = EDAC_SECDED;
+		}
 		dev_dbg(mci->dev,
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index a774c0d..ee90f3d 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -329,7 +329,8 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 {
 	struct cpc925_mc_pdata *pdata = mci->pvt_info;
 	struct csrow_info *csrow;
-	int index;
+	struct dimm_info *dimm;
+	int index, j;
 	u32 mbmr, mbbar, bba;
 	unsigned long row_size, last_nr_pages = 0;
 
@@ -354,32 +355,35 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 		last_nr_pages = csrow->last_page + 1;
 
-		csrow->mtype = MEM_RDDR;
-		csrow->edac_mode = EDAC_SECDED;
-
-		switch (csrow->nr_channels) {
-		case 1: /* Single channel */
-			csrow->grain = 32; /* four-beat burst of 32 bytes */
-			break;
-		case 2: /* Dual channel */
-		default:
-			csrow->grain = 64; /* four-beat burst of 64 bytes */
-			break;
-		}
-
-		switch ((mbmr & MBMR_MODE_MASK) >> MBMR_MODE_SHIFT) {
-		case 6: /* 0110, no way to differentiate X8 VS X16 */
-		case 5:	/* 0101 */
-		case 8: /* 1000 */
-			csrow->dtype = DEV_X16;
-			break;
-		case 7: /* 0111 */
-		case 9: /* 1001 */
-			csrow->dtype = DEV_X8;
-			break;
-		default:
-			csrow->dtype = DEV_UNKNOWN;
-			break;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			dimm = csrow->channels[j].dimm;
+			dimm->mtype = MEM_RDDR;
+			dimm->edac_mode = EDAC_SECDED;
+
+			switch (csrow->nr_channels) {
+			case 1: /* Single channel */
+				dimm->grain = 32; /* four-beat burst of 32 bytes */
+				break;
+			case 2: /* Dual channel */
+			default:
+				dimm->grain = 64; /* four-beat burst of 64 bytes */
+				break;
+			}
+
+			switch ((mbmr & MBMR_MODE_MASK) >> MBMR_MODE_SHIFT) {
+			case 6: /* 0110, no way to differentiate X8 VS X16 */
+			case 5:	/* 0101 */
+			case 8: /* 1000 */
+				dimm->dtype = DEV_X16;
+				break;
+			case 7: /* 0111 */
+			case 9: /* 1001 */
+				dimm->dtype = DEV_X8;
+				break;
+			default:
+				dimm->dtype = DEV_UNKNOWN;
+				break;
+			}
 		}
 	}
 }
@@ -962,9 +966,9 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 		goto err2;
 	}
 
-	nr_channels = cpc925_mc_get_channels(vbase);
+	nr_channels = cpc925_mc_get_channels(vbase) + 1;
 	mci = edac_mc_alloc(sizeof(struct cpc925_mc_pdata),
-			CPC925_NR_CSROWS, nr_channels + 1, edac_mc_idx);
+			CPC925_NR_CSROWS, nr_channels, edac_mc_idx);
 	if (!mci) {
 		cpc925_printk(KERN_ERR, "No memory for mem_ctl_info\n");
 		res = -ENOMEM;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 1af531a..db291ea 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1044,7 +1044,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	int drc_drbg;		/* DRB granularity 0=64mb, 1=128mb */
 	int drc_ddim;		/* DRAM Data Integrity Mode 0=none, 2=edac */
 	u8 value;
-	u32 dra, drc, cumul_size;
+	u32 dra, drc, cumul_size, i;
 
 	dra = 0;
 	for (index = 0; index < 4; index++) {
@@ -1053,7 +1053,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		dra |= dra_reg << (index * 8);
 	}
 	pci_read_config_dword(pdev, E752X_DRC, &drc);
-	drc_chan = dual_channel_active(ddrcsr);
+	drc_chan = dual_channel_active(ddrcsr) ? 1 : 0;
 	drc_drbg = drc_chan + 1;	/* 128 in dual mode, 64 in single */
 	drc_ddim = (drc >> 20) & 0x3;
 
@@ -1080,24 +1080,28 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
-		csrow->mtype = MEM_RDDR;	/* only one type supported */
-		csrow->dtype = mem_dev ? DEV_X4 : DEV_X8;
-
-		/*
-		 * if single channel or x8 devices then SECDED
-		 * if dual channel and x4 then S4ECD4ED
-		 */
-		if (drc_ddim) {
-			if (drc_chan && mem_dev) {
-				csrow->edac_mode = EDAC_S4ECD4ED;
-				mci->edac_cap |= EDAC_FLAG_S4ECD4ED;
-			} else {
-				csrow->edac_mode = EDAC_SECDED;
-				mci->edac_cap |= EDAC_FLAG_SECDED;
-			}
-		} else
-			csrow->edac_mode = EDAC_NONE;
+
+		for (i = 0; i < drc_chan + 1; i++) {
+			struct dimm_info *dimm = csrow->channels[i].dimm;
+			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
+			dimm->mtype = MEM_RDDR;	/* only one type supported */
+			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
+
+			/*
+			* if single channel or x8 devices then SECDED
+			* if dual channel and x4 then S4ECD4ED
+			*/
+			if (drc_ddim) {
+				if (drc_chan && mem_dev) {
+					dimm->edac_mode = EDAC_S4ECD4ED;
+					mci->edac_cap |= EDAC_FLAG_S4ECD4ED;
+				} else {
+					dimm->edac_mode = EDAC_SECDED;
+					mci->edac_cap |= EDAC_FLAG_SECDED;
+				}
+			} else
+				dimm->edac_mode = EDAC_NONE;
+		}
 	}
 }
 
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 6ffb6d2..178d2af 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -347,11 +347,12 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 			int dev_idx, u32 drc)
 {
 	unsigned long last_cumul_size;
-	int index;
+	int index, j;
 	u8 value;
 	u32 dra, cumul_size;
 	int drc_chan, drc_drbg, drc_ddim, mem_dev;
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 
 	pci_read_config_dword(pdev, E7XXX_DRA, &dra);
 	drc_chan = dual_channel_active(drc, dev_idx);
@@ -381,24 +382,29 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
-		csrow->mtype = MEM_RDDR;	/* only one type supported */
-		csrow->dtype = mem_dev ? DEV_X4 : DEV_X8;
-
-		/*
-		 * if single channel or x8 devices then SECDED
-		 * if dual channel and x4 then S4ECD4ED
-		 */
-		if (drc_ddim) {
-			if (drc_chan && mem_dev) {
-				csrow->edac_mode = EDAC_S4ECD4ED;
-				mci->edac_cap |= EDAC_FLAG_S4ECD4ED;
-			} else {
-				csrow->edac_mode = EDAC_SECDED;
-				mci->edac_cap |= EDAC_FLAG_SECDED;
-			}
-		} else
-			csrow->edac_mode = EDAC_NONE;
+
+		for (j = 0; j < drc_chan + 1; j++) {
+			dimm = csrow->channels[j].dimm;
+
+			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
+			dimm->mtype = MEM_RDDR;	/* only one type supported */
+			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
+
+			/*
+			* if single channel or x8 devices then SECDED
+			* if dual channel and x4 then S4ECD4ED
+			*/
+			if (drc_ddim) {
+				if (drc_chan && mem_dev) {
+					dimm->edac_mode = EDAC_S4ECD4ED;
+					mci->edac_cap |= EDAC_FLAG_S4ECD4ED;
+				} else {
+					dimm->edac_mode = EDAC_SECDED;
+					mci->edac_cap |= EDAC_FLAG_SECDED;
+				}
+			} else
+				dimm->edac_mode = EDAC_NONE;
+		}
 	}
 }
 
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index ba2599e..f83e63d 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -43,7 +43,7 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 {
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
-	debugf4("\tchannel->ce_count = %d\n", chan->ce_count);
+	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
 	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
 }
@@ -695,6 +695,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 {
 	unsigned long remapped_page;
 	char *label = NULL;
+	u32 grain;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
@@ -719,6 +720,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 	}
 
 	label = mci->csrows[row].channels[channel].dimm->label;
+	grain = mci->csrows[row].channels[channel].dimm->grain;
 
 	if (edac_mc_get_log_ce())
 		/* FIXME - put in DIMM location */
@@ -726,11 +728,12 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
 			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
 			page_frame_number, offset_in_page,
-			mci->csrows[row].grain, syndrome, row, channel,
+			grain, syndrome, row, channel,
 			label, msg);
 
 	mci->ce_count++;
 	mci->csrows[row].ce_count++;
+	mci->csrows[row].channels[channel].dimm->ce_count++;
 	mci->csrows[row].channels[channel].ce_count++;
 
 	if (mci->scrub_mode & SCRUB_SW_SRC) {
@@ -747,8 +750,7 @@ void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			mci->ctl_page_to_phys(mci, page_frame_number) :
 			page_frame_number;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page,
-				mci->csrows[row].grain);
+		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
 	}
 }
 EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
@@ -774,6 +776,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 	int chan;
 	int chars;
 	char *label = NULL;
+	u32 grain;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
@@ -787,6 +790,7 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 		return;
 	}
 
+	grain = mci->csrows[row].channels[0].dimm->grain;
 	label = mci->csrows[row].channels[0].dimm->label;
 	chars = snprintf(pos, len + 1, "%s", label);
 	len -= chars;
@@ -804,14 +808,13 @@ void edac_mc_handle_ue(struct mem_ctl_info *mci,
 		edac_mc_printk(mci, KERN_EMERG,
 			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
 			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, mci->csrows[row].grain, row,
-			labels, msg);
+			offset_in_page, grain, row, labels, msg);
 
 	if (edac_mc_get_panic_on_ue())
 		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
 			"row %d, labels \"%s\": %s\n", mci->mc_idx,
 			page_frame_number, offset_in_page,
-			mci->csrows[row].grain, row, labels, msg);
+			grain, row, labels, msg);
 
 	mci->ue_count++;
 	mci->csrows[row].ue_count++;
@@ -883,6 +886,7 @@ void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
 	chars = snprintf(pos, len + 1, "%s", label);
 	len -= chars;
 	pos += chars;
+
 	chars = snprintf(pos, len + 1, "-%s",
 			mci->csrows[csrow].channels[channelb].dimm->label);
 
@@ -936,6 +940,7 @@ void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
 
 	mci->ce_count++;
 	mci->csrows[csrow].ce_count++;
+	mci->csrows[csrow].channels[channel].dimm->ce_count++;
 	mci->csrows[csrow].channels[channel].ce_count++;
 }
 EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index c83697c..d63904e 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -150,19 +150,19 @@ static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
 static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%s\n", mem_types[csrow->mtype]);
+	return sprintf(data, "%s\n", mem_types[csrow->channels[0].dimm->mtype]);
 }
 
 static ssize_t csrow_dev_type_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%s\n", dev_types[csrow->dtype]);
+	return sprintf(data, "%s\n", dev_types[csrow->channels[0].dimm->dtype]);
 }
 
 static ssize_t csrow_edac_mode_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%s\n", edac_caps[csrow->edac_mode]);
+	return sprintf(data, "%s\n", edac_caps[csrow->channels[0].dimm->edac_mode]);
 }
 
 /* show/store functions for DIMM Label attributes */
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index c0510b3..1498c5f 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -304,7 +304,7 @@ static int i3000_is_interleaved(const unsigned char *c0dra,
 static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc;
-	int i;
+	int i, j;
 	struct mem_ctl_info *mci = NULL;
 	unsigned long last_cumul_size;
 	int interleaved, nr_channels;
@@ -386,19 +386,21 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 			cumul_size <<= 1;
 		debugf3("MC: %s(): (%d) cumul_size 0x%x\n",
 			__func__, i, cumul_size);
-		if (cumul_size == last_cumul_size) {
-			csrow->mtype = MEM_EMPTY;
+		if (cumul_size == last_cumul_size)
 			continue;
-		}
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = I3000_DEAP_GRAIN;
-		csrow->mtype = MEM_DDR2;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = EDAC_UNKNOWN;
+
+		for (j = 0; j < nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			dimm->grain = I3000_DEAP_GRAIN;
+			dimm->mtype = MEM_DDR2;
+			dimm->dtype = DEV_UNKNOWN;
+			dimm->edac_mode = EDAC_UNKNOWN;
+		}
 	}
 
 	/*
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 73f55e200..73529fd 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -319,7 +319,7 @@ static unsigned long drb_to_nr_pages(
 static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc;
-	int i;
+	int i, j;
 	struct mem_ctl_info *mci = NULL;
 	unsigned long last_page;
 	u16 drbs[I3200_CHANNELS][I3200_RANKS_PER_CHANNEL];
@@ -375,20 +375,22 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 			i / I3200_RANKS_PER_CHANNEL,
 			i % I3200_RANKS_PER_CHANNEL);
 
-		if (nr_pages == 0) {
-			csrow->mtype = MEM_EMPTY;
+		if (nr_pages == 0)
 			continue;
-		}
 
 		csrow->first_page = last_page + 1;
 		last_page += nr_pages;
 		csrow->last_page = last_page;
 		csrow->nr_pages = nr_pages;
 
-		csrow->grain = nr_pages << PAGE_SHIFT;
-		csrow->mtype = MEM_DDR2;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = EDAC_UNKNOWN;
+		for (j = 0; j < nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->grain = nr_pages << PAGE_SHIFT;
+			dimm->mtype = MEM_DDR2;
+			dimm->dtype = DEV_UNKNOWN;
+			dimm->edac_mode = EDAC_UNKNOWN;
+		}
 	}
 
 	i3200_clear_error_info(mci);
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 4dc3ac2..e612f1e 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1268,25 +1268,23 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 		p_csrow->last_page = 9 + csrow * 20;
 		p_csrow->page_mask = 0xFFF;
 
-		p_csrow->grain = 8;
-
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
-		}
+			p_csrow->channels[channel].dimm->grain = 8;
 
-		p_csrow->nr_pages = csrow_megs << 8;
+			/* Assume DDR2 for now */
+			p_csrow->channels[channel].dimm->mtype = MEM_FB_DDR2;
 
-		/* Assume DDR2 for now */
-		p_csrow->mtype = MEM_FB_DDR2;
+			/* ask what device type on this row */
+			if (MTR_DRAM_WIDTH(mtr))
+				p_csrow->channels[channel].dimm->dtype = DEV_X8;
+			else
+				p_csrow->channels[channel].dimm->dtype = DEV_X4;
 
-		/* ask what device type on this row */
-		if (MTR_DRAM_WIDTH(mtr))
-			p_csrow->dtype = DEV_X8;
-		else
-			p_csrow->dtype = DEV_X4;
-
-		p_csrow->edac_mode = EDAC_S8ECD8ED;
+			p_csrow->channels[channel].dimm->edac_mode = EDAC_S8ECD8ED;
+		}
+		p_csrow->nr_pages = csrow_megs << 8;
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 2ce7ef1..9caff36 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -428,12 +428,16 @@ static void i5100_handle_ce(struct mem_ctl_info *mci,
 			    const char *msg)
 {
 	const int csrow = i5100_rank_to_csrow(mci, chan, rank);
+	char *label = NULL;
+
+	if (mci->csrows[csrow].channels[0].dimm)
+		label = mci->csrows[csrow].channels[0].dimm->label;
 
 	printk(KERN_ERR
 		"CE chan %d, bank %u, rank %u, syndrome 0x%lx, "
 		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
 		chan, bank, rank, syndrome, cas, ras,
-		csrow, mci->csrows[csrow].channels[0].dimm->label, msg);
+		csrow, label, msg);
 
 	mci->ce_count++;
 	mci->csrows[csrow].ce_count++;
@@ -450,12 +454,16 @@ static void i5100_handle_ue(struct mem_ctl_info *mci,
 			    const char *msg)
 {
 	const int csrow = i5100_rank_to_csrow(mci, chan, rank);
+	char *label = NULL;
+
+	if (mci->csrows[csrow].channels[0].dimm)
+		label = mci->csrows[csrow].channels[0].dimm->label;
 
 	printk(KERN_ERR
 		"UE chan %d, bank %u, rank %u, syndrome 0x%lx, "
 		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
 		chan, bank, rank, syndrome, cas, ras,
-		csrow, mci->csrows[csrow].channels[0].dimm->label, msg);
+		csrow, label, msg);
 
 	mci->ue_count++;
 	mci->csrows[csrow].ue_count++;
@@ -837,6 +845,7 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 	int i;
 	unsigned long total_pages = 0UL;
 	struct i5100_priv *priv = mci->pvt_info;
+	struct dimm_info *dimm;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
 		const unsigned long npages = i5100_npages(mci, i);
@@ -852,27 +861,22 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 */
 		mci->csrows[i].first_page = total_pages;
 		mci->csrows[i].last_page = total_pages + npages - 1;
-		mci->csrows[i].page_mask = 0UL;
-
 		mci->csrows[i].nr_pages = npages;
-		mci->csrows[i].grain = 32;
 		mci->csrows[i].csrow_idx = i;
-		mci->csrows[i].dtype =
-			(priv->mtr[chan][rank].width == 4) ? DEV_X4 : DEV_X8;
-		mci->csrows[i].ue_count = 0;
-		mci->csrows[i].ce_count = 0;
-		mci->csrows[i].mtype = MEM_RDDR2;
-		mci->csrows[i].edac_mode = EDAC_SECDED;
 		mci->csrows[i].mci = mci;
 		mci->csrows[i].nr_channels = 1;
-		mci->csrows[i].channels[0].chan_idx = 0;
-		mci->csrows[i].channels[0].ce_count = 0;
 		mci->csrows[i].channels[0].csrow = mci->csrows + i;
-		snprintf(mci->csrows[i].channels[0].dimm->label,
-			 sizeof(mci->csrows[i].channels[0].dimm->label),
-			 "DIMM%u", i5100_rank_to_slot(mci, chan, rank));
-
 		total_pages += npages;
+
+		dimm = mci->csrows[i].channels[0].dimm;
+		dimm->grain = 32;
+		dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
+			      DEV_X4 : DEV_X8;
+		dimm->mtype = MEM_RDDR2;
+		dimm->edac_mode = EDAC_SECDED;
+		snprintf(dimm->label, sizeof(dimm->label),
+			 "DIMM%u",
+			 i5100_rank_to_slot(mci, chan, rank));
 	}
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index b44a5de..229aff5 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1159,6 +1159,7 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	int csrow_megs;
 	int channel;
 	int csrow;
+	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
 
@@ -1184,24 +1185,17 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		p_csrow->last_page = 9 + csrow * 20;
 		p_csrow->page_mask = 0xFFF;
 
-		p_csrow->grain = 8;
-
 		csrow_megs = 0;
-		for (channel = 0; channel < pvt->maxch; channel++)
+		for (channel = 0; channel < pvt->maxch; channel++) {
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
 
-		p_csrow->nr_pages = csrow_megs << 8;
-
-		/* Assume DDR2 for now */
-		p_csrow->mtype = MEM_FB_DDR2;
-
-		/* ask what device type on this row */
-		if (MTR_DRAM_WIDTH(mtr))
-			p_csrow->dtype = DEV_X8;
-		else
-			p_csrow->dtype = DEV_X4;
-
-		p_csrow->edac_mode = EDAC_S8ECD8ED;
+			p_csrow->nr_pages = csrow_megs << 8;
+			dimm = p_csrow->channels[channel].dimm;
+			dimm->grain = 8;
+			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
+			dimm->mtype = MEM_RDDR2;
+			dimm->edac_mode = EDAC_SECDED;
+		}
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 6104dba..07a5927 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -618,6 +618,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 		      int slot, int ch, int branch,
 		      struct i7300_dimm_info *dinfo,
 		      struct csrow_info *p_csrow,
+		      struct dimm_info *dimm,
 		      u32 *nr_pages)
 {
 	int mtr, ans, addrBits, channel;
@@ -663,10 +664,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 	debugf2("\t\tSIZE: %d MB\n", dinfo->megabytes);
 
-	p_csrow->grain = 8;
-	p_csrow->mtype = MEM_FB_DDR2;
 	p_csrow->csrow_idx = slot;
-	p_csrow->page_mask = 0;
 
 	/*
 	 * The type of error detection actually depends of the
@@ -677,15 +675,17 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	 * See datasheet Sections 7.3.6 to 7.3.8
 	 */
 
+	dimm->grain = 8;
+	dimm->mtype = MEM_FB_DDR2;
 	if (IS_SINGLE_MODE(pvt->mc_settings_a)) {
-		p_csrow->edac_mode = EDAC_SECDED;
+		dimm->edac_mode = EDAC_SECDED;
 		debugf2("\t\tECC code is 8-byte-over-32-byte SECDED+ code\n");
 	} else {
 		debugf2("\t\tECC code is on Lockstep mode\n");
 		if (MTR_DRAM_WIDTH(mtr) == 8)
-			p_csrow->edac_mode = EDAC_S8ECD8ED;
+			dimm->edac_mode = EDAC_S8ECD8ED;
 		else
-			p_csrow->edac_mode = EDAC_S4ECD4ED;
+			dimm->edac_mode = EDAC_S4ECD4ED;
 	}
 
 	/* ask what device type on this row */
@@ -694,9 +694,9 @@ static int decode_mtr(struct i7300_pvt *pvt,
 			IS_SCRBALGO_ENHANCED(pvt->mc_settings) ?
 					    "enhanced" : "normal");
 
-		p_csrow->dtype = DEV_X8;
+		dimm->dtype = DEV_X8;
 	} else
-		p_csrow->dtype = DEV_X4;
+		dimm->dtype = DEV_X4;
 
 	return mtr;
 }
@@ -779,6 +779,7 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	int mtr;
 	int ch, branch, slot, channel;
 	u32 last_page = 0, nr_pages;
+	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
 
@@ -803,20 +804,24 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	/* Get the set of MTR[0-7] regs by each branch */
+	nr_pages = 0;
 	for (slot = 0; slot < MAX_SLOTS; slot++) {
 		int where = mtr_regs[slot];
 		for (branch = 0; branch < MAX_BRANCHES; branch++) {
 			pci_read_config_word(pvt->pci_dev_2x_0_fbd_branch[branch],
 					where,
 					&pvt->mtr[slot][branch]);
-			for (ch = 0; ch < MAX_BRANCHES; ch++) {
+			for (ch = 0; ch < MAX_CH_PER_BRANCH; ch++) {
 				int channel = to_channel(ch, branch);
 
 				dinfo = &pvt->dimm_info[slot][channel];
 				p_csrow = &mci->csrows[slot];
 
+				dimm = p_csrow->channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+
 				mtr = decode_mtr(pvt, slot, ch, branch,
-						 dinfo, p_csrow, &nr_pages);
+						 dinfo, p_csrow, dimm,
+						 &nr_pages);
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 5203f30..21f9791 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -592,7 +592,7 @@ static int i7core_get_active_channels(const u8 socket, unsigned *channels,
 	return 0;
 }
 
-static int get_dimm_config(const struct mem_ctl_info *mci)
+static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct i7core_pvt *pvt = mci->pvt_info;
 	struct csrow_info *csr;
@@ -602,6 +602,7 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 	unsigned long last_page = 0;
 	enum edac_type mode;
 	enum mem_type mtype;
+	struct dimm_info *dimm;
 
 	/* Get data from the MC register, function 0 */
 	pdev = pvt->pci_mcr[0];
@@ -721,7 +722,6 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 			csr->nr_pages = npages;
 
 			csr->page_mask = 0;
-			csr->grain = 8;
 			csr->csrow_idx = csrow;
 			csr->nr_channels = 1;
 
@@ -730,28 +730,27 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 
 			pvt->csrow_map[i][j] = csrow;
 
+			dimm = csr->channels[0].dimm;
 			switch (banks) {
 			case 4:
-				csr->dtype = DEV_X4;
+				dimm->dtype = DEV_X4;
 				break;
 			case 8:
-				csr->dtype = DEV_X8;
+				dimm->dtype = DEV_X8;
 				break;
 			case 16:
-				csr->dtype = DEV_X16;
+				dimm->dtype = DEV_X16;
 				break;
 			default:
-				csr->dtype = DEV_UNKNOWN;
+				dimm->dtype = DEV_UNKNOWN;
 			}
 
-			csr->edac_mode = mode;
-			csr->mtype = mtype;
-			snprintf(csr->channels[0].dimm->label,
-					sizeof(csr->channels[0].dimm->label),
-					"CPU#%uChannel#%u_DIMM#%u",
-					pvt->i7core_dev->socket, i, j);
-
-			csrow++;
+			snprintf(dimm->label, sizeof(dimm->label),
+				 "CPU#%uChannel#%u_DIMM#%u",
+				 pvt->i7core_dev->socket, i, j);
+			dimm->grain = 8;
+			dimm->edac_mode = mode;
+			dimm->mtype = mtype;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 4329d39..1e19492 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -12,7 +12,7 @@
  * 440GX fix by Jason Uhlenkott <juhlenko@akamai.com>.
  *
  * Written with reference to 82443BX Host Bridge Datasheet:
- * http://download.intel.com/design/chipsets/datashts/29063301.pdf 
+ * http://download.intel.com/design/chipsets/datashts/29063301.pdf
  * references to this document given in [].
  *
  * This module doesn't support the 440LX, but it may be possible to
@@ -189,6 +189,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 				enum mem_type mtype)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	int index;
 	u8 drbar, dramc;
 	u32 row_base, row_high_limit, row_high_limit_last;
@@ -197,6 +198,8 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 	row_high_limit_last = 0;
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
+
 		pci_read_config_byte(pdev, I82443BXGX_DRB + index, &drbar);
 		debugf1("MC%d: %s: %s() Row=%d DRB = %#0x\n",
 			mci->mc_idx, __FILE__, __func__, index, drbar);
@@ -219,12 +222,12 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
 		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* EAP reports in 4kilobyte granularity [61] */
-		csrow->grain = 1 << 12;
-		csrow->mtype = mtype;
+		dimm->grain = 1 << 12;
+		dimm->mtype = mtype;
 		/* I don't think 440BX can tell you device type? FIXME? */
-		csrow->dtype = DEV_UNKNOWN;
+		dimm->dtype = DEV_UNKNOWN;
 		/* Mode is global to all rows on 440BX */
-		csrow->edac_mode = edac_mode;
+		dimm->edac_mode = edac_mode;
 		row_high_limit_last = row_high_limit;
 	}
 }
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 931a057..acbd924 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -140,6 +140,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 	u16 value;
 	u32 cumul_size;
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	int index;
 
 	pci_read_config_word(pdev, I82860_MCHCFG, &mchcfg_ddim);
@@ -153,6 +154,8 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 	 */
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
+
 		pci_read_config_word(pdev, I82860_GBA + index * 2, &value);
 		cumul_size = (value & I82860_GBA_MASK) <<
 			(I82860_GBA_SHIFT - PAGE_SHIFT);
@@ -166,10 +169,10 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
-		csrow->mtype = MEM_RMBS;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = mchcfg_ddim ? EDAC_SECDED : EDAC_NONE;
+		dimm->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
+		dimm->mtype = MEM_RMBS;
+		dimm->dtype = DEV_UNKNOWN;
+		dimm->edac_mode = mchcfg_ddim ? EDAC_SECDED : EDAC_NONE;
 	}
 }
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 33864c6..81f79e2 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -342,11 +342,13 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 				void __iomem * ovrfl_window, u32 drc)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
+	unsigned nr_chans = dual_channel_active(drc) + 1;
 	unsigned long last_cumul_size;
 	u8 value;
 	u32 drc_ddim;		/* DRAM Data Integrity Mode 0=none,2=edac */
 	u32 cumul_size;
-	int index;
+	int index, j;
 
 	drc_ddim = (drc >> 18) & 0x1;
 	last_cumul_size = 0;
@@ -371,10 +373,15 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
-		csrow->mtype = MEM_DDR;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = drc_ddim ? EDAC_SECDED : EDAC_NONE;
+
+		for (j = 0; j < nr_chans; j++) {
+			dimm = csrow->channels[j].dimm;
+
+			dimm->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
+			dimm->mtype = MEM_DDR;
+			dimm->dtype = DEV_UNKNOWN;
+			dimm->edac_mode = drc_ddim ? EDAC_SECDED : EDAC_NONE;
+		}
 	}
 }
 
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 864061b..0b40e11 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -309,7 +309,7 @@ static int i82975x_process_error_info(struct mem_ctl_info *mci,
 	chan = (mci->csrows[row].nr_channels == 1) ? 0 : info->eap & 1;
 	offst = info->eap
 			& ((1 << PAGE_SHIFT) -
-				(1 << mci->csrows[row].grain));
+			   (1 << mci->csrows[row].channels[chan].dimm->grain));
 
 	if (info->errsts & 0x0002)
 		edac_mc_handle_ue(mci, page, offst , row, "i82975x UE");
@@ -372,6 +372,8 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 	u8 value;
 	u32 cumul_size;
 	int index, chan;
+	struct dimm_info *dimm;
+	enum dev_type dtype;
 
 	last_cumul_size = 0;
 
@@ -406,10 +408,17 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		 *   [0-7] for single-channel; i.e. csrow->nr_channels = 1
 		 *   [0-3] for dual-channel; i.e. csrow->nr_channels = 2
 		 */
-		for (chan = 0; chan < csrow->nr_channels; chan++)
+		dtype = i82975x_dram_type(mch_window, index);
+		for (chan = 0; chan < csrow->nr_channels; chan++) {
+			dimm = mci->csrows[index].channels[chan].dimm;
 			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
+			dimm->grain = 1 << 7;	/* 128Byte cache-line resolution */
+			dimm->dtype = i82975x_dram_type(mch_window, index);
+			dimm->mtype = MEM_DDR2; /* I82975x supports only DDR2 */
+			dimm->edac_mode = EDAC_SECDED; /* only supported */
+		}
 
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -418,10 +427,6 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		csrow->last_page = cumul_size - 1;
 		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
-		csrow->grain = 1 << 7;	/* 128Byte cache-line resolution */
-		csrow->mtype = MEM_DDR2; /* I82975x supports only DDR2 */
-		csrow->dtype = i82975x_dram_type(mch_window, index);
-		csrow->edac_mode = EDAC_SECDED; /* only supported */
 	}
 }
 
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 73464a6..fb92916 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -883,6 +883,7 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 {
 	struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	u32 sdram_ctl;
 	u32 sdtype;
 	enum mem_type mtype;
@@ -929,6 +930,8 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 		u32 end;
 
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
+
 		cs_bnds = in_be32(pdata->mc_vbase + MPC85XX_MC_CS_BNDS_0 +
 				  (index * MPC85XX_MC_CS_BNDS_OFS));
 
@@ -945,12 +948,12 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 		csrow->first_page = start;
 		csrow->last_page = end;
 		csrow->nr_pages = end + 1 - start;
-		csrow->grain = 8;
-		csrow->mtype = mtype;
-		csrow->dtype = DEV_UNKNOWN;
+		dimm->grain = 8;
+		dimm->mtype = mtype;
+		dimm->dtype = DEV_UNKNOWN;
 		if (sdram_ctl & DSC_X32_EN)
-			csrow->dtype = DEV_X32;
-		csrow->edac_mode = EDAC_SECDED;
+			dimm->dtype = DEV_X32;
+		dimm->edac_mode = EDAC_SECDED;
 	}
 }
 
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 7e5ff36..12d7fe0 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -656,6 +656,8 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 				struct mv64x60_mc_pdata *pdata)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
+
 	u32 devtype;
 	u32 ctl;
 
@@ -664,30 +666,30 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 	ctl = in_le32(pdata->mc_vbase + MV64X60_SDRAM_CONFIG);
 
 	csrow = &mci->csrows[0];
-	csrow->first_page = 0;
+	dimm = csrow->channels[0].dimm;
 	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
 	csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-	csrow->grain = 8;
+	dimm->grain = 8;
 
-	csrow->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
+	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
 
 	devtype = (ctl >> 20) & 0x3;
 	switch (devtype) {
 	case 0x0:
-		csrow->dtype = DEV_X32;
+		dimm->dtype = DEV_X32;
 		break;
 	case 0x2:		/* could be X8 too, but no way to tell */
-		csrow->dtype = DEV_X16;
+		dimm->dtype = DEV_X16;
 		break;
 	case 0x3:
-		csrow->dtype = DEV_X4;
+		dimm->dtype = DEV_X4;
 		break;
 	default:
-		csrow->dtype = DEV_UNKNOWN;
+		dimm->dtype = DEV_UNKNOWN;
 		break;
 	}
 
-	csrow->edac_mode = EDAC_SECDED;
+	dimm->edac_mode = EDAC_SECDED;
 }
 
 static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 7f71ee4..4e53270 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -135,11 +135,13 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 				   enum edac_type edac_mode)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	u32 rankcfg;
 	int index;
 
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
 
 		pci_read_config_dword(pdev,
 				      MCDRAM_RANKCFG + (index * 12),
@@ -177,10 +179,10 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 		last_page_in_mmc += csrow->nr_pages;
 		csrow->page_mask = 0;
-		csrow->grain = PASEMI_EDAC_ERROR_GRAIN;
-		csrow->mtype = MEM_DDR;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = edac_mode;
+		dimm->grain = PASEMI_EDAC_ERROR_GRAIN;
+		dimm->mtype = MEM_DDR;
+		dimm->dtype = DEV_UNKNOWN;
+		dimm->edac_mode = edac_mode;
 	}
 	return 0;
 }
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index d427c69..a75e567 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -895,7 +895,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum mem_type mtype;
 	enum dev_type dtype;
 	enum edac_type edac_mode;
-	int row;
+	int row, j;
 	u32 mbxcf, size;
 	static u32 ppc4xx_last_page;
 
@@ -975,15 +975,18 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		 * possible values would be the PLB width (16), the
 		 * page size (PAGE_SIZE) or the memory width (2 or 4).
 		 */
+		for (j = 0; j < csi->nr_channels; j++) {
+			struct dimm_info *dimm = csi->channels[j].dimm;
 
-		csi->grain	= 1;
+			dimm->grain	= 1;
 
-		csi->mtype	= mtype;
-		csi->dtype	= dtype;
+			dimm->mtype	= mtype;
+			dimm->dtype	= dtype;
 
-		csi->edac_mode	= edac_mode;
+			dimm->edac_mode	= edac_mode;
 
 		ppc4xx_last_page += csi->nr_pages;
+		}
 	}
 
  done:
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index e294e1b..414a532 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -216,6 +216,7 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 			u8 dramcr)
 {
 	struct csrow_info *csrow;
+	struct dimm_info *dimm;
 	int index;
 	u8 drbar;		/* SDRAM Row Boundary Address Register */
 	u32 row_high_limit, row_high_limit_last;
@@ -227,6 +228,7 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 	for (index = 0; index < mci->nr_csrows; index++) {
 		csrow = &mci->csrows[index];
+		dimm = csrow->channels[0].dimm;
 
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_byte(pdev, R82600_DRBA + index, &drbar);
@@ -250,13 +252,13 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* Error address is top 19 bits - so granularity is      *
 		 * 14 bits                                               */
-		csrow->grain = 1 << 14;
-		csrow->mtype = reg_sdram ? MEM_RDDR : MEM_DDR;
+		dimm->grain = 1 << 14;
+		dimm->mtype = reg_sdram ? MEM_RDDR : MEM_DDR;
 		/* FIXME - check that this is unknowable with this chipset */
-		csrow->dtype = DEV_UNKNOWN;
+		dimm->dtype = DEV_UNKNOWN;
 
 		/* Mode is global on 82600 */
-		csrow->edac_mode = ecc_on ? EDAC_SECDED : EDAC_NONE;
+		dimm->edac_mode = ecc_on ? EDAC_SECDED : EDAC_NONE;
 		row_high_limit_last = row_high_limit;
 	}
 }
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index dea1ef3..ec6e03d 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -551,7 +551,7 @@ static int sbridge_get_active_channels(const u8 bus, unsigned *channels,
 	return 0;
 }
 
-static int get_dimm_config(const struct mem_ctl_info *mci)
+static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct sbridge_pvt *pvt = mci->pvt_info;
 	struct csrow_info *csr;
@@ -561,6 +561,7 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
+	struct dimm_info *dimm;
 
 	pci_read_config_dword(pvt->pci_br, SAD_TARGET, &reg);
 	pvt->sbridge_dev->source_id = SOURCE_ID(reg);
@@ -612,6 +613,7 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 	/* On all supported DDR3 DIMM types, there are 8 banks available */
 	banks = 8;
 
+	dimm = mci->dimms;
 	for (i = 0; i < NUM_CHANNELS; i++) {
 		u32 mtr;
 
@@ -634,29 +636,30 @@ static int get_dimm_config(const struct mem_ctl_info *mci)
 					pvt->sbridge_dev->mc, i, j,
 					size, npages,
 					banks, ranks, rows, cols);
-				csr = &mci->csrows[csrow];
 
+				/*
+				 * Fake stuff. This controller doesn't see
+				 * csrows.
+				 */
+				csr = &mci->csrows[csrow];
 				csr->first_page = last_page;
 				csr->last_page = last_page + npages - 1;
-				csr->page_mask = 0UL;	/* Unused */
 				csr->nr_pages = npages;
-				csr->grain = 32;
 				csr->csrow_idx = csrow;
-				csr->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
-				csr->ce_count = 0;
-				csr->ue_count = 0;
-				csr->mtype = mtype;
-				csr->edac_mode = mode;
 				csr->nr_channels = 1;
 				csr->channels[0].chan_idx = i;
-				csr->channels[0].ce_count = 0;
 				pvt->csrow_map[i][j] = csrow;
-				snprintf(csr->channels[0].dimm->label,
-					 sizeof(csr->channels[0].dimm->label),
-					 "CPU_SrcID#%u_Channel#%u_DIMM#%u",
-					 pvt->sbridge_dev->source_id, i, j);
 				last_page += npages;
 				csrow++;
+
+				csr->channels[0].dimm = dimm;
+				dimm->grain = 32;
+				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
+				dimm->mtype = mtype;
+				dimm->edac_mode = mode;
+				snprintf(dimm->label, sizeof(dimm->label),
+					 "CPU_SrcID#%u_Channel#%u_DIMM#%u",
+					 pvt->sbridge_dev->source_id, i, j);
 			}
 		}
 	}
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index 1d5cf06..db7d2ae 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -84,6 +84,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 	struct csrow_info	*csrow = &mci->csrows[0];
 	struct tile_edac_priv	*priv = mci->pvt_info;
 	struct mshim_mem_info	mem_info;
+	struct dimm_info *dimm = csrow->channels[0].dimm;
 
 	if (hv_dev_pread(priv->hv_devhdl, 0, (HV_VirtAddr)&mem_info,
 		sizeof(struct mshim_mem_info), MSHIM_MEM_INFO_OFF) !=
@@ -93,16 +94,16 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	if (mem_info.mem_ecc)
-		csrow->edac_mode = EDAC_SECDED;
+		dimm->edac_mode = EDAC_SECDED;
 	else
-		csrow->edac_mode = EDAC_NONE;
+		dimm->edac_mode = EDAC_NONE;
 	switch (mem_info.mem_type) {
 	case DDR2:
-		csrow->mtype = MEM_DDR2;
+		dimm->mtype = MEM_DDR2;
 		break;
 
 	case DDR3:
-		csrow->mtype = MEM_DDR3;
+		dimm->mtype = MEM_DDR3;
 		break;
 
 	default:
@@ -112,8 +113,8 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 	csrow->first_page = 0;
 	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
 	csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-	csrow->grain = TILE_EDAC_ERROR_GRAIN;
-	csrow->dtype = DEV_UNKNOWN;
+	dimm->grain = TILE_EDAC_ERROR_GRAIN;
+	dimm->dtype = DEV_UNKNOWN;
 
 	return 0;
 }
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index b6f47de..52c8d69 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -317,7 +317,7 @@ static unsigned long drb_to_nr_pages(
 static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc;
-	int i;
+	int i, j;
 	struct mem_ctl_info *mci = NULL;
 	unsigned long last_page;
 	u16 drbs[X38_CHANNELS][X38_RANKS_PER_CHANNEL];
@@ -372,20 +372,21 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 			i / X38_RANKS_PER_CHANNEL,
 			i % X38_RANKS_PER_CHANNEL);
 
-		if (nr_pages == 0) {
-			csrow->mtype = MEM_EMPTY;
+		if (nr_pages == 0)
 			continue;
-		}
 
 		csrow->first_page = last_page + 1;
 		last_page += nr_pages;
 		csrow->last_page = last_page;
 		csrow->nr_pages = nr_pages;
 
-		csrow->grain = nr_pages << PAGE_SHIFT;
-		csrow->mtype = MEM_DDR2;
-		csrow->dtype = DEV_UNKNOWN;
-		csrow->edac_mode = EDAC_UNKNOWN;
+		for (j = 0; j < x38_channel_num; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			dimm->grain = nr_pages << PAGE_SHIFT;
+			dimm->mtype = MEM_DDR2;
+			dimm->dtype = DEV_UNKNOWN;
+			dimm->edac_mode = EDAC_UNKNOWN;
+		}
 	}
 
 	x38_clear_error_info(mci);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index f40b835..5244193 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -314,6 +314,13 @@ struct dimm_info {
 	unsigned memory_controller;
 	unsigned csrow;
 	unsigned csrow_channel;
+
+	u32 grain;		/* granularity of reported error in bytes */
+	enum dev_type dtype;	/* memory device type */
+	enum mem_type mtype;	/* memory dimm type */
+	enum edac_type edac_mode;	/* EDAC mode for this dimm */
+
+	u32 ce_count;		/* Correctable Errors for this dimm */
 };
 
 /**
@@ -339,19 +346,17 @@ struct rank_info {
 };
 
 struct csrow_info {
-	unsigned long first_page;	/* first page number in dimm */
-	unsigned long last_page;	/* last page number in dimm */
+	unsigned long first_page;	/* first page number in csrow */
+	unsigned long last_page;	/* last page number in csrow */
+	u32 nr_pages;			/* number of pages in csrow */
 	unsigned long page_mask;	/* used for interleaving -
 					 * 0UL for non intlv
 					 */
-	u32 nr_pages;		/* number of pages in csrow */
-	u32 grain;		/* granularity of reported error in bytes */
-	int csrow_idx;		/* the chip-select row */
-	enum dev_type dtype;	/* memory device type */
+	int csrow_idx;			/* the chip-select row */
+
 	u32 ue_count;		/* Uncorrectable Errors for this csrow */
 	u32 ce_count;		/* Correctable Errors for this csrow */
-	enum mem_type mtype;	/* memory csrow type */
-	enum edac_type edac_mode;	/* EDAC mode for this csrow */
+
 	struct mem_ctl_info *mci;	/* the parent */
 
 	struct kobject kobj;	/* sysfs kobject for this csrow */
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 3/7] edac: Don't initialize csrow's first_page & friends when not needed
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
  2012-04-16 20:12   ` [EDAC PATCH v13 1/7] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
  2012-04-16 20:12   ` [EDAC PATCH v13 2/7] edac: move dimm properties to struct dimm_info Mauro Carvalho Chehab
@ 2012-04-16 20:12   ` Mauro Carvalho Chehab
  2012-04-16 20:12     ` Mauro Carvalho Chehab
                     ` (4 subsequent siblings)
  7 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Chris Metcalf,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund, Josh Boyer,
	Jiri Kosina

Almost all edac	drivers	initialize csrow_info->first_page,
csrow_info->last_page and csrow_info->page_mask. Those vars are
used inside the EDAC core, in order to calculate the csrow affected
by an error, by using the routine edac_mc_find_csrow_by_page().

However, very few drivers actually use it:
        e752x_edac.c
        e7xxx_edac.c
        i3000_edac.c
        i82443bxgx_edac.c
        i82860_edac.c
        i82875p_edac.c
        i82975x_edac.c
        r82600_edac.c

There also a few other drivers that have their own calculus
formula internally using those vars.

All the others are just wasting time by initializing those
data.

While initializing data without using them won't cause any troubles, as
those information is stored at the wrong place (at csrows structure), it
is better to remove what is unused, in order to simplify the next patch.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c   |   38 ++------------------------------------
 drivers/edac/i3200_edac.c   |    5 -----
 drivers/edac/i5000_edac.c   |    5 -----
 drivers/edac/i5100_edac.c   |    2 --
 drivers/edac/i5400_edac.c   |    5 -----
 drivers/edac/i7300_edac.c   |    5 +----
 drivers/edac/i7core_edac.c  |    5 -----
 drivers/edac/mv64x60_edac.c |    1 -
 drivers/edac/ppc4xx_edac.c  |    7 -------
 drivers/edac/sb_edac.c      |    2 --
 drivers/edac/tile_edac.c    |    2 --
 drivers/edac/x38_edac.c     |    5 -----
 12 files changed, 3 insertions(+), 79 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index c4c61fb..0be3f29 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -715,25 +715,6 @@ static inline u64 input_addr_to_sys_addr(struct mem_ctl_info *mci,
 				     input_addr_to_dram_addr(mci, input_addr));
 }
 
-/*
- * Find the minimum and maximum InputAddr values that map to the given @csrow.
- * Pass back these values in *input_addr_min and *input_addr_max.
- */
-static void find_csrow_limits(struct mem_ctl_info *mci, int csrow,
-			      u64 *input_addr_min, u64 *input_addr_max)
-{
-	struct amd64_pvt *pvt;
-	u64 base, mask;
-
-	pvt = mci->pvt_info;
-	BUG_ON((csrow < 0) || (csrow >= pvt->csels[0].b_cnt));
-
-	get_cs_base_and_mask(pvt, csrow, 0, &base, &mask);
-
-	*input_addr_min = base & ~mask;
-	*input_addr_max = base | mask;
-}
-
 /* Map the Error address to a PAGE and PAGE OFFSET. */
 static inline void error_address_to_page_and_offset(u64 error_address,
 						    u32 *page, u32 *offset)
@@ -2166,7 +2147,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 {
 	struct csrow_info *csrow;
 	struct amd64_pvt *pvt = mci->pvt_info;
-	u64 input_addr_min, input_addr_max, sys_addr, base, mask;
+	u64 base, mask;
 	u32 val;
 	int i, j, empty = 1;
 	enum mem_type mtype;
@@ -2194,28 +2175,13 @@ static int init_csrows(struct mem_ctl_info *mci)
 
 		empty = 0;
 		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
-		find_csrow_limits(mci, i, &input_addr_min, &input_addr_max);
-		sys_addr = input_addr_to_sys_addr(mci, input_addr_min);
-		csrow->first_page = (u32) (sys_addr >> PAGE_SHIFT);
-		sys_addr = input_addr_to_sys_addr(mci, input_addr_max);
-		csrow->last_page = (u32) (sys_addr >> PAGE_SHIFT);
-
 		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
-		csrow->page_mask = ~mask;
 		/* 8 bytes of resolution */
 
 		mtype = amd64_determine_memory_type(pvt, i);
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    input_addr_min: 0x%lx input_addr_max: 0x%lx\n",
-			(unsigned long)input_addr_min,
-			(unsigned long)input_addr_max);
-		debugf1("    sys_addr: 0x%lx  page_mask: 0x%lx\n",
-			(unsigned long)sys_addr, csrow->page_mask);
-		debugf1("    nr_pages: %u  first_page: 0x%lx "
-			"last_page: 0x%lx\n",
-			(unsigned)csrow->nr_pages,
-			csrow->first_page, csrow->last_page);
+		debugf1("    nr_pages: %u\n", csrow->nr_pages);
 
 		/*
 		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 73529fd..d8fa7f3 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -321,7 +321,6 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_page;
 	u16 drbs[I3200_CHANNELS][I3200_RANKS_PER_CHANNEL];
 	bool stacked;
 	void __iomem *window;
@@ -366,7 +365,6 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	 * cumulative; the last one will contain the total memory
 	 * contained in all ranks.
 	 */
-	last_page = -1UL;
 	for (i = 0; i < mci->nr_csrows; i++) {
 		unsigned long nr_pages;
 		struct csrow_info *csrow = &mci->csrows[i];
@@ -378,9 +376,6 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->first_page = last_page + 1;
-		last_page += nr_pages;
-		csrow->last_page = last_page;
 		csrow->nr_pages = nr_pages;
 
 		for (j = 0; j < nr_channels; j++) {
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index e612f1e..f00f684 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1263,11 +1263,6 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr) && !MTR_DIMMS_PRESENT(mtr1))
 			continue;
 
-		/* FAKE OUT VALUES, FIXME */
-		p_csrow->first_page = 0 + csrow * 20;
-		p_csrow->last_page = 9 + csrow * 20;
-		p_csrow->page_mask = 0xFFF;
-
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 9caff36..8da7ce1 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -859,8 +859,6 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 * FIXME: these two are totally bogus -- I don't see how to
 		 * map them correctly to this structure...
 		 */
-		mci->csrows[i].first_page = total_pages;
-		mci->csrows[i].last_page = total_pages + npages - 1;
 		mci->csrows[i].nr_pages = npages;
 		mci->csrows[i].csrow_idx = i;
 		mci->csrows[i].mci = mci;
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 229aff5..4a23813 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1180,11 +1180,6 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr))
 			continue;
 
-		/* FAKE OUT VALUES, FIXME */
-		p_csrow->first_page = 0 + csrow * 20;
-		p_csrow->last_page = 9 + csrow * 20;
-		p_csrow->page_mask = 0xFFF;
-
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 07a5927..df6cd59 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -778,7 +778,7 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	int rc = -ENODEV;
 	int mtr;
 	int ch, branch, slot, channel;
-	u32 last_page = 0, nr_pages;
+	u32 nr_pages;
 	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
@@ -828,9 +828,6 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 
 				/* Update per_csrow memory count */
 				p_csrow->nr_pages += nr_pages;
-				p_csrow->first_page = last_page;
-				last_page += nr_pages;
-				p_csrow->last_page = last_page;
 
 				rc = 0;
 			}
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 21f9791..89ccec6 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -599,7 +599,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	struct pci_dev *pdev;
 	int i, j;
 	int csrow = 0;
-	unsigned long last_page = 0;
 	enum edac_type mode;
 	enum mem_type mtype;
 	struct dimm_info *dimm;
@@ -716,12 +715,8 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			npages = MiB_TO_PAGES(size);
 
 			csr = &mci->csrows[csrow];
-			csr->first_page = last_page + 1;
-			last_page += npages;
-			csr->last_page = last_page;
 			csr->nr_pages = npages;
 
-			csr->page_mask = 0;
 			csr->csrow_idx = csrow;
 			csr->nr_channels = 1;
 
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 12d7fe0..d2e3c39 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -668,7 +668,6 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 	csrow = &mci->csrows[0];
 	dimm = csrow->channels[0].dimm;
 	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
-	csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 	dimm->grain = 8;
 
 	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index a75e567..ec5e529 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -897,7 +897,6 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum edac_type edac_mode;
 	int row, j;
 	u32 mbxcf, size;
-	static u32 ppc4xx_last_page;
 
 	/* Establish the memory type and width */
 
@@ -959,10 +958,6 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 			goto done;
 		}
 
-		csi->first_page = ppc4xx_last_page;
-		csi->last_page	= csi->first_page + csi->nr_pages - 1;
-		csi->page_mask	= 0;
-
 		/*
 		 * It's unclear exactly what grain should be set to
 		 * here. The SDRAM_ECCES register allows resolution of
@@ -984,8 +979,6 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 			dimm->dtype	= dtype;
 
 			dimm->edac_mode	= edac_mode;
-
-		ppc4xx_last_page += csi->nr_pages;
 		}
 	}
 
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index ec6e03d..cf53007 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -642,8 +642,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				 * csrows.
 				 */
 				csr = &mci->csrows[csrow];
-				csr->first_page = last_page;
-				csr->last_page = last_page + npages - 1;
 				csr->nr_pages = npages;
 				csr->csrow_idx = csrow;
 				csr->nr_channels = 1;
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index db7d2ae..ba0917b 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -110,9 +110,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 		return -1;
 	}
 
-	csrow->first_page = 0;
 	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
-	csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
 	dimm->grain = TILE_EDAC_ERROR_GRAIN;
 	dimm->dtype = DEV_UNKNOWN;
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 52c8d69..7be10dd 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -319,7 +319,6 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_page;
 	u16 drbs[X38_CHANNELS][X38_RANKS_PER_CHANNEL];
 	bool stacked;
 	void __iomem *window;
@@ -363,7 +362,6 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	 * cumulative; the last one will contain the total memory
 	 * contained in all ranks.
 	 */
-	last_page = -1UL;
 	for (i = 0; i < mci->nr_csrows; i++) {
 		unsigned long nr_pages;
 		struct csrow_info *csrow = &mci->csrows[i];
@@ -375,9 +373,6 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->first_page = last_page + 1;
-		last_page += nr_pages;
-		csrow->last_page = last_page;
 		csrow->nr_pages = nr_pages;
 
 		for (j = 0; j < x38_channel_num; j++) {
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
@ 2012-04-16 20:12     ` Mauro Carvalho Chehab
  2012-04-16 20:12   ` [EDAC PATCH v13 2/7] edac: move dimm properties to struct dimm_info Mauro Carvalho Chehab
                       ` (6 subsequent siblings)
  7 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Borislav Petkov,
	Mark Gross, Jason Uhlenkott, Tim Small, Ranganathan Desikan,
	Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

The number of pages is a dimm property. Move it to the dimm struct.

After this change, it is possible to add sysfs nodes for the DIMM's that
will properly represent the DIMM stick properties, including its size.

A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
the memory controller represents the memory via chip select rows.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c      |   12 +++------
 drivers/edac/amd76x_edac.c     |    6 ++--
 drivers/edac/cell_edac.c       |    8 ++++--
 drivers/edac/cpc925_edac.c     |    8 ++++--
 drivers/edac/e752x_edac.c      |    6 +++-
 drivers/edac/e7xxx_edac.c      |    5 ++-
 drivers/edac/edac_mc.c         |   16 ++++++++-----
 drivers/edac/edac_mc_sysfs.c   |   47 ++++++++++++++++++++++++++++------------
 drivers/edac/i3000_edac.c      |    6 +++-
 drivers/edac/i3200_edac.c      |    3 +-
 drivers/edac/i5000_edac.c      |   14 ++++++-----
 drivers/edac/i5100_edac.c      |   22 +++++++++++-------
 drivers/edac/i5400_edac.c      |    9 ++-----
 drivers/edac/i7300_edac.c      |   22 +++++-------------
 drivers/edac/i7core_edac.c     |   10 ++------
 drivers/edac/i82443bxgx_edac.c |    2 +-
 drivers/edac/i82860_edac.c     |    2 +-
 drivers/edac/i82875p_edac.c    |    5 ++-
 drivers/edac/i82975x_edac.c    |   11 ++++++--
 drivers/edac/mpc85xx_edac.c    |    3 +-
 drivers/edac/mv64x60_edac.c    |    3 +-
 drivers/edac/pasemi_edac.c     |   14 ++++++------
 drivers/edac/ppc4xx_edac.c     |    5 ++-
 drivers/edac/r82600_edac.c     |    3 +-
 drivers/edac/sb_edac.c         |    8 +-----
 drivers/edac/tile_edac.c       |    2 +-
 drivers/edac/x38_edac.c        |    4 +-
 include/linux/edac.h           |    8 ++++--
 28 files changed, 144 insertions(+), 120 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 0be3f29..8804ac8 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2126,12 +2126,6 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
 
 	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
 
-	/*
-	 * If dual channel then double the memory size of single channel.
-	 * Channel count is 1 or 2
-	 */
-	nr_pages <<= (pvt->channel_count - 1);
-
 	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
 	debugf0("    nr_pages= %u  channel-count = %d\n",
 		nr_pages, pvt->channel_count);
@@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 	int i, j, empty = 1;
 	enum mem_type mtype;
 	enum edac_type edac_mode;
+	int nr_pages;
 
 	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
 
@@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
 			i, pvt->mc_node_id);
 
 		empty = 0;
-		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
+		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
 		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
 		/* 8 bytes of resolution */
 
 		mtype = amd64_determine_memory_type(pvt, i);
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    nr_pages: %u\n", csrow->nr_pages);
+		debugf1("    nr_pages: %u\n", nr_pages);
 
 		/*
 		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
@@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 		for (j = 0; j < pvt->channel_count; j++) {
 			csrow->channels[j].dimm->mtype = mtype;
 			csrow->channels[j].dimm->edac_mode = edac_mode;
+			csrow->channels[j].dimm->nr_pages = nr_pages;
 		}
 	}
 
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 2a63ed0..1532750 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -205,10 +205,10 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		mba_mask = ((mba & 0xff80) << 16) | 0x7fffffUL;
 		pci_read_config_dword(pdev, AMD76X_DRAM_MODE_STATUS, &dms);
 		csrow->first_page = mba_base >> PAGE_SHIFT;
-		csrow->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		dimm->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
 		csrow->page_mask = mba_mask >> PAGE_SHIFT;
-		dimm->grain = csrow->nr_pages << PAGE_SHIFT;
+		dimm->grain = dimm->nr_pages << PAGE_SHIFT;
 		dimm->mtype = MEM_RDDR;
 		dimm->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
 		dimm->edac_mode = edac_mode;
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 94fbb12..09e1b5d 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -128,6 +128,7 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 	struct cell_edac_priv		*priv = mci->pvt_info;
 	struct device_node		*np;
 	int				j;
+	u32				nr_pages;
 
 	for (np = NULL;
 	     (np = of_find_node_by_name(np, "memory")) != NULL;) {
@@ -142,19 +143,20 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 		if (of_node_to_nid(np) != priv->node)
 			continue;
 		csrow->first_page = r.start >> PAGE_SHIFT;
-		csrow->nr_pages = resource_size(&r) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = resource_size(&r) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
 			dimm->mtype = MEM_XDR;
 			dimm->edac_mode = EDAC_SECDED;
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 		}
 		dev_dbg(mci->dev,
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
 			priv->node, priv->chanmask,
-			csrow->first_page, csrow->nr_pages);
+			csrow->first_page, dimm->nr_pages);
 		break;
 	}
 }
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index ee90f3d..7b764a8 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -332,7 +332,7 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 	struct dimm_info *dimm;
 	int index, j;
 	u32 mbmr, mbbar, bba;
-	unsigned long row_size, last_nr_pages = 0;
+	unsigned long row_size, nr_pages, last_nr_pages = 0;
 
 	get_total_mem(pdata);
 
@@ -351,12 +351,14 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 
 		row_size = bba * (1UL << 28);	/* 256M */
 		csrow->first_page = last_nr_pages;
-		csrow->nr_pages = row_size >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = row_size >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 		last_nr_pages = csrow->last_page + 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			dimm->mtype = MEM_RDDR;
 			dimm->edac_mode = EDAC_SECDED;
 
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index db291ea..6d81d3c 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1044,7 +1044,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	int drc_drbg;		/* DRB granularity 0=64mb, 1=128mb */
 	int drc_ddim;		/* DRAM Data Integrity Mode 0=none, 2=edac */
 	u8 value;
-	u32 dra, drc, cumul_size, i;
+	u32 dra, drc, cumul_size, i, nr_pages;
 
 	dra = 0;
 	for (index = 0; index < 4; index++) {
@@ -1078,11 +1078,13 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (i = 0; i < drc_chan + 1; i++) {
 			struct dimm_info *dimm = csrow->channels[i].dimm;
+
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 178d2af..aeb69f0 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -349,7 +349,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	unsigned long last_cumul_size;
 	int index, j;
 	u8 value;
-	u32 dra, cumul_size;
+	u32 dra, cumul_size, nr_pages;
 	int drc_chan, drc_drbg, drc_ddim, mem_dev;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
@@ -380,12 +380,13 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < drc_chan + 1; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index f83e63d..ffedae9 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -43,9 +43,10 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 {
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
-	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
+	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
+	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -55,7 +56,6 @@ static void edac_mc_dump_csrow(struct csrow_info *csrow)
 	debugf4("\tcsrow->first_page = 0x%lx\n", csrow->first_page);
 	debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
 	debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
-	debugf4("\tcsrow->nr_pages = 0x%x\n", csrow->nr_pages);
 	debugf4("\tcsrow->nr_channels = %d\n", csrow->nr_channels);
 	debugf4("\tcsrow->channels = %p\n", csrow->channels);
 	debugf4("\tcsrow->mci = %p\n\n", csrow->mci);
@@ -652,15 +652,19 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 {
 	struct csrow_info *csrows = mci->csrows;
-	int row, i;
+	int row, i, j, n;
 
 	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
 	row = -1;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
 		struct csrow_info *csrow = &csrows[i];
-
-		if (csrow->nr_pages == 0)
+		n = 0;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			n += dimm->nr_pages;
+		}
+		if (n == 0)
 			continue;
 
 		debugf3("MC%d: %s(): first(0x%lx) page(0x%lx) last(0x%lx) "
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index d63904e..c0275e6 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -144,7 +144,13 @@ static ssize_t csrow_ce_count_show(struct csrow_info *csrow, char *data,
 static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%u\n", PAGES_TO_MiB(csrow->nr_pages));
+	int i;
+	u32 nr_pages = 0;
+
+	for (i = 0; i < csrow->nr_channels; i++)
+		nr_pages += csrow->channels[i].dimm->nr_pages;
+
+	return sprintf(data, "%u\n", PAGES_TO_MiB(nr_pages));
 }
 
 static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
@@ -519,16 +525,16 @@ static ssize_t mci_ctl_name_show(struct mem_ctl_info *mci, char *data)
 
 static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data)
 {
-	int total_pages, csrow_idx;
+	int total_pages = 0, csrow_idx, j;
 
-	for (total_pages = csrow_idx = 0; csrow_idx < mci->nr_csrows;
-		csrow_idx++) {
+	for (csrow_idx = 0; csrow_idx < mci->nr_csrows; csrow_idx++) {
 		struct csrow_info *csrow = &mci->csrows[csrow_idx];
 
-		if (!csrow->nr_pages)
-			continue;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
 
-		total_pages += csrow->nr_pages;
+			total_pages += dimm->nr_pages;
+		}
 	}
 
 	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
@@ -900,7 +906,7 @@ static void edac_remove_mci_instance_attributes(struct mem_ctl_info *mci,
  */
 int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	int i, j;
 	int err;
 	struct csrow_info *csrow;
 	struct kobject *kobj_mci = &mci->edac_mci_kobj;
@@ -934,10 +940,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	/* Make directories for each CSROW object under the mc<id> kobject
 	 */
 	for (i = 0; i < mci->nr_csrows; i++) {
+		int nr_pages = 0;
+
 		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
 
-		/* Only expose populated CSROWs */
-		if (csrow->nr_pages > 0) {
+		if (nr_pages > 0) {
 			err = edac_create_csrow_object(mci, csrow, i);
 			if (err) {
 				debugf1("%s() failure: create csrow %d obj\n",
@@ -949,10 +958,14 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 
 	return 0;
 
-	/* CSROW error: backout what has already been registered,  */
 fail1:
 	for (i--; i >= 0; i--) {
-		if (mci->csrows[i].nr_pages > 0)
+		int nr_pages = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
+		if (nr_pages > 0)
 			kobject_put(&mci->csrows[i].kobj);
 	}
 
@@ -972,14 +985,20 @@ fail0:
  */
 void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	struct csrow_info *csrow;
+	int i, j;
 
 	debugf0("%s()\n", __func__);
 
 	/* remove all csrow kobjects */
 	debugf4("%s()  unregister this mci kobj\n", __func__);
 	for (i = 0; i < mci->nr_csrows; i++) {
-		if (mci->csrows[i].nr_pages > 0) {
+		int nr_pages = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
+		if (nr_pages > 0) {
 			debugf0("%s()  unreg csrow-%d\n", __func__, i);
 			kobject_put(&mci->csrows[i].kobj);
 		}
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 1498c5f..bf8a230 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -306,7 +306,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_cumul_size;
+	unsigned long last_cumul_size, nr_pages;
 	int interleaved, nr_channels;
 	unsigned char dra[I3000_RANKS / 2], drb[I3000_RANKS];
 	unsigned char *c0dra = dra, *c1dra = &dra[I3000_RANKS_PER_CHANNEL / 2];
@@ -391,11 +391,13 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = I3000_DEAP_GRAIN;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index d8fa7f3..b82667f 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -376,11 +376,10 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index f00f684..e8d32e8 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1236,6 +1236,7 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i5000_pvt *pvt;
 	struct csrow_info *p_csrow;
+	struct dimm_info *dimm;
 	int empty, channel_count;
 	int max_csrows;
 	int mtr, mtr1;
@@ -1265,21 +1266,22 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
+			dimm = p_csrow->channels[channel].dimm;
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
-			p_csrow->channels[channel].dimm->grain = 8;
+			dimm->grain = 8;
 
 			/* Assume DDR2 for now */
-			p_csrow->channels[channel].dimm->mtype = MEM_FB_DDR2;
+			dimm->mtype = MEM_FB_DDR2;
 
 			/* ask what device type on this row */
 			if (MTR_DRAM_WIDTH(mtr))
-				p_csrow->channels[channel].dimm->dtype = DEV_X8;
+				dimm->dtype = DEV_X8;
 			else
-				p_csrow->channels[channel].dimm->dtype = DEV_X4;
+				dimm->dtype = DEV_X4;
 
-			p_csrow->channels[channel].dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->nr_pages = (csrow_megs << 8) / pvt->maxch;
 		}
-		p_csrow->nr_pages = csrow_megs << 8;
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 8da7ce1..a0219a9 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -859,7 +859,6 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 * FIXME: these two are totally bogus -- I don't see how to
 		 * map them correctly to this structure...
 		 */
-		mci->csrows[i].nr_pages = npages;
 		mci->csrows[i].csrow_idx = i;
 		mci->csrows[i].mci = mci;
 		mci->csrows[i].nr_channels = 1;
@@ -867,14 +866,19 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		total_pages += npages;
 
 		dimm = mci->csrows[i].channels[0].dimm;
-		dimm->grain = 32;
-		dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
-			      DEV_X4 : DEV_X8;
-		dimm->mtype = MEM_RDDR2;
-		dimm->edac_mode = EDAC_SECDED;
-		snprintf(dimm->label, sizeof(dimm->label),
-			 "DIMM%u",
-			 i5100_rank_to_slot(mci, chan, rank));
+		dimm->nr_pages = npages;
+		if (npages) {
+			total_pages += npages;
+
+			dimm->grain = 32;
+			dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
+				DEV_X4 : DEV_X8;
+			dimm->mtype = MEM_RDDR2;
+			dimm->edac_mode = EDAC_SECDED;
+			snprintf(dimm->label, sizeof(dimm->label),
+				"DIMM%u",
+				i5100_rank_to_slot(mci, chan, rank));
+		}
 	}
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 4a23813..784d6dc 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1156,7 +1156,7 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	int empty, channel_count;
 	int max_csrows;
 	int mtr;
-	int csrow_megs;
+	int size_mb;
 	int channel;
 	int csrow;
 	struct dimm_info *dimm;
@@ -1171,8 +1171,6 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	for (csrow = 0; csrow < max_csrows; csrow++) {
 		p_csrow = &mci->csrows[csrow];
 
-		p_csrow->csrow_idx = csrow;
-
 		/* use branch 0 for the basis */
 		mtr = determine_mtr(pvt, csrow, 0);
 
@@ -1180,12 +1178,11 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr))
 			continue;
 
-		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
-			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
+			size_mb = pvt->dimm_info[csrow][channel].megabytes;
 
-			p_csrow->nr_pages = csrow_megs << 8;
 			dimm = p_csrow->channels[channel].dimm;
+			dimm->nr_pages = size_mb << 8;
 			dimm->grain = 8;
 			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
 			dimm->mtype = MEM_RDDR2;
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index df6cd59..5e594ae 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -617,9 +617,7 @@ static void i7300_enable_error_reporting(struct mem_ctl_info *mci)
 static int decode_mtr(struct i7300_pvt *pvt,
 		      int slot, int ch, int branch,
 		      struct i7300_dimm_info *dinfo,
-		      struct csrow_info *p_csrow,
-		      struct dimm_info *dimm,
-		      u32 *nr_pages)
+		      struct dimm_info *dimm)
 {
 	int mtr, ans, addrBits, channel;
 
@@ -651,7 +649,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	addrBits -= 3;	/* 8 bits per bytes */
 
 	dinfo->megabytes = 1 << addrBits;
-	*nr_pages = dinfo->megabytes << 8;
 
 	debugf2("\t\tWIDTH: x%d\n", MTR_DRAM_WIDTH(mtr));
 
@@ -664,8 +661,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 	debugf2("\t\tSIZE: %d MB\n", dinfo->megabytes);
 
-	p_csrow->csrow_idx = slot;
-
 	/*
 	 * The type of error detection actually depends of the
 	 * mode of operation. When it is just one single memory chip, at
@@ -675,6 +670,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	 * See datasheet Sections 7.3.6 to 7.3.8
 	 */
 
+	dimm->nr_pages = MiB_TO_PAGES(dinfo->megabytes);
 	dimm->grain = 8;
 	dimm->mtype = MEM_FB_DDR2;
 	if (IS_SINGLE_MODE(pvt->mc_settings_a)) {
@@ -774,11 +770,9 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i7300_pvt *pvt;
 	struct i7300_dimm_info *dinfo;
-	struct csrow_info *p_csrow;
 	int rc = -ENODEV;
 	int mtr;
 	int ch, branch, slot, channel;
-	u32 nr_pages;
 	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
@@ -804,7 +798,6 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	/* Get the set of MTR[0-7] regs by each branch */
-	nr_pages = 0;
 	for (slot = 0; slot < MAX_SLOTS; slot++) {
 		int where = mtr_regs[slot];
 		for (branch = 0; branch < MAX_BRANCHES; branch++) {
@@ -815,21 +808,18 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 				int channel = to_channel(ch, branch);
 
 				dinfo = &pvt->dimm_info[slot][channel];
-				p_csrow = &mci->csrows[slot];
 
-				dimm = p_csrow->channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+				dimm = mci->csrows[slot].channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
 
 				mtr = decode_mtr(pvt, slot, ch, branch,
-						 dinfo, p_csrow, dimm,
-						 &nr_pages);
+						 dinfo, dimm);
+
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
 
-				/* Update per_csrow memory count */
-				p_csrow->nr_pages += nr_pages;
-
 				rc = 0;
+
 			}
 		}
 	}
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 89ccec6..d566797 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -715,17 +715,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			npages = MiB_TO_PAGES(size);
 
 			csr = &mci->csrows[csrow];
-			csr->nr_pages = npages;
-
-			csr->csrow_idx = csrow;
-			csr->nr_channels = 1;
-
-			csr->channels[0].chan_idx = i;
-			csr->channels[0].ce_count = 0;
 
 			pvt->csrow_map[i][j] = csrow;
 
 			dimm = csr->channels[0].dimm;
+			dimm->nr_pages = npages;
+
 			switch (banks) {
 			case 4:
 				dimm->dtype = DEV_X4;
@@ -746,6 +741,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			dimm->grain = 8;
 			dimm->edac_mode = mode;
 			dimm->mtype = mtype;
+			csrow++;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 1e19492..74166ae 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -220,7 +220,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		row_base = row_high_limit_last;
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* EAP reports in 4kilobyte granularity [61] */
 		dimm->grain = 1 << 12;
 		dimm->mtype = mtype;
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index acbd924..48e0ecd 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -167,7 +167,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		dimm->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 		dimm->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
 		dimm->mtype = MEM_RMBS;
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 81f79e2..dc207dc 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -347,7 +347,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 	unsigned long last_cumul_size;
 	u8 value;
 	u32 drc_ddim;		/* DRAM Data Integrity Mode 0=none,2=edac */
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, j;
 
 	drc_ddim = (drc >> 18) & 0x1;
@@ -371,12 +371,13 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_chans; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_chans;
 			dimm->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
 			dimm->mtype = MEM_DDR;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 0b40e11..304af1d 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -370,7 +370,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 	struct csrow_info *csrow;
 	unsigned long last_cumul_size;
 	u8 value;
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, chan;
 	struct dimm_info *dimm;
 	enum dev_type dtype;
@@ -402,6 +402,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
 			cumul_size);
 
+		nr_pages = cumul_size - last_cumul_size;
 		/*
 		 * Initialise dram labels
 		 * index values:
@@ -411,6 +412,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		dtype = i82975x_dram_type(mch_window, index);
 		for (chan = 0; chan < csrow->nr_channels; chan++) {
 			dimm = mci->csrows[index].channels[chan].dimm;
+
+			if (!nr_pages)
+				continue;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
@@ -420,12 +426,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 			dimm->edac_mode = EDAC_SECDED; /* only supported */
 		}
 
-		if (cumul_size == last_cumul_size)
+		if (!nr_pages)
 			continue;	/* not populated */
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 	}
 }
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index fb92916..c1d9e15 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -947,7 +947,8 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 
 		csrow->first_page = start;
 		csrow->last_page = end;
-		csrow->nr_pages = end + 1 - start;
+
+		dimm->nr_pages = end + 1 - start;
 		dimm->grain = 8;
 		dimm->mtype = mtype;
 		dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index d2e3c39..281e245 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -667,7 +667,8 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 
 	csrow = &mci->csrows[0];
 	dimm = csrow->channels[0].dimm;
-	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
+
+	dimm->nr_pages = pdata->total_mem >> PAGE_SHIFT;
 	dimm->grain = 8;
 
 	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 4e53270..3fcefda 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -153,20 +153,20 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		switch ((rankcfg & MCDRAM_RANKCFG_TYPE_SIZE_M) >>
 			MCDRAM_RANKCFG_TYPE_SIZE_S) {
 		case 0:
-			csrow->nr_pages = 128 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 128 << (20 - PAGE_SHIFT);
 			break;
 		case 1:
-			csrow->nr_pages = 256 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 256 << (20 - PAGE_SHIFT);
 			break;
 		case 2:
 		case 3:
-			csrow->nr_pages = 512 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 512 << (20 - PAGE_SHIFT);
 			break;
 		case 4:
-			csrow->nr_pages = 1024 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 1024 << (20 - PAGE_SHIFT);
 			break;
 		case 5:
-			csrow->nr_pages = 2048 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 2048 << (20 - PAGE_SHIFT);
 			break;
 		default:
 			edac_mc_printk(mci, KERN_ERR,
@@ -176,8 +176,8 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		}
 
 		csrow->first_page = last_page_in_mmc;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-		last_page_in_mmc += csrow->nr_pages;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
+		last_page_in_mmc += dimm->nr_pages;
 		csrow->page_mask = 0;
 		dimm->grain = PASEMI_EDAC_ERROR_GRAIN;
 		dimm->mtype = MEM_DDR;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index ec5e529..95cfc0f 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -896,7 +896,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum dev_type dtype;
 	enum edac_type edac_mode;
 	int row, j;
-	u32 mbxcf, size;
+	u32 mbxcf, size, nr_pages;
 
 	/* Establish the memory type and width */
 
@@ -947,7 +947,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		case SDRAM_MBCF_SZ_2GB:
 		case SDRAM_MBCF_SZ_4GB:
 		case SDRAM_MBCF_SZ_8GB:
-			csi->nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
+			nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
 			break;
 		default:
 			ppc4xx_edac_mc_printk(KERN_ERR, mci,
@@ -973,6 +973,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		for (j = 0; j < csi->nr_channels; j++) {
 			struct dimm_info *dimm = csi->channels[j].dimm;
 
+			dimm->nr_pages  = nr_pages / csi->nr_channels;
 			dimm->grain	= 1;
 
 			dimm->mtype	= mtype;
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 414a532..19f3a10 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -249,7 +249,8 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* Error address is top 19 bits - so granularity is      *
 		 * 14 bits                                               */
 		dimm->grain = 1 << 14;
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index cf53007..ee1543d 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -561,7 +561,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
-	struct dimm_info *dimm;
 
 	pci_read_config_dword(pvt->pci_br, SAD_TARGET, &reg);
 	pvt->sbridge_dev->source_id = SOURCE_ID(reg);
@@ -613,11 +612,11 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	/* On all supported DDR3 DIMM types, there are 8 banks available */
 	banks = 8;
 
-	dimm = mci->dimms;
 	for (i = 0; i < NUM_CHANNELS; i++) {
 		u32 mtr;
 
 		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
+			struct dimm_info *dimm = &mci->dimms[j];
 			pci_read_config_dword(pvt->pci_tad[i],
 					      mtr_regs[j], &mtr);
 			debugf4("Channel #%d  MTR%d = %x\n", i, j, mtr);
@@ -642,15 +641,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				 * csrows.
 				 */
 				csr = &mci->csrows[csrow];
-				csr->nr_pages = npages;
-				csr->csrow_idx = csrow;
-				csr->nr_channels = 1;
-				csr->channels[0].chan_idx = i;
 				pvt->csrow_map[i][j] = csrow;
 				last_page += npages;
 				csrow++;
 
 				csr->channels[0].dimm = dimm;
+				dimm->nr_pages = npages;
 				dimm->grain = 32;
 				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
 				dimm->mtype = mtype;
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index ba0917b..6314ff9 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -110,7 +110,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 		return -1;
 	}
 
-	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
+	dimm->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
 	dimm->grain = TILE_EDAC_ERROR_GRAIN;
 	dimm->dtype = DEV_UNKNOWN;
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 7be10dd..0de288f 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -373,10 +373,10 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < x38_channel_num; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / x38_channel_num;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 5244193..8b78bd0 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -320,6 +320,8 @@ struct dimm_info {
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
+	u32 nr_pages;			/* number of pages in csrow */
+
 	u32 ce_count;		/* Correctable Errors for this dimm */
 };
 
@@ -346,12 +348,12 @@ struct rank_info {
 };
 
 struct csrow_info {
+	/* Used only by edac_mc_find_csrow_by_page() */
 	unsigned long first_page;	/* first page number in csrow */
 	unsigned long last_page;	/* last page number in csrow */
-	u32 nr_pages;			/* number of pages in csrow */
 	unsigned long page_mask;	/* used for interleaving -
-					 * 0UL for non intlv
-					 */
+					 * 0UL for non intlv */
+
 	int csrow_idx;			/* the chip-select row */
 
 	u32 ue_count;		/* Uncorrectable Errors for this csrow */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
@ 2012-04-16 20:12     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Hitoshi Mitake,
	Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Borislav Petkov, Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

The number of pages is a dimm property. Move it to the dimm struct.

After this change, it is possible to add sysfs nodes for the DIMM's that
will properly represent the DIMM stick properties, including its size.

A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
the memory controller represents the memory via chip select rows.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c      |   12 +++------
 drivers/edac/amd76x_edac.c     |    6 ++--
 drivers/edac/cell_edac.c       |    8 ++++--
 drivers/edac/cpc925_edac.c     |    8 ++++--
 drivers/edac/e752x_edac.c      |    6 +++-
 drivers/edac/e7xxx_edac.c      |    5 ++-
 drivers/edac/edac_mc.c         |   16 ++++++++-----
 drivers/edac/edac_mc_sysfs.c   |   47 ++++++++++++++++++++++++++++------------
 drivers/edac/i3000_edac.c      |    6 +++-
 drivers/edac/i3200_edac.c      |    3 +-
 drivers/edac/i5000_edac.c      |   14 ++++++-----
 drivers/edac/i5100_edac.c      |   22 +++++++++++-------
 drivers/edac/i5400_edac.c      |    9 ++-----
 drivers/edac/i7300_edac.c      |   22 +++++-------------
 drivers/edac/i7core_edac.c     |   10 ++------
 drivers/edac/i82443bxgx_edac.c |    2 +-
 drivers/edac/i82860_edac.c     |    2 +-
 drivers/edac/i82875p_edac.c    |    5 ++-
 drivers/edac/i82975x_edac.c    |   11 ++++++--
 drivers/edac/mpc85xx_edac.c    |    3 +-
 drivers/edac/mv64x60_edac.c    |    3 +-
 drivers/edac/pasemi_edac.c     |   14 ++++++------
 drivers/edac/ppc4xx_edac.c     |    5 ++-
 drivers/edac/r82600_edac.c     |    3 +-
 drivers/edac/sb_edac.c         |    8 +-----
 drivers/edac/tile_edac.c       |    2 +-
 drivers/edac/x38_edac.c        |    4 +-
 include/linux/edac.h           |    8 ++++--
 28 files changed, 144 insertions(+), 120 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 0be3f29..8804ac8 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2126,12 +2126,6 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
 
 	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
 
-	/*
-	 * If dual channel then double the memory size of single channel.
-	 * Channel count is 1 or 2
-	 */
-	nr_pages <<= (pvt->channel_count - 1);
-
 	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
 	debugf0("    nr_pages= %u  channel-count = %d\n",
 		nr_pages, pvt->channel_count);
@@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 	int i, j, empty = 1;
 	enum mem_type mtype;
 	enum edac_type edac_mode;
+	int nr_pages;
 
 	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
 
@@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
 			i, pvt->mc_node_id);
 
 		empty = 0;
-		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
+		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
 		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
 		/* 8 bytes of resolution */
 
 		mtype = amd64_determine_memory_type(pvt, i);
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    nr_pages: %u\n", csrow->nr_pages);
+		debugf1("    nr_pages: %u\n", nr_pages);
 
 		/*
 		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
@@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 		for (j = 0; j < pvt->channel_count; j++) {
 			csrow->channels[j].dimm->mtype = mtype;
 			csrow->channels[j].dimm->edac_mode = edac_mode;
+			csrow->channels[j].dimm->nr_pages = nr_pages;
 		}
 	}
 
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 2a63ed0..1532750 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -205,10 +205,10 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		mba_mask = ((mba & 0xff80) << 16) | 0x7fffffUL;
 		pci_read_config_dword(pdev, AMD76X_DRAM_MODE_STATUS, &dms);
 		csrow->first_page = mba_base >> PAGE_SHIFT;
-		csrow->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		dimm->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
 		csrow->page_mask = mba_mask >> PAGE_SHIFT;
-		dimm->grain = csrow->nr_pages << PAGE_SHIFT;
+		dimm->grain = dimm->nr_pages << PAGE_SHIFT;
 		dimm->mtype = MEM_RDDR;
 		dimm->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
 		dimm->edac_mode = edac_mode;
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 94fbb12..09e1b5d 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -128,6 +128,7 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 	struct cell_edac_priv		*priv = mci->pvt_info;
 	struct device_node		*np;
 	int				j;
+	u32				nr_pages;
 
 	for (np = NULL;
 	     (np = of_find_node_by_name(np, "memory")) != NULL;) {
@@ -142,19 +143,20 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 		if (of_node_to_nid(np) != priv->node)
 			continue;
 		csrow->first_page = r.start >> PAGE_SHIFT;
-		csrow->nr_pages = resource_size(&r) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = resource_size(&r) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
 			dimm->mtype = MEM_XDR;
 			dimm->edac_mode = EDAC_SECDED;
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 		}
 		dev_dbg(mci->dev,
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
 			priv->node, priv->chanmask,
-			csrow->first_page, csrow->nr_pages);
+			csrow->first_page, dimm->nr_pages);
 		break;
 	}
 }
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index ee90f3d..7b764a8 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -332,7 +332,7 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 	struct dimm_info *dimm;
 	int index, j;
 	u32 mbmr, mbbar, bba;
-	unsigned long row_size, last_nr_pages = 0;
+	unsigned long row_size, nr_pages, last_nr_pages = 0;
 
 	get_total_mem(pdata);
 
@@ -351,12 +351,14 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 
 		row_size = bba * (1UL << 28);	/* 256M */
 		csrow->first_page = last_nr_pages;
-		csrow->nr_pages = row_size >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = row_size >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 		last_nr_pages = csrow->last_page + 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			dimm->mtype = MEM_RDDR;
 			dimm->edac_mode = EDAC_SECDED;
 
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index db291ea..6d81d3c 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1044,7 +1044,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	int drc_drbg;		/* DRB granularity 0=64mb, 1=128mb */
 	int drc_ddim;		/* DRAM Data Integrity Mode 0=none, 2=edac */
 	u8 value;
-	u32 dra, drc, cumul_size, i;
+	u32 dra, drc, cumul_size, i, nr_pages;
 
 	dra = 0;
 	for (index = 0; index < 4; index++) {
@@ -1078,11 +1078,13 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (i = 0; i < drc_chan + 1; i++) {
 			struct dimm_info *dimm = csrow->channels[i].dimm;
+
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 178d2af..aeb69f0 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -349,7 +349,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	unsigned long last_cumul_size;
 	int index, j;
 	u8 value;
-	u32 dra, cumul_size;
+	u32 dra, cumul_size, nr_pages;
 	int drc_chan, drc_drbg, drc_ddim, mem_dev;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
@@ -380,12 +380,13 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < drc_chan + 1; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index f83e63d..ffedae9 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -43,9 +43,10 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 {
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
-	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
+	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
+	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -55,7 +56,6 @@ static void edac_mc_dump_csrow(struct csrow_info *csrow)
 	debugf4("\tcsrow->first_page = 0x%lx\n", csrow->first_page);
 	debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
 	debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
-	debugf4("\tcsrow->nr_pages = 0x%x\n", csrow->nr_pages);
 	debugf4("\tcsrow->nr_channels = %d\n", csrow->nr_channels);
 	debugf4("\tcsrow->channels = %p\n", csrow->channels);
 	debugf4("\tcsrow->mci = %p\n\n", csrow->mci);
@@ -652,15 +652,19 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 {
 	struct csrow_info *csrows = mci->csrows;
-	int row, i;
+	int row, i, j, n;
 
 	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
 	row = -1;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
 		struct csrow_info *csrow = &csrows[i];
-
-		if (csrow->nr_pages == 0)
+		n = 0;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			n += dimm->nr_pages;
+		}
+		if (n == 0)
 			continue;
 
 		debugf3("MC%d: %s(): first(0x%lx) page(0x%lx) last(0x%lx) "
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index d63904e..c0275e6 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -144,7 +144,13 @@ static ssize_t csrow_ce_count_show(struct csrow_info *csrow, char *data,
 static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%u\n", PAGES_TO_MiB(csrow->nr_pages));
+	int i;
+	u32 nr_pages = 0;
+
+	for (i = 0; i < csrow->nr_channels; i++)
+		nr_pages += csrow->channels[i].dimm->nr_pages;
+
+	return sprintf(data, "%u\n", PAGES_TO_MiB(nr_pages));
 }
 
 static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
@@ -519,16 +525,16 @@ static ssize_t mci_ctl_name_show(struct mem_ctl_info *mci, char *data)
 
 static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data)
 {
-	int total_pages, csrow_idx;
+	int total_pages = 0, csrow_idx, j;
 
-	for (total_pages = csrow_idx = 0; csrow_idx < mci->nr_csrows;
-		csrow_idx++) {
+	for (csrow_idx = 0; csrow_idx < mci->nr_csrows; csrow_idx++) {
 		struct csrow_info *csrow = &mci->csrows[csrow_idx];
 
-		if (!csrow->nr_pages)
-			continue;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
 
-		total_pages += csrow->nr_pages;
+			total_pages += dimm->nr_pages;
+		}
 	}
 
 	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
@@ -900,7 +906,7 @@ static void edac_remove_mci_instance_attributes(struct mem_ctl_info *mci,
  */
 int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	int i, j;
 	int err;
 	struct csrow_info *csrow;
 	struct kobject *kobj_mci = &mci->edac_mci_kobj;
@@ -934,10 +940,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	/* Make directories for each CSROW object under the mc<id> kobject
 	 */
 	for (i = 0; i < mci->nr_csrows; i++) {
+		int nr_pages = 0;
+
 		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
 
-		/* Only expose populated CSROWs */
-		if (csrow->nr_pages > 0) {
+		if (nr_pages > 0) {
 			err = edac_create_csrow_object(mci, csrow, i);
 			if (err) {
 				debugf1("%s() failure: create csrow %d obj\n",
@@ -949,10 +958,14 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 
 	return 0;
 
-	/* CSROW error: backout what has already been registered,  */
 fail1:
 	for (i--; i >= 0; i--) {
-		if (mci->csrows[i].nr_pages > 0)
+		int nr_pages = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
+		if (nr_pages > 0)
 			kobject_put(&mci->csrows[i].kobj);
 	}
 
@@ -972,14 +985,20 @@ fail0:
  */
 void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	struct csrow_info *csrow;
+	int i, j;
 
 	debugf0("%s()\n", __func__);
 
 	/* remove all csrow kobjects */
 	debugf4("%s()  unregister this mci kobj\n", __func__);
 	for (i = 0; i < mci->nr_csrows; i++) {
-		if (mci->csrows[i].nr_pages > 0) {
+		int nr_pages = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
+		if (nr_pages > 0) {
 			debugf0("%s()  unreg csrow-%d\n", __func__, i);
 			kobject_put(&mci->csrows[i].kobj);
 		}
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 1498c5f..bf8a230 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -306,7 +306,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_cumul_size;
+	unsigned long last_cumul_size, nr_pages;
 	int interleaved, nr_channels;
 	unsigned char dra[I3000_RANKS / 2], drb[I3000_RANKS];
 	unsigned char *c0dra = dra, *c1dra = &dra[I3000_RANKS_PER_CHANNEL / 2];
@@ -391,11 +391,13 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = I3000_DEAP_GRAIN;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index d8fa7f3..b82667f 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -376,11 +376,10 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index f00f684..e8d32e8 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1236,6 +1236,7 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i5000_pvt *pvt;
 	struct csrow_info *p_csrow;
+	struct dimm_info *dimm;
 	int empty, channel_count;
 	int max_csrows;
 	int mtr, mtr1;
@@ -1265,21 +1266,22 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
+			dimm = p_csrow->channels[channel].dimm;
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
-			p_csrow->channels[channel].dimm->grain = 8;
+			dimm->grain = 8;
 
 			/* Assume DDR2 for now */
-			p_csrow->channels[channel].dimm->mtype = MEM_FB_DDR2;
+			dimm->mtype = MEM_FB_DDR2;
 
 			/* ask what device type on this row */
 			if (MTR_DRAM_WIDTH(mtr))
-				p_csrow->channels[channel].dimm->dtype = DEV_X8;
+				dimm->dtype = DEV_X8;
 			else
-				p_csrow->channels[channel].dimm->dtype = DEV_X4;
+				dimm->dtype = DEV_X4;
 
-			p_csrow->channels[channel].dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->nr_pages = (csrow_megs << 8) / pvt->maxch;
 		}
-		p_csrow->nr_pages = csrow_megs << 8;
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 8da7ce1..a0219a9 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -859,7 +859,6 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 * FIXME: these two are totally bogus -- I don't see how to
 		 * map them correctly to this structure...
 		 */
-		mci->csrows[i].nr_pages = npages;
 		mci->csrows[i].csrow_idx = i;
 		mci->csrows[i].mci = mci;
 		mci->csrows[i].nr_channels = 1;
@@ -867,14 +866,19 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		total_pages += npages;
 
 		dimm = mci->csrows[i].channels[0].dimm;
-		dimm->grain = 32;
-		dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
-			      DEV_X4 : DEV_X8;
-		dimm->mtype = MEM_RDDR2;
-		dimm->edac_mode = EDAC_SECDED;
-		snprintf(dimm->label, sizeof(dimm->label),
-			 "DIMM%u",
-			 i5100_rank_to_slot(mci, chan, rank));
+		dimm->nr_pages = npages;
+		if (npages) {
+			total_pages += npages;
+
+			dimm->grain = 32;
+			dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
+				DEV_X4 : DEV_X8;
+			dimm->mtype = MEM_RDDR2;
+			dimm->edac_mode = EDAC_SECDED;
+			snprintf(dimm->label, sizeof(dimm->label),
+				"DIMM%u",
+				i5100_rank_to_slot(mci, chan, rank));
+		}
 	}
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 4a23813..784d6dc 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1156,7 +1156,7 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	int empty, channel_count;
 	int max_csrows;
 	int mtr;
-	int csrow_megs;
+	int size_mb;
 	int channel;
 	int csrow;
 	struct dimm_info *dimm;
@@ -1171,8 +1171,6 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	for (csrow = 0; csrow < max_csrows; csrow++) {
 		p_csrow = &mci->csrows[csrow];
 
-		p_csrow->csrow_idx = csrow;
-
 		/* use branch 0 for the basis */
 		mtr = determine_mtr(pvt, csrow, 0);
 
@@ -1180,12 +1178,11 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr))
 			continue;
 
-		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
-			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
+			size_mb = pvt->dimm_info[csrow][channel].megabytes;
 
-			p_csrow->nr_pages = csrow_megs << 8;
 			dimm = p_csrow->channels[channel].dimm;
+			dimm->nr_pages = size_mb << 8;
 			dimm->grain = 8;
 			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
 			dimm->mtype = MEM_RDDR2;
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index df6cd59..5e594ae 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -617,9 +617,7 @@ static void i7300_enable_error_reporting(struct mem_ctl_info *mci)
 static int decode_mtr(struct i7300_pvt *pvt,
 		      int slot, int ch, int branch,
 		      struct i7300_dimm_info *dinfo,
-		      struct csrow_info *p_csrow,
-		      struct dimm_info *dimm,
-		      u32 *nr_pages)
+		      struct dimm_info *dimm)
 {
 	int mtr, ans, addrBits, channel;
 
@@ -651,7 +649,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	addrBits -= 3;	/* 8 bits per bytes */
 
 	dinfo->megabytes = 1 << addrBits;
-	*nr_pages = dinfo->megabytes << 8;
 
 	debugf2("\t\tWIDTH: x%d\n", MTR_DRAM_WIDTH(mtr));
 
@@ -664,8 +661,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 	debugf2("\t\tSIZE: %d MB\n", dinfo->megabytes);
 
-	p_csrow->csrow_idx = slot;
-
 	/*
 	 * The type of error detection actually depends of the
 	 * mode of operation. When it is just one single memory chip, at
@@ -675,6 +670,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	 * See datasheet Sections 7.3.6 to 7.3.8
 	 */
 
+	dimm->nr_pages = MiB_TO_PAGES(dinfo->megabytes);
 	dimm->grain = 8;
 	dimm->mtype = MEM_FB_DDR2;
 	if (IS_SINGLE_MODE(pvt->mc_settings_a)) {
@@ -774,11 +770,9 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i7300_pvt *pvt;
 	struct i7300_dimm_info *dinfo;
-	struct csrow_info *p_csrow;
 	int rc = -ENODEV;
 	int mtr;
 	int ch, branch, slot, channel;
-	u32 nr_pages;
 	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
@@ -804,7 +798,6 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	/* Get the set of MTR[0-7] regs by each branch */
-	nr_pages = 0;
 	for (slot = 0; slot < MAX_SLOTS; slot++) {
 		int where = mtr_regs[slot];
 		for (branch = 0; branch < MAX_BRANCHES; branch++) {
@@ -815,21 +808,18 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 				int channel = to_channel(ch, branch);
 
 				dinfo = &pvt->dimm_info[slot][channel];
-				p_csrow = &mci->csrows[slot];
 
-				dimm = p_csrow->channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+				dimm = mci->csrows[slot].channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
 
 				mtr = decode_mtr(pvt, slot, ch, branch,
-						 dinfo, p_csrow, dimm,
-						 &nr_pages);
+						 dinfo, dimm);
+
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
 
-				/* Update per_csrow memory count */
-				p_csrow->nr_pages += nr_pages;
-
 				rc = 0;
+
 			}
 		}
 	}
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 89ccec6..d566797 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -715,17 +715,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			npages = MiB_TO_PAGES(size);
 
 			csr = &mci->csrows[csrow];
-			csr->nr_pages = npages;
-
-			csr->csrow_idx = csrow;
-			csr->nr_channels = 1;
-
-			csr->channels[0].chan_idx = i;
-			csr->channels[0].ce_count = 0;
 
 			pvt->csrow_map[i][j] = csrow;
 
 			dimm = csr->channels[0].dimm;
+			dimm->nr_pages = npages;
+
 			switch (banks) {
 			case 4:
 				dimm->dtype = DEV_X4;
@@ -746,6 +741,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			dimm->grain = 8;
 			dimm->edac_mode = mode;
 			dimm->mtype = mtype;
+			csrow++;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 1e19492..74166ae 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -220,7 +220,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		row_base = row_high_limit_last;
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* EAP reports in 4kilobyte granularity [61] */
 		dimm->grain = 1 << 12;
 		dimm->mtype = mtype;
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index acbd924..48e0ecd 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -167,7 +167,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		dimm->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 		dimm->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
 		dimm->mtype = MEM_RMBS;
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 81f79e2..dc207dc 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -347,7 +347,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 	unsigned long last_cumul_size;
 	u8 value;
 	u32 drc_ddim;		/* DRAM Data Integrity Mode 0=none,2=edac */
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, j;
 
 	drc_ddim = (drc >> 18) & 0x1;
@@ -371,12 +371,13 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_chans; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_chans;
 			dimm->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
 			dimm->mtype = MEM_DDR;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 0b40e11..304af1d 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -370,7 +370,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 	struct csrow_info *csrow;
 	unsigned long last_cumul_size;
 	u8 value;
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, chan;
 	struct dimm_info *dimm;
 	enum dev_type dtype;
@@ -402,6 +402,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
 			cumul_size);
 
+		nr_pages = cumul_size - last_cumul_size;
 		/*
 		 * Initialise dram labels
 		 * index values:
@@ -411,6 +412,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		dtype = i82975x_dram_type(mch_window, index);
 		for (chan = 0; chan < csrow->nr_channels; chan++) {
 			dimm = mci->csrows[index].channels[chan].dimm;
+
+			if (!nr_pages)
+				continue;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
@@ -420,12 +426,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 			dimm->edac_mode = EDAC_SECDED; /* only supported */
 		}
 
-		if (cumul_size == last_cumul_size)
+		if (!nr_pages)
 			continue;	/* not populated */
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 	}
 }
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index fb92916..c1d9e15 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -947,7 +947,8 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 
 		csrow->first_page = start;
 		csrow->last_page = end;
-		csrow->nr_pages = end + 1 - start;
+
+		dimm->nr_pages = end + 1 - start;
 		dimm->grain = 8;
 		dimm->mtype = mtype;
 		dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index d2e3c39..281e245 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -667,7 +667,8 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 
 	csrow = &mci->csrows[0];
 	dimm = csrow->channels[0].dimm;
-	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
+
+	dimm->nr_pages = pdata->total_mem >> PAGE_SHIFT;
 	dimm->grain = 8;
 
 	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 4e53270..3fcefda 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -153,20 +153,20 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		switch ((rankcfg & MCDRAM_RANKCFG_TYPE_SIZE_M) >>
 			MCDRAM_RANKCFG_TYPE_SIZE_S) {
 		case 0:
-			csrow->nr_pages = 128 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 128 << (20 - PAGE_SHIFT);
 			break;
 		case 1:
-			csrow->nr_pages = 256 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 256 << (20 - PAGE_SHIFT);
 			break;
 		case 2:
 		case 3:
-			csrow->nr_pages = 512 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 512 << (20 - PAGE_SHIFT);
 			break;
 		case 4:
-			csrow->nr_pages = 1024 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 1024 << (20 - PAGE_SHIFT);
 			break;
 		case 5:
-			csrow->nr_pages = 2048 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 2048 << (20 - PAGE_SHIFT);
 			break;
 		default:
 			edac_mc_printk(mci, KERN_ERR,
@@ -176,8 +176,8 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		}
 
 		csrow->first_page = last_page_in_mmc;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-		last_page_in_mmc += csrow->nr_pages;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
+		last_page_in_mmc += dimm->nr_pages;
 		csrow->page_mask = 0;
 		dimm->grain = PASEMI_EDAC_ERROR_GRAIN;
 		dimm->mtype = MEM_DDR;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index ec5e529..95cfc0f 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -896,7 +896,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum dev_type dtype;
 	enum edac_type edac_mode;
 	int row, j;
-	u32 mbxcf, size;
+	u32 mbxcf, size, nr_pages;
 
 	/* Establish the memory type and width */
 
@@ -947,7 +947,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		case SDRAM_MBCF_SZ_2GB:
 		case SDRAM_MBCF_SZ_4GB:
 		case SDRAM_MBCF_SZ_8GB:
-			csi->nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
+			nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
 			break;
 		default:
 			ppc4xx_edac_mc_printk(KERN_ERR, mci,
@@ -973,6 +973,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		for (j = 0; j < csi->nr_channels; j++) {
 			struct dimm_info *dimm = csi->channels[j].dimm;
 
+			dimm->nr_pages  = nr_pages / csi->nr_channels;
 			dimm->grain	= 1;
 
 			dimm->mtype	= mtype;
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 414a532..19f3a10 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -249,7 +249,8 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* Error address is top 19 bits - so granularity is      *
 		 * 14 bits                                               */
 		dimm->grain = 1 << 14;
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index cf53007..ee1543d 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -561,7 +561,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
-	struct dimm_info *dimm;
 
 	pci_read_config_dword(pvt->pci_br, SAD_TARGET, &reg);
 	pvt->sbridge_dev->source_id = SOURCE_ID(reg);
@@ -613,11 +612,11 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	/* On all supported DDR3 DIMM types, there are 8 banks available */
 	banks = 8;
 
-	dimm = mci->dimms;
 	for (i = 0; i < NUM_CHANNELS; i++) {
 		u32 mtr;
 
 		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
+			struct dimm_info *dimm = &mci->dimms[j];
 			pci_read_config_dword(pvt->pci_tad[i],
 					      mtr_regs[j], &mtr);
 			debugf4("Channel #%d  MTR%d = %x\n", i, j, mtr);
@@ -642,15 +641,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				 * csrows.
 				 */
 				csr = &mci->csrows[csrow];
-				csr->nr_pages = npages;
-				csr->csrow_idx = csrow;
-				csr->nr_channels = 1;
-				csr->channels[0].chan_idx = i;
 				pvt->csrow_map[i][j] = csrow;
 				last_page += npages;
 				csrow++;
 
 				csr->channels[0].dimm = dimm;
+				dimm->nr_pages = npages;
 				dimm->grain = 32;
 				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
 				dimm->mtype = mtype;
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index ba0917b..6314ff9 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -110,7 +110,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 		return -1;
 	}
 
-	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
+	dimm->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
 	dimm->grain = TILE_EDAC_ERROR_GRAIN;
 	dimm->dtype = DEV_UNKNOWN;
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 7be10dd..0de288f 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -373,10 +373,10 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < x38_channel_num; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / x38_channel_num;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 5244193..8b78bd0 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -320,6 +320,8 @@ struct dimm_info {
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
+	u32 nr_pages;			/* number of pages in csrow */
+
 	u32 ce_count;		/* Correctable Errors for this dimm */
 };
 
@@ -346,12 +348,12 @@ struct rank_info {
 };
 
 struct csrow_info {
+	/* Used only by edac_mc_find_csrow_by_page() */
 	unsigned long first_page;	/* first page number in csrow */
 	unsigned long last_page;	/* last page number in csrow */
-	u32 nr_pages;			/* number of pages in csrow */
 	unsigned long page_mask;	/* used for interleaving -
-					 * 0UL for non intlv
-					 */
+					 * 0UL for non intlv */
+
 	int csrow_idx;			/* the chip-select row */
 
 	u32 ue_count;		/* Uncorrectable Errors for this csrow */
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 5/7] edac: rewrite edac_align_ptr()
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
                     ` (3 preceding siblings ...)
  2012-04-16 20:12     ` Mauro Carvalho Chehab
@ 2012-04-16 20:12   ` Mauro Carvalho Chehab
  2012-04-18 14:06     ` Borislav Petkov
  2012-04-16 20:12   ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Mauro Carvalho Chehab
                     ` (2 subsequent siblings)
  7 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

The edac_align_ptr() function is used to prepare data for a single
memory allocation kzalloc() call. It counts how many bytes are needed
by some data structure.

Using it as-is is not that trivial, as the quantity of memory elements
reserved is not there, but, instead, it is on a next call.

In order to avoid mistakes when using it, move the number of allocated
elements into it, making easier to use it.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/edac_device.c |   27 +++++++++++----------------
 drivers/edac/edac_mc.c     |   19 +++++++++++++------
 drivers/edac/edac_module.h |    2 +-
 drivers/edac/edac_pci.c    |    7 ++++---
 4 files changed, 29 insertions(+), 26 deletions(-)

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 4b15459..cb397d9 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -79,7 +79,7 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	unsigned total_size;
 	unsigned count;
 	unsigned instance, block, attr;
-	void *pvt;
+	void *pvt, *p;
 	int err;
 
 	debugf4("%s() instances=%d blocks=%d\n",
@@ -92,35 +92,30 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	 * to be at least as stringent as what the compiler would
 	 * provide if we could simply hardcode everything into a single struct.
 	 */
-	dev_ctl = (struct edac_device_ctl_info *)NULL;
+	p = NULL;
+	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
 
 	/* Calc the 'end' offset past end of ONE ctl_info structure
 	 * which will become the start of the 'instance' array
 	 */
-	dev_inst = edac_align_ptr(&dev_ctl[1], sizeof(*dev_inst));
+	dev_inst = edac_align_ptr(&p, sizeof(*dev_inst), nr_instances);
 
 	/* Calc the 'end' offset past the instance array within the ctl_info
 	 * which will become the start of the block array
 	 */
-	dev_blk = edac_align_ptr(&dev_inst[nr_instances], sizeof(*dev_blk));
+	count = nr_instances * nr_blocks;
+	dev_blk = edac_align_ptr(&p, sizeof(*dev_blk), count);
 
 	/* Calc the 'end' offset past the dev_blk array
 	 * which will become the start of the attrib array, if any.
 	 */
-	count = nr_instances * nr_blocks;
-	dev_attrib = edac_align_ptr(&dev_blk[count], sizeof(*dev_attrib));
-
-	/* Check for case of when an attribute array is specified */
-	if (nr_attrib > 0) {
-		/* calc how many nr_attrib we need */
+	/* calc how many nr_attrib we need */
+	if (nr_attrib > 0)
 		count *= nr_attrib;
+	dev_attrib = edac_align_ptr(&p, sizeof(*dev_attrib), count);
 
-		/* Calc the 'end' offset past the attributes array */
-		pvt = edac_align_ptr(&dev_attrib[count], sz_private);
-	} else {
-		/* no attribute array specificed */
-		pvt = edac_align_ptr(dev_attrib, sz_private);
-	}
+	/* Calc the 'end' offset past the attributes array */
+	pvt = edac_align_ptr(&p, sz_private, 1);
 
 	/* 'pvt' now points to where the private data area is.
 	 * At this point 'pvt' (like dev_inst,dev_blk and dev_attrib)
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index ffedae9..98de5d1 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -108,9 +108,12 @@ EXPORT_SYMBOL_GPL(edac_mem_types);
  * If 'size' is a constant, the compiler will optimize this whole function
  * down to either a no-op or the addition of a constant to the value of 'ptr'.
  */
-void *edac_align_ptr(void *ptr, unsigned size)
+void *edac_align_ptr(void **p, unsigned size, int quant)
 {
 	unsigned align, r;
+	void *ptr = *p;
+
+	*p += size * quant;
 
 	/* Here we assume that the alignment of a "long long" is the most
 	 * stringent alignment that the compiler will ever provide by default.
@@ -132,6 +135,8 @@ void *edac_align_ptr(void *ptr, unsigned size)
 	if (r == 0)
 		return (char *)ptr;
 
+	*p += align - r;
+
 	return (void *)(((unsigned long)ptr) + align - r);
 }
 
@@ -154,6 +159,7 @@ void *edac_align_ptr(void *ptr, unsigned size)
 struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 				unsigned nr_chans, int edac_index)
 {
+	void *ptr;
 	struct mem_ctl_info *mci;
 	struct csrow_info *csi, *csrow;
 	struct rank_info *chi, *chp, *chan;
@@ -168,11 +174,12 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * stringent as what the compiler would provide if we could simply
 	 * hardcode everything into a single struct.
 	 */
-	mci = (struct mem_ctl_info *)0;
-	csi = edac_align_ptr(&mci[1], sizeof(*csi));
-	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
-	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
-	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
+	ptr = 0;
+	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
+	dimm = edac_align_ptr(ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	mci = kzalloc(size, GFP_KERNEL);
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 00f81b4..0be4b01 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -50,7 +50,7 @@ extern void edac_device_reset_delay_period(struct edac_device_ctl_info
 					   *edac_dev, unsigned long value);
 extern void edac_mc_reset_delay_period(int value);
 
-extern void *edac_align_ptr(void *ptr, unsigned size);
+extern void *edac_align_ptr(void **p, unsigned size, int quant);
 
 /*
  * EDAC PCI functions
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index 63af1c5..9016560 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -42,13 +42,14 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 						const char *edac_pci_name)
 {
 	struct edac_pci_ctl_info *pci;
-	void *pvt;
+	void *p, *pvt;
 	unsigned int size;
 
 	debugf1("%s()\n", __func__);
 
-	pci = (struct edac_pci_ctl_info *)0;
-	pvt = edac_align_ptr(&pci[1], sz_pvt);
+	p = 0;
+	pci = edac_align_ptr(&p, sizeof(*pci), 1);
+	pvt = edac_align_ptr(&p, 1, sz_pvt);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	/* Alloc the needed control struct memory */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
                     ` (4 preceding siblings ...)
  2012-04-16 20:12   ` [EDAC PATCH v13 5/7] edac: rewrite edac_align_ptr() Mauro Carvalho Chehab
@ 2012-04-16 20:12   ` Mauro Carvalho Chehab
  2012-04-23 17:49     ` Borislav Petkov
  2012-04-16 20:12   ` [EDAC PATCH v13 7/7] edac: Change internal representation to work with layers Mauro Carvalho Chehab
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
  7 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMM's, instead of ranks, accessed
via csrow/channel.

So, changes are needed in order to allow the EDAC core to
work with all types of architectures.

As a preparation for handling non-csrows based memory controllers,
adds some memory structs and a macro:

enum hw_event_mc_err_type: describes the type of error
			   (corrected, uncorrected, fatal)

To be used by the new edac_mc_handle_error function;

enum edac_mc_layer: describes the type of a given Memory
architecture layer (branch, channel, slot, csrow).

struct edac_mc_layer: describes the properties of a memory
		      layer (type, size, and if the layer
		      will be used on a virtual csrow.

GET_POS() - as the number of layers can vary from 1 to 3,
this macro converts from an address with up to 3 layers into
a linear address.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 include/linux/edac.h |   83 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 82 insertions(+), 1 deletions(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index 8b78bd0..0fdf6ba 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -67,6 +67,25 @@ enum dev_type {
 #define DEV_FLAG_X64		BIT(DEV_X64)
 
 /**
+ * enum hw_event_mc_err_type - type of the detected error
+ *
+ * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
+ *				corrected error was detected
+ * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
+ *				can't be corrected by ECC, but it is not
+ *				factal (maybe it is on an unused memory area,
+ *				or the memory controller could recover from
+ *				it for example, by re-trying the operation).
+ * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
+ *				be recovered.
+ */
+enum hw_event_mc_err_type {
+	HW_EVENT_ERR_CORRECTED,
+	HW_EVENT_ERR_UNCORRECTED,
+	HW_EVENT_ERR_FATAL,
+};
+
+/**
  * enum mem_type - memory types. For a more detailed reference, please see
  *			http://en.wikipedia.org/wiki/DRAM
  *
@@ -308,7 +327,69 @@ enum scrub_type {
  * PS - I enjoyed writing all that about as much as you enjoyed reading it.
  */
 
-/* FIXME: add a per-dimm ce error count */
+/**
+ * enum edac_mc_layer - memory controller hierarchy layer
+ *
+ * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
+ * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
+ * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
+ * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
+ *
+ * This enum is used by the drivers to tell edac_mc_sysfs what name should
+ * be used when describing a memory stick location.
+ */
+enum edac_mc_layer_type {
+	EDAC_MC_LAYER_BRANCH,
+	EDAC_MC_LAYER_CHANNEL,
+	EDAC_MC_LAYER_SLOT,
+	EDAC_MC_LAYER_CHIP_SELECT,
+};
+
+/**
+ * struct edac_mc_layer - describes the memory controller hierarchy
+ * @layer:		layer type
+ * @size:maximum size of the layer
+ * @is_csrow:		This layer is part of the "csrow" when old API
+ *			compatibility mode is enabled. Otherwise, it is
+ *			a channel
+ */
+struct edac_mc_layer {
+	enum edac_mc_layer_type	type;
+	unsigned		size;
+	bool			is_csrow;
+};
+
+/*
+ * Maximum number of layers used by the memory controller to uniquelly
+ * identify a single memory stick.
+ * NOTE: incrementing it would require changes at edac_mc_handle_error()
+ * and at the routines at edac_mc_sysfs that create layers
+ */
+#define EDAC_MAX_LAYERS		3
+
+/*
+ * A loop could be used here to make it more generic, but, as we only have
+ * 3 layers, this is a little faster. By design, layers can never be 0 or
+ * more than 3. If that ever happens, a NULL is returned, causing an OOPS
+ * during the memory allocation routine, with would point to the developer
+ * that he's doing something wrong.
+ */
+#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
+	typeof(var) __p;						\
+	if ((nlayers) == 1)						\
+		__p = &var[lay0];					\
+	else if ((nlayers) == 2)					\
+		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
+	else if ((nlayers) == 3)					\
+		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
+			    ((layers[1]).size * (lay0))))];		\
+	else								\
+		__p = NULL;						\
+	__p;								\
+})
+
+
+/* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
 	unsigned memory_controller;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC PATCH v13 7/7] edac: Change internal representation to work with layers
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
                     ` (5 preceding siblings ...)
  2012-04-16 20:12   ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Mauro Carvalho Chehab
@ 2012-04-16 20:12   ` Mauro Carvalho Chehab
  2012-04-18 18:22     ` [PATCH] " Mauro Carvalho Chehab
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
  7 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:12 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Aristeu Rozanski,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Borislav Petkov, Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMM's, instead of ranks, accessed
via csrow/channel.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks on FB-DIMM and RAMBUS
memory controllers on the next patches.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMM's are currently represented by multiple DIMM
entries at struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same dimm.
Such bug is there since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

PS.: I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger on the drivers for FB-DIMM's.

FIXME: while the FB-DIMMs are not converted to use the new
design, uncorrected errors will show just one channel. In
the past, all changes were on a big patch with about 150K.
As it needed to be split, in order to be accepted by the
EDAC ML at vger, we've opted to have this small drawback.
As an advantage, it is now easier to review the patch series.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/edac_core.h |   92 ++++++-
 drivers/edac/edac_mc.c   |  682 ++++++++++++++++++++++++++++------------------
 include/linux/edac.h     |   40 ++-
 3 files changed, 526 insertions(+), 288 deletions(-)

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..7201bb1 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index);
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -467,24 +472,80 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
  * reporting logic and function interface - reduces conditional
  * statement clutter and extra function arguments.
  */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
+
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
+
+static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			      unsigned long page_frame_number,
 			      unsigned long offset_in_page,
 			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
+			      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      page_frame_number, offset_in_page, syndrome,
+		              row, channel, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+				      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
 			      unsigned long page_frame_number,
 			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+			      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      page_frame_number, offset_in_page, 0,
+		              row, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+				      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel0,
+					 unsigned int channel1,
+					 char *msg)
+{
+	/*
+	 *FIXME: The error can also be at channel1 (e. g. at the second
+	 *	  channel of the same branch). The fix is to push
+	 *	  edac_mc_handle_error() call into each driver
+	 */
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel0, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel, char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel, -1, msg, NULL, NULL);
+}
+
+
 
 /*
  * edac_device APIs
@@ -496,6 +557,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 98de5d1..f231c54 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -141,10 +159,25 @@ void *edac_align_ptr(void **p, unsigned size, int quant)
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Number of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -156,18 +189,41 @@ void *edac_align_ptr(void **p, unsigned size, int quant)
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt)
 {
 	void *ptr;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *lay;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
-	unsigned size;
-	int row, chn;
+	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
+	unsigned tot_csrows, tot_cschannels;
+	int i, j;
 	int err;
+	int row, chn;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	tot_dimms = 1;
+	tot_cschannels = 1;
+	tot_csrows = 1;
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_cschannels *= layers[i].size;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
@@ -176,12 +232,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	ptr = 0;
 	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
-	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
-	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
-	dimm = edac_align_ptr(ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	lay = edac_align_ptr(&ptr, sizeof(*lay), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_cschannels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	count = 1;
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+	}
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
+		__func__, size, tot_dimms, tot_csrows * tot_cschannels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -189,42 +254,99 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	lay = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)lay));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
+	mci->n_layers = n_layers;
+	mci->layers = lay;
+	memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
+	mci->nr_csrows = tot_csrows;
+	mci->num_cschannel = tot_cschannels;
 
 	/*
-	 * For now, assumes that a per-csrow arrangement for dimms.
-	 * This will be latter changed.
+	 * Fills the csrow struct
 	 */
-	dimm = mci->dimms;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
-
-		for (chn = 0; chn < nr_chans; chn++) {
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_cschannels;
+		chp = &chi[row * tot_cschannels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_cschannels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
+		}
+	}
 
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+	/*
+	 * Fills the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = GET_POS(lay, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		if (!rev_order) {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (!layers[j].is_csrow)
+					break;
+			chn++;
+			if (chn == tot_cschannels) {
+				chn = 0;
+				row++;
+			}
+		} else {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (layers[j].is_csrow)
+					break;
+			row++;
+			if (row == tot_csrows) {
+				row = 0;
+				chn++;
+			}
+		}
+
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -248,6 +370,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	return mci;
 }
+EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Nu
+mber of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * Only can be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ *	NULL allocation failed
+ *	struct mem_ctl_info pointer
+ */
+
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index)
+{
+	unsigned n_layers = 2;
+	struct edac_mc_layer layers[n_layers];
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = nr_csrows;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_csrow = false;
+
+	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
+			  false, sz_pvt);
+}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
@@ -513,7 +686,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -540,6 +712,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -697,261 +871,249 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
+
+static void edac_increment_ce_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
 {
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+	int i, index = 0;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	mci->ce_mc++;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_increment_ue_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	if (mci->scrub_mode & SCRUB_SW_SRC) {
-		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
-		remapped_page = mci->ctl_page_to_phys ?
-			mci->ctl_page_to_phys(mci, page_frame_number) :
-			page_frame_number;
+	mci->ue_mc++;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
+		return;
 	}
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	unsigned long remapped_page;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_filter = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/* Check if the event report is consistent */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED) {
+				p = "CE";
+				mci->ce_mc++;
+			} else {
+				p = "UE";
+				mci->ue_mc++;
+			}
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_filter = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
-
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory,
+	 * the logic here will get all possible labels that could pottentially
+	 * be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called as "rank")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 * It will also get the max grain, over the error match range
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no sense
+		 * on seeking for the affected DIMMs, as everything may be
+		 * affected. Also, don't show errors for non-filled dimm's.
+		 */
+		if (enable_filter && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the dimm, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: dimm csrows (%d,%d)\n",
+				__func__, dimm->csrow, dimm->cschannel);
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
-
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_filter) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
+		p += sprintf(p, "%s %d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED)
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+	else
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d",
+			page_frame_number, offset_in_page, grain);
+
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		if (edac_mc_get_log_ce())
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		edac_increment_ce_error(mci, enable_filter, pos);
+
+		if (mci->scrub_mode & SCRUB_SW_SRC) {
+			/*
+			 * Some MC's can remap memory so that it is still
+			 * available at a different address when PCI devices
+			 * map into memory.
+			 * MC's that can't do this lose the memory where PCI
+			 * devices are mapped. This mapping is MC dependent
+			 * and so we call back into the MC driver for it to
+			 * map the MC page to a physical (CPU) page which can
+			 * then be mapped to a virtual page - which can then
+			 * be scrubbed.
+			 */
+			remapped_page = mci->ctl_page_to_phys ?
+				mci->ctl_page_to_phys(mci, page_frame_number) :
+				page_frame_number;
+
+			edac_mc_scrub_block(remapped_page,
+					    offset_in_page, grain);
+		}
+	} else {
+		if (edac_mc_get_log_ue())
+			edac_mc_printk(mci, KERN_WARNING,
+				"UE %s on %s (%s%s %s)\n",
+				msg, label, location, detail, other_detail);
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+		if (edac_mc_get_panic_on_ue())
+			panic("UE %s on %s (%s%s %s)\n",
+			      msg, label, location, detail, other_detail);
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		edac_increment_ue_error(mci, enable_filter, pos);
 	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
-
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 0fdf6ba..1439670 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -392,18 +392,20 @@ struct edac_mc_layer {
 /* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -423,9 +425,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -477,6 +480,11 @@ struct mcidev_sysfs_attribute {
         ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
 };
 
+struct edac_hierarchy {
+	char		*name;
+	unsigned	nr;
+};
+
 /* MEMORY controller information structure
  */
 struct mem_ctl_info {
@@ -521,13 +529,16 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned nr_csrows, num_cschannel;
 
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -542,12 +553,15 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
+	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
+	u32 ce_count;           /* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/* drivers shouldn't access this struct directly */
+	unsigned ce_noinfo_count, ue_noinfo_count;
+	unsigned ce_mc, ue_mc;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -560,7 +574,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers
  2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
                     ` (6 preceding siblings ...)
  2012-04-16 20:12   ` [EDAC PATCH v13 7/7] edac: Change internal representation to work with layers Mauro Carvalho Chehab
@ 2012-04-16 20:21   ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
                       ` (25 more replies)
  7 siblings, 26 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

Convert the EDAC MC drivers to use the new ABI. For the csrow-based
memory controllers, no functional changes should be noticed.

RAMBUS, FB-DIMM, Nehalem and SandyBridge-EP will get an improvement
on this series, as the EDAC core will know the real memory config,
instead of relying on some fake/virtual information.

Yet, on the userspace ABI, nothing changes. A latter series will
fix the userspace ABI.

All those patches, except for the last one, were previously part of
a single patch:

	https://lkml.org/lkml/2012/3/29/305

That were also handling the EDAC core changes. However, in order to
simplify the driver review, the big patch were broken into a few
EDAC core changes and this series of patches.

Mauro Carvalho Chehab (26):
  amd64_edac: convert driver to use the new edac ABI
  amd76x_edac: convert driver to use the new edac ABI
  cell_edac: convert driver to use the new edac ABI
  cpc925_edac: convert driver to use the new edac ABI
  e752x_edac: convert driver to use the new edac ABI
  e7xxx_edac: convert driver to use the new edac ABI
  i3000_edac: convert driver to use the new edac ABI
  i3200_edac: convert driver to use the new edac ABI
  i5000_edac: convert driver to use the new edac ABI
  i5100_edac: convert driver to use the new edac ABI
  i5400_edac: convert driver to use the new edac ABI
  i7300_edac: convert driver to use the new edac ABI
  i7core_edac: convert driver to use the new edac ABI
  i82443bxgx_edac: convert driver to use the new edac ABI
  i82860_edac: convert driver to use the new edac ABI
  i82875p_edac: convert driver to use the new edac ABI
  i82975x_edac: convert driver to use the new edac ABI
  mpc85xx_edac: convert driver to use the new edac ABI
  mv64x60_edac: convert driver to use the new edac ABI
  pasemi_edac: convert driver to use the new edac ABI
  ppc4xx_edac: convert driver to use the new edac ABI
  r82600_edac: convert driver to use the new edac ABI
  sb_edac: convert driver to use the new edac ABI
  tile_edac: convert driver to use the new edac ABI
  x38_edac: convert driver to use the new edac ABI
  edac: Remove the legacy EDAC ABI

 drivers/edac/amd64_edac.c      |  137 +++++++++++++++++--------
 drivers/edac/amd76x_edac.c     |   28 ++++--
 drivers/edac/cell_edac.c       |   26 ++++--
 drivers/edac/cpc925_edac.c     |   23 ++++-
 drivers/edac/e752x_edac.c      |   49 ++++++---
 drivers/edac/e7xxx_edac.c      |   37 ++++++--
 drivers/edac/edac_core.h       |   78 +--------------
 drivers/edac/edac_mc.c         |   53 +----------
 drivers/edac/i3000_edac.c      |   25 ++++-
 drivers/edac/i3200_edac.c      |   32 ++++--
 drivers/edac/i5000_edac.c      |   60 +++++++----
 drivers/edac/i5100_edac.c      |   90 ++++++++---------
 drivers/edac/i5400_edac.c      |  217 ++++++++++++++++++++++------------------
 drivers/edac/i7300_edac.c      |   81 ++++++---------
 drivers/edac/i7core_edac.c     |  202 +++++++++----------------------------
 drivers/edac/i82443bxgx_edac.c |   26 +++--
 drivers/edac/i82860_edac.c     |   42 ++++++---
 drivers/edac/i82875p_edac.c    |   29 ++++--
 drivers/edac/i82975x_edac.c    |   27 ++++--
 drivers/edac/mpc85xx_edac.c    |   22 +++-
 drivers/edac/mv64x60_edac.c    |   25 ++++-
 drivers/edac/pasemi_edac.c     |   25 +++--
 drivers/edac/ppc4xx_edac.c     |   25 +++--
 drivers/edac/r82600_edac.c     |   27 ++++--
 drivers/edac/sb_edac.c         |  159 ++++++++++-------------------
 drivers/edac/tile_edac.c       |   16 +++-
 drivers/edac/x38_edac.c        |   28 ++++--
 27 files changed, 800 insertions(+), 789 deletions(-)

-- 
1.7.8


^ permalink raw reply	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-05-07 14:31       ` Borislav Petkov
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 02/26] amd76x_edac: " Mauro Carvalho Chehab
                       ` (24 subsequent siblings)
  25 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Borislav Petkov

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c |  137 ++++++++++++++++++++++++++++++---------------
 1 files changed, 92 insertions(+), 45 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 8804ac8..c4182e4 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1039,6 +1039,37 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	int channel, csrow;
 	u32 page, offset;
 
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
+
+	/*
+	 * Find out which node the error address belongs to. This may be
+	 * different from the node that detected the error.
+	 */
+	src_mci = find_mc_by_sys_addr(mci, sys_addr);
+	if (!src_mci) {
+		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
+			     (unsigned long)sys_addr);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a node",
+				     NULL);
+		return;
+	}
+
+	/* Now map the sys_addr to a CSROW */
+	csrow = sys_addr_to_csrow(src_mci, sys_addr);
+	if (csrow < 0) {
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a csrow",
+				     NULL);
+		return;
+	}
+
 	/* CHIPKILL enabled */
 	if (pvt->nbcfg & NBCFG_CHIPKILL) {
 		channel = get_channel_from_ecc_syndrome(mci, syndrome);
@@ -1048,9 +1079,15 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 			 * 2 DIMMs is in error. So we need to ID 'both' of them
 			 * as suspect.
 			 */
-			amd64_mc_warn(mci, "unknown syndrome 0x%04x - possible "
-					   "error reporting race\n", syndrome);
-			edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+			amd64_mc_warn(src_mci, "unknown syndrome 0x%04x - "
+				      "possible error reporting race\n",
+				      syndrome);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     page, offset, syndrome,
+					     csrow, -1, -1,
+					     EDAC_MOD_STR,
+					     "unknown syndrome - possible error reporting race",
+					     NULL);
 			return;
 		}
 	} else {
@@ -1065,28 +1102,10 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 		channel = ((sys_addr & BIT(3)) != 0);
 	}
 
-	/*
-	 * Find out which node the error address belongs to. This may be
-	 * different from the node that detected the error.
-	 */
-	src_mci = find_mc_by_sys_addr(mci, sys_addr);
-	if (!src_mci) {
-		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
-			     (unsigned long)sys_addr);
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
-		return;
-	}
-
-	/* Now map the sys_addr to a CSROW */
-	csrow = sys_addr_to_csrow(src_mci, sys_addr);
-	if (csrow < 0) {
-		edac_mc_handle_ce_no_info(src_mci, EDAC_MOD_STR);
-	} else {
-		error_address_to_page_and_offset(sys_addr, &page, &offset);
-
-		edac_mc_handle_ce(src_mci, page, offset, syndrome, csrow,
-				  channel, EDAC_MOD_STR);
-	}
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, src_mci,
+			     page, offset, syndrome,
+			     csrow, channel, -1,
+			     EDAC_MOD_STR, "", NULL);
 }
 
 static int ddr2_cs_size(unsigned i, bool dct_width)
@@ -1568,15 +1587,20 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	u32 page, offset;
 	int nid, csrow, chan = 0;
 
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
+
 	csrow = f1x_translate_sysaddr_to_cs(pvt, sys_addr, &nid, &chan);
 
 	if (csrow < 0) {
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a csrow",
+				     NULL);
 		return;
 	}
 
-	error_address_to_page_and_offset(sys_addr, &page, &offset);
-
 	/*
 	 * We need the syndromes for channel detection only when we're
 	 * ganged. Otherwise @chan should already contain the channel at
@@ -1585,16 +1609,10 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	if (dct_ganging_enabled(pvt))
 		chan = get_channel_from_ecc_syndrome(mci, syndrome);
 
-	if (chan >= 0)
-		edac_mc_handle_ce(mci, page, offset, syndrome, csrow, chan,
-				  EDAC_MOD_STR);
-	else
-		/*
-		 * Channel unknown, report all channels on this CSROW as failed.
-		 */
-		for (chan = 0; chan < mci->csrows[csrow].nr_channels; chan++)
-			edac_mc_handle_ce(mci, page, offset, syndrome,
-					  csrow, chan, EDAC_MOD_STR);
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				page, offset, syndrome,
+				csrow, chan, -1,
+				EDAC_MOD_STR, "", NULL);
 }
 
 /*
@@ -1875,7 +1893,12 @@ static void amd64_handle_ce(struct mem_ctl_info *mci, struct mce *m)
 	/* Ensure that the Error Address is VALID */
 	if (!(m->status & MCI_STATUS_ADDRV)) {
 		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     0, 0, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "HW has no ERROR_ADDRESS available",
+				     NULL);
 		return;
 	}
 
@@ -1899,11 +1922,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 
 	if (!(m->status & MCI_STATUS_ADDRV)) {
 		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     0, 0, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "HW has no ERROR_ADDRESS available",
+				     NULL);
 		return;
 	}
 
 	sys_addr = get_error_address(m);
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
 
 	/*
 	 * Find out which node the error address belongs to. This may be
@@ -1913,7 +1942,11 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 	if (!src_mci) {
 		amd64_mc_err(mci, "ERROR ADDRESS (0x%lx) NOT mapped to a MC\n",
 				  (unsigned long)sys_addr);
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "ERROR ADDRESS NOT mapped to a MC", NULL);
 		return;
 	}
 
@@ -1923,10 +1956,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 	if (csrow < 0) {
 		amd64_mc_err(mci, "ERROR_ADDRESS (0x%lx) NOT mapped to CS\n",
 				  (unsigned long)sys_addr);
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "ERROR ADDRESS NOT mapped to CS",
+				     NULL);
 	} else {
-		error_address_to_page_and_offset(sys_addr, &page, &offset);
-		edac_mc_handle_ue(log_mci, page, offset, csrow, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     csrow, -1, -1,
+				     EDAC_MOD_STR, "", NULL);
 	}
 }
 
@@ -2486,6 +2526,7 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 	struct amd64_pvt *pvt = NULL;
 	struct amd64_family_type *fam_type = NULL;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	int err = 0, ret;
 	u8 nid = get_node_id(F2);
 
@@ -2520,7 +2561,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 		goto err_siblings;
 
 	ret = -ENOMEM;
-	mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = pvt->csels[0].b_cnt;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = pvt->channel_count;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		goto err_siblings;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 02/26] amd76x_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 03/26] cell_edac: " Mauro Carvalho Chehab
                       ` (23 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Michal Marek

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd76x_edac.c |   28 +++++++++++++++++++---------
 1 files changed, 19 insertions(+), 9 deletions(-)

diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 1532750..fd9ab44 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -29,7 +29,6 @@
 	edac_mc_chipset_printk(mci, level, "amd76x", fmt, ##arg)
 
 #define AMD76X_NR_CSROWS 8
-#define AMD76X_NR_CHANS  1
 #define AMD76X_NR_DIMMS  4
 
 /* AMD 76x register addresses - device 0 function 0 - PCI bridge */
@@ -146,8 +145,10 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 
 		if (handle_errors) {
 			row = (info->ecc_mode_status >> 4) & 0xf;
-			edac_mc_handle_ue(mci, mci->csrows[row].first_page, 0,
-					row, mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     mci->csrows[row].first_page, 0, 0,
+					     row, 0, -1,
+					     mci->ctl_name, "", NULL);
 		}
 	}
 
@@ -159,8 +160,10 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 
 		if (handle_errors) {
 			row = info->ecc_mode_status & 0xf;
-			edac_mc_handle_ce(mci, mci->csrows[row].first_page, 0,
-					0, row, 0, mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     mci->csrows[row].first_page, 0, 0,
+					     row, 0, -1,
+					     mci->ctl_name, "", NULL);
 		}
 	}
 
@@ -232,7 +235,8 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 		EDAC_SECDED,
 		EDAC_SECDED
 	};
-	struct mem_ctl_info *mci = NULL;
+	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	u32 ems;
 	u32 ems_mode;
 	struct amd76x_error_info discard;
@@ -240,11 +244,17 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	debugf0("%s()\n", __func__);
 	pci_read_config_dword(pdev, AMD76X_ECC_MODE_STATUS, &ems);
 	ems_mode = (ems >> 10) & 0x3;
-	mci = edac_mc_alloc(0, AMD76X_NR_CSROWS, AMD76X_NR_CHANS, 0);
 
-	if (mci == NULL) {
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = AMD76X_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = 1;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
+
+	if (mci == NULL)
 		return -ENOMEM;
-	}
 
 	debugf0("%s(): mci = %p\n", __func__, mci);
 	mci->dev = &pdev->dev;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 03/26] cell_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 02/26] amd76x_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 04/26] cpc925_edac: " Mauro Carvalho Chehab
                       ` (22 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Jiri Kosina, Joe Perches

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/cell_edac.c |   26 +++++++++++++++++++-------
 1 files changed, 19 insertions(+), 7 deletions(-)

diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 09e1b5d..4cfd22a 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -48,8 +48,9 @@ static void cell_edac_count_ce(struct mem_ctl_info *mci, int chan, u64 ar)
 	syndrome = (ar & 0x000000001fe00000ul) >> 21;
 
 	/* TODO: Decoding of the error address */
-	edac_mc_handle_ce(mci, csrow->first_page + pfn, offset,
-			  syndrome, 0, chan, "");
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			     csrow->first_page + pfn, offset, syndrome,
+			     0, chan, -1, "", "", NULL);
 }
 
 static void cell_edac_count_ue(struct mem_ctl_info *mci, int chan, u64 ar)
@@ -69,7 +70,9 @@ static void cell_edac_count_ue(struct mem_ctl_info *mci, int chan, u64 ar)
 	offset = address & ~PAGE_MASK;
 
 	/* TODO: Decoding of the error address */
-	edac_mc_handle_ue(mci, csrow->first_page + pfn, offset, 0, "");
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			     csrow->first_page + pfn, offset, 0,
+			     0, chan, -1, "", "", NULL);
 }
 
 static void cell_edac_check(struct mem_ctl_info *mci)
@@ -156,7 +159,7 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
 			priv->node, priv->chanmask,
-			csrow->first_page, dimm->nr_pages);
+			csrow->first_page, nr_pages);
 		break;
 	}
 }
@@ -165,9 +168,10 @@ static int __devinit cell_edac_probe(struct platform_device *pdev)
 {
 	struct cbe_mic_tm_regs __iomem	*regs;
 	struct mem_ctl_info		*mci;
+	struct edac_mc_layer		layers[2];
 	struct cell_edac_priv		*priv;
 	u64				reg;
-	int				rc, chanmask;
+	int				rc, chanmask, num_chans;
 
 	regs = cbe_get_cpu_mic_tm_regs(cbe_node_to_cpu(pdev->id));
 	if (regs == NULL)
@@ -192,8 +196,16 @@ static int __devinit cell_edac_probe(struct platform_device *pdev)
 		in_be64(&regs->mic_fir));
 
 	/* Allocate & init EDAC MC data structure */
-	mci = edac_mc_alloc(sizeof(struct cell_edac_priv), 1,
-			    chanmask == 3 ? 2 : 1, pdev->id);
+	num_chans = chanmask == 3 ? 2 : 1;
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = 1;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = num_chans;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
+			    sizeof(struct cell_edac_priv));
 	if (mci == NULL)
 		return -ENOMEM;
 	priv = mci->pvt_info;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 04/26] cpc925_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (2 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 03/26] cell_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 05/26] e752x_edac: " Mauro Carvalho Chehab
                       ` (21 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Michal Marek

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/cpc925_edac.c |   23 ++++++++++++++++++-----
 1 files changed, 18 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 7b764a8..adbeb13 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -555,13 +555,18 @@ static void cpc925_mc_check(struct mem_ctl_info *mci)
 	if (apiexcp & CECC_EXCP_DETECTED) {
 		cpc925_mc_printk(mci, KERN_INFO, "DRAM CECC Fault\n");
 		channel = cpc925_mc_find_channel(mci, syndrome);
-		edac_mc_handle_ce(mci, pfn, offset, syndrome,
-				  csrow, channel, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     pfn, offset, syndrome,
+				     csrow, channel, -1,
+				     mci->ctl_name, "", NULL);
 	}
 
 	if (apiexcp & UECC_EXCP_DETECTED) {
 		cpc925_mc_printk(mci, KERN_INFO, "DRAM UECC Fault\n");
-		edac_mc_handle_ue(mci, pfn, offset, csrow, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     pfn, offset, 0,
+				     csrow, -1, -1,
+				     mci->ctl_name, "", NULL);
 	}
 
 	cpc925_mc_printk(mci, KERN_INFO, "Dump registers:\n");
@@ -933,6 +938,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 {
 	static int edac_mc_idx;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	void __iomem *vbase;
 	struct cpc925_mc_pdata *pdata;
 	struct resource *r;
@@ -969,8 +975,15 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	}
 
 	nr_channels = cpc925_mc_get_channels(vbase) + 1;
-	mci = edac_mc_alloc(sizeof(struct cpc925_mc_pdata),
-			CPC925_NR_CSROWS, nr_channels, edac_mc_idx);
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = CPC925_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_channels;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+			    sizeof(struct cpc925_mc_pdata));
 	if (!mci) {
 		cpc925_printk(KERN_ERR, "No memory for mem_ctl_info\n");
 		res = -ENOMEM;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 05/26] e752x_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (3 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 04/26] cpc925_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 06/26] e7xxx_edac: " Mauro Carvalho Chehab
                       ` (20 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Mark Gross, Doug Thompson

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Mark Gross <mark.gross@intel.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/e752x_edac.c |   49 +++++++++++++++++++++++++++++++-------------
 1 files changed, 34 insertions(+), 15 deletions(-)

diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 6d81d3c..7b6ce11 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -6,6 +6,9 @@
  *
  * See "enum e752x_chips" below for supported chipsets
  *
+ * Datasheet:
+ *	http://www.intel.in/content/www/in/en/chipsets/e7525-memory-controller-hub-datasheet.html
+ *
  * Written by Tom Zimmerman
  *
  * Contributors:
@@ -350,8 +353,10 @@ static void do_process_ce(struct mem_ctl_info *mci, u16 error_one,
 	channel = !(error_one & 1);
 
 	/* e752x mc reads 34:6 of the DRAM linear address */
-	edac_mc_handle_ce(mci, page, offset_in_page(sec1_add << 4),
-			sec1_syndrome, row, channel, "e752x CE");
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			     page, offset_in_page(sec1_add << 4), sec1_syndrome,
+			     row, channel, -1,
+			     "e752x CE", "", NULL);
 }
 
 static inline void process_ce(struct mem_ctl_info *mci, u16 error_one,
@@ -385,9 +390,12 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 			edac_mc_find_csrow_by_page(mci, block_page);
 
 		/* e752x mc reads 34:6 of the DRAM linear address */
-		edac_mc_handle_ue(mci, block_page,
-				offset_in_page(error_2b << 4),
-				row, "e752x UE from Read");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					block_page,
+					offset_in_page(error_2b << 4), 0,
+					 row, -1, -1,
+					"e752x UE from Read", "", NULL);
+
 	}
 	if (error_one & 0x0404) {
 		error_2b = scrb_add;
@@ -401,9 +409,11 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 			edac_mc_find_csrow_by_page(mci, block_page);
 
 		/* e752x mc reads 34:6 of the DRAM linear address */
-		edac_mc_handle_ue(mci, block_page,
-				offset_in_page(error_2b << 4),
-				row, "e752x UE from Scruber");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					block_page,
+					offset_in_page(error_2b << 4), 0,
+					row, -1, -1,
+					"e752x UE from Scruber", "", NULL);
 	}
 }
 
@@ -426,7 +436,9 @@ static inline void process_ue_no_info_wr(struct mem_ctl_info *mci,
 		return;
 
 	debugf3("%s()\n", __func__);
-	edac_mc_handle_ue_no_info(mci, "e752x UE log memory write");
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+			     -1, -1, -1,
+			     "e752x UE log memory write", "", NULL);
 }
 
 static void do_process_ded_retry(struct mem_ctl_info *mci, u16 error,
@@ -1081,10 +1093,11 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
-		for (i = 0; i < drc_chan + 1; i++) {
+		for (i = 0; i < csrow->nr_channels; i++) {
 			struct dimm_info *dimm = csrow->channels[i].dimm;
 
-			dimm->nr_pages = nr_pages / (drc_chan + 1);
+			debugf3("Initializing rank at (%i,%i)\n", index, i);
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
@@ -1232,6 +1245,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	u16 pci_data;
 	u8 stat8;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct e752x_pvt *pvt;
 	u16 ddrcsr;
 	int drc_chan;		/* Number of channels 0=1chan,1=2chan */
@@ -1258,11 +1272,16 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	/* Dual channel = 1, Single channel = 0 */
 	drc_chan = dual_channel_active(ddrcsr);
 
-	mci = edac_mc_alloc(sizeof(*pvt), E752X_NR_CSROWS, drc_chan + 1, 0);
-
-	if (mci == NULL) {
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = E752X_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = drc_chan + 1;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+			    false, sizeof(*pvt));
+	if (mci == NULL)
 		return -ENOMEM;
-	}
 
 	debugf3("%s(): init mci\n", __func__);
 	mci->mtype_cap = MEM_FLAG_RDDR;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 06/26] e7xxx_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (4 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 05/26] e752x_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 07/26] i3000_edac: " Mauro Carvalho Chehab
                       ` (19 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/e7xxx_edac.c |   37 ++++++++++++++++++++++++++++++-------
 1 files changed, 30 insertions(+), 7 deletions(-)

diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index aeb69f0..9380f8a 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -10,6 +10,9 @@
  * Based on work by Dan Hollis <goemon at anime dot net> and others.
  *	http://www.anime.net/~goemon/linux-ecc/
  *
+ * Datasheet:
+ *	http://www.intel.com/content/www/us/en/chipsets/e7501-chipset-memory-controller-hub-datasheet.html
+ *
  * Contributors:
  *	Eric Biederman (Linux Networx)
  *	Tom Zimmerman (Linux Networx)
@@ -71,7 +74,7 @@
 #endif				/* PCI_DEVICE_ID_INTEL_7505_1_ERR */
 
 #define E7XXX_NR_CSROWS		8	/* number of csrows */
-#define E7XXX_NR_DIMMS		8	/* FIXME - is this correct? */
+#define E7XXX_NR_DIMMS		8	/* 2 channels, 4 dimms/channel */
 
 /* E7XXX register addresses - device 0 function 0 */
 #define E7XXX_DRB		0x60	/* DRAM row boundary register (8b) */
@@ -216,13 +219,15 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	row = edac_mc_find_csrow_by_page(mci, page);
 	/* convert syndrome to channel */
 	channel = e7xxx_find_channel(syndrome);
-	edac_mc_handle_ce(mci, page, 0, syndrome, row, channel, "e7xxx CE");
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, page, 0, syndrome,
+			     row, channel, -1, "e7xxx CE", "", NULL);
 }
 
 static void process_ce_no_info(struct mem_ctl_info *mci)
 {
 	debugf3("%s()\n", __func__);
-	edac_mc_handle_ce_no_info(mci, "e7xxx CE log register overflow");
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
+			     "e7xxx CE log register overflow", "", NULL);
 }
 
 static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
@@ -236,13 +241,17 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	/* FIXME - should use PAGE_SHIFT */
 	block_page = error_2b >> 6;	/* convert to 4k address */
 	row = edac_mc_find_csrow_by_page(mci, block_page);
-	edac_mc_handle_ue(mci, block_page, 0, row, "e7xxx UE");
+
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, block_page, 0, 0,
+			     row, -1, -1, "e7xxx UE", "", NULL);
 }
 
 static void process_ue_no_info(struct mem_ctl_info *mci)
 {
 	debugf3("%s()\n", __func__);
-	edac_mc_handle_ue_no_info(mci, "e7xxx UE log register overflow");
+
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
+			     "e7xxx UE log register overflow", "", NULL);
 }
 
 static void e7xxx_get_error_info(struct mem_ctl_info *mci,
@@ -413,6 +422,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	u16 pci_data;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	struct e7xxx_pvt *pvt = NULL;
 	u32 drc;
 	int drc_chan;
@@ -423,8 +433,21 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	pci_read_config_dword(pdev, E7XXX_DRC, &drc);
 
 	drc_chan = dual_channel_active(drc, dev_idx);
-	mci = edac_mc_alloc(sizeof(*pvt), E7XXX_NR_CSROWS, drc_chan + 1, 0);
-
+	/*
+	 * According with the datasheet, this device has a maximum of
+	 * 4 DIMMS per channel, either single-rank or dual-rank. So, the
+	 * total amount of dimms is 8 (E7XXX_NR_DIMMS).
+	 * That means that the DIMM is mapped as CSROWs, and the channel
+	 * will map the rank. So, an error to either channel should be
+	 * attributed to the same dimm.
+	 */
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = E7XXX_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = drc_chan + 1;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (mci == NULL)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 07/26] i3000_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (5 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 06/26] e7xxx_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 08/26] i3200_edac: " Mauro Carvalho Chehab
                       ` (18 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Jason Uhlenkott

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Jason Uhlenkott <juhlenko@akamai.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i3000_edac.c |   25 ++++++++++++++++++++-----
 1 files changed, 20 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index bf8a230..9b2c561 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -245,7 +245,9 @@ static int i3000_process_error_info(struct mem_ctl_info *mci,
 		return 1;
 
 	if ((info->errsts ^ info->errsts2) & I3000_ERRSTS_BITS) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1,
+				     "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
@@ -256,10 +258,15 @@ static int i3000_process_error_info(struct mem_ctl_info *mci,
 	row = edac_mc_find_csrow_by_page(mci, pfn);
 
 	if (info->errsts & I3000_ERRSTS_UE)
-		edac_mc_handle_ue(mci, pfn, offset, row, "i3000 UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     pfn, offset, 0,
+				     row, -1, -1,
+				     "i3000 UE", "", NULL);
 	else
-		edac_mc_handle_ce(mci, pfn, offset, info->derrsyn, row,
-				multi_chan ? channel : 0, "i3000 CE");
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     pfn, offset, info->derrsyn,
+				     row, multi_chan ? channel : 0, -1,
+				     "i3000 CE", "", NULL);
 
 	return 1;
 }
@@ -306,6 +313,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	unsigned long last_cumul_size, nr_pages;
 	int interleaved, nr_channels;
 	unsigned char dra[I3000_RANKS / 2], drb[I3000_RANKS];
@@ -347,7 +355,14 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	 */
 	interleaved = i3000_is_interleaved(c0dra, c1dra, c0drb, c1drb);
 	nr_channels = interleaved ? 2 : 1;
-	mci = edac_mc_alloc(0, I3000_RANKS / nr_channels, nr_channels, 0);
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I3000_RANKS / nr_channels;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_channels;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 08/26] i3200_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (6 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 07/26] i3000_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 09/26] i5000_edac: " Mauro Carvalho Chehab
                       ` (17 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Hitoshi Mitake, Borislav Petkov,
	Andrew Morton

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i3200_edac.c |   32 ++++++++++++++++++++++----------
 1 files changed, 22 insertions(+), 10 deletions(-)

diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index b82667f..c926ff0 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -23,6 +23,7 @@
 
 #define PCI_DEVICE_ID_INTEL_3200_HB    0x29f0
 
+#define I3200_DIMMS		4
 #define I3200_RANKS		8
 #define I3200_RANKS_PER_CHANNEL	4
 #define I3200_CHANNELS		2
@@ -217,21 +218,25 @@ static void i3200_process_error_info(struct mem_ctl_info *mci,
 		return;
 
 	if ((info->errsts ^ info->errsts2) & I3200_ERRSTS_BITS) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1, "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
 	for (channel = 0; channel < nr_channels; channel++) {
 		log = info->eccerrlog[channel];
 		if (log & I3200_ECCERRLOG_UE) {
-			edac_mc_handle_ue(mci, 0, 0,
-				eccerrlog_row(channel, log),
-				"i3200 UE");
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     0, 0, 0,
+					     eccerrlog_row(channel, log),
+					     -1, -1,
+					     "i3000 UE", "", NULL);
 		} else if (log & I3200_ECCERRLOG_CE) {
-			edac_mc_handle_ce(mci, 0, 0,
-				eccerrlog_syndrome(log),
-				eccerrlog_row(channel, log), 0,
-				"i3200 CE");
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     0, 0, eccerrlog_syndrome(log),
+					     eccerrlog_row(channel, log),
+					     -1, -1,
+					     "i3000 UE", "", NULL);
 		}
 	}
 }
@@ -321,6 +326,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	u16 drbs[I3200_CHANNELS][I3200_RANKS_PER_CHANNEL];
 	bool stacked;
 	void __iomem *window;
@@ -335,8 +341,14 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	i3200_get_drbs(window, drbs);
 	nr_channels = how_many_channels(pdev);
 
-	mci = edac_mc_alloc(sizeof(struct i3200_priv), I3200_RANKS,
-		nr_channels, 0);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I3200_DIMMS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_channels;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+			    false, sizeof(struct i3200_priv));
 	if (!mci)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 09/26] i5000_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (7 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 08/26] i3200_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 10/26] i5100_edac: " Mauro Carvalho Chehab
                       ` (16 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i5000_edac.c |   60 +++++++++++++++++++++++++++++---------------
 1 files changed, 39 insertions(+), 21 deletions(-)

diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index e8d32e8..18b2532 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -533,13 +533,14 @@ static void i5000_process_fatal_error_info(struct mem_ctl_info *mci,
 
 	/* Form out message */
 	snprintf(msg, sizeof(msg),
-		 "(Branch=%d DRAM-Bank=%d RDWR=%s RAS=%d CAS=%d "
-		 "FATAL Err=0x%x (%s))",
-		 branch >> 1, bank, rdwr ? "Write" : "Read", ras, cas,
-		 allErrors, specific);
+		 "Bank=%d RAS=%d CAS=%d FATAL Err=0x%x (%s)",
+		 bank, ras, cas, allErrors, specific);
 
 	/* Call the helper to output message */
-	edac_mc_handle_fbd_ue(mci, rank, channel, channel + 1, msg);
+	edac_mc_handle_error(HW_EVENT_ERR_FATAL, mci, 0, 0, 0,
+			     branch >> 1, -1, rank,
+			     rdwr ? "Write error" : "Read error",
+			     msg, NULL);
 }
 
 /*
@@ -633,13 +634,14 @@ static void i5000_process_nonfatal_error_info(struct mem_ctl_info *mci,
 
 		/* Form out message */
 		snprintf(msg, sizeof(msg),
-			 "(Branch=%d DRAM-Bank=%d RDWR=%s RAS=%d "
-			 "CAS=%d, UE Err=0x%x (%s))",
-			 branch >> 1, bank, rdwr ? "Write" : "Read", ras, cas,
-			 ue_errors, specific);
+			 "Rank=%d Bank=%d RAS=%d CAS=%d, UE Err=0x%x (%s)",
+			 rank, bank, ras, cas, ue_errors, specific);
 
 		/* Call the helper to output message */
-		edac_mc_handle_fbd_ue(mci, rank, channel, channel + 1, msg);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				channel >> 1, -1, rank,
+				rdwr ? "Write error" : "Read error",
+				msg, NULL);
 	}
 
 	/* Check correctable errors */
@@ -685,13 +687,16 @@ static void i5000_process_nonfatal_error_info(struct mem_ctl_info *mci,
 
 		/* Form out message */
 		snprintf(msg, sizeof(msg),
-			 "(Branch=%d DRAM-Bank=%d RDWR=%s RAS=%d "
+			 "Rank=%d Bank=%d RDWR=%s RAS=%d "
 			 "CAS=%d, CE Err=0x%x (%s))", branch >> 1, bank,
 			 rdwr ? "Write" : "Read", ras, cas, ce_errors,
 			 specific);
 
 		/* Call the helper to output message */
-		edac_mc_handle_fbd_ce(mci, rank, channel, msg);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0, 0,
+				channel >> 1, channel % 2, rank,
+				rdwr ? "Write error" : "Read error",
+				msg, NULL);
 	}
 
 	if (!misc_messages)
@@ -731,11 +736,12 @@ static void i5000_process_nonfatal_error_info(struct mem_ctl_info *mci,
 
 		/* Form out message */
 		snprintf(msg, sizeof(msg),
-			 "(Branch=%d Err=%#x (%s))", branch >> 1,
-			 misc_errors, specific);
+			 "Err=%#x (%s)", misc_errors, specific);
 
 		/* Call the helper to output message */
-		edac_mc_handle_fbd_ce(mci, 0, 0, msg);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0, 0,
+				branch >> 1, -1, -1,
+				"Misc error", msg, NULL);
 	}
 }
 
@@ -1251,6 +1257,10 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 
 	empty = 1;		/* Assume NO memory */
 
+	/*
+	 * TODO: it would be better to not use csrow here, filling
+	 * directly the dimm_info structs, based on branch, channel, dim number
+	 */
 	for (csrow = 0; csrow < max_csrows; csrow++) {
 		p_csrow = &mci->csrows[csrow];
 
@@ -1312,7 +1322,7 @@ static void i5000_enable_error_reporting(struct mem_ctl_info *mci)
 }
 
 /*
- * i5000_get_dimm_and_channel_counts(pdev, &num_csrows, &num_channels)
+ * i5000_get_dimm_and_channel_counts(pdev, &nr_csrows, &num_channels)
  *
  *	ask the device how many channels are present and how many CSROWS
  *	 as well
@@ -1343,10 +1353,10 @@ static void i5000_get_dimm_and_channel_counts(struct pci_dev *pdev,
 static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[3];
 	struct i5000_pvt *pvt;
 	int num_channels;
 	int num_dimms_per_channel;
-	int num_csrows;
 
 	debugf0("MC: %s: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
 		__FILE__, __func__,
@@ -1372,13 +1382,21 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	 */
 	i5000_get_dimm_and_channel_counts(pdev, &num_dimms_per_channel,
 					&num_channels);
-	num_csrows = num_dimms_per_channel * 2;
 
-	debugf0("MC: %s(): Number of - Channels= %d  DIMMS= %d  CSROWS= %d\n",
-		__func__, num_channels, num_dimms_per_channel, num_csrows);
+	debugf0("MC: %s(): Number of Branches=2 Channels= %d  DIMMS= %d\n",
+		__func__, num_channels, num_dimms_per_channel);
 
 	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), num_csrows, num_channels, 0);
+	layers[0].type = EDAC_MC_LAYER_BRANCH;
+	layers[0].size = 2;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = num_channels;
+	layers[1].is_csrow = false;
+	layers[2].type = EDAC_MC_LAYER_SLOT;
+	layers[2].size = num_dimms_per_channel;
+	layers[2].is_csrow = true;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 10/26] i5100_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (8 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 09/26] i5000_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 11/26] i5400_edac: " Mauro Carvalho Chehab
                       ` (15 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Niklas Söderlund,
	Borislav Petkov

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i5100_edac.c |   90 ++++++++++++++++++++-------------------------
 1 files changed, 40 insertions(+), 50 deletions(-)

diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index a0219a9..ec5de18 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -14,6 +14,11 @@
  * rows for each respective channel are laid out one after another,
  * the first half belonging to channel 0, the second half belonging
  * to channel 1.
+ *
+ * This driver is for DDR2 DIMMs, and it uses chip select to select among the
+ * several ranks. However, instead of showing memories as ranks, it outputs
+ * them as DIMM's. An internal table creates the association between ranks
+ * and DIMM's.
  */
 #include <linux/module.h>
 #include <linux/init.h>
@@ -410,14 +415,6 @@ static int i5100_csrow_to_chan(const struct mem_ctl_info *mci, int csrow)
 	return csrow / priv->ranksperchan;
 }
 
-static unsigned i5100_rank_to_csrow(const struct mem_ctl_info *mci,
-				    int chan, int rank)
-{
-	const struct i5100_priv *priv = mci->pvt_info;
-
-	return chan * priv->ranksperchan + rank;
-}
-
 static void i5100_handle_ce(struct mem_ctl_info *mci,
 			    int chan,
 			    unsigned bank,
@@ -427,21 +424,17 @@ static void i5100_handle_ce(struct mem_ctl_info *mci,
 			    unsigned ras,
 			    const char *msg)
 {
-	const int csrow = i5100_rank_to_csrow(mci, chan, rank);
-	char *label = NULL;
+	char detail[80];
 
-	if (mci->csrows[csrow].channels[0].dimm)
-		label = mci->csrows[csrow].channels[0].dimm->label;
+	/* Form out message */
+	snprintf(detail, sizeof(detail),
+		 "bank %u, cas %u, ras %u\n",
+		 bank, cas, ras);
 
-	printk(KERN_ERR
-		"CE chan %d, bank %u, rank %u, syndrome 0x%lx, "
-		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
-		chan, bank, rank, syndrome, cas, ras,
-		csrow, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[0].ce_count++;
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			     0, 0, syndrome,
+			     chan, rank, -1,
+			     msg, detail, NULL);
 }
 
 static void i5100_handle_ue(struct mem_ctl_info *mci,
@@ -453,20 +446,17 @@ static void i5100_handle_ue(struct mem_ctl_info *mci,
 			    unsigned ras,
 			    const char *msg)
 {
-	const int csrow = i5100_rank_to_csrow(mci, chan, rank);
-	char *label = NULL;
-
-	if (mci->csrows[csrow].channels[0].dimm)
-		label = mci->csrows[csrow].channels[0].dimm->label;
+	char detail[80];
 
-	printk(KERN_ERR
-		"UE chan %d, bank %u, rank %u, syndrome 0x%lx, "
-		"cas %u, ras %u, csrow %u, label \"%s\": %s\n",
-		chan, bank, rank, syndrome, cas, ras,
-		csrow, label, msg);
+	/* Form out message */
+	snprintf(detail, sizeof(detail),
+		 "bank %u, cas %u, ras %u\n",
+		 bank, cas, ras);
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
+	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			     0, 0, syndrome,
+			     chan, rank, -1,
+			     msg, detail, NULL);
 }
 
 static void i5100_read_log(struct mem_ctl_info *mci, int chan,
@@ -843,11 +833,10 @@ static void __devinit i5100_init_interleaving(struct pci_dev *pdev,
 static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 {
 	int i;
-	unsigned long total_pages = 0UL;
 	struct i5100_priv *priv = mci->pvt_info;
-	struct dimm_info *dimm;
 
-	for (i = 0; i < mci->nr_csrows; i++) {
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm;
 		const unsigned long npages = i5100_npages(mci, i);
 		const unsigned chan = i5100_csrow_to_chan(mci, i);
 		const unsigned rank = i5100_csrow_to_rank(mci, i);
@@ -855,30 +844,23 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		if (!npages)
 			continue;
 
-		/*
-		 * FIXME: these two are totally bogus -- I don't see how to
-		 * map them correctly to this structure...
-		 */
-		mci->csrows[i].csrow_idx = i;
-		mci->csrows[i].mci = mci;
-		mci->csrows[i].nr_channels = 1;
-		mci->csrows[i].channels[0].csrow = mci->csrows + i;
-		total_pages += npages;
+		dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+			       chan, rank, 0);
 
-		dimm = mci->csrows[i].channels[0].dimm;
 		dimm->nr_pages = npages;
 		if (npages) {
-			total_pages += npages;
-
 			dimm->grain = 32;
 			dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
-				DEV_X4 : DEV_X8;
+					DEV_X4 : DEV_X8;
 			dimm->mtype = MEM_RDDR2;
 			dimm->edac_mode = EDAC_SECDED;
 			snprintf(dimm->label, sizeof(dimm->label),
 				"DIMM%u",
 				i5100_rank_to_slot(mci, chan, rank));
 		}
+
+		debugf2("dimm channel %d, rank %d, size %zd\n",
+			chan, rank, PAGES_TO_MiB(npages));
 	}
 }
 
@@ -887,6 +869,7 @@ static int __devinit i5100_init_one(struct pci_dev *pdev,
 {
 	int rc;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct i5100_priv *priv;
 	struct pci_dev *ch0mm, *ch1mm;
 	int ret = 0;
@@ -947,7 +930,14 @@ static int __devinit i5100_init_one(struct pci_dev *pdev,
 		goto bail_ch1;
 	}
 
-	mci = edac_mc_alloc(sizeof(*priv), ranksperch * 2, 1, 0);
+	layers[0].type = EDAC_MC_LAYER_CHANNEL;
+	layers[0].size = 2;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_SLOT;
+	layers[1].size = ranksperch;
+	layers[1].is_csrow = true;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+			    false, sizeof(*priv));
 	if (!mci) {
 		ret = -ENOMEM;
 		goto bail_disable_ch1;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 11/26] i5400_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (9 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 10/26] i5100_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 12/26] i7300_edac: " Mauro Carvalho Chehab
                       ` (14 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i5400_edac.c |  217 +++++++++++++++++++++++++--------------------
 1 files changed, 119 insertions(+), 98 deletions(-)

diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 784d6dc..029c557 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -18,6 +18,10 @@
  * Intel 5400 Chipset Memory Controller Hub (MCH) - Datasheet
  * 	http://developer.intel.com/design/chipsets/datashts/313070.htm
  *
+ * This Memory Controller manages DDR2 FB-DIMMs. It has 2 branches, each with
+ * 2 channels operating in lockstep no-mirror mode. Each channel can have up to
+ * 4 dimm's, each with up to 8GB.
+ *
  */
 
 #include <linux/module.h>
@@ -44,12 +48,10 @@
 	edac_mc_chipset_printk(mci, level, "i5400", fmt, ##arg)
 
 /* Limits for i5400 */
-#define NUM_MTRS_PER_BRANCH	4
+#define MAX_BRANCHES		2
 #define CHANNELS_PER_BRANCH	2
-#define MAX_DIMMS_PER_CHANNEL	NUM_MTRS_PER_BRANCH
-#define	MAX_CHANNELS		4
-/* max possible csrows per channel */
-#define MAX_CSROWS		(MAX_DIMMS_PER_CHANNEL)
+#define DIMMS_PER_CHANNEL	4
+#define	MAX_CHANNELS		(MAX_BRANCHES * CHANNELS_PER_BRANCH)
 
 /* Device 16,
  * Function 0: System Address
@@ -347,16 +349,16 @@ struct i5400_pvt {
 
 	u16 mir0, mir1;
 
-	u16 b0_mtr[NUM_MTRS_PER_BRANCH];	/* Memory Technlogy Reg */
+	u16 b0_mtr[DIMMS_PER_CHANNEL];	/* Memory Technlogy Reg */
 	u16 b0_ambpresent0;			/* Branch 0, Channel 0 */
 	u16 b0_ambpresent1;			/* Brnach 0, Channel 1 */
 
-	u16 b1_mtr[NUM_MTRS_PER_BRANCH];	/* Memory Technlogy Reg */
+	u16 b1_mtr[DIMMS_PER_CHANNEL];	/* Memory Technlogy Reg */
 	u16 b1_ambpresent0;			/* Branch 1, Channel 8 */
 	u16 b1_ambpresent1;			/* Branch 1, Channel 1 */
 
 	/* DIMM information matrix, allocating architecture maximums */
-	struct i5400_dimm_info dimm_info[MAX_CSROWS][MAX_CHANNELS];
+	struct i5400_dimm_info dimm_info[DIMMS_PER_CHANNEL][MAX_CHANNELS];
 
 	/* Actual values for this controller */
 	int maxch;				/* Max channels */
@@ -532,13 +534,15 @@ static void i5400_proccess_non_recoverable_info(struct mem_ctl_info *mci,
 	int ras, cas;
 	int errnum;
 	char *type = NULL;
+	enum hw_event_mc_err_type tp_event = HW_EVENT_ERR_UNCORRECTED;
 
 	if (!allErrors)
 		return;		/* if no error, return now */
 
-	if (allErrors &  ERROR_FAT_MASK)
+	if (allErrors &  ERROR_FAT_MASK) {
 		type = "FATAL";
-	else if (allErrors & FERR_NF_UNCORRECTABLE)
+		tp_event = HW_EVENT_ERR_FATAL;
+	} else if (allErrors & FERR_NF_UNCORRECTABLE)
 		type = "NON-FATAL uncorrected";
 	else
 		type = "NON-FATAL recoverable";
@@ -556,7 +560,7 @@ static void i5400_proccess_non_recoverable_info(struct mem_ctl_info *mci,
 	ras = nrec_ras(info);
 	cas = nrec_cas(info);
 
-	debugf0("\t\tCSROW= %d  Channels= %d,%d  (Branch= %d "
+	debugf0("\t\tDIMM= %d  Channels= %d,%d  (Branch= %d "
 		"DRAM Bank= %d Buffer ID = %d rdwr= %s ras= %d cas= %d)\n",
 		rank, channel, channel + 1, branch >> 1, bank,
 		buf_id, rdwr_str(rdwr), ras, cas);
@@ -566,13 +570,13 @@ static void i5400_proccess_non_recoverable_info(struct mem_ctl_info *mci,
 
 	/* Form out message */
 	snprintf(msg, sizeof(msg),
-		 "%s (Branch=%d DRAM-Bank=%d Buffer ID = %d RDWR=%s "
-		 "RAS=%d CAS=%d %s Err=0x%lx (%s))",
-		 type, branch >> 1, bank, buf_id, rdwr_str(rdwr), ras, cas,
-		 type, allErrors, error_name[errnum]);
+		 "Bank=%d Buffer ID = %d RAS=%d CAS=%d Err=0x%lx (%s)",
+		 bank, buf_id, ras, cas, allErrors, error_name[errnum]);
 
-	/* Call the helper to output message */
-	edac_mc_handle_fbd_ue(mci, rank, channel, channel + 1, msg);
+	edac_mc_handle_error(tp_event, mci, 0, 0, 0,
+			     branch >> 1, -1, rank,
+			     rdwr ? "Write error" : "Read error",
+			     msg, NULL);
 }
 
 /*
@@ -630,7 +634,7 @@ static void i5400_process_nonfatal_error_info(struct mem_ctl_info *mci,
 		/* Only 1 bit will be on */
 		errnum = find_first_bit(&allErrors, ARRAY_SIZE(error_name));
 
-		debugf0("\t\tCSROW= %d Channel= %d  (Branch %d "
+		debugf0("\t\tDIMM= %d Channel= %d  (Branch %d "
 			"DRAM Bank= %d rdwr= %s ras= %d cas= %d)\n",
 			rank, channel, branch >> 1, bank,
 			rdwr_str(rdwr), ras, cas);
@@ -642,8 +646,10 @@ static void i5400_process_nonfatal_error_info(struct mem_ctl_info *mci,
 			 branch >> 1, bank, rdwr_str(rdwr), ras, cas,
 			 allErrors, error_name[errnum]);
 
-		/* Call the helper to output message */
-		edac_mc_handle_fbd_ce(mci, rank, channel, msg);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0, 0,
+				     branch >> 1, channel % 2, rank,
+				     rdwr ? "Write error" : "Read error",
+				     msg, NULL);
 
 		return;
 	}
@@ -831,8 +837,8 @@ static int i5400_get_devices(struct mem_ctl_info *mci, int dev_idx)
 /*
  *	determine_amb_present
  *
- *		the information is contained in NUM_MTRS_PER_BRANCH different
- *		registers determining which of the NUM_MTRS_PER_BRANCH requires
+ *		the information is contained in DIMMS_PER_CHANNEL different
+ *		registers determining which of the DIMMS_PER_CHANNEL requires
  *              knowing which channel is in question
  *
  *	2 branches, each with 2 channels
@@ -861,11 +867,11 @@ static int determine_amb_present_reg(struct i5400_pvt *pvt, int channel)
 }
 
 /*
- * determine_mtr(pvt, csrow, channel)
+ * determine_mtr(pvt, dimm, channel)
  *
- * return the proper MTR register as determine by the csrow and desired channel
+ * return the proper MTR register as determine by the dimm and desired channel
  */
-static int determine_mtr(struct i5400_pvt *pvt, int csrow, int channel)
+static int determine_mtr(struct i5400_pvt *pvt, int dimm, int channel)
 {
 	int mtr;
 	int n;
@@ -873,11 +879,11 @@ static int determine_mtr(struct i5400_pvt *pvt, int csrow, int channel)
 	/* There is one MTR for each slot pair of FB-DIMMs,
 	   Each slot pair may be at branch 0 or branch 1.
 	 */
-	n = csrow;
+	n = dimm;
 
-	if (n >= NUM_MTRS_PER_BRANCH) {
-		debugf0("ERROR: trying to access an invalid csrow: %d\n",
-			csrow);
+	if (n >= DIMMS_PER_CHANNEL) {
+		debugf0("ERROR: trying to access an invalid dimm: %d\n",
+			dimm);
 		return 0;
 	}
 
@@ -913,19 +919,19 @@ static void decode_mtr(int slot_row, u16 mtr)
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 }
 
-static void handle_channel(struct i5400_pvt *pvt, int csrow, int channel,
+static void handle_channel(struct i5400_pvt *pvt, int dimm, int channel,
 			struct i5400_dimm_info *dinfo)
 {
 	int mtr;
 	int amb_present_reg;
 	int addrBits;
 
-	mtr = determine_mtr(pvt, csrow, channel);
+	mtr = determine_mtr(pvt, dimm, channel);
 	if (MTR_DIMMS_PRESENT(mtr)) {
 		amb_present_reg = determine_amb_present_reg(pvt, channel);
 
 		/* Determine if there is a DIMM present in this DIMM slot */
-		if (amb_present_reg & (1 << csrow)) {
+		if (amb_present_reg & (1 << dimm)) {
 			/* Start with the number of bits for a Bank
 			 * on the DRAM */
 			addrBits = MTR_DRAM_BANKS_ADDR_BITS(mtr);
@@ -954,7 +960,7 @@ static void handle_channel(struct i5400_pvt *pvt, int csrow, int channel,
 static void calculate_dimm_size(struct i5400_pvt *pvt)
 {
 	struct i5400_dimm_info *dinfo;
-	int csrow, max_csrows;
+	int dimm, max_dimms;
 	char *p, *mem_buffer;
 	int space, n;
 	int channel;
@@ -968,32 +974,32 @@ static void calculate_dimm_size(struct i5400_pvt *pvt)
 		return;
 	}
 
-	/* Scan all the actual CSROWS
+	/* Scan all the actual DIMMS
 	 * and calculate the information for each DIMM
-	 * Start with the highest csrow first, to display it first
-	 * and work toward the 0th csrow
+	 * Start with the highest dimm first, to display it first
+	 * and work toward the 0th dimm
 	 */
-	max_csrows = pvt->maxdimmperch;
-	for (csrow = max_csrows - 1; csrow >= 0; csrow--) {
+	max_dimms = pvt->maxdimmperch;
+	for (dimm = max_dimms - 1; dimm >= 0; dimm--) {
 
-		/* on an odd csrow, first output a 'boundary' marker,
+		/* on an odd dimm, first output a 'boundary' marker,
 		 * then reset the message buffer  */
-		if (csrow & 0x1) {
+		if (dimm & 0x1) {
 			n = snprintf(p, space, "---------------------------"
-					"--------------------------------");
+					"-------------------------------");
 			p += n;
 			space -= n;
 			debugf2("%s\n", mem_buffer);
 			p = mem_buffer;
 			space = PAGE_SIZE;
 		}
-		n = snprintf(p, space, "csrow %2d    ", csrow);
+		n = snprintf(p, space, "dimm %2d    ", dimm);
 		p += n;
 		space -= n;
 
 		for (channel = 0; channel < pvt->maxch; channel++) {
-			dinfo = &pvt->dimm_info[csrow][channel];
-			handle_channel(pvt, csrow, channel, dinfo);
+			dinfo = &pvt->dimm_info[dimm][channel];
+			handle_channel(pvt, dimm, channel, dinfo);
 			n = snprintf(p, space, "%4d MB   | ", dinfo->megabytes);
 			p += n;
 			space -= n;
@@ -1005,7 +1011,7 @@ static void calculate_dimm_size(struct i5400_pvt *pvt)
 
 	/* Output the last bottom 'boundary' marker */
 	n = snprintf(p, space, "---------------------------"
-			"--------------------------------");
+			"-------------------------------");
 	p += n;
 	space -= n;
 	debugf2("%s\n", mem_buffer);
@@ -1013,7 +1019,7 @@ static void calculate_dimm_size(struct i5400_pvt *pvt)
 	space = PAGE_SIZE;
 
 	/* now output the 'channel' labels */
-	n = snprintf(p, space, "            ");
+	n = snprintf(p, space, "           ");
 	p += n;
 	space -= n;
 	for (channel = 0; channel < pvt->maxch; channel++) {
@@ -1080,7 +1086,7 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
 	debugf2("MIR1: limit= 0x%x  WAY1= %u  WAY0= %x\n", limit, way1, way0);
 
 	/* Get the set of MTR[0-3] regs by each branch */
-	for (slot_row = 0; slot_row < NUM_MTRS_PER_BRANCH; slot_row++) {
+	for (slot_row = 0; slot_row < DIMMS_PER_CHANNEL; slot_row++) {
 		int where = MTR0 + (slot_row * sizeof(u16));
 
 		/* Branch 0 set of MTR registers */
@@ -1105,7 +1111,7 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
 	/* Read and dump branch 0's MTRs */
 	debugf2("\nMemory Technology Registers:\n");
 	debugf2("   Branch 0:\n");
-	for (slot_row = 0; slot_row < NUM_MTRS_PER_BRANCH; slot_row++)
+	for (slot_row = 0; slot_row < DIMMS_PER_CHANNEL; slot_row++)
 		decode_mtr(slot_row, pvt->b0_mtr[slot_row]);
 
 	pci_read_config_word(pvt->branch_0, AMBPRESENT_0,
@@ -1122,7 +1128,7 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
 	} else {
 		/* Read and dump  branch 1's MTRs */
 		debugf2("   Branch 1:\n");
-		for (slot_row = 0; slot_row < NUM_MTRS_PER_BRANCH; slot_row++)
+		for (slot_row = 0; slot_row < DIMMS_PER_CHANNEL; slot_row++)
 			decode_mtr(slot_row, pvt->b1_mtr[slot_row]);
 
 		pci_read_config_word(pvt->branch_1, AMBPRESENT_0,
@@ -1141,7 +1147,7 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
 }
 
 /*
- *	i5400_init_csrows	Initialize the 'csrows' table within
+ *	i5400_init_dimms	Initialize the 'dimms' table within
  *				the mci control	structure with the
  *				addressing of memory.
  *
@@ -1149,50 +1155,68 @@ static void i5400_get_mc_regs(struct mem_ctl_info *mci)
  *		0	success
  *		1	no actual memory found on this MC
  */
-static int i5400_init_csrows(struct mem_ctl_info *mci)
+static int i5400_init_dimms(struct mem_ctl_info *mci)
 {
 	struct i5400_pvt *pvt;
-	struct csrow_info *p_csrow;
-	int empty, channel_count;
-	int max_csrows;
+	struct dimm_info *dimm;
+	int ndimms, channel_count;
+	int max_dimms;
 	int mtr;
 	int size_mb;
-	int channel;
-	int csrow;
-	struct dimm_info *dimm;
+	int  channel, slot;
 
 	pvt = mci->pvt_info;
 
 	channel_count = pvt->maxch;
-	max_csrows = pvt->maxdimmperch;
+	max_dimms = pvt->maxdimmperch;
 
-	empty = 1;		/* Assume NO memory */
+	ndimms = 0;
 
-	for (csrow = 0; csrow < max_csrows; csrow++) {
-		p_csrow = &mci->csrows[csrow];
+	/*
+	 * FIXME: remove  pvt->dimm_info[slot][channel] and use the 3
+	 * layers here.
+	 */
+	for (channel = 0; channel < mci->layers[0].size * mci->layers[1].size;
+	     channel++) {
+		for (slot = 0; slot < mci->layers[2].size; slot++) {
+			mtr = determine_mtr(pvt, slot, channel);
 
-		/* use branch 0 for the basis */
-		mtr = determine_mtr(pvt, csrow, 0);
+			/* if no DIMMS on this slot, continue */
+			if (!MTR_DIMMS_PRESENT(mtr))
+				continue;
 
-		/* if no DIMMS on this row, continue */
-		if (!MTR_DIMMS_PRESENT(mtr))
-			continue;
+			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+				       channel / 2, channel % 2, slot);
 
-		for (channel = 0; channel < pvt->maxch; channel++) {
-			size_mb = pvt->dimm_info[csrow][channel].megabytes;
+			size_mb =  pvt->dimm_info[slot][channel].megabytes;
+
+			debugf2("%s: dimm%zd (branch %d channel %d slot %d): %d.%03d GB\n",
+				__func__, dimm - mci->dimms,
+				channel / 2, channel % 2, slot,
+				size_mb / 1000, size_mb % 1000);
 
-			dimm = p_csrow->channels[channel].dimm;
 			dimm->nr_pages = size_mb << 8;
 			dimm->grain = 8;
 			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
-			dimm->mtype = MEM_RDDR2;
-			dimm->edac_mode = EDAC_SECDED;
+			dimm->mtype = MEM_FB_DDR2;
+			/*
+			 * The eccc mechanism is SDDC (aka SECC), with
+			 * is similar to Chipkill.
+			 */
+			dimm->edac_mode = MTR_DRAM_WIDTH(mtr) ?
+					  EDAC_S8ECD8ED : EDAC_S4ECD4ED;
+			ndimms++;
 		}
-
-		empty = 0;
 	}
 
-	return empty;
+	/*
+	 * When just one memory is provided, it should be at location (0,0,0).
+	 * With such single-DIMM mode, the SDCC algorithm degrades to SECDEC+.
+	 */
+	if (ndimms == 1)
+		mci->dimms[0].edac_mode = EDAC_SECDED;
+
+	return (ndimms == 0);
 }
 
 /*
@@ -1228,9 +1252,7 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
 	struct i5400_pvt *pvt;
-	int num_channels;
-	int num_dimms_per_channel;
-	int num_csrows;
+	struct edac_mc_layer layers[3];
 
 	if (dev_idx >= ARRAY_SIZE(i5400_devs))
 		return -EINVAL;
@@ -1244,22 +1266,21 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	if (PCI_FUNC(pdev->devfn) != 0)
 		return -ENODEV;
 
-	/* As we don't have a motherboard identification routine to determine
-	 * actual number of slots/dimms per channel, we thus utilize the
-	 * resource as specified by the chipset. Thus, we might have
-	 * have more DIMMs per channel than actually on the mobo, but this
-	 * allows the driver to support up to the chipset max, without
-	 * some fancy mobo determination.
+	/*
+	 * allocate a new MC control structure
+	 *
+	 * This drivers uses the DIMM slot as "csrow" and the rest as "channel".
 	 */
-	num_dimms_per_channel = MAX_DIMMS_PER_CHANNEL;
-	num_channels = MAX_CHANNELS;
-	num_csrows = num_dimms_per_channel;
-
-	debugf0("MC: %s(): Number of - Channels= %d  DIMMS= %d  CSROWS= %d\n",
-		__func__, num_channels, num_dimms_per_channel, num_csrows);
-
-	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), num_csrows, num_channels, 0);
+	layers[0].type = EDAC_MC_LAYER_BRANCH;
+	layers[0].size = MAX_BRANCHES;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = CHANNELS_PER_BRANCH;
+	layers[1].is_csrow = false;
+	layers[2].type = EDAC_MC_LAYER_SLOT;
+	layers[2].size = DIMMS_PER_CHANNEL;
+	layers[2].is_csrow = true;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
@@ -1270,8 +1291,8 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 
 	pvt = mci->pvt_info;
 	pvt->system_address = pdev;	/* Record this device in our private */
-	pvt->maxch = num_channels;
-	pvt->maxdimmperch = num_dimms_per_channel;
+	pvt->maxch = MAX_CHANNELS;
+	pvt->maxdimmperch = DIMMS_PER_CHANNEL;
 
 	/* 'get' the pci devices we want to reserve for our use */
 	if (i5400_get_devices(mci, dev_idx))
@@ -1293,13 +1314,13 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	/* Set the function pointer to an actual operation function */
 	mci->edac_check = i5400_check_error;
 
-	/* initialize the MC control structure 'csrows' table
+	/* initialize the MC control structure 'dimms' table
 	 * with the mapping and control information */
-	if (i5400_init_csrows(mci)) {
+	if (i5400_init_dimms(mci)) {
 		debugf0("MC: Setting mci->edac_cap to EDAC_FLAG_NONE\n"
-			"    because i5400_init_csrows() returned nonzero "
+			"    because i5400_init_dimms() returned nonzero "
 			"value\n");
-		mci->edac_cap = EDAC_FLAG_NONE;	/* no csrows found */
+		mci->edac_cap = EDAC_FLAG_NONE;	/* no dimms found */
 	} else {
 		debugf1("MC: Enable error reporting now\n");
 		i5400_enable_error_reporting(mci);
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 12/26] i7300_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (10 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 11/26] i5400_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 13/26] i7core_edac: " Mauro Carvalho Chehab
                       ` (13 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i7300_edac.c |   81 ++++++++++++++++++--------------------------
 1 files changed, 33 insertions(+), 48 deletions(-)

diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 5e594ae..6e762f5 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -464,17 +464,14 @@ static void i7300_process_fbd_error(struct mem_ctl_info *mci)
 				FERR_FAT_FBD, error_reg);
 
 		snprintf(pvt->tmp_prt_buffer, PAGE_SIZE,
-			"FATAL (Branch=%d DRAM-Bank=%d %s "
-			"RAS=%d CAS=%d Err=0x%lx (%s))",
-			branch, bank,
-			is_wr ? "RDWR" : "RD",
-			ras, cas,
-			errors, specific);
-
-		/* Call the helper to output message */
-		edac_mc_handle_fbd_ue(mci, rank, branch << 1,
-				      (branch << 1) + 1,
-				      pvt->tmp_prt_buffer);
+			 "Bank=%d RAS=%d CAS=%d Err=0x%lx (%s))",
+			 bank, ras, cas, errors, specific);
+
+		edac_mc_handle_error(HW_EVENT_ERR_FATAL, mci, 0, 0, 0,
+				     branch, -1, rank,
+				     is_wr ? "Write error" : "Read error",
+				     pvt->tmp_prt_buffer, NULL);
+
 	}
 
 	/* read in the 1st NON-FATAL error register */
@@ -513,23 +510,14 @@ static void i7300_process_fbd_error(struct mem_ctl_info *mci)
 
 		/* Form out message */
 		snprintf(pvt->tmp_prt_buffer, PAGE_SIZE,
-			"Corrected error (Branch=%d, Channel %d), "
-			" DRAM-Bank=%d %s "
-			"RAS=%d CAS=%d, CE Err=0x%lx, Syndrome=0x%08x(%s))",
-			branch, channel,
-			bank,
-			is_wr ? "RDWR" : "RD",
-			ras, cas,
-			errors, syndrome, specific);
-
-		/*
-		 * Call the helper to output message
-		 * NOTE: Errors are reported per-branch, and not per-channel
-		 *	 Currently, we don't know how to identify the right
-		 *	 channel.
-		 */
-		edac_mc_handle_fbd_ce(mci, rank, channel,
-				      pvt->tmp_prt_buffer);
+			 "DRAM-Bank=%d RAS=%d CAS=%d, Err=0x%lx (%s))",
+			 bank, ras, cas, errors, specific);
+
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0,
+				     syndrome,
+				     branch >> 1, channel % 2, rank,
+				     is_wr ? "Write error" : "Read error",
+				     pvt->tmp_prt_buffer, NULL);
 	}
 	return;
 }
@@ -807,13 +795,17 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 			for (ch = 0; ch < MAX_CH_PER_BRANCH; ch++) {
 				int channel = to_channel(ch, branch);
 
-				dinfo = &pvt->dimm_info[slot][channel];
+				dimm = GET_POS(mci->layers, mci->dimms,
+					       mci->n_layers, branch, ch, slot);
 
-				dimm = mci->csrows[slot].channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+				dinfo = &pvt->dimm_info[slot][channel];
 
 				mtr = decode_mtr(pvt, slot, ch, branch,
 						 dinfo, dimm);
 
+				mci->tot_dimms++;
+				dimm++;
+
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
@@ -1034,10 +1026,8 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 				    const struct pci_device_id *id)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[3];
 	struct i7300_pvt *pvt;
-	int num_channels;
-	int num_dimms_per_channel;
-	int num_csrows;
 	int rc;
 
 	/* wake up device */
@@ -1054,22 +1044,17 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 	if (PCI_FUNC(pdev->devfn) != 0)
 		return -ENODEV;
 
-	/* As we don't have a motherboard identification routine to determine
-	 * actual number of slots/dimms per channel, we thus utilize the
-	 * resource as specified by the chipset. Thus, we might have
-	 * have more DIMMs per channel than actually on the mobo, but this
-	 * allows the driver to support up to the chipset max, without
-	 * some fancy mobo determination.
-	 */
-	num_dimms_per_channel = MAX_SLOTS;
-	num_channels = MAX_CHANNELS;
-	num_csrows = MAX_SLOTS * MAX_CHANNELS;
-
-	debugf0("MC: %s(): Number of - Channels= %d  DIMMS= %d  CSROWS= %d\n",
-		__func__, num_channels, num_dimms_per_channel, num_csrows);
-
 	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), num_csrows, num_channels, 0);
+	layers[0].type = EDAC_MC_LAYER_BRANCH;
+	layers[0].size = MAX_BRANCHES;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = MAX_CH_PER_BRANCH;
+	layers[1].is_csrow = true;
+	layers[2].type = EDAC_MC_LAYER_SLOT;
+	layers[2].size = MAX_SLOTS;
+	layers[2].is_csrow = true;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 13/26] i7core_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (11 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 12/26] i7300_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 14/26] i82443bxgx_edac: " Mauro Carvalho Chehab
                       ` (12 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i7core_edac.c |  202 +++++++++++---------------------------------
 1 files changed, 50 insertions(+), 152 deletions(-)

diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index d566797..8bcee03 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -257,7 +257,6 @@ struct i7core_pvt {
 	struct i7core_channel	channel[NUM_CHANS];
 
 	int		ce_count_available;
-	int 		csrow_map[NUM_CHANS][MAX_DIMMS];
 
 			/* ECC corrected errors counts per udimm */
 	unsigned long	udimm_ce_count[MAX_DIMMS];
@@ -492,113 +491,12 @@ static void free_i7core_dev(struct i7core_dev *i7core_dev)
 /****************************************************************************
 			Memory check routines
  ****************************************************************************/
-static struct pci_dev *get_pdev_slot_func(u8 socket, unsigned slot,
-					  unsigned func)
-{
-	struct i7core_dev *i7core_dev = get_i7core_dev(socket);
-	int i;
-
-	if (!i7core_dev)
-		return NULL;
-
-	for (i = 0; i < i7core_dev->n_devs; i++) {
-		if (!i7core_dev->pdev[i])
-			continue;
-
-		if (PCI_SLOT(i7core_dev->pdev[i]->devfn) == slot &&
-		    PCI_FUNC(i7core_dev->pdev[i]->devfn) == func) {
-			return i7core_dev->pdev[i];
-		}
-	}
-
-	return NULL;
-}
-
-/**
- * i7core_get_active_channels() - gets the number of channels and csrows
- * @socket:	Quick Path Interconnect socket
- * @channels:	Number of channels that will be returned
- * @csrows:	Number of csrows found
- *
- * Since EDAC core needs to know in advance the number of available channels
- * and csrows, in order to allocate memory for csrows/channels, it is needed
- * to run two similar steps. At the first step, implemented on this function,
- * it checks the number of csrows/channels present at one socket.
- * this is used in order to properly allocate the size of mci components.
- *
- * It should be noticed that none of the current available datasheets explain
- * or even mention how csrows are seen by the memory controller. So, we need
- * to add a fake description for csrows.
- * So, this driver is attributing one DIMM memory for one csrow.
- */
-static int i7core_get_active_channels(const u8 socket, unsigned *channels,
-				      unsigned *csrows)
-{
-	struct pci_dev *pdev = NULL;
-	int i, j;
-	u32 status, control;
-
-	*channels = 0;
-	*csrows = 0;
-
-	pdev = get_pdev_slot_func(socket, 3, 0);
-	if (!pdev) {
-		i7core_printk(KERN_ERR, "Couldn't find socket %d fn 3.0!!!\n",
-			      socket);
-		return -ENODEV;
-	}
-
-	/* Device 3 function 0 reads */
-	pci_read_config_dword(pdev, MC_STATUS, &status);
-	pci_read_config_dword(pdev, MC_CONTROL, &control);
-
-	for (i = 0; i < NUM_CHANS; i++) {
-		u32 dimm_dod[3];
-		/* Check if the channel is active */
-		if (!(control & (1 << (8 + i))))
-			continue;
-
-		/* Check if the channel is disabled */
-		if (status & (1 << i))
-			continue;
-
-		pdev = get_pdev_slot_func(socket, i + 4, 1);
-		if (!pdev) {
-			i7core_printk(KERN_ERR, "Couldn't find socket %d "
-						"fn %d.%d!!!\n",
-						socket, i + 4, 1);
-			return -ENODEV;
-		}
-		/* Devices 4-6 function 1 */
-		pci_read_config_dword(pdev,
-				MC_DOD_CH_DIMM0, &dimm_dod[0]);
-		pci_read_config_dword(pdev,
-				MC_DOD_CH_DIMM1, &dimm_dod[1]);
-		pci_read_config_dword(pdev,
-				MC_DOD_CH_DIMM2, &dimm_dod[2]);
-
-		(*channels)++;
-
-		for (j = 0; j < 3; j++) {
-			if (!DIMM_PRESENT(dimm_dod[j]))
-				continue;
-			(*csrows)++;
-		}
-	}
-
-	debugf0("Number of active channels on socket %d: %d\n",
-		socket, *channels);
-
-	return 0;
-}
 
 static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct i7core_pvt *pvt = mci->pvt_info;
-	struct csrow_info *csr;
 	struct pci_dev *pdev;
 	int i, j;
-	int csrow = 0;
 	enum edac_type mode;
 	enum mem_type mtype;
 	struct dimm_info *dimm;
@@ -696,6 +594,8 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			if (!DIMM_PRESENT(dimm_dod[j]))
 				continue;
 
+			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+				       i, j, 0);
 			banks = numbank(MC_DOD_NUMBANK(dimm_dod[j]));
 			ranks = numrank(MC_DOD_NUMRANK(dimm_dod[j]));
 			rows = numrow(MC_DOD_NUMROW(dimm_dod[j]));
@@ -704,8 +604,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			/* DDR3 has 8 I/O banks */
 			size = (rows * cols * banks * ranks) >> (20 - 3);
 
-			pvt->channel[i].dimms++;
-
 			debugf0("\tdimm %d %d Mb offset: %x, "
 				"bank: %d, rank: %d, row: %#x, col: %#x\n",
 				j, size,
@@ -714,11 +612,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 
 			npages = MiB_TO_PAGES(size);
 
-			csr = &mci->csrows[csrow];
-
-			pvt->csrow_map[i][j] = csrow;
-
-			dimm = csr->channels[0].dimm;
 			dimm->nr_pages = npages;
 
 			switch (banks) {
@@ -741,7 +634,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			dimm->grain = 8;
 			dimm->edac_mode = mode;
 			dimm->mtype = mtype;
-			csrow++;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
@@ -1557,22 +1449,16 @@ error:
 /****************************************************************************
 			Error check routines
  ****************************************************************************/
-static void i7core_rdimm_update_csrow(struct mem_ctl_info *mci,
+static void i7core_rdimm_update_errcount(struct mem_ctl_info *mci,
 				      const int chan,
 				      const int dimm,
 				      const int add)
 {
-	char *msg;
-	struct i7core_pvt *pvt = mci->pvt_info;
-	int row = pvt->csrow_map[chan][dimm], i;
+	int i;
 
 	for (i = 0; i < add; i++) {
-		msg = kasprintf(GFP_KERNEL, "Corrected error "
-				"(Socket=%d channel=%d dimm=%d)",
-				pvt->i7core_dev->socket, chan, dimm);
-
-		edac_mc_handle_fbd_ce(mci, row, 0, msg);
-		kfree (msg);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, 0, 0, 0,
+				     chan, dimm, -1, "error", "", NULL);
 	}
 }
 
@@ -1613,11 +1499,11 @@ static void i7core_rdimm_update_ce_count(struct mem_ctl_info *mci,
 
 	/*updated the edac core */
 	if (add0 != 0)
-		i7core_rdimm_update_csrow(mci, chan, 0, add0);
+		i7core_rdimm_update_errcount(mci, chan, 0, add0);
 	if (add1 != 0)
-		i7core_rdimm_update_csrow(mci, chan, 1, add1);
+		i7core_rdimm_update_errcount(mci, chan, 1, add1);
 	if (add2 != 0)
-		i7core_rdimm_update_csrow(mci, chan, 2, add2);
+		i7core_rdimm_update_errcount(mci, chan, 2, add2);
 
 }
 
@@ -1738,19 +1624,29 @@ static void i7core_mce_output_error(struct mem_ctl_info *mci,
 {
 	struct i7core_pvt *pvt = mci->pvt_info;
 	char *type, *optype, *err, *msg;
+	enum hw_event_mc_err_type tp_event;
 	unsigned long error = m->status & 0x1ff0000l;
+	bool uncorrected_error = m->mcgstatus & 1ll << 61;
+	bool ripv = m->mcgstatus & 1;
 	u32 optypenum = (m->status >> 4) & 0x07;
 	u32 core_err_cnt = (m->status >> 38) & 0x7fff;
 	u32 dimm = (m->misc >> 16) & 0x3;
 	u32 channel = (m->misc >> 18) & 0x3;
 	u32 syndrome = m->misc >> 32;
 	u32 errnum = find_first_bit(&error, 32);
-	int csrow;
 
-	if (m->mcgstatus & 1)
-		type = "FATAL";
-	else
-		type = "NON_FATAL";
+	if (uncorrected_error) {
+		if (ripv) {
+			type = "FATAL";
+			tp_event = HW_EVENT_ERR_FATAL;
+		} else {
+			type = "NON_FATAL";
+			tp_event = HW_EVENT_ERR_UNCORRECTED;
+		}
+	} else {
+		type = "CORRECTED";
+		tp_event = HW_EVENT_ERR_CORRECTED;
+	}
 
 	switch (optypenum) {
 	case 0:
@@ -1805,25 +1701,23 @@ static void i7core_mce_output_error(struct mem_ctl_info *mci,
 		err = "unknown";
 	}
 
-	/* FIXME: should convert addr into bank and rank information */
 	msg = kasprintf(GFP_ATOMIC,
-		"%s (addr = 0x%08llx, cpu=%d, Dimm=%d, Channel=%d, "
-		"syndrome=0x%08x, count=%d, Err=%08llx:%08llx (%s: %s))\n",
-		type, (long long) m->addr, m->cpu, dimm, channel,
-		syndrome, core_err_cnt, (long long)m->status,
-		(long long)m->misc, optype, err);
+		"addr=0x%08llx cpu=%d count=%d Err=%08llx:%08llx (%s: %s))\n",
+		(long long) m->addr, m->cpu, core_err_cnt,
+		(long long)m->status, (long long)m->misc, optype, err);
 
-	debugf0("%s", msg);
-
-	csrow = pvt->csrow_map[channel][dimm];
-
-	/* Call the helper to output message */
-	if (m->mcgstatus & 1)
-		edac_mc_handle_fbd_ue(mci, csrow, 0,
-				0 /* FIXME: should be channel here */, msg);
-	else if (!pvt->is_registered)
-		edac_mc_handle_fbd_ce(mci, csrow,
-				0 /* FIXME: should be channel here */, msg);
+	/*
+	 * Call the helper to output message
+	 * FIXME: what to do if core_err_cnt > 1? Currently, it generates
+	 * only one event
+	 */
+	if (uncorrected_error || !pvt->is_registered)
+		edac_mc_handle_error(tp_event, mci,
+				     m->addr >> PAGE_SHIFT,
+				     m->addr & ~PAGE_MASK,
+				     syndrome,
+				     channel, dimm, -1,
+				     err, msg, m);
 
 	kfree(msg);
 }
@@ -2242,15 +2136,19 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 {
 	struct mem_ctl_info *mci;
 	struct i7core_pvt *pvt;
-	int rc, channels, csrows;
-
-	/* Check the number of active and not disabled channels */
-	rc = i7core_get_active_channels(i7core_dev->socket, &channels, &csrows);
-	if (unlikely(rc < 0))
-		return rc;
+	int rc;
+	struct edac_mc_layer layers[2];
 
 	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), csrows, channels, i7core_dev->socket);
+
+	layers[0].type = EDAC_MC_LAYER_CHANNEL;
+	layers[0].size = NUM_CHANS;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_SLOT;
+	layers[1].size = MAX_DIMMS;
+	layers[1].is_csrow = true;
+	mci = new_edac_mc_alloc(i7core_dev->socket, ARRAY_SIZE(layers), layers,
+			    false, sizeof(*pvt));
 	if (unlikely(!mci))
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 14/26] i82443bxgx_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (12 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 13/26] i7core_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 15/26] i82860_edac: " Mauro Carvalho Chehab
                       ` (11 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Tim Small

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Tim Small <tim@buttersideup.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i82443bxgx_edac.c |   26 ++++++++++++++++----------
 1 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 74166ae..00aa186 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -156,19 +156,19 @@ static int i82443bxgx_edacmc_process_error_info(struct mem_ctl_info *mci,
 	if (info->eap & I82443BXGX_EAP_OFFSET_SBE) {
 		error_found = 1;
 		if (handle_errors)
-			edac_mc_handle_ce(mci, page, pageoffset,
-				/* 440BX/GX don't make syndrome information
-				 * available */
-				0, edac_mc_find_csrow_by_page(mci, page), 0,
-				mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     page, pageoffset, 0,
+					     edac_mc_find_csrow_by_page(mci, page),
+					     0, -1, mci->ctl_name, "", NULL);
 	}
 
 	if (info->eap & I82443BXGX_EAP_OFFSET_MBE) {
 		error_found = 1;
 		if (handle_errors)
-			edac_mc_handle_ue(mci, page, pageoffset,
-					edac_mc_find_csrow_by_page(mci, page),
-					mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     page, pageoffset, 0,
+					     edac_mc_find_csrow_by_page(mci, page),
+					     0, -1, mci->ctl_name, "", NULL);
 	}
 
 	return error_found;
@@ -235,6 +235,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	u8 dramc;
 	u32 nbxcfg, ecc_mode;
 	enum mem_type mtype;
@@ -248,8 +249,13 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	if (pci_read_config_dword(pdev, I82443BXGX_NBXCFG, &nbxcfg))
 		return -EIO;
 
-	mci = edac_mc_alloc(0, I82443BXGX_NR_CSROWS, I82443BXGX_NR_CHANS, 0);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I82443BXGX_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = I82443BXGX_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (mci == NULL)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 15/26] i82860_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (13 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 14/26] i82443bxgx_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 16/26] i82875p_edac: " Mauro Carvalho Chehab
                       ` (10 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Michal Marek

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Michal Marek <mmarek@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i82860_edac.c |   42 +++++++++++++++++++++++++++++-------------
 1 files changed, 29 insertions(+), 13 deletions(-)

diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 48e0ecd..26d91ba 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -99,6 +99,7 @@ static int i82860_process_error_info(struct mem_ctl_info *mci,
 				struct i82860_error_info *info,
 				int handle_errors)
 {
+	struct dimm_info *dimm;
 	int row;
 
 	if (!(info->errsts2 & 0x0003))
@@ -108,18 +109,25 @@ static int i82860_process_error_info(struct mem_ctl_info *mci,
 		return 1;
 
 	if ((info->errsts ^ info->errsts2) & 0x0003) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1, "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
 	info->eap >>= PAGE_SHIFT;
 	row = edac_mc_find_csrow_by_page(mci, info->eap);
+	dimm = mci->csrows[row].channels[0].dimm;
 
 	if (info->errsts & 0x0002)
-		edac_mc_handle_ue(mci, info->eap, 0, row, "i82860 UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     info->eap, 0, 0,
+				     dimm->location[0], dimm->location[1], -1,
+				     "i82860 UE", "", NULL);
 	else
-		edac_mc_handle_ce(mci, info->eap, 0, info->derrsyn, row, 0,
-				"i82860 UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     info->eap, 0, info->derrsyn,
+				     dimm->location[0], dimm->location[1], -1,
+				     "i82860 CE", "", NULL);
 
 	return 1;
 }
@@ -179,18 +187,26 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct i82860_error_info discard;
 
-	/* RDRAM has channels but these don't map onto the abstractions that
-	   edac uses.
-	   The device groups from the GRA registers seem to map reasonably
-	   well onto the notion of a chip select row.
-	   There are 16 GRA registers and since the name is associated with
-	   the channel and the GRA registers map to physical devices so we are
-	   going to make 1 channel for group.
+	/*
+	 * RDRAM has channels but these don't map onto the csrow abstraction.
+	 * According with the datasheet, there are 2 Rambus channels, supporting
+	 * up to 16 direct RDRAM devices.
+	 * The device groups from the GRA registers seem to map reasonably
+	 * well onto the notion of a chip select row.
+	 * There are 16 GRA registers and since the name is associated with
+	 * the channel and the GRA registers map to physical devices so we are
+	 * going to make 1 channel for group.
 	 */
-	mci = edac_mc_alloc(0, 16, 1, 0);
-
+	layers[0].type = EDAC_MC_LAYER_CHANNEL;
+	layers[0].size = 2;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_SLOT;
+	layers[1].size = 8;
+	layers[1].is_csrow = true;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 16/26] i82875p_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (14 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 15/26] i82860_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 17/26] i82975x_edac: " Mauro Carvalho Chehab
                       ` (9 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i82875p_edac.c |   29 +++++++++++++++++++++--------
 1 files changed, 21 insertions(+), 8 deletions(-)

diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index dc207dc..4259c4a 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -38,7 +38,8 @@
 #endif				/* PCI_DEVICE_ID_INTEL_82875_6 */
 
 /* four csrows in dual channel, eight in single channel */
-#define I82875P_NR_CSROWS(nr_chans) (8/(nr_chans))
+#define I82875P_NR_DIMMS		8
+#define I82875P_NR_CSROWS(nr_chans)	(I82875P_NR_DIMMS / (nr_chans))
 
 /* Intel 82875p register addresses - device 0 function 0 - DRAM Controller */
 #define I82875P_EAP		0x58	/* Error Address Pointer (32b)
@@ -235,7 +236,9 @@ static int i82875p_process_error_info(struct mem_ctl_info *mci,
 		return 1;
 
 	if ((info->errsts ^ info->errsts2) & 0x0081) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1,
+				     "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
@@ -243,11 +246,15 @@ static int i82875p_process_error_info(struct mem_ctl_info *mci,
 	row = edac_mc_find_csrow_by_page(mci, info->eap);
 
 	if (info->errsts & 0x0080)
-		edac_mc_handle_ue(mci, info->eap, 0, row, "i82875p UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     info->eap, 0, 0,
+				     row, -1, -1,
+				     "i82875p UE", "", NULL);
 	else
-		edac_mc_handle_ce(mci, info->eap, 0, info->derrsyn, row,
-				multi_chan ? (info->des & 0x1) : 0,
-				"i82875p CE");
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     info->eap, 0, info->derrsyn,
+				     row, multi_chan ? (info->des & 0x1) : 0,
+				     -1, "i82875p CE", "", NULL);
 
 	return 1;
 }
@@ -390,6 +397,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc = -ENODEV;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct i82875p_pvt *pvt;
 	struct pci_dev *ovrfl_pdev;
 	void __iomem *ovrfl_window;
@@ -405,9 +413,14 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENODEV;
 	drc = readl(ovrfl_window + I82875P_DRC);
 	nr_chans = dual_channel_active(drc) + 1;
-	mci = edac_mc_alloc(sizeof(*pvt), I82875P_NR_CSROWS(nr_chans),
-			nr_chans, 0);
 
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I82875P_NR_CSROWS(nr_chans);
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (!mci) {
 		rc = -ENOMEM;
 		goto fail0;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 17/26] i82975x_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (15 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 16/26] i82875p_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 18/26] mpc85xx_edac: " Mauro Carvalho Chehab
                       ` (8 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Ranganathan Desikan, Arvind R.

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/i82975x_edac.c |   27 ++++++++++++++++++++-------
 1 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 304af1d..bfd632c 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -29,7 +29,8 @@
 #define PCI_DEVICE_ID_INTEL_82975_0	0x277c
 #endif				/* PCI_DEVICE_ID_INTEL_82975_0 */
 
-#define I82975X_NR_CSROWS(nr_chans)		(8/(nr_chans))
+#define I82975X_NR_DIMMS		8
+#define I82975X_NR_CSROWS(nr_chans)	(I82975X_NR_DIMMS / (nr_chans))
 
 /* Intel 82975X register addresses - device 0 function 0 - DRAM Controller */
 #define I82975X_EAP		0x58	/* Dram Error Address Pointer (32b)
@@ -287,7 +288,8 @@ static int i82975x_process_error_info(struct mem_ctl_info *mci,
 		return 1;
 
 	if ((info->errsts ^ info->errsts2) & 0x0003) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1, "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
@@ -312,10 +314,15 @@ static int i82975x_process_error_info(struct mem_ctl_info *mci,
 			   (1 << mci->csrows[row].channels[chan].dimm->grain));
 
 	if (info->errsts & 0x0002)
-		edac_mc_handle_ue(mci, page, offst , row, "i82975x UE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offst, 0,
+				     row, -1, -1,
+				     "i82975x UE", "", NULL);
 	else
-		edac_mc_handle_ce(mci, page, offst, info->derrsyn, row,
-				chan, "i82975x CE");
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offst, info->derrsyn,
+				     row, chan ? chan : 0, -1,
+				     "i82975x CE", "", NULL);
 
 	return 1;
 }
@@ -473,6 +480,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	int rc = -ENODEV;
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct i82975x_pvt *pvt;
 	void __iomem *mch_window;
 	u32 mchbar;
@@ -541,8 +549,13 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	chans = dual_channel_active(mch_window) + 1;
 
 	/* assuming only one controller, index thus is 0 */
-	mci = edac_mc_alloc(sizeof(*pvt), I82975X_NR_CSROWS(chans),
-					chans, 0);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = I82975X_NR_DIMMS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = I82975X_NR_CSROWS(chans);
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (!mci) {
 		rc = -ENOMEM;
 		goto fail1;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 18/26] mpc85xx_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (16 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 17/26] i82975x_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 19/26] mv64x60_edac: " Mauro Carvalho Chehab
                       ` (7 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Andrew Morton, Shaohui Xie,
	Jiri Kosina

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/mpc85xx_edac.c |   22 +++++++++++++++++-----
 1 files changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index c1d9e15..87a9baa 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -854,12 +854,16 @@ static void mpc85xx_mc_check(struct mem_ctl_info *mci)
 		mpc85xx_mc_printk(mci, KERN_ERR, "PFN out of range!\n");
 
 	if (err_detect & DDR_EDE_SBE)
-		edac_mc_handle_ce(mci, pfn, err_addr & ~PAGE_MASK,
-				  syndrome, row_index, 0, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     pfn, err_addr & ~PAGE_MASK, syndrome,
+				     row_index, 0, -1,
+				     mci->ctl_name, "", NULL);
 
 	if (err_detect & DDR_EDE_MBE)
-		edac_mc_handle_ue(mci, pfn, err_addr & ~PAGE_MASK,
-				  row_index, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     pfn, err_addr & ~PAGE_MASK, syndrome,
+				     row_index, 0, -1,
+				     mci->ctl_name, "", NULL);
 
 	out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT, err_detect);
 }
@@ -961,6 +965,7 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct mpc85xx_mc_pdata *pdata;
 	struct resource r;
 	u32 sdram_ctl;
@@ -969,7 +974,14 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	if (!devres_open_group(&op->dev, mpc85xx_mc_err_probe, GFP_KERNEL))
 		return -ENOMEM;
 
-	mci = edac_mc_alloc(sizeof(*pdata), 4, 1, edac_mc_idx);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = 4;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = 1;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+			    sizeof(*pdata));
 	if (!mci) {
 		devres_release_group(&op->dev, mpc85xx_mc_err_probe);
 		return -ENOMEM;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 19/26] mv64x60_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (17 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 18/26] mpc85xx_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21       ` Mauro Carvalho Chehab
                       ` (6 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Borislav Petkov

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/mv64x60_edac.c |   25 +++++++++++++++++++------
 1 files changed, 19 insertions(+), 6 deletions(-)

diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 281e245..f294da7 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -611,12 +611,17 @@ static void mv64x60_mc_check(struct mem_ctl_info *mci)
 
 	/* first bit clear in ECC Err Reg, 1 bit error, correctable by HW */
 	if (!(reg & 0x1))
-		edac_mc_handle_ce(mci, err_addr >> PAGE_SHIFT,
-				  err_addr & PAGE_MASK, syndrome, 0, 0,
-				  mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     err_addr >> PAGE_SHIFT,
+				     err_addr & PAGE_MASK, syndrome,
+				     0, 0, -1,
+				     mci->ctl_name, "", NULL);
 	else	/* 2 bit error, UE */
-		edac_mc_handle_ue(mci, err_addr >> PAGE_SHIFT,
-				  err_addr & PAGE_MASK, 0, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     err_addr >> PAGE_SHIFT,
+				     err_addr & PAGE_MASK, 0,
+				     0, 0, -1,
+				     mci->ctl_name, "", NULL);
 
 	/* clear the error */
 	out_le32(pdata->mc_vbase + MV64X60_SDRAM_ERR_ADDR, 0);
@@ -695,6 +700,7 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct mv64x60_mc_pdata *pdata;
 	struct resource *r;
 	u32 ctl;
@@ -703,7 +709,14 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	if (!devres_open_group(&pdev->dev, mv64x60_mc_err_probe, GFP_KERNEL))
 		return -ENOMEM;
 
-	mci = edac_mc_alloc(sizeof(struct mv64x60_mc_pdata), 1, 1, edac_mc_idx);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = 1;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = 1;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+			    sizeof(struct mv64x60_mc_pdata));
 	if (!mci) {
 		printk(KERN_ERR "%s: No memory for CPU err\n", __func__);
 		devres_release_group(&pdev->dev, mv64x60_mc_err_probe);
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 20/26] pasemi_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
@ 2012-04-16 20:21       ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 02/26] amd76x_edac: " Mauro Carvalho Chehab
                         ` (24 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Olof Johansson, Egor Martovetsky,
	linuxppc-dev

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/pasemi_edac.c |   25 ++++++++++++++++---------
 1 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 3fcefda..addf893 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -110,15 +110,16 @@ static void pasemi_edac_process_error_info(struct mem_ctl_info *mci, u32 errsta)
 	/* uncorrectable/multi-bit errors */
 	if (errsta & (MCDEBUG_ERRSTA_MBE_STATUS |
 		      MCDEBUG_ERRSTA_RFL_STATUS)) {
-		edac_mc_handle_ue(mci, mci->csrows[cs].first_page, 0,
-				  cs, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     mci->csrows[cs].first_page, 0, 0,
+				     cs, 0, -1, mci->ctl_name, "", NULL);
 	}
 
 	/* correctable/single-bit errors */
-	if (errsta & MCDEBUG_ERRSTA_SBE_STATUS) {
-		edac_mc_handle_ce(mci, mci->csrows[cs].first_page, 0,
-				  0, cs, 0, mci->ctl_name);
-	}
+	if (errsta & MCDEBUG_ERRSTA_SBE_STATUS)
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     mci->csrows[cs].first_page, 0, 0,
+				     cs, 0, -1, mci->ctl_name, "", NULL);
 }
 
 static void pasemi_edac_check(struct mem_ctl_info *mci)
@@ -191,6 +192,7 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 		const struct pci_device_id *ent)
 {
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	u32 errctl1, errcor, scrub, mcen;
 
 	pci_read_config_dword(pdev, MCCFG_MCEN, &mcen);
@@ -207,9 +209,14 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 		MCDEBUG_ERRCTL1_RFL_LOG_EN;
 	pci_write_config_dword(pdev, MCDEBUG_ERRCTL1, errctl1);
 
-	mci = edac_mc_alloc(0, PASEMI_EDAC_NR_CSROWS, PASEMI_EDAC_NR_CHANS,
-				system_mmc_id++);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = PASEMI_EDAC_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = PASEMI_EDAC_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(system_mmc_id++, ARRAY_SIZE(layers), layers, false,
+			    0);
 	if (mci == NULL)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 20/26] pasemi_edac: convert driver to use the new edac ABI
@ 2012-04-16 20:21       ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Kernel Mailing List,
	Egor Martovetsky, Olof Johansson, linuxppc-dev,
	Linux Edac Mailing List

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/pasemi_edac.c |   25 ++++++++++++++++---------
 1 files changed, 16 insertions(+), 9 deletions(-)

diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 3fcefda..addf893 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -110,15 +110,16 @@ static void pasemi_edac_process_error_info(struct mem_ctl_info *mci, u32 errsta)
 	/* uncorrectable/multi-bit errors */
 	if (errsta & (MCDEBUG_ERRSTA_MBE_STATUS |
 		      MCDEBUG_ERRSTA_RFL_STATUS)) {
-		edac_mc_handle_ue(mci, mci->csrows[cs].first_page, 0,
-				  cs, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     mci->csrows[cs].first_page, 0, 0,
+				     cs, 0, -1, mci->ctl_name, "", NULL);
 	}
 
 	/* correctable/single-bit errors */
-	if (errsta & MCDEBUG_ERRSTA_SBE_STATUS) {
-		edac_mc_handle_ce(mci, mci->csrows[cs].first_page, 0,
-				  0, cs, 0, mci->ctl_name);
-	}
+	if (errsta & MCDEBUG_ERRSTA_SBE_STATUS)
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     mci->csrows[cs].first_page, 0, 0,
+				     cs, 0, -1, mci->ctl_name, "", NULL);
 }
 
 static void pasemi_edac_check(struct mem_ctl_info *mci)
@@ -191,6 +192,7 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 		const struct pci_device_id *ent)
 {
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	u32 errctl1, errcor, scrub, mcen;
 
 	pci_read_config_dword(pdev, MCCFG_MCEN, &mcen);
@@ -207,9 +209,14 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 		MCDEBUG_ERRCTL1_RFL_LOG_EN;
 	pci_write_config_dword(pdev, MCDEBUG_ERRCTL1, errctl1);
 
-	mci = edac_mc_alloc(0, PASEMI_EDAC_NR_CSROWS, PASEMI_EDAC_NR_CHANS,
-				system_mmc_id++);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = PASEMI_EDAC_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = PASEMI_EDAC_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(system_mmc_id++, ARRAY_SIZE(layers), layers, false,
+			    0);
 	if (mci == NULL)
 		return -ENOMEM;
 
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 21/26] ppc4xx_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (19 preceding siblings ...)
  2012-04-16 20:21       ` Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 22/26] r82600_edac: " Mauro Carvalho Chehab
                       ` (4 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Josh Boyer, Jiri Kosina,
	Borislav Petkov

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Josh Boyer <jwboyer@gmail.com>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/ppc4xx_edac.c |   25 +++++++++++++++++--------
 1 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index 95cfc0f..482161e 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -727,7 +727,10 @@ ppc4xx_edac_handle_ce(struct mem_ctl_info *mci,
 
 	for (row = 0; row < mci->nr_csrows; row++)
 		if (ppc4xx_edac_check_bank_error(status, row))
-			edac_mc_handle_ce_no_info(mci, message);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     0, 0, 0,
+					     row, 0, -1,
+					     message, "", NULL);
 }
 
 /**
@@ -755,7 +758,10 @@ ppc4xx_edac_handle_ue(struct mem_ctl_info *mci,
 
 	for (row = 0; row < mci->nr_csrows; row++)
 		if (ppc4xx_edac_check_bank_error(status, row))
-			edac_mc_handle_ue(mci, page, offset, row, message);
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     page, offset, 0,
+					     row, 0, -1,
+					     message, "", NULL);
 }
 
 /**
@@ -1233,6 +1239,7 @@ static int __devinit ppc4xx_edac_probe(struct platform_device *op)
 	dcr_host_t dcr_host;
 	const struct device_node *np = op->dev.of_node;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	static int ppc4xx_edac_instance;
 
 	/*
@@ -1278,12 +1285,14 @@ static int __devinit ppc4xx_edac_probe(struct platform_device *op)
 	 * controller instance and perform the appropriate
 	 * initialization.
 	 */
-
-	mci = edac_mc_alloc(sizeof(struct ppc4xx_edac_pdata),
-			    ppc4xx_edac_nr_csrows,
-			    ppc4xx_edac_nr_chans,
-			    ppc4xx_edac_instance);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = ppc4xx_edac_nr_csrows;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = ppc4xx_edac_nr_chans;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(ppc4xx_edac_instance, ARRAY_SIZE(layers), layers,
+			    false, sizeof(struct ppc4xx_edac_pdata));
 	if (mci == NULL) {
 		ppc4xx_edac_printk(KERN_ERR, "%s: "
 				   "Failed to allocate EDAC MC instance!\n",
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 22/26] r82600_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (20 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 21/26] ppc4xx_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 23/26] sb_edac: " Mauro Carvalho Chehab
                       ` (3 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Tim Small

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Tim Small <tim@buttersideup.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/r82600_edac.c |   27 ++++++++++++++++++---------
 1 files changed, 18 insertions(+), 9 deletions(-)

diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 19f3a10..ca64939 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -179,10 +179,11 @@ static int r82600_process_error_info(struct mem_ctl_info *mci,
 		error_found = 1;
 
 		if (handle_errors)
-			edac_mc_handle_ce(mci, page, 0,	/* not avail */
-					syndrome,
-					edac_mc_find_csrow_by_page(mci, page),
-					0, mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     page, 0, syndrome,
+					     edac_mc_find_csrow_by_page(mci, page),
+					     0, -1,
+					     mci->ctl_name, "", NULL);
 	}
 
 	if (info->eapr & BIT(1)) {	/* UE? */
@@ -190,9 +191,11 @@ static int r82600_process_error_info(struct mem_ctl_info *mci,
 
 		if (handle_errors)
 			/* 82600 doesn't give enough info */
-			edac_mc_handle_ue(mci, page, 0,
-					edac_mc_find_csrow_by_page(mci, page),
-					mci->ctl_name);
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     page, 0, 0,
+					     edac_mc_find_csrow_by_page(mci, page),
+					     0, -1,
+					     mci->ctl_name, "", NULL);
 	}
 
 	return error_found;
@@ -267,6 +270,7 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	u8 dramcr;
 	u32 eapr;
 	u32 scrub_disabled;
@@ -281,8 +285,13 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	debugf2("%s(): sdram refresh rate = %#0x\n", __func__,
 		sdram_refresh_rate);
 	debugf2("%s(): DRAMC register = %#0x\n", __func__, dramcr);
-	mci = edac_mc_alloc(0, R82600_NR_CSROWS, R82600_NR_CHANS, 0);
-
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = R82600_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = R82600_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (mci == NULL)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 23/26] sb_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (21 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 22/26] r82600_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 24/26] tile_edac: " Mauro Carvalho Chehab
                       ` (2 subsequent siblings)
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/sb_edac.c |  159 +++++++++++++++++-------------------------------
 1 files changed, 56 insertions(+), 103 deletions(-)

diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index ee1543d..dc6a306 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -314,8 +314,6 @@ struct sbridge_pvt {
 	struct sbridge_info	info;
 	struct sbridge_channel	channel[NUM_CHANNELS];
 
-	int 			csrow_map[NUM_CHANNELS][MAX_DIMMS];
-
 	/* Memory type detection */
 	bool			is_mirrored, is_lockstep, is_close_pg;
 
@@ -487,29 +485,14 @@ static struct pci_dev *get_pdev_slot_func(u8 bus, unsigned slot,
 }
 
 /**
- * sbridge_get_active_channels() - gets the number of channels and csrows
+ * check_if_ecc_is_active() - Checks if ECC is active
  * bus:		Device bus
- * @channels:	Number of channels that will be returned
- * @csrows:	Number of csrows found
- *
- * Since EDAC core needs to know in advance the number of available channels
- * and csrows, in order to allocate memory for csrows/channels, it is needed
- * to run two similar steps. At the first step, implemented on this function,
- * it checks the number of csrows/channels present at one socket, identified
- * by the associated PCI bus.
- * this is used in order to properly allocate the size of mci components.
- * Note: one csrow is one dimm.
  */
-static int sbridge_get_active_channels(const u8 bus, unsigned *channels,
-				      unsigned *csrows)
+static int check_if_ecc_is_active(const u8 bus)
 {
 	struct pci_dev *pdev = NULL;
-	int i, j;
 	u32 mcmtr;
 
-	*channels = 0;
-	*csrows = 0;
-
 	pdev = get_pdev_slot_func(bus, 15, 0);
 	if (!pdev) {
 		sbridge_printk(KERN_ERR, "Couldn't find PCI device "
@@ -523,41 +506,14 @@ static int sbridge_get_active_channels(const u8 bus, unsigned *channels,
 		sbridge_printk(KERN_ERR, "ECC is disabled. Aborting\n");
 		return -ENODEV;
 	}
-
-	for (i = 0; i < NUM_CHANNELS; i++) {
-		u32 mtr;
-
-		/* Device 15 functions 2 - 5  */
-		pdev = get_pdev_slot_func(bus, 15, 2 + i);
-		if (!pdev) {
-			sbridge_printk(KERN_ERR, "Couldn't find PCI device "
-						 "%2x.%02d.%d!!!\n",
-						 bus, 15, 2 + i);
-			return -ENODEV;
-		}
-		(*channels)++;
-
-		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
-			pci_read_config_dword(pdev, mtr_regs[j], &mtr);
-			debugf1("Bus#%02x channel #%d  MTR%d = %x\n", bus, i, j, mtr);
-			if (IS_DIMM_PRESENT(mtr))
-				(*csrows)++;
-		}
-	}
-
-	debugf0("Number of active channels: %d, number of active dimms: %d\n",
-		*channels, *csrows);
-
 	return 0;
 }
 
 static int get_dimm_config(struct mem_ctl_info *mci)
 {
 	struct sbridge_pvt *pvt = mci->pvt_info;
-	struct csrow_info *csr;
+	struct dimm_info *dimm;
 	int i, j, banks, ranks, rows, cols, size, npages;
-	int csrow = 0;
-	unsigned long last_page = 0;
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
@@ -616,7 +572,8 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 		u32 mtr;
 
 		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
-			struct dimm_info *dimm = &mci->dimms[j];
+			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+				       i, j, 0);
 			pci_read_config_dword(pvt->pci_tad[i],
 					      mtr_regs[j], &mtr);
 			debugf4("Channel #%d  MTR%d = %x\n", i, j, mtr);
@@ -636,16 +593,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 					size, npages,
 					banks, ranks, rows, cols);
 
-				/*
-				 * Fake stuff. This controller doesn't see
-				 * csrows.
-				 */
-				csr = &mci->csrows[csrow];
-				pvt->csrow_map[i][j] = csrow;
-				last_page += npages;
-				csrow++;
-
-				csr->channels[0].dimm = dimm;
 				dimm->nr_pages = npages;
 				dimm->grain = 32;
 				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
@@ -841,11 +788,10 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 				 u8 *socket,
 				 long *channel_mask,
 				 u8 *rank,
-				 char *area_type)
+				 char *area_type, char *msg)
 {
 	struct mem_ctl_info	*new_mci;
 	struct sbridge_pvt *pvt = mci->pvt_info;
-	char			msg[256];
 	int 			n_rir, n_sads, n_tads, sad_way, sck_xch;
 	int			sad_interl, idx, base_ch;
 	int			interleave_mode;
@@ -867,12 +813,10 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	 */
 	if ((addr > (u64) pvt->tolm) && (addr < (1LL << 32))) {
 		sprintf(msg, "Error at TOLM area, on addr 0x%08Lx", addr);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	if (addr >= (u64)pvt->tohm) {
 		sprintf(msg, "Error at MMIOH area, on addr 0x%016Lx", addr);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 
@@ -889,7 +833,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		limit = SAD_LIMIT(reg);
 		if (limit <= prv) {
 			sprintf(msg, "Can't discover the memory socket");
-			edac_mc_handle_ce_no_info(mci, msg);
 			return -EINVAL;
 		}
 		if  (addr <= limit)
@@ -898,7 +841,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	}
 	if (n_sads == MAX_SAD) {
 		sprintf(msg, "Can't discover the memory socket");
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	area_type = get_dram_attr(reg);
@@ -939,7 +881,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		break;
 	default:
 		sprintf(msg, "Can't discover socket interleave");
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	*socket = sad_interleave[idx];
@@ -954,7 +895,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	if (!new_mci) {
 		sprintf(msg, "Struct for socket #%u wasn't initialized",
 			*socket);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	mci = new_mci;
@@ -970,7 +910,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		limit = TAD_LIMIT(reg);
 		if (limit <= prv) {
 			sprintf(msg, "Can't discover the memory channel");
-			edac_mc_handle_ce_no_info(mci, msg);
 			return -EINVAL;
 		}
 		if  (addr <= limit)
@@ -1010,7 +949,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 		break;
 	default:
 		sprintf(msg, "Can't discover the TAD target");
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	*channel_mask = 1 << base_ch;
@@ -1024,7 +962,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 			break;
 		default:
 			sprintf(msg, "Invalid mirror set. Can't decode addr");
-			edac_mc_handle_ce_no_info(mci, msg);
 			return -EINVAL;
 		}
 	} else
@@ -1052,7 +989,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	if (offset > addr) {
 		sprintf(msg, "Can't calculate ch addr: TAD offset 0x%08Lx is too high for addr 0x%08Lx!",
 			offset, addr);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	addr -= offset;
@@ -1092,7 +1028,6 @@ static int get_memory_error_data(struct mem_ctl_info *mci,
 	if (n_rir == MAX_RIR_RANGES) {
 		sprintf(msg, "Can't discover the memory rank for ch addr 0x%08Lx",
 			ch_addr);
-		edac_mc_handle_ce_no_info(mci, msg);
 		return -EINVAL;
 	}
 	rir_way = RIR_WAY(reg);
@@ -1406,7 +1341,8 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 {
 	struct mem_ctl_info *new_mci;
 	struct sbridge_pvt *pvt = mci->pvt_info;
-	char *type, *optype, *msg, *recoverable_msg;
+	enum hw_event_mc_err_type tp_event;
+	char *type, *optype, msg[256], *recoverable_msg;
 	bool ripv = GET_BITFIELD(m->mcgstatus, 0, 0);
 	bool overflow = GET_BITFIELD(m->status, 62, 62);
 	bool uncorrected_error = GET_BITFIELD(m->status, 61, 61);
@@ -1418,13 +1354,21 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	u32 optypenum = GET_BITFIELD(m->status, 4, 6);
 	long channel_mask, first_channel;
 	u8  rank, socket;
-	int csrow, rc, dimm;
+	int rc, dimm;
 	char *area_type = "Unknown";
 
-	if (ripv)
-		type = "NON_FATAL";
-	else
-		type = "FATAL";
+	if (uncorrected_error) {
+		if (ripv) {
+			type = "FATAL";
+			tp_event = HW_EVENT_ERR_FATAL;
+		} else {
+			type = "NON_FATAL";
+			tp_event = HW_EVENT_ERR_UNCORRECTED;
+		}
+	} else {
+		type = "CORRECTED";
+		tp_event = HW_EVENT_ERR_CORRECTED;
+	}
 
 	/*
 	 * According with Table 15-9 of the Intel Archictecture spec vol 3A,
@@ -1442,19 +1386,19 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	} else {
 		switch (optypenum) {
 		case 0:
-			optype = "generic undef request";
+			optype = "generic undef request error";
 			break;
 		case 1:
-			optype = "memory read";
+			optype = "memory read error";
 			break;
 		case 2:
-			optype = "memory write";
+			optype = "memory write error";
 			break;
 		case 3:
-			optype = "addr/cmd";
+			optype = "addr/cmd error";
 			break;
 		case 4:
-			optype = "memory scrubbing";
+			optype = "memory scrubbing error";
 			break;
 		default:
 			optype = "reserved";
@@ -1463,13 +1407,13 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	}
 
 	rc = get_memory_error_data(mci, m->addr, &socket,
-				   &channel_mask, &rank, area_type);
+				   &channel_mask, &rank, area_type, msg);
 	if (rc < 0)
-		return;
+		goto err_parsing;
 	new_mci = get_mci_for_node_id(socket);
 	if (!new_mci) {
-		edac_mc_handle_ce_no_info(mci, "Error: socket got corrupted!");
-		return;
+		strcpy(msg, "Error: socket got corrupted!");
+		goto err_parsing;
 	}
 	mci = new_mci;
 	pvt = mci->pvt_info;
@@ -1483,8 +1427,6 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	else
 		dimm = 2;
 
-	csrow = pvt->csrow_map[first_channel][dimm];
-
 	if (uncorrected_error && recoverable)
 		recoverable_msg = " recoverable";
 	else
@@ -1495,18 +1437,14 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 	 * Probably, we can just discard it, as the channel information
 	 * comes from the get_memory_error_data() address decoding
 	 */
-	msg = kasprintf(GFP_ATOMIC,
-			"%d %s error(s): %s on %s area %s%s: cpu=%d Err=%04x:%04x (ch=%d), "
-			"addr = 0x%08llx => socket=%d, Channel=%ld(mask=%ld), rank=%d\n",
+	snprintf(msg, sizeof(msg),
+			"%d error(s)%s: %s%s: cpu=%d Err=%04x:%04x addr = 0x%08llx socket=%d Channel=%ld(mask=%ld), rank=%d\n",
 			core_err_cnt,
+			overflow ? " OVERFLOW" : "",
 			area_type,
-			optype,
-			type,
 			recoverable_msg,
-			overflow ? "OVERFLOW" : "",
 			m->cpu,
 			mscod, errcode,
-			channel,		/* 1111b means not specified */
 			(long long) m->addr,
 			socket,
 			first_channel,		/* This is the real channel on SB */
@@ -1515,13 +1453,19 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 
 	debugf0("%s", msg);
 
+	/* FIXME: need support for channel mask */
+
 	/* Call the helper to output message */
-	if (uncorrected_error)
-		edac_mc_handle_fbd_ue(mci, csrow, 0, 0, msg);
-	else
-		edac_mc_handle_fbd_ce(mci, csrow, 0, msg);
+	edac_mc_handle_error(tp_event, mci,
+			     m->addr >> PAGE_SHIFT, m->addr & ~PAGE_MASK, 0,
+			     channel, dimm, -1,
+			     optype, msg, m);
+	return;
+err_parsing:
+	edac_mc_handle_error(tp_event, mci, 0, 0, 0,
+			     -1, -1, -1,
+			     msg, "", m);
 
-	kfree(msg);
 }
 
 /*
@@ -1680,16 +1624,25 @@ static void sbridge_unregister_mci(struct sbridge_dev *sbridge_dev)
 static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 {
 	struct mem_ctl_info *mci;
+	struct edac_mc_layer layers[2];
 	struct sbridge_pvt *pvt;
-	int rc, channels, csrows;
+	int rc;
 
 	/* Check the number of active and not disabled channels */
-	rc = sbridge_get_active_channels(sbridge_dev->bus, &channels, &csrows);
+	rc = check_if_ecc_is_active(sbridge_dev->bus);
 	if (unlikely(rc < 0))
 		return rc;
 
 	/* allocate a new MC control structure */
-	mci = edac_mc_alloc(sizeof(*pvt), csrows, channels, sbridge_dev->mc);
+	layers[0].type = EDAC_MC_LAYER_CHANNEL;
+	layers[0].size = NUM_CHANNELS;
+	layers[0].is_csrow = false;
+	layers[1].type = EDAC_MC_LAYER_SLOT;
+	layers[1].size = MAX_DIMMS;
+	layers[1].is_csrow = true;
+	mci = new_edac_mc_alloc(sbridge_dev->mc, ARRAY_SIZE(layers), layers,
+			    false, sizeof(*pvt));
+
 	if (unlikely(!mci))
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 24/26] tile_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (22 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 23/26] sb_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-26 19:47       ` Chris Metcalf
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 25/26] x38_edac: " Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 26/26] edac: Remove the legacy EDAC ABI Mauro Carvalho Chehab
  25 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Chris Metcalf

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Chris Metcalf <cmetcalf@tilera.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/tile_edac.c |   16 +++++++++++++---
 1 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index 6314ff9..0e4d628 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -71,7 +71,10 @@ static void tile_edac_check(struct mem_ctl_info *mci)
 	if (mem_error.sbe_count != priv->ce_count) {
 		dev_dbg(mci->dev, "ECC CE err on node %d\n", priv->node);
 		priv->ce_count = mem_error.sbe_count;
-		edac_mc_handle_ce(mci, 0, 0, 0, 0, 0, mci->ctl_name);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     0, 0, 0,
+				     0, 0, -1,
+				     mci->ctl_name, "", NULL);
 	}
 }
 
@@ -122,6 +125,7 @@ static int __devinit tile_edac_mc_probe(struct platform_device *pdev)
 	char			hv_file[32];
 	int			hv_devhdl;
 	struct mem_ctl_info	*mci;
+	struct edac_mc_layer	layers[2];
 	struct tile_edac_priv	*priv;
 	int			rc;
 
@@ -131,8 +135,14 @@ static int __devinit tile_edac_mc_probe(struct platform_device *pdev)
 		return -EINVAL;
 
 	/* A TILE MC has a single channel and one chip-select row. */
-	mci = edac_mc_alloc(sizeof(struct tile_edac_priv),
-		TILE_EDAC_NR_CSROWS, TILE_EDAC_NR_CHANS, pdev->id);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = TILE_EDAC_NR_CSROWS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = TILE_EDAC_NR_CHANS;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
+			    sizeof(struct tile_edac_priv));
 	if (mci == NULL)
 		return -ENOMEM;
 	priv = mci->pvt_info;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 25/26] x38_edac: convert driver to use the new edac ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (23 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 24/26] tile_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 26/26] edac: Remove the legacy EDAC ABI Mauro Carvalho Chehab
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Borislav Petkov

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/x38_edac.c |   28 +++++++++++++++++++++-------
 1 files changed, 21 insertions(+), 7 deletions(-)

diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 0de288f..f715269 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -215,19 +215,26 @@ static void x38_process_error_info(struct mem_ctl_info *mci,
 		return;
 
 	if ((info->errsts ^ info->errsts2) & X38_ERRSTS_BITS) {
-		edac_mc_handle_ce_no_info(mci, "UE overwrote CE");
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
+				     -1, -1, -1,
+				     "UE overwrote CE", "", NULL);
 		info->errsts = info->errsts2;
 	}
 
 	for (channel = 0; channel < x38_channel_num; channel++) {
 		log = info->eccerrlog[channel];
 		if (log & X38_ECCERRLOG_UE) {
-			edac_mc_handle_ue(mci, 0, 0,
-				eccerrlog_row(channel, log), "x38 UE");
+			edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+					     0, 0, 0,
+					     eccerrlog_row(channel, log),
+					     -1, -1,
+					     "x38 UE", "", NULL);
 		} else if (log & X38_ECCERRLOG_CE) {
-			edac_mc_handle_ce(mci, 0, 0,
-				eccerrlog_syndrome(log),
-				eccerrlog_row(channel, log), 0, "x38 CE");
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     0, 0, eccerrlog_syndrome(log),
+					     eccerrlog_row(channel, log),
+					     -1, -1,
+					     "x38 CE", "", NULL);
 		}
 	}
 }
@@ -319,6 +326,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	u16 drbs[X38_CHANNELS][X38_RANKS_PER_CHANNEL];
 	bool stacked;
 	void __iomem *window;
@@ -334,7 +342,13 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	how_many_channel(pdev);
 
 	/* FIXME: unconventional pvt_info usage */
-	mci = edac_mc_alloc(0, X38_RANKS, x38_channel_num, 0);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = X38_RANKS;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = x38_channel_num;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [EDAC_ABI PATCH v13 26/26] edac: Remove the legacy EDAC ABI
  2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
                       ` (24 preceding siblings ...)
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 25/26] x38_edac: " Mauro Carvalho Chehab
@ 2012-04-16 20:21     ` Mauro Carvalho Chehab
  25 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-16 20:21 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

Now that all drivers got converted to use the new ABI, we can
drop the old one.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 drivers/edac/amd64_edac.c      |    2 +-
 drivers/edac/amd76x_edac.c     |    2 +-
 drivers/edac/cell_edac.c       |    2 +-
 drivers/edac/cpc925_edac.c     |    2 +-
 drivers/edac/e752x_edac.c      |    2 +-
 drivers/edac/e7xxx_edac.c      |    2 +-
 drivers/edac/edac_core.h       |   78 +---------------------------------------
 drivers/edac/edac_mc.c         |   53 +--------------------------
 drivers/edac/i3000_edac.c      |    2 +-
 drivers/edac/i3200_edac.c      |    2 +-
 drivers/edac/i5000_edac.c      |    2 +-
 drivers/edac/i5100_edac.c      |    2 +-
 drivers/edac/i5400_edac.c      |    2 +-
 drivers/edac/i7300_edac.c      |    2 +-
 drivers/edac/i7core_edac.c     |    2 +-
 drivers/edac/i82443bxgx_edac.c |    2 +-
 drivers/edac/i82860_edac.c     |    2 +-
 drivers/edac/i82875p_edac.c    |    2 +-
 drivers/edac/i82975x_edac.c    |    2 +-
 drivers/edac/mpc85xx_edac.c    |    2 +-
 drivers/edac/mv64x60_edac.c    |    2 +-
 drivers/edac/pasemi_edac.c     |    2 +-
 drivers/edac/ppc4xx_edac.c     |    2 +-
 drivers/edac/r82600_edac.c     |    2 +-
 drivers/edac/sb_edac.c         |    2 +-
 drivers/edac/tile_edac.c       |    2 +-
 drivers/edac/x38_edac.c        |    2 +-
 27 files changed, 27 insertions(+), 154 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index c4182e4..445ff03 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2567,7 +2567,7 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = pvt->channel_count;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
+	mci = edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		goto err_siblings;
 
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index fd9ab44..0184e90 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -251,7 +251,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = 1;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 4cfd22a..39616a3 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -204,7 +204,7 @@ static int __devinit cell_edac_probe(struct platform_device *pdev)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = num_chans;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
+	mci = edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
 			    sizeof(struct cell_edac_priv));
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index adbeb13..caaae0d 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -982,7 +982,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = nr_channels;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
 			    sizeof(struct cpc925_mc_pdata));
 	if (!mci) {
 		cpc925_printk(KERN_ERR, "No memory for mem_ctl_info\n");
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 7b6ce11..5f20a8e 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1278,7 +1278,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = drc_chan + 1;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
 			    false, sizeof(*pvt));
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 9380f8a..54a6666 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -447,7 +447,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = drc_chan + 1;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (mci == NULL)
 		return -ENOMEM;
 
diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index 7201bb1..8aadd83 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,9 +447,7 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				   unsigned nr_chans, int edac_index);
-struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 				   unsigned n_layers,
 				   struct edac_mc_layer *layers,
 				   bool rev_order,
@@ -461,18 +459,6 @@ extern struct mem_ctl_info *find_mci_by_dev(struct device *dev);
 extern struct mem_ctl_info *edac_mc_del_mc(struct device *dev);
 extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
 				      unsigned long page);
-
-/*
- * The no info errors are used when error overflows are reported.
- * There are a limited number of error logging registers that can
- * be exausted.  When all registers are exhausted and an additional
- * error occurs then an error overflow register records that an
- * error occurred and the type of error, but doesn't have any
- * further information.  The ce/ue versions make for cleaner
- * reporting logic and function interface - reduces conditional
- * statement clutter and extra function arguments.
- */
-
 void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			  struct mem_ctl_info *mci,
 			  const unsigned long page_frame_number,
@@ -485,68 +471,6 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			  const char *other_detail,
 			  const void *mcelog);
 
-static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page,
-			      unsigned long syndrome, int row, int channel,
-			      const char *msg)
-{
-	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
-			      page_frame_number, offset_in_page, syndrome,
-		              row, channel, -1, msg, NULL, NULL);
-}
-
-static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg)
-{
-	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
-			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
-}
-
-static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page, int row,
-			      const char *msg)
-{
-	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
-			      page_frame_number, offset_in_page, 0,
-		              row, -1, -1, msg, NULL, NULL);
-}
-
-static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg)
-{
-	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
-			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
-}
-
-static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-					 unsigned int csrow,
-					 unsigned int channel0,
-					 unsigned int channel1,
-					 char *msg)
-{
-	/*
-	 *FIXME: The error can also be at channel1 (e. g. at the second
-	 *	  channel of the same branch). The fix is to push
-	 *	  edac_mc_handle_error() call into each driver
-	 */
-	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
-			      0, 0, 0,
-		              csrow, channel0, -1, msg, NULL, NULL);
-}
-
-static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-					 unsigned int csrow,
-					 unsigned int channel, char *msg)
-{
-	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
-			      0, 0, 0,
-		              csrow, channel, -1, msg, NULL, NULL);
-}
-
-
-
 /*
  * edac_device APIs
  */
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index f231c54..f5be026 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -189,7 +189,7 @@ void *edac_align_ptr(void **p, unsigned size, int quant)
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 				   unsigned n_layers,
 				   struct edac_mc_layer *layers,
 				   bool rev_order,
@@ -370,57 +370,6 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	 */
 	return mci;
 }
-EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
-
-/**
- * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
- * @edac_index:		Memory controller number
- * @n_layers:		Nu
-mber of layers at the MC hierarchy
- * layers:		Describes each layer as seen by the Memory Controller
- * @rev_order:		Fills csrows/cs channels at the reverse order
- * @size_pvt:		size of private storage needed
- *
- *
- * FIXME: drivers handle multi-rank memories on different ways: on some
- * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
- * a single multi-rank DIMM would be mapped into several "dimms".
- *
- * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
- * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
- * thing, as two chip select values are used for dual-rank memories (and 4, for
- * quad-rank ones). I suspect that this issue could be solved inside the EDAC
- * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
- *
- * In summary, solving this issue is not easy, as it requires a lot of testing.
- *
- * Everything is kmalloc'ed as one big chunk - more efficient.
- * Only can be used if all structures have the same lifetime - otherwise
- * you have to allocate and initialize your own structures.
- *
- * Use edac_mc_free() to free mc structures allocated by this function.
- *
- * Returns:
- *	NULL allocation failed
- *	struct mem_ctl_info pointer
- */
-
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				   unsigned nr_chans, int edac_index)
-{
-	unsigned n_layers = 2;
-	struct edac_mc_layer layers[n_layers];
-
-	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
-	layers[0].size = nr_csrows;
-	layers[0].is_csrow = true;
-	layers[1].type = EDAC_MC_LAYER_CHANNEL;
-	layers[1].size = nr_chans;
-	layers[1].is_csrow = false;
-
-	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
-			  false, sz_pvt);
-}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 9b2c561..2032d198 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -362,7 +362,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = nr_channels;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index c926ff0..9a35487 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -347,7 +347,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = nr_channels;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
 			    false, sizeof(struct i3200_priv));
 	if (!mci)
 		return -ENOMEM;
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 18b2532..1691cdd 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1396,7 +1396,7 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[2].type = EDAC_MC_LAYER_SLOT;
 	layers[2].size = num_dimms_per_channel;
 	layers[2].is_csrow = true;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index ec5de18..fda60f8 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -936,7 +936,7 @@ static int __devinit i5100_init_one(struct pci_dev *pdev,
 	layers[1].type = EDAC_MC_LAYER_SLOT;
 	layers[1].size = ranksperch;
 	layers[1].is_csrow = true;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
 			    false, sizeof(*priv));
 	if (!mci) {
 		ret = -ENOMEM;
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 029c557..3aa2a1e 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1280,7 +1280,7 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[2].type = EDAC_MC_LAYER_SLOT;
 	layers[2].size = DIMMS_PER_CHANNEL;
 	layers[2].is_csrow = true;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 6e762f5..0ff0b26 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -1054,7 +1054,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 	layers[2].type = EDAC_MC_LAYER_SLOT;
 	layers[2].size = MAX_SLOTS;
 	layers[2].is_csrow = true;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 8bcee03..72553dd 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -2147,7 +2147,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 	layers[1].type = EDAC_MC_LAYER_SLOT;
 	layers[1].size = MAX_DIMMS;
 	layers[1].is_csrow = true;
-	mci = new_edac_mc_alloc(i7core_dev->socket, ARRAY_SIZE(layers), layers,
+	mci = edac_mc_alloc(i7core_dev->socket, ARRAY_SIZE(layers), layers,
 			    false, sizeof(*pvt));
 	if (unlikely(!mci))
 		return -ENOMEM;
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 00aa186..23a0b5d 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -255,7 +255,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = I82443BXGX_NR_CHANS;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (mci == NULL)
 		return -ENOMEM;
 
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 26d91ba..b49b3b5 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -206,7 +206,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_SLOT;
 	layers[1].size = 8;
 	layers[1].is_csrow = true;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 4259c4a..c2c82c3 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -420,7 +420,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = nr_chans;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (!mci) {
 		rc = -ENOMEM;
 		goto fail0;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index bfd632c..c7db489 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -555,7 +555,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = I82975X_NR_CSROWS(chans);
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (!mci) {
 		rc = -ENOMEM;
 		goto fail1;
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 87a9baa..a7f1ff1 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -980,7 +980,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = 1;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
 			    sizeof(*pdata));
 	if (!mci) {
 		devres_release_group(&op->dev, mpc85xx_mc_err_probe);
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index f294da7..a32e9b6 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -715,7 +715,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = 1;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
+	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
 			    sizeof(struct mv64x60_mc_pdata));
 	if (!mci) {
 		printk(KERN_ERR "%s: No memory for CPU err\n", __func__);
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index addf893..caf17c8 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -215,7 +215,7 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = PASEMI_EDAC_NR_CHANS;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(system_mmc_id++, ARRAY_SIZE(layers), layers, false,
+	mci = edac_mc_alloc(system_mmc_id++, ARRAY_SIZE(layers), layers, false,
 			    0);
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index 482161e..89e3147 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -1291,7 +1291,7 @@ static int __devinit ppc4xx_edac_probe(struct platform_device *op)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = ppc4xx_edac_nr_chans;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(ppc4xx_edac_instance, ARRAY_SIZE(layers), layers,
+	mci = edac_mc_alloc(ppc4xx_edac_instance, ARRAY_SIZE(layers), layers,
 			    false, sizeof(struct ppc4xx_edac_pdata));
 	if (mci == NULL) {
 		ppc4xx_edac_printk(KERN_ERR, "%s: "
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index ca64939..fe060db 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -291,7 +291,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = R82600_NR_CHANS;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (mci == NULL)
 		return -ENOMEM;
 
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index dc6a306..b253675 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1640,7 +1640,7 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 	layers[1].type = EDAC_MC_LAYER_SLOT;
 	layers[1].size = MAX_DIMMS;
 	layers[1].is_csrow = true;
-	mci = new_edac_mc_alloc(sbridge_dev->mc, ARRAY_SIZE(layers), layers,
+	mci = edac_mc_alloc(sbridge_dev->mc, ARRAY_SIZE(layers), layers,
 			    false, sizeof(*pvt));
 
 	if (unlikely(!mci))
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index 0e4d628..3e878bf 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -141,7 +141,7 @@ static int __devinit tile_edac_mc_probe(struct platform_device *pdev)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = TILE_EDAC_NR_CHANS;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
+	mci = edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
 			    sizeof(struct tile_edac_priv));
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index f715269..996d548 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -348,7 +348,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = x38_channel_num;
 	layers[1].is_csrow = false;
-	mci = new_edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
+	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
  2012-04-16 20:12     ` Mauro Carvalho Chehab
@ 2012-04-17 18:48       ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-17 18:48 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Mark Gross, Jason Uhlenkott, Tim Small,
	Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Mon, Apr 16, 2012 at 05:12:10PM -0300, Mauro Carvalho Chehab wrote:
> The number of pages is a dimm property. Move it to the dimm struct.
> 
> After this change, it is possible to add sysfs nodes for the DIMM's that

Minor nitpick:

                                                               DIMMs

Please go over the rest of the commit messages because they have similar
typos.

> will properly represent the DIMM stick properties, including its size.
> 
> A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
> the memory controller represents the memory via chip select rows.
> 
> Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
> Cc: Doug Thompson <norsk5@yahoo.com>
> Cc: Borislav Petkov <borislav.petkov@amd.com>
> Cc: Mark Gross <mark.gross@intel.com>
> Cc: Jason Uhlenkott <juhlenko@akamai.com>
> Cc: Tim Small <tim@buttersideup.com>
> Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
> Cc: "Arvind R." <arvino55@gmail.com>
> Cc: Olof Johansson <olof@lixom.net>
> Cc: Egor Martovetsky <egor@pasemi.com>
> Cc: Chris Metcalf <cmetcalf@tilera.com>
> Cc: Michal Marek <mmarek@suse.cz>
> Cc: Jiri Kosina <jkosina@suse.cz>
> Cc: Joe Perches <joe@perches.com>
> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Hitoshi Mitake <h.mitake@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
> Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
> Cc: Josh Boyer <jwboyer@gmail.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
>  drivers/edac/amd64_edac.c      |   12 +++------
>  drivers/edac/amd76x_edac.c     |    6 ++--
>  drivers/edac/cell_edac.c       |    8 ++++--
>  drivers/edac/cpc925_edac.c     |    8 ++++--
>  drivers/edac/e752x_edac.c      |    6 +++-
>  drivers/edac/e7xxx_edac.c      |    5 ++-
>  drivers/edac/edac_mc.c         |   16 ++++++++-----
>  drivers/edac/edac_mc_sysfs.c   |   47 ++++++++++++++++++++++++++++------------
>  drivers/edac/i3000_edac.c      |    6 +++-
>  drivers/edac/i3200_edac.c      |    3 +-
>  drivers/edac/i5000_edac.c      |   14 ++++++-----
>  drivers/edac/i5100_edac.c      |   22 +++++++++++-------
>  drivers/edac/i5400_edac.c      |    9 ++-----
>  drivers/edac/i7300_edac.c      |   22 +++++-------------
>  drivers/edac/i7core_edac.c     |   10 ++------
>  drivers/edac/i82443bxgx_edac.c |    2 +-
>  drivers/edac/i82860_edac.c     |    2 +-
>  drivers/edac/i82875p_edac.c    |    5 ++-
>  drivers/edac/i82975x_edac.c    |   11 ++++++--
>  drivers/edac/mpc85xx_edac.c    |    3 +-
>  drivers/edac/mv64x60_edac.c    |    3 +-
>  drivers/edac/pasemi_edac.c     |   14 ++++++------
>  drivers/edac/ppc4xx_edac.c     |    5 ++-
>  drivers/edac/r82600_edac.c     |    3 +-
>  drivers/edac/sb_edac.c         |    8 +-----
>  drivers/edac/tile_edac.c       |    2 +-
>  drivers/edac/x38_edac.c        |    4 +-
>  include/linux/edac.h           |    8 ++++--
>  28 files changed, 144 insertions(+), 120 deletions(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 0be3f29..8804ac8 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -2126,12 +2126,6 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>  
>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>  
> -	/*
> -	 * If dual channel then double the memory size of single channel.
> -	 * Channel count is 1 or 2
> -	 */
> -	nr_pages <<= (pvt->channel_count - 1);

This changes DEBUG output from:

EDAC DEBUG: init_csrows: ----CSROW 0 VALID for MC node 0
EDAC DEBUG: amd64_csrow_nr_pages:   (csrow=0) DBAM map index= 8
EDAC DEBUG: amd64_csrow_nr_pages:     nr_pages= 1048576  channel-count = 2
EDAC amd64: CS0: Registered DDR3 RAM
EDAC DEBUG: init_csrows:   for MC node 0 csrow 0:
EDAC DEBUG: init_csrows:     nr_pages: 1048576

to

EDAC DEBUG: init_csrows: ----CSROW 0 VALID for MC node 0
EDAC DEBUG: amd64_csrow_nr_pages:   (csrow=0) DBAM map index= 8
EDAC DEBUG: amd64_csrow_nr_pages:     nr_pages= 524288  channel-count = 2

which is only half the pages.

> -
>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
>  	debugf0("    nr_pages= %u  channel-count = %d\n",
>  		nr_pages, pvt->channel_count);
> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>  	int i, j, empty = 1;
>  	enum mem_type mtype;
>  	enum edac_type edac_mode;
> +	int nr_pages;
>  
>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>  
> @@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
>  			i, pvt->mc_node_id);
>  
>  		empty = 0;
> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
>  		/* 8 bytes of resolution */
>  
>  		mtype = amd64_determine_memory_type(pvt, i);
>  
>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
> +		debugf1("    nr_pages: %u\n", nr_pages);
>  
>  		/*
>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> @@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>  		for (j = 0; j < pvt->channel_count; j++) {
>  			csrow->channels[j].dimm->mtype = mtype;
>  			csrow->channels[j].dimm->edac_mode = edac_mode;
> +			csrow->channels[j].dimm->nr_pages = nr_pages;

I'm guessing you want to accumulate the nr_pages for all channels here
and dump it properly?

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
@ 2012-04-17 18:48       ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-17 18:48 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Hitoshi Mitake, Mark Gross,
	Dmitry Eremin-Solenikov, Ranganathan Desikan, Egor Martovetsky,
	Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Mon, Apr 16, 2012 at 05:12:10PM -0300, Mauro Carvalho Chehab wrote:
> The number of pages is a dimm property. Move it to the dimm struct.
> 
> After this change, it is possible to add sysfs nodes for the DIMM's that

Minor nitpick:

                                                               DIMMs

Please go over the rest of the commit messages because they have similar
typos.

> will properly represent the DIMM stick properties, including its size.
> 
> A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
> the memory controller represents the memory via chip select rows.
> 
> Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
> Cc: Doug Thompson <norsk5@yahoo.com>
> Cc: Borislav Petkov <borislav.petkov@amd.com>
> Cc: Mark Gross <mark.gross@intel.com>
> Cc: Jason Uhlenkott <juhlenko@akamai.com>
> Cc: Tim Small <tim@buttersideup.com>
> Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
> Cc: "Arvind R." <arvino55@gmail.com>
> Cc: Olof Johansson <olof@lixom.net>
> Cc: Egor Martovetsky <egor@pasemi.com>
> Cc: Chris Metcalf <cmetcalf@tilera.com>
> Cc: Michal Marek <mmarek@suse.cz>
> Cc: Jiri Kosina <jkosina@suse.cz>
> Cc: Joe Perches <joe@perches.com>
> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Hitoshi Mitake <h.mitake@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
> Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
> Cc: Josh Boyer <jwboyer@gmail.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
>  drivers/edac/amd64_edac.c      |   12 +++------
>  drivers/edac/amd76x_edac.c     |    6 ++--
>  drivers/edac/cell_edac.c       |    8 ++++--
>  drivers/edac/cpc925_edac.c     |    8 ++++--
>  drivers/edac/e752x_edac.c      |    6 +++-
>  drivers/edac/e7xxx_edac.c      |    5 ++-
>  drivers/edac/edac_mc.c         |   16 ++++++++-----
>  drivers/edac/edac_mc_sysfs.c   |   47 ++++++++++++++++++++++++++++------------
>  drivers/edac/i3000_edac.c      |    6 +++-
>  drivers/edac/i3200_edac.c      |    3 +-
>  drivers/edac/i5000_edac.c      |   14 ++++++-----
>  drivers/edac/i5100_edac.c      |   22 +++++++++++-------
>  drivers/edac/i5400_edac.c      |    9 ++-----
>  drivers/edac/i7300_edac.c      |   22 +++++-------------
>  drivers/edac/i7core_edac.c     |   10 ++------
>  drivers/edac/i82443bxgx_edac.c |    2 +-
>  drivers/edac/i82860_edac.c     |    2 +-
>  drivers/edac/i82875p_edac.c    |    5 ++-
>  drivers/edac/i82975x_edac.c    |   11 ++++++--
>  drivers/edac/mpc85xx_edac.c    |    3 +-
>  drivers/edac/mv64x60_edac.c    |    3 +-
>  drivers/edac/pasemi_edac.c     |   14 ++++++------
>  drivers/edac/ppc4xx_edac.c     |    5 ++-
>  drivers/edac/r82600_edac.c     |    3 +-
>  drivers/edac/sb_edac.c         |    8 +-----
>  drivers/edac/tile_edac.c       |    2 +-
>  drivers/edac/x38_edac.c        |    4 +-
>  include/linux/edac.h           |    8 ++++--
>  28 files changed, 144 insertions(+), 120 deletions(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 0be3f29..8804ac8 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -2126,12 +2126,6 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>  
>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>  
> -	/*
> -	 * If dual channel then double the memory size of single channel.
> -	 * Channel count is 1 or 2
> -	 */
> -	nr_pages <<= (pvt->channel_count - 1);

This changes DEBUG output from:

EDAC DEBUG: init_csrows: ----CSROW 0 VALID for MC node 0
EDAC DEBUG: amd64_csrow_nr_pages:   (csrow=0) DBAM map index= 8
EDAC DEBUG: amd64_csrow_nr_pages:     nr_pages= 1048576  channel-count = 2
EDAC amd64: CS0: Registered DDR3 RAM
EDAC DEBUG: init_csrows:   for MC node 0 csrow 0:
EDAC DEBUG: init_csrows:     nr_pages: 1048576

to

EDAC DEBUG: init_csrows: ----CSROW 0 VALID for MC node 0
EDAC DEBUG: amd64_csrow_nr_pages:   (csrow=0) DBAM map index= 8
EDAC DEBUG: amd64_csrow_nr_pages:     nr_pages= 524288  channel-count = 2

which is only half the pages.

> -
>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
>  	debugf0("    nr_pages= %u  channel-count = %d\n",
>  		nr_pages, pvt->channel_count);
> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>  	int i, j, empty = 1;
>  	enum mem_type mtype;
>  	enum edac_type edac_mode;
> +	int nr_pages;
>  
>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>  
> @@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
>  			i, pvt->mc_node_id);
>  
>  		empty = 0;
> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
>  		/* 8 bytes of resolution */
>  
>  		mtype = amd64_determine_memory_type(pvt, i);
>  
>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
> +		debugf1("    nr_pages: %u\n", nr_pages);
>  
>  		/*
>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> @@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>  		for (j = 0; j < pvt->channel_count; j++) {
>  			csrow->channels[j].dimm->mtype = mtype;
>  			csrow->channels[j].dimm->edac_mode = edac_mode;
> +			csrow->channels[j].dimm->nr_pages = nr_pages;

I'm guessing you want to accumulate the nr_pages for all channels here
and dump it properly?

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
  2012-04-17 18:48       ` Borislav Petkov
@ 2012-04-17 19:28         ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-17 19:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Mark Gross, Jason Uhlenkott, Tim Small,
	Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 17-04-2012 15:48, Borislav Petkov escreveu:
> On Mon, Apr 16, 2012 at 05:12:10PM -0300, Mauro Carvalho Chehab wrote:
>> The number of pages is a dimm property. Move it to the dimm struct.
>>
>> After this change, it is possible to add sysfs nodes for the DIMM's that
> 
> Minor nitpick:
> 
>                                                                DIMMs
> 
> Please go over the rest of the commit messages because they have similar
> typos.
> 
>> will properly represent the DIMM stick properties, including its size.
>>
>> A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
>> the memory controller represents the memory via chip select rows.
>>
>> Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
>> Cc: Doug Thompson <norsk5@yahoo.com>
>> Cc: Borislav Petkov <borislav.petkov@amd.com>
>> Cc: Mark Gross <mark.gross@intel.com>
>> Cc: Jason Uhlenkott <juhlenko@akamai.com>
>> Cc: Tim Small <tim@buttersideup.com>
>> Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
>> Cc: "Arvind R." <arvino55@gmail.com>
>> Cc: Olof Johansson <olof@lixom.net>
>> Cc: Egor Martovetsky <egor@pasemi.com>
>> Cc: Chris Metcalf <cmetcalf@tilera.com>
>> Cc: Michal Marek <mmarek@suse.cz>
>> Cc: Jiri Kosina <jkosina@suse.cz>
>> Cc: Joe Perches <joe@perches.com>
>> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Hitoshi Mitake <h.mitake@gmail.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
>> Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
>> Cc: Josh Boyer <jwboyer@gmail.com>
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>  drivers/edac/amd64_edac.c      |   12 +++------
>>  drivers/edac/amd76x_edac.c     |    6 ++--
>>  drivers/edac/cell_edac.c       |    8 ++++--
>>  drivers/edac/cpc925_edac.c     |    8 ++++--
>>  drivers/edac/e752x_edac.c      |    6 +++-
>>  drivers/edac/e7xxx_edac.c      |    5 ++-
>>  drivers/edac/edac_mc.c         |   16 ++++++++-----
>>  drivers/edac/edac_mc_sysfs.c   |   47 ++++++++++++++++++++++++++++------------
>>  drivers/edac/i3000_edac.c      |    6 +++-
>>  drivers/edac/i3200_edac.c      |    3 +-
>>  drivers/edac/i5000_edac.c      |   14 ++++++-----
>>  drivers/edac/i5100_edac.c      |   22 +++++++++++-------
>>  drivers/edac/i5400_edac.c      |    9 ++-----
>>  drivers/edac/i7300_edac.c      |   22 +++++-------------
>>  drivers/edac/i7core_edac.c     |   10 ++------
>>  drivers/edac/i82443bxgx_edac.c |    2 +-
>>  drivers/edac/i82860_edac.c     |    2 +-
>>  drivers/edac/i82875p_edac.c    |    5 ++-
>>  drivers/edac/i82975x_edac.c    |   11 ++++++--
>>  drivers/edac/mpc85xx_edac.c    |    3 +-
>>  drivers/edac/mv64x60_edac.c    |    3 +-
>>  drivers/edac/pasemi_edac.c     |   14 ++++++------
>>  drivers/edac/ppc4xx_edac.c     |    5 ++-
>>  drivers/edac/r82600_edac.c     |    3 +-
>>  drivers/edac/sb_edac.c         |    8 +-----
>>  drivers/edac/tile_edac.c       |    2 +-
>>  drivers/edac/x38_edac.c        |    4 +-
>>  include/linux/edac.h           |    8 ++++--
>>  28 files changed, 144 insertions(+), 120 deletions(-)
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index 0be3f29..8804ac8 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -2126,12 +2126,6 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>>  
>>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>>  
>> -	/*
>> -	 * If dual channel then double the memory size of single channel.
>> -	 * Channel count is 1 or 2
>> -	 */
>> -	nr_pages <<= (pvt->channel_count - 1);
> 
> This changes DEBUG output from:
> 
> EDAC DEBUG: init_csrows: ----CSROW 0 VALID for MC node 0
> EDAC DEBUG: amd64_csrow_nr_pages:   (csrow=0) DBAM map index= 8
> EDAC DEBUG: amd64_csrow_nr_pages:     nr_pages= 1048576  channel-count = 2
> EDAC amd64: CS0: Registered DDR3 RAM
> EDAC DEBUG: init_csrows:   for MC node 0 csrow 0:
> EDAC DEBUG: init_csrows:     nr_pages: 1048576
> 
> to
> 
> EDAC DEBUG: init_csrows: ----CSROW 0 VALID for MC node 0
> EDAC DEBUG: amd64_csrow_nr_pages:   (csrow=0) DBAM map index= 8
> EDAC DEBUG: amd64_csrow_nr_pages:     nr_pages= 524288  channel-count = 2
> 
> which is only half the pages.
> 
>> -
>>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
>>  	debugf0("    nr_pages= %u  channel-count = %d\n",
>>  		nr_pages, pvt->channel_count);


Ok. well, we can either multiply nr_pages by channel_count or to let it
clear that this is per channel. I prefer the last option (see the enclosed
patch).

>> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  	int i, j, empty = 1;
>>  	enum mem_type mtype;
>>  	enum edac_type edac_mode;
>> +	int nr_pages;
>>  
>>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>>  
>> @@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  			i, pvt->mc_node_id);
>>  
>>  		empty = 0;
>> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
>>  		/* 8 bytes of resolution */
>>  
>>  		mtype = amd64_determine_memory_type(pvt, i);
>>  
>>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
>> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
>> +		debugf1("    nr_pages: %u\n", nr_pages);
>>  
>>  		/*
>>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
>> @@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  		for (j = 0; j < pvt->channel_count; j++) {
>>  			csrow->channels[j].dimm->mtype = mtype;
>>  			csrow->channels[j].dimm->edac_mode = edac_mode;
>> +			csrow->channels[j].dimm->nr_pages = nr_pages;
> 
> I'm guessing you want to accumulate the nr_pages for all channels here
> and dump it properly?
> 

As you've requested to not move the debugf0() to be after the loop, it is
easier to just multiply it at the printk. As an advantage, when the kernel
is compiled without debug, no code will be produced.

IMO, the best way to solve it is with this small patch. If you're ok, I'll
fold it with this one and add your ack.

Regards,
Mauro

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 8804ac8..6d6ec68 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2127,7 +2127,7 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
 	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
 
 	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
-	debugf0("    nr_pages= %u  channel-count = %d\n",
+	debugf0("    nr_pages/channel= %u  channel-count = %d\n",
 		nr_pages, pvt->channel_count);
 
 	return nr_pages;
@@ -2176,7 +2176,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 		mtype = amd64_determine_memory_type(pvt, i);
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    nr_pages: %u\n", nr_pages);
+		debugf1("    nr_pages: %u\n", nr_pages * pvt->channel_count);
 
 		/*
 		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
@ 2012-04-17 19:28         ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-17 19:28 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Hitoshi Mitake, Mark Gross,
	Dmitry Eremin-Solenikov, Ranganathan Desikan, Egor Martovetsky,
	Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 17-04-2012 15:48, Borislav Petkov escreveu:
> On Mon, Apr 16, 2012 at 05:12:10PM -0300, Mauro Carvalho Chehab wrote:
>> The number of pages is a dimm property. Move it to the dimm struct.
>>
>> After this change, it is possible to add sysfs nodes for the DIMM's that
> 
> Minor nitpick:
> 
>                                                                DIMMs
> 
> Please go over the rest of the commit messages because they have similar
> typos.
> 
>> will properly represent the DIMM stick properties, including its size.
>>
>> A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
>> the memory controller represents the memory via chip select rows.
>>
>> Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
>> Cc: Doug Thompson <norsk5@yahoo.com>
>> Cc: Borislav Petkov <borislav.petkov@amd.com>
>> Cc: Mark Gross <mark.gross@intel.com>
>> Cc: Jason Uhlenkott <juhlenko@akamai.com>
>> Cc: Tim Small <tim@buttersideup.com>
>> Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
>> Cc: "Arvind R." <arvino55@gmail.com>
>> Cc: Olof Johansson <olof@lixom.net>
>> Cc: Egor Martovetsky <egor@pasemi.com>
>> Cc: Chris Metcalf <cmetcalf@tilera.com>
>> Cc: Michal Marek <mmarek@suse.cz>
>> Cc: Jiri Kosina <jkosina@suse.cz>
>> Cc: Joe Perches <joe@perches.com>
>> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
>> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
>> Cc: Hitoshi Mitake <h.mitake@gmail.com>
>> Cc: Andrew Morton <akpm@linux-foundation.org>
>> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
>> Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
>> Cc: Josh Boyer <jwboyer@gmail.com>
>> Cc: linuxppc-dev@lists.ozlabs.org
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>  drivers/edac/amd64_edac.c      |   12 +++------
>>  drivers/edac/amd76x_edac.c     |    6 ++--
>>  drivers/edac/cell_edac.c       |    8 ++++--
>>  drivers/edac/cpc925_edac.c     |    8 ++++--
>>  drivers/edac/e752x_edac.c      |    6 +++-
>>  drivers/edac/e7xxx_edac.c      |    5 ++-
>>  drivers/edac/edac_mc.c         |   16 ++++++++-----
>>  drivers/edac/edac_mc_sysfs.c   |   47 ++++++++++++++++++++++++++++------------
>>  drivers/edac/i3000_edac.c      |    6 +++-
>>  drivers/edac/i3200_edac.c      |    3 +-
>>  drivers/edac/i5000_edac.c      |   14 ++++++-----
>>  drivers/edac/i5100_edac.c      |   22 +++++++++++-------
>>  drivers/edac/i5400_edac.c      |    9 ++-----
>>  drivers/edac/i7300_edac.c      |   22 +++++-------------
>>  drivers/edac/i7core_edac.c     |   10 ++------
>>  drivers/edac/i82443bxgx_edac.c |    2 +-
>>  drivers/edac/i82860_edac.c     |    2 +-
>>  drivers/edac/i82875p_edac.c    |    5 ++-
>>  drivers/edac/i82975x_edac.c    |   11 ++++++--
>>  drivers/edac/mpc85xx_edac.c    |    3 +-
>>  drivers/edac/mv64x60_edac.c    |    3 +-
>>  drivers/edac/pasemi_edac.c     |   14 ++++++------
>>  drivers/edac/ppc4xx_edac.c     |    5 ++-
>>  drivers/edac/r82600_edac.c     |    3 +-
>>  drivers/edac/sb_edac.c         |    8 +-----
>>  drivers/edac/tile_edac.c       |    2 +-
>>  drivers/edac/x38_edac.c        |    4 +-
>>  include/linux/edac.h           |    8 ++++--
>>  28 files changed, 144 insertions(+), 120 deletions(-)
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index 0be3f29..8804ac8 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -2126,12 +2126,6 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>>  
>>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>>  
>> -	/*
>> -	 * If dual channel then double the memory size of single channel.
>> -	 * Channel count is 1 or 2
>> -	 */
>> -	nr_pages <<= (pvt->channel_count - 1);
> 
> This changes DEBUG output from:
> 
> EDAC DEBUG: init_csrows: ----CSROW 0 VALID for MC node 0
> EDAC DEBUG: amd64_csrow_nr_pages:   (csrow=0) DBAM map index= 8
> EDAC DEBUG: amd64_csrow_nr_pages:     nr_pages= 1048576  channel-count = 2
> EDAC amd64: CS0: Registered DDR3 RAM
> EDAC DEBUG: init_csrows:   for MC node 0 csrow 0:
> EDAC DEBUG: init_csrows:     nr_pages: 1048576
> 
> to
> 
> EDAC DEBUG: init_csrows: ----CSROW 0 VALID for MC node 0
> EDAC DEBUG: amd64_csrow_nr_pages:   (csrow=0) DBAM map index= 8
> EDAC DEBUG: amd64_csrow_nr_pages:     nr_pages= 524288  channel-count = 2
> 
> which is only half the pages.
> 
>> -
>>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
>>  	debugf0("    nr_pages= %u  channel-count = %d\n",
>>  		nr_pages, pvt->channel_count);


Ok. well, we can either multiply nr_pages by channel_count or to let it
clear that this is per channel. I prefer the last option (see the enclosed
patch).

>> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  	int i, j, empty = 1;
>>  	enum mem_type mtype;
>>  	enum edac_type edac_mode;
>> +	int nr_pages;
>>  
>>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>>  
>> @@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  			i, pvt->mc_node_id);
>>  
>>  		empty = 0;
>> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
>>  		/* 8 bytes of resolution */
>>  
>>  		mtype = amd64_determine_memory_type(pvt, i);
>>  
>>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
>> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
>> +		debugf1("    nr_pages: %u\n", nr_pages);
>>  
>>  		/*
>>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
>> @@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  		for (j = 0; j < pvt->channel_count; j++) {
>>  			csrow->channels[j].dimm->mtype = mtype;
>>  			csrow->channels[j].dimm->edac_mode = edac_mode;
>> +			csrow->channels[j].dimm->nr_pages = nr_pages;
> 
> I'm guessing you want to accumulate the nr_pages for all channels here
> and dump it properly?
> 

As you've requested to not move the debugf0() to be after the loop, it is
easier to just multiply it at the printk. As an advantage, when the kernel
is compiled without debug, no code will be produced.

IMO, the best way to solve it is with this small patch. If you're ok, I'll
fold it with this one and add your ack.

Regards,
Mauro

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 8804ac8..6d6ec68 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2127,7 +2127,7 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
 	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
 
 	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
-	debugf0("    nr_pages= %u  channel-count = %d\n",
+	debugf0("    nr_pages/channel= %u  channel-count = %d\n",
 		nr_pages, pvt->channel_count);
 
 	return nr_pages;
@@ -2176,7 +2176,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 		mtype = amd64_determine_memory_type(pvt, i);
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    nr_pages: %u\n", nr_pages);
+		debugf1("    nr_pages: %u\n", nr_pages * pvt->channel_count);
 
 		/*
 		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
  2012-04-17 19:28         ` Mauro Carvalho Chehab
@ 2012-04-17 21:40           ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-17 21:40 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Mark Gross,
	Jason Uhlenkott, Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Tue, Apr 17, 2012 at 04:28:49PM -0300, Mauro Carvalho Chehab wrote:
> Ok. well, we can either multiply nr_pages by channel_count or to let it
> clear that this is per channel. I prefer the last option (see the enclosed
> patch).
> 
> >> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
> >>  	int i, j, empty = 1;
> >>  	enum mem_type mtype;
> >>  	enum edac_type edac_mode;
> >> +	int nr_pages;
> >>  
> >>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
> >>  
> >> @@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
> >>  			i, pvt->mc_node_id);
> >>  
> >>  		empty = 0;
> >> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
> >> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
> >>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
> >>  		/* 8 bytes of resolution */
> >>  
> >>  		mtype = amd64_determine_memory_type(pvt, i);
> >>  
> >>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
> >> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
> >> +		debugf1("    nr_pages: %u\n", nr_pages);
> >>  
> >>  		/*
> >>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> >> @@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
> >>  		for (j = 0; j < pvt->channel_count; j++) {
> >>  			csrow->channels[j].dimm->mtype = mtype;
> >>  			csrow->channels[j].dimm->edac_mode = edac_mode;
> >> +			csrow->channels[j].dimm->nr_pages = nr_pages;
> > 
> > I'm guessing you want to accumulate the nr_pages for all channels here
> > and dump it properly?
> > 
> 
> As you've requested to not move the debugf0() to be after the loop, it is
> easier to just multiply it at the printk. As an advantage, when the kernel
> is compiled without debug, no code will be produced.
> 
> IMO, the best way to solve it is with this small patch. If you're ok, I'll
> fold it with this one and add your ack.
> 
> Regards,
> Mauro
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 8804ac8..6d6ec68 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -2127,7 +2127,7 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>  
>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
> -	debugf0("    nr_pages= %u  channel-count = %d\n",
> +	debugf0("    nr_pages/channel= %u  channel-count = %d\n",
>  		nr_pages, pvt->channel_count);
>  
>  	return nr_pages;
> @@ -2176,7 +2176,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>  		mtype = amd64_determine_memory_type(pvt, i);
>  
>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
> -		debugf1("    nr_pages: %u\n", nr_pages);
> +		debugf1("    nr_pages: %u\n", nr_pages * pvt->channel_count);
>  
>  		/*
>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating

Yeah, this is basically what the code did anyway, so yes, please fold it
in and you can add my ACK. I'll continue reviewing the rest tomorrow.

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
@ 2012-04-17 21:40           ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-17 21:40 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Hitoshi Mitake, Mark Gross,
	Dmitry Eremin-Solenikov, Ranganathan Desikan, Borislav Petkov,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Tue, Apr 17, 2012 at 04:28:49PM -0300, Mauro Carvalho Chehab wrote:
> Ok. well, we can either multiply nr_pages by channel_count or to let it
> clear that this is per channel. I prefer the last option (see the enclosed
> patch).
> 
> >> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
> >>  	int i, j, empty = 1;
> >>  	enum mem_type mtype;
> >>  	enum edac_type edac_mode;
> >> +	int nr_pages;
> >>  
> >>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
> >>  
> >> @@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
> >>  			i, pvt->mc_node_id);
> >>  
> >>  		empty = 0;
> >> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
> >> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
> >>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
> >>  		/* 8 bytes of resolution */
> >>  
> >>  		mtype = amd64_determine_memory_type(pvt, i);
> >>  
> >>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
> >> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
> >> +		debugf1("    nr_pages: %u\n", nr_pages);
> >>  
> >>  		/*
> >>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> >> @@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
> >>  		for (j = 0; j < pvt->channel_count; j++) {
> >>  			csrow->channels[j].dimm->mtype = mtype;
> >>  			csrow->channels[j].dimm->edac_mode = edac_mode;
> >> +			csrow->channels[j].dimm->nr_pages = nr_pages;
> > 
> > I'm guessing you want to accumulate the nr_pages for all channels here
> > and dump it properly?
> > 
> 
> As you've requested to not move the debugf0() to be after the loop, it is
> easier to just multiply it at the printk. As an advantage, when the kernel
> is compiled without debug, no code will be produced.
> 
> IMO, the best way to solve it is with this small patch. If you're ok, I'll
> fold it with this one and add your ack.
> 
> Regards,
> Mauro
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 8804ac8..6d6ec68 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -2127,7 +2127,7 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>  
>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
> -	debugf0("    nr_pages= %u  channel-count = %d\n",
> +	debugf0("    nr_pages/channel= %u  channel-count = %d\n",
>  		nr_pages, pvt->channel_count);
>  
>  	return nr_pages;
> @@ -2176,7 +2176,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>  		mtype = amd64_determine_memory_type(pvt, i);
>  
>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
> -		debugf1("    nr_pages: %u\n", nr_pages);
> +		debugf1("    nr_pages: %u\n", nr_pages * pvt->channel_count);
>  
>  		/*
>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating

Yeah, this is basically what the code did anyway, so yes, please fold it
in and you can add my ACK. I'll continue reviewing the rest tomorrow.

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
  2012-04-17 21:40           ` Borislav Petkov
@ 2012-04-18 12:58             ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-18 12:58 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Mark Gross, Jason Uhlenkott, Tim Small,
	Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 17-04-2012 18:40, Borislav Petkov escreveu:
> On Tue, Apr 17, 2012 at 04:28:49PM -0300, Mauro Carvalho Chehab wrote:
>> Ok. well, we can either multiply nr_pages by channel_count or to let it
>> clear that this is per channel. I prefer the last option (see the enclosed
>> patch).
>>
>>>> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>>>  	int i, j, empty = 1;
>>>>  	enum mem_type mtype;
>>>>  	enum edac_type edac_mode;
>>>> +	int nr_pages;
>>>>  
>>>>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>>>>  
>>>> @@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
>>>>  			i, pvt->mc_node_id);
>>>>  
>>>>  		empty = 0;
>>>> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>>>> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>>>>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
>>>>  		/* 8 bytes of resolution */
>>>>  
>>>>  		mtype = amd64_determine_memory_type(pvt, i);
>>>>  
>>>>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
>>>> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
>>>> +		debugf1("    nr_pages: %u\n", nr_pages);
>>>>  
>>>>  		/*
>>>>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
>>>> @@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>>>  		for (j = 0; j < pvt->channel_count; j++) {
>>>>  			csrow->channels[j].dimm->mtype = mtype;
>>>>  			csrow->channels[j].dimm->edac_mode = edac_mode;
>>>> +			csrow->channels[j].dimm->nr_pages = nr_pages;
>>>
>>> I'm guessing you want to accumulate the nr_pages for all channels here
>>> and dump it properly?
>>>
>>
>> As you've requested to not move the debugf0() to be after the loop, it is
>> easier to just multiply it at the printk. As an advantage, when the kernel
>> is compiled without debug, no code will be produced.
>>
>> IMO, the best way to solve it is with this small patch. If you're ok, I'll
>> fold it with this one and add your ack.
>>
>> Regards,
>> Mauro
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index 8804ac8..6d6ec68 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -2127,7 +2127,7 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>>  
>>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
>> -	debugf0("    nr_pages= %u  channel-count = %d\n",
>> +	debugf0("    nr_pages/channel= %u  channel-count = %d\n",
>>  		nr_pages, pvt->channel_count);
>>  
>>  	return nr_pages;
>> @@ -2176,7 +2176,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  		mtype = amd64_determine_memory_type(pvt, i);
>>  
>>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
>> -		debugf1("    nr_pages: %u\n", nr_pages);
>> +		debugf1("    nr_pages: %u\n", nr_pages * pvt->channel_count);
>>  
>>  		/*
>>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> 
> Yeah, this is basically what the code did anyway, so yes, please fold it
> in and you can add my ACK. I'll continue reviewing the rest tomorrow.

Thanks!
Mauro
> 
> Thanks.
> 


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct
@ 2012-04-18 12:58             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-18 12:58 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Hitoshi Mitake, Mark Gross,
	Dmitry Eremin-Solenikov, Ranganathan Desikan, Egor Martovetsky,
	Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 17-04-2012 18:40, Borislav Petkov escreveu:
> On Tue, Apr 17, 2012 at 04:28:49PM -0300, Mauro Carvalho Chehab wrote:
>> Ok. well, we can either multiply nr_pages by channel_count or to let it
>> clear that this is per channel. I prefer the last option (see the enclosed
>> patch).
>>
>>>> @@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>>>  	int i, j, empty = 1;
>>>>  	enum mem_type mtype;
>>>>  	enum edac_type edac_mode;
>>>> +	int nr_pages;
>>>>  
>>>>  	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
>>>>  
>>>> @@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
>>>>  			i, pvt->mc_node_id);
>>>>  
>>>>  		empty = 0;
>>>> -		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>>>> +		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
>>>>  		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
>>>>  		/* 8 bytes of resolution */
>>>>  
>>>>  		mtype = amd64_determine_memory_type(pvt, i);
>>>>  
>>>>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
>>>> -		debugf1("    nr_pages: %u\n", csrow->nr_pages);
>>>> +		debugf1("    nr_pages: %u\n", nr_pages);
>>>>  
>>>>  		/*
>>>>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
>>>> @@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>>>  		for (j = 0; j < pvt->channel_count; j++) {
>>>>  			csrow->channels[j].dimm->mtype = mtype;
>>>>  			csrow->channels[j].dimm->edac_mode = edac_mode;
>>>> +			csrow->channels[j].dimm->nr_pages = nr_pages;
>>>
>>> I'm guessing you want to accumulate the nr_pages for all channels here
>>> and dump it properly?
>>>
>>
>> As you've requested to not move the debugf0() to be after the loop, it is
>> easier to just multiply it at the printk. As an advantage, when the kernel
>> is compiled without debug, no code will be produced.
>>
>> IMO, the best way to solve it is with this small patch. If you're ok, I'll
>> fold it with this one and add your ack.
>>
>> Regards,
>> Mauro
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index 8804ac8..6d6ec68 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -2127,7 +2127,7 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
>>  	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
>>  
>>  	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
>> -	debugf0("    nr_pages= %u  channel-count = %d\n",
>> +	debugf0("    nr_pages/channel= %u  channel-count = %d\n",
>>  		nr_pages, pvt->channel_count);
>>  
>>  	return nr_pages;
>> @@ -2176,7 +2176,7 @@ static int init_csrows(struct mem_ctl_info *mci)
>>  		mtype = amd64_determine_memory_type(pvt, i);
>>  
>>  		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
>> -		debugf1("    nr_pages: %u\n", nr_pages);
>> +		debugf1("    nr_pages: %u\n", nr_pages * pvt->channel_count);
>>  
>>  		/*
>>  		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
> 
> Yeah, this is basically what the code did anyway, so yes, please fold it
> in and you can add my ACK. I'll continue reviewing the rest tomorrow.

Thanks!
Mauro
> 
> Thanks.
> 

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 5/7] edac: rewrite edac_align_ptr()
  2012-04-16 20:12   ` [EDAC PATCH v13 5/7] edac: rewrite edac_align_ptr() Mauro Carvalho Chehab
@ 2012-04-18 14:06     ` Borislav Petkov
  2012-04-18 15:25       ` Borislav Petkov
                         ` (2 more replies)
  0 siblings, 3 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-18 14:06 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

On Mon, Apr 16, 2012 at 05:12:11PM -0300, Mauro Carvalho Chehab wrote:
> The edac_align_ptr() function is used to prepare data for a single
> memory allocation kzalloc() call. It counts how many bytes are needed
> by some data structure.
> 
> Using it as-is is not that trivial, as the quantity of memory elements
> reserved is not there, but, instead, it is on a next call.
> 
> In order to avoid mistakes when using it, move the number of allocated
> elements into it, making easier to use it.
> 
> Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>

AFAICT, this is a new patch so Aristeu cannot have reviewed it too. In
such case, you can't simply keep the Reviewed-by tagging. Unless he
really did that and I missed his mail with the tag somehow...?

> Cc: Doug Thompson <norsk5@yahoo.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
>  drivers/edac/edac_device.c |   27 +++++++++++----------------
>  drivers/edac/edac_mc.c     |   19 +++++++++++++------
>  drivers/edac/edac_module.h |    2 +-
>  drivers/edac/edac_pci.c    |    7 ++++---
>  4 files changed, 29 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
> index 4b15459..cb397d9 100644
> --- a/drivers/edac/edac_device.c
> +++ b/drivers/edac/edac_device.c
> @@ -79,7 +79,7 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
>  	unsigned total_size;
>  	unsigned count;
>  	unsigned instance, block, attr;
> -	void *pvt;
> +	void *pvt, *p;
>  	int err;
>  
>  	debugf4("%s() instances=%d blocks=%d\n",
> @@ -92,35 +92,30 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
>  	 * to be at least as stringent as what the compiler would
>  	 * provide if we could simply hardcode everything into a single struct.
>  	 */
> -	dev_ctl = (struct edac_device_ctl_info *)NULL;
> +	p = NULL;
> +	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
>  
>  	/* Calc the 'end' offset past end of ONE ctl_info structure
>  	 * which will become the start of the 'instance' array
>  	 */
> -	dev_inst = edac_align_ptr(&dev_ctl[1], sizeof(*dev_inst));
> +	dev_inst = edac_align_ptr(&p, sizeof(*dev_inst), nr_instances);
>  
>  	/* Calc the 'end' offset past the instance array within the ctl_info
>  	 * which will become the start of the block array
>  	 */
> -	dev_blk = edac_align_ptr(&dev_inst[nr_instances], sizeof(*dev_blk));
> +	count = nr_instances * nr_blocks;
> +	dev_blk = edac_align_ptr(&p, sizeof(*dev_blk), count);
>  
>  	/* Calc the 'end' offset past the dev_blk array
>  	 * which will become the start of the attrib array, if any.
>  	 */
> -	count = nr_instances * nr_blocks;
> -	dev_attrib = edac_align_ptr(&dev_blk[count], sizeof(*dev_attrib));
> -
> -	/* Check for case of when an attribute array is specified */
> -	if (nr_attrib > 0) {
> -		/* calc how many nr_attrib we need */
> +	/* calc how many nr_attrib we need */
> +	if (nr_attrib > 0)
>  		count *= nr_attrib;
> +	dev_attrib = edac_align_ptr(&p, sizeof(*dev_attrib), count);
>  
> -		/* Calc the 'end' offset past the attributes array */
> -		pvt = edac_align_ptr(&dev_attrib[count], sz_private);
> -	} else {
> -		/* no attribute array specificed */
> -		pvt = edac_align_ptr(dev_attrib, sz_private);
> -	}
> +	/* Calc the 'end' offset past the attributes array */
> +	pvt = edac_align_ptr(&p, sz_private, 1);
>  
>  	/* 'pvt' now points to where the private data area is.
>  	 * At this point 'pvt' (like dev_inst,dev_blk and dev_attrib)
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index ffedae9..98de5d1 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -108,9 +108,12 @@ EXPORT_SYMBOL_GPL(edac_mem_types);
>   * If 'size' is a constant, the compiler will optimize this whole function
>   * down to either a no-op or the addition of a constant to the value of 'ptr'.
>   */
> -void *edac_align_ptr(void *ptr, unsigned size)
> +void *edac_align_ptr(void **p, unsigned size, int quant)

Oh, no, pls write it out as 'quantity'. 'quant' only means nothing...
ok, it does but it does not fit in this here context:

>From The Collaborative International Dictionary of English v.0.48 [gcide]:

  Quant \Quant\, n.
     A punting pole with a broad flange near the end to prevent it
     from sinking into the mud; a setting pole.
     [1913 Webster]

:-)

>  {
>  	unsigned align, r;
> +	void *ptr = *p;
> +
> +	*p += size * quant;
>  
>  	/* Here we assume that the alignment of a "long long" is the most
>  	 * stringent alignment that the compiler will ever provide by default.
> @@ -132,6 +135,8 @@ void *edac_align_ptr(void *ptr, unsigned size)
>  	if (r == 0)
>  		return (char *)ptr;
>  
> +	*p += align - r;
> +

Why increment *p here too - we're returning ptr below? Or are we keeping
the alignment in the original pointer too? Why can't we pass the aligned
pointer from the previous pass? I.e., do

	p = NULL;
	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);

and then do

	dev_inst = edac_align_ptr(&dev_ctl, sizeof(*dev_inst), nr_instances);

In any case, this is not trivial so the function needs a bunch of comments.

>  	return (void *)(((unsigned long)ptr) + align - r);
>  }
>  
> @@ -154,6 +159,7 @@ void *edac_align_ptr(void *ptr, unsigned size)
>  struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  				unsigned nr_chans, int edac_index)
>  {
> +	void *ptr;
>  	struct mem_ctl_info *mci;
>  	struct csrow_info *csi, *csrow;
>  	struct rank_info *chi, *chp, *chan;
> @@ -168,11 +174,12 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	 * stringent as what the compiler would provide if we could simply
>  	 * hardcode everything into a single struct.
>  	 */
> -	mci = (struct mem_ctl_info *)0;
> -	csi = edac_align_ptr(&mci[1], sizeof(*csi));
> -	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
> -	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
> -	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
> +	ptr = 0;

Declare it above like this:

	void *ptr = NULL;

> +	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
> +	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
> +	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
> +	dimm = edac_align_ptr(ptr, sizeof(*dimm), nr_csrows * nr_chans);
> +	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
>  	size = ((unsigned long)pvt) + sz_pvt;
>  
>  	mci = kzalloc(size, GFP_KERNEL);
> diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
> index 00f81b4..0be4b01 100644
> --- a/drivers/edac/edac_module.h
> +++ b/drivers/edac/edac_module.h
> @@ -50,7 +50,7 @@ extern void edac_device_reset_delay_period(struct edac_device_ctl_info
>  					   *edac_dev, unsigned long value);
>  extern void edac_mc_reset_delay_period(int value);
>  
> -extern void *edac_align_ptr(void *ptr, unsigned size);
> +extern void *edac_align_ptr(void **p, unsigned size, int quant);
>  
>  /*
>   * EDAC PCI functions
> diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
> index 63af1c5..9016560 100644
> --- a/drivers/edac/edac_pci.c
> +++ b/drivers/edac/edac_pci.c
> @@ -42,13 +42,14 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
>  						const char *edac_pci_name)
>  {
>  	struct edac_pci_ctl_info *pci;
> -	void *pvt;
> +	void *p, *pvt;
>  	unsigned int size;
>  
>  	debugf1("%s()\n", __func__);
>  
> -	pci = (struct edac_pci_ctl_info *)0;
> -	pvt = edac_align_ptr(&pci[1], sz_pvt);
> +	p = 0;

ditto.

> +	pci = edac_align_ptr(&p, sizeof(*pci), 1);
> +	pvt = edac_align_ptr(&p, 1, sz_pvt);
>  	size = ((unsigned long)pvt) + sz_pvt;
>  
>  	/* Alloc the needed control struct memory */
> -- 
> 1.7.8
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-edac" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 5/7] edac: rewrite edac_align_ptr()
  2012-04-18 14:06     ` Borislav Petkov
@ 2012-04-18 15:25       ` Borislav Petkov
  2012-04-18 18:15       ` Mauro Carvalho Chehab
  2012-04-18 18:19       ` [PATCH] " Mauro Carvalho Chehab
  2 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-18 15:25 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

On Wed, Apr 18, 2012 at 04:06:40PM +0200, Borislav Petkov wrote:
> Why increment *p here too - we're returning ptr below? Or are we keeping
> the alignment in the original pointer too? Why can't we pass the aligned
> pointer from the previous pass? I.e., do
> 
> 	p = NULL;
> 	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
> 
> and then do
> 
> 	dev_inst = edac_align_ptr(&dev_ctl, sizeof(*dev_inst), nr_instances);
> 
> In any case, this is not trivial so the function needs a bunch of comments.

And broken it is too:

[    8.813572] BUG: unable to handle kernel NULL pointer dereference at 0000000000000740
[    8.817562] IP: [<ffffffffa0001084>] edac_align_ptr+0x9/0x5d [edac_core]
[    8.817562] PGD 4238d4067 PUD 4238d6067 PMD 0 
[    8.817562] Oops: 0000 [#1] SMP 
[    8.817562] CPU 0 
[    8.817562] Modules linked in: radeon(+) ttm amd64_edac_mod(+) drm_kms_helper e1000e hwmon backlight ohci_hcd cfbcopyarea cfbimgblt cfbfillrect ehci_hcd edac_core
[    8.817562] 
[    8.817562] Pid: 1557, comm: work_for_cpu Not tainted 3.3.0+ #6 AMD
[    8.817562] RIP: 0010:[<ffffffffa0001084>]  [<ffffffffa0001084>] edac_align_ptr+0x9/0x5d [edac_core]
[    8.817562] RSP: 0018:ffff880425213d60  EFLAGS: 00010246
[    8.817562] RAX: 00000000000005c0 RBX: 0000000000000002 RCX: 00000000000005c0
[    8.817562] RDX: 0000000000000010 RSI: 0000000000000044 RDI: 0000000000000740
[    8.817562] RBP: ffff880425213d60 R08: 0000000000000008 R09: 0000000000000740
[    8.817562] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000008
[    8.817562] R13: 0000000000000000 R14: ffff880425213d98 R15: 0000000000000010
[    8.817562] FS:  00007f0f12ed2740(0000) GS:ffff880427c00000(0000) knlGS:0000000000000000
[    8.817562] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[    8.817562] CR2: 0000000000000740 CR3: 00000004238d1000 CR4: 00000000000006f0
[    8.817562] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    8.817562] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[    8.817562] Process work_for_cpu (pid: 1557, threadinfo ffff880425212000, task ffff880425bed9c0)
[    8.817562] Stack:
[    8.817562]  ffff880425213dd0 ffffffffa0001157 ffff880425213d80 ffffffff00000000
[    8.817562]  00000000000005c0 00000000000001c0 0000000000000088 0000000000000740
[    8.817562]  0000000000000000 ffff880425d91000 0000000000000170 ffff880425ef2c90
[    8.817562] Call Trace:
[    8.817562]  [<ffffffffa0001157>] edac_mc_alloc+0x7f/0x1b9 [edac_core]
[    8.817562]  [<ffffffffa005bdef>] amd64_probe_one_instance+0xf8b/0x149a [amd64_edac_mod]
[    8.817562]  [<ffffffff811e7aac>] local_pci_probe+0x4d/0x96
[    8.817562]  [<ffffffff81042ecf>] ? cwq_dec_nr_in_flight+0x7b/0x7b
[    8.817562]  [<ffffffff81042ee7>] do_work_for_cpu+0x18/0x2a
[    8.817562]  [<ffffffff81048d0d>] kthread+0x89/0x91
[    8.817562]  [<ffffffff8144ca64>] kernel_thread_helper+0x4/0x10
[    8.817562]  [<ffffffff81444f4a>] ? retint_restore_args+0xe/0xe
[    8.817562]  [<ffffffff81048c84>] ? kthread_freezable_should_stop+0x57/0x57
[    8.817562]  [<ffffffff8144ca60>] ? gs_change+0xb/0xb
[    8.817562] Code: 5e 00 a0 31 c0 48 89 d6 e8 36 12 44 e1 48 89 df e8 c5 1a 00 00 48 89 df e8 fb cd 0e e1 41 59 5b c9 c3 55 48 89 e5 66 66 66 66 90 <48> 8b 0f 41 b8 08 00 00 00 41 89 d1 44 0f af ce 83 fe 08 4e 8d 
[    8.817562] RIP  [<ffffffffa0001084>] edac_align_ptr+0x9/0x5d [edac_core]
[    8.817562]  RSP <ffff880425213d60>
[    8.817562] CR2: 0000000000000740
[    9.099191] ---[ end trace 8214e97f27078aee ]---

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* [PATCH] edac: move nr_pages to dimm struct
  2012-04-17 21:40           ` Borislav Petkov
@ 2012-04-18 17:53             ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-18 17:53 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Mark Gross,
	Jason Uhlenkott, Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

The number of pages is a dimm property. Move it to the dimm struct.

After this change, it is possible to add sysfs nodes for the DIMM's that
will properly represent the DIMM stick properties, including its size.

A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
the memory controller represents the memory via chip select rows.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v14: Fix two debug messages to properly report the number of pages

 drivers/edac/amd64_edac.c      |   14 ++++-------
 drivers/edac/amd76x_edac.c     |    6 ++--
 drivers/edac/cell_edac.c       |    8 ++++--
 drivers/edac/cpc925_edac.c     |    8 ++++--
 drivers/edac/e752x_edac.c      |    6 +++-
 drivers/edac/e7xxx_edac.c      |    5 ++-
 drivers/edac/edac_mc.c         |   16 ++++++++-----
 drivers/edac/edac_mc_sysfs.c   |   47 ++++++++++++++++++++++++++++------------
 drivers/edac/i3000_edac.c      |    6 +++-
 drivers/edac/i3200_edac.c      |    3 +-
 drivers/edac/i5000_edac.c      |   14 ++++++-----
 drivers/edac/i5100_edac.c      |   22 +++++++++++-------
 drivers/edac/i5400_edac.c      |    9 ++-----
 drivers/edac/i7300_edac.c      |   22 +++++-------------
 drivers/edac/i7core_edac.c     |   10 ++------
 drivers/edac/i82443bxgx_edac.c |    2 +-
 drivers/edac/i82860_edac.c     |    2 +-
 drivers/edac/i82875p_edac.c    |    5 ++-
 drivers/edac/i82975x_edac.c    |   11 ++++++--
 drivers/edac/mpc85xx_edac.c    |    3 +-
 drivers/edac/mv64x60_edac.c    |    3 +-
 drivers/edac/pasemi_edac.c     |   14 ++++++------
 drivers/edac/ppc4xx_edac.c     |    5 ++-
 drivers/edac/r82600_edac.c     |    3 +-
 drivers/edac/sb_edac.c         |    8 +-----
 drivers/edac/tile_edac.c       |    2 +-
 drivers/edac/x38_edac.c        |    4 +-
 include/linux/edac.h           |    8 ++++--
 28 files changed, 145 insertions(+), 121 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 0be3f29..6d6ec68 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2126,14 +2126,8 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
 
 	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
 
-	/*
-	 * If dual channel then double the memory size of single channel.
-	 * Channel count is 1 or 2
-	 */
-	nr_pages <<= (pvt->channel_count - 1);
-
 	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
-	debugf0("    nr_pages= %u  channel-count = %d\n",
+	debugf0("    nr_pages/channel= %u  channel-count = %d\n",
 		nr_pages, pvt->channel_count);
 
 	return nr_pages;
@@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 	int i, j, empty = 1;
 	enum mem_type mtype;
 	enum edac_type edac_mode;
+	int nr_pages;
 
 	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
 
@@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
 			i, pvt->mc_node_id);
 
 		empty = 0;
-		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
+		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
 		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
 		/* 8 bytes of resolution */
 
 		mtype = amd64_determine_memory_type(pvt, i);
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    nr_pages: %u\n", csrow->nr_pages);
+		debugf1("    nr_pages: %u\n", nr_pages * pvt->channel_count);
 
 		/*
 		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
@@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 		for (j = 0; j < pvt->channel_count; j++) {
 			csrow->channels[j].dimm->mtype = mtype;
 			csrow->channels[j].dimm->edac_mode = edac_mode;
+			csrow->channels[j].dimm->nr_pages = nr_pages;
 		}
 	}
 
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 2a63ed0..1532750 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -205,10 +205,10 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		mba_mask = ((mba & 0xff80) << 16) | 0x7fffffUL;
 		pci_read_config_dword(pdev, AMD76X_DRAM_MODE_STATUS, &dms);
 		csrow->first_page = mba_base >> PAGE_SHIFT;
-		csrow->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		dimm->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
 		csrow->page_mask = mba_mask >> PAGE_SHIFT;
-		dimm->grain = csrow->nr_pages << PAGE_SHIFT;
+		dimm->grain = dimm->nr_pages << PAGE_SHIFT;
 		dimm->mtype = MEM_RDDR;
 		dimm->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
 		dimm->edac_mode = edac_mode;
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 94fbb12..09e1b5d 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -128,6 +128,7 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 	struct cell_edac_priv		*priv = mci->pvt_info;
 	struct device_node		*np;
 	int				j;
+	u32				nr_pages;
 
 	for (np = NULL;
 	     (np = of_find_node_by_name(np, "memory")) != NULL;) {
@@ -142,19 +143,20 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 		if (of_node_to_nid(np) != priv->node)
 			continue;
 		csrow->first_page = r.start >> PAGE_SHIFT;
-		csrow->nr_pages = resource_size(&r) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = resource_size(&r) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
 			dimm->mtype = MEM_XDR;
 			dimm->edac_mode = EDAC_SECDED;
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 		}
 		dev_dbg(mci->dev,
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
 			priv->node, priv->chanmask,
-			csrow->first_page, csrow->nr_pages);
+			csrow->first_page, dimm->nr_pages);
 		break;
 	}
 }
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index ee90f3d..7b764a8 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -332,7 +332,7 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 	struct dimm_info *dimm;
 	int index, j;
 	u32 mbmr, mbbar, bba;
-	unsigned long row_size, last_nr_pages = 0;
+	unsigned long row_size, nr_pages, last_nr_pages = 0;
 
 	get_total_mem(pdata);
 
@@ -351,12 +351,14 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 
 		row_size = bba * (1UL << 28);	/* 256M */
 		csrow->first_page = last_nr_pages;
-		csrow->nr_pages = row_size >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = row_size >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 		last_nr_pages = csrow->last_page + 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			dimm->mtype = MEM_RDDR;
 			dimm->edac_mode = EDAC_SECDED;
 
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index db291ea..6d81d3c 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1044,7 +1044,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	int drc_drbg;		/* DRB granularity 0=64mb, 1=128mb */
 	int drc_ddim;		/* DRAM Data Integrity Mode 0=none, 2=edac */
 	u8 value;
-	u32 dra, drc, cumul_size, i;
+	u32 dra, drc, cumul_size, i, nr_pages;
 
 	dra = 0;
 	for (index = 0; index < 4; index++) {
@@ -1078,11 +1078,13 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (i = 0; i < drc_chan + 1; i++) {
 			struct dimm_info *dimm = csrow->channels[i].dimm;
+
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 178d2af..aeb69f0 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -349,7 +349,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	unsigned long last_cumul_size;
 	int index, j;
 	u8 value;
-	u32 dra, cumul_size;
+	u32 dra, cumul_size, nr_pages;
 	int drc_chan, drc_drbg, drc_ddim, mem_dev;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
@@ -380,12 +380,13 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < drc_chan + 1; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index f83e63d..ffedae9 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -43,9 +43,10 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 {
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
-	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
+	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
+	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -55,7 +56,6 @@ static void edac_mc_dump_csrow(struct csrow_info *csrow)
 	debugf4("\tcsrow->first_page = 0x%lx\n", csrow->first_page);
 	debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
 	debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
-	debugf4("\tcsrow->nr_pages = 0x%x\n", csrow->nr_pages);
 	debugf4("\tcsrow->nr_channels = %d\n", csrow->nr_channels);
 	debugf4("\tcsrow->channels = %p\n", csrow->channels);
 	debugf4("\tcsrow->mci = %p\n\n", csrow->mci);
@@ -652,15 +652,19 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 {
 	struct csrow_info *csrows = mci->csrows;
-	int row, i;
+	int row, i, j, n;
 
 	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
 	row = -1;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
 		struct csrow_info *csrow = &csrows[i];
-
-		if (csrow->nr_pages == 0)
+		n = 0;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			n += dimm->nr_pages;
+		}
+		if (n == 0)
 			continue;
 
 		debugf3("MC%d: %s(): first(0x%lx) page(0x%lx) last(0x%lx) "
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index d63904e..c0275e6 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -144,7 +144,13 @@ static ssize_t csrow_ce_count_show(struct csrow_info *csrow, char *data,
 static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%u\n", PAGES_TO_MiB(csrow->nr_pages));
+	int i;
+	u32 nr_pages = 0;
+
+	for (i = 0; i < csrow->nr_channels; i++)
+		nr_pages += csrow->channels[i].dimm->nr_pages;
+
+	return sprintf(data, "%u\n", PAGES_TO_MiB(nr_pages));
 }
 
 static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
@@ -519,16 +525,16 @@ static ssize_t mci_ctl_name_show(struct mem_ctl_info *mci, char *data)
 
 static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data)
 {
-	int total_pages, csrow_idx;
+	int total_pages = 0, csrow_idx, j;
 
-	for (total_pages = csrow_idx = 0; csrow_idx < mci->nr_csrows;
-		csrow_idx++) {
+	for (csrow_idx = 0; csrow_idx < mci->nr_csrows; csrow_idx++) {
 		struct csrow_info *csrow = &mci->csrows[csrow_idx];
 
-		if (!csrow->nr_pages)
-			continue;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
 
-		total_pages += csrow->nr_pages;
+			total_pages += dimm->nr_pages;
+		}
 	}
 
 	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
@@ -900,7 +906,7 @@ static void edac_remove_mci_instance_attributes(struct mem_ctl_info *mci,
  */
 int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	int i, j;
 	int err;
 	struct csrow_info *csrow;
 	struct kobject *kobj_mci = &mci->edac_mci_kobj;
@@ -934,10 +940,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	/* Make directories for each CSROW object under the mc<id> kobject
 	 */
 	for (i = 0; i < mci->nr_csrows; i++) {
+		int nr_pages = 0;
+
 		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
 
-		/* Only expose populated CSROWs */
-		if (csrow->nr_pages > 0) {
+		if (nr_pages > 0) {
 			err = edac_create_csrow_object(mci, csrow, i);
 			if (err) {
 				debugf1("%s() failure: create csrow %d obj\n",
@@ -949,10 +958,14 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 
 	return 0;
 
-	/* CSROW error: backout what has already been registered,  */
 fail1:
 	for (i--; i >= 0; i--) {
-		if (mci->csrows[i].nr_pages > 0)
+		int nr_pages = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
+		if (nr_pages > 0)
 			kobject_put(&mci->csrows[i].kobj);
 	}
 
@@ -972,14 +985,20 @@ fail0:
  */
 void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	struct csrow_info *csrow;
+	int i, j;
 
 	debugf0("%s()\n", __func__);
 
 	/* remove all csrow kobjects */
 	debugf4("%s()  unregister this mci kobj\n", __func__);
 	for (i = 0; i < mci->nr_csrows; i++) {
-		if (mci->csrows[i].nr_pages > 0) {
+		int nr_pages = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
+		if (nr_pages > 0) {
 			debugf0("%s()  unreg csrow-%d\n", __func__, i);
 			kobject_put(&mci->csrows[i].kobj);
 		}
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 1498c5f..bf8a230 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -306,7 +306,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_cumul_size;
+	unsigned long last_cumul_size, nr_pages;
 	int interleaved, nr_channels;
 	unsigned char dra[I3000_RANKS / 2], drb[I3000_RANKS];
 	unsigned char *c0dra = dra, *c1dra = &dra[I3000_RANKS_PER_CHANNEL / 2];
@@ -391,11 +391,13 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = I3000_DEAP_GRAIN;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index d8fa7f3..b82667f 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -376,11 +376,10 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index f00f684..e8d32e8 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1236,6 +1236,7 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i5000_pvt *pvt;
 	struct csrow_info *p_csrow;
+	struct dimm_info *dimm;
 	int empty, channel_count;
 	int max_csrows;
 	int mtr, mtr1;
@@ -1265,21 +1266,22 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
+			dimm = p_csrow->channels[channel].dimm;
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
-			p_csrow->channels[channel].dimm->grain = 8;
+			dimm->grain = 8;
 
 			/* Assume DDR2 for now */
-			p_csrow->channels[channel].dimm->mtype = MEM_FB_DDR2;
+			dimm->mtype = MEM_FB_DDR2;
 
 			/* ask what device type on this row */
 			if (MTR_DRAM_WIDTH(mtr))
-				p_csrow->channels[channel].dimm->dtype = DEV_X8;
+				dimm->dtype = DEV_X8;
 			else
-				p_csrow->channels[channel].dimm->dtype = DEV_X4;
+				dimm->dtype = DEV_X4;
 
-			p_csrow->channels[channel].dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->nr_pages = (csrow_megs << 8) / pvt->maxch;
 		}
-		p_csrow->nr_pages = csrow_megs << 8;
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 8da7ce1..a0219a9 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -859,7 +859,6 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 * FIXME: these two are totally bogus -- I don't see how to
 		 * map them correctly to this structure...
 		 */
-		mci->csrows[i].nr_pages = npages;
 		mci->csrows[i].csrow_idx = i;
 		mci->csrows[i].mci = mci;
 		mci->csrows[i].nr_channels = 1;
@@ -867,14 +866,19 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		total_pages += npages;
 
 		dimm = mci->csrows[i].channels[0].dimm;
-		dimm->grain = 32;
-		dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
-			      DEV_X4 : DEV_X8;
-		dimm->mtype = MEM_RDDR2;
-		dimm->edac_mode = EDAC_SECDED;
-		snprintf(dimm->label, sizeof(dimm->label),
-			 "DIMM%u",
-			 i5100_rank_to_slot(mci, chan, rank));
+		dimm->nr_pages = npages;
+		if (npages) {
+			total_pages += npages;
+
+			dimm->grain = 32;
+			dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
+				DEV_X4 : DEV_X8;
+			dimm->mtype = MEM_RDDR2;
+			dimm->edac_mode = EDAC_SECDED;
+			snprintf(dimm->label, sizeof(dimm->label),
+				"DIMM%u",
+				i5100_rank_to_slot(mci, chan, rank));
+		}
 	}
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 4a23813..784d6dc 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1156,7 +1156,7 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	int empty, channel_count;
 	int max_csrows;
 	int mtr;
-	int csrow_megs;
+	int size_mb;
 	int channel;
 	int csrow;
 	struct dimm_info *dimm;
@@ -1171,8 +1171,6 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	for (csrow = 0; csrow < max_csrows; csrow++) {
 		p_csrow = &mci->csrows[csrow];
 
-		p_csrow->csrow_idx = csrow;
-
 		/* use branch 0 for the basis */
 		mtr = determine_mtr(pvt, csrow, 0);
 
@@ -1180,12 +1178,11 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr))
 			continue;
 
-		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
-			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
+			size_mb = pvt->dimm_info[csrow][channel].megabytes;
 
-			p_csrow->nr_pages = csrow_megs << 8;
 			dimm = p_csrow->channels[channel].dimm;
+			dimm->nr_pages = size_mb << 8;
 			dimm->grain = 8;
 			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
 			dimm->mtype = MEM_RDDR2;
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index df6cd59..5e594ae 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -617,9 +617,7 @@ static void i7300_enable_error_reporting(struct mem_ctl_info *mci)
 static int decode_mtr(struct i7300_pvt *pvt,
 		      int slot, int ch, int branch,
 		      struct i7300_dimm_info *dinfo,
-		      struct csrow_info *p_csrow,
-		      struct dimm_info *dimm,
-		      u32 *nr_pages)
+		      struct dimm_info *dimm)
 {
 	int mtr, ans, addrBits, channel;
 
@@ -651,7 +649,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	addrBits -= 3;	/* 8 bits per bytes */
 
 	dinfo->megabytes = 1 << addrBits;
-	*nr_pages = dinfo->megabytes << 8;
 
 	debugf2("\t\tWIDTH: x%d\n", MTR_DRAM_WIDTH(mtr));
 
@@ -664,8 +661,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 	debugf2("\t\tSIZE: %d MB\n", dinfo->megabytes);
 
-	p_csrow->csrow_idx = slot;
-
 	/*
 	 * The type of error detection actually depends of the
 	 * mode of operation. When it is just one single memory chip, at
@@ -675,6 +670,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	 * See datasheet Sections 7.3.6 to 7.3.8
 	 */
 
+	dimm->nr_pages = MiB_TO_PAGES(dinfo->megabytes);
 	dimm->grain = 8;
 	dimm->mtype = MEM_FB_DDR2;
 	if (IS_SINGLE_MODE(pvt->mc_settings_a)) {
@@ -774,11 +770,9 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i7300_pvt *pvt;
 	struct i7300_dimm_info *dinfo;
-	struct csrow_info *p_csrow;
 	int rc = -ENODEV;
 	int mtr;
 	int ch, branch, slot, channel;
-	u32 nr_pages;
 	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
@@ -804,7 +798,6 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	/* Get the set of MTR[0-7] regs by each branch */
-	nr_pages = 0;
 	for (slot = 0; slot < MAX_SLOTS; slot++) {
 		int where = mtr_regs[slot];
 		for (branch = 0; branch < MAX_BRANCHES; branch++) {
@@ -815,21 +808,18 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 				int channel = to_channel(ch, branch);
 
 				dinfo = &pvt->dimm_info[slot][channel];
-				p_csrow = &mci->csrows[slot];
 
-				dimm = p_csrow->channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+				dimm = mci->csrows[slot].channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
 
 				mtr = decode_mtr(pvt, slot, ch, branch,
-						 dinfo, p_csrow, dimm,
-						 &nr_pages);
+						 dinfo, dimm);
+
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
 
-				/* Update per_csrow memory count */
-				p_csrow->nr_pages += nr_pages;
-
 				rc = 0;
+
 			}
 		}
 	}
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 89ccec6..d566797 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -715,17 +715,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			npages = MiB_TO_PAGES(size);
 
 			csr = &mci->csrows[csrow];
-			csr->nr_pages = npages;
-
-			csr->csrow_idx = csrow;
-			csr->nr_channels = 1;
-
-			csr->channels[0].chan_idx = i;
-			csr->channels[0].ce_count = 0;
 
 			pvt->csrow_map[i][j] = csrow;
 
 			dimm = csr->channels[0].dimm;
+			dimm->nr_pages = npages;
+
 			switch (banks) {
 			case 4:
 				dimm->dtype = DEV_X4;
@@ -746,6 +741,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			dimm->grain = 8;
 			dimm->edac_mode = mode;
 			dimm->mtype = mtype;
+			csrow++;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 1e19492..74166ae 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -220,7 +220,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		row_base = row_high_limit_last;
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* EAP reports in 4kilobyte granularity [61] */
 		dimm->grain = 1 << 12;
 		dimm->mtype = mtype;
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index acbd924..48e0ecd 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -167,7 +167,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		dimm->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 		dimm->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
 		dimm->mtype = MEM_RMBS;
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 81f79e2..dc207dc 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -347,7 +347,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 	unsigned long last_cumul_size;
 	u8 value;
 	u32 drc_ddim;		/* DRAM Data Integrity Mode 0=none,2=edac */
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, j;
 
 	drc_ddim = (drc >> 18) & 0x1;
@@ -371,12 +371,13 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_chans; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_chans;
 			dimm->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
 			dimm->mtype = MEM_DDR;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 0b40e11..304af1d 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -370,7 +370,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 	struct csrow_info *csrow;
 	unsigned long last_cumul_size;
 	u8 value;
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, chan;
 	struct dimm_info *dimm;
 	enum dev_type dtype;
@@ -402,6 +402,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
 			cumul_size);
 
+		nr_pages = cumul_size - last_cumul_size;
 		/*
 		 * Initialise dram labels
 		 * index values:
@@ -411,6 +412,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		dtype = i82975x_dram_type(mch_window, index);
 		for (chan = 0; chan < csrow->nr_channels; chan++) {
 			dimm = mci->csrows[index].channels[chan].dimm;
+
+			if (!nr_pages)
+				continue;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
@@ -420,12 +426,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 			dimm->edac_mode = EDAC_SECDED; /* only supported */
 		}
 
-		if (cumul_size == last_cumul_size)
+		if (!nr_pages)
 			continue;	/* not populated */
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 	}
 }
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index fb92916..c1d9e15 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -947,7 +947,8 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 
 		csrow->first_page = start;
 		csrow->last_page = end;
-		csrow->nr_pages = end + 1 - start;
+
+		dimm->nr_pages = end + 1 - start;
 		dimm->grain = 8;
 		dimm->mtype = mtype;
 		dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index d2e3c39..281e245 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -667,7 +667,8 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 
 	csrow = &mci->csrows[0];
 	dimm = csrow->channels[0].dimm;
-	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
+
+	dimm->nr_pages = pdata->total_mem >> PAGE_SHIFT;
 	dimm->grain = 8;
 
 	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 4e53270..3fcefda 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -153,20 +153,20 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		switch ((rankcfg & MCDRAM_RANKCFG_TYPE_SIZE_M) >>
 			MCDRAM_RANKCFG_TYPE_SIZE_S) {
 		case 0:
-			csrow->nr_pages = 128 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 128 << (20 - PAGE_SHIFT);
 			break;
 		case 1:
-			csrow->nr_pages = 256 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 256 << (20 - PAGE_SHIFT);
 			break;
 		case 2:
 		case 3:
-			csrow->nr_pages = 512 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 512 << (20 - PAGE_SHIFT);
 			break;
 		case 4:
-			csrow->nr_pages = 1024 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 1024 << (20 - PAGE_SHIFT);
 			break;
 		case 5:
-			csrow->nr_pages = 2048 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 2048 << (20 - PAGE_SHIFT);
 			break;
 		default:
 			edac_mc_printk(mci, KERN_ERR,
@@ -176,8 +176,8 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		}
 
 		csrow->first_page = last_page_in_mmc;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-		last_page_in_mmc += csrow->nr_pages;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
+		last_page_in_mmc += dimm->nr_pages;
 		csrow->page_mask = 0;
 		dimm->grain = PASEMI_EDAC_ERROR_GRAIN;
 		dimm->mtype = MEM_DDR;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index ec5e529..95cfc0f 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -896,7 +896,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum dev_type dtype;
 	enum edac_type edac_mode;
 	int row, j;
-	u32 mbxcf, size;
+	u32 mbxcf, size, nr_pages;
 
 	/* Establish the memory type and width */
 
@@ -947,7 +947,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		case SDRAM_MBCF_SZ_2GB:
 		case SDRAM_MBCF_SZ_4GB:
 		case SDRAM_MBCF_SZ_8GB:
-			csi->nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
+			nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
 			break;
 		default:
 			ppc4xx_edac_mc_printk(KERN_ERR, mci,
@@ -973,6 +973,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		for (j = 0; j < csi->nr_channels; j++) {
 			struct dimm_info *dimm = csi->channels[j].dimm;
 
+			dimm->nr_pages  = nr_pages / csi->nr_channels;
 			dimm->grain	= 1;
 
 			dimm->mtype	= mtype;
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 414a532..19f3a10 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -249,7 +249,8 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* Error address is top 19 bits - so granularity is      *
 		 * 14 bits                                               */
 		dimm->grain = 1 << 14;
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index cf53007..ee1543d 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -561,7 +561,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
-	struct dimm_info *dimm;
 
 	pci_read_config_dword(pvt->pci_br, SAD_TARGET, &reg);
 	pvt->sbridge_dev->source_id = SOURCE_ID(reg);
@@ -613,11 +612,11 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	/* On all supported DDR3 DIMM types, there are 8 banks available */
 	banks = 8;
 
-	dimm = mci->dimms;
 	for (i = 0; i < NUM_CHANNELS; i++) {
 		u32 mtr;
 
 		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
+			struct dimm_info *dimm = &mci->dimms[j];
 			pci_read_config_dword(pvt->pci_tad[i],
 					      mtr_regs[j], &mtr);
 			debugf4("Channel #%d  MTR%d = %x\n", i, j, mtr);
@@ -642,15 +641,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				 * csrows.
 				 */
 				csr = &mci->csrows[csrow];
-				csr->nr_pages = npages;
-				csr->csrow_idx = csrow;
-				csr->nr_channels = 1;
-				csr->channels[0].chan_idx = i;
 				pvt->csrow_map[i][j] = csrow;
 				last_page += npages;
 				csrow++;
 
 				csr->channels[0].dimm = dimm;
+				dimm->nr_pages = npages;
 				dimm->grain = 32;
 				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
 				dimm->mtype = mtype;
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index ba0917b..6314ff9 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -110,7 +110,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 		return -1;
 	}
 
-	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
+	dimm->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
 	dimm->grain = TILE_EDAC_ERROR_GRAIN;
 	dimm->dtype = DEV_UNKNOWN;
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 7be10dd..0de288f 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -373,10 +373,10 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < x38_channel_num; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / x38_channel_num;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 5244193..8b78bd0 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -320,6 +320,8 @@ struct dimm_info {
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
+	u32 nr_pages;			/* number of pages in csrow */
+
 	u32 ce_count;		/* Correctable Errors for this dimm */
 };
 
@@ -346,12 +348,12 @@ struct rank_info {
 };
 
 struct csrow_info {
+	/* Used only by edac_mc_find_csrow_by_page() */
 	unsigned long first_page;	/* first page number in csrow */
 	unsigned long last_page;	/* last page number in csrow */
-	u32 nr_pages;			/* number of pages in csrow */
 	unsigned long page_mask;	/* used for interleaving -
-					 * 0UL for non intlv
-					 */
+					 * 0UL for non intlv */
+
 	int csrow_idx;			/* the chip-select row */
 
 	u32 ue_count;		/* Uncorrectable Errors for this csrow */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH] edac: move nr_pages to dimm struct
@ 2012-04-18 17:53             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-18 17:53 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Hitoshi Mitake,
	Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson, Andrew Morton,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, linuxppc-dev

The number of pages is a dimm property. Move it to the dimm struct.

After this change, it is possible to add sysfs nodes for the DIMM's that
will properly represent the DIMM stick properties, including its size.

A TODO fix here is to properly represent dual-rank/quad-rank DIMMs when
the memory controller represents the memory via chip select rows.

Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
Acked-by: Borislav Petkov <borislav.petkov@amd.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v14: Fix two debug messages to properly report the number of pages

 drivers/edac/amd64_edac.c      |   14 ++++-------
 drivers/edac/amd76x_edac.c     |    6 ++--
 drivers/edac/cell_edac.c       |    8 ++++--
 drivers/edac/cpc925_edac.c     |    8 ++++--
 drivers/edac/e752x_edac.c      |    6 +++-
 drivers/edac/e7xxx_edac.c      |    5 ++-
 drivers/edac/edac_mc.c         |   16 ++++++++-----
 drivers/edac/edac_mc_sysfs.c   |   47 ++++++++++++++++++++++++++++------------
 drivers/edac/i3000_edac.c      |    6 +++-
 drivers/edac/i3200_edac.c      |    3 +-
 drivers/edac/i5000_edac.c      |   14 ++++++-----
 drivers/edac/i5100_edac.c      |   22 +++++++++++-------
 drivers/edac/i5400_edac.c      |    9 ++-----
 drivers/edac/i7300_edac.c      |   22 +++++-------------
 drivers/edac/i7core_edac.c     |   10 ++------
 drivers/edac/i82443bxgx_edac.c |    2 +-
 drivers/edac/i82860_edac.c     |    2 +-
 drivers/edac/i82875p_edac.c    |    5 ++-
 drivers/edac/i82975x_edac.c    |   11 ++++++--
 drivers/edac/mpc85xx_edac.c    |    3 +-
 drivers/edac/mv64x60_edac.c    |    3 +-
 drivers/edac/pasemi_edac.c     |   14 ++++++------
 drivers/edac/ppc4xx_edac.c     |    5 ++-
 drivers/edac/r82600_edac.c     |    3 +-
 drivers/edac/sb_edac.c         |    8 +-----
 drivers/edac/tile_edac.c       |    2 +-
 drivers/edac/x38_edac.c        |    4 +-
 include/linux/edac.h           |    8 ++++--
 28 files changed, 145 insertions(+), 121 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 0be3f29..6d6ec68 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2126,14 +2126,8 @@ static u32 amd64_csrow_nr_pages(struct amd64_pvt *pvt, u8 dct, int csrow_nr)
 
 	nr_pages = pvt->ops->dbam_to_cs(pvt, dct, cs_mode) << (20 - PAGE_SHIFT);
 
-	/*
-	 * If dual channel then double the memory size of single channel.
-	 * Channel count is 1 or 2
-	 */
-	nr_pages <<= (pvt->channel_count - 1);
-
 	debugf0("  (csrow=%d) DBAM map index= %d\n", csrow_nr, cs_mode);
-	debugf0("    nr_pages= %u  channel-count = %d\n",
+	debugf0("    nr_pages/channel= %u  channel-count = %d\n",
 		nr_pages, pvt->channel_count);
 
 	return nr_pages;
@@ -2152,6 +2146,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 	int i, j, empty = 1;
 	enum mem_type mtype;
 	enum edac_type edac_mode;
+	int nr_pages;
 
 	amd64_read_pci_cfg(pvt->F3, NBCFG, &val);
 
@@ -2174,14 +2169,14 @@ static int init_csrows(struct mem_ctl_info *mci)
 			i, pvt->mc_node_id);
 
 		empty = 0;
-		csrow->nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
+		nr_pages = amd64_csrow_nr_pages(pvt, 0, i);
 		get_cs_base_and_mask(pvt, i, 0, &base, &mask);
 		/* 8 bytes of resolution */
 
 		mtype = amd64_determine_memory_type(pvt, i);
 
 		debugf1("  for MC node %d csrow %d:\n", pvt->mc_node_id, i);
-		debugf1("    nr_pages: %u\n", csrow->nr_pages);
+		debugf1("    nr_pages: %u\n", nr_pages * pvt->channel_count);
 
 		/*
 		 * determine whether CHIPKILL or JUST ECC or NO ECC is operating
@@ -2195,6 +2190,7 @@ static int init_csrows(struct mem_ctl_info *mci)
 		for (j = 0; j < pvt->channel_count; j++) {
 			csrow->channels[j].dimm->mtype = mtype;
 			csrow->channels[j].dimm->edac_mode = edac_mode;
+			csrow->channels[j].dimm->nr_pages = nr_pages;
 		}
 	}
 
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 2a63ed0..1532750 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -205,10 +205,10 @@ static void amd76x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		mba_mask = ((mba & 0xff80) << 16) | 0x7fffffUL;
 		pci_read_config_dword(pdev, AMD76X_DRAM_MODE_STATUS, &dms);
 		csrow->first_page = mba_base >> PAGE_SHIFT;
-		csrow->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		dimm->nr_pages = (mba_mask + 1) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
 		csrow->page_mask = mba_mask >> PAGE_SHIFT;
-		dimm->grain = csrow->nr_pages << PAGE_SHIFT;
+		dimm->grain = dimm->nr_pages << PAGE_SHIFT;
 		dimm->mtype = MEM_RDDR;
 		dimm->dtype = ((dms >> index) & 0x1) ? DEV_X4 : DEV_UNKNOWN;
 		dimm->edac_mode = edac_mode;
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index 94fbb12..09e1b5d 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -128,6 +128,7 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 	struct cell_edac_priv		*priv = mci->pvt_info;
 	struct device_node		*np;
 	int				j;
+	u32				nr_pages;
 
 	for (np = NULL;
 	     (np = of_find_node_by_name(np, "memory")) != NULL;) {
@@ -142,19 +143,20 @@ static void __devinit cell_edac_init_csrows(struct mem_ctl_info *mci)
 		if (of_node_to_nid(np) != priv->node)
 			continue;
 		csrow->first_page = r.start >> PAGE_SHIFT;
-		csrow->nr_pages = resource_size(&r) >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = resource_size(&r) >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
 			dimm->mtype = MEM_XDR;
 			dimm->edac_mode = EDAC_SECDED;
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 		}
 		dev_dbg(mci->dev,
 			"Initialized on node %d, chanmask=0x%x,"
 			" first_page=0x%lx, nr_pages=0x%x\n",
 			priv->node, priv->chanmask,
-			csrow->first_page, csrow->nr_pages);
+			csrow->first_page, dimm->nr_pages);
 		break;
 	}
 }
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index ee90f3d..7b764a8 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -332,7 +332,7 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 	struct dimm_info *dimm;
 	int index, j;
 	u32 mbmr, mbbar, bba;
-	unsigned long row_size, last_nr_pages = 0;
+	unsigned long row_size, nr_pages, last_nr_pages = 0;
 
 	get_total_mem(pdata);
 
@@ -351,12 +351,14 @@ static void cpc925_init_csrows(struct mem_ctl_info *mci)
 
 		row_size = bba * (1UL << 28);	/* 256M */
 		csrow->first_page = last_nr_pages;
-		csrow->nr_pages = row_size >> PAGE_SHIFT;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
+		nr_pages = row_size >> PAGE_SHIFT;
+		csrow->last_page = csrow->first_page + nr_pages - 1;
 		last_nr_pages = csrow->last_page + 1;
 
 		for (j = 0; j < csrow->nr_channels; j++) {
 			dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			dimm->mtype = MEM_RDDR;
 			dimm->edac_mode = EDAC_SECDED;
 
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index db291ea..6d81d3c 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1044,7 +1044,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	int drc_drbg;		/* DRB granularity 0=64mb, 1=128mb */
 	int drc_ddim;		/* DRAM Data Integrity Mode 0=none, 2=edac */
 	u8 value;
-	u32 dra, drc, cumul_size, i;
+	u32 dra, drc, cumul_size, i, nr_pages;
 
 	dra = 0;
 	for (index = 0; index < 4; index++) {
@@ -1078,11 +1078,13 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (i = 0; i < drc_chan + 1; i++) {
 			struct dimm_info *dimm = csrow->channels[i].dimm;
+
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 178d2af..aeb69f0 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -349,7 +349,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 	unsigned long last_cumul_size;
 	int index, j;
 	u8 value;
-	u32 dra, cumul_size;
+	u32 dra, cumul_size, nr_pages;
 	int drc_chan, drc_drbg, drc_ddim, mem_dev;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
@@ -380,12 +380,13 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < drc_chan + 1; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / (drc_chan + 1);
 			dimm->grain = 1 << 12;	/* 4KiB - resolution of CELOG */
 			dimm->mtype = MEM_RDDR;	/* only one type supported */
 			dimm->dtype = mem_dev ? DEV_X4 : DEV_X8;
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index f83e63d..ffedae9 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -43,9 +43,10 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 {
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
-	debugf4("\tchannel->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tchannel->label = '%s'\n", chan->dimm->label);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
+	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
+	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -55,7 +56,6 @@ static void edac_mc_dump_csrow(struct csrow_info *csrow)
 	debugf4("\tcsrow->first_page = 0x%lx\n", csrow->first_page);
 	debugf4("\tcsrow->last_page = 0x%lx\n", csrow->last_page);
 	debugf4("\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
-	debugf4("\tcsrow->nr_pages = 0x%x\n", csrow->nr_pages);
 	debugf4("\tcsrow->nr_channels = %d\n", csrow->nr_channels);
 	debugf4("\tcsrow->channels = %p\n", csrow->channels);
 	debugf4("\tcsrow->mci = %p\n\n", csrow->mci);
@@ -652,15 +652,19 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 {
 	struct csrow_info *csrows = mci->csrows;
-	int row, i;
+	int row, i, j, n;
 
 	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
 	row = -1;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
 		struct csrow_info *csrow = &csrows[i];
-
-		if (csrow->nr_pages == 0)
+		n = 0;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
+			n += dimm->nr_pages;
+		}
+		if (n == 0)
 			continue;
 
 		debugf3("MC%d: %s(): first(0x%lx) page(0x%lx) last(0x%lx) "
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index d63904e..c0275e6 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -144,7 +144,13 @@ static ssize_t csrow_ce_count_show(struct csrow_info *csrow, char *data,
 static ssize_t csrow_size_show(struct csrow_info *csrow, char *data,
 				int private)
 {
-	return sprintf(data, "%u\n", PAGES_TO_MiB(csrow->nr_pages));
+	int i;
+	u32 nr_pages = 0;
+
+	for (i = 0; i < csrow->nr_channels; i++)
+		nr_pages += csrow->channels[i].dimm->nr_pages;
+
+	return sprintf(data, "%u\n", PAGES_TO_MiB(nr_pages));
 }
 
 static ssize_t csrow_mem_type_show(struct csrow_info *csrow, char *data,
@@ -519,16 +525,16 @@ static ssize_t mci_ctl_name_show(struct mem_ctl_info *mci, char *data)
 
 static ssize_t mci_size_mb_show(struct mem_ctl_info *mci, char *data)
 {
-	int total_pages, csrow_idx;
+	int total_pages = 0, csrow_idx, j;
 
-	for (total_pages = csrow_idx = 0; csrow_idx < mci->nr_csrows;
-		csrow_idx++) {
+	for (csrow_idx = 0; csrow_idx < mci->nr_csrows; csrow_idx++) {
 		struct csrow_info *csrow = &mci->csrows[csrow_idx];
 
-		if (!csrow->nr_pages)
-			continue;
+		for (j = 0; j < csrow->nr_channels; j++) {
+			struct dimm_info *dimm = csrow->channels[j].dimm;
 
-		total_pages += csrow->nr_pages;
+			total_pages += dimm->nr_pages;
+		}
 	}
 
 	return sprintf(data, "%u\n", PAGES_TO_MiB(total_pages));
@@ -900,7 +906,7 @@ static void edac_remove_mci_instance_attributes(struct mem_ctl_info *mci,
  */
 int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	int i, j;
 	int err;
 	struct csrow_info *csrow;
 	struct kobject *kobj_mci = &mci->edac_mci_kobj;
@@ -934,10 +940,13 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	/* Make directories for each CSROW object under the mc<id> kobject
 	 */
 	for (i = 0; i < mci->nr_csrows; i++) {
+		int nr_pages = 0;
+
 		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
 
-		/* Only expose populated CSROWs */
-		if (csrow->nr_pages > 0) {
+		if (nr_pages > 0) {
 			err = edac_create_csrow_object(mci, csrow, i);
 			if (err) {
 				debugf1("%s() failure: create csrow %d obj\n",
@@ -949,10 +958,14 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 
 	return 0;
 
-	/* CSROW error: backout what has already been registered,  */
 fail1:
 	for (i--; i >= 0; i--) {
-		if (mci->csrows[i].nr_pages > 0)
+		int nr_pages = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
+		if (nr_pages > 0)
 			kobject_put(&mci->csrows[i].kobj);
 	}
 
@@ -972,14 +985,20 @@ fail0:
  */
 void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
-	int i;
+	struct csrow_info *csrow;
+	int i, j;
 
 	debugf0("%s()\n", __func__);
 
 	/* remove all csrow kobjects */
 	debugf4("%s()  unregister this mci kobj\n", __func__);
 	for (i = 0; i < mci->nr_csrows; i++) {
-		if (mci->csrows[i].nr_pages > 0) {
+		int nr_pages = 0;
+
+		csrow = &mci->csrows[i];
+		for (j = 0; j < csrow->nr_channels; j++)
+			nr_pages += csrow->channels[j].dimm->nr_pages;
+		if (nr_pages > 0) {
 			debugf0("%s()  unreg csrow-%d\n", __func__, i);
 			kobject_put(&mci->csrows[i].kobj);
 		}
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 1498c5f..bf8a230 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -306,7 +306,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	int rc;
 	int i, j;
 	struct mem_ctl_info *mci = NULL;
-	unsigned long last_cumul_size;
+	unsigned long last_cumul_size, nr_pages;
 	int interleaved, nr_channels;
 	unsigned char dra[I3000_RANKS / 2], drb[I3000_RANKS];
 	unsigned char *c0dra = dra, *c1dra = &dra[I3000_RANKS_PER_CHANNEL / 2];
@@ -391,11 +391,13 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = I3000_DEAP_GRAIN;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index d8fa7f3..b82667f 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -376,11 +376,10 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < nr_channels; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_channels;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index f00f684..e8d32e8 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1236,6 +1236,7 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i5000_pvt *pvt;
 	struct csrow_info *p_csrow;
+	struct dimm_info *dimm;
 	int empty, channel_count;
 	int max_csrows;
 	int mtr, mtr1;
@@ -1265,21 +1266,22 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 
 		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
+			dimm = p_csrow->channels[channel].dimm;
 			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
-			p_csrow->channels[channel].dimm->grain = 8;
+			dimm->grain = 8;
 
 			/* Assume DDR2 for now */
-			p_csrow->channels[channel].dimm->mtype = MEM_FB_DDR2;
+			dimm->mtype = MEM_FB_DDR2;
 
 			/* ask what device type on this row */
 			if (MTR_DRAM_WIDTH(mtr))
-				p_csrow->channels[channel].dimm->dtype = DEV_X8;
+				dimm->dtype = DEV_X8;
 			else
-				p_csrow->channels[channel].dimm->dtype = DEV_X4;
+				dimm->dtype = DEV_X4;
 
-			p_csrow->channels[channel].dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->edac_mode = EDAC_S8ECD8ED;
+			dimm->nr_pages = (csrow_megs << 8) / pvt->maxch;
 		}
-		p_csrow->nr_pages = csrow_megs << 8;
 
 		empty = 0;
 	}
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index 8da7ce1..a0219a9 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -859,7 +859,6 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		 * FIXME: these two are totally bogus -- I don't see how to
 		 * map them correctly to this structure...
 		 */
-		mci->csrows[i].nr_pages = npages;
 		mci->csrows[i].csrow_idx = i;
 		mci->csrows[i].mci = mci;
 		mci->csrows[i].nr_channels = 1;
@@ -867,14 +866,19 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		total_pages += npages;
 
 		dimm = mci->csrows[i].channels[0].dimm;
-		dimm->grain = 32;
-		dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
-			      DEV_X4 : DEV_X8;
-		dimm->mtype = MEM_RDDR2;
-		dimm->edac_mode = EDAC_SECDED;
-		snprintf(dimm->label, sizeof(dimm->label),
-			 "DIMM%u",
-			 i5100_rank_to_slot(mci, chan, rank));
+		dimm->nr_pages = npages;
+		if (npages) {
+			total_pages += npages;
+
+			dimm->grain = 32;
+			dimm->dtype = (priv->mtr[chan][rank].width == 4) ?
+				DEV_X4 : DEV_X8;
+			dimm->mtype = MEM_RDDR2;
+			dimm->edac_mode = EDAC_SECDED;
+			snprintf(dimm->label, sizeof(dimm->label),
+				"DIMM%u",
+				i5100_rank_to_slot(mci, chan, rank));
+		}
 	}
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 4a23813..784d6dc 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1156,7 +1156,7 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	int empty, channel_count;
 	int max_csrows;
 	int mtr;
-	int csrow_megs;
+	int size_mb;
 	int channel;
 	int csrow;
 	struct dimm_info *dimm;
@@ -1171,8 +1171,6 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 	for (csrow = 0; csrow < max_csrows; csrow++) {
 		p_csrow = &mci->csrows[csrow];
 
-		p_csrow->csrow_idx = csrow;
-
 		/* use branch 0 for the basis */
 		mtr = determine_mtr(pvt, csrow, 0);
 
@@ -1180,12 +1178,11 @@ static int i5400_init_csrows(struct mem_ctl_info *mci)
 		if (!MTR_DIMMS_PRESENT(mtr))
 			continue;
 
-		csrow_megs = 0;
 		for (channel = 0; channel < pvt->maxch; channel++) {
-			csrow_megs += pvt->dimm_info[csrow][channel].megabytes;
+			size_mb = pvt->dimm_info[csrow][channel].megabytes;
 
-			p_csrow->nr_pages = csrow_megs << 8;
 			dimm = p_csrow->channels[channel].dimm;
+			dimm->nr_pages = size_mb << 8;
 			dimm->grain = 8;
 			dimm->dtype = MTR_DRAM_WIDTH(mtr) ? DEV_X8 : DEV_X4;
 			dimm->mtype = MEM_RDDR2;
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index df6cd59..5e594ae 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -617,9 +617,7 @@ static void i7300_enable_error_reporting(struct mem_ctl_info *mci)
 static int decode_mtr(struct i7300_pvt *pvt,
 		      int slot, int ch, int branch,
 		      struct i7300_dimm_info *dinfo,
-		      struct csrow_info *p_csrow,
-		      struct dimm_info *dimm,
-		      u32 *nr_pages)
+		      struct dimm_info *dimm)
 {
 	int mtr, ans, addrBits, channel;
 
@@ -651,7 +649,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	addrBits -= 3;	/* 8 bits per bytes */
 
 	dinfo->megabytes = 1 << addrBits;
-	*nr_pages = dinfo->megabytes << 8;
 
 	debugf2("\t\tWIDTH: x%d\n", MTR_DRAM_WIDTH(mtr));
 
@@ -664,8 +661,6 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
 	debugf2("\t\tSIZE: %d MB\n", dinfo->megabytes);
 
-	p_csrow->csrow_idx = slot;
-
 	/*
 	 * The type of error detection actually depends of the
 	 * mode of operation. When it is just one single memory chip, at
@@ -675,6 +670,7 @@ static int decode_mtr(struct i7300_pvt *pvt,
 	 * See datasheet Sections 7.3.6 to 7.3.8
 	 */
 
+	dimm->nr_pages = MiB_TO_PAGES(dinfo->megabytes);
 	dimm->grain = 8;
 	dimm->mtype = MEM_FB_DDR2;
 	if (IS_SINGLE_MODE(pvt->mc_settings_a)) {
@@ -774,11 +770,9 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 {
 	struct i7300_pvt *pvt;
 	struct i7300_dimm_info *dinfo;
-	struct csrow_info *p_csrow;
 	int rc = -ENODEV;
 	int mtr;
 	int ch, branch, slot, channel;
-	u32 nr_pages;
 	struct dimm_info *dimm;
 
 	pvt = mci->pvt_info;
@@ -804,7 +798,6 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 	}
 
 	/* Get the set of MTR[0-7] regs by each branch */
-	nr_pages = 0;
 	for (slot = 0; slot < MAX_SLOTS; slot++) {
 		int where = mtr_regs[slot];
 		for (branch = 0; branch < MAX_BRANCHES; branch++) {
@@ -815,21 +808,18 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 				int channel = to_channel(ch, branch);
 
 				dinfo = &pvt->dimm_info[slot][channel];
-				p_csrow = &mci->csrows[slot];
 
-				dimm = p_csrow->channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
+				dimm = mci->csrows[slot].channels[branch * MAX_CH_PER_BRANCH + ch].dimm;
 
 				mtr = decode_mtr(pvt, slot, ch, branch,
-						 dinfo, p_csrow, dimm,
-						 &nr_pages);
+						 dinfo, dimm);
+
 				/* if no DIMMS on this row, continue */
 				if (!MTR_DIMMS_PRESENT(mtr))
 					continue;
 
-				/* Update per_csrow memory count */
-				p_csrow->nr_pages += nr_pages;
-
 				rc = 0;
+
 			}
 		}
 	}
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index 89ccec6..d566797 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -715,17 +715,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			npages = MiB_TO_PAGES(size);
 
 			csr = &mci->csrows[csrow];
-			csr->nr_pages = npages;
-
-			csr->csrow_idx = csrow;
-			csr->nr_channels = 1;
-
-			csr->channels[0].chan_idx = i;
-			csr->channels[0].ce_count = 0;
 
 			pvt->csrow_map[i][j] = csrow;
 
 			dimm = csr->channels[0].dimm;
+			dimm->nr_pages = npages;
+
 			switch (banks) {
 			case 4:
 				dimm->dtype = DEV_X4;
@@ -746,6 +741,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			dimm->grain = 8;
 			dimm->edac_mode = mode;
 			dimm->mtype = mtype;
+			csrow++;
 		}
 
 		pci_read_config_dword(pdev, MC_SAG_CH_0, &value[0]);
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 1e19492..74166ae 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -220,7 +220,7 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		row_base = row_high_limit_last;
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* EAP reports in 4kilobyte granularity [61] */
 		dimm->grain = 1 << 12;
 		dimm->mtype = mtype;
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index acbd924..48e0ecd 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -167,7 +167,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		dimm->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 		dimm->grain = 1 << 12;	/* I82860_EAP has 4KiB reolution */
 		dimm->mtype = MEM_RMBS;
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index 81f79e2..dc207dc 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -347,7 +347,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 	unsigned long last_cumul_size;
 	u8 value;
 	u32 drc_ddim;		/* DRAM Data Integrity Mode 0=none,2=edac */
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, j;
 
 	drc_ddim = (drc >> 18) & 0x1;
@@ -371,12 +371,13 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
+		nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 
 		for (j = 0; j < nr_chans; j++) {
 			dimm = csrow->channels[j].dimm;
 
+			dimm->nr_pages = nr_pages / nr_chans;
 			dimm->grain = 1 << 12;	/* I82875P_EAP has 4KiB reolution */
 			dimm->mtype = MEM_DDR;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 0b40e11..304af1d 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -370,7 +370,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 	struct csrow_info *csrow;
 	unsigned long last_cumul_size;
 	u8 value;
-	u32 cumul_size;
+	u32 cumul_size, nr_pages;
 	int index, chan;
 	struct dimm_info *dimm;
 	enum dev_type dtype;
@@ -402,6 +402,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
 			cumul_size);
 
+		nr_pages = cumul_size - last_cumul_size;
 		/*
 		 * Initialise dram labels
 		 * index values:
@@ -411,6 +412,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		dtype = i82975x_dram_type(mch_window, index);
 		for (chan = 0; chan < csrow->nr_channels; chan++) {
 			dimm = mci->csrows[index].channels[chan].dimm;
+
+			if (!nr_pages)
+				continue;
+
+			dimm->nr_pages = nr_pages / csrow->nr_channels;
 			strncpy(csrow->channels[chan].dimm->label,
 					labels[(index >> 1) + (chan * 2)],
 					EDAC_MC_LABEL_LEN);
@@ -420,12 +426,11 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 			dimm->edac_mode = EDAC_SECDED; /* only supported */
 		}
 
-		if (cumul_size == last_cumul_size)
+		if (!nr_pages)
 			continue;	/* not populated */
 
 		csrow->first_page = last_cumul_size;
 		csrow->last_page = cumul_size - 1;
-		csrow->nr_pages = cumul_size - last_cumul_size;
 		last_cumul_size = cumul_size;
 	}
 }
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index fb92916..c1d9e15 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -947,7 +947,8 @@ static void __devinit mpc85xx_init_csrows(struct mem_ctl_info *mci)
 
 		csrow->first_page = start;
 		csrow->last_page = end;
-		csrow->nr_pages = end + 1 - start;
+
+		dimm->nr_pages = end + 1 - start;
 		dimm->grain = 8;
 		dimm->mtype = mtype;
 		dimm->dtype = DEV_UNKNOWN;
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index d2e3c39..281e245 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -667,7 +667,8 @@ static void mv64x60_init_csrows(struct mem_ctl_info *mci,
 
 	csrow = &mci->csrows[0];
 	dimm = csrow->channels[0].dimm;
-	csrow->nr_pages = pdata->total_mem >> PAGE_SHIFT;
+
+	dimm->nr_pages = pdata->total_mem >> PAGE_SHIFT;
 	dimm->grain = 8;
 
 	dimm->mtype = (ctl & MV64X60_SDRAM_REGISTERED) ? MEM_RDDR : MEM_DDR;
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 4e53270..3fcefda 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -153,20 +153,20 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		switch ((rankcfg & MCDRAM_RANKCFG_TYPE_SIZE_M) >>
 			MCDRAM_RANKCFG_TYPE_SIZE_S) {
 		case 0:
-			csrow->nr_pages = 128 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 128 << (20 - PAGE_SHIFT);
 			break;
 		case 1:
-			csrow->nr_pages = 256 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 256 << (20 - PAGE_SHIFT);
 			break;
 		case 2:
 		case 3:
-			csrow->nr_pages = 512 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 512 << (20 - PAGE_SHIFT);
 			break;
 		case 4:
-			csrow->nr_pages = 1024 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 1024 << (20 - PAGE_SHIFT);
 			break;
 		case 5:
-			csrow->nr_pages = 2048 << (20 - PAGE_SHIFT);
+			dimm->nr_pages = 2048 << (20 - PAGE_SHIFT);
 			break;
 		default:
 			edac_mc_printk(mci, KERN_ERR,
@@ -176,8 +176,8 @@ static int pasemi_edac_init_csrows(struct mem_ctl_info *mci,
 		}
 
 		csrow->first_page = last_page_in_mmc;
-		csrow->last_page = csrow->first_page + csrow->nr_pages - 1;
-		last_page_in_mmc += csrow->nr_pages;
+		csrow->last_page = csrow->first_page + dimm->nr_pages - 1;
+		last_page_in_mmc += dimm->nr_pages;
 		csrow->page_mask = 0;
 		dimm->grain = PASEMI_EDAC_ERROR_GRAIN;
 		dimm->mtype = MEM_DDR;
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index ec5e529..95cfc0f 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -896,7 +896,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 	enum dev_type dtype;
 	enum edac_type edac_mode;
 	int row, j;
-	u32 mbxcf, size;
+	u32 mbxcf, size, nr_pages;
 
 	/* Establish the memory type and width */
 
@@ -947,7 +947,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		case SDRAM_MBCF_SZ_2GB:
 		case SDRAM_MBCF_SZ_4GB:
 		case SDRAM_MBCF_SZ_8GB:
-			csi->nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
+			nr_pages = SDRAM_MBCF_SZ_TO_PAGES(size);
 			break;
 		default:
 			ppc4xx_edac_mc_printk(KERN_ERR, mci,
@@ -973,6 +973,7 @@ ppc4xx_edac_init_csrows(struct mem_ctl_info *mci, u32 mcopt1)
 		for (j = 0; j < csi->nr_channels; j++) {
 			struct dimm_info *dimm = csi->channels[j].dimm;
 
+			dimm->nr_pages  = nr_pages / csi->nr_channels;
 			dimm->grain	= 1;
 
 			dimm->mtype	= mtype;
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 414a532..19f3a10 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -249,7 +249,8 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 
 		csrow->first_page = row_base >> PAGE_SHIFT;
 		csrow->last_page = (row_high_limit >> PAGE_SHIFT) - 1;
-		csrow->nr_pages = csrow->last_page - csrow->first_page + 1;
+
+		dimm->nr_pages = csrow->last_page - csrow->first_page + 1;
 		/* Error address is top 19 bits - so granularity is      *
 		 * 14 bits                                               */
 		dimm->grain = 1 << 14;
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index cf53007..ee1543d 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -561,7 +561,6 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	u32 reg;
 	enum edac_type mode;
 	enum mem_type mtype;
-	struct dimm_info *dimm;
 
 	pci_read_config_dword(pvt->pci_br, SAD_TARGET, &reg);
 	pvt->sbridge_dev->source_id = SOURCE_ID(reg);
@@ -613,11 +612,11 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 	/* On all supported DDR3 DIMM types, there are 8 banks available */
 	banks = 8;
 
-	dimm = mci->dimms;
 	for (i = 0; i < NUM_CHANNELS; i++) {
 		u32 mtr;
 
 		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
+			struct dimm_info *dimm = &mci->dimms[j];
 			pci_read_config_dword(pvt->pci_tad[i],
 					      mtr_regs[j], &mtr);
 			debugf4("Channel #%d  MTR%d = %x\n", i, j, mtr);
@@ -642,15 +641,12 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				 * csrows.
 				 */
 				csr = &mci->csrows[csrow];
-				csr->nr_pages = npages;
-				csr->csrow_idx = csrow;
-				csr->nr_channels = 1;
-				csr->channels[0].chan_idx = i;
 				pvt->csrow_map[i][j] = csrow;
 				last_page += npages;
 				csrow++;
 
 				csr->channels[0].dimm = dimm;
+				dimm->nr_pages = npages;
 				dimm->grain = 32;
 				dimm->dtype = (banks == 8) ? DEV_X8 : DEV_X4;
 				dimm->mtype = mtype;
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index ba0917b..6314ff9 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -110,7 +110,7 @@ static int __devinit tile_edac_init_csrows(struct mem_ctl_info *mci)
 		return -1;
 	}
 
-	csrow->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
+	dimm->nr_pages = mem_info.mem_size >> PAGE_SHIFT;
 	dimm->grain = TILE_EDAC_ERROR_GRAIN;
 	dimm->dtype = DEV_UNKNOWN;
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 7be10dd..0de288f 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -373,10 +373,10 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 		if (nr_pages == 0)
 			continue;
 
-		csrow->nr_pages = nr_pages;
-
 		for (j = 0; j < x38_channel_num; j++) {
 			struct dimm_info *dimm = csrow->channels[j].dimm;
+
+			dimm->nr_pages = nr_pages / x38_channel_num;
 			dimm->grain = nr_pages << PAGE_SHIFT;
 			dimm->mtype = MEM_DDR2;
 			dimm->dtype = DEV_UNKNOWN;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 5244193..8b78bd0 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -320,6 +320,8 @@ struct dimm_info {
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
+	u32 nr_pages;			/* number of pages in csrow */
+
 	u32 ce_count;		/* Correctable Errors for this dimm */
 };
 
@@ -346,12 +348,12 @@ struct rank_info {
 };
 
 struct csrow_info {
+	/* Used only by edac_mc_find_csrow_by_page() */
 	unsigned long first_page;	/* first page number in csrow */
 	unsigned long last_page;	/* last page number in csrow */
-	u32 nr_pages;			/* number of pages in csrow */
 	unsigned long page_mask;	/* used for interleaving -
-					 * 0UL for non intlv
-					 */
+					 * 0UL for non intlv */
+
 	int csrow_idx;			/* the chip-select row */
 
 	u32 ue_count;		/* Uncorrectable Errors for this csrow */
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 5/7] edac: rewrite edac_align_ptr()
  2012-04-18 14:06     ` Borislav Petkov
  2012-04-18 15:25       ` Borislav Petkov
@ 2012-04-18 18:15       ` Mauro Carvalho Chehab
  2012-04-18 18:19       ` [PATCH] " Mauro Carvalho Chehab
  2 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-18 18:15 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 18-04-2012 11:06, Borislav Petkov escreveu:
> On Mon, Apr 16, 2012 at 05:12:11PM -0300, Mauro Carvalho Chehab wrote:
>> The edac_align_ptr() function is used to prepare data for a single
>> memory allocation kzalloc() call. It counts how many bytes are needed
>> by some data structure.
>>
>> Using it as-is is not that trivial, as the quantity of memory elements
>> reserved is not there, but, instead, it is on a next call.
>>
>> In order to avoid mistakes when using it, move the number of allocated
>> elements into it, making easier to use it.
>>
>> Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
> 
> AFAICT, this is a new patch so Aristeu cannot have reviewed it too. In
> such case, you can't simply keep the Reviewed-by tagging. Unless he
> really did that and I missed his mail with the tag somehow...?
> 
>> Cc: Doug Thompson <norsk5@yahoo.com>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>  drivers/edac/edac_device.c |   27 +++++++++++----------------
>>  drivers/edac/edac_mc.c     |   19 +++++++++++++------
>>  drivers/edac/edac_module.h |    2 +-
>>  drivers/edac/edac_pci.c    |    7 ++++---
>>  4 files changed, 29 insertions(+), 26 deletions(-)
>>
>> diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
>> index 4b15459..cb397d9 100644
>> --- a/drivers/edac/edac_device.c
>> +++ b/drivers/edac/edac_device.c
>> @@ -79,7 +79,7 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
>>  	unsigned total_size;
>>  	unsigned count;
>>  	unsigned instance, block, attr;
>> -	void *pvt;
>> +	void *pvt, *p;
>>  	int err;
>>  
>>  	debugf4("%s() instances=%d blocks=%d\n",
>> @@ -92,35 +92,30 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
>>  	 * to be at least as stringent as what the compiler would
>>  	 * provide if we could simply hardcode everything into a single struct.
>>  	 */
>> -	dev_ctl = (struct edac_device_ctl_info *)NULL;
>> +	p = NULL;
>> +	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
>>  
>>  	/* Calc the 'end' offset past end of ONE ctl_info structure
>>  	 * which will become the start of the 'instance' array
>>  	 */
>> -	dev_inst = edac_align_ptr(&dev_ctl[1], sizeof(*dev_inst));
>> +	dev_inst = edac_align_ptr(&p, sizeof(*dev_inst), nr_instances);
>>  
>>  	/* Calc the 'end' offset past the instance array within the ctl_info
>>  	 * which will become the start of the block array
>>  	 */
>> -	dev_blk = edac_align_ptr(&dev_inst[nr_instances], sizeof(*dev_blk));
>> +	count = nr_instances * nr_blocks;
>> +	dev_blk = edac_align_ptr(&p, sizeof(*dev_blk), count);
>>  
>>  	/* Calc the 'end' offset past the dev_blk array
>>  	 * which will become the start of the attrib array, if any.
>>  	 */
>> -	count = nr_instances * nr_blocks;
>> -	dev_attrib = edac_align_ptr(&dev_blk[count], sizeof(*dev_attrib));
>> -
>> -	/* Check for case of when an attribute array is specified */
>> -	if (nr_attrib > 0) {
>> -		/* calc how many nr_attrib we need */
>> +	/* calc how many nr_attrib we need */
>> +	if (nr_attrib > 0)
>>  		count *= nr_attrib;
>> +	dev_attrib = edac_align_ptr(&p, sizeof(*dev_attrib), count);
>>  
>> -		/* Calc the 'end' offset past the attributes array */
>> -		pvt = edac_align_ptr(&dev_attrib[count], sz_private);
>> -	} else {
>> -		/* no attribute array specificed */
>> -		pvt = edac_align_ptr(dev_attrib, sz_private);
>> -	}
>> +	/* Calc the 'end' offset past the attributes array */
>> +	pvt = edac_align_ptr(&p, sz_private, 1);
>>  
>>  	/* 'pvt' now points to where the private data area is.
>>  	 * At this point 'pvt' (like dev_inst,dev_blk and dev_attrib)
>> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
>> index ffedae9..98de5d1 100644
>> --- a/drivers/edac/edac_mc.c
>> +++ b/drivers/edac/edac_mc.c
>> @@ -108,9 +108,12 @@ EXPORT_SYMBOL_GPL(edac_mem_types);
>>   * If 'size' is a constant, the compiler will optimize this whole function
>>   * down to either a no-op or the addition of a constant to the value of 'ptr'.
>>   */
>> -void *edac_align_ptr(void *ptr, unsigned size)
>> +void *edac_align_ptr(void **p, unsigned size, int quant)
> 
> Oh, no, pls write it out as 'quantity'. 'quant' only means nothing...
> ok, it does but it does not fit in this here context:
> 
> From The Collaborative International Dictionary of English v.0.48 [gcide]:
> 
>   Quant \Quant\, n.
>      A punting pole with a broad flange near the end to prevent it
>      from sinking into the mud; a setting pole.
>      [1913 Webster]
> 
> :-)

:)

quantity is too big. Changed to "count".

> 
>>  {
>>  	unsigned align, r;
>> +	void *ptr = *p;
>> +
>> +	*p += size * quant;
>>  
>>  	/* Here we assume that the alignment of a "long long" is the most
>>  	 * stringent alignment that the compiler will ever provide by default.
>> @@ -132,6 +135,8 @@ void *edac_align_ptr(void *ptr, unsigned size)
>>  	if (r == 0)
>>  		return (char *)ptr;
>>  
>> +	*p += align - r;
>> +
> 
> Why increment *p here too - we're returning ptr below?

Yes, ptr is returned. That's why it is declared as void **.

> Or are we keeping
> the alignment in the original pointer too? Why can't we pass the aligned
> pointer from the previous pass? I.e., do
> 
> 	p = NULL;
> 	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
> 
> and then do
> 
> 	dev_inst = edac_align_ptr(&dev_ctl, sizeof(*dev_inst), nr_instances);

This makes harder to add new calls to edac_align_ptr(() at the caller function, 
and to commit allocation mistakes, as, if the developer forgets to change the
next line, there will be two vars pointing to the same memory range.

Btw, I think this actually occurred during edac/bluesmoke development, as there
are several debug printk's for the pointers values.

With the way I've patched it, adding new stuff to be allocated is easier and
requires less line changes, as you can just add another:

	foo = edac_align_ptr(&ptr, sizeof(*foo), n);

without needing to touch on the other calls to edac_align_ptr().

> In any case, this is not trivial so the function needs a bunch of comments.

I'll properly document the parameters and return value using
Documentation/kernel-doc-nano-HOWTO.txt format.

> 
>>  	return (void *)(((unsigned long)ptr) + align - r);
>>  }
>>  
>> @@ -154,6 +159,7 @@ void *edac_align_ptr(void *ptr, unsigned size)
>>  struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  				unsigned nr_chans, int edac_index)
>>  {
>> +	void *ptr;
>>  	struct mem_ctl_info *mci;
>>  	struct csrow_info *csi, *csrow;
>>  	struct rank_info *chi, *chp, *chan;
>> @@ -168,11 +174,12 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	 * stringent as what the compiler would provide if we could simply
>>  	 * hardcode everything into a single struct.
>>  	 */
>> -	mci = (struct mem_ctl_info *)0;
>> -	csi = edac_align_ptr(&mci[1], sizeof(*csi));
>> -	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
>> -	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
>> -	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
>> +	ptr = 0;
> 
> Declare it above like this:
> 
> 	void *ptr = NULL;

Ok.

> 
>> +	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
>> +	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
>> +	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
>> +	dimm = edac_align_ptr(ptr, sizeof(*dimm), nr_csrows * nr_chans);

That was wrong. Some bad resolution due to some rebase. It should be, instead:

dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);

This was what very likely caused the oops you've got.

>> +	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
>>  	size = ((unsigned long)pvt) + sz_pvt;
>>  
>>  	mci = kzalloc(size, GFP_KERNEL);
>> diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
>> index 00f81b4..0be4b01 100644
>> --- a/drivers/edac/edac_module.h
>> +++ b/drivers/edac/edac_module.h
>> @@ -50,7 +50,7 @@ extern void edac_device_reset_delay_period(struct edac_device_ctl_info
>>  					   *edac_dev, unsigned long value);
>>  extern void edac_mc_reset_delay_period(int value);
>>  
>> -extern void *edac_align_ptr(void *ptr, unsigned size);
>> +extern void *edac_align_ptr(void **p, unsigned size, int quant);
>>  
>>  /*
>>   * EDAC PCI functions
>> diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
>> index 63af1c5..9016560 100644
>> --- a/drivers/edac/edac_pci.c
>> +++ b/drivers/edac/edac_pci.c
>> @@ -42,13 +42,14 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
>>  						const char *edac_pci_name)
>>  {
>>  	struct edac_pci_ctl_info *pci;
>> -	void *pvt;
>> +	void *p, *pvt;
>>  	unsigned int size;
>>  
>>  	debugf1("%s()\n", __func__);
>>  
>> -	pci = (struct edac_pci_ctl_info *)0;
>> -	pvt = edac_align_ptr(&pci[1], sz_pvt);
>> +	p = 0;
> 
> ditto.

Ok.

> 
>> +	pci = edac_align_ptr(&p, sizeof(*pci), 1);
>> +	pvt = edac_align_ptr(&p, 1, sz_pvt);
>>  	size = ((unsigned long)pvt) + sz_pvt;
>>  
>>  	/* Alloc the needed control struct memory */
>> -- 
>> 1.7.8
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-edac" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 


^ permalink raw reply	[flat|nested] 206+ messages in thread

* [PATCH] edac: rewrite edac_align_ptr()
  2012-04-18 14:06     ` Borislav Petkov
  2012-04-18 15:25       ` Borislav Petkov
  2012-04-18 18:15       ` Mauro Carvalho Chehab
@ 2012-04-18 18:19       ` Mauro Carvalho Chehab
  2012-04-23 14:05         ` Borislav Petkov
  2 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-18 18:19 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson

The edac_align_ptr() function is used to prepare data for a single
memory allocation kzalloc() call. It counts how many bytes are needed
by some data structure.

Using it as-is is not that trivial, as the quantity of memory elements
reserved is not there, but, instead, it is on a next call.

In order to avoid mistakes when using it, move the number of allocated
elements into it, making easier to use it.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v14: fixes a badly-solved rebase conflict, uses NULL instead of 0, adds more comments
     and renames the counter for the number of structures to "count"

 drivers/edac/edac_device.c |   27 +++++++++++----------------
 drivers/edac/edac_mc.c     |   31 +++++++++++++++++++++++--------
 drivers/edac/edac_module.h |    2 +-
 drivers/edac/edac_pci.c    |    6 +++---
 4 files changed, 38 insertions(+), 28 deletions(-)

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 4b15459..cb397d9 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -79,7 +79,7 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	unsigned total_size;
 	unsigned count;
 	unsigned instance, block, attr;
-	void *pvt;
+	void *pvt, *p;
 	int err;
 
 	debugf4("%s() instances=%d blocks=%d\n",
@@ -92,35 +92,30 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	 * to be at least as stringent as what the compiler would
 	 * provide if we could simply hardcode everything into a single struct.
 	 */
-	dev_ctl = (struct edac_device_ctl_info *)NULL;
+	p = NULL;
+	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
 
 	/* Calc the 'end' offset past end of ONE ctl_info structure
 	 * which will become the start of the 'instance' array
 	 */
-	dev_inst = edac_align_ptr(&dev_ctl[1], sizeof(*dev_inst));
+	dev_inst = edac_align_ptr(&p, sizeof(*dev_inst), nr_instances);
 
 	/* Calc the 'end' offset past the instance array within the ctl_info
 	 * which will become the start of the block array
 	 */
-	dev_blk = edac_align_ptr(&dev_inst[nr_instances], sizeof(*dev_blk));
+	count = nr_instances * nr_blocks;
+	dev_blk = edac_align_ptr(&p, sizeof(*dev_blk), count);
 
 	/* Calc the 'end' offset past the dev_blk array
 	 * which will become the start of the attrib array, if any.
 	 */
-	count = nr_instances * nr_blocks;
-	dev_attrib = edac_align_ptr(&dev_blk[count], sizeof(*dev_attrib));
-
-	/* Check for case of when an attribute array is specified */
-	if (nr_attrib > 0) {
-		/* calc how many nr_attrib we need */
+	/* calc how many nr_attrib we need */
+	if (nr_attrib > 0)
 		count *= nr_attrib;
+	dev_attrib = edac_align_ptr(&p, sizeof(*dev_attrib), count);
 
-		/* Calc the 'end' offset past the attributes array */
-		pvt = edac_align_ptr(&dev_attrib[count], sz_private);
-	} else {
-		/* no attribute array specificed */
-		pvt = edac_align_ptr(dev_attrib, sz_private);
-	}
+	/* Calc the 'end' offset past the attributes array */
+	pvt = edac_align_ptr(&p, sz_private, 1);
 
 	/* 'pvt' now points to where the private data area is.
 	 * At this point 'pvt' (like dev_inst,dev_blk and dev_attrib)
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index ffedae9..775a3ff 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -101,16 +101,28 @@ const char *edac_mem_types[] = {
 };
 EXPORT_SYMBOL_GPL(edac_mem_types);
 
-/* 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.
+/**
+ * edac_align_ptr - Prepares the pointer offsets for a single-shot allocation
+ * @p:		pointer to a pointer with the memory offset to be used. At
+ *		return, this will be incremented to point to the next offset
+ * @size:	Size of the data structure to be reserved
+ * @count:	Number of elements that should be reserved
+ *
+ * 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.
  * Adjust 'ptr' so that its alignment is at least as stringent as what the
  * compiler would provide for X and return the aligned result.
  *
  * If 'size' is a constant, the compiler will optimize this whole function
- * down to either a no-op or the addition of a constant to the value of 'ptr'.
+ * down to either a no-op or the addition of a constant to the value of '*p'.
+ *
+ * At return, the pointer 'p' will be incremented.
  */
-void *edac_align_ptr(void *ptr, unsigned size)
+void *edac_align_ptr(void **p, unsigned size, int count)
 {
 	unsigned align, r;
+	void *ptr = *p;
+
+	*p += size * count;
 
 	/* Here we assume that the alignment of a "long long" is the most
 	 * stringent alignment that the compiler will ever provide by default.
@@ -132,6 +144,8 @@ void *edac_align_ptr(void *ptr, unsigned size)
 	if (r == 0)
 		return (char *)ptr;
 
+	*p += align - r;
+
 	return (void *)(((unsigned long)ptr) + align - r);
 }
 
@@ -154,6 +168,7 @@ void *edac_align_ptr(void *ptr, unsigned size)
 struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 				unsigned nr_chans, int edac_index)
 {
+	void *ptr = NULL;
 	struct mem_ctl_info *mci;
 	struct csrow_info *csi, *csrow;
 	struct rank_info *chi, *chp, *chan;
@@ -168,11 +183,11 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * stringent as what the compiler would provide if we could simply
 	 * hardcode everything into a single struct.
 	 */
-	mci = (struct mem_ctl_info *)0;
-	csi = edac_align_ptr(&mci[1], sizeof(*csi));
-	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
-	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
-	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
+	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	mci = kzalloc(size, GFP_KERNEL);
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 00f81b4..7a19b1b 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -50,7 +50,7 @@ extern void edac_device_reset_delay_period(struct edac_device_ctl_info
 					   *edac_dev, unsigned long value);
 extern void edac_mc_reset_delay_period(int value);
 
-extern void *edac_align_ptr(void *ptr, unsigned size);
+extern void *edac_align_ptr(void **p, unsigned size, int count);
 
 /*
  * EDAC PCI functions
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index 63af1c5..f1ac866 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -42,13 +42,13 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 						const char *edac_pci_name)
 {
 	struct edac_pci_ctl_info *pci;
-	void *pvt;
+	void *p = NULL, *pvt;
 	unsigned int size;
 
 	debugf1("%s()\n", __func__);
 
-	pci = (struct edac_pci_ctl_info *)0;
-	pvt = edac_align_ptr(&pci[1], sz_pvt);
+	pci = edac_align_ptr(&p, sizeof(*pci), 1);
+	pvt = edac_align_ptr(&p, 1, sz_pvt);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	/* Alloc the needed control struct memory */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH] edac: Change internal representation to work with layers
  2012-04-16 20:12   ` [EDAC PATCH v13 7/7] edac: Change internal representation to work with layers Mauro Carvalho Chehab
@ 2012-04-18 18:22     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-18 18:22 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Aristeu Rozanski,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Borislav Petkov, Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMM's, instead of ranks, accessed
via csrow/channel.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks on FB-DIMM and RAMBUS
memory controllers on the next patches.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMM's are currently represented by multiple DIMM
entries at struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same dimm.
Such bug is there since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

PS.: I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger on the drivers for FB-DIMM's.

FIXME: while the FB-DIMMs are not converted to use the new
design, uncorrected errors will show just one channel. In
the past, all changes were on a big patch with about 150K.
As it needed to be split, in order to be accepted by the
EDAC ML at vger, we've opted to have this small drawback.
As an advantage, it is now easier to review the patch series.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v14: Contextual changes fix

 drivers/edac/edac_core.h |   92 ++++++-
 drivers/edac/edac_mc.c   |  682 ++++++++++++++++++++++++++++------------------
 include/linux/edac.h     |   40 ++-
 3 files changed, 526 insertions(+), 288 deletions(-)

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..7201bb1 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index);
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -467,24 +472,80 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
  * reporting logic and function interface - reduces conditional
  * statement clutter and extra function arguments.
  */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
+
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
+
+static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			      unsigned long page_frame_number,
 			      unsigned long offset_in_page,
 			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
+			      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      page_frame_number, offset_in_page, syndrome,
+		              row, channel, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+				      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
 			      unsigned long page_frame_number,
 			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+			      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      page_frame_number, offset_in_page, 0,
+		              row, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+				      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel0,
+					 unsigned int channel1,
+					 char *msg)
+{
+	/*
+	 *FIXME: The error can also be at channel1 (e. g. at the second
+	 *	  channel of the same branch). The fix is to push
+	 *	  edac_mc_handle_error() call into each driver
+	 */
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel0, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel, char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel, -1, msg, NULL, NULL);
+}
+
+
 
 /*
  * edac_device APIs
@@ -496,6 +557,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 775a3ff..a22b5d4 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -150,10 +168,25 @@ void *edac_align_ptr(void **p, unsigned size, int count)
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Number of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -165,18 +198,41 @@ void *edac_align_ptr(void **p, unsigned size, int count)
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt)
 {
 	void *ptr = NULL;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *lay;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
-	unsigned size;
-	int row, chn;
+	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
+	unsigned tot_csrows, tot_cschannels;
+	int i, j;
 	int err;
+	int row, chn;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	tot_dimms = 1;
+	tot_cschannels = 1;
+	tot_csrows = 1;
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_cschannels *= layers[i].size;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
@@ -184,12 +240,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * hardcode everything into a single struct.
 	 */
 	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
-	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
-	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
-	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	lay = edac_align_ptr(&ptr, sizeof(*lay), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_cschannels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	count = 1;
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+	}
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
+		__func__, size, tot_dimms, tot_csrows * tot_cschannels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -197,42 +262,99 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	lay = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)lay));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
+	mci->n_layers = n_layers;
+	mci->layers = lay;
+	memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
+	mci->nr_csrows = tot_csrows;
+	mci->num_cschannel = tot_cschannels;
 
 	/*
-	 * For now, assumes that a per-csrow arrangement for dimms.
-	 * This will be latter changed.
+	 * Fills the csrow struct
 	 */
-	dimm = mci->dimms;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
-
-		for (chn = 0; chn < nr_chans; chn++) {
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_cschannels;
+		chp = &chi[row * tot_cschannels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_cschannels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
+		}
+	}
 
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+	/*
+	 * Fills the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = GET_POS(lay, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		if (!rev_order) {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (!layers[j].is_csrow)
+					break;
+			chn++;
+			if (chn == tot_cschannels) {
+				chn = 0;
+				row++;
+			}
+		} else {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (layers[j].is_csrow)
+					break;
+			row++;
+			if (row == tot_csrows) {
+				row = 0;
+				chn++;
+			}
+		}
+
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -256,6 +378,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	return mci;
 }
+EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Nu
+mber of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * Only can be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ *	NULL allocation failed
+ *	struct mem_ctl_info pointer
+ */
+
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index)
+{
+	unsigned n_layers = 2;
+	struct edac_mc_layer layers[n_layers];
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = nr_csrows;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_csrow = false;
+
+	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
+			  false, sz_pvt);
+}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
@@ -521,7 +694,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -548,6 +720,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -705,261 +879,249 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
+
+static void edac_increment_ce_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
 {
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+	int i, index = 0;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	mci->ce_mc++;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_increment_ue_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	if (mci->scrub_mode & SCRUB_SW_SRC) {
-		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
-		remapped_page = mci->ctl_page_to_phys ?
-			mci->ctl_page_to_phys(mci, page_frame_number) :
-			page_frame_number;
+	mci->ue_mc++;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
+		return;
 	}
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	unsigned long remapped_page;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_filter = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/* Check if the event report is consistent */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED) {
+				p = "CE";
+				mci->ce_mc++;
+			} else {
+				p = "UE";
+				mci->ue_mc++;
+			}
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_filter = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
-
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory,
+	 * the logic here will get all possible labels that could pottentially
+	 * be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called as "rank")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 * It will also get the max grain, over the error match range
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no sense
+		 * on seeking for the affected DIMMs, as everything may be
+		 * affected. Also, don't show errors for non-filled dimm's.
+		 */
+		if (enable_filter && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the dimm, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: dimm csrows (%d,%d)\n",
+				__func__, dimm->csrow, dimm->cschannel);
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
-
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_filter) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
+		p += sprintf(p, "%s %d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED)
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+	else
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d",
+			page_frame_number, offset_in_page, grain);
+
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		if (edac_mc_get_log_ce())
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		edac_increment_ce_error(mci, enable_filter, pos);
+
+		if (mci->scrub_mode & SCRUB_SW_SRC) {
+			/*
+			 * Some MC's can remap memory so that it is still
+			 * available at a different address when PCI devices
+			 * map into memory.
+			 * MC's that can't do this lose the memory where PCI
+			 * devices are mapped. This mapping is MC dependent
+			 * and so we call back into the MC driver for it to
+			 * map the MC page to a physical (CPU) page which can
+			 * then be mapped to a virtual page - which can then
+			 * be scrubbed.
+			 */
+			remapped_page = mci->ctl_page_to_phys ?
+				mci->ctl_page_to_phys(mci, page_frame_number) :
+				page_frame_number;
+
+			edac_mc_scrub_block(remapped_page,
+					    offset_in_page, grain);
+		}
+	} else {
+		if (edac_mc_get_log_ue())
+			edac_mc_printk(mci, KERN_WARNING,
+				"UE %s on %s (%s%s %s)\n",
+				msg, label, location, detail, other_detail);
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+		if (edac_mc_get_panic_on_ue())
+			panic("UE %s on %s (%s%s %s)\n",
+			      msg, label, location, detail, other_detail);
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		edac_increment_ue_error(mci, enable_filter, pos);
 	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
-
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 0fdf6ba..1439670 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -392,18 +392,20 @@ struct edac_mc_layer {
 /* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -423,9 +425,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -477,6 +480,11 @@ struct mcidev_sysfs_attribute {
         ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
 };
 
+struct edac_hierarchy {
+	char		*name;
+	unsigned	nr;
+};
+
 /* MEMORY controller information structure
  */
 struct mem_ctl_info {
@@ -521,13 +529,16 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned nr_csrows, num_cschannel;
 
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -542,12 +553,15 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
+	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
+	u32 ce_count;           /* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/* drivers shouldn't access this struct directly */
+	unsigned ce_noinfo_count, ue_noinfo_count;
+	unsigned ce_mc, ue_mc;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -560,7 +574,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac: rewrite edac_align_ptr()
  2012-04-18 18:19       ` [PATCH] " Mauro Carvalho Chehab
@ 2012-04-23 14:05         ` Borislav Petkov
  2012-04-23 15:19           ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-23 14:05 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson

On Wed, Apr 18, 2012 at 03:19:34PM -0300, Mauro Carvalho Chehab wrote:
> The edac_align_ptr() function is used to prepare data for a single
> memory allocation kzalloc() call. It counts how many bytes are needed
> by some data structure.
> 
> Using it as-is is not that trivial, as the quantity of memory elements
> reserved is not there, but, instead, it is on a next call.
> 
> In order to avoid mistakes when using it, move the number of allocated
> elements into it, making easier to use it.
> 
> Cc: Aristeu Rozanski <arozansk@redhat.com>
> Cc: Doug Thompson <norsk5@yahoo.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
> 
> v14: fixes a badly-solved rebase conflict, uses NULL instead of 0, adds more comments
>      and renames the counter for the number of structures to "count"
> 
>  drivers/edac/edac_device.c |   27 +++++++++++----------------
>  drivers/edac/edac_mc.c     |   31 +++++++++++++++++++++++--------
>  drivers/edac/edac_module.h |    2 +-
>  drivers/edac/edac_pci.c    |    6 +++---
>  4 files changed, 38 insertions(+), 28 deletions(-)
> 
> diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
> index 4b15459..cb397d9 100644
> --- a/drivers/edac/edac_device.c
> +++ b/drivers/edac/edac_device.c
> @@ -79,7 +79,7 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
>  	unsigned total_size;
>  	unsigned count;
>  	unsigned instance, block, attr;
> -	void *pvt;
> +	void *pvt, *p;
>  	int err;
>  
>  	debugf4("%s() instances=%d blocks=%d\n",
> @@ -92,35 +92,30 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
>  	 * to be at least as stringent as what the compiler would
>  	 * provide if we could simply hardcode everything into a single struct.
>  	 */
> -	dev_ctl = (struct edac_device_ctl_info *)NULL;
> +	p = NULL;
> +	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
>  
>  	/* Calc the 'end' offset past end of ONE ctl_info structure
>  	 * which will become the start of the 'instance' array
>  	 */
> -	dev_inst = edac_align_ptr(&dev_ctl[1], sizeof(*dev_inst));
> +	dev_inst = edac_align_ptr(&p, sizeof(*dev_inst), nr_instances);
>  
>  	/* Calc the 'end' offset past the instance array within the ctl_info
>  	 * which will become the start of the block array
>  	 */
> -	dev_blk = edac_align_ptr(&dev_inst[nr_instances], sizeof(*dev_blk));
> +	count = nr_instances * nr_blocks;
> +	dev_blk = edac_align_ptr(&p, sizeof(*dev_blk), count);
>  
>  	/* Calc the 'end' offset past the dev_blk array
>  	 * which will become the start of the attrib array, if any.
>  	 */
> -	count = nr_instances * nr_blocks;
> -	dev_attrib = edac_align_ptr(&dev_blk[count], sizeof(*dev_attrib));
> -
> -	/* Check for case of when an attribute array is specified */
> -	if (nr_attrib > 0) {
> -		/* calc how many nr_attrib we need */
> +	/* calc how many nr_attrib we need */
> +	if (nr_attrib > 0)
>  		count *= nr_attrib;
> +	dev_attrib = edac_align_ptr(&p, sizeof(*dev_attrib), count);
>  
> -		/* Calc the 'end' offset past the attributes array */
> -		pvt = edac_align_ptr(&dev_attrib[count], sz_private);
> -	} else {
> -		/* no attribute array specificed */
> -		pvt = edac_align_ptr(dev_attrib, sz_private);
> -	}
> +	/* Calc the 'end' offset past the attributes array */
> +	pvt = edac_align_ptr(&p, sz_private, 1);
>  
>  	/* 'pvt' now points to where the private data area is.
>  	 * At this point 'pvt' (like dev_inst,dev_blk and dev_attrib)
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index ffedae9..775a3ff 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -101,16 +101,28 @@ const char *edac_mem_types[] = {
>  };
>  EXPORT_SYMBOL_GPL(edac_mem_types);
>  
> -/* 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.
> +/**
> + * edac_align_ptr - Prepares the pointer offsets for a single-shot allocation
> + * @p:		pointer to a pointer with the memory offset to be used. At
> + *		return, this will be incremented to point to the next offset
> + * @size:	Size of the data structure to be reserved
> + * @count:	Number of elements that should be reserved
> + *
> + * 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.

There's no 'ptr' argument anymore. Also, the text doesn't apply anymore
since the ptr is not possibly unaligned but the returned pointer *p is
properly aligned to size * count.

Also, this pointer is absolutely needed to keep the proper advancing
further in memory to the proper offsets when allocating the struct along
with its embedded structs, as edac_device_alloc_ctl_info() does it
above, for example.

>   * Adjust 'ptr' so that its alignment is at least as stringent as what the
>   * compiler would provide for X and return the aligned result.
>   *
>   * If 'size' is a constant, the compiler will optimize this whole function
> - * down to either a no-op or the addition of a constant to the value of 'ptr'.
> + * down to either a no-op or the addition of a constant to the value of '*p'.
> + *
> + * At return, the pointer 'p' will be incremented.
>   */
> -void *edac_align_ptr(void *ptr, unsigned size)
> +void *edac_align_ptr(void **p, unsigned size, int count)

'count' is non-descriptive and at least ambiguous as to what it relates
to - call it 'n_elems' instead.

>  {
>  	unsigned align, r;
> +	void *ptr = *p;
> +
> +	*p += size * count;
>  
>  	/* Here we assume that the alignment of a "long long" is the most
>  	 * stringent alignment that the compiler will ever provide by default.
> @@ -132,6 +144,8 @@ void *edac_align_ptr(void *ptr, unsigned size)
>  	if (r == 0)
>  		return (char *)ptr;
>  
> +	*p += align - r;
> +
>  	return (void *)(((unsigned long)ptr) + align - r);
>  }

In general, this edac_align_ptr is not really helpful because it requres
the caller to know the exact layout of the struct it allocates memory
for and what structs it has embedded. And frankly, I don't know how much
it would help but I hear unaligned pointers are something bad on some
!x86 architectures.

Oh well...

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac: rewrite edac_align_ptr()
  2012-04-23 14:05         ` Borislav Petkov
@ 2012-04-23 15:19           ` Mauro Carvalho Chehab
  2012-04-23 15:26             ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-23 15:19 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson

Em 23-04-2012 14:05, Borislav Petkov escreveu:
> On Wed, Apr 18, 2012 at 03:19:34PM -0300, Mauro Carvalho Chehab wrote:
>> The edac_align_ptr() function is used to prepare data for a single
>> memory allocation kzalloc() call. It counts how many bytes are needed
>> by some data structure.
>>
>> Using it as-is is not that trivial, as the quantity of memory elements
>> reserved is not there, but, instead, it is on a next call.
>>
>> In order to avoid mistakes when using it, move the number of allocated
>> elements into it, making easier to use it.
>>
>> Cc: Aristeu Rozanski <arozansk@redhat.com>
>> Cc: Doug Thompson <norsk5@yahoo.com>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>
>> v14: fixes a badly-solved rebase conflict, uses NULL instead of 0, adds more comments
>>      and renames the counter for the number of structures to "count"
>>
>>  drivers/edac/edac_device.c |   27 +++++++++++----------------
>>  drivers/edac/edac_mc.c     |   31 +++++++++++++++++++++++--------
>>  drivers/edac/edac_module.h |    2 +-
>>  drivers/edac/edac_pci.c    |    6 +++---
>>  4 files changed, 38 insertions(+), 28 deletions(-)
>>
>> diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
>> index 4b15459..cb397d9 100644
>> --- a/drivers/edac/edac_device.c
>> +++ b/drivers/edac/edac_device.c
>> @@ -79,7 +79,7 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
>>  	unsigned total_size;
>>  	unsigned count;
>>  	unsigned instance, block, attr;
>> -	void *pvt;
>> +	void *pvt, *p;
>>  	int err;
>>  
>>  	debugf4("%s() instances=%d blocks=%d\n",
>> @@ -92,35 +92,30 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
>>  	 * to be at least as stringent as what the compiler would
>>  	 * provide if we could simply hardcode everything into a single struct.
>>  	 */
>> -	dev_ctl = (struct edac_device_ctl_info *)NULL;
>> +	p = NULL;
>> +	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
>>  
>>  	/* Calc the 'end' offset past end of ONE ctl_info structure
>>  	 * which will become the start of the 'instance' array
>>  	 */
>> -	dev_inst = edac_align_ptr(&dev_ctl[1], sizeof(*dev_inst));
>> +	dev_inst = edac_align_ptr(&p, sizeof(*dev_inst), nr_instances);
>>  
>>  	/* Calc the 'end' offset past the instance array within the ctl_info
>>  	 * which will become the start of the block array
>>  	 */
>> -	dev_blk = edac_align_ptr(&dev_inst[nr_instances], sizeof(*dev_blk));
>> +	count = nr_instances * nr_blocks;
>> +	dev_blk = edac_align_ptr(&p, sizeof(*dev_blk), count);
>>  
>>  	/* Calc the 'end' offset past the dev_blk array
>>  	 * which will become the start of the attrib array, if any.
>>  	 */
>> -	count = nr_instances * nr_blocks;
>> -	dev_attrib = edac_align_ptr(&dev_blk[count], sizeof(*dev_attrib));
>> -
>> -	/* Check for case of when an attribute array is specified */
>> -	if (nr_attrib > 0) {
>> -		/* calc how many nr_attrib we need */
>> +	/* calc how many nr_attrib we need */
>> +	if (nr_attrib > 0)
>>  		count *= nr_attrib;
>> +	dev_attrib = edac_align_ptr(&p, sizeof(*dev_attrib), count);
>>  
>> -		/* Calc the 'end' offset past the attributes array */
>> -		pvt = edac_align_ptr(&dev_attrib[count], sz_private);
>> -	} else {
>> -		/* no attribute array specificed */
>> -		pvt = edac_align_ptr(dev_attrib, sz_private);
>> -	}
>> +	/* Calc the 'end' offset past the attributes array */
>> +	pvt = edac_align_ptr(&p, sz_private, 1);
>>  
>>  	/* 'pvt' now points to where the private data area is.
>>  	 * At this point 'pvt' (like dev_inst,dev_blk and dev_attrib)
>> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
>> index ffedae9..775a3ff 100644
>> --- a/drivers/edac/edac_mc.c
>> +++ b/drivers/edac/edac_mc.c
>> @@ -101,16 +101,28 @@ const char *edac_mem_types[] = {
>>  };
>>  EXPORT_SYMBOL_GPL(edac_mem_types);
>>  
>> -/* 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.
>> +/**
>> + * edac_align_ptr - Prepares the pointer offsets for a single-shot allocation
>> + * @p:		pointer to a pointer with the memory offset to be used. At
>> + *		return, this will be incremented to point to the next offset
>> + * @size:	Size of the data structure to be reserved
>> + * @count:	Number of elements that should be reserved
>> + *
>> + * 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.
> 
> There's no 'ptr' argument anymore. Also, the text doesn't apply anymore
> since the ptr is not possibly unaligned but the returned pointer *p is
> properly aligned to size * count.

While this comment were before the function, it actually belongs to the logic
that aligns the pointer. So, I'll move it to be there inside the code.

> Also, this pointer is absolutely needed to keep the proper advancing
> further in memory to the proper offsets when allocating the struct along
> with its embedded structs, as edac_device_alloc_ctl_info() does it
> above, for example.

I'll add the above comment.

> 
>>   * Adjust 'ptr' so that its alignment is at least as stringent as what the
>>   * compiler would provide for X and return the aligned result.
>>   *
>>   * If 'size' is a constant, the compiler will optimize this whole function
>> - * down to either a no-op or the addition of a constant to the value of 'ptr'.
>> + * down to either a no-op or the addition of a constant to the value of '*p'.
>> + *
>> + * At return, the pointer 'p' will be incremented.
>>   */
>> -void *edac_align_ptr(void *ptr, unsigned size)
>> +void *edac_align_ptr(void **p, unsigned size, int count)
> 
> 'count' is non-descriptive and at least ambiguous as to what it relates
> to - call it 'n_elems' instead.

Ok.

>>  {
>>  	unsigned align, r;
>> +	void *ptr = *p;
>> +
>> +	*p += size * count;
>>  
>>  	/* Here we assume that the alignment of a "long long" is the most
>>  	 * stringent alignment that the compiler will ever provide by default.
>> @@ -132,6 +144,8 @@ void *edac_align_ptr(void *ptr, unsigned size)
>>  	if (r == 0)
>>  		return (char *)ptr;
>>  
>> +	*p += align - r;
>> +
>>  	return (void *)(((unsigned long)ptr) + align - r);
>>  }
> 
> In general, this edac_align_ptr is not really helpful because it requres
> the caller to know the exact layout of the struct it allocates memory
> for and what structs it has embedded. And frankly, I don't know how much
> it would help but I hear unaligned pointers are something bad on some
> !x86 architectures.
> 
> Oh well...

AFAIKT, badly aligned data can have serious performance impacts on some RISC
processors.

Anyway, all above points addressed by this diff patch, that I'll fold with
the original patch.

Thanks for the review,
Mauro

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 775a3ff..6ec967a 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -106,25 +106,32 @@ EXPORT_SYMBOL_GPL(edac_mem_types);
  * @p:		pointer to a pointer with the memory offset to be used. At
  *		return, this will be incremented to point to the next offset
  * @size:	Size of the data structure to be reserved
- * @count:	Number of elements that should be reserved
- *
- * 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.
- * Adjust 'ptr' so that its alignment is at least as stringent as what the
- * compiler would provide for X and return the aligned result.
+ * @n_elems:	Number of elements that should be reserved
  *
  * If 'size' is a constant, the compiler will optimize this whole function
  * down to either a no-op or the addition of a constant to the value of '*p'.
  *
- * At return, the pointer 'p' will be incremented.
+ * The 'p' pointer is absolutely needed to keep the proper advancing
+ * further in memory to the proper offsets when allocating the struct along
+ * with its embedded structs, as edac_device_alloc_ctl_info() does it
+ * above, for example.
+ *
+ * At return, the pointer 'p' will be incremented to be used on a next call
+ * to this function.
  */
-void *edac_align_ptr(void **p, unsigned size, int count)
+void *edac_align_ptr(void **p, unsigned size, int n_elems)
 {
 	unsigned align, r;
 	void *ptr = *p;
 
-	*p += size * count;
+	*p += size * n_elems;
 
-	/* Here we assume that the alignment of a "long long" is the most
+	/*
+	 * 'p' can possibly be an unaligned item X such that sizeof(X) is
+	 * 'size'.  Adjust 'p' so that its alignment is at least as
+	 * stringent as what the compiler would provide for X and return
+	 * the aligned result.
+	 * Here we assume that the alignment of a "long long" is the most
 	 * stringent alignment that the compiler will ever provide by default.
 	 * As far as I know, this is a reasonable assumption.
 	 */
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 7a19b1b..0ea7d14 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -50,7 +50,7 @@ extern void edac_device_reset_delay_period(struct edac_device_ctl_info
 					   *edac_dev, unsigned long value);
 extern void edac_mc_reset_delay_period(int value);
 
-extern void *edac_align_ptr(void **p, unsigned size, int count);
+extern void *edac_align_ptr(void **p, unsigned size, int n_elems);
 
 /*
  * EDAC PCI functions




^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH] edac: rewrite edac_align_ptr()
  2012-04-23 15:19           ` Mauro Carvalho Chehab
@ 2012-04-23 15:26             ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-23 15:26 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson

The edac_align_ptr() function is used to prepare data for a single
memory allocation kzalloc() call. It counts how many bytes are needed
by some data structure.

Using it as-is is not that trivial, as the quantity of memory elements
reserved is not there, but, instead, it is on a next call.

In order to avoid mistakes when using it, move the number of allocated
elements into it, making easier to use it.

Reviewed-by: Borislav Petkov <bp@amd64.org>
Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
v15: Fixes some comments and rename the name one var at edac_align_ptr(),
     in order to improve the code readability, as per Borislav's review.

 drivers/edac/edac_device.c |   27 +++++++++++----------------
 drivers/edac/edac_mc.c     |   44 +++++++++++++++++++++++++++++++++-----------
 drivers/edac/edac_module.h |    2 +-
 drivers/edac/edac_pci.c    |    6 +++---
 4 files changed, 48 insertions(+), 31 deletions(-)

diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index 4b15459..cb397d9 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -79,7 +79,7 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	unsigned total_size;
 	unsigned count;
 	unsigned instance, block, attr;
-	void *pvt;
+	void *pvt, *p;
 	int err;
 
 	debugf4("%s() instances=%d blocks=%d\n",
@@ -92,35 +92,30 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	 * to be at least as stringent as what the compiler would
 	 * provide if we could simply hardcode everything into a single struct.
 	 */
-	dev_ctl = (struct edac_device_ctl_info *)NULL;
+	p = NULL;
+	dev_ctl = edac_align_ptr(&p, sizeof(*dev_ctl), 1);
 
 	/* Calc the 'end' offset past end of ONE ctl_info structure
 	 * which will become the start of the 'instance' array
 	 */
-	dev_inst = edac_align_ptr(&dev_ctl[1], sizeof(*dev_inst));
+	dev_inst = edac_align_ptr(&p, sizeof(*dev_inst), nr_instances);
 
 	/* Calc the 'end' offset past the instance array within the ctl_info
 	 * which will become the start of the block array
 	 */
-	dev_blk = edac_align_ptr(&dev_inst[nr_instances], sizeof(*dev_blk));
+	count = nr_instances * nr_blocks;
+	dev_blk = edac_align_ptr(&p, sizeof(*dev_blk), count);
 
 	/* Calc the 'end' offset past the dev_blk array
 	 * which will become the start of the attrib array, if any.
 	 */
-	count = nr_instances * nr_blocks;
-	dev_attrib = edac_align_ptr(&dev_blk[count], sizeof(*dev_attrib));
-
-	/* Check for case of when an attribute array is specified */
-	if (nr_attrib > 0) {
-		/* calc how many nr_attrib we need */
+	/* calc how many nr_attrib we need */
+	if (nr_attrib > 0)
 		count *= nr_attrib;
+	dev_attrib = edac_align_ptr(&p, sizeof(*dev_attrib), count);
 
-		/* Calc the 'end' offset past the attributes array */
-		pvt = edac_align_ptr(&dev_attrib[count], sz_private);
-	} else {
-		/* no attribute array specificed */
-		pvt = edac_align_ptr(dev_attrib, sz_private);
-	}
+	/* Calc the 'end' offset past the attributes array */
+	pvt = edac_align_ptr(&p, sz_private, 1);
 
 	/* 'pvt' now points to where the private data area is.
 	 * At this point 'pvt' (like dev_inst,dev_blk and dev_attrib)
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index ffedae9..6ec967a 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -101,18 +101,37 @@ const char *edac_mem_types[] = {
 };
 EXPORT_SYMBOL_GPL(edac_mem_types);
 
-/* 'ptr' points to a possibly unaligned item X such that sizeof(X) is 'size'.
- * Adjust 'ptr' so that its alignment is at least as stringent as what the
- * compiler would provide for X and return the aligned result.
+/**
+ * edac_align_ptr - Prepares the pointer offsets for a single-shot allocation
+ * @p:		pointer to a pointer with the memory offset to be used. At
+ *		return, this will be incremented to point to the next offset
+ * @size:	Size of the data structure to be reserved
+ * @n_elems:	Number of elements that should be reserved
  *
  * If 'size' is a constant, the compiler will optimize this whole function
- * down to either a no-op or the addition of a constant to the value of 'ptr'.
+ * down to either a no-op or the addition of a constant to the value of '*p'.
+ *
+ * The 'p' pointer is absolutely needed to keep the proper advancing
+ * further in memory to the proper offsets when allocating the struct along
+ * with its embedded structs, as edac_device_alloc_ctl_info() does it
+ * above, for example.
+ *
+ * At return, the pointer 'p' will be incremented to be used on a next call
+ * to this function.
  */
-void *edac_align_ptr(void *ptr, unsigned size)
+void *edac_align_ptr(void **p, unsigned size, int n_elems)
 {
 	unsigned align, r;
+	void *ptr = *p;
 
-	/* Here we assume that the alignment of a "long long" is the most
+	*p += size * n_elems;
+
+	/*
+	 * 'p' can possibly be an unaligned item X such that sizeof(X) is
+	 * 'size'.  Adjust 'p' so that its alignment is at least as
+	 * stringent as what the compiler would provide for X and return
+	 * the aligned result.
+	 * Here we assume that the alignment of a "long long" is the most
 	 * stringent alignment that the compiler will ever provide by default.
 	 * As far as I know, this is a reasonable assumption.
 	 */
@@ -132,6 +151,8 @@ void *edac_align_ptr(void *ptr, unsigned size)
 	if (r == 0)
 		return (char *)ptr;
 
+	*p += align - r;
+
 	return (void *)(((unsigned long)ptr) + align - r);
 }
 
@@ -154,6 +175,7 @@ void *edac_align_ptr(void *ptr, unsigned size)
 struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 				unsigned nr_chans, int edac_index)
 {
+	void *ptr = NULL;
 	struct mem_ctl_info *mci;
 	struct csrow_info *csi, *csrow;
 	struct rank_info *chi, *chp, *chan;
@@ -168,11 +190,11 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * stringent as what the compiler would provide if we could simply
 	 * hardcode everything into a single struct.
 	 */
-	mci = (struct mem_ctl_info *)0;
-	csi = edac_align_ptr(&mci[1], sizeof(*csi));
-	chi = edac_align_ptr(&csi[nr_csrows], sizeof(*chi));
-	dimm = edac_align_ptr(&chi[nr_chans * nr_csrows], sizeof(*dimm));
-	pvt = edac_align_ptr(&dimm[nr_chans * nr_csrows], sz_pvt);
+	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	mci = kzalloc(size, GFP_KERNEL);
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 00f81b4..0ea7d14 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -50,7 +50,7 @@ extern void edac_device_reset_delay_period(struct edac_device_ctl_info
 					   *edac_dev, unsigned long value);
 extern void edac_mc_reset_delay_period(int value);
 
-extern void *edac_align_ptr(void *ptr, unsigned size);
+extern void *edac_align_ptr(void **p, unsigned size, int n_elems);
 
 /*
  * EDAC PCI functions
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index 63af1c5..f1ac866 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -42,13 +42,13 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 						const char *edac_pci_name)
 {
 	struct edac_pci_ctl_info *pci;
-	void *pvt;
+	void *p = NULL, *pvt;
 	unsigned int size;
 
 	debugf1("%s()\n", __func__);
 
-	pci = (struct edac_pci_ctl_info *)0;
-	pvt = edac_align_ptr(&pci[1], sz_pvt);
+	pci = edac_align_ptr(&p, sizeof(*pci), 1);
+	pvt = edac_align_ptr(&p, 1, sz_pvt);
 	size = ((unsigned long)pvt) + sz_pvt;
 
 	/* Alloc the needed control struct memory */
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-16 20:12   ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Mauro Carvalho Chehab
@ 2012-04-23 17:49     ` Borislav Petkov
  2012-04-23 18:30       ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-23 17:49 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Subject: "edac.h: Prepare to handle with generic layers"

what does that even mean?

Do you per chance mean

	"Add generic layers for describing a memory location"

or something similar?

On Mon, Apr 16, 2012 at 05:12:12PM -0300, Mauro Carvalho Chehab wrote:
> The edac core were written with the idea that memory controllers
> are able to directly access csrows, and that the channels are
> used inside a csrows select.
> 
> This is not true for FB-DIMM and RAMBUS memory controllers.
> 
> Also, some recent advanced memory controllers don't present a per-csrows
> view. Instead, they view memories as DIMM's, instead of ranks, accessed

				       DIMMs

> via csrow/channel.
> 
> So, changes are needed in order to allow the EDAC core to
> work with all types of architectures.
> 
> As a preparation for handling non-csrows based memory controllers,

 In preparation...

> adds some memory structs and a macro:

  add some...

> enum hw_event_mc_err_type: describes the type of error
> 			   (corrected, uncorrected, fatal)
> 
> To be used by the new edac_mc_handle_error function;
> 
> enum edac_mc_layer: describes the type of a given Memory

						    memory

> architecture layer (branch, channel, slot, csrow).
> 
> struct edac_mc_layer: describes the properties of a memory
> 		      layer (type, size, and if the layer
> 		      will be used on a virtual csrow.
> 
> GET_POS() - as the number of layers can vary from 1 to 3,
> this macro converts from an address with up to 3 layers into
> a linear address.
> 
> Cc: Doug Thompson <norsk5@yahoo.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
>  include/linux/edac.h |   83 +++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 files changed, 82 insertions(+), 1 deletions(-)
> 
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index 8b78bd0..0fdf6ba 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -67,6 +67,25 @@ enum dev_type {
>  #define DEV_FLAG_X64		BIT(DEV_X64)
>  
>  /**
> + * enum hw_event_mc_err_type - type of the detected error
> + *
> + * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
> + *				corrected error was detected
> + * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
> + *				can't be corrected by ECC, but it is not
> + *				factal (maybe it is on an unused memory area,

				fatal

> + *				or the memory controller could recover from
> + *				it for example, by re-trying the operation).
> + * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
> + *				be recovered.
> + */
> +enum hw_event_mc_err_type {
> +	HW_EVENT_ERR_CORRECTED,
> +	HW_EVENT_ERR_UNCORRECTED,
> +	HW_EVENT_ERR_FATAL,

Need a terminating elem here:
	HW_EVENT_ERR_NUM,

> +};
> +
> +/**
>   * enum mem_type - memory types. For a more detailed reference, please see
>   *			http://en.wikipedia.org/wiki/DRAM
>   *
> @@ -308,7 +327,69 @@ enum scrub_type {
>   * PS - I enjoyed writing all that about as much as you enjoyed reading it.
>   */
>  
> -/* FIXME: add a per-dimm ce error count */
> +/**
> + * enum edac_mc_layer - memory controller hierarchy layer
> + *
> + * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
> + * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
> + * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
> + * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
> + *
> + * This enum is used by the drivers to tell edac_mc_sysfs what name should
> + * be used when describing a memory stick location.
> + */
> +enum edac_mc_layer_type {
> +	EDAC_MC_LAYER_BRANCH,
> +	EDAC_MC_LAYER_CHANNEL,
> +	EDAC_MC_LAYER_SLOT,
> +	EDAC_MC_LAYER_CHIP_SELECT,

ditto.

> +};
> +
> +/**
> + * struct edac_mc_layer - describes the memory controller hierarchy
> + * @layer:		layer type
> + * @size:maximum size of the layer
> + * @is_csrow:		This layer is part of the "csrow" when old API
> + *			compatibility mode is enabled. Otherwise, it is
> + *			a channel
> + */
> +struct edac_mc_layer {
> +	enum edac_mc_layer_type	type;
> +	unsigned		size;
> +	bool			is_csrow;
> +};

Huh, why do you need is_csrow? Can't do

	type = EDAC_MC_LAYER_CHIP_SELECT;

?

> +
> +/*
> + * Maximum number of layers used by the memory controller to uniquelly

								uniquely

> + * identify a single memory stick.
> + * NOTE: incrementing it would require changes at edac_mc_handle_error()
> + * and at the routines at edac_mc_sysfs that create layers

Maybe add their names here with a regex or so: edac_mc_blabla_*
?

> + */
> +#define EDAC_MAX_LAYERS		3
> +
> +/*
> + * A loop could be used here to make it more generic, but, as we only have
> + * 3 layers, this is a little faster. By design, layers can never be 0 or
> + * more than 3. If that ever happens, a NULL is returned, causing an OOPS
> + * during the memory allocation routine, with would point to the developer
> + * that he's doing something wrong.
> + */
> +#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\

This is returning size per layers so it cannot be GET_POS(), AFAICT.
EDAC_GET_SIZE or similar maybe?

> +	typeof(var) __p;						\
> +	if ((nlayers) == 1)						\
> +		__p = &var[lay0];					\
> +	else if ((nlayers) == 2)					\
> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
> +	else if ((nlayers) == 3)					\
> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
> +			    ((layers[1]).size * (lay0))))];		\
> +	else								\
> +		__p = NULL;						\
> +	__p;								\
> +})

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-23 17:49     ` Borislav Petkov
@ 2012-04-23 18:30       ` Mauro Carvalho Chehab
  2012-04-23 18:56         ` Mauro Carvalho Chehab
  2012-04-24 10:40         ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
  0 siblings, 2 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-23 18:30 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 23-04-2012 17:49, Borislav Petkov escreveu:
> Subject: "edac.h: Prepare to handle with generic layers"
> 
> what does that even mean?
> 
> Do you per chance mean
> 
> 	"Add generic layers for describing a memory location"
> 
> or something similar?
> 
> On Mon, Apr 16, 2012 at 05:12:12PM -0300, Mauro Carvalho Chehab wrote:
>> The edac core were written with the idea that memory controllers
>> are able to directly access csrows, and that the channels are
>> used inside a csrows select.
>>
>> This is not true for FB-DIMM and RAMBUS memory controllers.
>>
>> Also, some recent advanced memory controllers don't present a per-csrows
>> view. Instead, they view memories as DIMM's, instead of ranks, accessed
> 
> 				       DIMMs
> 
>> via csrow/channel.
>>
>> So, changes are needed in order to allow the EDAC core to
>> work with all types of architectures.
>>
>> As a preparation for handling non-csrows based memory controllers,
> 
>  In preparation...
> 
>> adds some memory structs and a macro:
> 
>   add some...
> 
>> enum hw_event_mc_err_type: describes the type of error
>> 			   (corrected, uncorrected, fatal)
>>
>> To be used by the new edac_mc_handle_error function;
>>
>> enum edac_mc_layer: describes the type of a given Memory
> 
> 						    memory
> 
>> architecture layer (branch, channel, slot, csrow).
>>
>> struct edac_mc_layer: describes the properties of a memory
>> 		      layer (type, size, and if the layer
>> 		      will be used on a virtual csrow.
>>
>> GET_POS() - as the number of layers can vary from 1 to 3,
>> this macro converts from an address with up to 3 layers into
>> a linear address.
>>
>> Cc: Doug Thompson <norsk5@yahoo.com>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>> ---
>>  include/linux/edac.h |   83 +++++++++++++++++++++++++++++++++++++++++++++++++-
>>  1 files changed, 82 insertions(+), 1 deletions(-)
>>
>> diff --git a/include/linux/edac.h b/include/linux/edac.h
>> index 8b78bd0..0fdf6ba 100644
>> --- a/include/linux/edac.h
>> +++ b/include/linux/edac.h
>> @@ -67,6 +67,25 @@ enum dev_type {
>>  #define DEV_FLAG_X64		BIT(DEV_X64)
>>  
>>  /**
>> + * enum hw_event_mc_err_type - type of the detected error
>> + *
>> + * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
>> + *				corrected error was detected
>> + * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
>> + *				can't be corrected by ECC, but it is not
>> + *				factal (maybe it is on an unused memory area,
> 
> 				fatal
> 

Fixed all the above.

>> + *				or the memory controller could recover from
>> + *				it for example, by re-trying the operation).
>> + * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
>> + *				be recovered.
>> + */
>> +enum hw_event_mc_err_type {
>> +	HW_EVENT_ERR_CORRECTED,
>> +	HW_EVENT_ERR_UNCORRECTED,
>> +	HW_EVENT_ERR_FATAL,
> 
> Need a terminating elem here:
> 	HW_EVENT_ERR_NUM,

Why? There's no place where the number of types is needed. It should be noticed
no other EDAC enum's have an element for the count.

IMHO, we should't add any code there that won't be used. If latter needed, such
change can be added anytime.

> 
>> +};
>> +
>> +/**
>>   * enum mem_type - memory types. For a more detailed reference, please see
>>   *			http://en.wikipedia.org/wiki/DRAM
>>   *
>> @@ -308,7 +327,69 @@ enum scrub_type {
>>   * PS - I enjoyed writing all that about as much as you enjoyed reading it.
>>   */
>>  
>> -/* FIXME: add a per-dimm ce error count */
>> +/**
>> + * enum edac_mc_layer - memory controller hierarchy layer
>> + *
>> + * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
>> + * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
>> + * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
>> + * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
>> + *
>> + * This enum is used by the drivers to tell edac_mc_sysfs what name should
>> + * be used when describing a memory stick location.
>> + */
>> +enum edac_mc_layer_type {
>> +	EDAC_MC_LAYER_BRANCH,
>> +	EDAC_MC_LAYER_CHANNEL,
>> +	EDAC_MC_LAYER_SLOT,
>> +	EDAC_MC_LAYER_CHIP_SELECT,
> 
> ditto.

ditto.

> 
>> +};
>> +
>> +/**
>> + * struct edac_mc_layer - describes the memory controller hierarchy
>> + * @layer:		layer type
>> + * @size:maximum size of the layer
>> + * @is_csrow:		This layer is part of the "csrow" when old API
>> + *			compatibility mode is enabled. Otherwise, it is
>> + *			a channel
>> + */
>> +struct edac_mc_layer {
>> +	enum edac_mc_layer_type	type;
>> +	unsigned		size;
>> +	bool			is_csrow;
>> +};
> 
> Huh, why do you need is_csrow? Can't do
> 
> 	type = EDAC_MC_LAYER_CHIP_SELECT;
> 
> ?

No, that's different. For a csrow-based memory controller, is_csrow is equal to
type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
is used to mark with layers will be used for the "fake csrow" exported by the
EDAC core by the legacy API.

This field will be dropped together with the legacy API on some future Kernel,
but, for now, it is needed, in order to avoid breaking the userspace API.

> 
>> +
>> +/*
>> + * Maximum number of layers used by the memory controller to uniquelly
> 
> 								uniquely

Fixed.

> 
>> + * identify a single memory stick.
>> + * NOTE: incrementing it would require changes at edac_mc_handle_error()
>> + * and at the routines at edac_mc_sysfs that create layers
> 
> Maybe add their names here with a regex or so: edac_mc_blabla_*
> ?

With regards to the changes at edac_mc_sysfs,  it will likely affect all per-dimm
routines, plus the counters reset logic. The problem of pointing to a set of
routines that need changes is that this list can/will change with time.

So, the intention behind this note is not to give an exhaustive list of what should
be changed, if EDAC_MAX_LAYERS is incremented. Instead, it is meant to give a
clue that incrementing the number of layers is not as easy as just changing
it: it would require to change the number of layers also at the code.

> 
>> + */
>> +#define EDAC_MAX_LAYERS		3
>> +
>> +/*
>> + * A loop could be used here to make it more generic, but, as we only have
>> + * 3 layers, this is a little faster. By design, layers can never be 0 or
>> + * more than 3. If that ever happens, a NULL is returned, causing an OOPS
>> + * during the memory allocation routine, with would point to the developer
>> + * that he's doing something wrong.
>> + */
>> +#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
> 
> This is returning size per layers so it cannot be GET_POS(), AFAICT.
> EDAC_GET_SIZE or similar maybe?

This is not returning the size, per layers. It is returning a pointer to the
structure that holds the dimm.

> 
>> +	typeof(var) __p;						\
>> +	if ((nlayers) == 1)						\
>> +		__p = &var[lay0];					\
>> +	else if ((nlayers) == 2)					\
>> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
>> +	else if ((nlayers) == 3)					\
>> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
>> +			    ((layers[1]).size * (lay0))))];		\
>> +	else								\
>> +		__p = NULL;						\
>> +	__p;								\
>> +})
> 

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-23 18:30       ` Mauro Carvalho Chehab
@ 2012-04-23 18:56         ` Mauro Carvalho Chehab
  2012-04-23 19:19           ` [PATCH] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
  2012-04-24 10:40         ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
  1 sibling, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-23 18:56 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 23-04-2012 18:30, Mauro Carvalho Chehab escreveu:
> Em 23-04-2012 17:49, Borislav Petkov escreveu:
>> Subject: "edac.h: Prepare to handle with generic layers"
>>
>> what does that even mean?
>>
>> Do you per chance mean
>>
>> 	"Add generic layers for describing a memory location"
>>
>> or something similar?
>>
>> On Mon, Apr 16, 2012 at 05:12:12PM -0300, Mauro Carvalho Chehab wrote:
>>> The edac core were written with the idea that memory controllers
>>> are able to directly access csrows, and that the channels are
>>> used inside a csrows select.
>>>
>>> This is not true for FB-DIMM and RAMBUS memory controllers.
>>>
>>> Also, some recent advanced memory controllers don't present a per-csrows
>>> view. Instead, they view memories as DIMM's, instead of ranks, accessed
>>
>> 				       DIMMs
>>
>>> via csrow/channel.
>>>
>>> So, changes are needed in order to allow the EDAC core to
>>> work with all types of architectures.
>>>
>>> As a preparation for handling non-csrows based memory controllers,
>>
>>  In preparation...
>>
>>> adds some memory structs and a macro:
>>
>>   add some...
>>
>>> enum hw_event_mc_err_type: describes the type of error
>>> 			   (corrected, uncorrected, fatal)
>>>
>>> To be used by the new edac_mc_handle_error function;
>>>
>>> enum edac_mc_layer: describes the type of a given Memory
>>
>> 						    memory
>>
>>> architecture layer (branch, channel, slot, csrow).
>>>
>>> struct edac_mc_layer: describes the properties of a memory
>>> 		      layer (type, size, and if the layer
>>> 		      will be used on a virtual csrow.
>>>
>>> GET_POS() - as the number of layers can vary from 1 to 3,
>>> this macro converts from an address with up to 3 layers into
>>> a linear address.
>>>
>>> Cc: Doug Thompson <norsk5@yahoo.com>
>>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>>> ---
>>>  include/linux/edac.h |   83 +++++++++++++++++++++++++++++++++++++++++++++++++-
>>>  1 files changed, 82 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/include/linux/edac.h b/include/linux/edac.h
>>> index 8b78bd0..0fdf6ba 100644
>>> --- a/include/linux/edac.h
>>> +++ b/include/linux/edac.h
>>> @@ -67,6 +67,25 @@ enum dev_type {
>>>  #define DEV_FLAG_X64		BIT(DEV_X64)
>>>  
>>>  /**
>>> + * enum hw_event_mc_err_type - type of the detected error
>>> + *
>>> + * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
>>> + *				corrected error was detected
>>> + * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
>>> + *				can't be corrected by ECC, but it is not
>>> + *				factal (maybe it is on an unused memory area,
>>
>> 				fatal
>>
> 
> Fixed all the above.
> 
>>> + *				or the memory controller could recover from
>>> + *				it for example, by re-trying the operation).
>>> + * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
>>> + *				be recovered.
>>> + */
>>> +enum hw_event_mc_err_type {
>>> +	HW_EVENT_ERR_CORRECTED,
>>> +	HW_EVENT_ERR_UNCORRECTED,
>>> +	HW_EVENT_ERR_FATAL,
>>
>> Need a terminating elem here:
>> 	HW_EVENT_ERR_NUM,
> 
> Why? There's no place where the number of types is needed. It should be noticed
> no other EDAC enum's have an element for the count.
> 
> IMHO, we should't add any code there that won't be used. If latter needed, such
> change can be added anytime.
> 
>>
>>> +};
>>> +
>>> +/**
>>>   * enum mem_type - memory types. For a more detailed reference, please see
>>>   *			http://en.wikipedia.org/wiki/DRAM
>>>   *
>>> @@ -308,7 +327,69 @@ enum scrub_type {
>>>   * PS - I enjoyed writing all that about as much as you enjoyed reading it.
>>>   */
>>>  
>>> -/* FIXME: add a per-dimm ce error count */
>>> +/**
>>> + * enum edac_mc_layer - memory controller hierarchy layer
>>> + *
>>> + * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
>>> + * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
>>> + * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
>>> + * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
>>> + *
>>> + * This enum is used by the drivers to tell edac_mc_sysfs what name should
>>> + * be used when describing a memory stick location.
>>> + */
>>> +enum edac_mc_layer_type {
>>> +	EDAC_MC_LAYER_BRANCH,
>>> +	EDAC_MC_LAYER_CHANNEL,
>>> +	EDAC_MC_LAYER_SLOT,
>>> +	EDAC_MC_LAYER_CHIP_SELECT,
>>
>> ditto.
> 
> ditto.
> 
>>
>>> +};
>>> +
>>> +/**
>>> + * struct edac_mc_layer - describes the memory controller hierarchy
>>> + * @layer:		layer type
>>> + * @size:maximum size of the layer
>>> + * @is_csrow:		This layer is part of the "csrow" when old API
>>> + *			compatibility mode is enabled. Otherwise, it is
>>> + *			a channel
>>> + */
>>> +struct edac_mc_layer {
>>> +	enum edac_mc_layer_type	type;
>>> +	unsigned		size;
>>> +	bool			is_csrow;
>>> +};
>>
>> Huh, why do you need is_csrow? Can't do
>>
>> 	type = EDAC_MC_LAYER_CHIP_SELECT;
>>
>> ?
> 
> No, that's different. For a csrow-based memory controller, is_csrow is equal to
> type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
> is used to mark with layers will be used for the "fake csrow" exported by the
> EDAC core by the legacy API.
> 
> This field will be dropped together with the legacy API on some future Kernel,
> but, for now, it is needed, in order to avoid breaking the userspace API.

I don't like big var names, but, if you're not comfortable with is_csrow, then
we can call it as "is_virtual_csrow".
> 
>>
>>> +
>>> +/*
>>> + * Maximum number of layers used by the memory controller to uniquelly
>>
>> 								uniquely
> 
> Fixed.
> 
>>
>>> + * identify a single memory stick.
>>> + * NOTE: incrementing it would require changes at edac_mc_handle_error()
>>> + * and at the routines at edac_mc_sysfs that create layers
>>
>> Maybe add their names here with a regex or so: edac_mc_blabla_*
>> ?
> 
> With regards to the changes at edac_mc_sysfs,  it will likely affect all per-dimm
> routines, plus the counters reset logic. The problem of pointing to a set of
> routines that need changes is that this list can/will change with time.
> 
> So, the intention behind this note is not to give an exhaustive list of what should
> be changed, if EDAC_MAX_LAYERS is incremented. Instead, it is meant to give a
> clue that incrementing the number of layers is not as easy as just changing
> it: it would require to change the number of layers also at the code.
> 
>>
>>> + */
>>> +#define EDAC_MAX_LAYERS		3
>>> +
>>> +/*
>>> + * A loop could be used here to make it more generic, but, as we only have
>>> + * 3 layers, this is a little faster. By design, layers can never be 0 or
>>> + * more than 3. If that ever happens, a NULL is returned, causing an OOPS
>>> + * during the memory allocation routine, with would point to the developer
>>> + * that he's doing something wrong.
>>> + */
>>> +#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
>>
>> This is returning size per layers so it cannot be GET_POS(), AFAICT.
>> EDAC_GET_SIZE or similar maybe?
> 
> This is not returning the size, per layers. It is returning a pointer to the
> structure that holds the dimm.

Maybe it can be called, instead: EDAC_DIMM_PTR().

> 
>>
>>> +	typeof(var) __p;						\
>>> +	if ((nlayers) == 1)						\
>>> +		__p = &var[lay0];					\
>>> +	else if ((nlayers) == 2)					\
>>> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
>>> +	else if ((nlayers) == 3)					\
>>> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
>>> +			    ((layers[1]).size * (lay0))))];		\
>>> +	else								\
>>> +		__p = NULL;						\
>>> +	__p;								\
>>> +})
>>
> 
> Regards,
> Mauro
> --
> To unsubscribe from this list: send the line "unsubscribe linux-edac" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 206+ messages in thread

* [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-23 18:56         ` Mauro Carvalho Chehab
@ 2012-04-23 19:19           ` Mauro Carvalho Chehab
  2012-04-23 20:07             ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-23 19:19 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks, accessed
via csrow/channel.

So, changes are needed in order to allow the EDAC core to
work with all types of architectures.

In preparation for handling non-csrows based memory controllers,
add some memory structs and a macro:

enum hw_event_mc_err_type: describes the type of error
			   (corrected, uncorrected, fatal)

To be used by the new edac_mc_handle_error function;

enum edac_mc_layer: describes the type of a given memory
architecture layer (branch, channel, slot, csrow).

struct edac_mc_layer: describes the properties of a memory
		      layer (type, size, and if the layer
		      will be used on a virtual csrow.

EDAC_DIMM_PTR() - as the number of layers can vary from 1 to 3,
this macro converts from an address with up to 3 layers into
a linear address.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
v15: Fixed some comments, is_csrow renamed to is_virt_csrow,
     GET_POS renamed to EDAC_DIMM_PTR


 include/linux/edac.h |   83 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 82 insertions(+), 1 deletions(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index 8b78bd0..243a92b 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -67,6 +67,25 @@ enum dev_type {
 #define DEV_FLAG_X64		BIT(DEV_X64)
 
 /**
+ * enum hw_event_mc_err_type - type of the detected error
+ *
+ * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
+ *				corrected error was detected
+ * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
+ *				can't be corrected by ECC, but it is not
+ *				fatal (maybe it is on an unused memory area,
+ *				or the memory controller could recover from
+ *				it for example, by re-trying the operation).
+ * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
+ *				be recovered.
+ */
+enum hw_event_mc_err_type {
+	HW_EVENT_ERR_CORRECTED,
+	HW_EVENT_ERR_UNCORRECTED,
+	HW_EVENT_ERR_FATAL,
+};
+
+/**
  * enum mem_type - memory types. For a more detailed reference, please see
  *			http://en.wikipedia.org/wiki/DRAM
  *
@@ -308,7 +327,69 @@ enum scrub_type {
  * PS - I enjoyed writing all that about as much as you enjoyed reading it.
  */
 
-/* FIXME: add a per-dimm ce error count */
+/**
+ * enum edac_mc_layer - memory controller hierarchy layer
+ *
+ * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
+ * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
+ * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
+ * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
+ *
+ * This enum is used by the drivers to tell edac_mc_sysfs what name should
+ * be used when describing a memory stick location.
+ */
+enum edac_mc_layer_type {
+	EDAC_MC_LAYER_BRANCH,
+	EDAC_MC_LAYER_CHANNEL,
+	EDAC_MC_LAYER_SLOT,
+	EDAC_MC_LAYER_CHIP_SELECT,
+};
+
+/**
+ * struct edac_mc_layer - describes the memory controller hierarchy
+ * @layer:		layer type
+ * @size:maximum size of the layer
+ * @is_virt_csrow:	This layer is part of the "csrow" when old API
+ *			compatibility mode is enabled. Otherwise, it is
+ *			a channel
+ */
+struct edac_mc_layer {
+	enum edac_mc_layer_type	type;
+	unsigned		size;
+	bool			is_virt_csrow;
+};
+
+/*
+ * Maximum number of layers used by the memory controller to uniquely
+ * identify a single memory stick.
+ * NOTE: incrementing it would require changes at edac_mc_handle_error()
+ * and at the routines at edac_mc_sysfs that create layers
+ */
+#define EDAC_MAX_LAYERS		3
+
+/*
+ * A loop could be used here to make it more generic, but, as we only have
+ * 3 layers, this is a little faster. By design, layers can never be 0 or
+ * more than 3. If that ever happens, a NULL is returned, causing an OOPS
+ * during the memory allocation routine, with would point to the developer
+ * that he's doing something wrong.
+ */
+#define EDAC_DIMM_PTR(layers, var, nlayers, lay0, lay1, lay2) ({		\
+	typeof(var) __p;						\
+	if ((nlayers) == 1)						\
+		__p = &var[lay0];					\
+	else if ((nlayers) == 2)					\
+		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
+	else if ((nlayers) == 3)					\
+		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
+			    ((layers[1]).size * (lay0))))];		\
+	else								\
+		__p = NULL;						\
+	__p;								\
+})
+
+
+/* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
 	unsigned memory_controller;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-23 19:19           ` [PATCH] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
@ 2012-04-23 20:07             ` Mauro Carvalho Chehab
  2012-04-24 10:46               ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-23 20:07 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 23-04-2012 19:19, Mauro Carvalho Chehab escreveu:
> The edac core were written with the idea that memory controllers
> are able to directly access csrows, and that the channels are
> used inside a csrows select.
> 
> This is not true for FB-DIMM and RAMBUS memory controllers.
> 
> Also, some recent advanced memory controllers don't present a per-csrows
> view. Instead, they view memories as DIMMs, instead of ranks, accessed
> via csrow/channel.
> 
> So, changes are needed in order to allow the EDAC core to
> work with all types of architectures.
> 
> In preparation for handling non-csrows based memory controllers,
> add some memory structs and a macro:
> 
> enum hw_event_mc_err_type: describes the type of error
> 			   (corrected, uncorrected, fatal)
> 
> To be used by the new edac_mc_handle_error function;
> 
> enum edac_mc_layer: describes the type of a given memory
> architecture layer (branch, channel, slot, csrow).
> 
> struct edac_mc_layer: describes the properties of a memory
> 		      layer (type, size, and if the layer
> 		      will be used on a virtual csrow.
> 
> EDAC_DIMM_PTR() - as the number of layers can vary from 1 to 3,
> this macro converts from an address with up to 3 layers into
> a linear address.
> 
> Cc: Doug Thompson <norsk5@yahoo.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
> v15: Fixed some comments, is_csrow renamed to is_virt_csrow,
>      GET_POS renamed to EDAC_DIMM_PTR

There are 55 patches affected by this change. Applying them locally
was as simple as running this small script to the submitted patches:

$ for i in `quilt series`; do sed s,is_csrow,is_virt_csrow,g $i|sed s,GET_POS,EDAC_DIMM_PTR,g| >a && mv a $i; done

However, mailbombing 55 patches just because of the above rename
is probably not very welcome by the people at the ML. Also, at least
for me, it seems more logical to add a patch like that at the end of
the patch series, than to force people to re-analyze the entire patchset.

Due to that, I won't resend the entire patchbomb to the ML.
They'll be there, anyway, on my tree, at:

	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac.git hw_events_v15

After Kernel.org mirror sync. The previous tree, with the enclosed
renamed patch at the end is at:
	git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac.git hw_events_v14

PS.: I'm also storing the very latest version of this patch series at:
	git://git.infradead.org/users/mchehab/edac.git experimental

The only difference is that the experimental branch there will be rebased
every time I need to modify a patch on this series, while I'll create a new
branch at the kernel.org tree with the changes, instead of rebasing an
existing one.

-

[PATCH] edac: rename is_csrows and GET_POS

Those names don't represent well the meaning for those fields.
So, rename them to be meaningful.

As "is_csrows" is used to indicate that a layer is part of the
virtual csrow, let's name it as is_virt_csrows.

As "GET_POS" is used to get the EDAC mci dimm_info pointer for
a given layer address, let's call it as "EDAC_DIMM_PTR".

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 08af66c..50467a0 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -2568,10 +2568,10 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 	ret = -ENOMEM;
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = pvt->csels[0].b_cnt;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = pvt->channel_count;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		goto err_siblings;
diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index 99d8d56..be6c225 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -247,10 +247,10 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = AMD76X_NR_CSROWS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = 1;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 
 	if (mci == NULL)
diff --git a/drivers/edac/cell_edac.c b/drivers/edac/cell_edac.c
index ee61f0d..d28167b 100644
--- a/drivers/edac/cell_edac.c
+++ b/drivers/edac/cell_edac.c
@@ -200,10 +200,10 @@ static int __devinit cell_edac_probe(struct platform_device *pdev)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = 1;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = num_chans;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
 			    sizeof(struct cell_edac_priv));
 	if (mci == NULL)
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 0203089..31b3c91 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -978,10 +978,10 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = CPC925_NR_CSROWS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = nr_channels;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
 			    sizeof(struct cpc925_mc_pdata));
 	if (!mci) {
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 35f282e..7e601c1 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -1293,10 +1293,10 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = E752X_NR_CSROWS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = drc_chan + 1;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
 			    false, sizeof(*pvt));
 	if (mci == NULL)
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 93695e4..2defa96 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -445,10 +445,10 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	 */
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = E7XXX_NR_CSROWS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = drc_chan + 1;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index d3dd0dd..6853935 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -240,7 +240,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	tot_csrows = 1;
 	for (i = 0; i < n_layers; i++) {
 		tot_dimms *= layers[i].size;
-		if (layers[i].is_csrow)
+		if (layers[i].is_virt_csrow)
 			tot_csrows *= layers[i].size;
 		else
 			tot_cschannels *= layers[i].size;
@@ -380,7 +380,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 		/* Increment csrow location */
 		if (!rev_order) {
 			for (j = n_layers - 1; j >= 0; j--)
-				if (!layers[j].is_csrow)
+				if (!layers[j].is_virt_csrow)
 					break;
 			chn++;
 			if (chn == tot_cschannels) {
@@ -389,7 +389,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 			}
 		} else {
 			for (j = n_layers - 1; j >= 0; j--)
-				if (layers[j].is_csrow)
+				if (layers[j].is_virt_csrow)
 					break;
 			row++;
 			if (row == tot_csrows) {
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 15df2bc..55eff02 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -358,10 +358,10 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = I3000_RANKS / nr_channels;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = nr_channels;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index acb5d39..818ee6f 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -343,10 +343,10 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = I3200_DIMMS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = nr_channels;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
 			    false, sizeof(struct i3200_priv));
 	if (!mci)
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 3626225..fda19b4 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1280,7 +1280,7 @@ static int i5000_init_csrows(struct mem_ctl_info *mci)
 			if (!MTR_DIMMS_PRESENT(mtr))
 				continue;
 
-			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+			dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms, mci->n_layers,
 				       channel / MAX_BRANCHES,
 				       channel % MAX_BRANCHES, slot);
 
@@ -1397,13 +1397,13 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	layers[0].type = EDAC_MC_LAYER_BRANCH;
 	layers[0].size = MAX_BRANCHES;
-	layers[0].is_csrow = false;
+	layers[0].is_virt_csrow = false;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = num_channels / MAX_BRANCHES;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	layers[2].type = EDAC_MC_LAYER_SLOT;
 	layers[2].size = num_dimms_per_channel;
-	layers[2].is_csrow = true;
+	layers[2].is_virt_csrow = true;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
diff --git a/drivers/edac/i5100_edac.c b/drivers/edac/i5100_edac.c
index dd260c8..a5a7ca4 100644
--- a/drivers/edac/i5100_edac.c
+++ b/drivers/edac/i5100_edac.c
@@ -844,7 +844,7 @@ static void __devinit i5100_init_csrows(struct mem_ctl_info *mci)
 		if (!npages)
 			continue;
 
-		dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+		dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms, mci->n_layers,
 			       chan, rank, 0);
 
 		dimm->nr_pages = npages;
@@ -932,10 +932,10 @@ static int __devinit i5100_init_one(struct pci_dev *pdev,
 
 	layers[0].type = EDAC_MC_LAYER_CHANNEL;
 	layers[0].size = 2;
-	layers[0].is_csrow = false;
+	layers[0].is_virt_csrow = false;
 	layers[1].type = EDAC_MC_LAYER_SLOT;
 	layers[1].size = ranksperch;
-	layers[1].is_csrow = true;
+	layers[1].is_virt_csrow = true;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers,
 			    false, sizeof(*priv));
 	if (!mci) {
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 74b64c6..676591e 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1198,7 +1198,7 @@ static int i5400_init_dimms(struct mem_ctl_info *mci)
 			if (!MTR_DIMMS_PRESENT(mtr))
 				continue;
 
-			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+			dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms, mci->n_layers,
 				       channel / 2, channel % 2, slot);
 
 			size_mb =  pvt->dimm_info[slot][channel].megabytes;
@@ -1286,13 +1286,13 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	 */
 	layers[0].type = EDAC_MC_LAYER_BRANCH;
 	layers[0].size = MAX_BRANCHES;
-	layers[0].is_csrow = false;
+	layers[0].is_virt_csrow = false;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = CHANNELS_PER_BRANCH;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	layers[2].type = EDAC_MC_LAYER_SLOT;
 	layers[2].size = DIMMS_PER_CHANNEL;
-	layers[2].is_csrow = true;
+	layers[2].is_virt_csrow = true;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index f9a4fa4..7425f17 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -795,7 +795,7 @@ static int i7300_init_csrows(struct mem_ctl_info *mci)
 			for (ch = 0; ch < MAX_CH_PER_BRANCH; ch++) {
 				int channel = to_channel(ch, branch);
 
-				dimm = GET_POS(mci->layers, mci->dimms,
+				dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms,
 					       mci->n_layers, branch, ch, slot);
 
 				dinfo = &pvt->dimm_info[slot][channel];
@@ -1044,13 +1044,13 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 	/* allocate a new MC control structure */
 	layers[0].type = EDAC_MC_LAYER_BRANCH;
 	layers[0].size = MAX_BRANCHES;
-	layers[0].is_csrow = false;
+	layers[0].is_virt_csrow = false;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = MAX_CH_PER_BRANCH;
-	layers[1].is_csrow = true;
+	layers[1].is_virt_csrow = true;
 	layers[2].type = EDAC_MC_LAYER_SLOT;
 	layers[2].size = MAX_SLOTS;
-	layers[2].is_csrow = true;
+	layers[2].is_virt_csrow = true;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 
 	if (mci == NULL)
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index cf27af8..dfdee48 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -596,7 +596,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 			if (!DIMM_PRESENT(dimm_dod[j]))
 				continue;
 
-			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+			dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms, mci->n_layers,
 				       i, j, 0);
 			banks = numbank(MC_DOD_NUMBANK(dimm_dod[j]));
 			ranks = numrank(MC_DOD_NUMRANK(dimm_dod[j]));
@@ -2229,10 +2229,10 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 
 	layers[0].type = EDAC_MC_LAYER_CHANNEL;
 	layers[0].size = NUM_CHANS;
-	layers[0].is_csrow = false;
+	layers[0].is_virt_csrow = false;
 	layers[1].type = EDAC_MC_LAYER_SLOT;
 	layers[1].size = MAX_DIMMS;
-	layers[1].is_csrow = true;
+	layers[1].is_virt_csrow = true;
 	mci = edac_mc_alloc(i7core_dev->socket, ARRAY_SIZE(layers), layers,
 			    false, sizeof(*pvt));
 	if (unlikely(!mci))
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index 877ba54..c0249f3 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -251,10 +251,10 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = I82443BXGX_NR_CSROWS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = I82443BXGX_NR_CHANS;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index f493758..6ff59b0 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -202,10 +202,10 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	 */
 	layers[0].type = EDAC_MC_LAYER_CHANNEL;
 	layers[0].size = 2;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_SLOT;
 	layers[1].size = 8;
-	layers[1].is_csrow = true;
+	layers[1].is_virt_csrow = true;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index a42a5bd..c943904 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -416,10 +416,10 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = I82875P_NR_CSROWS(nr_chans);
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = nr_chans;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (!mci) {
 		rc = -ENOMEM;
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index 717f208..a4a6768 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -548,10 +548,10 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	/* assuming only one controller, index thus is 0 */
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = I82975X_NR_DIMMS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = I82975X_NR_CSROWS(chans);
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, sizeof(*pvt));
 	if (!mci) {
 		rc = -ENOMEM;
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 42e209c..1640d54 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -988,10 +988,10 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = 4;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = 1;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
 			    sizeof(*pdata));
 	if (!mci) {
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 87139ca..59c399a 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -711,10 +711,10 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = 1;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = 1;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(edac_mc_idx, ARRAY_SIZE(layers), layers, false,
 			    sizeof(struct mv64x60_mc_pdata));
 	if (!mci) {
diff --git a/drivers/edac/pasemi_edac.c b/drivers/edac/pasemi_edac.c
index 634b919..267e9cc 100644
--- a/drivers/edac/pasemi_edac.c
+++ b/drivers/edac/pasemi_edac.c
@@ -211,10 +211,10 @@ static int __devinit pasemi_edac_probe(struct pci_dev *pdev,
 
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = PASEMI_EDAC_NR_CSROWS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = PASEMI_EDAC_NR_CHANS;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(system_mmc_id++, ARRAY_SIZE(layers), layers, false,
 			    0);
 	if (mci == NULL)
diff --git a/drivers/edac/ppc4xx_edac.c b/drivers/edac/ppc4xx_edac.c
index 3917b0f..77908cd 100644
--- a/drivers/edac/ppc4xx_edac.c
+++ b/drivers/edac/ppc4xx_edac.c
@@ -1287,10 +1287,10 @@ static int __devinit ppc4xx_edac_probe(struct platform_device *op)
 	 */
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = ppc4xx_edac_nr_csrows;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = ppc4xx_edac_nr_chans;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(ppc4xx_edac_instance, ARRAY_SIZE(layers), layers,
 			    false, sizeof(struct ppc4xx_edac_pdata));
 	if (mci == NULL) {
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 6a7a2ce..7b7eaf2 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -287,10 +287,10 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	debugf2("%s(): DRAMC register = %#0x\n", __func__, dramcr);
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = R82600_NR_CSROWS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = R82600_NR_CHANS;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (mci == NULL)
 		return -ENOMEM;
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index ff07f34..bb7e95f 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -572,7 +572,7 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 		u32 mtr;
 
 		for (j = 0; j < ARRAY_SIZE(mtr_regs); j++) {
-			dimm = GET_POS(mci->layers, mci->dimms, mci->n_layers,
+			dimm = EDAC_DIMM_PTR(mci->layers, mci->dimms, mci->n_layers,
 				       i, j, 0);
 			pci_read_config_dword(pvt->pci_tad[i],
 					      mtr_regs[j], &mtr);
@@ -1635,10 +1635,10 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 	/* allocate a new MC control structure */
 	layers[0].type = EDAC_MC_LAYER_CHANNEL;
 	layers[0].size = NUM_CHANNELS;
-	layers[0].is_csrow = false;
+	layers[0].is_virt_csrow = false;
 	layers[1].type = EDAC_MC_LAYER_SLOT;
 	layers[1].size = MAX_DIMMS;
-	layers[1].is_csrow = true;
+	layers[1].is_virt_csrow = true;
 	mci = edac_mc_alloc(sbridge_dev->mc, ARRAY_SIZE(layers), layers,
 			    false, sizeof(*pvt));
 
diff --git a/drivers/edac/tile_edac.c b/drivers/edac/tile_edac.c
index 4aecb06..56b0ab0 100644
--- a/drivers/edac/tile_edac.c
+++ b/drivers/edac/tile_edac.c
@@ -137,10 +137,10 @@ static int __devinit tile_edac_mc_probe(struct platform_device *pdev)
 	/* A TILE MC has a single channel and one chip-select row. */
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = TILE_EDAC_NR_CSROWS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = TILE_EDAC_NR_CHANS;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(pdev->id, ARRAY_SIZE(layers), layers, false,
 			    sizeof(struct tile_edac_priv));
 	if (mci == NULL)
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index c5e54ef..219530b 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -344,10 +344,10 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	/* FIXME: unconventional pvt_info usage */
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = X38_RANKS;
-	layers[0].is_csrow = true;
+	layers[0].is_virt_csrow = true;
 	layers[1].type = EDAC_MC_LAYER_CHANNEL;
 	layers[1].size = x38_channel_num;
-	layers[1].is_csrow = false;
+	layers[1].is_virt_csrow = false;
 	mci = edac_mc_alloc(0, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		return -ENOMEM;
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 062a1a7..e348afb 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -384,14 +384,14 @@ enum edac_mc_layer_type {
  * struct edac_mc_layer - describes the memory controller hierarchy
  * @layer:		layer type
  * @size:maximum size of the layer
- * @is_csrow:		This layer is part of the "csrow" when old API
+ * @is_virt_csrow:	This layer is part of the "csrow" when old API
  *			compatibility mode is enabled. Otherwise, it is
  *			a channel
  */
 struct edac_mc_layer {
 	enum edac_mc_layer_type	type;
 	unsigned		size;
-	bool			is_csrow;
+	bool			is_virt_csrow;
 };
 
 /*
@@ -424,7 +424,7 @@ struct edac_mc_layer {
 	__i;								\
 })
 
-#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
+#define EDAC_DIMM_PTR(layers, var, nlayers, lay0, lay1, lay2) ({		\
 	typeof(*var) __p;						\
 	int ___i = GET_OFFSET(layers, nlayers, lay0, lay1, lay2);	\
 	if (___i < 0)							\

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-23 18:30       ` Mauro Carvalho Chehab
  2012-04-23 18:56         ` Mauro Carvalho Chehab
@ 2012-04-24 10:40         ` Borislav Petkov
  2012-04-24 11:46           ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-24 10:40 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

On Mon, Apr 23, 2012 at 06:30:54PM +0000, Mauro Carvalho Chehab wrote:
> >> +};
> >> +
> >> +/**
> >> + * struct edac_mc_layer - describes the memory controller hierarchy
> >> + * @layer:		layer type
> >> + * @size:maximum size of the layer
> >> + * @is_csrow:		This layer is part of the "csrow" when old API
> >> + *			compatibility mode is enabled. Otherwise, it is
> >> + *			a channel
> >> + */
> >> +struct edac_mc_layer {
> >> +	enum edac_mc_layer_type	type;
> >> +	unsigned		size;
> >> +	bool			is_csrow;
> >> +};
> > 
> > Huh, why do you need is_csrow? Can't do
> > 
> > 	type = EDAC_MC_LAYER_CHIP_SELECT;
> > 
> > ?
> 
> No, that's different. For a csrow-based memory controller, is_csrow is equal to
> type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
> is used to mark with layers will be used for the "fake csrow" exported by the
> EDAC core by the legacy API.

I don't understand this, do you mean: "this will be used to mark which
layer will be used to fake a csrow"...?

[..]

> With regards to the changes at edac_mc_sysfs,  it will likely affect all per-dimm
> routines, plus the counters reset logic. The problem of pointing to a set of
> routines that need changes is that this list can/will change with time.
> 
> So, the intention behind this note is not to give an exhaustive list of what should
> be changed, if EDAC_MAX_LAYERS is incremented. Instead, it is meant to give a
> clue that incrementing the number of layers is not as easy as just changing
> it: it would require to change the number of layers also at the code.

Then write that instead of adding a clueless note which only confuses readers.

> 
> > 
> >> + */
> >> +#define EDAC_MAX_LAYERS		3
> >> +
> >> +/*
> >> + * A loop could be used here to make it more generic, but, as we only have
> >> + * 3 layers, this is a little faster. By design, layers can never be 0 or
> >> + * more than 3. If that ever happens, a NULL is returned, causing an OOPS
> >> + * during the memory allocation routine, with would point to the developer
> >> + * that he's doing something wrong.
> >> + */
> >> +#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
> > 
> > This is returning size per layers so it cannot be GET_POS(), AFAICT.
> > EDAC_GET_SIZE or similar maybe?
> 
> This is not returning the size, per layers. It is returning a pointer to the
> structure that holds the dimm.
> 
> > 
> >> +	typeof(var) __p;						\
> >> +	if ((nlayers) == 1)						\
> >> +		__p = &var[lay0];					\
> >> +	else if ((nlayers) == 2)					\
> >> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
> >> +	else if ((nlayers) == 3)					\
> >> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
> >> +			    ((layers[1]).size * (lay0))))];		\
> >> +	else								\
> >> +		__p = NULL;						\
> >> +	__p;								\
> >> +})

Ok, I'm looking at your next patch trying to understand this thing:

+	/*
+	 * Fills the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = GET_POS(lay, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);

pos is an unsigned[3] array with all its elements set to 0 in the memset
above. Which means I need a run-variable like that all the time whenever
I iterate over the layers.

Now, say nlayers == 3, then your macro does this:

	__p = &var[(lay2) + ((layers[2]).size * ((lay1) + ((layers[1]).size * (lay0))))];

So I'm multiplying a loop variable with layers[i].size which is the
maximum size of the layer. What does that mean, where is this size
initialized?

I can imagine that I'll get an element in the mci->dimms array in the
end but this is very confusing.

So please explain what each argument of this macro exactly means.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-23 20:07             ` Mauro Carvalho Chehab
@ 2012-04-24 10:46               ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-24 10:46 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

On Mon, Apr 23, 2012 at 08:07:13PM +0000, Mauro Carvalho Chehab wrote:
> There are 55 patches affected by this change. Applying them locally
> was as simple as running this small script to the submitted patches:
> 
> $ for i in `quilt series`; do sed s,is_csrow,is_virt_csrow,g $i|sed s,GET_POS,EDAC_DIMM_PTR,g| >a && mv a $i; done
> 
> However, mailbombing 55 patches just because of the above rename
> is probably not very welcome by the people at the ML. Also, at least
> for me, it seems more logical to add a patch like that at the end of
> the patch series, than to force people to re-analyze the entire patchset.
> 
> Due to that, I won't resend the entire patchbomb to the ML.

This is exactly one of the reasons why one shouldn't send mailbombs like
that in the first place.

Documentation/SubmittingPatches does not state this for no reason:

"If you cannot condense your patch set into a smaller set of patches,
then only post say 15 or so at a time and wait for review and
integration."

And having patches which change earlier patches in your patchset is
absolutely a no-no.

This means at least that your previous patches haven't been reviewed
properly and still suffer some churn, causing the later ones to need
readjustment.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 10:40         ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
@ 2012-04-24 11:46           ` Mauro Carvalho Chehab
  2012-04-24 12:42             ` Mauro Carvalho Chehab
  2012-04-24 12:55             ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
  0 siblings, 2 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 11:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 24-04-2012 07:40, Borislav Petkov escreveu:
> On Mon, Apr 23, 2012 at 06:30:54PM +0000, Mauro Carvalho Chehab wrote:
>>>> +};
>>>> +
>>>> +/**
>>>> + * struct edac_mc_layer - describes the memory controller hierarchy
>>>> + * @layer:		layer type
>>>> + * @size:maximum size of the layer
>>>> + * @is_csrow:		This layer is part of the "csrow" when old API
>>>> + *			compatibility mode is enabled. Otherwise, it is
>>>> + *			a channel
>>>> + */
>>>> +struct edac_mc_layer {
>>>> +	enum edac_mc_layer_type	type;
>>>> +	unsigned		size;
>>>> +	bool			is_csrow;
>>>> +};
>>>
>>> Huh, why do you need is_csrow? Can't do
>>>
>>> 	type = EDAC_MC_LAYER_CHIP_SELECT;
>>>
>>> ?
>>
>> No, that's different. For a csrow-based memory controller, is_csrow is equal to
>> type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
>> is used to mark with layers will be used for the "fake csrow" exported by the
>> EDAC core by the legacy API.
> 
> I don't understand this, do you mean: "this will be used to mark which
> layer will be used to fake a csrow"...?

I've already explained this dozens of times: on x86, except for amd64_edac and
the drivers for legacy hardware (+7 years old), the information filled at struct 
csrow_info is FAKE. That's basically one of the main reasons for this patchset.

There's no csrow signals accessed by the memory controller on FB-DIMM/RAMBUS, and on DDR3
Intel memory controllers, it is possible to fill memories on different channels with
different sizes. For example, this is how the 4 DIMM banks are filled on an HP Z400
with a Intel W3505 CPU:

$ ./edac-ctl --layout
       +-----------------------------------+
       |                mc0                |
       | channel0  | channel1  | channel2  |
-------+-----------------------------------+
slot2: |     0 MB  |     0 MB  |     0 MB  |
slot1: |  1024 MB  |     0 MB  |     0 MB  |
slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
-------+-----------------------------------+

Those are the logs that dump the Memory Controller registers: 

[  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
[  115.818950] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
[  115.818955] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
[  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
[  115.818985] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
[  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
[  115.819016] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400

The Nehalem memory controllers allow up to 3 DIMMs per channel, and has 3 channels (so,
a total of 9 DIMMs). Most motherboards, however, expose either 4 or 8 DIMMs per CPU,
so it isn't possible to have all channels and dimms filled on them.

On this motherboard, DIMM1 to DIMM3 are mapped to the the first dimm# at channels 0 to 2, and
DIMM4 goes to the second dimm# at channel 0.

See? On slot 1, only channel 0 is filled.

Even if this memory controller would be rank-based[1], the channel information
can't be mapped using the legacy EDAC API, as, on the old API, all channels need to be
filled with memories with the same size. So, this driver uses both the slot layer and
the channel layer as the fake csrow.

[1] As you can see from the logs and from the source code, the MC registers aren't per rank,
they are per DIMM. The number of ranks is just one attribute of the register that describes
a DIMM. The MCA Error registers, however, don't map the rank when reporting an errors,
nor the error counters are per rank. So, while it is possible to enumerate information
per rank, the error detection is always per DIMM.

the "is_csrow" property (or "is_virt_csrow" after the renaming patch) is there to indicate
what memory layer will compose the "csrow" when using the legacy API. 

> [..]
> 
>> With regards to the changes at edac_mc_sysfs,  it will likely affect all per-dimm
>> routines, plus the counters reset logic. The problem of pointing to a set of
>> routines that need changes is that this list can/will change with time.
>>
>> So, the intention behind this note is not to give an exhaustive list of what should
>> be changed, if EDAC_MAX_LAYERS is incremented. Instead, it is meant to give a
>> clue that incrementing the number of layers is not as easy as just changing
>> it: it would require to change the number of layers also at the code.
> 
> Then write that instead of adding a clueless note which only confuses readers.
> 
>>
>>>
>>>> + */
>>>> +#define EDAC_MAX_LAYERS		3
>>>> +
>>>> +/*
>>>> + * A loop could be used here to make it more generic, but, as we only have
>>>> + * 3 layers, this is a little faster. By design, layers can never be 0 or
>>>> + * more than 3. If that ever happens, a NULL is returned, causing an OOPS
>>>> + * during the memory allocation routine, with would point to the developer
>>>> + * that he's doing something wrong.
>>>> + */
>>>> +#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
>>>
>>> This is returning size per layers so it cannot be GET_POS(), AFAICT.
>>> EDAC_GET_SIZE or similar maybe?
>>
>> This is not returning the size, per layers. It is returning a pointer to the
>> structure that holds the dimm.
>>
>>>
>>>> +	typeof(var) __p;						\
>>>> +	if ((nlayers) == 1)						\
>>>> +		__p = &var[lay0];					\
>>>> +	else if ((nlayers) == 2)					\
>>>> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
>>>> +	else if ((nlayers) == 3)					\
>>>> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
>>>> +			    ((layers[1]).size * (lay0))))];		\
>>>> +	else								\
>>>> +		__p = NULL;						\
>>>> +	__p;								\
>>>> +})
> 
> Ok, I'm looking at your next patch trying to understand this thing:

(That's why this were merged with the big patch. The meaning of this
patch and the next one is explained by the changes that happened at the
drivers. Those two patches, plus the 26 patches are part of a single logical
change)
> 
> +	/*
> +	 * Fills the dimm struct
> +	 */
> +	memset(&pos, 0, sizeof(pos));
> +	row = 0;
> +	chn = 0;
> +	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
> +	for (i = 0; i < tot_dimms; i++) {
> +		chan = &csi[row].channels[chn];
> +		dimm = GET_POS(lay, mci->dimms, n_layers,
> +			       pos[0], pos[1], pos[2]);
> 
> pos is an unsigned[3] array with all its elements set to 0 in the memset
> above. Which means I need a run-variable like that all the time whenever
> I iterate over the layers.

Yes. 

> 
> Now, say nlayers == 3, then your macro does this:
> 
> 	__p = &var[(lay2) + ((layers[2]).size * ((lay1) + ((layers[1]).size * (lay0))))];
> 
> So I'm multiplying a loop variable with layers[i].size which is the
> maximum size of the layer. What does that mean, where is this size
> initialized?

The layers array is initialized by the drivers.  For example, at amd664_edac (see
patch 01/26):

@@ -2520,7 +2561,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 		goto err_siblings;
 
 	ret = -ENOMEM;
-	mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = pvt->csels[0].b_cnt;
+	layers[0].is_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = pvt->channel_count;
+	layers[1].is_csrow = false;
+	mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		goto err_siblings;

> 
> I can imagine that I'll get an element in the mci->dimms array in the
> end 

Yes. In the case of a 2 layer MC, this would be similar to create a
bi-dimensional array and use:
	dimm = mci->dimm[x][y]

> but this is very confusing.
> 
> So please explain what each argument of this macro exactly means.
> 
I'll.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 11:46           ` Mauro Carvalho Chehab
@ 2012-04-24 12:42             ` Mauro Carvalho Chehab
  2012-04-24 12:49               ` [PATCH] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
  2012-04-24 12:55             ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
  1 sibling, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 12:42 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 24-04-2012 08:46, Mauro Carvalho Chehab escreveu:
> Em 24-04-2012 07:40, Borislav Petkov escreveu:
>> On Mon, Apr 23, 2012 at 06:30:54PM +0000, Mauro Carvalho Chehab wrote:
>>>>> +};
>>>>> +
>>>>> +/**
>>>>> + * struct edac_mc_layer - describes the memory controller hierarchy
>>>>> + * @layer:		layer type
>>>>> + * @size:maximum size of the layer
>>>>> + * @is_csrow:		This layer is part of the "csrow" when old API
>>>>> + *			compatibility mode is enabled. Otherwise, it is
>>>>> + *			a channel
>>>>> + */
>>>>> +struct edac_mc_layer {
>>>>> +	enum edac_mc_layer_type	type;
>>>>> +	unsigned		size;
>>>>> +	bool			is_csrow;
>>>>> +};
>>>>
>>>> Huh, why do you need is_csrow? Can't do
>>>>
>>>> 	type = EDAC_MC_LAYER_CHIP_SELECT;
>>>>
>>>> ?
>>>
>>> No, that's different. For a csrow-based memory controller, is_csrow is equal to
>>> type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
>>> is used to mark with layers will be used for the "fake csrow" exported by the
>>> EDAC core by the legacy API.
>>
>> I don't understand this, do you mean: "this will be used to mark which
>> layer will be used to fake a csrow"...?
> 
> I've already explained this dozens of times: on x86, except for amd64_edac and
> the drivers for legacy hardware (+7 years old), the information filled at struct 
> csrow_info is FAKE. That's basically one of the main reasons for this patchset.
> 
> There's no csrow signals accessed by the memory controller on FB-DIMM/RAMBUS, and on DDR3
> Intel memory controllers, it is possible to fill memories on different channels with
> different sizes. For example, this is how the 4 DIMM banks are filled on an HP Z400
> with a Intel W3505 CPU:
> 
> $ ./edac-ctl --layout
>        +-----------------------------------+
>        |                mc0                |
>        | channel0  | channel1  | channel2  |
> -------+-----------------------------------+
> slot2: |     0 MB  |     0 MB  |     0 MB  |
> slot1: |  1024 MB  |     0 MB  |     0 MB  |
> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
> -------+-----------------------------------+
> 
> Those are the logs that dump the Memory Controller registers: 
> 
> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
> [  115.818950] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.818955] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
> [  115.818985] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
> [  115.819016] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> 
> The Nehalem memory controllers allow up to 3 DIMMs per channel, and has 3 channels (so,
> a total of 9 DIMMs). Most motherboards, however, expose either 4 or 8 DIMMs per CPU,
> so it isn't possible to have all channels and dimms filled on them.
> 
> On this motherboard, DIMM1 to DIMM3 are mapped to the the first dimm# at channels 0 to 2, and
> DIMM4 goes to the second dimm# at channel 0.
> 
> See? On slot 1, only channel 0 is filled.
> 
> Even if this memory controller would be rank-based[1], the channel information
> can't be mapped using the legacy EDAC API, as, on the old API, all channels need to be
> filled with memories with the same size. So, this driver uses both the slot layer and
> the channel layer as the fake csrow.
> 
> [1] As you can see from the logs and from the source code, the MC registers aren't per rank,
> they are per DIMM. The number of ranks is just one attribute of the register that describes
> a DIMM. The MCA Error registers, however, don't map the rank when reporting an errors,
> nor the error counters are per rank. So, while it is possible to enumerate information
> per rank, the error detection is always per DIMM.
> 
> the "is_csrow" property (or "is_virt_csrow" after the renaming patch) is there to indicate
> what memory layer will compose the "csrow" when using the legacy API. 
> 
>> [..]
>>
>>> With regards to the changes at edac_mc_sysfs,  it will likely affect all per-dimm
>>> routines, plus the counters reset logic. The problem of pointing to a set of
>>> routines that need changes is that this list can/will change with time.
>>>
>>> So, the intention behind this note is not to give an exhaustive list of what should
>>> be changed, if EDAC_MAX_LAYERS is incremented. Instead, it is meant to give a
>>> clue that incrementing the number of layers is not as easy as just changing
>>> it: it would require to change the number of layers also at the code.
>>
>> Then write that instead of adding a clueless note which only confuses readers.
>>
>>>
>>>>
>>>>> + */
>>>>> +#define EDAC_MAX_LAYERS		3
>>>>> +
>>>>> +/*
>>>>> + * A loop could be used here to make it more generic, but, as we only have
>>>>> + * 3 layers, this is a little faster. By design, layers can never be 0 or
>>>>> + * more than 3. If that ever happens, a NULL is returned, causing an OOPS
>>>>> + * during the memory allocation routine, with would point to the developer
>>>>> + * that he's doing something wrong.
>>>>> + */
>>>>> +#define GET_POS(layers, var, nlayers, lay0, lay1, lay2) ({		\
>>>>
>>>> This is returning size per layers so it cannot be GET_POS(), AFAICT.
>>>> EDAC_GET_SIZE or similar maybe?
>>>
>>> This is not returning the size, per layers. It is returning a pointer to the
>>> structure that holds the dimm.
>>>
>>>>
>>>>> +	typeof(var) __p;						\
>>>>> +	if ((nlayers) == 1)						\
>>>>> +		__p = &var[lay0];					\
>>>>> +	else if ((nlayers) == 2)					\
>>>>> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
>>>>> +	else if ((nlayers) == 3)					\
>>>>> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
>>>>> +			    ((layers[1]).size * (lay0))))];		\
>>>>> +	else								\
>>>>> +		__p = NULL;						\
>>>>> +	__p;								\
>>>>> +})
>>
>> Ok, I'm looking at your next patch trying to understand this thing:
> 
> (That's why this were merged with the big patch. The meaning of this
> patch and the next one is explained by the changes that happened at the
> drivers. Those two patches, plus the 26 patches are part of a single logical
> change)
>>
>> +	/*
>> +	 * Fills the dimm struct
>> +	 */
>> +	memset(&pos, 0, sizeof(pos));
>> +	row = 0;
>> +	chn = 0;
>> +	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
>> +	for (i = 0; i < tot_dimms; i++) {
>> +		chan = &csi[row].channels[chn];
>> +		dimm = GET_POS(lay, mci->dimms, n_layers,
>> +			       pos[0], pos[1], pos[2]);
>>
>> pos is an unsigned[3] array with all its elements set to 0 in the memset
>> above. Which means I need a run-variable like that all the time whenever
>> I iterate over the layers.
> 
> Yes. 
> 
>>
>> Now, say nlayers == 3, then your macro does this:
>>
>> 	__p = &var[(lay2) + ((layers[2]).size * ((lay1) + ((layers[1]).size * (lay0))))];
>>
>> So I'm multiplying a loop variable with layers[i].size which is the
>> maximum size of the layer. What does that mean, where is this size
>> initialized?
> 
> The layers array is initialized by the drivers.  For example, at amd664_edac (see
> patch 01/26):
> 
> @@ -2520,7 +2561,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
>  		goto err_siblings;
>  
>  	ret = -ENOMEM;
> -	mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
> +	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
> +	layers[0].size = pvt->csels[0].b_cnt;
> +	layers[0].is_csrow = true;
> +	layers[1].type = EDAC_MC_LAYER_CHANNEL;
> +	layers[1].size = pvt->channel_count;
> +	layers[1].is_csrow = false;
> +	mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
>  	if (!mci)
>  		goto err_siblings;
> 
>>
>> I can imagine that I'll get an element in the mci->dimms array in the
>> end 
> 
> Yes. In the case of a 2 layer MC, this would be similar to create a
> bi-dimensional array and use:
> 	dimm = mci->dimm[x][y]
> 
>> but this is very confusing.
>>
>> So please explain what each argument of this macro exactly means.
>>
> I'll.

See the enclosed changes. They improve the comments as requested.

I'll fold them to the main patch and send the complete patch on a separate
email.

Regards,
Mauro


diff --git a/include/linux/edac.h b/include/linux/edac.h
index 243a92b..671b27b 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -362,19 +362,38 @@ struct edac_mc_layer {
 /*
  * Maximum number of layers used by the memory controller to uniquely
  * identify a single memory stick.
- * NOTE: incrementing it would require changes at edac_mc_handle_error()
- * and at the routines at edac_mc_sysfs that create layers
+ * NOTE: Changing this constant requires not only to change the constant
+ * below, but also to change the existing code at the core, as there are
+ * some code there that are optimized for 3 layers.
  */
 #define EDAC_MAX_LAYERS		3
 
-/*
+/**
+ * EDAC_DIMM_PTR - Macro responsible to find a pointer inside a pointer array
+ *		   for the element given by [lay0,lay1,lay2] position
+ *
+ * @layers:	a struct edac_mc_layer array, describing how many elements
+ *		were allocated for each layer
+ * @var:	name of the var where we want to get the pointer
+ *		(like mci->dimms)
+ * @n_layers:	Number of layers at the @layers array
+ * @lay0:	layer0 position
+ * @lay1:	layer1 position. Unused if n_layers < 2
+ * @lay2:	layer2 position. Unused if n_layers < 3
+ *
+ * For 1 layer, this macro returns &var[lay0]
+ * For 2 layers, this macro is similar to allocate a bi-dimensional array
+ *		and to return "&var[lay0][lay1]"
+ * For 3 layers, this macro is similar to allocate a tri-dimensional array
+ *		and to return "&var[lay0][lay1][lay2]"
+ *
  * A loop could be used here to make it more generic, but, as we only have
- * 3 layers, this is a little faster. By design, layers can never be 0 or
- * more than 3. If that ever happens, a NULL is returned, causing an OOPS
- * during the memory allocation routine, with would point to the developer
- * that he's doing something wrong.
+ * 3 layers, this is a little faster.
+ * By design, layers can never be 0 or more than 3. If that ever happens,
+ * a NULL is returned, causing an OOPS during the memory allocation routine,
+ * with would point to the developer that he's doing something wrong.
  */
-#define EDAC_DIMM_PTR(layers, var, nlayers, lay0, lay1, lay2) ({		\
+#define EDAC_DIMM_PTR(layers, var, nlayers, lay0, lay1, lay2) ({	\
 	typeof(var) __p;						\
 	if ((nlayers) == 1)						\
 		__p = &var[lay0];					\

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-24 12:42             ` Mauro Carvalho Chehab
@ 2012-04-24 12:49               ` Mauro Carvalho Chehab
  2012-04-24 13:09                 ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 12:49 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks, accessed
via csrow/channel.

So, changes are needed in order to allow the EDAC core to
work with all types of architectures.

In preparation for handling non-csrows based memory controllers,
add some memory structs and a macro:

enum hw_event_mc_err_type: describes the type of error
			   (corrected, uncorrected, fatal)

To be used by the new edac_mc_handle_error function;

enum edac_mc_layer: describes the type of a given memory
architecture layer (branch, channel, slot, csrow).

struct edac_mc_layer: describes the properties of a memory
		      layer (type, size, and if the layer
		      will be used on a virtual csrow.

EDAC_DIMM_PTR() - as the number of layers can vary from 1 to 3,
this macro converts from an address with up to 3 layers into
a linear address.

Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
 include/linux/edac.h |  102 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 101 insertions(+), 1 deletions(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index 8b78bd0..671b27b 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -67,6 +67,25 @@ enum dev_type {
 #define DEV_FLAG_X64		BIT(DEV_X64)
 
 /**
+ * enum hw_event_mc_err_type - type of the detected error
+ *
+ * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
+ *				corrected error was detected
+ * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
+ *				can't be corrected by ECC, but it is not
+ *				fatal (maybe it is on an unused memory area,
+ *				or the memory controller could recover from
+ *				it for example, by re-trying the operation).
+ * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
+ *				be recovered.
+ */
+enum hw_event_mc_err_type {
+	HW_EVENT_ERR_CORRECTED,
+	HW_EVENT_ERR_UNCORRECTED,
+	HW_EVENT_ERR_FATAL,
+};
+
+/**
  * enum mem_type - memory types. For a more detailed reference, please see
  *			http://en.wikipedia.org/wiki/DRAM
  *
@@ -308,7 +327,88 @@ enum scrub_type {
  * PS - I enjoyed writing all that about as much as you enjoyed reading it.
  */
 
-/* FIXME: add a per-dimm ce error count */
+/**
+ * enum edac_mc_layer - memory controller hierarchy layer
+ *
+ * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
+ * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
+ * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
+ * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
+ *
+ * This enum is used by the drivers to tell edac_mc_sysfs what name should
+ * be used when describing a memory stick location.
+ */
+enum edac_mc_layer_type {
+	EDAC_MC_LAYER_BRANCH,
+	EDAC_MC_LAYER_CHANNEL,
+	EDAC_MC_LAYER_SLOT,
+	EDAC_MC_LAYER_CHIP_SELECT,
+};
+
+/**
+ * struct edac_mc_layer - describes the memory controller hierarchy
+ * @layer:		layer type
+ * @size:maximum size of the layer
+ * @is_virt_csrow:	This layer is part of the "csrow" when old API
+ *			compatibility mode is enabled. Otherwise, it is
+ *			a channel
+ */
+struct edac_mc_layer {
+	enum edac_mc_layer_type	type;
+	unsigned		size;
+	bool			is_virt_csrow;
+};
+
+/*
+ * Maximum number of layers used by the memory controller to uniquely
+ * identify a single memory stick.
+ * NOTE: Changing this constant requires not only to change the constant
+ * below, but also to change the existing code at the core, as there are
+ * some code there that are optimized for 3 layers.
+ */
+#define EDAC_MAX_LAYERS		3
+
+/**
+ * EDAC_DIMM_PTR - Macro responsible to find a pointer inside a pointer array
+ *		   for the element given by [lay0,lay1,lay2] position
+ *
+ * @layers:	a struct edac_mc_layer array, describing how many elements
+ *		were allocated for each layer
+ * @var:	name of the var where we want to get the pointer
+ *		(like mci->dimms)
+ * @n_layers:	Number of layers at the @layers array
+ * @lay0:	layer0 position
+ * @lay1:	layer1 position. Unused if n_layers < 2
+ * @lay2:	layer2 position. Unused if n_layers < 3
+ *
+ * For 1 layer, this macro returns &var[lay0]
+ * For 2 layers, this macro is similar to allocate a bi-dimensional array
+ *		and to return "&var[lay0][lay1]"
+ * For 3 layers, this macro is similar to allocate a tri-dimensional array
+ *		and to return "&var[lay0][lay1][lay2]"
+ *
+ * A loop could be used here to make it more generic, but, as we only have
+ * 3 layers, this is a little faster.
+ * By design, layers can never be 0 or more than 3. If that ever happens,
+ * a NULL is returned, causing an OOPS during the memory allocation routine,
+ * with would point to the developer that he's doing something wrong.
+ */
+#define EDAC_DIMM_PTR(layers, var, nlayers, lay0, lay1, lay2) ({	\
+	typeof(var) __p;						\
+	if ((nlayers) == 1)						\
+		__p = &var[lay0];					\
+	else if ((nlayers) == 2)					\
+		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
+	else if ((nlayers) == 3)					\
+		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
+			    ((layers[1]).size * (lay0))))];		\
+	else								\
+		__p = NULL;						\
+	__p;								\
+})
+
+
+/* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
 	unsigned memory_controller;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 11:46           ` Mauro Carvalho Chehab
  2012-04-24 12:42             ` Mauro Carvalho Chehab
@ 2012-04-24 12:55             ` Borislav Petkov
  2012-04-24 13:11               ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-24 12:55 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

On Tue, Apr 24, 2012 at 08:46:53AM -0300, Mauro Carvalho Chehab wrote:
> Em 24-04-2012 07:40, Borislav Petkov escreveu:
> > On Mon, Apr 23, 2012 at 06:30:54PM +0000, Mauro Carvalho Chehab wrote:
> >>>> +};
> >>>> +
> >>>> +/**
> >>>> + * struct edac_mc_layer - describes the memory controller hierarchy
> >>>> + * @layer:		layer type
> >>>> + * @size:maximum size of the layer
> >>>> + * @is_csrow:		This layer is part of the "csrow" when old API
> >>>> + *			compatibility mode is enabled. Otherwise, it is
> >>>> + *			a channel
> >>>> + */
> >>>> +struct edac_mc_layer {
> >>>> +	enum edac_mc_layer_type	type;
> >>>> +	unsigned		size;
> >>>> +	bool			is_csrow;
> >>>> +};
> >>>
> >>> Huh, why do you need is_csrow? Can't do
> >>>
> >>> 	type = EDAC_MC_LAYER_CHIP_SELECT;
> >>>
> >>> ?
> >>
> >> No, that's different. For a csrow-based memory controller, is_csrow is equal to
> >> type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
> >> is used to mark with layers will be used for the "fake csrow" exported by the
> >> EDAC core by the legacy API.
> > 
> > I don't understand this, do you mean: "this will be used to mark which
> > layer will be used to fake a csrow"...?
> 
> I've already explained this dozens of times: on x86, except for amd64_edac and
> the drivers for legacy hardware (+7 years old), the information filled at struct 
> csrow_info is FAKE. That's basically one of the main reasons for this patchset.
> 
> There's no csrow signals accessed by the memory controller on FB-DIMM/RAMBUS, and on DDR3
> Intel memory controllers, it is possible to fill memories on different channels with
> different sizes. For example, this is how the 4 DIMM banks are filled on an HP Z400
> with a Intel W3505 CPU:
> 
> $ ./edac-ctl --layout
>        +-----------------------------------+
>        |                mc0                |
>        | channel0  | channel1  | channel2  |
> -------+-----------------------------------+
> slot2: |     0 MB  |     0 MB  |     0 MB  |
> slot1: |  1024 MB  |     0 MB  |     0 MB  |
> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
> -------+-----------------------------------+
> 
> Those are the logs that dump the Memory Controller registers: 
> 
> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
> [  115.818950] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.818955] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
> [  115.818985] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
> [  115.819016] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> 
> The Nehalem memory controllers allow up to 3 DIMMs per channel, and has 3 channels (so,
> a total of 9 DIMMs). Most motherboards, however, expose either 4 or 8 DIMMs per CPU,
> so it isn't possible to have all channels and dimms filled on them.
> 
> On this motherboard, DIMM1 to DIMM3 are mapped to the the first dimm# at channels 0 to 2, and
> DIMM4 goes to the second dimm# at channel 0.
> 
> See? On slot 1, only channel 0 is filled.

Ok, wait a second, wait a second.

It's good that you brought up an example, that will probably help
clarify things better.

So, how many physical DIMMs are we talking in the example above? 4, and
all of them single-ranked? They must be because it says "rank: 1" above.

How would the table look if you had dual-ranked or quad-ranked DIMMs on
the motherboard?

I understand channel{0,1,2} so what is slot now, is that the physical
DIMM slot on the motherboard?

If so, why are there 9 slots (3x3) when you say that most motherboards
support 4 or 8 DIMMs per socket? Are the "slot{0,1,2}" things the
view from the memory controller or what you physically have on the
motherboard?

> Even if this memory controller would be rank-based[1], the channel
> information can't be mapped using the legacy EDAC API, as, on the old
> API, all channels need to be filled with memories with the same size.
> So, this driver uses both the slot layer and the channel layer as the
> fake csrow.

So what is the slot layer, is it something you've come up with or is it
a real DIMM slot on the motherboard?

> [1] As you can see from the logs and from the source code, the MC
> registers aren't per rank, they are per DIMM. The number of ranks
> is just one attribute of the register that describes a DIMM. The
> MCA Error registers, however, don't map the rank when reporting an
> errors, nor the error counters are per rank. So, while it is possible
> to enumerate information per rank, the error detection is always per
> DIMM.

Ok.

[..]

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-24 12:49               ` [PATCH] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
@ 2012-04-24 13:09                 ` Borislav Petkov
  2012-04-24 13:22                   ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-24 13:09 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

On Tue, Apr 24, 2012 at 09:49:59AM -0300, Mauro Carvalho Chehab wrote:
> + * EDAC_DIMM_PTR - Macro responsible to find a pointer inside a pointer array
> + *		   for the element given by [lay0,lay1,lay2] position
> + *
> + * @layers:	a struct edac_mc_layer array, describing how many elements
> + *		were allocated for each layer
> + * @var:	name of the var where we want to get the pointer
> + *		(like mci->dimms)
> + * @n_layers:	Number of layers at the @layers array
> + * @lay0:	layer0 position
> + * @lay1:	layer1 position. Unused if n_layers < 2
> + * @lay2:	layer2 position. Unused if n_layers < 3

Ok, just call them "layer", you're not saving anything by chomping
off the last two letters. Besides, "layer" actually means what it is
supposed to, versus "lay" which means something else.

:-)

> + *
> + * For 1 layer, this macro returns &var[lay0]
> + * For 2 layers, this macro is similar to allocate a bi-dimensional array
> + *		and to return "&var[lay0][lay1]"
> + * For 3 layers, this macro is similar to allocate a tri-dimensional array
> + *		and to return "&var[lay0][lay1][lay2]"
> + *
> + * A loop could be used here to make it more generic, but, as we only have
> + * 3 layers, this is a little faster.
> + * By design, layers can never be 0 or more than 3. If that ever happens,
> + * a NULL is returned, causing an OOPS during the memory allocation routine,
> + * with would point to the developer that he's doing something wrong.
> + */
> +#define EDAC_DIMM_PTR(layers, var, nlayers, lay0, lay1, lay2) ({	\
> +	typeof(var) __p;						\
> +	if ((nlayers) == 1)						\
> +		__p = &var[lay0];					\
> +	else if ((nlayers) == 2)					\
> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
> +	else if ((nlayers) == 3)					\
> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
> +			    ((layers[1]).size * (lay0))))];		\
> +	else								\
> +		__p = NULL;						\
> +	__p;								\

Ok, I see it now,

@@ -2520,7 +2561,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
                goto err_siblings;

        ret = -ENOMEM;
-       mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
+       layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+       layers[0].size = pvt->csels[0].b_cnt;
+       layers[0].is_csrow = true;
+       layers[1].type = EDAC_MC_LAYER_CHANNEL;
+       layers[1].size = pvt->channel_count;
+       layers[1].is_csrow = false;
+       mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
        if (!mci)
                goto err_siblings;

size is not "size"! doh, but the _count_ _of_ _elements_ this layer can
have. In the example above, layer0's size is actually the amount of chip
selects you can have per channel. WTF don't you call it that way:

Your diff says

> +/**
> + * struct edac_mc_layer - describes the memory controller hierarchy
> + * @layer:           layer type
> + * @size:maximum size of the layer
> + * @is_virt_csrow:   This layer is part of the "csrow" when old API
> + *                   compatibility mode is enabled. Otherwise, it is
> + *                   a channel
> + */
> +struct edac_mc_layer {
> +     enum edac_mc_layer_type type;
> +     unsigned                size;
> +     bool                    is_virt_csrow;
> +};

WTF am I, or anyone for that matter, to understand that with "size" you
mean "num_elems" or something like that? The explanation of that struct
member "maximum size of the layer" doesn't bring me any further either!

So call this thing properly and explain properly what it means - no one
else can look in your brain and actually understand what you mean by
this non-meaning-anything "size".

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 12:55             ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
@ 2012-04-24 13:11               ` Mauro Carvalho Chehab
  2012-04-24 13:32                 ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 13:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 24-04-2012 09:55, Borislav Petkov escreveu:
> On Tue, Apr 24, 2012 at 08:46:53AM -0300, Mauro Carvalho Chehab wrote:
>> Em 24-04-2012 07:40, Borislav Petkov escreveu:
>>> On Mon, Apr 23, 2012 at 06:30:54PM +0000, Mauro Carvalho Chehab wrote:
>>>>>> +};
>>>>>> +
>>>>>> +/**
>>>>>> + * struct edac_mc_layer - describes the memory controller hierarchy
>>>>>> + * @layer:		layer type
>>>>>> + * @size:maximum size of the layer
>>>>>> + * @is_csrow:		This layer is part of the "csrow" when old API
>>>>>> + *			compatibility mode is enabled. Otherwise, it is
>>>>>> + *			a channel
>>>>>> + */
>>>>>> +struct edac_mc_layer {
>>>>>> +	enum edac_mc_layer_type	type;
>>>>>> +	unsigned		size;
>>>>>> +	bool			is_csrow;
>>>>>> +};
>>>>>
>>>>> Huh, why do you need is_csrow? Can't do
>>>>>
>>>>> 	type = EDAC_MC_LAYER_CHIP_SELECT;
>>>>>
>>>>> ?
>>>>
>>>> No, that's different. For a csrow-based memory controller, is_csrow is equal to
>>>> type == EDAC_MC_LAYER_CHIP_SELECT, but, for the other memory controllers, this
>>>> is used to mark with layers will be used for the "fake csrow" exported by the
>>>> EDAC core by the legacy API.
>>>
>>> I don't understand this, do you mean: "this will be used to mark which
>>> layer will be used to fake a csrow"...?
>>
>> I've already explained this dozens of times: on x86, except for amd64_edac and
>> the drivers for legacy hardware (+7 years old), the information filled at struct 
>> csrow_info is FAKE. That's basically one of the main reasons for this patchset.
>>
>> There's no csrow signals accessed by the memory controller on FB-DIMM/RAMBUS, and on DDR3
>> Intel memory controllers, it is possible to fill memories on different channels with
>> different sizes. For example, this is how the 4 DIMM banks are filled on an HP Z400
>> with a Intel W3505 CPU:
>>
>> $ ./edac-ctl --layout
>>        +-----------------------------------+
>>        |                mc0                |
>>        | channel0  | channel1  | channel2  |
>> -------+-----------------------------------+
>> slot2: |     0 MB  |     0 MB  |     0 MB  |
>> slot1: |  1024 MB  |     0 MB  |     0 MB  |
>> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
>> -------+-----------------------------------+
>>
>> Those are the logs that dump the Memory Controller registers: 
>>
>> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
>> [  115.818950] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.818955] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
>> [  115.818985] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
>> [  115.819016] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>
>> The Nehalem memory controllers allow up to 3 DIMMs per channel, and has 3 channels (so,
>> a total of 9 DIMMs). Most motherboards, however, expose either 4 or 8 DIMMs per CPU,
>> so it isn't possible to have all channels and dimms filled on them.
>>
>> On this motherboard, DIMM1 to DIMM3 are mapped to the the first dimm# at channels 0 to 2, and
>> DIMM4 goes to the second dimm# at channel 0.
>>
>> See? On slot 1, only channel 0 is filled.
> 
> Ok, wait a second, wait a second.
> 
> It's good that you brought up an example, that will probably help
> clarify things better.
> 
> So, how many physical DIMMs are we talking in the example above? 4, and
> all of them single-ranked? They must be because it says "rank: 1" above.
> 
> How would the table look if you had dual-ranked or quad-ranked DIMMs on
> the motherboard?

It won't change. The only changes will be at the debug logs. It would print
something like:

EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 4 ranks, UDIMMs
EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 2, row: 0x4000, col: 0x400
EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 2, row: 0x4000, col: 0x400

> I understand channel{0,1,2} so what is slot now, is that the physical
> DIMM slot on the motherboard?

physical slots:
	DIMM1 - at MCU channel 0, dimm slot#0
	DIMM2 - at MCU channel 1, dimm slot#0
	DIMM3 - at MCU channel 2, dimm slot#0
	DIMM4 - at MCU channel 0, dimm slot#1

This motherboard has only 4 slots.

The i7core_edac driver is not able to discover how many physical DIMM slots
are there at the motherboard.

> If so, why are there 9 slots (3x3) when you say that most motherboards
> support 4 or 8 DIMMs per socket? Are the "slot{0,1,2}" things the
> view from the memory controller or what you physically have on the
> motherboard?

slot{0,1,2} channel{0,1,2} are the addresses given by the memory controller.
Not all motherboards add 9 DIMM physical slots though. Only high-end
motherboards provide 9 slots per MCU.

We have one Nehalem motherboard with 18 DIMM slots, and 2 CPUs. On that
machine, it is possible to use the maximum supported range of DIMMs.

> 
>> Even if this memory controller would be rank-based[1], the channel
>> information can't be mapped using the legacy EDAC API, as, on the old
>> API, all channels need to be filled with memories with the same size.
>> So, this driver uses both the slot layer and the channel layer as the
>> fake csrow.
> 
> So what is the slot layer, is it something you've come up with or is it
> a real DIMM slot on the motherboard?

It is the slot# inside each channel.

>> [1] As you can see from the logs and from the source code, the MC
>> registers aren't per rank, they are per DIMM. The number of ranks
>> is just one attribute of the register that describes a DIMM. The
>> MCA Error registers, however, don't map the rank when reporting an
>> errors, nor the error counters are per rank. So, while it is possible
>> to enumerate information per rank, the error detection is always per
>> DIMM.
> 
> Ok.
> 
> [..]
> 


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-24 13:09                 ` Borislav Petkov
@ 2012-04-24 13:22                   ` Mauro Carvalho Chehab
  2012-04-24 13:38                     ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 13:22 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 24-04-2012 10:09, Borislav Petkov escreveu:
> On Tue, Apr 24, 2012 at 09:49:59AM -0300, Mauro Carvalho Chehab wrote:
>> + * EDAC_DIMM_PTR - Macro responsible to find a pointer inside a pointer array
>> + *		   for the element given by [lay0,lay1,lay2] position
>> + *
>> + * @layers:	a struct edac_mc_layer array, describing how many elements
>> + *		were allocated for each layer
>> + * @var:	name of the var where we want to get the pointer
>> + *		(like mci->dimms)
>> + * @n_layers:	Number of layers at the @layers array
>> + * @lay0:	layer0 position
>> + * @lay1:	layer1 position. Unused if n_layers < 2
>> + * @lay2:	layer2 position. Unused if n_layers < 3
> 
> Ok, just call them "layer", you're not saving anything by chomping
> off the last two letters. Besides, "layer" actually means what it is
> supposed to, versus "lay" which means something else.
> 
> :-)

True.
> 
>> + *
>> + * For 1 layer, this macro returns &var[lay0]
>> + * For 2 layers, this macro is similar to allocate a bi-dimensional array
>> + *		and to return "&var[lay0][lay1]"
>> + * For 3 layers, this macro is similar to allocate a tri-dimensional array
>> + *		and to return "&var[lay0][lay1][lay2]"
>> + *
>> + * A loop could be used here to make it more generic, but, as we only have
>> + * 3 layers, this is a little faster.
>> + * By design, layers can never be 0 or more than 3. If that ever happens,
>> + * a NULL is returned, causing an OOPS during the memory allocation routine,
>> + * with would point to the developer that he's doing something wrong.
>> + */
>> +#define EDAC_DIMM_PTR(layers, var, nlayers, lay0, lay1, lay2) ({	\
>> +	typeof(var) __p;						\
>> +	if ((nlayers) == 1)						\
>> +		__p = &var[lay0];					\
>> +	else if ((nlayers) == 2)					\
>> +		__p = &var[(lay1) + ((layers[1]).size * (lay0))];	\
>> +	else if ((nlayers) == 3)					\
>> +		__p = &var[(lay2) + ((layers[2]).size * ((lay1) +	\
>> +			    ((layers[1]).size * (lay0))))];		\
>> +	else								\
>> +		__p = NULL;						\
>> +	__p;								\
> 
> Ok, I see it now,
> 
> @@ -2520,7 +2561,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
>                 goto err_siblings;
> 
>         ret = -ENOMEM;
> -       mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
> +       layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
> +       layers[0].size = pvt->csels[0].b_cnt;
> +       layers[0].is_csrow = true;
> +       layers[1].type = EDAC_MC_LAYER_CHANNEL;
> +       layers[1].size = pvt->channel_count;
> +       layers[1].is_csrow = false;
> +       mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
>         if (!mci)
>                 goto err_siblings;
> 
> size is not "size"! doh, but the _count_ _of_ _elements_ this layer can
> have. In the example above, layer0's size is actually the amount of chip
> selects you can have per channel. WTF don't you call it that way:

The count of elements of a layer is the size of the layer. The Kernel macro
that gets the number of elements of an array is called "ARRAY_SIZE", and not
"ARRAY_N_ELEMS".

layers->size is the dimension of the layer. So, the term "size" fits better.
For example, according with [1], size means:
	"the spatial dimensions, proportions, magnitude, or bulk of anything:
	 the size of a farm; the size of the fish you caught."

so, "size" fits better for a "dimension" measure.

I don't mind renaming it to n_elems, if this makes you happy.

> 
> Your diff says
> 
>> +/**
>> + * struct edac_mc_layer - describes the memory controller hierarchy
>> + * @layer:           layer type
>> + * @size:maximum size of the layer
>> + * @is_virt_csrow:   This layer is part of the "csrow" when old API
>> + *                   compatibility mode is enabled. Otherwise, it is
>> + *                   a channel
>> + */
>> +struct edac_mc_layer {
>> +     enum edac_mc_layer_type type;
>> +     unsigned                size;
>> +     bool                    is_virt_csrow;
>> +};
> 
> WTF am I, or anyone for that matter, to understand that with "size" you
> mean "num_elems" or something like that? The explanation of that struct
> member "maximum size of the layer" doesn't bring me any further either!
> 
> So call this thing properly and explain properly what it means - no one
> else can look in your brain and actually understand what you mean by
> this non-meaning-anything "size".
> 


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 13:11               ` Mauro Carvalho Chehab
@ 2012-04-24 13:32                 ` Borislav Petkov
  2012-04-24 14:24                   ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-24 13:32 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

On Tue, Apr 24, 2012 at 10:11:50AM -0300, Mauro Carvalho Chehab wrote:
> >> I've already explained this dozens of times: on x86, except for amd64_edac and
> >> the drivers for legacy hardware (+7 years old), the information filled at struct 
> >> csrow_info is FAKE. That's basically one of the main reasons for this patchset.
> >>
> >> There's no csrow signals accessed by the memory controller on FB-DIMM/RAMBUS, and on DDR3
> >> Intel memory controllers, it is possible to fill memories on different channels with
> >> different sizes. For example, this is how the 4 DIMM banks are filled on an HP Z400
> >> with a Intel W3505 CPU:
> >>
> >> $ ./edac-ctl --layout
> >>        +-----------------------------------+
> >>        |                mc0                |
> >>        | channel0  | channel1  | channel2  |
> >> -------+-----------------------------------+
> >> slot2: |     0 MB  |     0 MB  |     0 MB  |
> >> slot1: |  1024 MB  |     0 MB  |     0 MB  |
> >> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
> >> -------+-----------------------------------+
> >>
> >> Those are the logs that dump the Memory Controller registers: 
> >>
> >> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
> >> [  115.818950] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> >> [  115.818955] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
> >> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
> >> [  115.818985] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> >> [  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
> >> [  115.819016] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> >>
> >> The Nehalem memory controllers allow up to 3 DIMMs per channel, and has 3 channels (so,
> >> a total of 9 DIMMs). Most motherboards, however, expose either 4 or 8 DIMMs per CPU,
> >> so it isn't possible to have all channels and dimms filled on them.
> >>
> >> On this motherboard, DIMM1 to DIMM3 are mapped to the the first dimm# at channels 0 to 2, and
> >> DIMM4 goes to the second dimm# at channel 0.
> >>
> >> See? On slot 1, only channel 0 is filled.
> > 
> > Ok, wait a second, wait a second.
> > 
> > It's good that you brought up an example, that will probably help
> > clarify things better.
> > 
> > So, how many physical DIMMs are we talking in the example above? 4, and
> > all of them single-ranked? They must be because it says "rank: 1" above.
> > 
> > How would the table look if you had dual-ranked or quad-ranked DIMMs on
> > the motherboard?
> 
> It won't change. The only changes will be at the debug logs. It would print
> something like:
> 
> EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 4 ranks, UDIMMs
> EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 2, row: 0x4000, col: 0x400
> EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 2, row: 0x4000, col: 0x400
> 
> > I understand channel{0,1,2} so what is slot now, is that the physical
> > DIMM slot on the motherboard?
> 
> physical slots:
> 	DIMM1 - at MCU channel 0, dimm slot#0
> 	DIMM2 - at MCU channel 1, dimm slot#0
> 	DIMM3 - at MCU channel 2, dimm slot#0
> 	DIMM4 - at MCU channel 0, dimm slot#1
> 
> This motherboard has only 4 slots.

I see, so each of those slots has physically a DIMM in it of 1024MB, and
each of those DIMMs is single-ranked.

So yes, those are physical slots.

The edac-ctl output above contains "virtual" slots, the way the memory
controller and thus the hardware sees them.

> The i7core_edac driver is not able to discover how many physical DIMM slots
> are there at the motherboard.
> 
> > If so, why are there 9 slots (3x3) when you say that most motherboards
> > support 4 or 8 DIMMs per socket? Are the "slot{0,1,2}" things the
> > view from the memory controller or what you physically have on the
> > motherboard?
> 
> slot{0,1,2} channel{0,1,2} are the addresses given by the memory controller.
> Not all motherboards add 9 DIMM physical slots though. Only high-end
> motherboards provide 9 slots per MCU.
> 
> We have one Nehalem motherboard with 18 DIMM slots, and 2 CPUs. On that
> machine, it is possible to use the maximum supported range of DIMMs.
> 
> > 
> >> Even if this memory controller would be rank-based[1], the channel
> >> information can't be mapped using the legacy EDAC API, as, on the old
> >> API, all channels need to be filled with memories with the same size.
> >> So, this driver uses both the slot layer and the channel layer as the
> >> fake csrow.
> > 
> > So what is the slot layer, is it something you've come up with or is it
> > a real DIMM slot on the motherboard?
> 
> It is the slot# inside each channel.

I hope you can understand my confusion now:

On the one hand, there are the physical slots where the DIMMs are
sticked into.

OTOH, there are the slots==ranks which the memory controllers use to
talk to the DIMMs.

So the box above with 18 physical DIMM slots, i.e. 9 per socket (I think
with "CPU" you mean here physical processor on the node) you can have 9
single-ranked DIMMs, or 4 dual-ranked and 1 single-ranked (if this is
supported) on a node, or 2 quad-ranked...

So, if all of the above is true, we need to distinguish between
"virtual" slots, i.e. the ranks the memory controller can talk to, and
physical slots, i.e. where the DIMMs go.

Correct?

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-24 13:22                   ` Mauro Carvalho Chehab
@ 2012-04-24 13:38                     ` Borislav Petkov
  2012-04-24 16:39                       ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-24 13:38 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

On Tue, Apr 24, 2012 at 10:22:09AM -0300, Mauro Carvalho Chehab wrote:
> The count of elements of a layer is the size of the layer. The Kernel macro
> that gets the number of elements of an array is called "ARRAY_SIZE", and not
> "ARRAY_N_ELEMS".
> 
> layers->size is the dimension of the layer. So, the term "size" fits better.
> For example, according with [1], size means:
> 	"the spatial dimensions, proportions, magnitude, or bulk of anything:
> 	 the size of a farm; the size of the fish you caught."
> 
> so, "size" fits better for a "dimension" measure.
> 
> I don't mind renaming it to n_elems, if this makes you happy.

Ok, let's do a simple comparison:

1. Imagine you look at ARRAY_SIZE(), what does it mean? Well, the size
of an array is pretty well defined to be the number of elements in it.
Easy.

2. Now imagine a struct member ->size. Out of context it could mean
anything, the size of something this struct represents, what the hell do
I know...

Now let's put it in context, layer->size: It could mean the size of the
layer in MB, it could mean how thick the layer is in meters, it could
... I can go on with these forever.

So your example with ARRAY_SIZE does not apply here.

If it is a badly explained struct and size is "maximum size" then this
means sh*t. So either leave it "size" but make sure to explain it
thoroughly what exactly size means here or change its name to something
more meaningful.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 13:32                 ` Borislav Petkov
@ 2012-04-24 14:24                   ` Mauro Carvalho Chehab
  2012-04-24 16:27                     ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 14:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 24-04-2012 10:32, Borislav Petkov escreveu:
> On Tue, Apr 24, 2012 at 10:11:50AM -0300, Mauro Carvalho Chehab wrote:
>>>> I've already explained this dozens of times: on x86, except for amd64_edac and
>>>> the drivers for legacy hardware (+7 years old), the information filled at struct 
>>>> csrow_info is FAKE. That's basically one of the main reasons for this patchset.
>>>>
>>>> There's no csrow signals accessed by the memory controller on FB-DIMM/RAMBUS, and on DDR3
>>>> Intel memory controllers, it is possible to fill memories on different channels with
>>>> different sizes. For example, this is how the 4 DIMM banks are filled on an HP Z400
>>>> with a Intel W3505 CPU:
>>>>
>>>> $ ./edac-ctl --layout
>>>>        +-----------------------------------+
>>>>        |                mc0                |
>>>>        | channel0  | channel1  | channel2  |
>>>> -------+-----------------------------------+
>>>> slot2: |     0 MB  |     0 MB  |     0 MB  |
>>>> slot1: |  1024 MB  |     0 MB  |     0 MB  |
>>>> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
>>>> -------+-----------------------------------+
>>>>
>>>> Those are the logs that dump the Memory Controller registers: 
>>>>
>>>> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
>>>> [  115.818950] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>>> [  115.818955] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>>> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
>>>> [  115.818985] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>>> [  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
>>>> [  115.819016] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>>>
>>>> The Nehalem memory controllers allow up to 3 DIMMs per channel, and has 3 channels (so,
>>>> a total of 9 DIMMs). Most motherboards, however, expose either 4 or 8 DIMMs per CPU,
>>>> so it isn't possible to have all channels and dimms filled on them.
>>>>
>>>> On this motherboard, DIMM1 to DIMM3 are mapped to the the first dimm# at channels 0 to 2, and
>>>> DIMM4 goes to the second dimm# at channel 0.
>>>>
>>>> See? On slot 1, only channel 0 is filled.
>>>
>>> Ok, wait a second, wait a second.
>>>
>>> It's good that you brought up an example, that will probably help
>>> clarify things better.
>>>
>>> So, how many physical DIMMs are we talking in the example above? 4, and
>>> all of them single-ranked? They must be because it says "rank: 1" above.
>>>
>>> How would the table look if you had dual-ranked or quad-ranked DIMMs on
>>> the motherboard?
>>
>> It won't change. The only changes will be at the debug logs. It would print
>> something like:
>>
>> EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 4 ranks, UDIMMs
>> EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 2, row: 0x4000, col: 0x400
>> EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 2, row: 0x4000, col: 0x400
>>
>>> I understand channel{0,1,2} so what is slot now, is that the physical
>>> DIMM slot on the motherboard?
>>
>> physical slots:
>> 	DIMM1 - at MCU channel 0, dimm slot#0
>> 	DIMM2 - at MCU channel 1, dimm slot#0
>> 	DIMM3 - at MCU channel 2, dimm slot#0
>> 	DIMM4 - at MCU channel 0, dimm slot#1
>>
>> This motherboard has only 4 slots.
> 
> I see, so each of those slots has physically a DIMM in it of 1024MB, and
> each of those DIMMs is single-ranked.
> 
> So yes, those are physical slots.
> 
> The edac-ctl output above contains "virtual" slots, the way the memory
> controller and thus the hardware sees them.

Yes (well, except that Nehalem has also a concept of "virtual channel", so
calling it "virtual" can mislead into a different view).

> 
>> The i7core_edac driver is not able to discover how many physical DIMM slots
>> are there at the motherboard.
>>
>>> If so, why are there 9 slots (3x3) when you say that most motherboards
>>> support 4 or 8 DIMMs per socket? Are the "slot{0,1,2}" things the
>>> view from the memory controller or what you physically have on the
>>> motherboard?
>>
>> slot{0,1,2} channel{0,1,2} are the addresses given by the memory controller.
>> Not all motherboards add 9 DIMM physical slots though. Only high-end
>> motherboards provide 9 slots per MCU.
>>
>> We have one Nehalem motherboard with 18 DIMM slots, and 2 CPUs. On that
>> machine, it is possible to use the maximum supported range of DIMMs.
>>
>>>
>>>> Even if this memory controller would be rank-based[1], the channel
>>>> information can't be mapped using the legacy EDAC API, as, on the old
>>>> API, all channels need to be filled with memories with the same size.
>>>> So, this driver uses both the slot layer and the channel layer as the
>>>> fake csrow.
>>>
>>> So what is the slot layer, is it something you've come up with or is it
>>> a real DIMM slot on the motherboard?
>>
>> It is the slot# inside each channel.
> 
> I hope you can understand my confusion now:
> 
> On the one hand, there are the physical slots where the DIMMs are
> sticked into.
> 
> OTOH, there are the slots==ranks which the memory controllers use to
> talk to the DIMMs.

This only applies to amd64 and other csrows-based memory controllers.

A memory controller like the one at Nehalem abstracts csrows (I suspect
that they have internally something functionally similar to a FB-DIMM
AMB internally). They do memory interleaving between the memory channels
in order to produce a cachesize bigger than 64 bits, but they don't
actually care about how many ranks are there on each DIMM.

It should be noticed that EDAC developers that wrote drivers for FB-DIMMs
also seemed to misunderstand those concepts, thinking that the memory
controllers were just hiding some information that they had for no real
purpose.

> 
> So the box above with 18 physical DIMM slots, i.e. 9 per socket (I think
> with "CPU" you mean here physical processor on the node)

Yes.

> you can have 9
> single-ranked DIMMs, or 4 dual-ranked and 1 single-ranked (if this is
> supported) on a node, or 2 quad-ranked...

No. As far as I can tell, they can have 9 quad-ranked DIMMs (the machines
I've looked so far are all equipped with single rank memories, so I don't 
have a real scenario with 2R or 4R for Nehalem yet).

At Sandy Bridge-EP (E. g. Intel E5 CPUs), we have one machine fully equipped
with dual rank memories. The number of ranks there is just a DIMM property.

# ./edac-ctl --layout
       +-----------------------------------------------------------------------------------------------+
       |                      mc0                      |                      mc1                      |
       | channel0  | channel1  | channel2  | channel3  | channel0  | channel1  | channel2  | channel3  |
-------+-----------------------------------------------------------------------------------------------+
slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
slot1: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
slot0: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
-------+-----------------------------------------------------------------------------------------------+

(this machine doesn't have physical DIMM sockets for slot#2)

All memories there are 2R:

Handle 0x0040, DMI type 17, 28 bytes
Memory Device
        Array Handle: 0x003E
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 4096 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM_A1
        Bank Locator: NODE 0 CHANNEL 0 DIMM 0
        Type: DDR3
        Type Detail: Synchronous
        Speed: 1333 MHz
        Manufacturer: Samsung         
        Serial Number: 82766209  
        Asset Tag: Unknown         
        Part Number: M393B5273CH0-YH9  
        Rank: 2

Handle 0x0042, DMI type 17, 28 bytes
Memory Device
        Array Handle: 0x003E
        Error Information Handle: Not Provided
        Total Width: 72 bits
        Data Width: 64 bits
        Size: 4096 MB
        Form Factor: DIMM
        Set: None
        Locator: DIMM_A2
        Bank Locator: NODE 0 CHANNEL 0 DIMM 1
        Type: DDR3
        Type Detail: Synchronous
        Speed: 1333 MHz
        Manufacturer: Samsung         
        Serial Number: 827661D3  
        Asset Tag: Unknown         
        Part Number: M393B5273CH0-YH9  
        Rank: 2

...

The Bank Locator information at the DMI table matches the MCU layout:
node is the CPU socket #, channel is the channel, and DIMM is the dimm
slot # inside each channel.

> So, if all of the above is true, we need to distinguish between
> "virtual" slots, i.e. the ranks the memory controller can talk to, and
> physical slots, i.e. where the DIMMs go.
> 
> Correct?

The association between channel/dimm and a physical dimm slot is done via
the edac-utils userspace tools, that fills the silkscreen labels for each
channel/slot, as one channel/slot matches a single DIMM slot, as pointed
by the "Bank locator".

Regards,
Mauro


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 14:24                   ` Mauro Carvalho Chehab
@ 2012-04-24 16:27                     ` Borislav Petkov
  2012-04-24 17:24                       ` Mauro Carvalho Chehab
  2012-04-24 17:31                       ` Luck, Tony
  0 siblings, 2 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-24 16:27 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Tony Luck
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

On Tue, Apr 24, 2012 at 11:24:03AM -0300, Mauro Carvalho Chehab wrote:
> Yes (well, except that Nehalem has also a concept of "virtual channel", so
> calling it "virtual" can mislead into a different view).

No, it cannot. It is a very simple question: Am I looking at virtual
slots/channels or not, when I'm looking at edac-ctl output?

[..]

> > I hope you can understand my confusion now:
> > 
> > On the one hand, there are the physical slots where the DIMMs are
> > sticked into.
> > 
> > OTOH, there are the slots==ranks which the memory controllers use to
> > talk to the DIMMs.
> 
> This only applies to amd64 and other csrows-based memory controllers.
> 
> A memory controller like the one at Nehalem abstracts csrows (I suspect
> that they have internally something functionally similar to a FB-DIMM
> AMB internally). They do memory interleaving between the memory channels
> in order to produce a cachesize bigger than 64 bits, but they don't

You mean cacheline here.

> actually care about how many ranks are there on each DIMM.

This cannot be right - you need the chip select to talk to a rank.
This is basic DDR functionality.

I can imagine that they're doing some tricks like channel/chip
select/memory controller interleaving.

In the end of the day, it is smallest row that gives you 64 bits of
data.

@Tony: hey Tony, can you point us to an Intel document explaining how
Sandy Bridge or NH or one of the new ones does the memory addressing wrt
ranks, channels etc? Thanks.

[..]

> No. As far as I can tell, they can have 9 quad-ranked DIMMs (the machines
> I've looked so far are all equipped with single rank memories, so I don't 
> have a real scenario with 2R or 4R for Nehalem yet).
> 
> At Sandy Bridge-EP (E. g. Intel E5 CPUs), we have one machine fully equipped
> with dual rank memories. The number of ranks there is just a DIMM property.
> 
> # ./edac-ctl --layout
>        +-----------------------------------------------------------------------------------------------+
>        |                      mc0                      |                      mc1                      |
>        | channel0  | channel1  | channel2  | channel3  | channel0  | channel1  | channel2  | channel3  |
> -------+-----------------------------------------------------------------------------------------------+
> slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
> slot1: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
> slot0: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
> -------+-----------------------------------------------------------------------------------------------+
> 
> (this machine doesn't have physical DIMM sockets for slot#2)

Ok, I can count 8 2R DIMMs here and each rank or slot in your
nomenclature is 4G. slot#2 has to be something virtual since each rank
occupies one slot, i.e. slot0 and slot1 on a channel.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-24 13:38                     ` Borislav Petkov
@ 2012-04-24 16:39                       ` Mauro Carvalho Chehab
  2012-04-24 16:49                         ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 16:39 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 24-04-2012 10:38, Borislav Petkov escreveu:
> On Tue, Apr 24, 2012 at 10:22:09AM -0300, Mauro Carvalho Chehab wrote:
>> The count of elements of a layer is the size of the layer. The Kernel macro
>> that gets the number of elements of an array is called "ARRAY_SIZE", and not
>> "ARRAY_N_ELEMS".
>>
>> layers->size is the dimension of the layer. So, the term "size" fits better.
>> For example, according with [1], size means:
>> 	"the spatial dimensions, proportions, magnitude, or bulk of anything:
>> 	 the size of a farm; the size of the fish you caught."
>>
>> so, "size" fits better for a "dimension" measure.
>>
>> I don't mind renaming it to n_elems, if this makes you happy.
> 
> Ok, let's do a simple comparison:
> 
> 1. Imagine you look at ARRAY_SIZE(), what does it mean? Well, the size
> of an array is pretty well defined to be the number of elements in it.
> Easy.
> 
> 2. Now imagine a struct member ->size. Out of context it could mean
> anything, the size of something this struct represents, what the hell do
> I know...

The size of a memory architecture layer is the number of physical components 
on it:

So, for example (from i7300_edac, just to get an example with 3 layers):

...
#define MAX_SLOTS		8
#define MAX_BRANCHES		2
#define MAX_CH_PER_BRANCH	2
...
	layers[0].type = EDAC_MC_LAYER_BRANCH;
	layers[0].size = MAX_BRANCHES;
	layers[0].is_csrow = false;
	layers[1].type = EDAC_MC_LAYER_CHANNEL;
	layers[1].size = MAX_CH_PER_BRANCH;
	layers[1].is_csrow = true;
	layers[2].type = EDAC_MC_LAYER_SLOT;
	layers[2].size = MAX_SLOTS;
	layers[2].is_csrow = true;

This means that there are 2 branches at the branch layer, 2 channels
per branch, at the channel layer, and 8 DIMM slots, at the slot layer.

The maximum number of DIMMs on this MCU is 32 DIMMs.

Calling "2 channels" as n_elems = 2 doesn't sound nice.

> 
> Now let's put it in context, layer->size: It could mean the size of the
> layer in MB, it could mean how thick the layer is in meters, it could
> ... I can go on with these forever.
> 
> So your example with ARRAY_SIZE does not apply here.
> 
> If it is a badly explained struct and size is "maximum size" then this
> means sh*t. So either leave it "size" but make sure to explain it
> thoroughly what exactly size means here or change its name to something
> more meaningful.

Ok, I'll apply then the explanation below:

diff --git a/include/linux/edac.h b/include/linux/edac.h
index 671b27b..934c196 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -348,7 +348,8 @@ enum edac_mc_layer_type {
 /**
  * struct edac_mc_layer - describes the memory controller hierarchy
  * @layer:		layer type
- * @size:maximum size of the layer
+ * @size:		number of components on that layer. For example,
+ *			if the channel layer have two channels, size = 2
  * @is_virt_csrow:	This layer is part of the "csrow" when old API
  *			compatibility mode is enabled. Otherwise, it is
  *			a channel

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-24 16:39                       ` Mauro Carvalho Chehab
@ 2012-04-24 16:49                         ` Borislav Petkov
  2012-04-24 17:38                           ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-24 16:49 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

On Tue, Apr 24, 2012 at 01:39:56PM -0300, Mauro Carvalho Chehab wrote:
> Ok, I'll apply then the explanation below:
> 
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index 671b27b..934c196 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -348,7 +348,8 @@ enum edac_mc_layer_type {
>  /**
>   * struct edac_mc_layer - describes the memory controller hierarchy
>   * @layer:		layer type
> - * @size:maximum size of the layer
> + * @size:		number of components on that layer. For example,

					     per layer.

> + *			if the channel layer have two channels, size = 2

					     has

>   * @is_virt_csrow:	This layer is part of the "csrow" when old API
>   *			compatibility mode is enabled. Otherwise, it is
>   *			a channel

Ok, fair enough.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 16:27                     ` Borislav Petkov
@ 2012-04-24 17:24                       ` Mauro Carvalho Chehab
  2012-04-25 17:19                         ` Borislav Petkov
  2012-04-24 17:31                       ` Luck, Tony
  1 sibling, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 17:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson

Em 24-04-2012 13:27, Borislav Petkov escreveu:
> On Tue, Apr 24, 2012 at 11:24:03AM -0300, Mauro Carvalho Chehab wrote:
>> Yes (well, except that Nehalem has also a concept of "virtual channel", so
>> calling it "virtual" can mislead into a different view).
> 
> No, it cannot. It is a very simple question: Am I looking at virtual
> slots/channels or not, when I'm looking at edac-ctl output?

It is showing physical slots/channels at edac-ctl output.

> [..]
> 
>>> I hope you can understand my confusion now:
>>>
>>> On the one hand, there are the physical slots where the DIMMs are
>>> sticked into.
>>>
>>> OTOH, there are the slots==ranks which the memory controllers use to
>>> talk to the DIMMs.
>>
>> This only applies to amd64 and other csrows-based memory controllers.
>>
>> A memory controller like the one at Nehalem abstracts csrows (I suspect
>> that they have internally something functionally similar to a FB-DIMM
>> AMB internally). They do memory interleaving between the memory channels
>> in order to produce a cachesize bigger than 64 bits, but they don't
> 
> You mean cacheline here.

Yes. Sorry for the typo.

>> actually care about how many ranks are there on each DIMM.
> 
> This cannot be right - you need the chip select to talk to a rank.
> This is basic DDR functionality.

Yes, but this seems to be hidden on some lower level layer on their hardware.
The rank information is only an information inside their per-DIMM registers.

> I can imagine that they're doing some tricks like channel/chip
> select/memory controller interleaving.

They can do all several different types of interleaving, using from 1
(no interleaving) to 4 channels. The interleave is done by address range,
not by csrow.

This is a dump of what sb_edac reads from Sandy Bridge EP registers:

[52803.640136] EDAC DEBUG: get_dimm_config: mc#1: Node ID: 1, source ID: 1
[52803.640141] EDAC DEBUG: get_dimm_config: Memory mirror is disabled
[52803.640154] EDAC DEBUG: get_dimm_config: Lockstep is disabled
[52803.640156] EDAC DEBUG: get_dimm_config: address map is on open page mode
[52803.640157] EDAC DEBUG: get_dimm_config: Memory is unregistered
[52803.640159] EDAC DEBUG: get_dimm_config: Channel #0  MTR0 = 500c
[52803.640162] EDAC DEBUG: get_dimm_config: mc#1: channel 0, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
[52803.640165] EDAC DEBUG: get_dimm_config: Channel #0  MTR1 = 500c
[52803.640168] EDAC DEBUG: get_dimm_config: mc#1: channel 0, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
[52803.640171] EDAC DEBUG: get_dimm_config: Channel #0  MTR2 = 0
[52803.640174] EDAC DEBUG: get_dimm_config: Channel #1  MTR0 = 500c
[52803.640176] EDAC DEBUG: get_dimm_config: mc#1: channel 1, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
[52803.640180] EDAC DEBUG: get_dimm_config: Channel #1  MTR1 = 500c
[52803.640182] EDAC DEBUG: get_dimm_config: mc#1: channel 1, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
[52803.640185] EDAC DEBUG: get_dimm_config: Channel #1  MTR2 = 0
[52803.640188] EDAC DEBUG: get_dimm_config: Channel #2  MTR0 = 500c
[52803.640190] EDAC DEBUG: get_dimm_config: mc#1: channel 2, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
[52803.640193] EDAC DEBUG: get_dimm_config: Channel #2  MTR1 = 500c
[52803.640195] EDAC DEBUG: get_dimm_config: mc#1: channel 2, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
[52803.640199] EDAC DEBUG: get_dimm_config: Channel #2  MTR2 = 0
[52803.640201] EDAC DEBUG: get_dimm_config: Channel #3  MTR0 = 500c
[52803.640203] EDAC DEBUG: get_dimm_config: mc#1: channel 3, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
[52803.640218] EDAC DEBUG: get_dimm_config: Channel #3  MTR1 = 500c
[52803.640220] EDAC DEBUG: get_dimm_config: mc#1: channel 3, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
[52803.640223] EDAC DEBUG: get_dimm_config: Channel #3  MTR2 = 0
[52803.640226] EDAC DEBUG: get_memory_layout: TOLM: 3.136 GB (0x00000000c3ffffff)
[52803.640228] EDAC DEBUG: get_memory_layout: TOHM: 66.624 GB (0x0000001043ffffff)
[52803.640231] EDAC DEBUG: get_memory_layout: SAD#0 DRAM up to 33.792 GB (0x0000000840000000) Interleave: 8:6 reg=0x000083c3
[52803.640234] EDAC DEBUG: get_memory_layout: SAD#0, interleave #0: 0
[52803.640237] EDAC DEBUG: get_memory_layout: SAD#1 DRAM up to 66.560 GB (0x0000001040000000) Interleave: 8:6 reg=0x000103c3
[52803.640239] EDAC DEBUG: get_memory_layout: SAD#1, interleave #0: 1
[52803.640245] EDAC DEBUG: get_memory_layout: TAD#0: up to 66.560 GB (0x0000001040000000), socket interleave 0, memory interleave 3, TGT: 0, 1, 2, 3, reg=0x0040f3e4
[52803.640249] EDAC DEBUG: get_memory_layout: TAD CH#0, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
[52803.640252] EDAC DEBUG: get_memory_layout: TAD CH#1, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
[52803.640255] EDAC DEBUG: get_memory_layout: TAD CH#2, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
[52803.640258] EDAC DEBUG: get_memory_layout: TAD CH#3, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
[52803.640261] EDAC DEBUG: get_memory_layout: CH#0 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
[52803.640264] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
[52803.640278] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
[52803.640281] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
[52803.640283] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
[52803.640287] EDAC DEBUG: get_memory_layout: CH#1 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
[52803.640290] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
[52803.640293] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
[52803.640296] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
[52803.640299] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
[52803.640303] EDAC DEBUG: get_memory_layout: CH#2 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
[52803.640306] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
[52803.640309] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
[52803.640312] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
[52803.640315] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
[52803.640319] EDAC DEBUG: get_memory_layout: CH#3 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
[52803.640322] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
[52803.640324] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
[52803.640327] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
[52803.640330] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000

In this case, all 4 channels are used for interleave:

[52803.640245] EDAC DEBUG: get_memory_layout: TAD#0: up to 66.560 GB (0x0000001040000000), socket interleave 0, memory interleave 3, TGT: 0, 1, 2, 3, reg=0x0040f3e4

It doesn't do DIMM socket interleave (socket interleave 0). It does channel interleave
among channels 0 to 3 (TGT: 0, 1, 2, 3). 

It also does an interleave at the physical memory address on bits 6 to 8:

[52803.640231] EDAC DEBUG: get_memory_layout: SAD#0 DRAM up to 33.792 GB (0x0000000840000000) Interleave: 8:6 reg=0x000083c3

This memory controller have thousands (literally) of different BIOS setups
that change how interleaves can happen on it. The above is the default
setup.

They're based on DIMM socket, MCU channel and physical address ranges.

> In the end of the day, it is smallest row that gives you 64 bits of
> data.

Yes, but the memory controller views memories per DIMM socket, and 

> @Tony: hey Tony, can you point us to an Intel document explaining how
> Sandy Bridge or NH or one of the new ones does the memory addressing wrt
> ranks, channels etc? Thanks.

For Nehalem, see i7core_edac comments that I added at the beginning of the
driver:

 * Based on the following public Intel datasheets:
 * Intel Core i7 Processor Extreme Edition and Intel Core i7 Processor
 * Datasheet, Volume 2:
 *	http://download.intel.com/design/processor/datashts/320835.pdf
 * Intel Xeon Processor 5500 Series Datasheet Volume 2
 *	http://www.intel.com/Assets/PDF/datasheet/321322.pdf
 * also available at:
 * 	http://www.arrownac.com/manufacturers/intel/s/nehalem/5500-datasheet-v2.pdf

> 
> [..]
> 
>> No. As far as I can tell, they can have 9 quad-ranked DIMMs (the machines
>> I've looked so far are all equipped with single rank memories, so I don't 
>> have a real scenario with 2R or 4R for Nehalem yet).
>>
>> At Sandy Bridge-EP (E. g. Intel E5 CPUs), we have one machine fully equipped
>> with dual rank memories. The number of ranks there is just a DIMM property.
>>
>> # ./edac-ctl --layout
>>        +-----------------------------------------------------------------------------------------------+
>>        |                      mc0                      |                      mc1                      |
>>        | channel0  | channel1  | channel2  | channel3  | channel0  | channel1  | channel2  | channel3  |
>> -------+-----------------------------------------------------------------------------------------------+
>> slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
>> slot1: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
>> slot0: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
>> -------+-----------------------------------------------------------------------------------------------+
>>
>> (this machine doesn't have physical DIMM sockets for slot#2)
> 
> Ok, I can count 8 2R DIMMs here and each rank or slot in your
> nomenclature is 4G. slot#2 has to be something virtual since each rank
> occupies one slot, i.e. slot0 and slot1 on a channel.

No. This machine has 64 GB of RAM, and it was physically filled with 16 DIMMs, 
each with 4GB. Each of the above represents one DIMM (and not a rank).

Btw, the above logs are for this machine.

# free
             total       used       free     shared    buffers     cached
Mem:      65933268    1166384   64766884          0      60572     363712
-/+ buffers/cache:     742100   65191168
Swap:     68157436      18680   68138756

The DMI decode info also clearly states that:

# dmidecode|grep -e "Memory Device$" -e Size -e "Bank Locat" -e "Serial Number" |grep -v Range
...
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 0 CHANNEL 0 DIMM 0
	Serial Number: 82766209  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 0 CHANNEL 0 DIMM 1
	Serial Number: 827661D3  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 0 CHANNEL 1 DIMM 0
	Serial Number: 82766197  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 0 CHANNEL 1 DIMM 1
	Serial Number: 82766204  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 0 CHANNEL 2 DIMM 0
	Serial Number: 827661D7  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 0 CHANNEL 2 DIMM 1
	Serial Number: 82766200  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 0 CHANNEL 3 DIMM 0
	Serial Number: 827661F9  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 0 CHANNEL 3 DIMM 1
	Serial Number: 827661B3  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 1 CHANNEL 0 DIMM 0
	Serial Number: 47473B79  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 1 CHANNEL 0 DIMM 1
	Serial Number: 440FF77F  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 1 CHANNEL 1 DIMM 0
	Serial Number: 47473B5A  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 1 CHANNEL 1 DIMM 1
	Serial Number: 47473B71  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 1 CHANNEL 2 DIMM 0
	Serial Number: 47473B62  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 1 CHANNEL 2 DIMM 1
	Serial Number: 440FF7FC  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 1 CHANNEL 3 DIMM 0
	Serial Number: 440FF7C1  
Memory Device
	Size: 4096 MB
	Bank Locator: NODE 1 CHANNEL 3 DIMM 1
	Serial Number: 440FF7F4  

As I said, for this memory controller, and for Nehalem, the memories are
mapped per DIMM socket (and not per rank).

Mauro.

^ permalink raw reply	[flat|nested] 206+ messages in thread

* RE: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 16:27                     ` Borislav Petkov
  2012-04-24 17:24                       ` Mauro Carvalho Chehab
@ 2012-04-24 17:31                       ` Luck, Tony
  1 sibling, 0 replies; 206+ messages in thread
From: Luck, Tony @ 2012-04-24 17:31 UTC (permalink / raw)
  To: Borislav Petkov, Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

> @Tony: hey Tony, can you point us to an Intel document explaining how
> Sandy Bridge or NH or one of the new ones does the memory addressing wrt
> ranks, channels etc? Thanks.

Data sheets for the E5-26xx series (vol 1 & 2) are linked off this page:

http://www.intel.com/content/www/us/en/processors/xeon/xeon-processor-5000-sequence/Xeon5000TechnicalResources.html

volume 2, section 4.4 is where this stuff is described ... BUT I'm
not sure if everything needed made it into them from the internal docs (which
we let Mauro use to write the sb_edac.c driver and then got a special exemption
to let him post the driver before the public docs existed ... which means there
*might* be some things that are only publicly documented in the EDAC driver).

-Tony

^ permalink raw reply	[flat|nested] 206+ messages in thread

* [PATCH] edac.h: Add generic layers for describing a memory location
  2012-04-24 16:49                         ` Borislav Petkov
@ 2012-04-24 17:38                           ` Mauro Carvalho Chehab
  2012-04-24 18:15                             ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 17:38 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks, accessed
via csrow/channel.

So, changes are needed in order to allow the EDAC core to
work with all types of architectures.

In preparation for handling non-csrows based memory controllers,
add some memory structs and a macro:

enum hw_event_mc_err_type: describes the type of error
			   (corrected, uncorrected, fatal)

To be used by the new edac_mc_handle_error function;

enum edac_mc_layer: describes the type of a given memory
architecture layer (branch, channel, slot, csrow).

struct edac_mc_layer: describes the properties of a memory
		      layer (type, size, and if the layer
		      will be used on a virtual csrow.

EDAC_DIMM_PTR() - as the number of layers can vary from 1 to 3,
this macro converts from an address with up to 3 layers into
a linear address.

Reviewed-by: Borislav Petkov <bp@amd64.org>
Cc: Doug Thompson <norsk5@yahoo.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v15 - Added/improved comments and some vars/macros got renamed

 include/linux/edac.h |  103 +++++++++++++++++++++++++++++++++++++++++++++++++-
 1 files changed, 102 insertions(+), 1 deletions(-)

diff --git a/include/linux/edac.h b/include/linux/edac.h
index 8b78bd0..3b8798d 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -67,6 +67,25 @@ enum dev_type {
 #define DEV_FLAG_X64		BIT(DEV_X64)
 
 /**
+ * enum hw_event_mc_err_type - type of the detected error
+ *
+ * @HW_EVENT_ERR_CORRECTED:	Corrected Error - Indicates that an ECC
+ *				corrected error was detected
+ * @HW_EVENT_ERR_UNCORRECTED:	Uncorrected Error - Indicates an error that
+ *				can't be corrected by ECC, but it is not
+ *				fatal (maybe it is on an unused memory area,
+ *				or the memory controller could recover from
+ *				it for example, by re-trying the operation).
+ * @HW_EVENT_ERR_FATAL:		Fatal Error - Uncorrected error that could not
+ *				be recovered.
+ */
+enum hw_event_mc_err_type {
+	HW_EVENT_ERR_CORRECTED,
+	HW_EVENT_ERR_UNCORRECTED,
+	HW_EVENT_ERR_FATAL,
+};
+
+/**
  * enum mem_type - memory types. For a more detailed reference, please see
  *			http://en.wikipedia.org/wiki/DRAM
  *
@@ -308,7 +327,89 @@ enum scrub_type {
  * PS - I enjoyed writing all that about as much as you enjoyed reading it.
  */
 
-/* FIXME: add a per-dimm ce error count */
+/**
+ * enum edac_mc_layer - memory controller hierarchy layer
+ *
+ * @EDAC_MC_LAYER_BRANCH:	memory layer is named "branch"
+ * @EDAC_MC_LAYER_CHANNEL:	memory layer is named "channel"
+ * @EDAC_MC_LAYER_SLOT:		memory layer is named "slot"
+ * @EDAC_MC_LAYER_CHIP_SELECT:	memory layer is named "chip select"
+ *
+ * This enum is used by the drivers to tell edac_mc_sysfs what name should
+ * be used when describing a memory stick location.
+ */
+enum edac_mc_layer_type {
+	EDAC_MC_LAYER_BRANCH,
+	EDAC_MC_LAYER_CHANNEL,
+	EDAC_MC_LAYER_SLOT,
+	EDAC_MC_LAYER_CHIP_SELECT,
+};
+
+/**
+ * struct edac_mc_layer - describes the memory controller hierarchy
+ * @layer:		layer type
+ * @size:		number of components per layer. For example,
+ *			if the channel layer has two channels, size = 2
+ * @is_virt_csrow:	This layer is part of the "csrow" when old API
+ *			compatibility mode is enabled. Otherwise, it is
+ *			a channel
+ */
+struct edac_mc_layer {
+	enum edac_mc_layer_type	type;
+	unsigned		size;
+	bool			is_virt_csrow;
+};
+
+/*
+ * Maximum number of layers used by the memory controller to uniquely
+ * identify a single memory stick.
+ * NOTE: Changing this constant requires not only to change the constant
+ * below, but also to change the existing code at the core, as there are
+ * some code there that are optimized for 3 layers.
+ */
+#define EDAC_MAX_LAYERS		3
+
+/**
+ * EDAC_DIMM_PTR - Macro responsible to find a pointer inside a pointer array
+ *		   for the element given by [layer0,layer1,layer2] position
+ *
+ * @layers:	a struct edac_mc_layer array, describing how many elements
+ *		were allocated for each layer
+ * @var:	name of the var where we want to get the pointer
+ *		(like mci->dimms)
+ * @n_layers:	Number of layers at the @layers array
+ * @layer0:	layer0 position
+ * @layer1:	layer1 position. Unused if n_layers < 2
+ * @layer2:	layer2 position. Unused if n_layers < 3
+ *
+ * For 1 layer, this macro returns &var[layer0]
+ * For 2 layers, this macro is similar to allocate a bi-dimensional array
+ *		and to return "&var[layer0][layer1]"
+ * For 3 layers, this macro is similar to allocate a tri-dimensional array
+ *		and to return "&var[layer0][layer1][layer2]"
+ *
+ * A loop could be used here to make it more generic, but, as we only have
+ * 3 layers, this is a little faster.
+ * By design, layers can never be 0 or more than 3. If that ever happens,
+ * a NULL is returned, causing an OOPS during the memory allocation routine,
+ * with would point to the developer that he's doing something wrong.
+ */
+#define EDAC_DIMM_PTR(layers, var, nlayers, layer0, layer1, layer2) ({	\
+	typeof(var) __p;						\
+	if ((nlayers) == 1)						\
+		__p = &var[layer0];					\
+	else if ((nlayers) == 2)					\
+		__p = &var[(layer1) + ((layers[1]).size * (layer0))];	\
+	else if ((nlayers) == 3)					\
+		__p = &var[(layer2) + ((layers[2]).size * ((layer1) +	\
+			    ((layers[1]).size * (layer0))))];		\
+	else								\
+		__p = NULL;						\
+	__p;								\
+})
+
+
+/* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
 	unsigned memory_controller;
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-24 17:38                           ` Mauro Carvalho Chehab
@ 2012-04-24 18:15                             ` Mauro Carvalho Chehab
  2012-04-24 18:15                               ` [PATCH EDACv16 2/2] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
  2012-04-27 13:33                                 ` Borislav Petkov
  0 siblings, 2 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 18:15 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Aristeu Rozanski,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Borislav Petkov, Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core were written with the idea that memory controllers
are able to directly access csrows, and that the channels are
used inside a csrows select.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMM's, instead of ranks, accessed
via csrow/channel.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks on FB-DIMM and RAMBUS
memory controllers on the next patches.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMM's are currently represented by multiple DIMM
entries at struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same dimm.
Such bug is there since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

PS.: I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger on the drivers for FB-DIMM's.

FIXME: while the FB-DIMMs are not converted to use the new
design, uncorrected errors will show just one channel. In
the past, all changes were on a big patch with about 150K.
As it needed to be split, in order to be accepted by the
EDAC ML at vger, we've opted to have this small drawback.
As an advantage, it is now easier to review the patch series.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v16: Only context changes

 drivers/edac/edac_core.h |   92 ++++++-
 drivers/edac/edac_mc.c   |  682 ++++++++++++++++++++++++++++------------------
 include/linux/edac.h     |   40 ++-
 3 files changed, 526 insertions(+), 288 deletions(-)

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..7201bb1 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index);
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -467,24 +472,80 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
  * reporting logic and function interface - reduces conditional
  * statement clutter and extra function arguments.
  */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
+
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
+
+static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
 			      unsigned long page_frame_number,
 			      unsigned long offset_in_page,
 			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
+			      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      page_frame_number, offset_in_page, syndrome,
+		              row, channel, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+				      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
 			      unsigned long page_frame_number,
 			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+			      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      page_frame_number, offset_in_page, 0,
+		              row, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+				      const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel0,
+					 unsigned int channel1,
+					 char *msg)
+{
+	/*
+	 *FIXME: The error can also be at channel1 (e. g. at the second
+	 *	  channel of the same branch). The fix is to push
+	 *	  edac_mc_handle_error() call into each driver
+	 */
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel0, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel, char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel, -1, msg, NULL, NULL);
+}
+
+
 
 /*
  * edac_device APIs
@@ -496,6 +557,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 6ec967a..4d4d8b7 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -157,10 +175,25 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Number of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -172,18 +205,41 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt)
 {
 	void *ptr = NULL;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *lay;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
-	unsigned size;
-	int row, chn;
+	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
+	unsigned tot_csrows, tot_cschannels;
+	int i, j;
 	int err;
+	int row, chn;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	tot_dimms = 1;
+	tot_cschannels = 1;
+	tot_csrows = 1;
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_virt_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_cschannels *= layers[i].size;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
@@ -191,12 +247,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * hardcode everything into a single struct.
 	 */
 	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
-	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
-	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
-	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	lay = edac_align_ptr(&ptr, sizeof(*lay), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_cschannels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	count = 1;
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+	}
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
+		__func__, size, tot_dimms, tot_csrows * tot_cschannels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -204,42 +269,99 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	lay = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)lay));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
+	mci->n_layers = n_layers;
+	mci->layers = lay;
+	memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
+	mci->nr_csrows = tot_csrows;
+	mci->num_cschannel = tot_cschannels;
 
 	/*
-	 * For now, assumes that a per-csrow arrangement for dimms.
-	 * This will be latter changed.
+	 * Fills the csrow struct
 	 */
-	dimm = mci->dimms;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
-
-		for (chn = 0; chn < nr_chans; chn++) {
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_cschannels;
+		chp = &chi[row * tot_cschannels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_cschannels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
+		}
+	}
 
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+	/*
+	 * Fills the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		if (!rev_order) {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (!layers[j].is_virt_csrow)
+					break;
+			chn++;
+			if (chn == tot_cschannels) {
+				chn = 0;
+				row++;
+			}
+		} else {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (layers[j].is_virt_csrow)
+					break;
+			row++;
+			if (row == tot_csrows) {
+				row = 0;
+				chn++;
+			}
+		}
+
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -263,6 +385,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	return mci;
 }
+EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Nu
+mber of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * Only can be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ *	NULL allocation failed
+ *	struct mem_ctl_info pointer
+ */
+
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index)
+{
+	unsigned n_layers = 2;
+	struct edac_mc_layer layers[n_layers];
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = nr_csrows;
+	layers[0].is_virt_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_virt_csrow = false;
+
+	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
+			  false, sz_pvt);
+}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
@@ -528,7 +701,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -555,6 +727,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -712,261 +886,249 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
+
+static void edac_increment_ce_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
 {
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+	int i, index = 0;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	mci->ce_mc++;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_increment_ue_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	if (mci->scrub_mode & SCRUB_SW_SRC) {
-		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
-		remapped_page = mci->ctl_page_to_phys ?
-			mci->ctl_page_to_phys(mci, page_frame_number) :
-			page_frame_number;
+	mci->ue_mc++;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
+		return;
 	}
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	unsigned long remapped_page;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_filter = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/* Check if the event report is consistent */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED) {
+				p = "CE";
+				mci->ce_mc++;
+			} else {
+				p = "UE";
+				mci->ue_mc++;
+			}
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_filter = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
-
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory,
+	 * the logic here will get all possible labels that could pottentially
+	 * be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called as "rank")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 * It will also get the max grain, over the error match range
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no sense
+		 * on seeking for the affected DIMMs, as everything may be
+		 * affected. Also, don't show errors for non-filled dimm's.
+		 */
+		if (enable_filter && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the dimm, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: dimm csrows (%d,%d)\n",
+				__func__, dimm->csrow, dimm->cschannel);
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
-
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_filter) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
+		p += sprintf(p, "%s %d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED)
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+	else
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d",
+			page_frame_number, offset_in_page, grain);
+
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		if (edac_mc_get_log_ce())
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		edac_increment_ce_error(mci, enable_filter, pos);
+
+		if (mci->scrub_mode & SCRUB_SW_SRC) {
+			/*
+			 * Some MC's can remap memory so that it is still
+			 * available at a different address when PCI devices
+			 * map into memory.
+			 * MC's that can't do this lose the memory where PCI
+			 * devices are mapped. This mapping is MC dependent
+			 * and so we call back into the MC driver for it to
+			 * map the MC page to a physical (CPU) page which can
+			 * then be mapped to a virtual page - which can then
+			 * be scrubbed.
+			 */
+			remapped_page = mci->ctl_page_to_phys ?
+				mci->ctl_page_to_phys(mci, page_frame_number) :
+				page_frame_number;
+
+			edac_mc_scrub_block(remapped_page,
+					    offset_in_page, grain);
+		}
+	} else {
+		if (edac_mc_get_log_ue())
+			edac_mc_printk(mci, KERN_WARNING,
+				"UE %s on %s (%s%s %s)\n",
+				msg, label, location, detail, other_detail);
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+		if (edac_mc_get_panic_on_ue())
+			panic("UE %s on %s (%s%s %s)\n",
+			      msg, label, location, detail, other_detail);
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		edac_increment_ue_error(mci, enable_filter, pos);
 	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
-
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 3b8798d..412d5cd 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -412,18 +412,20 @@ struct edac_mc_layer {
 /* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -443,9 +445,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -497,6 +500,11 @@ struct mcidev_sysfs_attribute {
         ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
 };
 
+struct edac_hierarchy {
+	char		*name;
+	unsigned	nr;
+};
+
 /* MEMORY controller information structure
  */
 struct mem_ctl_info {
@@ -541,13 +549,16 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned nr_csrows, num_cschannel;
 
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -562,12 +573,15 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
+	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
+	u32 ce_count;           /* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/* drivers shouldn't access this struct directly */
+	unsigned ce_noinfo_count, ue_noinfo_count;
+	unsigned ce_mc, ue_mc;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -580,7 +594,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH EDACv16 2/2] amd64_edac: convert driver to use the new edac ABI
  2012-04-24 18:15                             ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Mauro Carvalho Chehab
@ 2012-04-24 18:15                               ` Mauro Carvalho Chehab
  2012-04-27 10:42                                 ` Mauro Carvalho Chehab
  2012-04-27 13:33                                 ` Borislav Petkov
  1 sibling, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-24 18:15 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Borislav Petkov

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v16: Only context changes

 drivers/edac/amd64_edac.c |  137 ++++++++++++++++++++++++++++++---------------
 1 files changed, 92 insertions(+), 45 deletions(-)

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 6d6ec68..b13d5a0 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1039,6 +1039,37 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	int channel, csrow;
 	u32 page, offset;
 
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
+
+	/*
+	 * Find out which node the error address belongs to. This may be
+	 * different from the node that detected the error.
+	 */
+	src_mci = find_mc_by_sys_addr(mci, sys_addr);
+	if (!src_mci) {
+		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
+			     (unsigned long)sys_addr);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a node",
+				     NULL);
+		return;
+	}
+
+	/* Now map the sys_addr to a CSROW */
+	csrow = sys_addr_to_csrow(src_mci, sys_addr);
+	if (csrow < 0) {
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a csrow",
+				     NULL);
+		return;
+	}
+
 	/* CHIPKILL enabled */
 	if (pvt->nbcfg & NBCFG_CHIPKILL) {
 		channel = get_channel_from_ecc_syndrome(mci, syndrome);
@@ -1048,9 +1079,15 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 			 * 2 DIMMs is in error. So we need to ID 'both' of them
 			 * as suspect.
 			 */
-			amd64_mc_warn(mci, "unknown syndrome 0x%04x - possible "
-					   "error reporting race\n", syndrome);
-			edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+			amd64_mc_warn(src_mci, "unknown syndrome 0x%04x - "
+				      "possible error reporting race\n",
+				      syndrome);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     page, offset, syndrome,
+					     csrow, -1, -1,
+					     EDAC_MOD_STR,
+					     "unknown syndrome - possible error reporting race",
+					     NULL);
 			return;
 		}
 	} else {
@@ -1065,28 +1102,10 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 		channel = ((sys_addr & BIT(3)) != 0);
 	}
 
-	/*
-	 * Find out which node the error address belongs to. This may be
-	 * different from the node that detected the error.
-	 */
-	src_mci = find_mc_by_sys_addr(mci, sys_addr);
-	if (!src_mci) {
-		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
-			     (unsigned long)sys_addr);
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
-		return;
-	}
-
-	/* Now map the sys_addr to a CSROW */
-	csrow = sys_addr_to_csrow(src_mci, sys_addr);
-	if (csrow < 0) {
-		edac_mc_handle_ce_no_info(src_mci, EDAC_MOD_STR);
-	} else {
-		error_address_to_page_and_offset(sys_addr, &page, &offset);
-
-		edac_mc_handle_ce(src_mci, page, offset, syndrome, csrow,
-				  channel, EDAC_MOD_STR);
-	}
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, src_mci,
+			     page, offset, syndrome,
+			     csrow, channel, -1,
+			     EDAC_MOD_STR, "", NULL);
 }
 
 static int ddr2_cs_size(unsigned i, bool dct_width)
@@ -1568,15 +1587,20 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	u32 page, offset;
 	int nid, csrow, chan = 0;
 
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
+
 	csrow = f1x_translate_sysaddr_to_cs(pvt, sys_addr, &nid, &chan);
 
 	if (csrow < 0) {
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a csrow",
+				     NULL);
 		return;
 	}
 
-	error_address_to_page_and_offset(sys_addr, &page, &offset);
-
 	/*
 	 * We need the syndromes for channel detection only when we're
 	 * ganged. Otherwise @chan should already contain the channel at
@@ -1585,16 +1609,10 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	if (dct_ganging_enabled(pvt))
 		chan = get_channel_from_ecc_syndrome(mci, syndrome);
 
-	if (chan >= 0)
-		edac_mc_handle_ce(mci, page, offset, syndrome, csrow, chan,
-				  EDAC_MOD_STR);
-	else
-		/*
-		 * Channel unknown, report all channels on this CSROW as failed.
-		 */
-		for (chan = 0; chan < mci->csrows[csrow].nr_channels; chan++)
-			edac_mc_handle_ce(mci, page, offset, syndrome,
-					  csrow, chan, EDAC_MOD_STR);
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				page, offset, syndrome,
+				csrow, chan, -1,
+				EDAC_MOD_STR, "", NULL);
 }
 
 /*
@@ -1875,7 +1893,12 @@ static void amd64_handle_ce(struct mem_ctl_info *mci, struct mce *m)
 	/* Ensure that the Error Address is VALID */
 	if (!(m->status & MCI_STATUS_ADDRV)) {
 		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     0, 0, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "HW has no ERROR_ADDRESS available",
+				     NULL);
 		return;
 	}
 
@@ -1899,11 +1922,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 
 	if (!(m->status & MCI_STATUS_ADDRV)) {
 		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     0, 0, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "HW has no ERROR_ADDRESS available",
+				     NULL);
 		return;
 	}
 
 	sys_addr = get_error_address(m);
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
 
 	/*
 	 * Find out which node the error address belongs to. This may be
@@ -1913,7 +1942,11 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 	if (!src_mci) {
 		amd64_mc_err(mci, "ERROR ADDRESS (0x%lx) NOT mapped to a MC\n",
 				  (unsigned long)sys_addr);
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "ERROR ADDRESS NOT mapped to a MC", NULL);
 		return;
 	}
 
@@ -1923,10 +1956,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 	if (csrow < 0) {
 		amd64_mc_err(mci, "ERROR_ADDRESS (0x%lx) NOT mapped to CS\n",
 				  (unsigned long)sys_addr);
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "ERROR ADDRESS NOT mapped to CS",
+				     NULL);
 	} else {
-		error_address_to_page_and_offset(sys_addr, &page, &offset);
-		edac_mc_handle_ue(log_mci, page, offset, csrow, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     csrow, -1, -1,
+				     EDAC_MOD_STR, "", NULL);
 	}
 }
 
@@ -2486,6 +2526,7 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 	struct amd64_pvt *pvt = NULL;
 	struct amd64_family_type *fam_type = NULL;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	int err = 0, ret;
 	u8 nid = get_node_id(F2);
 
@@ -2520,7 +2561,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 		goto err_siblings;
 
 	ret = -ENOMEM;
-	mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = pvt->csels[0].b_cnt;
+	layers[0].is_virt_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = pvt->channel_count;
+	layers[1].is_virt_csrow = false;
+	mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
 	if (!mci)
 		goto err_siblings;
 
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-24 17:24                       ` Mauro Carvalho Chehab
@ 2012-04-25 17:19                         ` Borislav Petkov
  2012-04-25 17:47                           ` Mauro Carvalho Chehab
  2012-04-25 17:55                           ` Luck, Tony
  0 siblings, 2 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-25 17:19 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Tony Luck, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

On Tue, Apr 24, 2012 at 02:24:59PM -0300, Mauro Carvalho Chehab wrote:
> Yes, but this seems to be hidden on some lower level layer on their
> hardware. The rank information is only an information inside their
> per-DIMM registers.

Yep, it looks like it.

[..]

> [52803.640136] EDAC DEBUG: get_dimm_config: mc#1: Node ID: 1, source ID: 1
> [52803.640141] EDAC DEBUG: get_dimm_config: Memory mirror is disabled
> [52803.640154] EDAC DEBUG: get_dimm_config: Lockstep is disabled
> [52803.640156] EDAC DEBUG: get_dimm_config: address map is on open page mode
> [52803.640157] EDAC DEBUG: get_dimm_config: Memory is unregistered
> [52803.640159] EDAC DEBUG: get_dimm_config: Channel #0  MTR0 = 500c
> [52803.640162] EDAC DEBUG: get_dimm_config: mc#1: channel 0, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
> [52803.640165] EDAC DEBUG: get_dimm_config: Channel #0  MTR1 = 500c
> [52803.640168] EDAC DEBUG: get_dimm_config: mc#1: channel 0, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
> [52803.640171] EDAC DEBUG: get_dimm_config: Channel #0  MTR2 = 0
> [52803.640174] EDAC DEBUG: get_dimm_config: Channel #1  MTR0 = 500c
> [52803.640176] EDAC DEBUG: get_dimm_config: mc#1: channel 1, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
> [52803.640180] EDAC DEBUG: get_dimm_config: Channel #1  MTR1 = 500c
> [52803.640182] EDAC DEBUG: get_dimm_config: mc#1: channel 1, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
> [52803.640185] EDAC DEBUG: get_dimm_config: Channel #1  MTR2 = 0
> [52803.640188] EDAC DEBUG: get_dimm_config: Channel #2  MTR0 = 500c
> [52803.640190] EDAC DEBUG: get_dimm_config: mc#1: channel 2, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
> [52803.640193] EDAC DEBUG: get_dimm_config: Channel #2  MTR1 = 500c
> [52803.640195] EDAC DEBUG: get_dimm_config: mc#1: channel 2, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
> [52803.640199] EDAC DEBUG: get_dimm_config: Channel #2  MTR2 = 0
> [52803.640201] EDAC DEBUG: get_dimm_config: Channel #3  MTR0 = 500c
> [52803.640203] EDAC DEBUG: get_dimm_config: mc#1: channel 3, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
> [52803.640218] EDAC DEBUG: get_dimm_config: Channel #3  MTR1 = 500c
> [52803.640220] EDAC DEBUG: get_dimm_config: mc#1: channel 3, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400

Ok, this looks like output from those MC_DOD_CH{0,1,2}_{0,1,2}
registers. And those are per-channel, actually, with a NUMRANK field
which tells you how many ranks the DIMM on this channel has.

(Btw, I'm looking at the corei7 datasheet, doc# 320835-003, couldn't
find those MC_DOD*s in the xeon datasheets).

So, the channels display in edac-ctl are the 3 channels, slot{0,1,2} are the
physical slots on each channel.

Now let's look at your output from earlier:

> $ ./edac-ctl --layout
>        +-----------------------------------+
>        |                mc0                |
>        | channel0  | channel1  | channel2  |
> -------+-----------------------------------+
> slot2: |     0 MB  |     0 MB  |     0 MB  |
> slot1: |  1024 MB  |     0 MB  |     0 MB  |
> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
> -------+-----------------------------------+
>
> Those are the logs that dump the Memory Controller registers:
>
> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs

it says here 2 ranks

> [  115.818950] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.818955] EDAC DEBUG: get_dimm_config:   dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs

and here 2 too although there's only one single-ranked DIMM here. So
which is it?

> [  115.818985] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> [  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
> [  115.819016] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400

So, I'd say this machine has 4 DIMMs on the node, all 4 of them are
single-ranked and 2 are connected to channel0, the other two channels
have each a single DIMM of a single rank.

Looking further the i7 doc above, there are other registers like
MC_SAG_CH{0,1,2}_{0-7} which look like rank descriptors and there's
even a small pseudo-code thing which can give you the memory address by
"unwinding" the interleaving.

> [52803.640223] EDAC DEBUG: get_dimm_config: Channel #3  MTR2 = 0
> [52803.640226] EDAC DEBUG: get_memory_layout: TOLM: 3.136 GB (0x00000000c3ffffff)
> [52803.640228] EDAC DEBUG: get_memory_layout: TOHM: 66.624 GB (0x0000001043ffffff)
> [52803.640231] EDAC DEBUG: get_memory_layout: SAD#0 DRAM up to 33.792 GB (0x0000000840000000) Interleave: 8:6 reg=0x000083c3
> [52803.640234] EDAC DEBUG: get_memory_layout: SAD#0, interleave #0: 0
> [52803.640237] EDAC DEBUG: get_memory_layout: SAD#1 DRAM up to 66.560 GB (0x0000001040000000) Interleave: 8:6 reg=0x000103c3
> [52803.640239] EDAC DEBUG: get_memory_layout: SAD#1, interleave #0: 1
> [52803.640245] EDAC DEBUG: get_memory_layout: TAD#0: up to 66.560 GB (0x0000001040000000), socket interleave 0, memory interleave 3, TGT: 0, 1, 2, 3, reg=0x0040f3e4
> [52803.640249] EDAC DEBUG: get_memory_layout: TAD CH#0, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
> [52803.640252] EDAC DEBUG: get_memory_layout: TAD CH#1, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
> [52803.640255] EDAC DEBUG: get_memory_layout: TAD CH#2, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
> [52803.640258] EDAC DEBUG: get_memory_layout: TAD CH#3, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
> [52803.640261] EDAC DEBUG: get_memory_layout: CH#0 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
> [52803.640264] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
> [52803.640278] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
> [52803.640281] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
> [52803.640283] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
> [52803.640287] EDAC DEBUG: get_memory_layout: CH#1 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
> [52803.640290] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
> [52803.640293] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
> [52803.640296] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
> [52803.640299] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
> [52803.640303] EDAC DEBUG: get_memory_layout: CH#2 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
> [52803.640306] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
> [52803.640309] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
> [52803.640312] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
> [52803.640315] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
> [52803.640319] EDAC DEBUG: get_memory_layout: CH#3 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
> [52803.640322] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
> [52803.640324] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
> [52803.640327] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
> [52803.640330] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
> 
> In this case, all 4 channels are used for interleave:

Ok, this has 4 channels.

> [52803.640245] EDAC DEBUG: get_memory_layout: TAD#0: up to 66.560 GB (0x0000001040000000), socket interleave 0, memory interleave 3, TGT: 0, 1, 2, 3, reg=0x0040f3e4
> 
> It doesn't do DIMM socket interleave (socket interleave 0). It does channel interleave
> among channels 0 to 3 (TGT: 0, 1, 2, 3). 
> 
> It also does an interleave at the physical memory address on bits 6 to 8:

Ok.

[..]

> For Nehalem, see i7core_edac comments that I added at the beginning of the
> driver:
> 
>  * Based on the following public Intel datasheets:
>  * Intel Core i7 Processor Extreme Edition and Intel Core i7 Processor
>  * Datasheet, Volume 2:
>  *	http://download.intel.com/design/processor/datashts/320835.pdf
>  * Intel Xeon Processor 5500 Series Datasheet Volume 2
>  *	http://www.intel.com/Assets/PDF/datasheet/321322.pdf

This is 404.

>  * also available at:
>  * 	http://www.arrownac.com/manufacturers/intel/s/nehalem/5500-datasheet-v2.pdf

This one works.

> >> No. As far as I can tell, they can have 9 quad-ranked DIMMs (the machines
> >> I've looked so far are all equipped with single rank memories, so I don't 
> >> have a real scenario with 2R or 4R for Nehalem yet).

Well, the xeon 5500 datasheet, vol2 has a table 3-2 of RDIMM population
configs and according to it, it can do only one 4R DIMM in the farthest
slot, page 127 from here:

http://www.intel.com/content/www/us/en/processors/xeon/xeon-processor-5000-sequence/Xeon5000TechnicalResources.html

?

> >> At Sandy Bridge-EP (E. g. Intel E5 CPUs), we have one machine fully equipped
> >> with dual rank memories. The number of ranks there is just a DIMM property.
> >>
> >> # ./edac-ctl --layout
> >>        +-----------------------------------------------------------------------------------------------+
> >>        |                      mc0                      |                      mc1                      |
> >>        | channel0  | channel1  | channel2  | channel3  | channel0  | channel1  | channel2  | channel3  |
> >> -------+-----------------------------------------------------------------------------------------------+
> >> slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
> >> slot1: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
> >> slot0: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
> >> -------+-----------------------------------------------------------------------------------------------+
> >>
> >> (this machine doesn't have physical DIMM sockets for slot#2)

This looks like a 4-channel memory controller with 3 physical slots per
channel.

> > Ok, I can count 8 2R DIMMs here and each rank or slot in your
> > nomenclature is 4G. slot#2 has to be something virtual since each rank
> > occupies one slot, i.e. slot0 and slot1 on a channel.
> 
> No. This machine has 64 GB of RAM, and it was physically filled with 16 DIMMs, 
> each with 4GB. Each of the above represents one DIMM (and not a rank).

Yep, I see that now.

> 
> Btw, the above logs are for this machine.
> 
> # free
>              total       used       free     shared    buffers     cached
> Mem:      65933268    1166384   64766884          0      60572     363712
> -/+ buffers/cache:     742100   65191168
> Swap:     68157436      18680   68138756
> 
> The DMI decode info also clearly states that:
> 
> # dmidecode|grep -e "Memory Device$" -e Size -e "Bank Locat" -e "Serial Number" |grep -v Range
> ...
> Memory Device
> 	Size: 4096 MB
> 	Bank Locator: NODE 0 CHANNEL 0 DIMM 0
> 	Serial Number: 82766209  
> Memory Device
> 	Size: 4096 MB
> 	Bank Locator: NODE 0 CHANNEL 0 DIMM 1
> 	Serial Number: 827661D3  
> Memory Device
> 	Size: 4096 MB
> 	Bank Locator: NODE 0 CHANNEL 1 DIMM 0
> 	Serial Number: 82766197

[..]

> As I said, for this memory controller, and for Nehalem, the memories are
> mapped per DIMM socket (and not per rank).

Ok, so there are still ranks and this is how the memory controller
addresses them but they can be interleaved (or not) depending on the
configuration. The registers describing the DIMMs are per-DIMM and have
fields like NUMRANK etc which tells you how many ranks a DIMM has, etc.

Then there are the MC_SAG_CH{0,1,2}_{1-7} which describes 8 interleave
ranges and those are actually the chip select rows == ranks.

And now the question is, when you get a DRAM ECC, how does the hardware
point to the DIMM in error, does it give you a (channel, slot) tuple
or a virtual address which you have to un-interleave? From MCA, you're
getting a virtual address in MC4_ADDR so how do you compute this one
back to a DIMM?

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-25 17:19                         ` Borislav Petkov
@ 2012-04-25 17:47                           ` Mauro Carvalho Chehab
  2012-04-25 18:32                             ` Luck, Tony
  2012-04-26 14:11                             ` Borislav Petkov
  2012-04-25 17:55                           ` Luck, Tony
  1 sibling, 2 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-25 17:47 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson

Em 25-04-2012 14:19, Borislav Petkov escreveu:
> On Tue, Apr 24, 2012 at 02:24:59PM -0300, Mauro Carvalho Chehab wrote:
>> Yes, but this seems to be hidden on some lower level layer on their
>> hardware. The rank information is only an information inside their
>> per-DIMM registers.
> 
> Yep, it looks like it.
> 
> [..]
> 
>> [52803.640136] EDAC DEBUG: get_dimm_config: mc#1: Node ID: 1, source ID: 1
>> [52803.640141] EDAC DEBUG: get_dimm_config: Memory mirror is disabled
>> [52803.640154] EDAC DEBUG: get_dimm_config: Lockstep is disabled
>> [52803.640156] EDAC DEBUG: get_dimm_config: address map is on open page mode
>> [52803.640157] EDAC DEBUG: get_dimm_config: Memory is unregistered
>> [52803.640159] EDAC DEBUG: get_dimm_config: Channel #0  MTR0 = 500c
>> [52803.640162] EDAC DEBUG: get_dimm_config: mc#1: channel 0, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
>> [52803.640165] EDAC DEBUG: get_dimm_config: Channel #0  MTR1 = 500c
>> [52803.640168] EDAC DEBUG: get_dimm_config: mc#1: channel 0, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
>> [52803.640171] EDAC DEBUG: get_dimm_config: Channel #0  MTR2 = 0
>> [52803.640174] EDAC DEBUG: get_dimm_config: Channel #1  MTR0 = 500c
>> [52803.640176] EDAC DEBUG: get_dimm_config: mc#1: channel 1, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
>> [52803.640180] EDAC DEBUG: get_dimm_config: Channel #1  MTR1 = 500c
>> [52803.640182] EDAC DEBUG: get_dimm_config: mc#1: channel 1, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
>> [52803.640185] EDAC DEBUG: get_dimm_config: Channel #1  MTR2 = 0
>> [52803.640188] EDAC DEBUG: get_dimm_config: Channel #2  MTR0 = 500c
>> [52803.640190] EDAC DEBUG: get_dimm_config: mc#1: channel 2, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
>> [52803.640193] EDAC DEBUG: get_dimm_config: Channel #2  MTR1 = 500c
>> [52803.640195] EDAC DEBUG: get_dimm_config: mc#1: channel 2, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
>> [52803.640199] EDAC DEBUG: get_dimm_config: Channel #2  MTR2 = 0
>> [52803.640201] EDAC DEBUG: get_dimm_config: Channel #3  MTR0 = 500c
>> [52803.640203] EDAC DEBUG: get_dimm_config: mc#1: channel 3, dimm 0, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
>> [52803.640218] EDAC DEBUG: get_dimm_config: Channel #3  MTR1 = 500c
>> [52803.640220] EDAC DEBUG: get_dimm_config: mc#1: channel 3, dimm 1, 4096 Mb (1048576 pages) bank: 8, rank: 2, row: 0x8000, col: 0x400
> 
> Ok, this looks like output from those MC_DOD_CH{0,1,2}_{0,1,2}
> registers. And those are per-channel, actually, with a NUMRANK field
> which tells you how many ranks the DIMM on this channel has.

No. there's one register per DIMM there. They're inside a PCI device
per channel.

> (Btw, I'm looking at the corei7 datasheet, doc# 320835-003, couldn't
> find those MC_DOD*s in the xeon datasheets).
> 
> So, the channels display in edac-ctl are the 3 channels, slot{0,1,2} are the
> physical slots on each channel.

Yes.

> 
> Now let's look at your output from earlier:
> 
>> $ ./edac-ctl --layout
>>        +-----------------------------------+
>>        |                mc0                |
>>        | channel0  | channel1  | channel2  |
>> -------+-----------------------------------+
>> slot2: |     0 MB  |     0 MB  |     0 MB  |
>> slot1: |  1024 MB  |     0 MB  |     0 MB  |
>> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
>> -------+-----------------------------------+
>>
>> Those are the logs that dump the Memory Controller registers:
>>
>> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
> 
> it says here 2 ranks

The above output is for the Nehalem machine, with 4 dimms, all single ranked.

>> [  115.818950] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.818955] EDAC DEBUG: get_dimm_config:   dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
> 
> and here 2 too although there's only one single-ranked DIMM here. So
> which is it?

The # of ranks there is the total amount of ranks at the channel.

The per-channel register shows the total amount of ranks in the channel;
the per-dimm register shows the number or ranks per dimm.

> 
>> [  115.818985] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>> [  115.819012] EDAC DEBUG: get_dimm_config: Ch2 phy rd3, wr3 (0x063f4031): 2 ranks, UDIMMs
>> [  115.819016] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> 
> So, I'd say this machine has 4 DIMMs on the node, all 4 of them are
> single-ranked and 2 are connected to channel0, the other two channels
> have each a single DIMM of a single rank.

Yes.

> Looking further the i7 doc above, there are other registers like
> MC_SAG_CH{0,1,2}_{0-7} which look like rank descriptors and there's
> even a small pseudo-code thing which can give you the memory address by
> "unwinding" the interleaving.

In the case of the EDAC driver, we're relying at the per-DIMM information, that
is reported via the MCE misc register. Also, there are per-DIMM error counters
out there. So, while it could, in thesis, be possible to use the per-RANK
registers and do the error decoding without MCA, this can have troubles, in
practice, as some BIOSes can also be accessing the same registers, which would
cause race conditions between BIOS and Linux.

> 
>> [52803.640223] EDAC DEBUG: get_dimm_config: Channel #3  MTR2 = 0
>> [52803.640226] EDAC DEBUG: get_memory_layout: TOLM: 3.136 GB (0x00000000c3ffffff)
>> [52803.640228] EDAC DEBUG: get_memory_layout: TOHM: 66.624 GB (0x0000001043ffffff)
>> [52803.640231] EDAC DEBUG: get_memory_layout: SAD#0 DRAM up to 33.792 GB (0x0000000840000000) Interleave: 8:6 reg=0x000083c3
>> [52803.640234] EDAC DEBUG: get_memory_layout: SAD#0, interleave #0: 0
>> [52803.640237] EDAC DEBUG: get_memory_layout: SAD#1 DRAM up to 66.560 GB (0x0000001040000000) Interleave: 8:6 reg=0x000103c3
>> [52803.640239] EDAC DEBUG: get_memory_layout: SAD#1, interleave #0: 1
>> [52803.640245] EDAC DEBUG: get_memory_layout: TAD#0: up to 66.560 GB (0x0000001040000000), socket interleave 0, memory interleave 3, TGT: 0, 1, 2, 3, reg=0x0040f3e4
>> [52803.640249] EDAC DEBUG: get_memory_layout: TAD CH#0, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
>> [52803.640252] EDAC DEBUG: get_memory_layout: TAD CH#1, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
>> [52803.640255] EDAC DEBUG: get_memory_layout: TAD CH#2, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
>> [52803.640258] EDAC DEBUG: get_memory_layout: TAD CH#3, offset #0: 33.792 GB (0x0000000840000000), reg=0x00008400
>> [52803.640261] EDAC DEBUG: get_memory_layout: CH#0 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
>> [52803.640264] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
>> [52803.640278] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
>> [52803.640281] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
>> [52803.640283] EDAC DEBUG: get_memory_layout: CH#0 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
>> [52803.640287] EDAC DEBUG: get_memory_layout: CH#1 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
>> [52803.640290] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
>> [52803.640293] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
>> [52803.640296] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
>> [52803.640299] EDAC DEBUG: get_memory_layout: CH#1 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
>> [52803.640303] EDAC DEBUG: get_memory_layout: CH#2 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
>> [52803.640306] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
>> [52803.640309] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
>> [52803.640312] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
>> [52803.640315] EDAC DEBUG: get_memory_layout: CH#2 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
>> [52803.640319] EDAC DEBUG: get_memory_layout: CH#3 RIR#0, limit: 8.191 GB (0x00000001fff00000), way: 4, reg=0xa000001e
>> [52803.640322] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#0, offset 0.000 GB (0x0000000000000000), tgt: 0, reg=0x00000000
>> [52803.640324] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#1, offset 0.000 GB (0x0000000000000000), tgt: 4, reg=0x00040000
>> [52803.640327] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#2, offset 0.000 GB (0x0000000000000000), tgt: 1, reg=0x00010000
>> [52803.640330] EDAC DEBUG: get_memory_layout: CH#3 RIR#0 INTL#3, offset 0.000 GB (0x0000000000000000), tgt: 5, reg=0x00050000
>>
>> In this case, all 4 channels are used for interleave:
> 
> Ok, this has 4 channels.
> 
>> [52803.640245] EDAC DEBUG: get_memory_layout: TAD#0: up to 66.560 GB (0x0000001040000000), socket interleave 0, memory interleave 3, TGT: 0, 1, 2, 3, reg=0x0040f3e4
>>
>> It doesn't do DIMM socket interleave (socket interleave 0). It does channel interleave
>> among channels 0 to 3 (TGT: 0, 1, 2, 3). 
>>
>> It also does an interleave at the physical memory address on bits 6 to 8:
> 
> Ok.
> 
> [..]
> 
>> For Nehalem, see i7core_edac comments that I added at the beginning of the
>> driver:
>>
>>  * Based on the following public Intel datasheets:
>>  * Intel Core i7 Processor Extreme Edition and Intel Core i7 Processor
>>  * Datasheet, Volume 2:
>>  *	http://download.intel.com/design/processor/datashts/320835.pdf
>>  * Intel Xeon Processor 5500 Series Datasheet Volume 2
>>  *	http://www.intel.com/Assets/PDF/datasheet/321322.pdf
> 
> This is 404.

They likely moved it to some other address. The datasheet was there at the time
the code was written. I'll find the right place at Intel's site and update it,
as datasheet can be very useful for anyone patching it.

> 
>>  * also available at:
>>  * 	http://www.arrownac.com/manufacturers/intel/s/nehalem/5500-datasheet-v2.pdf
> 
> This one works.
> 
>>>> No. As far as I can tell, they can have 9 quad-ranked DIMMs (the machines
>>>> I've looked so far are all equipped with single rank memories, so I don't 
>>>> have a real scenario with 2R or 4R for Nehalem yet).
> 
> Well, the xeon 5500 datasheet, vol2 has a table 3-2 of RDIMM population
> configs and according to it, it can do only one 4R DIMM in the farthest
> slot, page 127 from here:
> 
> http://www.intel.com/content/www/us/en/processors/xeon/xeon-processor-5000-sequence/Xeon5000TechnicalResources.html
> 
> ?

There are several restrictions related to how the the DIMM memories can be filled.
The i7core_edac driver actually supports a few different versions of the Nehalem MCU.
I'm not sure if the restrictions are the same for all of them.

> 
>>>> At Sandy Bridge-EP (E. g. Intel E5 CPUs), we have one machine fully equipped
>>>> with dual rank memories. The number of ranks there is just a DIMM property.
>>>>
>>>> # ./edac-ctl --layout
>>>>        +-----------------------------------------------------------------------------------------------+
>>>>        |                      mc0                      |                      mc1                      |
>>>>        | channel0  | channel1  | channel2  | channel3  | channel0  | channel1  | channel2  | channel3  |
>>>> -------+-----------------------------------------------------------------------------------------------+
>>>> slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
>>>> slot1: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
>>>> slot0: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
>>>> -------+-----------------------------------------------------------------------------------------------+
>>>>
>>>> (this machine doesn't have physical DIMM sockets for slot#2)
> 
> This looks like a 4-channel memory controller with 3 physical slots per
> channel.

Yes, except that this specific motherboard has only 16 physical slots. In
thesis, it is possible to have a motherboard with 24 physical slots.

The driver is not able to detect how many physical slots are inside the
motherboard, so, it assumes the maximum number of slot that the memory 
controller supports.

> 
>>> Ok, I can count 8 2R DIMMs here and each rank or slot in your
>>> nomenclature is 4G. slot#2 has to be something virtual since each rank
>>> occupies one slot, i.e. slot0 and slot1 on a channel.
>>
>> No. This machine has 64 GB of RAM, and it was physically filled with 16 DIMMs, 
>> each with 4GB. Each of the above represents one DIMM (and not a rank).
> 
> Yep, I see that now.
> 
>>
>> Btw, the above logs are for this machine.
>>
>> # free
>>              total       used       free     shared    buffers     cached
>> Mem:      65933268    1166384   64766884          0      60572     363712
>> -/+ buffers/cache:     742100   65191168
>> Swap:     68157436      18680   68138756
>>
>> The DMI decode info also clearly states that:
>>
>> # dmidecode|grep -e "Memory Device$" -e Size -e "Bank Locat" -e "Serial Number" |grep -v Range
>> ...
>> Memory Device
>> 	Size: 4096 MB
>> 	Bank Locator: NODE 0 CHANNEL 0 DIMM 0
>> 	Serial Number: 82766209  
>> Memory Device
>> 	Size: 4096 MB
>> 	Bank Locator: NODE 0 CHANNEL 0 DIMM 1
>> 	Serial Number: 827661D3  
>> Memory Device
>> 	Size: 4096 MB
>> 	Bank Locator: NODE 0 CHANNEL 1 DIMM 0
>> 	Serial Number: 82766197
> 
> [..]
> 
>> As I said, for this memory controller, and for Nehalem, the memories are
>> mapped per DIMM socket (and not per rank).
> 
> Ok, so there are still ranks and this is how the memory controller
> addresses them but they can be interleaved (or not) depending on the
> configuration. The registers describing the DIMMs are per-DIMM and have
> fields like NUMRANK etc which tells you how many ranks a DIMM has, etc.
> 
> Then there are the MC_SAG_CH{0,1,2}_{1-7} which describes 8 interleave
> ranges and those are actually the chip select rows == ranks.
> 
> And now the question is, when you get a DRAM ECC, how does the hardware
> point to the DIMM in error, does it give you a (channel, slot) tuple
> or a virtual address which you have to un-interleave? From MCA, you're
> getting a virtual address in MC4_ADDR so how do you compute this one
> back to a DIMM?

See the driver: the only useful information provided by the MCA log is
that an error happened, their physical address, and the type of the 
error. Unlikely the Nehalem MCA, the MCE_MISC registers won't point to the
DIMM in the error.

So, the driver needs to dig into all those MC_* registers, in order 
to convert a physical address into a DIMM slot (or to a set of dimm slots,
if mirror and/or lockstep mode is enabled).

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* RE: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-25 17:19                         ` Borislav Petkov
  2012-04-25 17:47                           ` Mauro Carvalho Chehab
@ 2012-04-25 17:55                           ` Luck, Tony
  1 sibling, 0 replies; 206+ messages in thread
From: Luck, Tony @ 2012-04-25 17:55 UTC (permalink / raw)
  To: Borislav Petkov, Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

> And now the question is, when you get a DRAM ECC, how does the hardware
> point to the DIMM in error, does it give you a (channel, slot) tuple
> or a virtual address which you have to un-interleave? From MCA, you're
> getting a virtual address in MC4_ADDR so how do you compute this one
> back to a DIMM?

Right now we have the EDAC driver doing a reverse translation from the
physical address it finds in MC5_ADDR using the SAD/TAD/... register
information to get to a DIMM address.

Some of the same information does get reported by BIOS via HEST to
the ghes driver ... but Linux currently isn't looking at it (this
was the code path to get physical address on Nehalem/Westmere
generations where the h/w didn't always provide a valid address)
See apei_mce_report_mem_error() in mce-apei.c ... the error record
passed in may have a bunch more fields valid which would help in
identifying the DIMM.

-Tony

^ permalink raw reply	[flat|nested] 206+ messages in thread

* RE: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-25 17:47                           ` Mauro Carvalho Chehab
@ 2012-04-25 18:32                             ` Luck, Tony
  2012-04-25 18:44                               ` Mauro Carvalho Chehab
  2012-04-26 14:11                             ` Borislav Petkov
  1 sibling, 1 reply; 206+ messages in thread
From: Luck, Tony @ 2012-04-25 18:32 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

> See the driver: the only useful information provided by the MCA log is
> that an error happened, their physical address, and the type of the 
> error. Unlikely the Nehalem MCA, the MCE_MISC registers won't point to the
> DIMM in the error.

There's a bit more information in the MCA log than just the physical address:

The cpu number that finds the data in its bank will provide socket information.
[/proc/cpuinfo maps logical cpu numbers to "physical id"]

Low order bits of the MCi_STATUS register will give the channel. See the SDM.

So the only missing information from the MCA log is which DIMM within
the channel.  I.e. we can pin the fault to a group of either two or
three DIMMs depending on how many DIMMS/channel the motherboard supports.

If you only have one DIMM per channel populated than socket/channel is
sufficient to identify the DIMM.

[We also don't have any intra-DIMM information for those customers who
would like to diagnose the device on the DIMM, or which bits within
the cache line had the error]

-Tony

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-25 18:32                             ` Luck, Tony
@ 2012-04-25 18:44                               ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-25 18:44 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

Em 25-04-2012 15:32, Luck, Tony escreveu:
>> See the driver: the only useful information provided by the MCA log is
>> that an error happened, their physical address, and the type of the 
>> error. Unlikely the Nehalem MCA, the MCE_MISC registers won't point to the
>> DIMM in the error.
> 
> There's a bit more information in the MCA log than just the physical address:
> 
> The cpu number that finds the data in its bank will provide socket information.
> [/proc/cpuinfo maps logical cpu numbers to "physical id"]

Yes, but this seems to be different than the CPU that actually has the memory
controller. The MCA registers have a bit to mark if the the error is at the
same CPU or on another one. So, when there's just 2 CPU (sockets), this could
be used, but, for more than 2 CPUs, this field is useless.

So, I opted to not trust on it.

> Low order bits of the MCi_STATUS register will give the channel. See the SDM.

On all tests I did, the channel information reported via MCi_status didn't
match the channel reported via the decoding logic. Maybe this might be due
to some bug on the pre-release CPUs I used so far.

> So the only missing information from the MCA log is which DIMM within
> the channel.  I.e. we can pin the fault to a group of either two or
> three DIMMs depending on how many DIMMS/channel the motherboard supports.
> 
> If you only have one DIMM per channel populated than socket/channel is
> sufficient to identify the DIMM.
> 
> [We also don't have any intra-DIMM information for those customers who
> would like to diagnose the device on the DIMM, or which bits within
> the cache line had the error]
> 
> -Tony
> --
> To unsubscribe from this list: send the line "unsubscribe linux-edac" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-25 17:47                           ` Mauro Carvalho Chehab
  2012-04-25 18:32                             ` Luck, Tony
@ 2012-04-26 14:11                             ` Borislav Petkov
  2012-04-26 14:25                               ` Mauro Carvalho Chehab
  1 sibling, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-26 14:11 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Tony Luck, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

On Wed, Apr 25, 2012 at 02:47:39PM -0300, Mauro Carvalho Chehab wrote:
> > Ok, this looks like output from those MC_DOD_CH{0,1,2}_{0,1,2}
> > registers. And those are per-channel, actually, with a NUMRANK field
> > which tells you how many ranks the DIMM on this channel has.
> 
> No. there's one register per DIMM there. They're inside a PCI device
> per channel.

Yeah, that's what I meant - I just typed something else :-)

> 
> > (Btw, I'm looking at the corei7 datasheet, doc# 320835-003, couldn't
> > find those MC_DOD*s in the xeon datasheets).
> > 
> > So, the channels display in edac-ctl are the 3 channels, slot{0,1,2} are the
> > physical slots on each channel.
> 
> Yes.
> 
> > 
> > Now let's look at your output from earlier:
> > 
> >> $ ./edac-ctl --layout
> >>        +-----------------------------------+
> >>        |                mc0                |
> >>        | channel0  | channel1  | channel2  |
> >> -------+-----------------------------------+
> >> slot2: |     0 MB  |     0 MB  |     0 MB  |
> >> slot1: |  1024 MB  |     0 MB  |     0 MB  |
> >> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
> >> -------+-----------------------------------+
> >>
> >> Those are the logs that dump the Memory Controller registers:
> >>
> >> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
> > 
> > it says here 2 ranks
> 
> The above output is for the Nehalem machine, with 4 dimms, all single ranked.
> 
> >> [  115.818950] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
> >> [  115.818955] EDAC DEBUG: get_dimm_config:   dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
> >> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
> > 
> > and here 2 too although there's only one single-ranked DIMM here. So
> > which is it?
> 
> The # of ranks there is the total amount of ranks at the channel.

The total amount of ranks what? The channel supports, are present on the
channel, the number of physical slots?

I'm just saying it is puzzling because your output says "2 ranks" whent
there are 2 single-ranked DIMMs connected to ch0 and also "2 ranks" when
there's only one DIMM connected to ch1.

[..]

> In the case of the EDAC driver, we're relying at the per-DIMM
> information, that is reported via the MCE misc register. Also, there
> are per-DIMM error counters out there. So, while it could, in thesis,
> be possible to use the per-RANK registers and do the error decoding
> without MCA, this can have troubles, in practice, as some BIOSes
> can also be accessing the same registers, which would cause race
> conditions between BIOS and Linux.

BIOS accessing those registers while OS is running, what is that SMM?
APEI?

[..]

> >>>> At Sandy Bridge-EP (E. g. Intel E5 CPUs), we have one machine fully equipped
> >>>> with dual rank memories. The number of ranks there is just a DIMM property.
> >>>>
> >>>> # ./edac-ctl --layout
> >>>>        +-----------------------------------------------------------------------------------------------+
> >>>>        |                      mc0                      |                      mc1                      |
> >>>>        | channel0  | channel1  | channel2  | channel3  | channel0  | channel1  | channel2  | channel3  |
> >>>> -------+-----------------------------------------------------------------------------------------------+
> >>>> slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
> >>>> slot1: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
> >>>> slot0: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
> >>>> -------+-----------------------------------------------------------------------------------------------+
> >>>>
> >>>> (this machine doesn't have physical DIMM sockets for slot#2)
> > 
> > This looks like a 4-channel memory controller with 3 physical slots per
> > channel.
> 
> Yes, except that this specific motherboard has only 16 physical slots. In
> thesis, it is possible to have a motherboard with 24 physical slots.

Ok, this probably means the memory controller supports 3 slots per
channel but the mobo designer laid out only 2 per channel.

> The driver is not able to detect how many physical slots are inside
> the motherboard, so, it assumes the maximum number of slot that the
> memory controller supports.

Yep.

[..]

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-26 14:11                             ` Borislav Petkov
@ 2012-04-26 14:25                               ` Mauro Carvalho Chehab
  2012-04-26 14:59                                 ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-26 14:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson

Em 26-04-2012 11:11, Borislav Petkov escreveu:
> On Wed, Apr 25, 2012 at 02:47:39PM -0300, Mauro Carvalho Chehab wrote:
>>> Now let's look at your output from earlier:
>>>
>>>> $ ./edac-ctl --layout
>>>>        +-----------------------------------+
>>>>        |                mc0                |
>>>>        | channel0  | channel1  | channel2  |
>>>> -------+-----------------------------------+
>>>> slot2: |     0 MB  |     0 MB  |     0 MB  |
>>>> slot1: |  1024 MB  |     0 MB  |     0 MB  |
>>>> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
>>>> -------+-----------------------------------+
>>>>
>>>> Those are the logs that dump the Memory Controller registers:
>>>>
>>>> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
>>>
>>> it says here 2 ranks
>>
>> The above output is for the Nehalem machine, with 4 dimms, all single ranked.
>>
>>>> [  115.818950] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>>> [  115.818955] EDAC DEBUG: get_dimm_config:   dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>>> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
>>>
>>> and here 2 too although there's only one single-ranked DIMM here. So
>>> which is it?
>>
>> The # of ranks there is the total amount of ranks at the channel.
> 
> The total amount of ranks what? The channel supports, are present on the
> channel, the number of physical slots?
> 
> I'm just saying it is puzzling because your output says "2 ranks" whent
> there are 2 single-ranked DIMMs connected to ch0 and also "2 ranks" when
> there's only one DIMM connected to ch1.

Ah, ok, now I understood what you meant: yeah, channel 1 and 2 also says
that there are two ranks.

I'll double check what's happening there.

> 
> [..]
> 
>> In the case of the EDAC driver, we're relying at the per-DIMM
>> information, that is reported via the MCE misc register. Also, there
>> are per-DIMM error counters out there. So, while it could, in thesis,
>> be possible to use the per-RANK registers and do the error decoding
>> without MCA, this can have troubles, in practice, as some BIOSes
>> can also be accessing the same registers, which would cause race
>> conditions between BIOS and Linux.
> 
> BIOS accessing those registers while OS is running, what is that SMM?
> APEI?

I was thinking in preventing against races with SMM when I was writing 
the code for using the MCA registers instead of accessing the registers 
directly.

> 
> [..]
> 
>>>>>> At Sandy Bridge-EP (E. g. Intel E5 CPUs), we have one machine fully equipped
>>>>>> with dual rank memories. The number of ranks there is just a DIMM property.
>>>>>>
>>>>>> # ./edac-ctl --layout
>>>>>>        +-----------------------------------------------------------------------------------------------+
>>>>>>        |                      mc0                      |                      mc1                      |
>>>>>>        | channel0  | channel1  | channel2  | channel3  | channel0  | channel1  | channel2  | channel3  |
>>>>>> -------+-----------------------------------------------------------------------------------------------+
>>>>>> slot2: |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |     0 MB  |
>>>>>> slot1: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
>>>>>> slot0: |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |  4096 MB  |
>>>>>> -------+-----------------------------------------------------------------------------------------------+
>>>>>>
>>>>>> (this machine doesn't have physical DIMM sockets for slot#2)
>>>
>>> This looks like a 4-channel memory controller with 3 physical slots per
>>> channel.
>>
>> Yes, except that this specific motherboard has only 16 physical slots. In
>> thesis, it is possible to have a motherboard with 24 physical slots.
> 
> Ok, this probably means the memory controller supports 3 slots per
> channel but the mobo designer laid out only 2 per channel.

Yes.

Regards,
Mauro.

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 1/7] edac: Create a dimm struct and move the labels into it
  2012-04-16 20:12   ` [EDAC PATCH v13 1/7] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
@ 2012-04-26 14:26     ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-26 14:26 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Ranganathan Desikan, Arvind R.,
	Niklas Söderlund

On Mon, Apr 16, 2012 at 05:12:07PM -0300, Mauro Carvalho Chehab wrote:
> The way a DIMM is currently represented implies that they're
> linked into a per-csrow struct. However, some drivers don't see
> csrows, as they're ridden behind some chip like the AMB's
> on FBDIMM's, for example.
> 
> This forced drivers to fake^Wvirtualize a csrow struct, and to create
> a mess under csrow/channel original's concept.
> 
> Move the DIMM labels into a per-DIMM struct, and add there
> the real location of the socket, in terms of csrow/channel.
> Latter patches will modify the location to properly represent the
> memory architecture.
> 
> All other drivers will use a per-csrow type of location.
> Some of those drivers will require a latter conversion, as
> they also fake the csrows internally.
> 
> TODO: While this patch doesn't change the existing behavior, on
> csrows-based memory controllers, a csrow/channel pair points to a memory
> rank. There's a known bug at the EDAC core that allows having different
> labels for the same DIMM, if it has more than one rank. A latter patch
> is need to merge the several ranks for a DIMM into the same dimm_info
> struct, in order to avoid having different labels for the same DIMM.
> 
> The edac_mc_alloc() will now contain a per-dimm initialization loop that
> will be changed by latter patches in order to match other types of
> memory architectures.
> 
> Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
> Cc: Doug Thompson <norsk5@yahoo.com>
> Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
> Cc: "Arvind R." <arvino55@gmail.com>
> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 2/7] edac: move dimm properties to struct dimm_info
  2012-04-16 20:12   ` [EDAC PATCH v13 2/7] edac: move dimm properties to struct dimm_info Mauro Carvalho Chehab
@ 2012-04-26 14:26     ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-26 14:26 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Mike Williams, Shaohui Xie, Jason Uhlenkott, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Borislav Petkov, Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	James Bottomley, Linux Kernel Mailing List, Joe Perches,
	Andrew Morton, linuxppc-dev

On Mon, Apr 16, 2012 at 05:12:08PM -0300, Mauro Carvalho Chehab wrote:
> On systems based on chip select rows, all channels need to use memories
> with the same properties, otherwise the memories on channels A and B
> won't be recognized.
> 
> However, such assumption is not true for all types of memory
> controllers.
> 
> Controllers for FB-DIMM's don't have such requirements.
> 
> Also, modern Intel controllers seem to be capable of handling such
> differences.
> 
> So, we need to get rid of storing the DIMM information into a per-csrow
> data, storing it, instead at the right place.
> 
> The first step is to move grain, mtype, dtype and edac_mode to the
> per-dimm struct.
> 
> Reviewed-by: Aristeu Rozanski <arozansk@redhat.com>
> Cc: Doug Thompson <norsk5@yahoo.com>
> Cc: Borislav Petkov <borislav.petkov@amd.com>
> Cc: Mark Gross <mark.gross@intel.com>
> Cc: Jason Uhlenkott <juhlenko@akamai.com>
> Cc: Tim Small <tim@buttersideup.com>
> Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
> Cc: "Arvind R." <arvino55@gmail.com>
> Cc: Olof Johansson <olof@lixom.net>
> Cc: Egor Martovetsky <egor@pasemi.com>
> Cc: Chris Metcalf <cmetcalf@tilera.com>
> Cc: Michal Marek <mmarek@suse.cz>
> Cc: Jiri Kosina <jkosina@suse.cz>
> Cc: Joe Perches <joe@perches.com>
> Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
> Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
> Cc: Hitoshi Mitake <h.mitake@gmail.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: James Bottomley <James.Bottomley@parallels.com>
> Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
> Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
> Cc: Josh Boyer <jwboyer@gmail.com>
> Cc: Mike Williams <mike@mikebwilliams.com>
> Cc: linuxppc-dev@lists.ozlabs.org
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

For the amd64_edac and core changes:

Reviewed-by: Borislav Petkov <borislav.petkov@amd.com>

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers
  2012-04-26 14:25                               ` Mauro Carvalho Chehab
@ 2012-04-26 14:59                                 ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-26 14:59 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Tony Luck, Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson

Em 26-04-2012 11:25, Mauro Carvalho Chehab escreveu:
> Em 26-04-2012 11:11, Borislav Petkov escreveu:
>> On Wed, Apr 25, 2012 at 02:47:39PM -0300, Mauro Carvalho Chehab wrote:
>>>> Now let's look at your output from earlier:
>>>>
>>>>> $ ./edac-ctl --layout
>>>>>        +-----------------------------------+
>>>>>        |                mc0                |
>>>>>        | channel0  | channel1  | channel2  |
>>>>> -------+-----------------------------------+
>>>>> slot2: |     0 MB  |     0 MB  |     0 MB  |
>>>>> slot1: |  1024 MB  |     0 MB  |     0 MB  |
>>>>> slot0: |  1024 MB  |  1024 MB  |  1024 MB  |
>>>>> -------+-----------------------------------+
>>>>>
>>>>> Those are the logs that dump the Memory Controller registers:
>>>>>
>>>>> [  115.818947] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f4031): 2 ranks, UDIMMs
>>>>
>>>> it says here 2 ranks
>>>
>>> The above output is for the Nehalem machine, with 4 dimms, all single ranked.
>>>
>>>>> [  115.818950] EDAC DEBUG: get_dimm_config:   dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>>>> [  115.818955] EDAC DEBUG: get_dimm_config:   dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400
>>>>> [  115.818982] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f4031): 2 ranks, UDIMMs
>>>>
>>>> and here 2 too although there's only one single-ranked DIMM here. So
>>>> which is it?
>>>
>>> The # of ranks there is the total amount of ranks at the channel.
>>
>> The total amount of ranks what? The channel supports, are present on the
>> channel, the number of physical slots?
>>
>> I'm just saying it is puzzling because your output says "2 ranks" whent
>> there are 2 single-ranked DIMMs connected to ch0 and also "2 ranks" when
>> there's only one DIMM connected to ch1.
> 
> Ah, ok, now I understood what you meant: yeah, channel 1 and 2 also says
> that there are two ranks.
> 
> I'll double check what's happening there.
> 

This were due to the way the driver reports that this channel doesn't
have any 4 Rank memories (e. g. all memories are either 1R or 2R).

The enclosed patch should improve the debug output information.

Thanks for pointing it,
Mauro

---

From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Thu, 26 Apr 2012 11:47:29 -0300
Subject: [PATCH] i7core: fix ranks information at the per-channel struct

There is a flag at the per-channel struct that indicates if there are
any 4R dimm on it. The way the presence of this flag were reported
is not ok, as it might give the false idea that the channel were filled
with 2R memories:

[  580.588701] EDAC DEBUG: get_dimm_config: Ch1 phy rd1, wr1 (0x063f7431): 2 ranks, UDIMMs
[  580.588704] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400

(in this case, just one 1R memory is filled on channel 1)

So, use a better way to represent the per-channel ranks information.
After the patch, it will show:

[ 2002.233978] EDAC DEBUG: get_dimm_config: Ch0 phy rd0, wr0 (0x063f7431): UDIMMs
[ 2002.233982] EDAC DEBUG: get_dimm_config: 	dimm 0 1024 Mb offset: 0, bank: 8, rank: 1, row: 0x4000, col: 0x400
[ 2002.233988] EDAC DEBUG: get_dimm_config: 	dimm 1 1024 Mb offset: 4, bank: 8, rank: 1, row: 0x4000, col: 0x400

(in this case, there isn't any 4R memories)

Reported-by: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index dfdee48..f4a0fe1 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -221,7 +221,9 @@ struct i7core_inject {
 };
 
 struct i7core_channel {
-	u32		ranks;
+	bool		is_3dimms_present;
+	bool		is_single_4rank;
+	bool		has_4rank;
 	u32		dimms;
 };
 
@@ -557,21 +559,20 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 		pci_read_config_dword(pvt->pci_ch[i][0],
 				MC_CHANNEL_DIMM_INIT_PARAMS, &data);
 
-		pvt->channel[i].ranks = (data & QUAD_RANK_PRESENT) ?
-						4 : 2;
+
+		if (data & THREE_DIMMS_PRESENT)
+			pvt->channel[i].is_3dimms_present = true;
+
+		if (data & SINGLE_QUAD_RANK_PRESENT)
+			pvt->channel[i].is_single_4rank = true;
+
+		if (data & QUAD_RANK_PRESENT)
+			pvt->channel[i].has_4rank = true;
 
 		if (data & REGISTERED_DIMM)
 			mtype = MEM_RDDR3;
 		else
 			mtype = MEM_DDR3;
-#if 0
-		if (data & THREE_DIMMS_PRESENT)
-			pvt->channel[i].dimms = 3;
-		else if (data & SINGLE_QUAD_RANK_PRESENT)
-			pvt->channel[i].dimms = 1;
-		else
-			pvt->channel[i].dimms = 2;
-#endif
 
 		/* Devices 4-6 function 1 */
 		pci_read_config_dword(pvt->pci_ch[i][1],
@@ -582,11 +583,13 @@ static int get_dimm_config(struct mem_ctl_info *mci)
 				MC_DOD_CH_DIMM2, &dimm_dod[2]);
 
 		debugf0("Ch%d phy rd%d, wr%d (0x%08x): "
-			"%d ranks, %cDIMMs\n",
+			"%s%s%s%cDIMMs\n",
 			i,
 			RDLCH(pvt->info.ch_map, i), WRLCH(pvt->info.ch_map, i),
 			data,
-			pvt->channel[i].ranks,
+			pvt->channel[i].is_3dimms_present ? "3DIMMS " : "",
+			pvt->channel[i].is_3dimms_present ? "SINGLE_4R " : "",
+			pvt->channel[i].has_4rank ? "HAS_4R " : "",
 			(data & REGISTERED_DIMM) ? 'R' : 'U');
 
 		for (j = 0; j < 3; j++) {

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC_ABI PATCH v13 24/26] tile_edac: convert driver to use the new edac ABI
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 24/26] tile_edac: " Mauro Carvalho Chehab
@ 2012-04-26 19:47       ` Chris Metcalf
  0 siblings, 0 replies; 206+ messages in thread
From: Chris Metcalf @ 2012-04-26 19:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab; +Cc: Linux Edac Mailing List, Linux Kernel Mailing List

On 4/16/2012 4:21 PM, Mauro Carvalho Chehab wrote:
> The legacy edac ABI is going to be removed. Port the driver to use
> and benefit from the new API functionality.
>
> Cc: Chris Metcalf <cmetcalf@tilera.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

For this and the other tile-related changes over the last couple of weeks:

Acked-by: Chris Metcalf <cmetcalf@tilera.com>

-- 
Chris Metcalf, Tilera Corp.
http://www.tilera.com


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 2/2] amd64_edac: convert driver to use the new edac ABI
  2012-04-24 18:15                               ` [PATCH EDACv16 2/2] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
@ 2012-04-27 10:42                                 ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-27 10:42 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 24-04-2012 15:15, Mauro Carvalho Chehab escreveu:
> The legacy edac ABI is going to be removed. Port the driver to use
> and benefit from the new API functionality.
> 
> Cc: Doug Thompson <norsk5@yahoo.com>
> Cc: Borislav Petkov <borislav.petkov@amd.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

Ping?

> ---
> 
> v16: Only context changes
> 
>  drivers/edac/amd64_edac.c |  137 ++++++++++++++++++++++++++++++---------------
>  1 files changed, 92 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 6d6ec68..b13d5a0 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -1039,6 +1039,37 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  	int channel, csrow;
>  	u32 page, offset;
>  
> +	error_address_to_page_and_offset(sys_addr, &page, &offset);
> +
> +	/*
> +	 * Find out which node the error address belongs to. This may be
> +	 * different from the node that detected the error.
> +	 */
> +	src_mci = find_mc_by_sys_addr(mci, sys_addr);
> +	if (!src_mci) {
> +		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
> +			     (unsigned long)sys_addr);
> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				     page, offset, syndrome,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "failed to map error addr to a node",
> +				     NULL);
> +		return;
> +	}
> +
> +	/* Now map the sys_addr to a CSROW */
> +	csrow = sys_addr_to_csrow(src_mci, sys_addr);
> +	if (csrow < 0) {
> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				     page, offset, syndrome,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "failed to map error addr to a csrow",
> +				     NULL);
> +		return;
> +	}
> +
>  	/* CHIPKILL enabled */
>  	if (pvt->nbcfg & NBCFG_CHIPKILL) {
>  		channel = get_channel_from_ecc_syndrome(mci, syndrome);
> @@ -1048,9 +1079,15 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  			 * 2 DIMMs is in error. So we need to ID 'both' of them
>  			 * as suspect.
>  			 */
> -			amd64_mc_warn(mci, "unknown syndrome 0x%04x - possible "
> -					   "error reporting race\n", syndrome);
> -			edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
> +			amd64_mc_warn(src_mci, "unknown syndrome 0x%04x - "
> +				      "possible error reporting race\n",
> +				      syndrome);
> +			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +					     page, offset, syndrome,
> +					     csrow, -1, -1,
> +					     EDAC_MOD_STR,
> +					     "unknown syndrome - possible error reporting race",
> +					     NULL);
>  			return;
>  		}
>  	} else {
> @@ -1065,28 +1102,10 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  		channel = ((sys_addr & BIT(3)) != 0);
>  	}
>  
> -	/*
> -	 * Find out which node the error address belongs to. This may be
> -	 * different from the node that detected the error.
> -	 */
> -	src_mci = find_mc_by_sys_addr(mci, sys_addr);
> -	if (!src_mci) {
> -		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
> -			     (unsigned long)sys_addr);
> -		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
> -		return;
> -	}
> -
> -	/* Now map the sys_addr to a CSROW */
> -	csrow = sys_addr_to_csrow(src_mci, sys_addr);
> -	if (csrow < 0) {
> -		edac_mc_handle_ce_no_info(src_mci, EDAC_MOD_STR);
> -	} else {
> -		error_address_to_page_and_offset(sys_addr, &page, &offset);
> -
> -		edac_mc_handle_ce(src_mci, page, offset, syndrome, csrow,
> -				  channel, EDAC_MOD_STR);
> -	}
> +	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, src_mci,
> +			     page, offset, syndrome,
> +			     csrow, channel, -1,
> +			     EDAC_MOD_STR, "", NULL);
>  }
>  
>  static int ddr2_cs_size(unsigned i, bool dct_width)
> @@ -1568,15 +1587,20 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  	u32 page, offset;
>  	int nid, csrow, chan = 0;
>  
> +	error_address_to_page_and_offset(sys_addr, &page, &offset);
> +
>  	csrow = f1x_translate_sysaddr_to_cs(pvt, sys_addr, &nid, &chan);
>  
>  	if (csrow < 0) {
> -		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				     page, offset, syndrome,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "failed to map error addr to a csrow",
> +				     NULL);
>  		return;
>  	}
>  
> -	error_address_to_page_and_offset(sys_addr, &page, &offset);
> -
>  	/*
>  	 * We need the syndromes for channel detection only when we're
>  	 * ganged. Otherwise @chan should already contain the channel at
> @@ -1585,16 +1609,10 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  	if (dct_ganging_enabled(pvt))
>  		chan = get_channel_from_ecc_syndrome(mci, syndrome);
>  
> -	if (chan >= 0)
> -		edac_mc_handle_ce(mci, page, offset, syndrome, csrow, chan,
> -				  EDAC_MOD_STR);
> -	else
> -		/*
> -		 * Channel unknown, report all channels on this CSROW as failed.
> -		 */
> -		for (chan = 0; chan < mci->csrows[csrow].nr_channels; chan++)
> -			edac_mc_handle_ce(mci, page, offset, syndrome,
> -					  csrow, chan, EDAC_MOD_STR);
> +	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				page, offset, syndrome,
> +				csrow, chan, -1,
> +				EDAC_MOD_STR, "", NULL);
>  }
>  
>  /*
> @@ -1875,7 +1893,12 @@ static void amd64_handle_ce(struct mem_ctl_info *mci, struct mce *m)
>  	/* Ensure that the Error Address is VALID */
>  	if (!(m->status & MCI_STATUS_ADDRV)) {
>  		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
> -		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				     0, 0, 0,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "HW has no ERROR_ADDRESS available",
> +				     NULL);
>  		return;
>  	}
>  
> @@ -1899,11 +1922,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
>  
>  	if (!(m->status & MCI_STATUS_ADDRV)) {
>  		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
> -		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
> +		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +				     0, 0, 0,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "HW has no ERROR_ADDRESS available",
> +				     NULL);
>  		return;
>  	}
>  
>  	sys_addr = get_error_address(m);
> +	error_address_to_page_and_offset(sys_addr, &page, &offset);
>  
>  	/*
>  	 * Find out which node the error address belongs to. This may be
> @@ -1913,7 +1942,11 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
>  	if (!src_mci) {
>  		amd64_mc_err(mci, "ERROR ADDRESS (0x%lx) NOT mapped to a MC\n",
>  				  (unsigned long)sys_addr);
> -		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
> +		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +				     page, offset, 0,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "ERROR ADDRESS NOT mapped to a MC", NULL);
>  		return;
>  	}
>  
> @@ -1923,10 +1956,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
>  	if (csrow < 0) {
>  		amd64_mc_err(mci, "ERROR_ADDRESS (0x%lx) NOT mapped to CS\n",
>  				  (unsigned long)sys_addr);
> -		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
> +		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +				     page, offset, 0,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "ERROR ADDRESS NOT mapped to CS",
> +				     NULL);
>  	} else {
> -		error_address_to_page_and_offset(sys_addr, &page, &offset);
> -		edac_mc_handle_ue(log_mci, page, offset, csrow, EDAC_MOD_STR);
> +		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +				     page, offset, 0,
> +				     csrow, -1, -1,
> +				     EDAC_MOD_STR, "", NULL);
>  	}
>  }
>  
> @@ -2486,6 +2526,7 @@ static int amd64_init_one_instance(struct pci_dev *F2)
>  	struct amd64_pvt *pvt = NULL;
>  	struct amd64_family_type *fam_type = NULL;
>  	struct mem_ctl_info *mci = NULL;
> +	struct edac_mc_layer layers[2];
>  	int err = 0, ret;
>  	u8 nid = get_node_id(F2);
>  
> @@ -2520,7 +2561,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
>  		goto err_siblings;
>  
>  	ret = -ENOMEM;
> -	mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
> +	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
> +	layers[0].size = pvt->csels[0].b_cnt;
> +	layers[0].is_virt_csrow = true;
> +	layers[1].type = EDAC_MC_LAYER_CHANNEL;
> +	layers[1].size = pvt->channel_count;
> +	layers[1].is_virt_csrow = false;
> +	mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, false, 0);
>  	if (!mci)
>  		goto err_siblings;
>  


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-24 18:15                             ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Mauro Carvalho Chehab
@ 2012-04-27 13:33                                 ` Borislav Petkov
  2012-04-27 13:33                                 ` Borislav Petkov
  1 sibling, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-27 13:33 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Btw,

this patch gives

[    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
[    8.287594] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
[    8.296784] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: dimm2 (1:0:0): row 1, chan 0
[    8.305968] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: dimm3 (1:1:0): row 1, chan 1
[    8.315144] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: dimm4 (2:0:0): row 2, chan 0
[    8.324326] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: dimm5 (2:1:0): row 2, chan 1
[    8.333502] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: dimm6 (3:0:0): row 3, chan 0
[    8.342684] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: dimm7 (3:1:0): row 3, chan 1
[    8.351860] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: dimm8 (4:0:0): row 4, chan 0
[    8.361049] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: dimm9 (4:1:0): row 4, chan 1
[    8.370227] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: dimm10 (5:0:0): row 5, chan 0
[    8.379582] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: dimm11 (5:1:0): row 5, chan 1
[    8.388941] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: dimm12 (6:0:0): row 6, chan 0
[    8.398315] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: dimm13 (6:1:0): row 6, chan 1
[    8.407680] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: dimm14 (7:0:0): row 7, chan 0
[    8.417047] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: dimm15 (7:1:0): row 7, chan 1

and the memory controller has the following chip selects

[    8.137662] EDAC MC: DCT0 chip selects:
[    8.150291] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[    8.155349] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[    8.160408] EDAC amd64: MC: 4:     0MB 5:     0MB
[    8.165475] EDAC amd64: MC: 6:     0MB 7:     0MB
[    8.180499] EDAC MC: DCT1 chip selects:
[    8.184693] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[    8.189753] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[    8.194812] EDAC amd64: MC: 4:     0MB 5:     0MB
[    8.199875] EDAC amd64: MC: 6:     0MB 7:     0MB

Those are 4 dual-ranked DIMMs on this node, DCT0 is one channel and DCT1
is another and I have 4 ranks per channel. Having dimm0-dimm15 is very
misleading and has nothing to do with the reality. So, if this is to use
your nomenclature with layers, I'll have dimm0-dimm7 where each dimm is
a rank.

Or, the most correct thing to do would be to have dimm0-dimm3, each
dual-ranked.

So either tot_dimms is computed wrongly or there's a more serious error
somewhere.

I've reviewed almost the half patch, will review the rest when/if we
sort out the above issue first.

Thanks.

On Tue, Apr 24, 2012 at 03:15:41PM -0300, Mauro Carvalho Chehab wrote:
> Change the EDAC internal representation to work with non-csrow
> based memory controllers.
> 
> There are lots of those memory controllers nowadays, and more
> are coming. So, the EDAC internal representation needs to be
> changed, in order to work with those memory controllers, while
> preserving backward compatibility with the old ones.
> 
> The edac core were written with the idea that memory controllers

		was

> are able to directly access csrows, and that the channels are
> used inside a csrows select.

This sounds funny, simply remove that second part about the channels.

> This is not true for FB-DIMM and RAMBUS memory controllers.
> 
> Also, some recent advanced memory controllers don't present a per-csrows
> view. Instead, they view memories as DIMM's, instead of ranks, accessed

					DIMMs instead of ranks."

Remove the rest.

> via csrow/channel.
> 
> So, change the allocation and error report routines to allow
> them to work with all types of architectures.
> 
> This will allow the removal of several hacks on FB-DIMM and RAMBUS

					       with

> memory controllers on the next patches.

		    . Remove the rest.

> 
> Also, several tests were done on different platforms using different
> x86 drivers.
> 
> TODO: a multi-rank DIMM's are currently represented by multiple DIMM

	Multi-rank DIMMs

> entries at struct dimm_info. That means that changing a label for one

	  in

> rank won't change the same label for the other ranks at the same dimm.

						       of the same DIMM.

> Such bug is there since the beginning of the EDAC, so it is not a big

  This bug is present ..

> deal. However, on several drivers, it is possible to fix this issue, but

		remove "on"

> it should be a per-driver fix, as the csrow => DIMM arrangement may not
> be equal for all. So, don't try to fix it here yet.
> 
> PS.: I tried to make this patch as short as possible, preceding it with

Remove "PS."

> several other patches that simplified the logic here. Yet, as the
> internal API changes, all drivers need changes. The changes are
> generally bigger on the drivers for FB-DIMM's.

		   in 		   for FB-DIMMs.

> 
> FIXME: while the FB-DIMMs are not converted to use the new
> design, uncorrected errors will show just one channel. In
> the past, all changes were on a big patch with about 150K.
> As it needed to be split, in order to be accepted by the
> EDAC ML at vger, we've opted to have this small drawback.
> As an advantage, it is now easier to review the patch series.

This whole paragraph above doesn't have anything to do with what the
patch does, so it can go.

[..]

> ---
> 
> v16: Only context changes
> 
>  drivers/edac/edac_core.h |   92 ++++++-
>  drivers/edac/edac_mc.c   |  682 ++++++++++++++++++++++++++++------------------
>  include/linux/edac.h     |   40 ++-
>  3 files changed, 526 insertions(+), 288 deletions(-)
> 
> diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
> index e48ab31..7201bb1 100644
> --- a/drivers/edac/edac_core.h
> +++ b/drivers/edac/edac_core.h
> @@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
>  
>  #endif				/* CONFIG_PCI */
>  
> -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> -					  unsigned nr_chans, int edac_index);
> +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> +				   unsigned nr_chans, int edac_index);

Why not "extern"?

> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> +				   unsigned n_layers,
> +				   struct edac_mc_layer *layers,
> +				   bool rev_order,
> +				   unsigned sz_pvt);

ditto.

>  extern int edac_mc_add_mc(struct mem_ctl_info *mci);
>  extern void edac_mc_free(struct mem_ctl_info *mci);
>  extern struct mem_ctl_info *edac_mc_find(int idx);
> @@ -467,24 +472,80 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
>   * reporting logic and function interface - reduces conditional
>   * statement clutter and extra function arguments.
>   */
> -extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
> +
> +void edac_mc_handle_error(const enum hw_event_mc_err_type type,
> +			  struct mem_ctl_info *mci,
> +			  const unsigned long page_frame_number,
> +			  const unsigned long offset_in_page,
> +			  const unsigned long syndrome,
> +			  const int layer0,
> +			  const int layer1,
> +			  const int layer2,
> +			  const char *msg,
> +			  const char *other_detail,
> +			  const void *mcelog);

Why isn't this one "extern" either?

> +
> +static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
>  			      unsigned long page_frame_number,
>  			      unsigned long offset_in_page,
>  			      unsigned long syndrome, int row, int channel,
> -			      const char *msg);

Strange alignment, pls do

static inline void edac_mc_handle_ce(struct...,
				     unsigned...,
				     ...,
				     ...);


> -extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
> -				      const char *msg);
> -extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
> +			      const char *msg)
> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +			      page_frame_number, offset_in_page, syndrome,
> +		              row, channel, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
> +				      const char *msg)

ditto.

> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
>  			      unsigned long page_frame_number,
>  			      unsigned long offset_in_page, int row,
> -			      const char *msg);

ditto.

> -extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
> -				      const char *msg);
> -extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
> -				  unsigned int channel0, unsigned int channel1,
> -				  char *msg);
> -extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
> -				  unsigned int channel, char *msg);
> +			      const char *msg)
> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +			      page_frame_number, offset_in_page, 0,
> +		              row, -1, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
> +				      const char *msg)
> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
> +					 unsigned int csrow,
> +					 unsigned int channel0,
> +					 unsigned int channel1,
> +					 char *msg)

Now this alignment looks correct.

> +{
> +	/*
> +	 *FIXME: The error can also be at channel1 (e. g. at the second
> +	 *	  channel of the same branch). The fix is to push
> +	 *	  edac_mc_handle_error() call into each driver
> +	 */
> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +			      0, 0, 0,
> +		              csrow, channel0, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
> +					 unsigned int csrow,
> +					 unsigned int channel, char *msg)
> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +			      0, 0, 0,
> +		              csrow, channel, -1, msg, NULL, NULL);
> +}
> +
> +

Two superfluous newlines.

>  
>  /*
>   * edac_device APIs
> @@ -496,6 +557,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
>  extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
>  				int inst_nr, int block_nr, const char *msg);
>  extern int edac_device_alloc_index(void);
> +extern const char *edac_layer_name[];
>  
>  /*
>   * edac_pci APIs
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index 6ec967a..4d4d8b7 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>  	debugf4("\tchannel = %p\n", chan);
>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
> -	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
> -	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
> -	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
> +	debugf4("\tchannel->dimm = %p\n", chan->dimm);
> +}
> +
> +static void edac_mc_dump_dimm(struct dimm_info *dimm)
> +{
> +	int i;
> +
> +	debugf4("\tdimm = %p\n", dimm);
> +	debugf4("\tdimm->label = '%s'\n", dimm->label);
> +	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
> +	debugf4("\tdimm location ");
> +	for (i = 0; i < dimm->mci->n_layers; i++) {
> +		printk(KERN_CONT "%d", dimm->location[i]);
> +		if (i < dimm->mci->n_layers - 1)
> +			printk(KERN_CONT ".");
> +	}
> +	printk(KERN_CONT "\n");

This looks hacky but I don't have a good suggestion what to do instead
here. Maybe snprintf into a complete string which you can issue with
debugf4()...

> +	debugf4("\tdimm->grain = %d\n", dimm->grain);
> +	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
>  }
>  
>  static void edac_mc_dump_csrow(struct csrow_info *csrow)
> @@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
>  	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
>  	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
>  		mci->nr_csrows, mci->csrows);
> +	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",

		      ->tot_dimms      dimms

> +		mci->tot_dimms, mci->dimms);
>  	debugf3("\tdev = %p\n", mci->dev);
>  	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
>  	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
> @@ -157,10 +175,25 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
>  }
>  
>  /**
> - * edac_mc_alloc: Allocate a struct mem_ctl_info structure
> - * @size_pvt:	size of private storage needed
> - * @nr_csrows:	Number of CWROWS needed for this MC
> - * @nr_chans:	Number of channels for the MC
> + * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure

					    fill

> + * @edac_index:		Memory controller number
> + * @n_layers:		Number of layers at the MC hierarchy

				Number of MC hierarchy layers

> + * layers:		Describes each layer as seen by the Memory Controller
> + * @rev_order:		Fills csrows/cs channels at the reverse order

				      csrows/channels in reverse order

> + * @size_pvt:		size of private storage needed
> + *
> + *
> + * FIXME: drivers handle multi-rank memories on different ways: on some

						in		   in

> + * drivers, one multi-rank memory is mapped as one DIMM, while, on others,

			      memory stick			   in

> + * a single multi-rank DIMM would be mapped into several "dimms".

			  memory stick

> + *
> + * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
> + * such DIMMS properly, but the CSROWS-based ones will likely do the wrong

				   csrow-based

> + * thing, as two chip select values are used for dual-rank memories (and 4, for
> + * quad-rank ones). I suspect that this issue could be solved inside the EDAC
> + * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
> + *
> + * In summary, solving this issue is not easy, as it requires a lot of testing.
>   *
>   * Everything is kmalloc'ed as one big chunk - more efficient.
>   * Only can be used if all structures have the same lifetime - otherwise
> @@ -172,18 +205,41 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
>   *	NULL allocation failed
>   *	struct mem_ctl_info pointer
>   */
> -struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> -				unsigned nr_chans, int edac_index)
> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> +				   unsigned n_layers,
> +				   struct edac_mc_layer *layers,
> +				   bool rev_order,
> +				   unsigned sz_pvt)

strange function argument vertical alignment

>  {
>  	void *ptr = NULL;
>  	struct mem_ctl_info *mci;
> -	struct csrow_info *csi, *csrow;
> +	struct edac_mc_layer *lay;

As before, call this "layers" pls.

> +	struct csrow_info *csi, *csr;
>  	struct rank_info *chi, *chp, *chan;
>  	struct dimm_info *dimm;
> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>  	void *pvt;
> -	unsigned size;
> -	int row, chn;
> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> +	unsigned tot_csrows, tot_cschannels;

No need to call this "tot_cschannels" - "tot_channels" should be enough.

> +	int i, j;
>  	int err;
> +	int row, chn;

All those local variables should be sorted in a reverse christmas tree
order:

	u32 this_is_the_longest_array_name[LENGTH];
	void *shorter_named_variable;
	unsigned long size;
	int i;

	...

> +
> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);


Push this BUG_ON up into edac_mc_alloc as the first thing this function
does. Also, is it valid to have n_layers == 0? The memcpy call below
will do nothing.


> +	/*
> +	 * Calculate the total amount of dimms and csrows/cschannels while
> +	 * in the old API emulation mode
> +	 */
> +	tot_dimms = 1;
> +	tot_cschannels = 1;
> +	tot_csrows = 1;

Those initializations can be done above at variable declaration time.

> +	for (i = 0; i < n_layers; i++) {
> +		tot_dimms *= layers[i].size;
> +		if (layers[i].is_virt_csrow)
> +			tot_csrows *= layers[i].size;
> +		else
> +			tot_cschannels *= layers[i].size;
> +	}
>  
>  	/* Figure out the offsets of the various items from the start of an mc
>  	 * structure.  We want the alignment of each item to be at least as
> @@ -191,12 +247,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	 * hardcode everything into a single struct.
>  	 */
>  	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
> -	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
> -	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
> -	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
> +	lay = edac_align_ptr(&ptr, sizeof(*lay), n_layers);
> +	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
> +	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_cschannels);
> +	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
> +	count = 1;

ditto.

> +	for (i = 0; i < n_layers; i++) {
> +		count *= layers[i].size;
> +		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
> +		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
> +	}
>  	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
>  	size = ((unsigned long)pvt) + sz_pvt;
>  
> +	debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
> +		__func__, size, tot_dimms, tot_csrows * tot_cschannels);
>  	mci = kzalloc(size, GFP_KERNEL);
>  	if (mci == NULL)
>  		return NULL;
> @@ -204,42 +269,99 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	/* Adjust pointers so they point within the memory we just allocated
>  	 * rather than an imaginary chunk of memory located at address 0.
>  	 */
> +	lay = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)lay));
>  	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
>  	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
>  	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
> +	for (i = 0; i < n_layers; i++) {
> +		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
> +		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
> +	}
>  	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
>  
>  	/* setup index and various internal pointers */
>  	mci->mc_idx = edac_index;
>  	mci->csrows = csi;
>  	mci->dimms  = dimm;
> +	mci->tot_dimms = tot_dimms;
>  	mci->pvt_info = pvt;
> -	mci->nr_csrows = nr_csrows;
> +	mci->n_layers = n_layers;
> +	mci->layers = lay;
> +	memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
> +	mci->nr_csrows = tot_csrows;
> +	mci->num_cschannel = tot_cschannels;
>  
>  	/*
> -	 * For now, assumes that a per-csrow arrangement for dimms.
> -	 * This will be latter changed.
> +	 * Fills the csrow struct
>  	 */
> -	dimm = mci->dimms;
> -
> -	for (row = 0; row < nr_csrows; row++) {
> -		csrow = &csi[row];
> -		csrow->csrow_idx = row;
> -		csrow->mci = mci;
> -		csrow->nr_channels = nr_chans;
> -		chp = &chi[row * nr_chans];
> -		csrow->channels = chp;
> -
> -		for (chn = 0; chn < nr_chans; chn++) {
> +	for (row = 0; row < tot_csrows; row++) {
> +		csr = &csi[row];
> +		csr->csrow_idx = row;
> +		csr->mci = mci;
> +		csr->nr_channels = tot_cschannels;
> +		chp = &chi[row * tot_cschannels];
> +		csr->channels = chp;
> +
> +		for (chn = 0; chn < tot_cschannels; chn++) {
>  			chan = &chp[chn];
>  			chan->chan_idx = chn;
> -			chan->csrow = csrow;
> +			chan->csrow = csr;
> +		}
> +	}
>  
> -			mci->csrows[row].channels[chn].dimm = dimm;
> -			dimm->csrow = row;
> -			dimm->csrow_channel = chn;
> -			dimm++;
> -			mci->nr_dimms++;
> +	/*
> +	 * Fills the dimm struct
> +	 */
> +	memset(&pos, 0, sizeof(pos));
> +	row = 0;
> +	chn = 0;
> +	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
> +	for (i = 0; i < tot_dimms; i++) {
> +		chan = &csi[row].channels[chn];
> +		dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
> +			       pos[0], pos[1], pos[2]);
> +		dimm->mci = mci;
> +
> +		debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
> +			i, (dimm - mci->dimms),
> +			pos[0], pos[1], pos[2], row, chn);
> +
> +		/* Copy DIMM location */
> +		for (j = 0; j < n_layers; j++)
> +			dimm->location[j] = pos[j];
> +
> +		/* Link it to the csrows old API data */
> +		chan->dimm = dimm;
> +		dimm->csrow = row;
> +		dimm->cschannel = chn;
> +
> +		/* Increment csrow location */
> +		if (!rev_order) {
> +			for (j = n_layers - 1; j >= 0; j--)
> +				if (!layers[j].is_virt_csrow)
> +					break;
> +			chn++;
> +			if (chn == tot_cschannels) {
> +				chn = 0;
> +				row++;
> +			}
> +		} else {
> +			for (j = n_layers - 1; j >= 0; j--)
> +				if (layers[j].is_virt_csrow)
> +					break;
> +			row++;
> +			if (row == tot_csrows) {
> +				row = 0;
> +				chn++;
> +			}
> +		}
> +
> +		/* Increment dimm location */
> +		for (j = n_layers - 1; j >= 0; j--) {
> +			pos[j]++;
> +			if (pos[j] < layers[j].size)
> +				break;
> +			pos[j] = 0;
>  		}
>  	}
>  
> @@ -263,6 +385,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	 */
>  	return mci;
>  }
> +EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
> +
> +/**
> + * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
> + * @edac_index:		Memory controller number
> + * @n_layers:		Nu
> +mber of layers at the MC hierarchy
> + * layers:		Describes each layer as seen by the Memory Controller
> + * @rev_order:		Fills csrows/cs channels at the reverse order
> + * @size_pvt:		size of private storage needed
> + *
> + *
> + * FIXME: drivers handle multi-rank memories on different ways: on some
> + * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
> + * a single multi-rank DIMM would be mapped into several "dimms".
> + *
> + * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
> + * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
> + * thing, as two chip select values are used for dual-rank memories (and 4, for
> + * quad-rank ones). I suspect that this issue could be solved inside the EDAC
> + * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
> + *
> + * In summary, solving this issue is not easy, as it requires a lot of testing.
> + *
> + * Everything is kmalloc'ed as one big chunk - more efficient.
> + * Only can be used if all structures have the same lifetime - otherwise
> + * you have to allocate and initialize your own structures.
> + *
> + * Use edac_mc_free() to free mc structures allocated by this function.
> + *
> + * Returns:
> + *	NULL allocation failed
> + *	struct mem_ctl_info pointer
> + */
> +
> +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> +				   unsigned nr_chans, int edac_index)
> +{
> +	unsigned n_layers = 2;
> +	struct edac_mc_layer layers[n_layers];
> +
> +	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
> +	layers[0].size = nr_csrows;
> +	layers[0].is_virt_csrow = true;
> +	layers[1].type = EDAC_MC_LAYER_CHANNEL;
> +	layers[1].size = nr_chans;
> +	layers[1].is_virt_csrow = false;
> +
> +	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
> +			  false, sz_pvt);
> +}
>  EXPORT_SYMBOL_GPL(edac_mc_alloc);
>  
>  /**
> @@ -528,7 +701,6 @@ EXPORT_SYMBOL(edac_mc_find);
>   * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
>   *                 create sysfs entries associated with mci structure
>   * @mci: pointer to the mci structure to be added to the list
> - * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
>   *
>   * Return:
>   *	0	Success
> @@ -555,6 +727,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
>  				edac_mc_dump_channel(&mci->csrows[i].
>  						channels[j]);
>  		}
> +		for (i = 0; i < mci->tot_dimms; i++)
> +			edac_mc_dump_dimm(&mci->dimms[i]);
>  	}
>  #endif
>  	mutex_lock(&mem_ctls_mutex);
> @@ -712,261 +886,249 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
>  }
>  EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
>  
> -/* FIXME - setable log (warning/emerg) levels */
> -/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
> -void edac_mc_handle_ce(struct mem_ctl_info *mci,
> -		unsigned long page_frame_number,
> -		unsigned long offset_in_page, unsigned long syndrome,
> -		int row, int channel, const char *msg)
> +const char *edac_layer_name[] = {
> +	[EDAC_MC_LAYER_BRANCH] = "branch",
> +	[EDAC_MC_LAYER_CHANNEL] = "channel",
> +	[EDAC_MC_LAYER_SLOT] = "slot",
> +	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
> +};
> +EXPORT_SYMBOL_GPL(edac_layer_name);
> +
> +static void edac_increment_ce_error(struct mem_ctl_info *mci,
> +				    bool enable_filter,
> +				    unsigned pos[EDAC_MAX_LAYERS])
>  {
> -	unsigned long remapped_page;
> -	char *label = NULL;
> -	u32 grain;
> +	int i, index = 0;
>  
> -	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
> +	mci->ce_mc++;
>  
> -	/* FIXME - maybe make panic on INTERNAL ERROR an option */
> -	if (row >= mci->nr_csrows || row < 0) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: row out of range "
> -			"(%d >= %d)\n", row, mci->nr_csrows);
> -		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
> +	if (!enable_filter) {
> +		mci->ce_noinfo_count++;
>  		return;
>  	}
>  
> -	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: channel out of range "
> -			"(%d >= %d)\n", channel,
> -			mci->csrows[row].nr_channels);
> -		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
> -		return;
> -	}
> -
> -	label = mci->csrows[row].channels[channel].dimm->label;
> -	grain = mci->csrows[row].channels[channel].dimm->grain;
> +	for (i = 0; i < mci->n_layers; i++) {
> +		if (pos[i] < 0)
> +			break;
> +		index += pos[i];
> +		mci->ce_per_layer[i][index]++;
>  
> -	if (edac_mc_get_log_ce())
> -		/* FIXME - put in DIMM location */
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
> -			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
> -			page_frame_number, offset_in_page,
> -			grain, syndrome, row, channel,
> -			label, msg);
> +		if (i < mci->n_layers - 1)
> +			index *= mci->layers[i + 1].size;
> +	}
> +}
>  
> -	mci->ce_count++;
> -	mci->csrows[row].ce_count++;
> -	mci->csrows[row].channels[channel].dimm->ce_count++;
> -	mci->csrows[row].channels[channel].ce_count++;
> +static void edac_increment_ue_error(struct mem_ctl_info *mci,
> +				    bool enable_filter,
> +				    unsigned pos[EDAC_MAX_LAYERS])
> +{
> +	int i, index = 0;
>  
> -	if (mci->scrub_mode & SCRUB_SW_SRC) {
> -		/*
> -		 * Some MC's can remap memory so that it is still available
> -		 * at a different address when PCI devices map into memory.
> -		 * MC's that can't do this lose the memory where PCI devices
> -		 * are mapped.  This mapping is MC dependent and so we call
> -		 * back into the MC driver for it to map the MC page to
> -		 * a physical (CPU) page which can then be mapped to a virtual
> -		 * page - which can then be scrubbed.
> -		 */
> -		remapped_page = mci->ctl_page_to_phys ?
> -			mci->ctl_page_to_phys(mci, page_frame_number) :
> -			page_frame_number;
> +	mci->ue_mc++;
>  
> -		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
> +	if (!enable_filter) {
> +		mci->ce_noinfo_count++;
> +		return;
>  	}
> -}
> -EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
>  
> -void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
> -{
> -	if (edac_mc_get_log_ce())
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"CE - no information available: %s\n", msg);
> +	for (i = 0; i < mci->n_layers; i++) {
> +		if (pos[i] < 0)
> +			break;
> +		index += pos[i];
> +		mci->ue_per_layer[i][index]++;
>  
> -	mci->ce_noinfo_count++;
> -	mci->ce_count++;
> +		if (i < mci->n_layers - 1)
> +			index *= mci->layers[i + 1].size;
> +	}
>  }
> -EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
>  
> -void edac_mc_handle_ue(struct mem_ctl_info *mci,
> -		unsigned long page_frame_number,
> -		unsigned long offset_in_page, int row, const char *msg)
> +#define OTHER_LABEL " or "
> +void edac_mc_handle_error(const enum hw_event_mc_err_type type,
> +			  struct mem_ctl_info *mci,
> +			  const unsigned long page_frame_number,
> +			  const unsigned long offset_in_page,
> +			  const unsigned long syndrome,
> +			  const int layer0,
> +			  const int layer1,
> +			  const int layer2,
> +			  const char *msg,
> +			  const char *other_detail,
> +			  const void *mcelog)
>  {
> -	int len = EDAC_MC_LABEL_LEN * 4;
> -	char labels[len + 1];
> -	char *pos = labels;
> -	int chan;
> -	int chars;
> -	char *label = NULL;
> +	unsigned long remapped_page;
> +	/* FIXME: too much for stack: move it to some pre-alocated area */
> +	char detail[80], location[80];
> +	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
> +	char *p;
> +	int row = -1, chan = -1;
> +	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
> +	int i;
>  	u32 grain;
> +	bool enable_filter = false;
>  
>  	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
>  
> -	/* FIXME - maybe make panic on INTERNAL ERROR an option */
> -	if (row >= mci->nr_csrows || row < 0) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: row out of range "
> -			"(%d >= %d)\n", row, mci->nr_csrows);
> -		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
> -		return;
> -	}
> -
> -	grain = mci->csrows[row].channels[0].dimm->grain;
> -	label = mci->csrows[row].channels[0].dimm->label;
> -	chars = snprintf(pos, len + 1, "%s", label);
> -	len -= chars;
> -	pos += chars;
> -
> -	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
> -		chan++) {
> -		label = mci->csrows[row].channels[chan].dimm->label;
> -		chars = snprintf(pos, len + 1, ":%s", label);
> -		len -= chars;
> -		pos += chars;
> +	/* Check if the event report is consistent */
> +	for (i = 0; i < mci->n_layers; i++) {
> +		if (pos[i] >= (int)mci->layers[i].size) {
> +			if (type == HW_EVENT_ERR_CORRECTED) {
> +				p = "CE";
> +				mci->ce_mc++;
> +			} else {
> +				p = "UE";
> +				mci->ue_mc++;
> +			}
> +			edac_mc_printk(mci, KERN_ERR,
> +				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
> +				       edac_layer_name[mci->layers[i].type],
> +				       pos[i], mci->layers[i].size);
> +			/*
> +			 * Instead of just returning it, let's use what's
> +			 * known about the error. The increment routines and
> +			 * the DIMM filter logic will do the right thing by
> +			 * pointing the likely damaged DIMMs.
> +			 */
> +			pos[i] = -1;
> +		}
> +		if (pos[i] >= 0)
> +			enable_filter = true;
>  	}
>  
> -	if (edac_mc_get_log_ue())
> -		edac_mc_printk(mci, KERN_EMERG,
> -			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
> -			"labels \"%s\": %s\n", page_frame_number,
> -			offset_in_page, grain, row, labels, msg);
> -
> -	if (edac_mc_get_panic_on_ue())
> -		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
> -			"row %d, labels \"%s\": %s\n", mci->mc_idx,
> -			page_frame_number, offset_in_page,
> -			grain, row, labels, msg);
> -
> -	mci->ue_count++;
> -	mci->csrows[row].ue_count++;
> -}
> -EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
> +	/*
> +	 * Get the dimm label/grain that applies to the match criteria.
> +	 * As the error algorithm may not be able to point to just one memory,
> +	 * the logic here will get all possible labels that could pottentially
> +	 * be affected by the error.
> +	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
> +	 * to have only the MC channel and the MC dimm (also called as "rank")
> +	 * but the channel is not known, as the memory is arranged in pairs,
> +	 * where each memory belongs to a separate channel within the same
> +	 * branch.
> +	 * It will also get the max grain, over the error match range
> +	 */
> +	grain = 0;
> +	p = label;
> +	*p = '\0';
> +	for (i = 0; i < mci->tot_dimms; i++) {
> +		struct dimm_info *dimm = &mci->dimms[i];
>  
> -void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
> -{
> -	if (edac_mc_get_panic_on_ue())
> -		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
> +		if (layer0 >= 0 && layer0 != dimm->location[0])
> +			continue;
> +		if (layer1 >= 0 && layer1 != dimm->location[1])
> +			continue;
> +		if (layer2 >= 0 && layer2 != dimm->location[2])
> +			continue;
>  
> -	if (edac_mc_get_log_ue())
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"UE - no information available: %s\n", msg);
> -	mci->ue_noinfo_count++;
> -	mci->ue_count++;
> -}
> -EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
> +		if (dimm->grain > grain)
> +			grain = dimm->grain;
>  
> -/*************************************************************
> - * On Fully Buffered DIMM modules, this help function is
> - * called to process UE events
> - */
> -void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
> -			unsigned int csrow,
> -			unsigned int channela,
> -			unsigned int channelb, char *msg)
> -{
> -	int len = EDAC_MC_LABEL_LEN * 4;
> -	char labels[len + 1];
> -	char *pos = labels;
> -	int chars;
> -	char *label;
> -
> -	if (csrow >= mci->nr_csrows) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: row out of range (%d >= %d)\n",
> -			csrow, mci->nr_csrows);
> -		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
> -		return;
> +		/*
> +		 * If the error is memory-controller wide, there's no sense
> +		 * on seeking for the affected DIMMs, as everything may be
> +		 * affected. Also, don't show errors for non-filled dimm's.
> +		 */
> +		if (enable_filter && dimm->nr_pages) {
> +			if (p != label) {
> +				strcpy(p, OTHER_LABEL);
> +				p += strlen(OTHER_LABEL);
> +			}
> +			strcpy(p, dimm->label);
> +			p += strlen(p);
> +			*p = '\0';
> +
> +			/*
> +			 * get csrow/channel of the dimm, in order to allow
> +			 * incrementing the compat API counters
> +			 */
> +			debugf4("%s: dimm csrows (%d,%d)\n",
> +				__func__, dimm->csrow, dimm->cschannel);
> +			if (row == -1)
> +				row = dimm->csrow;
> +			else if (row >= 0 && row != dimm->csrow)
> +				row = -2;
> +			if (chan == -1)
> +				chan = dimm->cschannel;
> +			else if (chan >= 0 && chan != dimm->cschannel)
> +				chan = -2;
> +		}
>  	}
> -
> -	if (channela >= mci->csrows[csrow].nr_channels) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: channel-a out of range "
> -			"(%d >= %d)\n",
> -			channela, mci->csrows[csrow].nr_channels);
> -		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
> -		return;
> +	if (!enable_filter) {
> +		strcpy(label, "any memory");
> +	} else {
> +		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
> +			__func__, row, chan);
> +		if (p == label)
> +			strcpy(label, "unknown memory");
> +		if (type == HW_EVENT_ERR_CORRECTED) {
> +			if (row >= 0) {
> +				mci->csrows[row].ce_count++;
> +				if (chan >= 0)
> +					mci->csrows[row].channels[chan].ce_count++;
> +			}
> +		} else
> +			if (row >= 0)
> +				mci->csrows[row].ue_count++;
>  	}
>  
> -	if (channelb >= mci->csrows[csrow].nr_channels) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: channel-b out of range "
> -			"(%d >= %d)\n",
> -			channelb, mci->csrows[csrow].nr_channels);
> -		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
> -		return;
> +	/* Fill the RAM location data */
> +	p = location;
> +	for (i = 0; i < mci->n_layers; i++) {
> +		if (pos[i] < 0)
> +			continue;
> +		p += sprintf(p, "%s %d ",
> +			     edac_layer_name[mci->layers[i].type],
> +			     pos[i]);
>  	}
>  
> -	mci->ue_count++;
> -	mci->csrows[csrow].ue_count++;
> -
> -	/* Generate the DIMM labels from the specified channels */
> -	label = mci->csrows[csrow].channels[channela].dimm->label;
> -	chars = snprintf(pos, len + 1, "%s", label);
> -	len -= chars;
> -	pos += chars;
> -
> -	chars = snprintf(pos, len + 1, "-%s",
> -			mci->csrows[csrow].channels[channelb].dimm->label);
> -
> -	if (edac_mc_get_log_ue())
> -		edac_mc_printk(mci, KERN_EMERG,
> -			"UE row %d, channel-a= %d channel-b= %d "
> -			"labels \"%s\": %s\n", csrow, channela, channelb,
> -			labels, msg);
> -
> -	if (edac_mc_get_panic_on_ue())
> -		panic("UE row %d, channel-a= %d channel-b= %d "
> -			"labels \"%s\": %s\n", csrow, channela,
> -			channelb, labels, msg);
> -}
> -EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
> +	/* Memory type dependent details about the error */
> +	if (type == HW_EVENT_ERR_CORRECTED)
> +		snprintf(detail, sizeof(detail),
> +			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
> +			page_frame_number, offset_in_page,
> +			grain, syndrome);
> +	else
> +		snprintf(detail, sizeof(detail),
> +			"page 0x%lx offset 0x%lx grain %d",
> +			page_frame_number, offset_in_page, grain);
> +
> +	if (type == HW_EVENT_ERR_CORRECTED) {
> +		if (edac_mc_get_log_ce())
> +			edac_mc_printk(mci, KERN_WARNING,
> +				       "CE %s on %s (%s%s %s)\n",
> +				       msg, label, location,
> +				       detail, other_detail);
> +		edac_increment_ce_error(mci, enable_filter, pos);
> +
> +		if (mci->scrub_mode & SCRUB_SW_SRC) {
> +			/*
> +			 * Some MC's can remap memory so that it is still
> +			 * available at a different address when PCI devices
> +			 * map into memory.
> +			 * MC's that can't do this lose the memory where PCI
> +			 * devices are mapped. This mapping is MC dependent
> +			 * and so we call back into the MC driver for it to
> +			 * map the MC page to a physical (CPU) page which can
> +			 * then be mapped to a virtual page - which can then
> +			 * be scrubbed.
> +			 */
> +			remapped_page = mci->ctl_page_to_phys ?
> +				mci->ctl_page_to_phys(mci, page_frame_number) :
> +				page_frame_number;
> +
> +			edac_mc_scrub_block(remapped_page,
> +					    offset_in_page, grain);
> +		}
> +	} else {
> +		if (edac_mc_get_log_ue())
> +			edac_mc_printk(mci, KERN_WARNING,
> +				"UE %s on %s (%s%s %s)\n",
> +				msg, label, location, detail, other_detail);
>  
> -/*************************************************************
> - * On Fully Buffered DIMM modules, this help function is
> - * called to process CE events
> - */
> -void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
> -			unsigned int csrow, unsigned int channel, char *msg)
> -{
> -	char *label = NULL;
> +		if (edac_mc_get_panic_on_ue())
> +			panic("UE %s on %s (%s%s %s)\n",
> +			      msg, label, location, detail, other_detail);
>  
> -	/* Ensure boundary values */
> -	if (csrow >= mci->nr_csrows) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: row out of range (%d >= %d)\n",
> -			csrow, mci->nr_csrows);
> -		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
> -		return;
> +		edac_increment_ue_error(mci, enable_filter, pos);
>  	}
> -	if (channel >= mci->csrows[csrow].nr_channels) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
> -			channel, mci->csrows[csrow].nr_channels);
> -		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
> -		return;
> -	}
> -
> -	label = mci->csrows[csrow].channels[channel].dimm->label;
> -
> -	if (edac_mc_get_log_ce())
> -		/* FIXME - put in DIMM location */
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"CE row %d, channel %d, label \"%s\": %s\n",
> -			csrow, channel, label, msg);
> -
> -	mci->ce_count++;
> -	mci->csrows[csrow].ce_count++;
> -	mci->csrows[csrow].channels[channel].dimm->ce_count++;
> -	mci->csrows[csrow].channels[channel].ce_count++;
>  }
> -EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
> +EXPORT_SYMBOL_GPL(edac_mc_handle_error);
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index 3b8798d..412d5cd 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -412,18 +412,20 @@ struct edac_mc_layer {
>  /* FIXME: add the proper per-location error counts */
>  struct dimm_info {
>  	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
> -	unsigned memory_controller;
> -	unsigned csrow;
> -	unsigned csrow_channel;
> +
> +	/* Memory location data */
> +	unsigned location[EDAC_MAX_LAYERS];
> +
> +	struct mem_ctl_info *mci;	/* the parent */
>  
>  	u32 grain;		/* granularity of reported error in bytes */
>  	enum dev_type dtype;	/* memory device type */
>  	enum mem_type mtype;	/* memory dimm type */
>  	enum edac_type edac_mode;	/* EDAC mode for this dimm */
>  
> -	u32 nr_pages;			/* number of pages in csrow */
> +	u32 nr_pages;			/* number of pages on this dimm */
>  
> -	u32 ce_count;		/* Correctable Errors for this dimm */
> +	unsigned csrow, cschannel;	/* Points to the old API data */
>  };
>  
>  /**
> @@ -443,9 +445,10 @@ struct dimm_info {
>   */
>  struct rank_info {
>  	int chan_idx;
> -	u32 ce_count;
>  	struct csrow_info *csrow;
>  	struct dimm_info *dimm;
> +
> +	u32 ce_count;		/* Correctable Errors for this csrow */
>  };
>  
>  struct csrow_info {
> @@ -497,6 +500,11 @@ struct mcidev_sysfs_attribute {
>          ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
>  };
>  
> +struct edac_hierarchy {
> +	char		*name;
> +	unsigned	nr;
> +};
> +
>  /* MEMORY controller information structure
>   */
>  struct mem_ctl_info {
> @@ -541,13 +549,16 @@ struct mem_ctl_info {
>  	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
>  					   unsigned long page);
>  	int mc_idx;
> -	int nr_csrows;
>  	struct csrow_info *csrows;
> +	unsigned nr_csrows, num_cschannel;
>  
> +	/* Memory Controller hierarchy */
> +	unsigned n_layers;
> +	struct edac_mc_layer *layers;
>  	/*
>  	 * DIMM info. Will eventually remove the entire csrows_info some day
>  	 */
> -	unsigned nr_dimms;
> +	unsigned tot_dimms;
>  	struct dimm_info *dimms;
>  
>  	/*
> @@ -562,12 +573,15 @@ struct mem_ctl_info {
>  	const char *dev_name;
>  	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
>  	void *pvt_info;
> -	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
> -	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
> -	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
> -	u32 ce_count;		/* Total Correctable Errors for this MC */
> +	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
> +	u32 ce_count;           /* Total Correctable Errors for this MC */
>  	unsigned long start_time;	/* mci load start time (in jiffies) */
>  
> +	/* drivers shouldn't access this struct directly */
> +	unsigned ce_noinfo_count, ue_noinfo_count;
> +	unsigned ce_mc, ue_mc;
> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
> +
>  	struct completion complete;
>  
>  	/* edac sysfs device control */
> @@ -580,7 +594,7 @@ struct mem_ctl_info {
>  	 * by the low level driver.
>  	 *
>  	 * Set by the low level driver to provide attributes at the
> -	 * controller level, same level as 'ue_count' and 'ce_count' above.
> +	 * controller level.
>  	 * An array of structures, NULL terminated
>  	 *
>  	 * If attributes are desired, then set to array of attributes
> -- 
> 1.7.8
> 
> 

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-27 13:33                                 ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-27 13:33 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Btw,

this patch gives

[    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
[    8.287594] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
[    8.296784] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: dimm2 (1:0:0): row 1, chan 0
[    8.305968] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: dimm3 (1:1:0): row 1, chan 1
[    8.315144] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: dimm4 (2:0:0): row 2, chan 0
[    8.324326] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: dimm5 (2:1:0): row 2, chan 1
[    8.333502] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: dimm6 (3:0:0): row 3, chan 0
[    8.342684] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: dimm7 (3:1:0): row 3, chan 1
[    8.351860] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: dimm8 (4:0:0): row 4, chan 0
[    8.361049] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: dimm9 (4:1:0): row 4, chan 1
[    8.370227] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: dimm10 (5:0:0): row 5, chan 0
[    8.379582] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: dimm11 (5:1:0): row 5, chan 1
[    8.388941] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: dimm12 (6:0:0): row 6, chan 0
[    8.398315] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: dimm13 (6:1:0): row 6, chan 1
[    8.407680] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: dimm14 (7:0:0): row 7, chan 0
[    8.417047] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: dimm15 (7:1:0): row 7, chan 1

and the memory controller has the following chip selects

[    8.137662] EDAC MC: DCT0 chip selects:
[    8.150291] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[    8.155349] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[    8.160408] EDAC amd64: MC: 4:     0MB 5:     0MB
[    8.165475] EDAC amd64: MC: 6:     0MB 7:     0MB
[    8.180499] EDAC MC: DCT1 chip selects:
[    8.184693] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[    8.189753] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[    8.194812] EDAC amd64: MC: 4:     0MB 5:     0MB
[    8.199875] EDAC amd64: MC: 6:     0MB 7:     0MB

Those are 4 dual-ranked DIMMs on this node, DCT0 is one channel and DCT1
is another and I have 4 ranks per channel. Having dimm0-dimm15 is very
misleading and has nothing to do with the reality. So, if this is to use
your nomenclature with layers, I'll have dimm0-dimm7 where each dimm is
a rank.

Or, the most correct thing to do would be to have dimm0-dimm3, each
dual-ranked.

So either tot_dimms is computed wrongly or there's a more serious error
somewhere.

I've reviewed almost the half patch, will review the rest when/if we
sort out the above issue first.

Thanks.

On Tue, Apr 24, 2012 at 03:15:41PM -0300, Mauro Carvalho Chehab wrote:
> Change the EDAC internal representation to work with non-csrow
> based memory controllers.
> 
> There are lots of those memory controllers nowadays, and more
> are coming. So, the EDAC internal representation needs to be
> changed, in order to work with those memory controllers, while
> preserving backward compatibility with the old ones.
> 
> The edac core were written with the idea that memory controllers

		was

> are able to directly access csrows, and that the channels are
> used inside a csrows select.

This sounds funny, simply remove that second part about the channels.

> This is not true for FB-DIMM and RAMBUS memory controllers.
> 
> Also, some recent advanced memory controllers don't present a per-csrows
> view. Instead, they view memories as DIMM's, instead of ranks, accessed

					DIMMs instead of ranks."

Remove the rest.

> via csrow/channel.
> 
> So, change the allocation and error report routines to allow
> them to work with all types of architectures.
> 
> This will allow the removal of several hacks on FB-DIMM and RAMBUS

					       with

> memory controllers on the next patches.

		    . Remove the rest.

> 
> Also, several tests were done on different platforms using different
> x86 drivers.
> 
> TODO: a multi-rank DIMM's are currently represented by multiple DIMM

	Multi-rank DIMMs

> entries at struct dimm_info. That means that changing a label for one

	  in

> rank won't change the same label for the other ranks at the same dimm.

						       of the same DIMM.

> Such bug is there since the beginning of the EDAC, so it is not a big

  This bug is present ..

> deal. However, on several drivers, it is possible to fix this issue, but

		remove "on"

> it should be a per-driver fix, as the csrow => DIMM arrangement may not
> be equal for all. So, don't try to fix it here yet.
> 
> PS.: I tried to make this patch as short as possible, preceding it with

Remove "PS."

> several other patches that simplified the logic here. Yet, as the
> internal API changes, all drivers need changes. The changes are
> generally bigger on the drivers for FB-DIMM's.

		   in 		   for FB-DIMMs.

> 
> FIXME: while the FB-DIMMs are not converted to use the new
> design, uncorrected errors will show just one channel. In
> the past, all changes were on a big patch with about 150K.
> As it needed to be split, in order to be accepted by the
> EDAC ML at vger, we've opted to have this small drawback.
> As an advantage, it is now easier to review the patch series.

This whole paragraph above doesn't have anything to do with what the
patch does, so it can go.

[..]

> ---
> 
> v16: Only context changes
> 
>  drivers/edac/edac_core.h |   92 ++++++-
>  drivers/edac/edac_mc.c   |  682 ++++++++++++++++++++++++++++------------------
>  include/linux/edac.h     |   40 ++-
>  3 files changed, 526 insertions(+), 288 deletions(-)
> 
> diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
> index e48ab31..7201bb1 100644
> --- a/drivers/edac/edac_core.h
> +++ b/drivers/edac/edac_core.h
> @@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
>  
>  #endif				/* CONFIG_PCI */
>  
> -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> -					  unsigned nr_chans, int edac_index);
> +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> +				   unsigned nr_chans, int edac_index);

Why not "extern"?

> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> +				   unsigned n_layers,
> +				   struct edac_mc_layer *layers,
> +				   bool rev_order,
> +				   unsigned sz_pvt);

ditto.

>  extern int edac_mc_add_mc(struct mem_ctl_info *mci);
>  extern void edac_mc_free(struct mem_ctl_info *mci);
>  extern struct mem_ctl_info *edac_mc_find(int idx);
> @@ -467,24 +472,80 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
>   * reporting logic and function interface - reduces conditional
>   * statement clutter and extra function arguments.
>   */
> -extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
> +
> +void edac_mc_handle_error(const enum hw_event_mc_err_type type,
> +			  struct mem_ctl_info *mci,
> +			  const unsigned long page_frame_number,
> +			  const unsigned long offset_in_page,
> +			  const unsigned long syndrome,
> +			  const int layer0,
> +			  const int layer1,
> +			  const int layer2,
> +			  const char *msg,
> +			  const char *other_detail,
> +			  const void *mcelog);

Why isn't this one "extern" either?

> +
> +static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
>  			      unsigned long page_frame_number,
>  			      unsigned long offset_in_page,
>  			      unsigned long syndrome, int row, int channel,
> -			      const char *msg);

Strange alignment, pls do

static inline void edac_mc_handle_ce(struct...,
				     unsigned...,
				     ...,
				     ...);


> -extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
> -				      const char *msg);
> -extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
> +			      const char *msg)
> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +			      page_frame_number, offset_in_page, syndrome,
> +		              row, channel, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
> +				      const char *msg)

ditto.

> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
>  			      unsigned long page_frame_number,
>  			      unsigned long offset_in_page, int row,
> -			      const char *msg);

ditto.

> -extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
> -				      const char *msg);
> -extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
> -				  unsigned int channel0, unsigned int channel1,
> -				  char *msg);
> -extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
> -				  unsigned int channel, char *msg);
> +			      const char *msg)
> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +			      page_frame_number, offset_in_page, 0,
> +		              row, -1, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
> +				      const char *msg)
> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
> +					 unsigned int csrow,
> +					 unsigned int channel0,
> +					 unsigned int channel1,
> +					 char *msg)

Now this alignment looks correct.

> +{
> +	/*
> +	 *FIXME: The error can also be at channel1 (e. g. at the second
> +	 *	  channel of the same branch). The fix is to push
> +	 *	  edac_mc_handle_error() call into each driver
> +	 */
> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
> +			      0, 0, 0,
> +		              csrow, channel0, -1, msg, NULL, NULL);
> +}
> +
> +static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
> +					 unsigned int csrow,
> +					 unsigned int channel, char *msg)
> +{
> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +			      0, 0, 0,
> +		              csrow, channel, -1, msg, NULL, NULL);
> +}
> +
> +

Two superfluous newlines.

>  
>  /*
>   * edac_device APIs
> @@ -496,6 +557,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
>  extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
>  				int inst_nr, int block_nr, const char *msg);
>  extern int edac_device_alloc_index(void);
> +extern const char *edac_layer_name[];
>  
>  /*
>   * edac_pci APIs
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index 6ec967a..4d4d8b7 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>  	debugf4("\tchannel = %p\n", chan);
>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
> -	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
> -	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
> -	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
> +	debugf4("\tchannel->dimm = %p\n", chan->dimm);
> +}
> +
> +static void edac_mc_dump_dimm(struct dimm_info *dimm)
> +{
> +	int i;
> +
> +	debugf4("\tdimm = %p\n", dimm);
> +	debugf4("\tdimm->label = '%s'\n", dimm->label);
> +	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
> +	debugf4("\tdimm location ");
> +	for (i = 0; i < dimm->mci->n_layers; i++) {
> +		printk(KERN_CONT "%d", dimm->location[i]);
> +		if (i < dimm->mci->n_layers - 1)
> +			printk(KERN_CONT ".");
> +	}
> +	printk(KERN_CONT "\n");

This looks hacky but I don't have a good suggestion what to do instead
here. Maybe snprintf into a complete string which you can issue with
debugf4()...

> +	debugf4("\tdimm->grain = %d\n", dimm->grain);
> +	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
>  }
>  
>  static void edac_mc_dump_csrow(struct csrow_info *csrow)
> @@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
>  	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
>  	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
>  		mci->nr_csrows, mci->csrows);
> +	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",

		      ->tot_dimms      dimms

> +		mci->tot_dimms, mci->dimms);
>  	debugf3("\tdev = %p\n", mci->dev);
>  	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
>  	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
> @@ -157,10 +175,25 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
>  }
>  
>  /**
> - * edac_mc_alloc: Allocate a struct mem_ctl_info structure
> - * @size_pvt:	size of private storage needed
> - * @nr_csrows:	Number of CWROWS needed for this MC
> - * @nr_chans:	Number of channels for the MC
> + * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure

					    fill

> + * @edac_index:		Memory controller number
> + * @n_layers:		Number of layers at the MC hierarchy

				Number of MC hierarchy layers

> + * layers:		Describes each layer as seen by the Memory Controller
> + * @rev_order:		Fills csrows/cs channels at the reverse order

				      csrows/channels in reverse order

> + * @size_pvt:		size of private storage needed
> + *
> + *
> + * FIXME: drivers handle multi-rank memories on different ways: on some

						in		   in

> + * drivers, one multi-rank memory is mapped as one DIMM, while, on others,

			      memory stick			   in

> + * a single multi-rank DIMM would be mapped into several "dimms".

			  memory stick

> + *
> + * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
> + * such DIMMS properly, but the CSROWS-based ones will likely do the wrong

				   csrow-based

> + * thing, as two chip select values are used for dual-rank memories (and 4, for
> + * quad-rank ones). I suspect that this issue could be solved inside the EDAC
> + * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
> + *
> + * In summary, solving this issue is not easy, as it requires a lot of testing.
>   *
>   * Everything is kmalloc'ed as one big chunk - more efficient.
>   * Only can be used if all structures have the same lifetime - otherwise
> @@ -172,18 +205,41 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
>   *	NULL allocation failed
>   *	struct mem_ctl_info pointer
>   */
> -struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> -				unsigned nr_chans, int edac_index)
> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> +				   unsigned n_layers,
> +				   struct edac_mc_layer *layers,
> +				   bool rev_order,
> +				   unsigned sz_pvt)

strange function argument vertical alignment

>  {
>  	void *ptr = NULL;
>  	struct mem_ctl_info *mci;
> -	struct csrow_info *csi, *csrow;
> +	struct edac_mc_layer *lay;

As before, call this "layers" pls.

> +	struct csrow_info *csi, *csr;
>  	struct rank_info *chi, *chp, *chan;
>  	struct dimm_info *dimm;
> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>  	void *pvt;
> -	unsigned size;
> -	int row, chn;
> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> +	unsigned tot_csrows, tot_cschannels;

No need to call this "tot_cschannels" - "tot_channels" should be enough.

> +	int i, j;
>  	int err;
> +	int row, chn;

All those local variables should be sorted in a reverse christmas tree
order:

	u32 this_is_the_longest_array_name[LENGTH];
	void *shorter_named_variable;
	unsigned long size;
	int i;

	...

> +
> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);


Push this BUG_ON up into edac_mc_alloc as the first thing this function
does. Also, is it valid to have n_layers == 0? The memcpy call below
will do nothing.


> +	/*
> +	 * Calculate the total amount of dimms and csrows/cschannels while
> +	 * in the old API emulation mode
> +	 */
> +	tot_dimms = 1;
> +	tot_cschannels = 1;
> +	tot_csrows = 1;

Those initializations can be done above at variable declaration time.

> +	for (i = 0; i < n_layers; i++) {
> +		tot_dimms *= layers[i].size;
> +		if (layers[i].is_virt_csrow)
> +			tot_csrows *= layers[i].size;
> +		else
> +			tot_cschannels *= layers[i].size;
> +	}
>  
>  	/* Figure out the offsets of the various items from the start of an mc
>  	 * structure.  We want the alignment of each item to be at least as
> @@ -191,12 +247,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	 * hardcode everything into a single struct.
>  	 */
>  	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
> -	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
> -	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
> -	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
> +	lay = edac_align_ptr(&ptr, sizeof(*lay), n_layers);
> +	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
> +	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_cschannels);
> +	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
> +	count = 1;

ditto.

> +	for (i = 0; i < n_layers; i++) {
> +		count *= layers[i].size;
> +		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
> +		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
> +	}
>  	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
>  	size = ((unsigned long)pvt) + sz_pvt;
>  
> +	debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
> +		__func__, size, tot_dimms, tot_csrows * tot_cschannels);
>  	mci = kzalloc(size, GFP_KERNEL);
>  	if (mci == NULL)
>  		return NULL;
> @@ -204,42 +269,99 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	/* Adjust pointers so they point within the memory we just allocated
>  	 * rather than an imaginary chunk of memory located at address 0.
>  	 */
> +	lay = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)lay));
>  	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
>  	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
>  	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
> +	for (i = 0; i < n_layers; i++) {
> +		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
> +		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
> +	}
>  	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
>  
>  	/* setup index and various internal pointers */
>  	mci->mc_idx = edac_index;
>  	mci->csrows = csi;
>  	mci->dimms  = dimm;
> +	mci->tot_dimms = tot_dimms;
>  	mci->pvt_info = pvt;
> -	mci->nr_csrows = nr_csrows;
> +	mci->n_layers = n_layers;
> +	mci->layers = lay;
> +	memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
> +	mci->nr_csrows = tot_csrows;
> +	mci->num_cschannel = tot_cschannels;
>  
>  	/*
> -	 * For now, assumes that a per-csrow arrangement for dimms.
> -	 * This will be latter changed.
> +	 * Fills the csrow struct
>  	 */
> -	dimm = mci->dimms;
> -
> -	for (row = 0; row < nr_csrows; row++) {
> -		csrow = &csi[row];
> -		csrow->csrow_idx = row;
> -		csrow->mci = mci;
> -		csrow->nr_channels = nr_chans;
> -		chp = &chi[row * nr_chans];
> -		csrow->channels = chp;
> -
> -		for (chn = 0; chn < nr_chans; chn++) {
> +	for (row = 0; row < tot_csrows; row++) {
> +		csr = &csi[row];
> +		csr->csrow_idx = row;
> +		csr->mci = mci;
> +		csr->nr_channels = tot_cschannels;
> +		chp = &chi[row * tot_cschannels];
> +		csr->channels = chp;
> +
> +		for (chn = 0; chn < tot_cschannels; chn++) {
>  			chan = &chp[chn];
>  			chan->chan_idx = chn;
> -			chan->csrow = csrow;
> +			chan->csrow = csr;
> +		}
> +	}
>  
> -			mci->csrows[row].channels[chn].dimm = dimm;
> -			dimm->csrow = row;
> -			dimm->csrow_channel = chn;
> -			dimm++;
> -			mci->nr_dimms++;
> +	/*
> +	 * Fills the dimm struct
> +	 */
> +	memset(&pos, 0, sizeof(pos));
> +	row = 0;
> +	chn = 0;
> +	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
> +	for (i = 0; i < tot_dimms; i++) {
> +		chan = &csi[row].channels[chn];
> +		dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
> +			       pos[0], pos[1], pos[2]);
> +		dimm->mci = mci;
> +
> +		debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
> +			i, (dimm - mci->dimms),
> +			pos[0], pos[1], pos[2], row, chn);
> +
> +		/* Copy DIMM location */
> +		for (j = 0; j < n_layers; j++)
> +			dimm->location[j] = pos[j];
> +
> +		/* Link it to the csrows old API data */
> +		chan->dimm = dimm;
> +		dimm->csrow = row;
> +		dimm->cschannel = chn;
> +
> +		/* Increment csrow location */
> +		if (!rev_order) {
> +			for (j = n_layers - 1; j >= 0; j--)
> +				if (!layers[j].is_virt_csrow)
> +					break;
> +			chn++;
> +			if (chn == tot_cschannels) {
> +				chn = 0;
> +				row++;
> +			}
> +		} else {
> +			for (j = n_layers - 1; j >= 0; j--)
> +				if (layers[j].is_virt_csrow)
> +					break;
> +			row++;
> +			if (row == tot_csrows) {
> +				row = 0;
> +				chn++;
> +			}
> +		}
> +
> +		/* Increment dimm location */
> +		for (j = n_layers - 1; j >= 0; j--) {
> +			pos[j]++;
> +			if (pos[j] < layers[j].size)
> +				break;
> +			pos[j] = 0;
>  		}
>  	}
>  
> @@ -263,6 +385,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>  	 */
>  	return mci;
>  }
> +EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
> +
> +/**
> + * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
> + * @edac_index:		Memory controller number
> + * @n_layers:		Nu
> +mber of layers at the MC hierarchy
> + * layers:		Describes each layer as seen by the Memory Controller
> + * @rev_order:		Fills csrows/cs channels at the reverse order
> + * @size_pvt:		size of private storage needed
> + *
> + *
> + * FIXME: drivers handle multi-rank memories on different ways: on some
> + * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
> + * a single multi-rank DIMM would be mapped into several "dimms".
> + *
> + * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
> + * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
> + * thing, as two chip select values are used for dual-rank memories (and 4, for
> + * quad-rank ones). I suspect that this issue could be solved inside the EDAC
> + * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
> + *
> + * In summary, solving this issue is not easy, as it requires a lot of testing.
> + *
> + * Everything is kmalloc'ed as one big chunk - more efficient.
> + * Only can be used if all structures have the same lifetime - otherwise
> + * you have to allocate and initialize your own structures.
> + *
> + * Use edac_mc_free() to free mc structures allocated by this function.
> + *
> + * Returns:
> + *	NULL allocation failed
> + *	struct mem_ctl_info pointer
> + */
> +
> +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> +				   unsigned nr_chans, int edac_index)
> +{
> +	unsigned n_layers = 2;
> +	struct edac_mc_layer layers[n_layers];
> +
> +	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
> +	layers[0].size = nr_csrows;
> +	layers[0].is_virt_csrow = true;
> +	layers[1].type = EDAC_MC_LAYER_CHANNEL;
> +	layers[1].size = nr_chans;
> +	layers[1].is_virt_csrow = false;
> +
> +	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
> +			  false, sz_pvt);
> +}
>  EXPORT_SYMBOL_GPL(edac_mc_alloc);
>  
>  /**
> @@ -528,7 +701,6 @@ EXPORT_SYMBOL(edac_mc_find);
>   * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
>   *                 create sysfs entries associated with mci structure
>   * @mci: pointer to the mci structure to be added to the list
> - * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
>   *
>   * Return:
>   *	0	Success
> @@ -555,6 +727,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
>  				edac_mc_dump_channel(&mci->csrows[i].
>  						channels[j]);
>  		}
> +		for (i = 0; i < mci->tot_dimms; i++)
> +			edac_mc_dump_dimm(&mci->dimms[i]);
>  	}
>  #endif
>  	mutex_lock(&mem_ctls_mutex);
> @@ -712,261 +886,249 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
>  }
>  EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
>  
> -/* FIXME - setable log (warning/emerg) levels */
> -/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
> -void edac_mc_handle_ce(struct mem_ctl_info *mci,
> -		unsigned long page_frame_number,
> -		unsigned long offset_in_page, unsigned long syndrome,
> -		int row, int channel, const char *msg)
> +const char *edac_layer_name[] = {
> +	[EDAC_MC_LAYER_BRANCH] = "branch",
> +	[EDAC_MC_LAYER_CHANNEL] = "channel",
> +	[EDAC_MC_LAYER_SLOT] = "slot",
> +	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
> +};
> +EXPORT_SYMBOL_GPL(edac_layer_name);
> +
> +static void edac_increment_ce_error(struct mem_ctl_info *mci,
> +				    bool enable_filter,
> +				    unsigned pos[EDAC_MAX_LAYERS])
>  {
> -	unsigned long remapped_page;
> -	char *label = NULL;
> -	u32 grain;
> +	int i, index = 0;
>  
> -	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
> +	mci->ce_mc++;
>  
> -	/* FIXME - maybe make panic on INTERNAL ERROR an option */
> -	if (row >= mci->nr_csrows || row < 0) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: row out of range "
> -			"(%d >= %d)\n", row, mci->nr_csrows);
> -		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
> +	if (!enable_filter) {
> +		mci->ce_noinfo_count++;
>  		return;
>  	}
>  
> -	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: channel out of range "
> -			"(%d >= %d)\n", channel,
> -			mci->csrows[row].nr_channels);
> -		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
> -		return;
> -	}
> -
> -	label = mci->csrows[row].channels[channel].dimm->label;
> -	grain = mci->csrows[row].channels[channel].dimm->grain;
> +	for (i = 0; i < mci->n_layers; i++) {
> +		if (pos[i] < 0)
> +			break;
> +		index += pos[i];
> +		mci->ce_per_layer[i][index]++;
>  
> -	if (edac_mc_get_log_ce())
> -		/* FIXME - put in DIMM location */
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
> -			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
> -			page_frame_number, offset_in_page,
> -			grain, syndrome, row, channel,
> -			label, msg);
> +		if (i < mci->n_layers - 1)
> +			index *= mci->layers[i + 1].size;
> +	}
> +}
>  
> -	mci->ce_count++;
> -	mci->csrows[row].ce_count++;
> -	mci->csrows[row].channels[channel].dimm->ce_count++;
> -	mci->csrows[row].channels[channel].ce_count++;
> +static void edac_increment_ue_error(struct mem_ctl_info *mci,
> +				    bool enable_filter,
> +				    unsigned pos[EDAC_MAX_LAYERS])
> +{
> +	int i, index = 0;
>  
> -	if (mci->scrub_mode & SCRUB_SW_SRC) {
> -		/*
> -		 * Some MC's can remap memory so that it is still available
> -		 * at a different address when PCI devices map into memory.
> -		 * MC's that can't do this lose the memory where PCI devices
> -		 * are mapped.  This mapping is MC dependent and so we call
> -		 * back into the MC driver for it to map the MC page to
> -		 * a physical (CPU) page which can then be mapped to a virtual
> -		 * page - which can then be scrubbed.
> -		 */
> -		remapped_page = mci->ctl_page_to_phys ?
> -			mci->ctl_page_to_phys(mci, page_frame_number) :
> -			page_frame_number;
> +	mci->ue_mc++;
>  
> -		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
> +	if (!enable_filter) {
> +		mci->ce_noinfo_count++;
> +		return;
>  	}
> -}
> -EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
>  
> -void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
> -{
> -	if (edac_mc_get_log_ce())
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"CE - no information available: %s\n", msg);
> +	for (i = 0; i < mci->n_layers; i++) {
> +		if (pos[i] < 0)
> +			break;
> +		index += pos[i];
> +		mci->ue_per_layer[i][index]++;
>  
> -	mci->ce_noinfo_count++;
> -	mci->ce_count++;
> +		if (i < mci->n_layers - 1)
> +			index *= mci->layers[i + 1].size;
> +	}
>  }
> -EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
>  
> -void edac_mc_handle_ue(struct mem_ctl_info *mci,
> -		unsigned long page_frame_number,
> -		unsigned long offset_in_page, int row, const char *msg)
> +#define OTHER_LABEL " or "
> +void edac_mc_handle_error(const enum hw_event_mc_err_type type,
> +			  struct mem_ctl_info *mci,
> +			  const unsigned long page_frame_number,
> +			  const unsigned long offset_in_page,
> +			  const unsigned long syndrome,
> +			  const int layer0,
> +			  const int layer1,
> +			  const int layer2,
> +			  const char *msg,
> +			  const char *other_detail,
> +			  const void *mcelog)
>  {
> -	int len = EDAC_MC_LABEL_LEN * 4;
> -	char labels[len + 1];
> -	char *pos = labels;
> -	int chan;
> -	int chars;
> -	char *label = NULL;
> +	unsigned long remapped_page;
> +	/* FIXME: too much for stack: move it to some pre-alocated area */
> +	char detail[80], location[80];
> +	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
> +	char *p;
> +	int row = -1, chan = -1;
> +	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
> +	int i;
>  	u32 grain;
> +	bool enable_filter = false;
>  
>  	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
>  
> -	/* FIXME - maybe make panic on INTERNAL ERROR an option */
> -	if (row >= mci->nr_csrows || row < 0) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: row out of range "
> -			"(%d >= %d)\n", row, mci->nr_csrows);
> -		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
> -		return;
> -	}
> -
> -	grain = mci->csrows[row].channels[0].dimm->grain;
> -	label = mci->csrows[row].channels[0].dimm->label;
> -	chars = snprintf(pos, len + 1, "%s", label);
> -	len -= chars;
> -	pos += chars;
> -
> -	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
> -		chan++) {
> -		label = mci->csrows[row].channels[chan].dimm->label;
> -		chars = snprintf(pos, len + 1, ":%s", label);
> -		len -= chars;
> -		pos += chars;
> +	/* Check if the event report is consistent */
> +	for (i = 0; i < mci->n_layers; i++) {
> +		if (pos[i] >= (int)mci->layers[i].size) {
> +			if (type == HW_EVENT_ERR_CORRECTED) {
> +				p = "CE";
> +				mci->ce_mc++;
> +			} else {
> +				p = "UE";
> +				mci->ue_mc++;
> +			}
> +			edac_mc_printk(mci, KERN_ERR,
> +				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
> +				       edac_layer_name[mci->layers[i].type],
> +				       pos[i], mci->layers[i].size);
> +			/*
> +			 * Instead of just returning it, let's use what's
> +			 * known about the error. The increment routines and
> +			 * the DIMM filter logic will do the right thing by
> +			 * pointing the likely damaged DIMMs.
> +			 */
> +			pos[i] = -1;
> +		}
> +		if (pos[i] >= 0)
> +			enable_filter = true;
>  	}
>  
> -	if (edac_mc_get_log_ue())
> -		edac_mc_printk(mci, KERN_EMERG,
> -			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
> -			"labels \"%s\": %s\n", page_frame_number,
> -			offset_in_page, grain, row, labels, msg);
> -
> -	if (edac_mc_get_panic_on_ue())
> -		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
> -			"row %d, labels \"%s\": %s\n", mci->mc_idx,
> -			page_frame_number, offset_in_page,
> -			grain, row, labels, msg);
> -
> -	mci->ue_count++;
> -	mci->csrows[row].ue_count++;
> -}
> -EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
> +	/*
> +	 * Get the dimm label/grain that applies to the match criteria.
> +	 * As the error algorithm may not be able to point to just one memory,
> +	 * the logic here will get all possible labels that could pottentially
> +	 * be affected by the error.
> +	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
> +	 * to have only the MC channel and the MC dimm (also called as "rank")
> +	 * but the channel is not known, as the memory is arranged in pairs,
> +	 * where each memory belongs to a separate channel within the same
> +	 * branch.
> +	 * It will also get the max grain, over the error match range
> +	 */
> +	grain = 0;
> +	p = label;
> +	*p = '\0';
> +	for (i = 0; i < mci->tot_dimms; i++) {
> +		struct dimm_info *dimm = &mci->dimms[i];
>  
> -void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
> -{
> -	if (edac_mc_get_panic_on_ue())
> -		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
> +		if (layer0 >= 0 && layer0 != dimm->location[0])
> +			continue;
> +		if (layer1 >= 0 && layer1 != dimm->location[1])
> +			continue;
> +		if (layer2 >= 0 && layer2 != dimm->location[2])
> +			continue;
>  
> -	if (edac_mc_get_log_ue())
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"UE - no information available: %s\n", msg);
> -	mci->ue_noinfo_count++;
> -	mci->ue_count++;
> -}
> -EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
> +		if (dimm->grain > grain)
> +			grain = dimm->grain;
>  
> -/*************************************************************
> - * On Fully Buffered DIMM modules, this help function is
> - * called to process UE events
> - */
> -void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
> -			unsigned int csrow,
> -			unsigned int channela,
> -			unsigned int channelb, char *msg)
> -{
> -	int len = EDAC_MC_LABEL_LEN * 4;
> -	char labels[len + 1];
> -	char *pos = labels;
> -	int chars;
> -	char *label;
> -
> -	if (csrow >= mci->nr_csrows) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: row out of range (%d >= %d)\n",
> -			csrow, mci->nr_csrows);
> -		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
> -		return;
> +		/*
> +		 * If the error is memory-controller wide, there's no sense
> +		 * on seeking for the affected DIMMs, as everything may be
> +		 * affected. Also, don't show errors for non-filled dimm's.
> +		 */
> +		if (enable_filter && dimm->nr_pages) {
> +			if (p != label) {
> +				strcpy(p, OTHER_LABEL);
> +				p += strlen(OTHER_LABEL);
> +			}
> +			strcpy(p, dimm->label);
> +			p += strlen(p);
> +			*p = '\0';
> +
> +			/*
> +			 * get csrow/channel of the dimm, in order to allow
> +			 * incrementing the compat API counters
> +			 */
> +			debugf4("%s: dimm csrows (%d,%d)\n",
> +				__func__, dimm->csrow, dimm->cschannel);
> +			if (row == -1)
> +				row = dimm->csrow;
> +			else if (row >= 0 && row != dimm->csrow)
> +				row = -2;
> +			if (chan == -1)
> +				chan = dimm->cschannel;
> +			else if (chan >= 0 && chan != dimm->cschannel)
> +				chan = -2;
> +		}
>  	}
> -
> -	if (channela >= mci->csrows[csrow].nr_channels) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: channel-a out of range "
> -			"(%d >= %d)\n",
> -			channela, mci->csrows[csrow].nr_channels);
> -		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
> -		return;
> +	if (!enable_filter) {
> +		strcpy(label, "any memory");
> +	} else {
> +		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
> +			__func__, row, chan);
> +		if (p == label)
> +			strcpy(label, "unknown memory");
> +		if (type == HW_EVENT_ERR_CORRECTED) {
> +			if (row >= 0) {
> +				mci->csrows[row].ce_count++;
> +				if (chan >= 0)
> +					mci->csrows[row].channels[chan].ce_count++;
> +			}
> +		} else
> +			if (row >= 0)
> +				mci->csrows[row].ue_count++;
>  	}
>  
> -	if (channelb >= mci->csrows[csrow].nr_channels) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: channel-b out of range "
> -			"(%d >= %d)\n",
> -			channelb, mci->csrows[csrow].nr_channels);
> -		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
> -		return;
> +	/* Fill the RAM location data */
> +	p = location;
> +	for (i = 0; i < mci->n_layers; i++) {
> +		if (pos[i] < 0)
> +			continue;
> +		p += sprintf(p, "%s %d ",
> +			     edac_layer_name[mci->layers[i].type],
> +			     pos[i]);
>  	}
>  
> -	mci->ue_count++;
> -	mci->csrows[csrow].ue_count++;
> -
> -	/* Generate the DIMM labels from the specified channels */
> -	label = mci->csrows[csrow].channels[channela].dimm->label;
> -	chars = snprintf(pos, len + 1, "%s", label);
> -	len -= chars;
> -	pos += chars;
> -
> -	chars = snprintf(pos, len + 1, "-%s",
> -			mci->csrows[csrow].channels[channelb].dimm->label);
> -
> -	if (edac_mc_get_log_ue())
> -		edac_mc_printk(mci, KERN_EMERG,
> -			"UE row %d, channel-a= %d channel-b= %d "
> -			"labels \"%s\": %s\n", csrow, channela, channelb,
> -			labels, msg);
> -
> -	if (edac_mc_get_panic_on_ue())
> -		panic("UE row %d, channel-a= %d channel-b= %d "
> -			"labels \"%s\": %s\n", csrow, channela,
> -			channelb, labels, msg);
> -}
> -EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
> +	/* Memory type dependent details about the error */
> +	if (type == HW_EVENT_ERR_CORRECTED)
> +		snprintf(detail, sizeof(detail),
> +			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
> +			page_frame_number, offset_in_page,
> +			grain, syndrome);
> +	else
> +		snprintf(detail, sizeof(detail),
> +			"page 0x%lx offset 0x%lx grain %d",
> +			page_frame_number, offset_in_page, grain);
> +
> +	if (type == HW_EVENT_ERR_CORRECTED) {
> +		if (edac_mc_get_log_ce())
> +			edac_mc_printk(mci, KERN_WARNING,
> +				       "CE %s on %s (%s%s %s)\n",
> +				       msg, label, location,
> +				       detail, other_detail);
> +		edac_increment_ce_error(mci, enable_filter, pos);
> +
> +		if (mci->scrub_mode & SCRUB_SW_SRC) {
> +			/*
> +			 * Some MC's can remap memory so that it is still
> +			 * available at a different address when PCI devices
> +			 * map into memory.
> +			 * MC's that can't do this lose the memory where PCI
> +			 * devices are mapped. This mapping is MC dependent
> +			 * and so we call back into the MC driver for it to
> +			 * map the MC page to a physical (CPU) page which can
> +			 * then be mapped to a virtual page - which can then
> +			 * be scrubbed.
> +			 */
> +			remapped_page = mci->ctl_page_to_phys ?
> +				mci->ctl_page_to_phys(mci, page_frame_number) :
> +				page_frame_number;
> +
> +			edac_mc_scrub_block(remapped_page,
> +					    offset_in_page, grain);
> +		}
> +	} else {
> +		if (edac_mc_get_log_ue())
> +			edac_mc_printk(mci, KERN_WARNING,
> +				"UE %s on %s (%s%s %s)\n",
> +				msg, label, location, detail, other_detail);
>  
> -/*************************************************************
> - * On Fully Buffered DIMM modules, this help function is
> - * called to process CE events
> - */
> -void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
> -			unsigned int csrow, unsigned int channel, char *msg)
> -{
> -	char *label = NULL;
> +		if (edac_mc_get_panic_on_ue())
> +			panic("UE %s on %s (%s%s %s)\n",
> +			      msg, label, location, detail, other_detail);
>  
> -	/* Ensure boundary values */
> -	if (csrow >= mci->nr_csrows) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: row out of range (%d >= %d)\n",
> -			csrow, mci->nr_csrows);
> -		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
> -		return;
> +		edac_increment_ue_error(mci, enable_filter, pos);
>  	}
> -	if (channel >= mci->csrows[csrow].nr_channels) {
> -		/* something is wrong */
> -		edac_mc_printk(mci, KERN_ERR,
> -			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
> -			channel, mci->csrows[csrow].nr_channels);
> -		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
> -		return;
> -	}
> -
> -	label = mci->csrows[csrow].channels[channel].dimm->label;
> -
> -	if (edac_mc_get_log_ce())
> -		/* FIXME - put in DIMM location */
> -		edac_mc_printk(mci, KERN_WARNING,
> -			"CE row %d, channel %d, label \"%s\": %s\n",
> -			csrow, channel, label, msg);
> -
> -	mci->ce_count++;
> -	mci->csrows[csrow].ce_count++;
> -	mci->csrows[csrow].channels[channel].dimm->ce_count++;
> -	mci->csrows[csrow].channels[channel].ce_count++;
>  }
> -EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
> +EXPORT_SYMBOL_GPL(edac_mc_handle_error);
> diff --git a/include/linux/edac.h b/include/linux/edac.h
> index 3b8798d..412d5cd 100644
> --- a/include/linux/edac.h
> +++ b/include/linux/edac.h
> @@ -412,18 +412,20 @@ struct edac_mc_layer {
>  /* FIXME: add the proper per-location error counts */
>  struct dimm_info {
>  	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
> -	unsigned memory_controller;
> -	unsigned csrow;
> -	unsigned csrow_channel;
> +
> +	/* Memory location data */
> +	unsigned location[EDAC_MAX_LAYERS];
> +
> +	struct mem_ctl_info *mci;	/* the parent */
>  
>  	u32 grain;		/* granularity of reported error in bytes */
>  	enum dev_type dtype;	/* memory device type */
>  	enum mem_type mtype;	/* memory dimm type */
>  	enum edac_type edac_mode;	/* EDAC mode for this dimm */
>  
> -	u32 nr_pages;			/* number of pages in csrow */
> +	u32 nr_pages;			/* number of pages on this dimm */
>  
> -	u32 ce_count;		/* Correctable Errors for this dimm */
> +	unsigned csrow, cschannel;	/* Points to the old API data */
>  };
>  
>  /**
> @@ -443,9 +445,10 @@ struct dimm_info {
>   */
>  struct rank_info {
>  	int chan_idx;
> -	u32 ce_count;
>  	struct csrow_info *csrow;
>  	struct dimm_info *dimm;
> +
> +	u32 ce_count;		/* Correctable Errors for this csrow */
>  };
>  
>  struct csrow_info {
> @@ -497,6 +500,11 @@ struct mcidev_sysfs_attribute {
>          ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
>  };
>  
> +struct edac_hierarchy {
> +	char		*name;
> +	unsigned	nr;
> +};
> +
>  /* MEMORY controller information structure
>   */
>  struct mem_ctl_info {
> @@ -541,13 +549,16 @@ struct mem_ctl_info {
>  	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
>  					   unsigned long page);
>  	int mc_idx;
> -	int nr_csrows;
>  	struct csrow_info *csrows;
> +	unsigned nr_csrows, num_cschannel;
>  
> +	/* Memory Controller hierarchy */
> +	unsigned n_layers;
> +	struct edac_mc_layer *layers;
>  	/*
>  	 * DIMM info. Will eventually remove the entire csrows_info some day
>  	 */
> -	unsigned nr_dimms;
> +	unsigned tot_dimms;
>  	struct dimm_info *dimms;
>  
>  	/*
> @@ -562,12 +573,15 @@ struct mem_ctl_info {
>  	const char *dev_name;
>  	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
>  	void *pvt_info;
> -	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
> -	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
> -	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
> -	u32 ce_count;		/* Total Correctable Errors for this MC */
> +	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
> +	u32 ce_count;           /* Total Correctable Errors for this MC */
>  	unsigned long start_time;	/* mci load start time (in jiffies) */
>  
> +	/* drivers shouldn't access this struct directly */
> +	unsigned ce_noinfo_count, ue_noinfo_count;
> +	unsigned ce_mc, ue_mc;
> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
> +
>  	struct completion complete;
>  
>  	/* edac sysfs device control */
> @@ -580,7 +594,7 @@ struct mem_ctl_info {
>  	 * by the low level driver.
>  	 *
>  	 * Set by the low level driver to provide attributes at the
> -	 * controller level, same level as 'ue_count' and 'ce_count' above.
> +	 * controller level.
>  	 * An array of structures, NULL terminated
>  	 *
>  	 * If attributes are desired, then set to array of attributes
> -- 
> 1.7.8
> 
> 

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 13:33                                 ` Borislav Petkov
@ 2012-04-27 14:11                                   ` Joe Perches
  -1 siblings, 0 replies; 206+ messages in thread
From: Joe Perches @ 2012-04-27 14:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson,
	Mark Gross, Jason Uhlenkott, Tim Small, Ranganathan Desikan,
	Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

On Fri, 2012-04-27 at 15:33 +0200, Borislav Petkov wrote:
> this patch gives
> 
> [    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0

One too many __func__'s in some combination of the
pr_fmt and/or dbg call and/or the actual call site?

> > diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
[]
> > @@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
> >  
> >  #endif				/* CONFIG_PCI */
> >  
> > -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> > -					  unsigned nr_chans, int edac_index);
> > +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> > +				   unsigned nr_chans, int edac_index);
> 
> Why not "extern"?

Using extern function prototypes in .h files
isn't generally necessary nor is extern the
more common kernel style.

> > +static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
> >  			      unsigned long page_frame_number,
> >  			      unsigned long offset_in_page,
> >  			      unsigned long syndrome, int row, int channel,
> > -			      const char *msg);
> 
> Strange alignment, pls do
> 
> static inline void edac_mc_handle_ce(struct...,
> 				     unsigned...,
> 				     ...,
> 				     ...);
> 

or

static inline
void edac_mc_handle_ce(struct ..., etc)

or

static inline void
edac_mc_handle_ce(...)



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-27 14:11                                   ` Joe Perches
  0 siblings, 0 replies; 206+ messages in thread
From: Joe Perches @ 2012-04-27 14:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Aristeu Rozanski,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Chris Metcalf, Doug Thompson, Andrew Morton,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Olof Johansson, linuxppc-dev

On Fri, 2012-04-27 at 15:33 +0200, Borislav Petkov wrote:
> this patch gives
> 
> [    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0

One too many __func__'s in some combination of the
pr_fmt and/or dbg call and/or the actual call site?

> > diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
[]
> > @@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
> >  
> >  #endif				/* CONFIG_PCI */
> >  
> > -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> > -					  unsigned nr_chans, int edac_index);
> > +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> > +				   unsigned nr_chans, int edac_index);
> 
> Why not "extern"?

Using extern function prototypes in .h files
isn't generally necessary nor is extern the
more common kernel style.

> > +static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
> >  			      unsigned long page_frame_number,
> >  			      unsigned long offset_in_page,
> >  			      unsigned long syndrome, int row, int channel,
> > -			      const char *msg);
> 
> Strange alignment, pls do
> 
> static inline void edac_mc_handle_ce(struct...,
> 				     unsigned...,
> 				     ...,
> 				     ...);
> 

or

static inline
void edac_mc_handle_ce(struct ..., etc)

or

static inline void
edac_mc_handle_ce(...)

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 14:11                                   ` Joe Perches
@ 2012-04-27 15:12                                     ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-27 15:12 UTC (permalink / raw)
  To: Joe Perches
  Cc: Borislav Petkov, Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson,
	Mark Gross, Jason Uhlenkott, Tim Small, Ranganathan Desikan,
	Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

On Fri, Apr 27, 2012 at 10:11:35AM -0400, Joe Perches wrote:
> > > -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> > > -					  unsigned nr_chans, int edac_index);
> > > +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> > > +				   unsigned nr_chans, int edac_index);
> > 
> > Why not "extern"?
> 
> Using extern function prototypes in .h files
> isn't generally necessary nor is extern the
> more common kernel style.

Searching for it, there's this discussion, for example:
http://gcc.gnu.org/ml/gcc/2009-04/msg00812.html

Maybe we should put a small note in Documentation/CodingStyle what the
kernel preference is and hold people to it.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-27 15:12                                     ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-27 15:12 UTC (permalink / raw)
  To: Joe Perches
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Aristeu Rozanski,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Borislav Petkov, Egor Martovetsky,
	Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Doug Thompson, Linux Edac Mailing List,
	Michal Marek, Jiri Kosina, Linux Kernel Mailing List,
	Olof Johansson, Andrew Morton, linuxppc-dev

On Fri, Apr 27, 2012 at 10:11:35AM -0400, Joe Perches wrote:
> > > -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> > > -					  unsigned nr_chans, int edac_index);
> > > +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> > > +				   unsigned nr_chans, int edac_index);
> > 
> > Why not "extern"?
> 
> Using extern function prototypes in .h files
> isn't generally necessary nor is extern the
> more common kernel style.

Searching for it, there's this discussion, for example:
http://gcc.gnu.org/ml/gcc/2009-04/msg00812.html

Maybe we should put a small note in Documentation/CodingStyle what the
kernel preference is and hold people to it.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 13:33                                 ` Borislav Petkov
@ 2012-04-27 15:36                                   ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-27 15:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 27-04-2012 10:33, Borislav Petkov escreveu:
> Btw,
> 
> this patch gives
> 
> [    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> [    8.287594] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> [    8.296784] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: dimm2 (1:0:0): row 1, chan 0
> [    8.305968] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: dimm3 (1:1:0): row 1, chan 1
> [    8.315144] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: dimm4 (2:0:0): row 2, chan 0
> [    8.324326] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: dimm5 (2:1:0): row 2, chan 1
> [    8.333502] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: dimm6 (3:0:0): row 3, chan 0
> [    8.342684] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: dimm7 (3:1:0): row 3, chan 1
> [    8.351860] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: dimm8 (4:0:0): row 4, chan 0
> [    8.361049] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: dimm9 (4:1:0): row 4, chan 1
> [    8.370227] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: dimm10 (5:0:0): row 5, chan 0
> [    8.379582] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: dimm11 (5:1:0): row 5, chan 1
> [    8.388941] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: dimm12 (6:0:0): row 6, chan 0
> [    8.398315] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: dimm13 (6:1:0): row 6, chan 1
> [    8.407680] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: dimm14 (7:0:0): row 7, chan 0
> [    8.417047] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: dimm15 (7:1:0): row 7, chan 1
> 
> and the memory controller has the following chip selects
> 
> [    8.137662] EDAC MC: DCT0 chip selects:
> [    8.150291] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [    8.155349] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [    8.160408] EDAC amd64: MC: 4:     0MB 5:     0MB
> [    8.165475] EDAC amd64: MC: 6:     0MB 7:     0MB
> [    8.180499] EDAC MC: DCT1 chip selects:
> [    8.184693] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [    8.189753] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [    8.194812] EDAC amd64: MC: 4:     0MB 5:     0MB
> [    8.199875] EDAC amd64: MC: 6:     0MB 7:     0MB
> 
> Those are 4 dual-ranked DIMMs on this node, DCT0 is one channel and DCT1
> is another and I have 4 ranks per channel. Having dimm0-dimm15 is very
> misleading and has nothing to do with the reality. So, if this is to use
> your nomenclature with layers, I'll have dimm0-dimm7 where each dimm is
> a rank.
> 
> Or, the most correct thing to do would be to have dimm0-dimm3, each
> dual-ranked.
> 
> So either tot_dimms is computed wrongly or there's a more serious error
> somewhere.
> 
> I've reviewed almost the half patch, will review the rest when/if we
> sort out the above issue first.
> 
> Thanks.


The fix for it were in another patch[1], as calling them as "rank" is needed
also at the sysfs API.

[1] http://lists-archives.com/linux-kernel/27623222-edac-add-a-new-per-dimm-api-and-make-the-old-per-virtual-rank-api-obsolete.html

I can just merge the fix on this patch, with the enclosed diff.

Regards,
Mauro

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 4d4d8b7..e0d9481 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -86,7 +86,7 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
-	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
+	debugf3("\tmci->nr_dimms = %d, dimms = %p\n",
 		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
@@ -183,10 +183,6 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  * @size_pvt:		size of private storage needed
  *
  *
- * FIXME: drivers handle multi-rank memories on different ways: on some
- * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
- * a single multi-rank DIMM would be mapped into several "dimms".
- *
  * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
  * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
  * thing, as two chip select values are used for dual-rank memories (and 4, for
@@ -201,6 +197,12 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  *
  * Use edac_mc_free() to free mc structures allocated by this function.
  *
+ * NOTE: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one entry, while, on others,
+ * a single multi-rank DIMM would be mapped into several entries. Currently,
+ * this function will allocate multiple struct dimm_info on such scenarios,
+ * as grouping the multiple ranks require drivers change.
+ *
  * Returns:
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
@@ -220,10 +222,11 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
 	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
-	unsigned tot_csrows, tot_cschannels;
+	unsigned tot_csrows, tot_cschannels, tot_errcount = 0;
 	int i, j;
 	int err;
 	int row, chn;
+	bool per_rank = false;
 
 	BUG_ON(n_layers > EDAC_MAX_LAYERS);
 	/*
@@ -239,6 +242,9 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 			tot_csrows *= layers[i].size;
 		else
 			tot_cschannels *= layers[i].size;
+
+		if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
+			per_rank = true;
 	}
 
 	/* Figure out the offsets of the various items from the start of an mc
@@ -254,14 +260,21 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	count = 1;
 	for (i = 0; i < n_layers; i++) {
 		count *= layers[i].size;
+		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
 		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		tot_errcount += 2 * count;
 	}
+
+	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
-	debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
-		__func__, size, tot_dimms, tot_csrows * tot_cschannels);
+	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		__func__, size,
+		tot_dimms,
+		per_rank ? "ranks" : "dimms",
+		tot_csrows * tot_cschannels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -290,6 +303,7 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
 	mci->nr_csrows = tot_csrows;
 	mci->num_cschannel = tot_cschannels;
+	mci->mem_is_per_rank = per_rank;
 
 	/*
 	 * Fills the csrow struct
@@ -315,15 +329,16 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
+	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = &csi[row].channels[chn];
 		dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
 			       pos[0], pos[1], pos[2]);
 		dimm->mci = mci;
 
-		debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
-			i, (dimm - mci->dimms),
+		debugf2("%s: %d: %s%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, per_rank ? "rank" : "dimm", (dimm - mci->dimms),
 			pos[0], pos[1], pos[2], row, chn);
 
 		/* Copy DIMM location */
@@ -1040,8 +1055,10 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			 * get csrow/channel of the dimm, in order to allow
 			 * incrementing the compat API counters
 			 */
-			debugf4("%s: dimm csrows (%d,%d)\n",
-				__func__, dimm->csrow, dimm->cschannel);
+			debugf4("%s: %s csrows map: (%d,%d)\n",
+				__func__,
+				mci->mem_is_per_rank ? "rank" : "dimm",
+				dimm->csrow, dimm->cschannel);
 			if (row == -1)
 				row = dimm->csrow;
 			else if (row >= 0 && row != dimm->csrow)
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 412d5cd..2b66109 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -555,6 +555,8 @@ struct mem_ctl_info {
 	/* Memory Controller hierarchy */
 	unsigned n_layers;
 	struct edac_mc_layer *layers;
+	bool mem_is_per_rank;
+
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-27 15:36                                   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-27 15:36 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 27-04-2012 10:33, Borislav Petkov escreveu:
> Btw,
> 
> this patch gives
> 
> [    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> [    8.287594] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> [    8.296784] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: dimm2 (1:0:0): row 1, chan 0
> [    8.305968] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: dimm3 (1:1:0): row 1, chan 1
> [    8.315144] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: dimm4 (2:0:0): row 2, chan 0
> [    8.324326] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: dimm5 (2:1:0): row 2, chan 1
> [    8.333502] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: dimm6 (3:0:0): row 3, chan 0
> [    8.342684] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: dimm7 (3:1:0): row 3, chan 1
> [    8.351860] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: dimm8 (4:0:0): row 4, chan 0
> [    8.361049] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: dimm9 (4:1:0): row 4, chan 1
> [    8.370227] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: dimm10 (5:0:0): row 5, chan 0
> [    8.379582] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: dimm11 (5:1:0): row 5, chan 1
> [    8.388941] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: dimm12 (6:0:0): row 6, chan 0
> [    8.398315] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: dimm13 (6:1:0): row 6, chan 1
> [    8.407680] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: dimm14 (7:0:0): row 7, chan 0
> [    8.417047] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: dimm15 (7:1:0): row 7, chan 1
> 
> and the memory controller has the following chip selects
> 
> [    8.137662] EDAC MC: DCT0 chip selects:
> [    8.150291] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [    8.155349] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [    8.160408] EDAC amd64: MC: 4:     0MB 5:     0MB
> [    8.165475] EDAC amd64: MC: 6:     0MB 7:     0MB
> [    8.180499] EDAC MC: DCT1 chip selects:
> [    8.184693] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [    8.189753] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [    8.194812] EDAC amd64: MC: 4:     0MB 5:     0MB
> [    8.199875] EDAC amd64: MC: 6:     0MB 7:     0MB
> 
> Those are 4 dual-ranked DIMMs on this node, DCT0 is one channel and DCT1
> is another and I have 4 ranks per channel. Having dimm0-dimm15 is very
> misleading and has nothing to do with the reality. So, if this is to use
> your nomenclature with layers, I'll have dimm0-dimm7 where each dimm is
> a rank.
> 
> Or, the most correct thing to do would be to have dimm0-dimm3, each
> dual-ranked.
> 
> So either tot_dimms is computed wrongly or there's a more serious error
> somewhere.
> 
> I've reviewed almost the half patch, will review the rest when/if we
> sort out the above issue first.
> 
> Thanks.


The fix for it were in another patch[1], as calling them as "rank" is needed
also at the sysfs API.

[1] http://lists-archives.com/linux-kernel/27623222-edac-add-a-new-per-dimm-api-and-make-the-old-per-virtual-rank-api-obsolete.html

I can just merge the fix on this patch, with the enclosed diff.

Regards,
Mauro

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 4d4d8b7..e0d9481 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -86,7 +86,7 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
-	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
+	debugf3("\tmci->nr_dimms = %d, dimms = %p\n",
 		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
@@ -183,10 +183,6 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  * @size_pvt:		size of private storage needed
  *
  *
- * FIXME: drivers handle multi-rank memories on different ways: on some
- * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
- * a single multi-rank DIMM would be mapped into several "dimms".
- *
  * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
  * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
  * thing, as two chip select values are used for dual-rank memories (and 4, for
@@ -201,6 +197,12 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  *
  * Use edac_mc_free() to free mc structures allocated by this function.
  *
+ * NOTE: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one entry, while, on others,
+ * a single multi-rank DIMM would be mapped into several entries. Currently,
+ * this function will allocate multiple struct dimm_info on such scenarios,
+ * as grouping the multiple ranks require drivers change.
+ *
  * Returns:
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
@@ -220,10 +222,11 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
 	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
-	unsigned tot_csrows, tot_cschannels;
+	unsigned tot_csrows, tot_cschannels, tot_errcount = 0;
 	int i, j;
 	int err;
 	int row, chn;
+	bool per_rank = false;
 
 	BUG_ON(n_layers > EDAC_MAX_LAYERS);
 	/*
@@ -239,6 +242,9 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 			tot_csrows *= layers[i].size;
 		else
 			tot_cschannels *= layers[i].size;
+
+		if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
+			per_rank = true;
 	}
 
 	/* Figure out the offsets of the various items from the start of an mc
@@ -254,14 +260,21 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	count = 1;
 	for (i = 0; i < n_layers; i++) {
 		count *= layers[i].size;
+		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
 		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		tot_errcount += 2 * count;
 	}
+
+	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
-	debugf1("%s(): allocating %u bytes for mci data (%d dimms, %d csrows/channels)\n",
-		__func__, size, tot_dimms, tot_csrows * tot_cschannels);
+	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		__func__, size,
+		tot_dimms,
+		per_rank ? "ranks" : "dimms",
+		tot_csrows * tot_cschannels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -290,6 +303,7 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	memcpy(mci->layers, layers, sizeof(*lay) * n_layers);
 	mci->nr_csrows = tot_csrows;
 	mci->num_cschannel = tot_cschannels;
+	mci->mem_is_per_rank = per_rank;
 
 	/*
 	 * Fills the csrow struct
@@ -315,15 +329,16 @@ struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("%s: initializing %d dimms\n", __func__, tot_dimms);
+	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = &csi[row].channels[chn];
 		dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
 			       pos[0], pos[1], pos[2]);
 		dimm->mci = mci;
 
-		debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
-			i, (dimm - mci->dimms),
+		debugf2("%s: %d: %s%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, per_rank ? "rank" : "dimm", (dimm - mci->dimms),
 			pos[0], pos[1], pos[2], row, chn);
 
 		/* Copy DIMM location */
@@ -1040,8 +1055,10 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			 * get csrow/channel of the dimm, in order to allow
 			 * incrementing the compat API counters
 			 */
-			debugf4("%s: dimm csrows (%d,%d)\n",
-				__func__, dimm->csrow, dimm->cschannel);
+			debugf4("%s: %s csrows map: (%d,%d)\n",
+				__func__,
+				mci->mem_is_per_rank ? "rank" : "dimm",
+				dimm->csrow, dimm->cschannel);
 			if (row == -1)
 				row = dimm->csrow;
 			else if (row >= 0 && row != dimm->csrow)
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 412d5cd..2b66109 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -555,6 +555,8 @@ struct mem_ctl_info {
 	/* Memory Controller hierarchy */
 	unsigned n_layers;
 	struct edac_mc_layer *layers;
+	bool mem_is_per_rank;
+
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 14:11                                   ` Joe Perches
@ 2012-04-27 16:07                                     ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-27 16:07 UTC (permalink / raw)
  To: Joe Perches
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson,
	Mark Gross, Jason Uhlenkott, Tim Small, Ranganathan Desikan,
	Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

Em 27-04-2012 11:11, Joe Perches escreveu:
> On Fri, 2012-04-27 at 15:33 +0200, Borislav Petkov wrote:
>> this patch gives
>>
>> [    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> 
> One too many __func__'s in some combination of the
> pr_fmt and/or dbg call and/or the actual call site?

Yes. This is a common issue at the EDAC core: on several places, it calls the
edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
the debug macros already handles that. I suspect that, in the past, the __func__
were not at the macros, but some patch added it there, and forgot to fix the
occurrences of its call.

This is something that needs to be reviewed at the entire EDAC core (and likely
at the drivers).

I opted to not touch on this at the existing debug logic, as I think that the
better is to address all those issues on one separate patch, after fixing the
EDAC core bugs.
> 
>>> diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
> []
>>> @@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
>>>  
>>>  #endif				/* CONFIG_PCI */
>>>  
>>> -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>> -					  unsigned nr_chans, int edac_index);
>>> +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>> +				   unsigned nr_chans, int edac_index);
>>
>> Why not "extern"?
> 
> Using extern function prototypes in .h files
> isn't generally necessary nor is extern the
> more common kernel style.

Yes. I never add extern on the code I write.

While CodingStyle doesn't explicitly say anything about that, its spirit
seem to indicate to that the right thing is avoid using it, like, for 
example:
	"Chapter 4: Naming

	C is a Spartan language, and so should your naming be."

(also on other places, like avoiding to use {} for single-statement if's).

So, useless clauses like "extern" doesn't seem to be the best choice.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-27 16:07                                     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-27 16:07 UTC (permalink / raw)
  To: Joe Perches
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Borislav Petkov, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Chris Metcalf, Doug Thompson, Linux Edac Mailing List,
	Michal Marek, Jiri Kosina, Linux Kernel Mailing List,
	Olof Johansson, Andrew Morton, linuxppc-dev

Em 27-04-2012 11:11, Joe Perches escreveu:
> On Fri, 2012-04-27 at 15:33 +0200, Borislav Petkov wrote:
>> this patch gives
>>
>> [    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> 
> One too many __func__'s in some combination of the
> pr_fmt and/or dbg call and/or the actual call site?

Yes. This is a common issue at the EDAC core: on several places, it calls the
edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
the debug macros already handles that. I suspect that, in the past, the __func__
were not at the macros, but some patch added it there, and forgot to fix the
occurrences of its call.

This is something that needs to be reviewed at the entire EDAC core (and likely
at the drivers).

I opted to not touch on this at the existing debug logic, as I think that the
better is to address all those issues on one separate patch, after fixing the
EDAC core bugs.
> 
>>> diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
> []
>>> @@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
>>>  
>>>  #endif				/* CONFIG_PCI */
>>>  
>>> -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>> -					  unsigned nr_chans, int edac_index);
>>> +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>> +				   unsigned nr_chans, int edac_index);
>>
>> Why not "extern"?
> 
> Using extern function prototypes in .h files
> isn't generally necessary nor is extern the
> more common kernel style.

Yes. I never add extern on the code I write.

While CodingStyle doesn't explicitly say anything about that, its spirit
seem to indicate to that the right thing is avoid using it, like, for 
example:
	"Chapter 4: Naming

	C is a Spartan language, and so should your naming be."

(also on other places, like avoiding to use {} for single-statement if's).

So, useless clauses like "extern" doesn't seem to be the best choice.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 13:33                                 ` Borislav Petkov
@ 2012-04-27 17:52                                   ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-27 17:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 27-04-2012 10:33, Borislav Petkov escreveu:
> Btw,
> 
> this patch gives
> 
> [    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> [    8.287594] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> [    8.296784] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: dimm2 (1:0:0): row 1, chan 0
> [    8.305968] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: dimm3 (1:1:0): row 1, chan 1
> [    8.315144] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: dimm4 (2:0:0): row 2, chan 0
> [    8.324326] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: dimm5 (2:1:0): row 2, chan 1
> [    8.333502] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: dimm6 (3:0:0): row 3, chan 0
> [    8.342684] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: dimm7 (3:1:0): row 3, chan 1
> [    8.351860] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: dimm8 (4:0:0): row 4, chan 0
> [    8.361049] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: dimm9 (4:1:0): row 4, chan 1
> [    8.370227] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: dimm10 (5:0:0): row 5, chan 0
> [    8.379582] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: dimm11 (5:1:0): row 5, chan 1
> [    8.388941] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: dimm12 (6:0:0): row 6, chan 0
> [    8.398315] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: dimm13 (6:1:0): row 6, chan 1
> [    8.407680] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: dimm14 (7:0:0): row 7, chan 0
> [    8.417047] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: dimm15 (7:1:0): row 7, chan 1
> 
> and the memory controller has the following chip selects
> 
> [    8.137662] EDAC MC: DCT0 chip selects:
> [    8.150291] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [    8.155349] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [    8.160408] EDAC amd64: MC: 4:     0MB 5:     0MB
> [    8.165475] EDAC amd64: MC: 6:     0MB 7:     0MB
> [    8.180499] EDAC MC: DCT1 chip selects:
> [    8.184693] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [    8.189753] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [    8.194812] EDAC amd64: MC: 4:     0MB 5:     0MB
> [    8.199875] EDAC amd64: MC: 6:     0MB 7:     0MB
> 
> Those are 4 dual-ranked DIMMs on this node, DCT0 is one channel and DCT1
> is another and I have 4 ranks per channel. Having dimm0-dimm15 is very
> misleading and has nothing to do with the reality. So, if this is to use
> your nomenclature with layers, I'll have dimm0-dimm7 where each dimm is
> a rank.
> 
> Or, the most correct thing to do would be to have dimm0-dimm3, each
> dual-ranked.
> 
> So either tot_dimms is computed wrongly or there's a more serious error
> somewhere.
> 
> I've reviewed almost the half patch, will review the rest when/if we
> sort out the above issue first.
> 
> Thanks.
> 
> On Tue, Apr 24, 2012 at 03:15:41PM -0300, Mauro Carvalho Chehab wrote:
>> Change the EDAC internal representation to work with non-csrow
>> based memory controllers.
>>
>> There are lots of those memory controllers nowadays, and more
>> are coming. So, the EDAC internal representation needs to be
>> changed, in order to work with those memory controllers, while
>> preserving backward compatibility with the old ones.
>>
>> The edac core were written with the idea that memory controllers
> 
> 		was
> 
>> are able to directly access csrows, and that the channels are
>> used inside a csrows select.
> 
> This sounds funny, simply remove that second part about the channels.
> 
>> This is not true for FB-DIMM and RAMBUS memory controllers.
>>
>> Also, some recent advanced memory controllers don't present a per-csrows
>> view. Instead, they view memories as DIMM's, instead of ranks, accessed
> 
> 					DIMMs instead of ranks."
> 
> Remove the rest.
> 
>> via csrow/channel.
>>
>> So, change the allocation and error report routines to allow
>> them to work with all types of architectures.
>>
>> This will allow the removal of several hacks on FB-DIMM and RAMBUS
> 
> 					       with
> 
>> memory controllers on the next patches.
> 
> 		    . Remove the rest.
> 
>>
>> Also, several tests were done on different platforms using different
>> x86 drivers.
>>
>> TODO: a multi-rank DIMM's are currently represented by multiple DIMM
> 
> 	Multi-rank DIMMs
> 
>> entries at struct dimm_info. That means that changing a label for one
> 
> 	  in
> 
>> rank won't change the same label for the other ranks at the same dimm.
> 
> 						       of the same DIMM.
> 
>> Such bug is there since the beginning of the EDAC, so it is not a big
> 
>   This bug is present ..
> 
>> deal. However, on several drivers, it is possible to fix this issue, but
> 
> 		remove "on"
> 
>> it should be a per-driver fix, as the csrow => DIMM arrangement may not
>> be equal for all. So, don't try to fix it here yet.
>>
>> PS.: I tried to make this patch as short as possible, preceding it with
> 
> Remove "PS."
> 
>> several other patches that simplified the logic here. Yet, as the
>> internal API changes, all drivers need changes. The changes are
>> generally bigger on the drivers for FB-DIMM's.
> 
> 		   in 		   for FB-DIMMs.
> 
>>
>> FIXME: while the FB-DIMMs are not converted to use the new
>> design, uncorrected errors will show just one channel. In
>> the past, all changes were on a big patch with about 150K.
>> As it needed to be split, in order to be accepted by the
>> EDAC ML at vger, we've opted to have this small drawback.
>> As an advantage, it is now easier to review the patch series.
> 
> This whole paragraph above doesn't have anything to do with what the
> patch does, so it can go.
> 
> [..]
> 
>> ---
>>
>> v16: Only context changes
>>
>>  drivers/edac/edac_core.h |   92 ++++++-
>>  drivers/edac/edac_mc.c   |  682 ++++++++++++++++++++++++++++------------------
>>  include/linux/edac.h     |   40 ++-
>>  3 files changed, 526 insertions(+), 288 deletions(-)
>>
>> diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
>> index e48ab31..7201bb1 100644
>> --- a/drivers/edac/edac_core.h
>> +++ b/drivers/edac/edac_core.h
>> @@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
>>  
>>  #endif				/* CONFIG_PCI */
>>  
>> -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>> -					  unsigned nr_chans, int edac_index);
>> +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>> +				   unsigned nr_chans, int edac_index);
> 
> Why not "extern"?
> 
>> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
>> +				   unsigned n_layers,
>> +				   struct edac_mc_layer *layers,
>> +				   bool rev_order,
>> +				   unsigned sz_pvt);
> 
> ditto.
> 
>>  extern int edac_mc_add_mc(struct mem_ctl_info *mci);
>>  extern void edac_mc_free(struct mem_ctl_info *mci);
>>  extern struct mem_ctl_info *edac_mc_find(int idx);
>> @@ -467,24 +472,80 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
>>   * reporting logic and function interface - reduces conditional
>>   * statement clutter and extra function arguments.
>>   */
>> -extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
>> +
>> +void edac_mc_handle_error(const enum hw_event_mc_err_type type,
>> +			  struct mem_ctl_info *mci,
>> +			  const unsigned long page_frame_number,
>> +			  const unsigned long offset_in_page,
>> +			  const unsigned long syndrome,
>> +			  const int layer0,
>> +			  const int layer1,
>> +			  const int layer2,
>> +			  const char *msg,
>> +			  const char *other_detail,
>> +			  const void *mcelog);
> 
> Why isn't this one "extern" either?
> 
>> +
>> +static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
>>  			      unsigned long page_frame_number,
>>  			      unsigned long offset_in_page,
>>  			      unsigned long syndrome, int row, int channel,
>> -			      const char *msg);
> 
> Strange alignment, pls do
> 
> static inline void edac_mc_handle_ce(struct...,
> 				     unsigned...,
> 				     ...,
> 				     ...);
> 
> 
>> -extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
>> -				      const char *msg);
>> -extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
>> +			      const char *msg)
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +			      page_frame_number, offset_in_page, syndrome,
>> +		              row, channel, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
>> +				      const char *msg)
> 
> ditto.
> 
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
>>  			      unsigned long page_frame_number,
>>  			      unsigned long offset_in_page, int row,
>> -			      const char *msg);
> 
> ditto.
> 
>> -extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
>> -				      const char *msg);
>> -extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
>> -				  unsigned int channel0, unsigned int channel1,
>> -				  char *msg);
>> -extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
>> -				  unsigned int channel, char *msg);
>> +			      const char *msg)
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
>> +			      page_frame_number, offset_in_page, 0,
>> +		              row, -1, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
>> +				      const char *msg)
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
>> +			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
>> +					 unsigned int csrow,
>> +					 unsigned int channel0,
>> +					 unsigned int channel1,
>> +					 char *msg)
> 
> Now this alignment looks correct.
> 
>> +{
>> +	/*
>> +	 *FIXME: The error can also be at channel1 (e. g. at the second
>> +	 *	  channel of the same branch). The fix is to push
>> +	 *	  edac_mc_handle_error() call into each driver
>> +	 */
>> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
>> +			      0, 0, 0,
>> +		              csrow, channel0, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
>> +					 unsigned int csrow,
>> +					 unsigned int channel, char *msg)
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +			      0, 0, 0,
>> +		              csrow, channel, -1, msg, NULL, NULL);
>> +}
>> +
>> +
> 
> Two superfluous newlines.

Fixed all above (except for the "extern").

> 
>>  
>>  /*
>>   * edac_device APIs
>> @@ -496,6 +557,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
>>  extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
>>  				int inst_nr, int block_nr, const char *msg);
>>  extern int edac_device_alloc_index(void);
>> +extern const char *edac_layer_name[];
>>  
>>  /*
>>   * edac_pci APIs
>> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
>> index 6ec967a..4d4d8b7 100644
>> --- a/drivers/edac/edac_mc.c
>> +++ b/drivers/edac/edac_mc.c
>> @@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>>  	debugf4("\tchannel = %p\n", chan);
>>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
>>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
>> -	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
>> -	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
>> -	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
>> +	debugf4("\tchannel->dimm = %p\n", chan->dimm);
>> +}
>> +
>> +static void edac_mc_dump_dimm(struct dimm_info *dimm)
>> +{
>> +	int i;
>> +
>> +	debugf4("\tdimm = %p\n", dimm);
>> +	debugf4("\tdimm->label = '%s'\n", dimm->label);
>> +	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
>> +	debugf4("\tdimm location ");
>> +	for (i = 0; i < dimm->mci->n_layers; i++) {
>> +		printk(KERN_CONT "%d", dimm->location[i]);
>> +		if (i < dimm->mci->n_layers - 1)
>> +			printk(KERN_CONT ".");
>> +	}
>> +	printk(KERN_CONT "\n");
> 
> This looks hacky but I don't have a good suggestion what to do instead
> here. Maybe snprintf into a complete string which you can issue with
> debugf4()...

This is not hacky. There are several places at the Kernel doing loops like
that. Look, for example, at lib/hexdump.c (without KERN_CONT, as this
macro was added later - probably to avoid checkpatch.pl complains).

>> +	debugf4("\tdimm->grain = %d\n", dimm->grain);
>> +	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
>>  }
>>  
>>  static void edac_mc_dump_csrow(struct csrow_info *csrow)
>> @@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
>>  	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
>>  	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
>>  		mci->nr_csrows, mci->csrows);
>> +	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
> 
> 		      ->tot_dimms      dimms

Fixed.

> 
>> +		mci->tot_dimms, mci->dimms);
>>  	debugf3("\tdev = %p\n", mci->dev);
>>  	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
>>  	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
>> @@ -157,10 +175,25 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
>>  }
>>  
>>  /**
>> - * edac_mc_alloc: Allocate a struct mem_ctl_info structure
>> - * @size_pvt:	size of private storage needed
>> - * @nr_csrows:	Number of CWROWS needed for this MC
>> - * @nr_chans:	Number of channels for the MC
>> + * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
> 
> 					    fill
> 
>> + * @edac_index:		Memory controller number
>> + * @n_layers:		Number of layers at the MC hierarchy
> 
> 				Number of MC hierarchy layers
> 
>> + * layers:		Describes each layer as seen by the Memory Controller
>> + * @rev_order:		Fills csrows/cs channels at the reverse order
> 
> 				      csrows/channels in reverse order
> 
>> + * @size_pvt:		size of private storage needed
>> + *
>> + *
>> + * FIXME: drivers handle multi-rank memories on different ways: on some
> 
> 						in		   in
> 
>> + * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
> 
> 			      memory stick			   in
> 
>> + * a single multi-rank DIMM would be mapped into several "dimms".
> 
> 			  memory stick
> 
>> + *
>> + * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
>> + * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
> 
> 				   csrow-based
> 
>> + * thing, as two chip select values are used for dual-rank memories (and 4, for
>> + * quad-rank ones). I suspect that this issue could be solved inside the EDAC
>> + * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
>> + *
>> + * In summary, solving this issue is not easy, as it requires a lot of testing.
>>   *
>>   * Everything is kmalloc'ed as one big chunk - more efficient.
>>   * Only can be used if all structures have the same lifetime - otherwise
>> @@ -172,18 +205,41 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
>>   *	NULL allocation failed
>>   *	struct mem_ctl_info pointer
>>   */
>> -struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>> -				unsigned nr_chans, int edac_index)
>> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
>> +				   unsigned n_layers,
>> +				   struct edac_mc_layer *layers,
>> +				   bool rev_order,
>> +				   unsigned sz_pvt)
> 
> strange function argument vertical alignment
> 
>>  {
>>  	void *ptr = NULL;
>>  	struct mem_ctl_info *mci;
>> -	struct csrow_info *csi, *csrow;
>> +	struct edac_mc_layer *lay;
> 
> As before, call this "layers" pls.
> 
>> +	struct csrow_info *csi, *csr;
>>  	struct rank_info *chi, *chp, *chan;
>>  	struct dimm_info *dimm;
>> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>>  	void *pvt;
>> -	unsigned size;
>> -	int row, chn;
>> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
>> +	unsigned tot_csrows, tot_cschannels;
> 
> No need to call this "tot_cschannels" - "tot_channels" should be enough.
> 
>> +	int i, j;
>>  	int err;
>> +	int row, chn;
> 
> All those local variables should be sorted in a reverse christmas tree
> order:
> 
> 	u32 this_is_the_longest_array_name[LENGTH];
> 	void *shorter_named_variable;
> 	unsigned long size;
> 	int i;
> 
> 	...

Why? There's nothing at the CodingStyle saying about how the vars should
be ordered. If you want to enforce some particular order, please do it
yourself, but apply it consistently among the entire subsystem.

> 
>> +
>> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);
> 
> 
> Push this BUG_ON up into edac_mc_alloc as the first thing this function
> does.

It is already the first thing at the function.

> Also, is it valid to have n_layers == 0? The memcpy call below
> will do nothing.

Changed to:
	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);

>> +	/*
>> +	 * Calculate the total amount of dimms and csrows/cschannels while
>> +	 * in the old API emulation mode
>> +	 */
>> +	tot_dimms = 1;
>> +	tot_cschannels = 1;
>> +	tot_csrows = 1;
> 
> Those initializations can be done above at variable declaration time.

Yes, but the compiled code will be the same anyway, as gcc will optimize
it, either by using registers for those vars or by moving the initialization
to the top of the function.

This function is too complex, so it is better to initialize those vars
just before the loops that are calculating those totals.

> 
>> +	for (i = 0; i < n_layers; i++) {
>> +		tot_dimms *= layers[i].size;
>> +		if (layers[i].is_virt_csrow)
>> +			tot_csrows *= layers[i].size;
>> +		else
>> +			tot_cschannels *= layers[i].size;
>> +	}
>>  
>>  	/* Figure out the offsets of the various items from the start of an mc
>>  	 * structure.  We want the alignment of each item to be at least as
>> @@ -191,12 +247,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	 * hardcode everything into a single struct.
>>  	 */
>>  	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
>> -	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
>> -	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
>> -	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
>> +	lay = edac_align_ptr(&ptr, sizeof(*lay), n_layers);
>> +	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
>> +	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_cschannels);
>> +	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
>> +	count = 1;
> 
> ditto.
>> +	for (i = 0; i < n_layers; i++) {
>> +		tot_dimms *= layers[i].size;
>> +		if (layers[i].is_virt_csrow)
>> +			tot_csrows *= layers[i].size;
>> +		else
>> +			tot_cschannels *= layers[i].size;
>> +	}

Ditto: let gcc optimize it.

Spreading the 'count' match code will only make harder for a reviewer to
actually see what's there.

At assember, count will likely be optimized as a register anyway.

<removed the rest of the email, as there aren't any comments after that point>

Patches with all the fixes is enclosed.

--

[PATCH EDACv17] edac: Change internal representation to work with layers

Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core was written with the idea that memory controllers
are able to directly access csrows.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks with FB-DIMM and RAMBUS
memory controllers.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMMs are currently represented by multiple DIMM
entries in struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same DIMM.
This bug is present since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger in the drivers for FB-DIMMs.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

---

v17: Several cosmetic changes.

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..b2dfdf5 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index);
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -467,24 +472,78 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
  * reporting logic and function interface - reduces conditional
  * statement clutter and extra function arguments.
  */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page,
-			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
+
+static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page,
+				     unsigned long syndrome, int row, int channel,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      page_frame_number, offset_in_page, syndrome,
+		              row, channel, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page, int row,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      page_frame_number, offset_in_page, 0,
+		              row, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel0,
+					 unsigned int channel1,
+					 char *msg)
+{
+	/*
+	 *FIXME: The error can also be at channel1 (e. g. at the second
+	 *	  channel of the same branch). The fix is to push
+	 *	  edac_mc_handle_error() call into each driver
+	 */
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel0, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel, char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel, -1, msg, NULL, NULL);
+}
 
 /*
  * edac_device APIs
@@ -496,6 +555,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 6ec967a..a9f7650 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimms = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -157,10 +175,21 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fill a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Number of MC hierarchy layers
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -168,22 +197,55 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  *
  * Use edac_mc_free() to free mc structures allocated by this function.
  *
+ * NOTE: drivers handle multi-rank memories in different ways: in some
+ * drivers, one multi-rank memory stick is mapped as one entry, while, in
+ * others, a single multi-rank memory stick would be mapped into several
+ * entries. Currently, this function will allocate multiple struct dimm_info
+ * on such scenarios, as grouping the multiple ranks require drivers change.
+ *
  * Returns:
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				       unsigned n_layers,
+				       struct edac_mc_layer *layers,
+				       bool rev_order,
+				       unsigned sz_pvt)
 {
 	void *ptr = NULL;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *layer;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
-	unsigned size;
-	int row, chn;
+	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
+	unsigned tot_csrows, tot_channels, tot_errcount = 0;
+	int i, j;
 	int err;
+	int row, chn;
+	bool per_rank = false;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	tot_dimms = 1;
+	tot_channels = 1;
+	tot_csrows = 1;
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_virt_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_channels *= layers[i].size;
+
+		if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
+			per_rank = true;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
@@ -191,12 +253,28 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * hardcode everything into a single struct.
 	 */
 	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
-	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
-	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
-	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	layer = edac_align_ptr(&ptr, sizeof(*layer), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_channels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	count = 1;
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		tot_errcount += 2 * count;
+	}
+
+	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		__func__, size,
+		tot_dimms,
+		per_rank ? "ranks" : "dimms",
+		tot_csrows * tot_channels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -204,42 +282,101 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	layer = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)layer));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
+	mci->n_layers = n_layers;
+	mci->layers = layer;
+	memcpy(mci->layers, layers, sizeof(*layer) * n_layers);
+	mci->nr_csrows = tot_csrows;
+	mci->num_cschannel = tot_channels;
+	mci->mem_is_per_rank = per_rank;
 
 	/*
-	 * For now, assumes that a per-csrow arrangement for dimms.
-	 * This will be latter changed.
+	 * Fills the csrow struct
 	 */
-	dimm = mci->dimms;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
-
-		for (chn = 0; chn < nr_chans; chn++) {
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_channels;
+		chp = &chi[row * tot_channels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_channels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
+		}
+	}
 
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+	/*
+	 * Fills the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+		per_rank ? "ranks" : "dimms");
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = EDAC_DIMM_PTR(layer, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: %s%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, per_rank ? "rank" : "dimm", (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		if (!rev_order) {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (!layers[j].is_virt_csrow)
+					break;
+			chn++;
+			if (chn == tot_channels) {
+				chn = 0;
+				row++;
+			}
+		} else {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (layers[j].is_virt_csrow)
+					break;
+			row++;
+			if (row == tot_csrows) {
+				row = 0;
+				chn++;
+			}
+		}
+
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -263,6 +400,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	return mci;
 }
+EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Nu
+mber of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the csrow-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * Only can be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ *	NULL allocation failed
+ *	struct mem_ctl_info pointer
+ */
+
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index)
+{
+	unsigned n_layers = 2;
+	struct edac_mc_layer layers[n_layers];
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = nr_csrows;
+	layers[0].is_virt_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_virt_csrow = false;
+
+	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
+			  false, sz_pvt);
+}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
@@ -528,7 +716,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -555,6 +742,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -712,261 +901,251 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
+
+static void edac_increment_ce_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
 {
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+	int i, index = 0;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	mci->ce_mc++;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_increment_ue_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	if (mci->scrub_mode & SCRUB_SW_SRC) {
-		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
-		remapped_page = mci->ctl_page_to_phys ?
-			mci->ctl_page_to_phys(mci, page_frame_number) :
-			page_frame_number;
+	mci->ue_mc++;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
+		return;
 	}
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	unsigned long remapped_page;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_filter = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/* Check if the event report is consistent */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED) {
+				p = "CE";
+				mci->ce_mc++;
+			} else {
+				p = "UE";
+				mci->ue_mc++;
+			}
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_filter = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
-
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory,
+	 * the logic here will get all possible labels that could pottentially
+	 * be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called as "rank")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 * It will also get the max grain, over the error match range
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no sense
+		 * on seeking for the affected DIMMs, as everything may be
+		 * affected. Also, don't show errors for non-filled dimm's.
+		 */
+		if (enable_filter && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the dimm, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: %s csrows map: (%d,%d)\n",
+				__func__,
+				mci->mem_is_per_rank ? "rank" : "dimm",
+				dimm->csrow, dimm->cschannel);
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
-
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_filter) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
+		p += sprintf(p, "%s %d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED)
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+	else
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d",
+			page_frame_number, offset_in_page, grain);
+
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		if (edac_mc_get_log_ce())
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		edac_increment_ce_error(mci, enable_filter, pos);
+
+		if (mci->scrub_mode & SCRUB_SW_SRC) {
+			/*
+			 * Some MC's can remap memory so that it is still
+			 * available at a different address when PCI devices
+			 * map into memory.
+			 * MC's that can't do this lose the memory where PCI
+			 * devices are mapped. This mapping is MC dependent
+			 * and so we call back into the MC driver for it to
+			 * map the MC page to a physical (CPU) page which can
+			 * then be mapped to a virtual page - which can then
+			 * be scrubbed.
+			 */
+			remapped_page = mci->ctl_page_to_phys ?
+				mci->ctl_page_to_phys(mci, page_frame_number) :
+				page_frame_number;
+
+			edac_mc_scrub_block(remapped_page,
+					    offset_in_page, grain);
+		}
+	} else {
+		if (edac_mc_get_log_ue())
+			edac_mc_printk(mci, KERN_WARNING,
+				"UE %s on %s (%s%s %s)\n",
+				msg, label, location, detail, other_detail);
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+		if (edac_mc_get_panic_on_ue())
+			panic("UE %s on %s (%s%s %s)\n",
+			      msg, label, location, detail, other_detail);
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		edac_increment_ue_error(mci, enable_filter, pos);
 	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
-
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 3b8798d..2b66109 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -412,18 +412,20 @@ struct edac_mc_layer {
 /* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -443,9 +445,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -497,6 +500,11 @@ struct mcidev_sysfs_attribute {
         ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
 };
 
+struct edac_hierarchy {
+	char		*name;
+	unsigned	nr;
+};
+
 /* MEMORY controller information structure
  */
 struct mem_ctl_info {
@@ -541,13 +549,18 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned nr_csrows, num_cschannel;
+
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
+	bool mem_is_per_rank;
 
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -562,12 +575,15 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
+	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
+	u32 ce_count;           /* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/* drivers shouldn't access this struct directly */
+	unsigned ce_noinfo_count, ue_noinfo_count;
+	unsigned ce_mc, ue_mc;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -580,7 +596,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-27 17:52                                   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-27 17:52 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 27-04-2012 10:33, Borislav Petkov escreveu:
> Btw,
> 
> this patch gives
> 
> [    8.278399] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> [    8.287594] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> [    8.296784] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: dimm2 (1:0:0): row 1, chan 0
> [    8.305968] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: dimm3 (1:1:0): row 1, chan 1
> [    8.315144] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: dimm4 (2:0:0): row 2, chan 0
> [    8.324326] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: dimm5 (2:1:0): row 2, chan 1
> [    8.333502] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: dimm6 (3:0:0): row 3, chan 0
> [    8.342684] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: dimm7 (3:1:0): row 3, chan 1
> [    8.351860] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: dimm8 (4:0:0): row 4, chan 0
> [    8.361049] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: dimm9 (4:1:0): row 4, chan 1
> [    8.370227] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: dimm10 (5:0:0): row 5, chan 0
> [    8.379582] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: dimm11 (5:1:0): row 5, chan 1
> [    8.388941] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: dimm12 (6:0:0): row 6, chan 0
> [    8.398315] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: dimm13 (6:1:0): row 6, chan 1
> [    8.407680] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: dimm14 (7:0:0): row 7, chan 0
> [    8.417047] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: dimm15 (7:1:0): row 7, chan 1
> 
> and the memory controller has the following chip selects
> 
> [    8.137662] EDAC MC: DCT0 chip selects:
> [    8.150291] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [    8.155349] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [    8.160408] EDAC amd64: MC: 4:     0MB 5:     0MB
> [    8.165475] EDAC amd64: MC: 6:     0MB 7:     0MB
> [    8.180499] EDAC MC: DCT1 chip selects:
> [    8.184693] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [    8.189753] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [    8.194812] EDAC amd64: MC: 4:     0MB 5:     0MB
> [    8.199875] EDAC amd64: MC: 6:     0MB 7:     0MB
> 
> Those are 4 dual-ranked DIMMs on this node, DCT0 is one channel and DCT1
> is another and I have 4 ranks per channel. Having dimm0-dimm15 is very
> misleading and has nothing to do with the reality. So, if this is to use
> your nomenclature with layers, I'll have dimm0-dimm7 where each dimm is
> a rank.
> 
> Or, the most correct thing to do would be to have dimm0-dimm3, each
> dual-ranked.
> 
> So either tot_dimms is computed wrongly or there's a more serious error
> somewhere.
> 
> I've reviewed almost the half patch, will review the rest when/if we
> sort out the above issue first.
> 
> Thanks.
> 
> On Tue, Apr 24, 2012 at 03:15:41PM -0300, Mauro Carvalho Chehab wrote:
>> Change the EDAC internal representation to work with non-csrow
>> based memory controllers.
>>
>> There are lots of those memory controllers nowadays, and more
>> are coming. So, the EDAC internal representation needs to be
>> changed, in order to work with those memory controllers, while
>> preserving backward compatibility with the old ones.
>>
>> The edac core were written with the idea that memory controllers
> 
> 		was
> 
>> are able to directly access csrows, and that the channels are
>> used inside a csrows select.
> 
> This sounds funny, simply remove that second part about the channels.
> 
>> This is not true for FB-DIMM and RAMBUS memory controllers.
>>
>> Also, some recent advanced memory controllers don't present a per-csrows
>> view. Instead, they view memories as DIMM's, instead of ranks, accessed
> 
> 					DIMMs instead of ranks."
> 
> Remove the rest.
> 
>> via csrow/channel.
>>
>> So, change the allocation and error report routines to allow
>> them to work with all types of architectures.
>>
>> This will allow the removal of several hacks on FB-DIMM and RAMBUS
> 
> 					       with
> 
>> memory controllers on the next patches.
> 
> 		    . Remove the rest.
> 
>>
>> Also, several tests were done on different platforms using different
>> x86 drivers.
>>
>> TODO: a multi-rank DIMM's are currently represented by multiple DIMM
> 
> 	Multi-rank DIMMs
> 
>> entries at struct dimm_info. That means that changing a label for one
> 
> 	  in
> 
>> rank won't change the same label for the other ranks at the same dimm.
> 
> 						       of the same DIMM.
> 
>> Such bug is there since the beginning of the EDAC, so it is not a big
> 
>   This bug is present ..
> 
>> deal. However, on several drivers, it is possible to fix this issue, but
> 
> 		remove "on"
> 
>> it should be a per-driver fix, as the csrow => DIMM arrangement may not
>> be equal for all. So, don't try to fix it here yet.
>>
>> PS.: I tried to make this patch as short as possible, preceding it with
> 
> Remove "PS."
> 
>> several other patches that simplified the logic here. Yet, as the
>> internal API changes, all drivers need changes. The changes are
>> generally bigger on the drivers for FB-DIMM's.
> 
> 		   in 		   for FB-DIMMs.
> 
>>
>> FIXME: while the FB-DIMMs are not converted to use the new
>> design, uncorrected errors will show just one channel. In
>> the past, all changes were on a big patch with about 150K.
>> As it needed to be split, in order to be accepted by the
>> EDAC ML at vger, we've opted to have this small drawback.
>> As an advantage, it is now easier to review the patch series.
> 
> This whole paragraph above doesn't have anything to do with what the
> patch does, so it can go.
> 
> [..]
> 
>> ---
>>
>> v16: Only context changes
>>
>>  drivers/edac/edac_core.h |   92 ++++++-
>>  drivers/edac/edac_mc.c   |  682 ++++++++++++++++++++++++++++------------------
>>  include/linux/edac.h     |   40 ++-
>>  3 files changed, 526 insertions(+), 288 deletions(-)
>>
>> diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
>> index e48ab31..7201bb1 100644
>> --- a/drivers/edac/edac_core.h
>> +++ b/drivers/edac/edac_core.h
>> @@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
>>  
>>  #endif				/* CONFIG_PCI */
>>  
>> -extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>> -					  unsigned nr_chans, int edac_index);
>> +struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>> +				   unsigned nr_chans, int edac_index);
> 
> Why not "extern"?
> 
>> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
>> +				   unsigned n_layers,
>> +				   struct edac_mc_layer *layers,
>> +				   bool rev_order,
>> +				   unsigned sz_pvt);
> 
> ditto.
> 
>>  extern int edac_mc_add_mc(struct mem_ctl_info *mci);
>>  extern void edac_mc_free(struct mem_ctl_info *mci);
>>  extern struct mem_ctl_info *edac_mc_find(int idx);
>> @@ -467,24 +472,80 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
>>   * reporting logic and function interface - reduces conditional
>>   * statement clutter and extra function arguments.
>>   */
>> -extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
>> +
>> +void edac_mc_handle_error(const enum hw_event_mc_err_type type,
>> +			  struct mem_ctl_info *mci,
>> +			  const unsigned long page_frame_number,
>> +			  const unsigned long offset_in_page,
>> +			  const unsigned long syndrome,
>> +			  const int layer0,
>> +			  const int layer1,
>> +			  const int layer2,
>> +			  const char *msg,
>> +			  const char *other_detail,
>> +			  const void *mcelog);
> 
> Why isn't this one "extern" either?
> 
>> +
>> +static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
>>  			      unsigned long page_frame_number,
>>  			      unsigned long offset_in_page,
>>  			      unsigned long syndrome, int row, int channel,
>> -			      const char *msg);
> 
> Strange alignment, pls do
> 
> static inline void edac_mc_handle_ce(struct...,
> 				     unsigned...,
> 				     ...,
> 				     ...);
> 
> 
>> -extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
>> -				      const char *msg);
>> -extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
>> +			      const char *msg)
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +			      page_frame_number, offset_in_page, syndrome,
>> +		              row, channel, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
>> +				      const char *msg)
> 
> ditto.
> 
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
>>  			      unsigned long page_frame_number,
>>  			      unsigned long offset_in_page, int row,
>> -			      const char *msg);
> 
> ditto.
> 
>> -extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
>> -				      const char *msg);
>> -extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
>> -				  unsigned int channel0, unsigned int channel1,
>> -				  char *msg);
>> -extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
>> -				  unsigned int channel, char *msg);
>> +			      const char *msg)
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
>> +			      page_frame_number, offset_in_page, 0,
>> +		              row, -1, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
>> +				      const char *msg)
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
>> +			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
>> +					 unsigned int csrow,
>> +					 unsigned int channel0,
>> +					 unsigned int channel1,
>> +					 char *msg)
> 
> Now this alignment looks correct.
> 
>> +{
>> +	/*
>> +	 *FIXME: The error can also be at channel1 (e. g. at the second
>> +	 *	  channel of the same branch). The fix is to push
>> +	 *	  edac_mc_handle_error() call into each driver
>> +	 */
>> +	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
>> +			      0, 0, 0,
>> +		              csrow, channel0, -1, msg, NULL, NULL);
>> +}
>> +
>> +static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
>> +					 unsigned int csrow,
>> +					 unsigned int channel, char *msg)
>> +{
>> +	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +			      0, 0, 0,
>> +		              csrow, channel, -1, msg, NULL, NULL);
>> +}
>> +
>> +
> 
> Two superfluous newlines.

Fixed all above (except for the "extern").

> 
>>  
>>  /*
>>   * edac_device APIs
>> @@ -496,6 +557,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
>>  extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
>>  				int inst_nr, int block_nr, const char *msg);
>>  extern int edac_device_alloc_index(void);
>> +extern const char *edac_layer_name[];
>>  
>>  /*
>>   * edac_pci APIs
>> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
>> index 6ec967a..4d4d8b7 100644
>> --- a/drivers/edac/edac_mc.c
>> +++ b/drivers/edac/edac_mc.c
>> @@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>>  	debugf4("\tchannel = %p\n", chan);
>>  	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
>>  	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
>> -	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
>> -	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
>> -	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
>> +	debugf4("\tchannel->dimm = %p\n", chan->dimm);
>> +}
>> +
>> +static void edac_mc_dump_dimm(struct dimm_info *dimm)
>> +{
>> +	int i;
>> +
>> +	debugf4("\tdimm = %p\n", dimm);
>> +	debugf4("\tdimm->label = '%s'\n", dimm->label);
>> +	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
>> +	debugf4("\tdimm location ");
>> +	for (i = 0; i < dimm->mci->n_layers; i++) {
>> +		printk(KERN_CONT "%d", dimm->location[i]);
>> +		if (i < dimm->mci->n_layers - 1)
>> +			printk(KERN_CONT ".");
>> +	}
>> +	printk(KERN_CONT "\n");
> 
> This looks hacky but I don't have a good suggestion what to do instead
> here. Maybe snprintf into a complete string which you can issue with
> debugf4()...

This is not hacky. There are several places at the Kernel doing loops like
that. Look, for example, at lib/hexdump.c (without KERN_CONT, as this
macro was added later - probably to avoid checkpatch.pl complains).

>> +	debugf4("\tdimm->grain = %d\n", dimm->grain);
>> +	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
>>  }
>>  
>>  static void edac_mc_dump_csrow(struct csrow_info *csrow)
>> @@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
>>  	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
>>  	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
>>  		mci->nr_csrows, mci->csrows);
>> +	debugf3("\tmci->nr_dimms = %d, dimns = %p\n",
> 
> 		      ->tot_dimms      dimms

Fixed.

> 
>> +		mci->tot_dimms, mci->dimms);
>>  	debugf3("\tdev = %p\n", mci->dev);
>>  	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
>>  	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
>> @@ -157,10 +175,25 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
>>  }
>>  
>>  /**
>> - * edac_mc_alloc: Allocate a struct mem_ctl_info structure
>> - * @size_pvt:	size of private storage needed
>> - * @nr_csrows:	Number of CWROWS needed for this MC
>> - * @nr_chans:	Number of channels for the MC
>> + * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
> 
> 					    fill
> 
>> + * @edac_index:		Memory controller number
>> + * @n_layers:		Number of layers at the MC hierarchy
> 
> 				Number of MC hierarchy layers
> 
>> + * layers:		Describes each layer as seen by the Memory Controller
>> + * @rev_order:		Fills csrows/cs channels at the reverse order
> 
> 				      csrows/channels in reverse order
> 
>> + * @size_pvt:		size of private storage needed
>> + *
>> + *
>> + * FIXME: drivers handle multi-rank memories on different ways: on some
> 
> 						in		   in
> 
>> + * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
> 
> 			      memory stick			   in
> 
>> + * a single multi-rank DIMM would be mapped into several "dimms".
> 
> 			  memory stick
> 
>> + *
>> + * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
>> + * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
> 
> 				   csrow-based
> 
>> + * thing, as two chip select values are used for dual-rank memories (and 4, for
>> + * quad-rank ones). I suspect that this issue could be solved inside the EDAC
>> + * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
>> + *
>> + * In summary, solving this issue is not easy, as it requires a lot of testing.
>>   *
>>   * Everything is kmalloc'ed as one big chunk - more efficient.
>>   * Only can be used if all structures have the same lifetime - otherwise
>> @@ -172,18 +205,41 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
>>   *	NULL allocation failed
>>   *	struct mem_ctl_info pointer
>>   */
>> -struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>> -				unsigned nr_chans, int edac_index)
>> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
>> +				   unsigned n_layers,
>> +				   struct edac_mc_layer *layers,
>> +				   bool rev_order,
>> +				   unsigned sz_pvt)
> 
> strange function argument vertical alignment
> 
>>  {
>>  	void *ptr = NULL;
>>  	struct mem_ctl_info *mci;
>> -	struct csrow_info *csi, *csrow;
>> +	struct edac_mc_layer *lay;
> 
> As before, call this "layers" pls.
> 
>> +	struct csrow_info *csi, *csr;
>>  	struct rank_info *chi, *chp, *chan;
>>  	struct dimm_info *dimm;
>> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>>  	void *pvt;
>> -	unsigned size;
>> -	int row, chn;
>> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
>> +	unsigned tot_csrows, tot_cschannels;
> 
> No need to call this "tot_cschannels" - "tot_channels" should be enough.
> 
>> +	int i, j;
>>  	int err;
>> +	int row, chn;
> 
> All those local variables should be sorted in a reverse christmas tree
> order:
> 
> 	u32 this_is_the_longest_array_name[LENGTH];
> 	void *shorter_named_variable;
> 	unsigned long size;
> 	int i;
> 
> 	...

Why? There's nothing at the CodingStyle saying about how the vars should
be ordered. If you want to enforce some particular order, please do it
yourself, but apply it consistently among the entire subsystem.

> 
>> +
>> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);
> 
> 
> Push this BUG_ON up into edac_mc_alloc as the first thing this function
> does.

It is already the first thing at the function.

> Also, is it valid to have n_layers == 0? The memcpy call below
> will do nothing.

Changed to:
	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);

>> +	/*
>> +	 * Calculate the total amount of dimms and csrows/cschannels while
>> +	 * in the old API emulation mode
>> +	 */
>> +	tot_dimms = 1;
>> +	tot_cschannels = 1;
>> +	tot_csrows = 1;
> 
> Those initializations can be done above at variable declaration time.

Yes, but the compiled code will be the same anyway, as gcc will optimize
it, either by using registers for those vars or by moving the initialization
to the top of the function.

This function is too complex, so it is better to initialize those vars
just before the loops that are calculating those totals.

> 
>> +	for (i = 0; i < n_layers; i++) {
>> +		tot_dimms *= layers[i].size;
>> +		if (layers[i].is_virt_csrow)
>> +			tot_csrows *= layers[i].size;
>> +		else
>> +			tot_cschannels *= layers[i].size;
>> +	}
>>  
>>  	/* Figure out the offsets of the various items from the start of an mc
>>  	 * structure.  We want the alignment of each item to be at least as
>> @@ -191,12 +247,21 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>>  	 * hardcode everything into a single struct.
>>  	 */
>>  	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
>> -	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
>> -	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
>> -	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
>> +	lay = edac_align_ptr(&ptr, sizeof(*lay), n_layers);
>> +	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
>> +	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_cschannels);
>> +	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
>> +	count = 1;
> 
> ditto.
>> +	for (i = 0; i < n_layers; i++) {
>> +		tot_dimms *= layers[i].size;
>> +		if (layers[i].is_virt_csrow)
>> +			tot_csrows *= layers[i].size;
>> +		else
>> +			tot_cschannels *= layers[i].size;
>> +	}

Ditto: let gcc optimize it.

Spreading the 'count' match code will only make harder for a reviewer to
actually see what's there.

At assember, count will likely be optimized as a register anyway.

<removed the rest of the email, as there aren't any comments after that point>

Patches with all the fixes is enclosed.

--

[PATCH EDACv17] edac: Change internal representation to work with layers

Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core was written with the idea that memory controllers
are able to directly access csrows.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks with FB-DIMM and RAMBUS
memory controllers.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMMs are currently represented by multiple DIMM
entries in struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same DIMM.
This bug is present since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger in the drivers for FB-DIMMs.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

---

v17: Several cosmetic changes.

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..b2dfdf5 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index);
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -467,24 +472,78 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
  * reporting logic and function interface - reduces conditional
  * statement clutter and extra function arguments.
  */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page,
-			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
+
+static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page,
+				     unsigned long syndrome, int row, int channel,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      page_frame_number, offset_in_page, syndrome,
+		              row, channel, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page, int row,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      page_frame_number, offset_in_page, 0,
+		              row, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel0,
+					 unsigned int channel1,
+					 char *msg)
+{
+	/*
+	 *FIXME: The error can also be at channel1 (e. g. at the second
+	 *	  channel of the same branch). The fix is to push
+	 *	  edac_mc_handle_error() call into each driver
+	 */
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel0, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel, char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel, -1, msg, NULL, NULL);
+}
 
 /*
  * edac_device APIs
@@ -496,6 +555,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 6ec967a..a9f7650 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimms = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -157,10 +175,21 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fill a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Number of MC hierarchy layers
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -168,22 +197,55 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  *
  * Use edac_mc_free() to free mc structures allocated by this function.
  *
+ * NOTE: drivers handle multi-rank memories in different ways: in some
+ * drivers, one multi-rank memory stick is mapped as one entry, while, in
+ * others, a single multi-rank memory stick would be mapped into several
+ * entries. Currently, this function will allocate multiple struct dimm_info
+ * on such scenarios, as grouping the multiple ranks require drivers change.
+ *
  * Returns:
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				       unsigned n_layers,
+				       struct edac_mc_layer *layers,
+				       bool rev_order,
+				       unsigned sz_pvt)
 {
 	void *ptr = NULL;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *layer;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
-	unsigned size;
-	int row, chn;
+	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
+	unsigned tot_csrows, tot_channels, tot_errcount = 0;
+	int i, j;
 	int err;
+	int row, chn;
+	bool per_rank = false;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	tot_dimms = 1;
+	tot_channels = 1;
+	tot_csrows = 1;
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_virt_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_channels *= layers[i].size;
+
+		if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
+			per_rank = true;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
@@ -191,12 +253,28 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * hardcode everything into a single struct.
 	 */
 	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
-	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
-	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
-	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	layer = edac_align_ptr(&ptr, sizeof(*layer), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_channels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	count = 1;
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		tot_errcount += 2 * count;
+	}
+
+	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		__func__, size,
+		tot_dimms,
+		per_rank ? "ranks" : "dimms",
+		tot_csrows * tot_channels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -204,42 +282,101 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	layer = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)layer));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
+	mci->n_layers = n_layers;
+	mci->layers = layer;
+	memcpy(mci->layers, layers, sizeof(*layer) * n_layers);
+	mci->nr_csrows = tot_csrows;
+	mci->num_cschannel = tot_channels;
+	mci->mem_is_per_rank = per_rank;
 
 	/*
-	 * For now, assumes that a per-csrow arrangement for dimms.
-	 * This will be latter changed.
+	 * Fills the csrow struct
 	 */
-	dimm = mci->dimms;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
-
-		for (chn = 0; chn < nr_chans; chn++) {
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_channels;
+		chp = &chi[row * tot_channels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_channels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
+		}
+	}
 
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+	/*
+	 * Fills the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+		per_rank ? "ranks" : "dimms");
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = EDAC_DIMM_PTR(layer, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: %s%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, per_rank ? "rank" : "dimm", (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		if (!rev_order) {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (!layers[j].is_virt_csrow)
+					break;
+			chn++;
+			if (chn == tot_channels) {
+				chn = 0;
+				row++;
+			}
+		} else {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (layers[j].is_virt_csrow)
+					break;
+			row++;
+			if (row == tot_csrows) {
+				row = 0;
+				chn++;
+			}
+		}
+
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -263,6 +400,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	return mci;
 }
+EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Nu
+mber of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the csrow-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * Only can be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ *	NULL allocation failed
+ *	struct mem_ctl_info pointer
+ */
+
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index)
+{
+	unsigned n_layers = 2;
+	struct edac_mc_layer layers[n_layers];
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = nr_csrows;
+	layers[0].is_virt_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_virt_csrow = false;
+
+	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
+			  false, sz_pvt);
+}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
@@ -528,7 +716,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -555,6 +742,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -712,261 +901,251 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
+
+static void edac_increment_ce_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
 {
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+	int i, index = 0;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	mci->ce_mc++;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_increment_ue_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	if (mci->scrub_mode & SCRUB_SW_SRC) {
-		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
-		remapped_page = mci->ctl_page_to_phys ?
-			mci->ctl_page_to_phys(mci, page_frame_number) :
-			page_frame_number;
+	mci->ue_mc++;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
+		return;
 	}
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	unsigned long remapped_page;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_filter = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/* Check if the event report is consistent */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED) {
+				p = "CE";
+				mci->ce_mc++;
+			} else {
+				p = "UE";
+				mci->ue_mc++;
+			}
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_filter = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
-
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory,
+	 * the logic here will get all possible labels that could pottentially
+	 * be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called as "rank")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 * It will also get the max grain, over the error match range
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no sense
+		 * on seeking for the affected DIMMs, as everything may be
+		 * affected. Also, don't show errors for non-filled dimm's.
+		 */
+		if (enable_filter && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the dimm, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: %s csrows map: (%d,%d)\n",
+				__func__,
+				mci->mem_is_per_rank ? "rank" : "dimm",
+				dimm->csrow, dimm->cschannel);
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
-
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_filter) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
+		p += sprintf(p, "%s %d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED)
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+	else
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d",
+			page_frame_number, offset_in_page, grain);
+
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		if (edac_mc_get_log_ce())
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		edac_increment_ce_error(mci, enable_filter, pos);
+
+		if (mci->scrub_mode & SCRUB_SW_SRC) {
+			/*
+			 * Some MC's can remap memory so that it is still
+			 * available at a different address when PCI devices
+			 * map into memory.
+			 * MC's that can't do this lose the memory where PCI
+			 * devices are mapped. This mapping is MC dependent
+			 * and so we call back into the MC driver for it to
+			 * map the MC page to a physical (CPU) page which can
+			 * then be mapped to a virtual page - which can then
+			 * be scrubbed.
+			 */
+			remapped_page = mci->ctl_page_to_phys ?
+				mci->ctl_page_to_phys(mci, page_frame_number) :
+				page_frame_number;
+
+			edac_mc_scrub_block(remapped_page,
+					    offset_in_page, grain);
+		}
+	} else {
+		if (edac_mc_get_log_ue())
+			edac_mc_printk(mci, KERN_WARNING,
+				"UE %s on %s (%s%s %s)\n",
+				msg, label, location, detail, other_detail);
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+		if (edac_mc_get_panic_on_ue())
+			panic("UE %s on %s (%s%s %s)\n",
+			      msg, label, location, detail, other_detail);
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		edac_increment_ue_error(mci, enable_filter, pos);
 	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
-
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 3b8798d..2b66109 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -412,18 +412,20 @@ struct edac_mc_layer {
 /* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -443,9 +445,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -497,6 +500,11 @@ struct mcidev_sysfs_attribute {
         ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
 };
 
+struct edac_hierarchy {
+	char		*name;
+	unsigned	nr;
+};
+
 /* MEMORY controller information structure
  */
 struct mem_ctl_info {
@@ -541,13 +549,18 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned nr_csrows, num_cschannel;
+
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
+	bool mem_is_per_rank;
 
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -562,12 +575,15 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
+	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
+	u32 ce_count;           /* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/* drivers shouldn't access this struct directly */
+	unsigned ce_noinfo_count, ue_noinfo_count;
+	unsigned ce_mc, ue_mc;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -580,7 +596,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* RE: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 17:52                                   ` Mauro Carvalho Chehab
  (?)
@ 2012-04-27 18:11                                   ` Luck, Tony
  2012-04-27 19:24                                     ` Mauro Carvalho Chehab
  -1 siblings, 1 reply; 206+ messages in thread
From: Luck, Tony @ 2012-04-27 18:11 UTC (permalink / raw)
  To: Mauro Carvalho Chehab, Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Gross, Mark, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

>>> +	for (i =3D 0; i < dimm->mci->n_layers; i++) {
>>> +		printk(KERN_CONT "%d", dimm->location[i]);
>>> +		if (i < dimm->mci->n_layers - 1)
>>> +			printk(KERN_CONT ".");
>>> +	}
>>> +	printk(KERN_CONT "\n");
>>=20
>> This looks hacky but I don't have a good suggestion what to do instead
>> here. Maybe snprintf into a complete string which you can issue with
>> debugf4()...
>
> This is not hacky. There are several places at the Kernel doing loops lik=
e
> that. Look, for example, at lib/hexdump.c (without KERN_CONT, as this
> macro was added later - probably to avoid checkpatch.pl complains).

There is some benefit to "one printk =3D=3D one output line" ... it means
that console output will not be (as) jumbled if multiple cpus are
printk'ing at the same time.

-Tony

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 18:11                                   ` Luck, Tony
@ 2012-04-27 19:24                                     ` Mauro Carvalho Chehab
  2012-04-28  8:58                                       ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-27 19:24 UTC (permalink / raw)
  To: Luck, Tony
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Gross, Mark, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Borislav Petkov, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 27-04-2012 15:11, Luck, Tony escreveu:
>>>> +	for (i = 0; i < dimm->mci->n_layers; i++) {
>>>> +		printk(KERN_CONT "%d", dimm->location[i]);
>>>> +		if (i < dimm->mci->n_layers - 1)
>>>> +			printk(KERN_CONT ".");
>>>> +	}
>>>> +	printk(KERN_CONT "\n");
>>>
>>> This looks hacky but I don't have a good suggestion what to do instead
>>> here. Maybe snprintf into a complete string which you can issue with
>>> debugf4()...
>>
>> This is not hacky. There are several places at the Kernel doing loops like
>> that. Look, for example, at lib/hexdump.c (without KERN_CONT, as this
>> macro was added later - probably to avoid checkpatch.pl complains).
> 
> There is some benefit to "one printk == one output line" ... it means
> that console output will not be (as) jumbled if multiple cpus are
> printk'ing at the same time.

Ok, but this message only appears when all the conditions below are met:
	- the driver is compiled with EDAC_DEBUG;
	- the edac_core is modprobed with edac_debug_level=4;
	- during the driver modprobe, when the EDAC driver is being registered.

Even on several-core machines, those messages won't mangle, in practice.

Let's not over-design a simple debug message.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 16:07                                     ` Mauro Carvalho Chehab
@ 2012-04-28  8:52                                       ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-28  8:52 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Joe Perches, Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson,
	Mark Gross, Jason Uhlenkott, Tim Small, Ranganathan Desikan,
	Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
> Yes. This is a common issue at the EDAC core: on several places, it calls the
> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
> the debug macros already handles that. I suspect that, in the past, the __func__
> were not at the macros, but some patch added it there, and forgot to fix the
> occurrences of its call.

The patch that added it is d357cbb445208 and you reviewed it.

> This is something that needs to be reviewed at the entire EDAC core (and likely
> at the drivers).

Looks like a job for a newbie to get her/his feet wet with kernel work.

> I opted to not touch on this at the existing debug logic, as I think that the
> better is to address all those issues on one separate patch, after fixing the
> EDAC core bugs.

No,

you simply need to remove the __func__ argument in your newly added debug call:

                debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
                        i, (dimm - mci->dimms),
                        pos[0], pos[1], pos[2], row, chn);

And while you're at it, remove the rest of the __func__ arguments from
your newly added debugfX calls.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-28  8:52                                       ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-28  8:52 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Borislav Petkov, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Chris Metcalf, Joe Perches, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Olof Johansson, Andrew Morton,
	linuxppc-dev

On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
> Yes. This is a common issue at the EDAC core: on several places, it calls the
> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
> the debug macros already handles that. I suspect that, in the past, the __func__
> were not at the macros, but some patch added it there, and forgot to fix the
> occurrences of its call.

The patch that added it is d357cbb445208 and you reviewed it.

> This is something that needs to be reviewed at the entire EDAC core (and likely
> at the drivers).

Looks like a job for a newbie to get her/his feet wet with kernel work.

> I opted to not touch on this at the existing debug logic, as I think that the
> better is to address all those issues on one separate patch, after fixing the
> EDAC core bugs.

No,

you simply need to remove the __func__ argument in your newly added debug call:

                debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
                        i, (dimm - mci->dimms),
                        pos[0], pos[1], pos[2], row, chn);

And while you're at it, remove the rest of the __func__ arguments from
your newly added debugfX calls.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 19:24                                     ` Mauro Carvalho Chehab
@ 2012-04-28  8:58                                       ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-28  8:58 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Gross, Mark, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Luck, Tony, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Fri, Apr 27, 2012 at 04:24:28PM -0300, Mauro Carvalho Chehab wrote:
> Em 27-04-2012 15:11, Luck, Tony escreveu:
> >>>> +	for (i = 0; i < dimm->mci->n_layers; i++) {
> >>>> +		printk(KERN_CONT "%d", dimm->location[i]);
> >>>> +		if (i < dimm->mci->n_layers - 1)
> >>>> +			printk(KERN_CONT ".");
> >>>> +	}
> >>>> +	printk(KERN_CONT "\n");
> >>>
> >>> This looks hacky but I don't have a good suggestion what to do instead
> >>> here. Maybe snprintf into a complete string which you can issue with
> >>> debugf4()...
> >>
> >> This is not hacky. There are several places at the Kernel doing loops like
> >> that. Look, for example, at lib/hexdump.c (without KERN_CONT, as this
> >> macro was added later - probably to avoid checkpatch.pl complains).
> > 
> > There is some benefit to "one printk == one output line" ... it means
> > that console output will not be (as) jumbled if multiple cpus are
> > printk'ing at the same time.
> 
> Ok, but this message only appears when all the conditions below are met:
> 	- the driver is compiled with EDAC_DEBUG;
> 	- the edac_core is modprobed with edac_debug_level=4;
> 	- during the driver modprobe, when the EDAC driver is being registered.

That means nothing.

> Even on several-core machines, those messages won't mangle, in practice.
> 
> Let's not over-design a simple debug message.

No, let's design a simple debug message correctly, regardless of when it
appears.

> 
> Regards,
> Mauro
> 

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 15:36                                   ` Mauro Carvalho Chehab
@ 2012-04-28  9:05                                     ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-28  9:05 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Fri, Apr 27, 2012 at 12:36:12PM -0300, Mauro Carvalho Chehab wrote:
> The fix for it were in another patch[1], as calling them as "rank" is
> needed also at the sysfs API.

No, this doesn't fix it either:

[   10.486440] EDAC MC: DCT0 chip selects:
[   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
[   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
[   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
[   10.486455] EDAC MC: DCT1 chip selects:
[   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
[   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
[   10.486467] EDAC amd64: using x8 syndromes.
[   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
[   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
[   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
[   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
[   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
[   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
[   10.486488] EDAC amd64: MCT channel count: 2
[   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
[   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
[   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
[   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
[   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
[   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
[   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
[   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
[   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
[   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
[   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
[   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
[   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
[   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
[   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
[   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
[   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1

DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.

Now your change is showing 16 ranks. Still b0rked.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-28  9:05                                     ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-28  9:05 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Fri, Apr 27, 2012 at 12:36:12PM -0300, Mauro Carvalho Chehab wrote:
> The fix for it were in another patch[1], as calling them as "rank" is
> needed also at the sysfs API.

No, this doesn't fix it either:

[   10.486440] EDAC MC: DCT0 chip selects:
[   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
[   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
[   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
[   10.486455] EDAC MC: DCT1 chip selects:
[   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
[   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
[   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
[   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
[   10.486467] EDAC amd64: using x8 syndromes.
[   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
[   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
[   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
[   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
[   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
[   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
[   10.486488] EDAC amd64: MCT channel count: 2
[   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
[   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
[   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
[   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
[   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
[   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
[   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
[   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
[   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
[   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
[   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
[   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
[   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
[   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
[   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
[   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
[   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1

DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.

Now your change is showing 16 ranks. Still b0rked.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-27 17:52                                   ` Mauro Carvalho Chehab
@ 2012-04-28  9:16                                     ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-28  9:16 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Fri, Apr 27, 2012 at 02:52:35PM -0300, Mauro Carvalho Chehab wrote:

[..]

> >> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> >> +				   unsigned n_layers,
> >> +				   struct edac_mc_layer *layers,
> >> +				   bool rev_order,
> >> +				   unsigned sz_pvt)
> > 
> > strange function argument vertical alignment
> > 
> >>  {
> >>  	void *ptr = NULL;
> >>  	struct mem_ctl_info *mci;
> >> -	struct csrow_info *csi, *csrow;
> >> +	struct edac_mc_layer *lay;
> > 
> > As before, call this "layers" pls.
> > 
> >> +	struct csrow_info *csi, *csr;
> >>  	struct rank_info *chi, *chp, *chan;
> >>  	struct dimm_info *dimm;
> >> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
> >>  	void *pvt;
> >> -	unsigned size;
> >> -	int row, chn;
> >> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> >> +	unsigned tot_csrows, tot_cschannels;
> > 
> > No need to call this "tot_cschannels" - "tot_channels" should be enough.
> > 
> >> +	int i, j;
> >>  	int err;
> >> +	int row, chn;
> > 
> > All those local variables should be sorted in a reverse christmas tree
> > order:
> > 
> > 	u32 this_is_the_longest_array_name[LENGTH];
> > 	void *shorter_named_variable;
> > 	unsigned long size;
> > 	int i;
> > 
> > 	...
> 
> Why? There's nothing at the CodingStyle saying about how the vars should
> be ordered. If you want to enforce some particular order, please do it
> yourself, but apply it consistently among the entire subsystem.

First of all, this way it is more readable. Second of all, maybe we
should hold it down in CodingStyle. Third of all, you touch this code so you
could fix it up to be more readable while you're at it.

> >> +
> >> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);
> > 
> > 
> > Push this BUG_ON up into edac_mc_alloc as the first thing this function
> > does.
> 
> It is already the first thing at the function.

Ah, that happens later in the patchset where you rename it back to
edac_mc_alloc, ok.

> > Also, is it valid to have n_layers == 0? The memcpy call below
> > will do nothing.
> 
> Changed to:
> 	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);

Really? Look below.

> 
> >> +	/*
> >> +	 * Calculate the total amount of dimms and csrows/cschannels while
> >> +	 * in the old API emulation mode
> >> +	 */
> >> +	tot_dimms = 1;
> >> +	tot_cschannels = 1;
> >> +	tot_csrows = 1;
> > 
> > Those initializations can be done above at variable declaration time.
> 
> Yes, but the compiled code will be the same anyway, as gcc will optimize

Hey, are you looking at compiled code or at source code? Because I'm
looking at source code, and it is a pretty safe bet the majority of the
people here do that too.

> it, either by using registers for those vars or by moving the initialization
> to the top of the function.
> 
> This function is too complex, so it is better to initialize those vars
> just before the loops that are calculating those totals.

Simply initialize those variables at declaration time and that's it.
Initializing them before the loop doesn't make the function less complex
- splitting it and sanitizing it does.

> > 
> >> +	for (i = 0; i < n_layers; i++) {
> >> +		tot_dimms *= layers[i].size;
> >> +		if (layers[i].is_virt_csrow)
> >> +			tot_csrows *= layers[i].size;
> >> +		else
> >> +			tot_cschannels *= layers[i].size;
> >> +	}

[..]

> -struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> -				unsigned nr_chans, int edac_index)
> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> +				       unsigned n_layers,
> +				       struct edac_mc_layer *layers,
> +				       bool rev_order,
> +				       unsigned sz_pvt)
>  {
>  	void *ptr = NULL;
>  	struct mem_ctl_info *mci;
> -	struct csrow_info *csi, *csrow;
> +	struct edac_mc_layer *layer;
> +	struct csrow_info *csi, *csr;
>  	struct rank_info *chi, *chp, *chan;
>  	struct dimm_info *dimm;
> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>  	void *pvt;
> -	unsigned size;
> -	int row, chn;
> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> +	unsigned tot_csrows, tot_channels, tot_errcount = 0;
> +	int i, j;
>  	int err;
> +	int row, chn;
> +	bool per_rank = false;
> +
> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);

	^^^^^^

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-28  9:16                                     ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-28  9:16 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Fri, Apr 27, 2012 at 02:52:35PM -0300, Mauro Carvalho Chehab wrote:

[..]

> >> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> >> +				   unsigned n_layers,
> >> +				   struct edac_mc_layer *layers,
> >> +				   bool rev_order,
> >> +				   unsigned sz_pvt)
> > 
> > strange function argument vertical alignment
> > 
> >>  {
> >>  	void *ptr = NULL;
> >>  	struct mem_ctl_info *mci;
> >> -	struct csrow_info *csi, *csrow;
> >> +	struct edac_mc_layer *lay;
> > 
> > As before, call this "layers" pls.
> > 
> >> +	struct csrow_info *csi, *csr;
> >>  	struct rank_info *chi, *chp, *chan;
> >>  	struct dimm_info *dimm;
> >> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
> >>  	void *pvt;
> >> -	unsigned size;
> >> -	int row, chn;
> >> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> >> +	unsigned tot_csrows, tot_cschannels;
> > 
> > No need to call this "tot_cschannels" - "tot_channels" should be enough.
> > 
> >> +	int i, j;
> >>  	int err;
> >> +	int row, chn;
> > 
> > All those local variables should be sorted in a reverse christmas tree
> > order:
> > 
> > 	u32 this_is_the_longest_array_name[LENGTH];
> > 	void *shorter_named_variable;
> > 	unsigned long size;
> > 	int i;
> > 
> > 	...
> 
> Why? There's nothing at the CodingStyle saying about how the vars should
> be ordered. If you want to enforce some particular order, please do it
> yourself, but apply it consistently among the entire subsystem.

First of all, this way it is more readable. Second of all, maybe we
should hold it down in CodingStyle. Third of all, you touch this code so you
could fix it up to be more readable while you're at it.

> >> +
> >> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);
> > 
> > 
> > Push this BUG_ON up into edac_mc_alloc as the first thing this function
> > does.
> 
> It is already the first thing at the function.

Ah, that happens later in the patchset where you rename it back to
edac_mc_alloc, ok.

> > Also, is it valid to have n_layers == 0? The memcpy call below
> > will do nothing.
> 
> Changed to:
> 	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);

Really? Look below.

> 
> >> +	/*
> >> +	 * Calculate the total amount of dimms and csrows/cschannels while
> >> +	 * in the old API emulation mode
> >> +	 */
> >> +	tot_dimms = 1;
> >> +	tot_cschannels = 1;
> >> +	tot_csrows = 1;
> > 
> > Those initializations can be done above at variable declaration time.
> 
> Yes, but the compiled code will be the same anyway, as gcc will optimize

Hey, are you looking at compiled code or at source code? Because I'm
looking at source code, and it is a pretty safe bet the majority of the
people here do that too.

> it, either by using registers for those vars or by moving the initialization
> to the top of the function.
> 
> This function is too complex, so it is better to initialize those vars
> just before the loops that are calculating those totals.

Simply initialize those variables at declaration time and that's it.
Initializing them before the loop doesn't make the function less complex
- splitting it and sanitizing it does.

> > 
> >> +	for (i = 0; i < n_layers; i++) {
> >> +		tot_dimms *= layers[i].size;
> >> +		if (layers[i].is_virt_csrow)
> >> +			tot_csrows *= layers[i].size;
> >> +		else
> >> +			tot_cschannels *= layers[i].size;
> >> +	}

[..]

> -struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
> -				unsigned nr_chans, int edac_index)
> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> +				       unsigned n_layers,
> +				       struct edac_mc_layer *layers,
> +				       bool rev_order,
> +				       unsigned sz_pvt)
>  {
>  	void *ptr = NULL;
>  	struct mem_ctl_info *mci;
> -	struct csrow_info *csi, *csrow;
> +	struct edac_mc_layer *layer;
> +	struct csrow_info *csi, *csr;
>  	struct rank_info *chi, *chp, *chan;
>  	struct dimm_info *dimm;
> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>  	void *pvt;
> -	unsigned size;
> -	int row, chn;
> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> +	unsigned tot_csrows, tot_channels, tot_errcount = 0;
> +	int i, j;
>  	int err;
> +	int row, chn;
> +	bool per_rank = false;
> +
> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);

	^^^^^^

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-28  9:16                                     ` Borislav Petkov
@ 2012-04-28 17:07                                       ` Joe Perches
  -1 siblings, 0 replies; 206+ messages in thread
From: Joe Perches @ 2012-04-28 17:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson,
	Mark Gross, Jason Uhlenkott, Tim Small, Ranganathan Desikan,
	Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

On Sat, 2012-04-28 at 11:16 +0200, Borislav Petkov wrote:
> On Fri, Apr 27, 2012 at 02:52:35PM -0300, Mauro Carvalho Chehab wrote:
> > > All those local variables should be sorted in a reverse christmas tree
> > > order:
> > > 
> > > 	u32 this_is_the_longest_array_name[LENGTH];
> > > 	void *shorter_named_variable;
> > > 	unsigned long size;
> > > 	int i;
> > > 
> > > 	...
> > 
> > Why? There's nothing at the CodingStyle saying about how the vars should
> > be ordered. If you want to enforce some particular order, please do it
> > yourself, but apply it consistently among the entire subsystem.
> 
> First of all, this way it is more readable.

Not in my opinion, and blindly using "reverse christmas tree"
can separate variables that should be declared together.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-28 17:07                                       ` Joe Perches
  0 siblings, 0 replies; 206+ messages in thread
From: Joe Perches @ 2012-04-28 17:07 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Aristeu Rozanski,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Chris Metcalf, Doug Thompson, Andrew Morton,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Olof Johansson, linuxppc-dev

On Sat, 2012-04-28 at 11:16 +0200, Borislav Petkov wrote:
> On Fri, Apr 27, 2012 at 02:52:35PM -0300, Mauro Carvalho Chehab wrote:
> > > All those local variables should be sorted in a reverse christmas tree
> > > order:
> > > 
> > > 	u32 this_is_the_longest_array_name[LENGTH];
> > > 	void *shorter_named_variable;
> > > 	unsigned long size;
> > > 	int i;
> > > 
> > > 	...
> > 
> > Why? There's nothing at the CodingStyle saying about how the vars should
> > be ordered. If you want to enforce some particular order, please do it
> > yourself, but apply it consistently among the entire subsystem.
> 
> First of all, this way it is more readable.

Not in my opinion, and blindly using "reverse christmas tree"
can separate variables that should be declared together.

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-28  8:52                                       ` Borislav Petkov
@ 2012-04-28 20:38                                         ` Joe Perches
  -1 siblings, 0 replies; 206+ messages in thread
From: Joe Perches @ 2012-04-28 20:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson,
	Mark Gross, Jason Uhlenkott, Tim Small, Ranganathan Desikan,
	Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

On Sat, 2012-04-28 at 10:52 +0200, Borislav Petkov wrote:
> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
> > Yes. This is a common issue at the EDAC core: on several places, it calls the
> > edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
> > the debug macros already handles that. I suspect that, in the past, the __func__
> > were not at the macros, but some patch added it there, and forgot to fix the
> > occurrences of its call.
> 
> The patch that added it is d357cbb445208 and you reviewed it.
> 
> > This is something that needs to be reviewed at the entire EDAC core (and likely
> > at the drivers).
> 
> Looks like a job for a newbie to get her/his feet wet with kernel work.

Looks to me more like a lazy maintainer/developer who doesn't
want to bother with a few minutes work.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-28 20:38                                         ` Joe Perches
  0 siblings, 0 replies; 206+ messages in thread
From: Joe Perches @ 2012-04-28 20:38 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Aristeu Rozanski,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Chris Metcalf, Doug Thompson, Andrew Morton,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Olof Johansson, linuxppc-dev

On Sat, 2012-04-28 at 10:52 +0200, Borislav Petkov wrote:
> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
> > Yes. This is a common issue at the EDAC core: on several places, it calls the
> > edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
> > the debug macros already handles that. I suspect that, in the past, the __func__
> > were not at the macros, but some patch added it there, and forgot to fix the
> > occurrences of its call.
> 
> The patch that added it is d357cbb445208 and you reviewed it.
> 
> > This is something that needs to be reviewed at the entire EDAC core (and likely
> > at the drivers).
> 
> Looks like a job for a newbie to get her/his feet wet with kernel work.

Looks to me more like a lazy maintainer/developer who doesn't
want to bother with a few minutes work.

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-28  9:05                                     ` Borislav Petkov
@ 2012-04-29 13:49                                       ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 13:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 28-04-2012 06:05, Borislav Petkov escreveu:
> On Fri, Apr 27, 2012 at 12:36:12PM -0300, Mauro Carvalho Chehab wrote:
>> The fix for it were in another patch[1], as calling them as "rank" is
>> needed also at the sysfs API.
> 
> No, this doesn't fix it either:
> 
> [   10.486440] EDAC MC: DCT0 chip selects:
> [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
> [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
> [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
> [   10.486455] EDAC MC: DCT1 chip selects:
> [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
> [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
> [   10.486467] EDAC amd64: using x8 syndromes.
> [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
> [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
> [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
> [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
> [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
> [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
> [   10.486488] EDAC amd64: MCT channel count: 2
> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
> [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
> [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
> [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
> [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
> [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
> [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
> [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
> [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
> [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
> [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
> [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
> [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
> [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
> [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
> [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
> [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
> 
> DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
> 
> Now your change is showing 16 ranks. Still b0rked.
> 
No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.

As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
doesn't know how many ranks are filled, as the driver logic first calls it to 
allocate for the max amount of ranks, and then fills the rank with their info 
(or let them untouched with 0 pages, if they're empty).

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-29 13:49                                       ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 13:49 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 28-04-2012 06:05, Borislav Petkov escreveu:
> On Fri, Apr 27, 2012 at 12:36:12PM -0300, Mauro Carvalho Chehab wrote:
>> The fix for it were in another patch[1], as calling them as "rank" is
>> needed also at the sysfs API.
> 
> No, this doesn't fix it either:
> 
> [   10.486440] EDAC MC: DCT0 chip selects:
> [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
> [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
> [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
> [   10.486455] EDAC MC: DCT1 chip selects:
> [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
> [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
> [   10.486467] EDAC amd64: using x8 syndromes.
> [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
> [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
> [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
> [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
> [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
> [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
> [   10.486488] EDAC amd64: MCT channel count: 2
> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
> [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
> [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
> [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
> [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
> [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
> [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
> [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
> [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
> [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
> [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
> [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
> [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
> [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
> [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
> [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
> [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
> 
> DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
> 
> Now your change is showing 16 ranks. Still b0rked.
> 
No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.

As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
doesn't know how many ranks are filled, as the driver logic first calls it to 
allocate for the max amount of ranks, and then fills the rank with their info 
(or let them untouched with 0 pages, if they're empty).

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-28 17:07                                       ` Joe Perches
@ 2012-04-29 14:02                                         ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 14:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Joe Perches, Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

Em 28-04-2012 14:07, Joe Perches escreveu:
> On Sat, 2012-04-28 at 11:16 +0200, Borislav Petkov wrote:
>> On Fri, Apr 27, 2012 at 02:52:35PM -0300, Mauro Carvalho Chehab wrote:
>>>> All those local variables should be sorted in a reverse christmas tree
>>>> order:
>>>>
>>>> 	u32 this_is_the_longest_array_name[LENGTH];
>>>> 	void *shorter_named_variable;
>>>> 	unsigned long size;
>>>> 	int i;
>>>>
>>>> 	...
>>>
>>> Why? There's nothing at the CodingStyle saying about how the vars should
>>> be ordered. If you want to enforce some particular order, please do it
>>> yourself, but apply it consistently among the entire subsystem.
>>
>> First of all, this way it is more readable.
> 
> Not in my opinion, and blindly using "reverse christmas tree"
> can separate variables that should be declared together.

I agree with Joe. The order won't make the code easier or harder to
read, nor it would improve code performance.

>> Second of all, maybe we should hold it down in CodingStyle.

Different developers have different opinions about how to order includes, 
functions, vars, etc. 

So, this is not at CodingStyle because there's no consensus about it, and 
because this is not relevant for code understanding.

A reviewer should not reject a patch just because he doesn't like the
order that the developer used.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-29 14:02                                         ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 14:02 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Joe Perches, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Olof Johansson, Andrew Morton,
	linuxppc-dev

Em 28-04-2012 14:07, Joe Perches escreveu:
> On Sat, 2012-04-28 at 11:16 +0200, Borislav Petkov wrote:
>> On Fri, Apr 27, 2012 at 02:52:35PM -0300, Mauro Carvalho Chehab wrote:
>>>> All those local variables should be sorted in a reverse christmas tree
>>>> order:
>>>>
>>>> 	u32 this_is_the_longest_array_name[LENGTH];
>>>> 	void *shorter_named_variable;
>>>> 	unsigned long size;
>>>> 	int i;
>>>>
>>>> 	...
>>>
>>> Why? There's nothing at the CodingStyle saying about how the vars should
>>> be ordered. If you want to enforce some particular order, please do it
>>> yourself, but apply it consistently among the entire subsystem.
>>
>> First of all, this way it is more readable.
> 
> Not in my opinion, and blindly using "reverse christmas tree"
> can separate variables that should be declared together.

I agree with Joe. The order won't make the code easier or harder to
read, nor it would improve code performance.

>> Second of all, maybe we should hold it down in CodingStyle.

Different developers have different opinions about how to order includes, 
functions, vars, etc. 

So, this is not at CodingStyle because there's no consensus about it, and 
because this is not relevant for code understanding.

A reviewer should not reject a patch just because he doesn't like the
order that the developer used.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-28  9:16                                     ` Borislav Petkov
@ 2012-04-29 14:16                                       ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 14:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 28-04-2012 06:16, Borislav Petkov escreveu:
> On Fri, Apr 27, 2012 at 02:52:35PM -0300, Mauro Carvalho Chehab wrote:
> 

>>> Also, is it valid to have n_layers == 0? The memcpy call below
>>> will do nothing.
>>
>> Changed to:
>> 	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
> 
> Really? Look below.

Weird, not sure what happened here... it seems I sent the version before
this change.

The patch I've made has the correct BUG_ON at:
	http://git.infradead.org/users/mchehab/edac.git/commitdiff/447b7929e633027ffe131f2f8f246bba5690cee7

> 
>>
>>>> +	/*
>>>> +	 * Calculate the total amount of dimms and csrows/cschannels while
>>>> +	 * in the old API emulation mode
>>>> +	 */
>>>> +	tot_dimms = 1;
>>>> +	tot_cschannels = 1;
>>>> +	tot_csrows = 1;
>>>
>>> Those initializations can be done above at variable declaration time.
>>
>> Yes, but the compiled code will be the same anyway, as gcc will optimize
> 
> Hey, are you looking at compiled code or at source code? Because I'm
> looking at source code, and it is a pretty safe bet the majority of the
> people here do that too.

What I said is that, from source code POV, a code where the loop variables are
initialized just before the loop is easier to read it when the initialization
of those vars are on another part of the code.

That's basically why the "for" syntax starts with a var initialization clause.

The tot_dimms & friends are loop vars: their value is calculated within the loop.

At the object code, this won't bring any difference.

> 
>> it, either by using registers for those vars or by moving the initialization
>> to the top of the function.
>>
>> This function is too complex, so it is better to initialize those vars
>> just before the loops that are calculating those totals.
> 
> Simply initialize those variables at declaration time and that's it.
> Initializing them before the loop doesn't make the function less complex
> - splitting it and sanitizing it does.

Initializing loop-calculated vars just before the loop makes the code easier
to read, and may avoid issues that might happen during code lifecycle.
> 
>>>
>>>> +	for (i = 0; i < n_layers; i++) {
>>>> +		tot_dimms *= layers[i].size;
>>>> +		if (layers[i].is_virt_csrow)
>>>> +			tot_csrows *= layers[i].size;
>>>> +		else
>>>> +			tot_cschannels *= layers[i].size;
>>>> +	}
> 
> [..]
> 
>> -struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>> -				unsigned nr_chans, int edac_index)
>> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
>> +				       unsigned n_layers,
>> +				       struct edac_mc_layer *layers,
>> +				       bool rev_order,
>> +				       unsigned sz_pvt)
>>  {
>>  	void *ptr = NULL;
>>  	struct mem_ctl_info *mci;
>> -	struct csrow_info *csi, *csrow;
>> +	struct edac_mc_layer *layer;
>> +	struct csrow_info *csi, *csr;
>>  	struct rank_info *chi, *chp, *chan;
>>  	struct dimm_info *dimm;
>> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>>  	void *pvt;
>> -	unsigned size;
>> -	int row, chn;
>> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
>> +	unsigned tot_csrows, tot_channels, tot_errcount = 0;
>> +	int i, j;
>>  	int err;
>> +	int row, chn;
>> +	bool per_rank = false;
>> +
>> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);
> 
> 	^^^^^^
> 

Let me re-send it, with the right BUG_ON there.

commit 447b7929e633027ffe131f2f8f246bba5690cee7
Author: Mauro Carvalho Chehab <mchehab@redhat.com>
Date:   Wed Apr 18 15:20:50 2012 -0300

    edac: Change internal representation to work with layers
    
    Change the EDAC internal representation to work with non-csrow
    based memory controllers.
    
    There are lots of those memory controllers nowadays, and more
    are coming. So, the EDAC internal representation needs to be
    changed, in order to work with those memory controllers, while
    preserving backward compatibility with the old ones.
    
    The edac core was written with the idea that memory controllers
    are able to directly access csrows.
    
    This is not true for FB-DIMM and RAMBUS memory controllers.
    
    Also, some recent advanced memory controllers don't present a per-csrows
    view. Instead, they view memories as DIMMs, instead of ranks.
    
    So, change the allocation and error report routines to allow
    them to work with all types of architectures.
    
    This will allow the removal of several hacks with FB-DIMM and RAMBUS
    memory controllers.
    
    Also, several tests were done on different platforms using different
    x86 drivers.
    
    TODO: a multi-rank DIMMs are currently represented by multiple DIMM
    entries in struct dimm_info. That means that changing a label for one
    rank won't change the same label for the other ranks at the same DIMM.
    This bug is present since the beginning of the EDAC, so it is not a big
    deal. However, on several drivers, it is possible to fix this issue, but
    it should be a per-driver fix, as the csrow => DIMM arrangement may not
    be equal for all. So, don't try to fix it here yet.
    
    I tried to make this patch as short as possible, preceding it with
    several other patches that simplified the logic here. Yet, as the
    internal API changes, all drivers need changes. The changes are
    generally bigger in the drivers for FB-DIMMs.
    
    Cc: Aristeu Rozanski <arozansk@redhat.com>
    Cc: Doug Thompson <norsk5@yahoo.com>
    Cc: Borislav Petkov <borislav.petkov@amd.com>
    Cc: Mark Gross <mark.gross@intel.com>
    Cc: Jason Uhlenkott <juhlenko@akamai.com>
    Cc: Tim Small <tim@buttersideup.com>
    Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
    Cc: "Arvind R." <arvino55@gmail.com>
    Cc: Olof Johansson <olof@lixom.net>
    Cc: Egor Martovetsky <egor@pasemi.com>
    Cc: Chris Metcalf <cmetcalf@tilera.com>
    Cc: Michal Marek <mmarek@suse.cz>
    Cc: Jiri Kosina <jkosina@suse.cz>
    Cc: Joe Perches <joe@perches.com>
    Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Hitoshi Mitake <h.mitake@gmail.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
    Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
    Cc: Josh Boyer <jwboyer@gmail.com>
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..b2dfdf5 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index);
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -467,24 +472,78 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
  * reporting logic and function interface - reduces conditional
  * statement clutter and extra function arguments.
  */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page,
-			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
+
+static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page,
+				     unsigned long syndrome, int row, int channel,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      page_frame_number, offset_in_page, syndrome,
+		              row, channel, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page, int row,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      page_frame_number, offset_in_page, 0,
+		              row, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel0,
+					 unsigned int channel1,
+					 char *msg)
+{
+	/*
+	 *FIXME: The error can also be at channel1 (e. g. at the second
+	 *	  channel of the same branch). The fix is to push
+	 *	  edac_mc_handle_error() call into each driver
+	 */
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel0, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel, char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel, -1, msg, NULL, NULL);
+}
 
 /*
  * edac_device APIs
@@ -496,6 +555,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 6ec967a..d837266 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimms = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -157,10 +175,21 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fill a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Number of MC hierarchy layers
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -168,22 +197,55 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  *
  * Use edac_mc_free() to free mc structures allocated by this function.
  *
+ * NOTE: drivers handle multi-rank memories in different ways: in some
+ * drivers, one multi-rank memory stick is mapped as one entry, while, in
+ * others, a single multi-rank memory stick would be mapped into several
+ * entries. Currently, this function will allocate multiple struct dimm_info
+ * on such scenarios, as grouping the multiple ranks require drivers change.
+ *
  * Returns:
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				       unsigned n_layers,
+				       struct edac_mc_layer *layers,
+				       bool rev_order,
+				       unsigned sz_pvt)
 {
 	void *ptr = NULL;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *layer;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
-	unsigned size;
-	int row, chn;
+	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
+	unsigned tot_csrows, tot_channels, tot_errcount = 0;
+	int i, j;
 	int err;
+	int row, chn;
+	bool per_rank = false;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	tot_dimms = 1;
+	tot_channels = 1;
+	tot_csrows = 1;
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_virt_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_channels *= layers[i].size;
+
+		if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
+			per_rank = true;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
@@ -191,12 +253,28 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * hardcode everything into a single struct.
 	 */
 	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
-	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
-	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
-	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	layer = edac_align_ptr(&ptr, sizeof(*layer), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_channels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	count = 1;
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		tot_errcount += 2 * count;
+	}
+
+	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		__func__, size,
+		tot_dimms,
+		per_rank ? "ranks" : "dimms",
+		tot_csrows * tot_channels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -204,42 +282,101 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	layer = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)layer));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
+	mci->n_layers = n_layers;
+	mci->layers = layer;
+	memcpy(mci->layers, layers, sizeof(*layer) * n_layers);
+	mci->nr_csrows = tot_csrows;
+	mci->num_cschannel = tot_channels;
+	mci->mem_is_per_rank = per_rank;
 
 	/*
-	 * For now, assumes that a per-csrow arrangement for dimms.
-	 * This will be latter changed.
+	 * Fills the csrow struct
 	 */
-	dimm = mci->dimms;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
-
-		for (chn = 0; chn < nr_chans; chn++) {
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_channels;
+		chp = &chi[row * tot_channels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_channels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
+		}
+	}
 
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+	/*
+	 * Fills the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+		per_rank ? "ranks" : "dimms");
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = EDAC_DIMM_PTR(layer, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: %s%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, per_rank ? "rank" : "dimm", (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		if (!rev_order) {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (!layers[j].is_virt_csrow)
+					break;
+			chn++;
+			if (chn == tot_channels) {
+				chn = 0;
+				row++;
+			}
+		} else {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (layers[j].is_virt_csrow)
+					break;
+			row++;
+			if (row == tot_csrows) {
+				row = 0;
+				chn++;
+			}
+		}
+
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -263,6 +400,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	return mci;
 }
+EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Nu
+mber of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the csrow-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * Only can be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ *	NULL allocation failed
+ *	struct mem_ctl_info pointer
+ */
+
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index)
+{
+	unsigned n_layers = 2;
+	struct edac_mc_layer layers[n_layers];
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = nr_csrows;
+	layers[0].is_virt_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_virt_csrow = false;
+
+	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
+			  false, sz_pvt);
+}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
@@ -528,7 +716,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -555,6 +742,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -712,261 +901,251 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
+
+static void edac_increment_ce_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
 {
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+	int i, index = 0;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	mci->ce_mc++;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_increment_ue_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	if (mci->scrub_mode & SCRUB_SW_SRC) {
-		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
-		remapped_page = mci->ctl_page_to_phys ?
-			mci->ctl_page_to_phys(mci, page_frame_number) :
-			page_frame_number;
+	mci->ue_mc++;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
+		return;
 	}
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	unsigned long remapped_page;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_filter = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/* Check if the event report is consistent */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED) {
+				p = "CE";
+				mci->ce_mc++;
+			} else {
+				p = "UE";
+				mci->ue_mc++;
+			}
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_filter = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
-
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory,
+	 * the logic here will get all possible labels that could pottentially
+	 * be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called as "rank")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 * It will also get the max grain, over the error match range
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no sense
+		 * on seeking for the affected DIMMs, as everything may be
+		 * affected. Also, don't show errors for non-filled dimm's.
+		 */
+		if (enable_filter && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the dimm, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: %s csrows map: (%d,%d)\n",
+				__func__,
+				mci->mem_is_per_rank ? "rank" : "dimm",
+				dimm->csrow, dimm->cschannel);
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
-
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_filter) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
+		p += sprintf(p, "%s %d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED)
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+	else
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d",
+			page_frame_number, offset_in_page, grain);
+
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		if (edac_mc_get_log_ce())
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		edac_increment_ce_error(mci, enable_filter, pos);
+
+		if (mci->scrub_mode & SCRUB_SW_SRC) {
+			/*
+			 * Some MC's can remap memory so that it is still
+			 * available at a different address when PCI devices
+			 * map into memory.
+			 * MC's that can't do this lose the memory where PCI
+			 * devices are mapped. This mapping is MC dependent
+			 * and so we call back into the MC driver for it to
+			 * map the MC page to a physical (CPU) page which can
+			 * then be mapped to a virtual page - which can then
+			 * be scrubbed.
+			 */
+			remapped_page = mci->ctl_page_to_phys ?
+				mci->ctl_page_to_phys(mci, page_frame_number) :
+				page_frame_number;
+
+			edac_mc_scrub_block(remapped_page,
+					    offset_in_page, grain);
+		}
+	} else {
+		if (edac_mc_get_log_ue())
+			edac_mc_printk(mci, KERN_WARNING,
+				"UE %s on %s (%s%s %s)\n",
+				msg, label, location, detail, other_detail);
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+		if (edac_mc_get_panic_on_ue())
+			panic("UE %s on %s (%s%s %s)\n",
+			      msg, label, location, detail, other_detail);
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		edac_increment_ue_error(mci, enable_filter, pos);
 	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
-
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 3b8798d..2b66109 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -412,18 +412,20 @@ struct edac_mc_layer {
 /* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -443,9 +445,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -497,6 +500,11 @@ struct mcidev_sysfs_attribute {
         ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
 };
 
+struct edac_hierarchy {
+	char		*name;
+	unsigned	nr;
+};
+
 /* MEMORY controller information structure
  */
 struct mem_ctl_info {
@@ -541,13 +549,18 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned nr_csrows, num_cschannel;
+
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
+	bool mem_is_per_rank;
 
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -562,12 +575,15 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
+	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
+	u32 ce_count;           /* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/* drivers shouldn't access this struct directly */
+	unsigned ce_noinfo_count, ue_noinfo_count;
+	unsigned ce_mc, ue_mc;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -580,7 +596,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-29 14:16                                       ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 14:16 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 28-04-2012 06:16, Borislav Petkov escreveu:
> On Fri, Apr 27, 2012 at 02:52:35PM -0300, Mauro Carvalho Chehab wrote:
> 

>>> Also, is it valid to have n_layers == 0? The memcpy call below
>>> will do nothing.
>>
>> Changed to:
>> 	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
> 
> Really? Look below.

Weird, not sure what happened here... it seems I sent the version before
this change.

The patch I've made has the correct BUG_ON at:
	http://git.infradead.org/users/mchehab/edac.git/commitdiff/447b7929e633027ffe131f2f8f246bba5690cee7

> 
>>
>>>> +	/*
>>>> +	 * Calculate the total amount of dimms and csrows/cschannels while
>>>> +	 * in the old API emulation mode
>>>> +	 */
>>>> +	tot_dimms = 1;
>>>> +	tot_cschannels = 1;
>>>> +	tot_csrows = 1;
>>>
>>> Those initializations can be done above at variable declaration time.
>>
>> Yes, but the compiled code will be the same anyway, as gcc will optimize
> 
> Hey, are you looking at compiled code or at source code? Because I'm
> looking at source code, and it is a pretty safe bet the majority of the
> people here do that too.

What I said is that, from source code POV, a code where the loop variables are
initialized just before the loop is easier to read it when the initialization
of those vars are on another part of the code.

That's basically why the "for" syntax starts with a var initialization clause.

The tot_dimms & friends are loop vars: their value is calculated within the loop.

At the object code, this won't bring any difference.

> 
>> it, either by using registers for those vars or by moving the initialization
>> to the top of the function.
>>
>> This function is too complex, so it is better to initialize those vars
>> just before the loops that are calculating those totals.
> 
> Simply initialize those variables at declaration time and that's it.
> Initializing them before the loop doesn't make the function less complex
> - splitting it and sanitizing it does.

Initializing loop-calculated vars just before the loop makes the code easier
to read, and may avoid issues that might happen during code lifecycle.
> 
>>>
>>>> +	for (i = 0; i < n_layers; i++) {
>>>> +		tot_dimms *= layers[i].size;
>>>> +		if (layers[i].is_virt_csrow)
>>>> +			tot_csrows *= layers[i].size;
>>>> +		else
>>>> +			tot_cschannels *= layers[i].size;
>>>> +	}
> 
> [..]
> 
>> -struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
>> -				unsigned nr_chans, int edac_index)
>> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
>> +				       unsigned n_layers,
>> +				       struct edac_mc_layer *layers,
>> +				       bool rev_order,
>> +				       unsigned sz_pvt)
>>  {
>>  	void *ptr = NULL;
>>  	struct mem_ctl_info *mci;
>> -	struct csrow_info *csi, *csrow;
>> +	struct edac_mc_layer *layer;
>> +	struct csrow_info *csi, *csr;
>>  	struct rank_info *chi, *chp, *chan;
>>  	struct dimm_info *dimm;
>> +	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>>  	void *pvt;
>> -	unsigned size;
>> -	int row, chn;
>> +	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
>> +	unsigned tot_csrows, tot_channels, tot_errcount = 0;
>> +	int i, j;
>>  	int err;
>> +	int row, chn;
>> +	bool per_rank = false;
>> +
>> +	BUG_ON(n_layers > EDAC_MAX_LAYERS);
> 
> 	^^^^^^
> 

Let me re-send it, with the right BUG_ON there.

commit 447b7929e633027ffe131f2f8f246bba5690cee7
Author: Mauro Carvalho Chehab <mchehab@redhat.com>
Date:   Wed Apr 18 15:20:50 2012 -0300

    edac: Change internal representation to work with layers
    
    Change the EDAC internal representation to work with non-csrow
    based memory controllers.
    
    There are lots of those memory controllers nowadays, and more
    are coming. So, the EDAC internal representation needs to be
    changed, in order to work with those memory controllers, while
    preserving backward compatibility with the old ones.
    
    The edac core was written with the idea that memory controllers
    are able to directly access csrows.
    
    This is not true for FB-DIMM and RAMBUS memory controllers.
    
    Also, some recent advanced memory controllers don't present a per-csrows
    view. Instead, they view memories as DIMMs, instead of ranks.
    
    So, change the allocation and error report routines to allow
    them to work with all types of architectures.
    
    This will allow the removal of several hacks with FB-DIMM and RAMBUS
    memory controllers.
    
    Also, several tests were done on different platforms using different
    x86 drivers.
    
    TODO: a multi-rank DIMMs are currently represented by multiple DIMM
    entries in struct dimm_info. That means that changing a label for one
    rank won't change the same label for the other ranks at the same DIMM.
    This bug is present since the beginning of the EDAC, so it is not a big
    deal. However, on several drivers, it is possible to fix this issue, but
    it should be a per-driver fix, as the csrow => DIMM arrangement may not
    be equal for all. So, don't try to fix it here yet.
    
    I tried to make this patch as short as possible, preceding it with
    several other patches that simplified the logic here. Yet, as the
    internal API changes, all drivers need changes. The changes are
    generally bigger in the drivers for FB-DIMMs.
    
    Cc: Aristeu Rozanski <arozansk@redhat.com>
    Cc: Doug Thompson <norsk5@yahoo.com>
    Cc: Borislav Petkov <borislav.petkov@amd.com>
    Cc: Mark Gross <mark.gross@intel.com>
    Cc: Jason Uhlenkott <juhlenko@akamai.com>
    Cc: Tim Small <tim@buttersideup.com>
    Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
    Cc: "Arvind R." <arvino55@gmail.com>
    Cc: Olof Johansson <olof@lixom.net>
    Cc: Egor Martovetsky <egor@pasemi.com>
    Cc: Chris Metcalf <cmetcalf@tilera.com>
    Cc: Michal Marek <mmarek@suse.cz>
    Cc: Jiri Kosina <jkosina@suse.cz>
    Cc: Joe Perches <joe@perches.com>
    Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
    Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Cc: Hitoshi Mitake <h.mitake@gmail.com>
    Cc: Andrew Morton <akpm@linux-foundation.org>
    Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
    Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
    Cc: Josh Boyer <jwboyer@gmail.com>
    Cc: linuxppc-dev@lists.ozlabs.org
    Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..b2dfdf5 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,13 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index);
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   bool rev_order,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -467,24 +472,78 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
  * reporting logic and function interface - reduces conditional
  * statement clutter and extra function arguments.
  */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page,
-			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
+
+static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page,
+				     unsigned long syndrome, int row, int channel,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      page_frame_number, offset_in_page, syndrome,
+		              row, channel, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page, int row,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      page_frame_number, offset_in_page, 0,
+		              row, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel0,
+					 unsigned int channel1,
+					 char *msg)
+{
+	/*
+	 *FIXME: The error can also be at channel1 (e. g. at the second
+	 *	  channel of the same branch). The fix is to push
+	 *	  edac_mc_handle_error() call into each driver
+	 */
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel0, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel, char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0,
+		              csrow, channel, -1, msg, NULL, NULL);
+}
 
 /*
  * edac_device APIs
@@ -496,6 +555,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 6ec967a..d837266 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimms = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -157,10 +175,21 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fill a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Number of MC hierarchy layers
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the CSROWS-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -168,22 +197,55 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  *
  * Use edac_mc_free() to free mc structures allocated by this function.
  *
+ * NOTE: drivers handle multi-rank memories in different ways: in some
+ * drivers, one multi-rank memory stick is mapped as one entry, while, in
+ * others, a single multi-rank memory stick would be mapped into several
+ * entries. Currently, this function will allocate multiple struct dimm_info
+ * on such scenarios, as grouping the multiple ranks require drivers change.
+ *
  * Returns:
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				       unsigned n_layers,
+				       struct edac_mc_layer *layers,
+				       bool rev_order,
+				       unsigned sz_pvt)
 {
 	void *ptr = NULL;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *layer;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
 	void *pvt;
-	unsigned size;
-	int row, chn;
+	unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
+	unsigned tot_csrows, tot_channels, tot_errcount = 0;
+	int i, j;
 	int err;
+	int row, chn;
+	bool per_rank = false;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	tot_dimms = 1;
+	tot_channels = 1;
+	tot_csrows = 1;
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_virt_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_channels *= layers[i].size;
+
+		if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
+			per_rank = true;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
@@ -191,12 +253,28 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * hardcode everything into a single struct.
 	 */
 	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
-	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
-	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
-	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	layer = edac_align_ptr(&ptr, sizeof(*layer), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_channels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	count = 1;
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		tot_errcount += 2 * count;
+	}
+
+	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		__func__, size,
+		tot_dimms,
+		per_rank ? "ranks" : "dimms",
+		tot_csrows * tot_channels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -204,42 +282,101 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	layer = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)layer));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
 	mci->mc_idx = edac_index;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
+	mci->n_layers = n_layers;
+	mci->layers = layer;
+	memcpy(mci->layers, layers, sizeof(*layer) * n_layers);
+	mci->nr_csrows = tot_csrows;
+	mci->num_cschannel = tot_channels;
+	mci->mem_is_per_rank = per_rank;
 
 	/*
-	 * For now, assumes that a per-csrow arrangement for dimms.
-	 * This will be latter changed.
+	 * Fills the csrow struct
 	 */
-	dimm = mci->dimms;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
-
-		for (chn = 0; chn < nr_chans; chn++) {
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_channels;
+		chp = &chi[row * tot_channels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_channels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
+		}
+	}
 
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+	/*
+	 * Fills the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+		per_rank ? "ranks" : "dimms");
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = EDAC_DIMM_PTR(layer, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: %s%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, per_rank ? "rank" : "dimm", (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		if (!rev_order) {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (!layers[j].is_virt_csrow)
+					break;
+			chn++;
+			if (chn == tot_channels) {
+				chn = 0;
+				row++;
+			}
+		} else {
+			for (j = n_layers - 1; j >= 0; j--)
+				if (layers[j].is_virt_csrow)
+					break;
+			row++;
+			if (row == tot_csrows) {
+				row = 0;
+				chn++;
+			}
+		}
+
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -263,6 +400,57 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	return mci;
 }
+EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate and partially fills a struct mem_ctl_info structure
+ * @edac_index:		Memory controller number
+ * @n_layers:		Nu
+mber of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @rev_order:		Fills csrows/cs channels at the reverse order
+ * @size_pvt:		size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories on different ways: on some
+ * drivers, one multi-rank memory is mapped as one DIMM, while, on others,
+ * a single multi-rank DIMM would be mapped into several "dimms".
+ *
+ * Non-csrow based drivers (like FB-DIMM and RAMBUS ones) will likely report
+ * such DIMMS properly, but the csrow-based ones will likely do the wrong
+ * thing, as two chip select values are used for dual-rank memories (and 4, for
+ * quad-rank ones). I suspect that this issue could be solved inside the EDAC
+ * core for SDRAM memories, but it requires further study at JEDEC JESD 21C.
+ *
+ * In summary, solving this issue is not easy, as it requires a lot of testing.
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * Only can be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ *	NULL allocation failed
+ *	struct mem_ctl_info pointer
+ */
+
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index)
+{
+	unsigned n_layers = 2;
+	struct edac_mc_layer layers[n_layers];
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = nr_csrows;
+	layers[0].is_virt_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_virt_csrow = false;
+
+	return new_edac_mc_alloc(edac_index, ARRAY_SIZE(layers), layers,
+			  false, sz_pvt);
+}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
@@ -528,7 +716,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -555,6 +742,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -712,261 +901,251 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
+
+static void edac_increment_ce_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
 {
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+	int i, index = 0;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	mci->ce_mc++;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_increment_ue_error(struct mem_ctl_info *mci,
+				    bool enable_filter,
+				    unsigned pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
 
-	if (mci->scrub_mode & SCRUB_SW_SRC) {
-		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
-		remapped_page = mci->ctl_page_to_phys ?
-			mci->ctl_page_to_phys(mci, page_frame_number) :
-			page_frame_number;
+	mci->ue_mc++;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+	if (!enable_filter) {
+		mci->ce_noinfo_count++;
+		return;
 	}
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	unsigned long remapped_page;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_filter = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/* Check if the event report is consistent */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED) {
+				p = "CE";
+				mci->ce_mc++;
+			} else {
+				p = "UE";
+				mci->ue_mc++;
+			}
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_filter = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
-
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory,
+	 * the logic here will get all possible labels that could pottentially
+	 * be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called as "rank")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 * It will also get the max grain, over the error match range
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no sense
+		 * on seeking for the affected DIMMs, as everything may be
+		 * affected. Also, don't show errors for non-filled dimm's.
+		 */
+		if (enable_filter && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the dimm, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: %s csrows map: (%d,%d)\n",
+				__func__,
+				mci->mem_is_per_rank ? "rank" : "dimm",
+				dimm->csrow, dimm->cschannel);
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
-
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_filter) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
+		p += sprintf(p, "%s %d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED)
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+	else
+		snprintf(detail, sizeof(detail),
+			"page 0x%lx offset 0x%lx grain %d",
+			page_frame_number, offset_in_page, grain);
+
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		if (edac_mc_get_log_ce())
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		edac_increment_ce_error(mci, enable_filter, pos);
+
+		if (mci->scrub_mode & SCRUB_SW_SRC) {
+			/*
+			 * Some MC's can remap memory so that it is still
+			 * available at a different address when PCI devices
+			 * map into memory.
+			 * MC's that can't do this lose the memory where PCI
+			 * devices are mapped. This mapping is MC dependent
+			 * and so we call back into the MC driver for it to
+			 * map the MC page to a physical (CPU) page which can
+			 * then be mapped to a virtual page - which can then
+			 * be scrubbed.
+			 */
+			remapped_page = mci->ctl_page_to_phys ?
+				mci->ctl_page_to_phys(mci, page_frame_number) :
+				page_frame_number;
+
+			edac_mc_scrub_block(remapped_page,
+					    offset_in_page, grain);
+		}
+	} else {
+		if (edac_mc_get_log_ue())
+			edac_mc_printk(mci, KERN_WARNING,
+				"UE %s on %s (%s%s %s)\n",
+				msg, label, location, detail, other_detail);
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+		if (edac_mc_get_panic_on_ue())
+			panic("UE %s on %s (%s%s %s)\n",
+			      msg, label, location, detail, other_detail);
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		edac_increment_ue_error(mci, enable_filter, pos);
 	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
-
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
-
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 3b8798d..2b66109 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -412,18 +412,20 @@ struct edac_mc_layer {
 /* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -443,9 +445,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -497,6 +500,11 @@ struct mcidev_sysfs_attribute {
         ssize_t (*store)(struct mem_ctl_info *, const char *,size_t);
 };
 
+struct edac_hierarchy {
+	char		*name;
+	unsigned	nr;
+};
+
 /* MEMORY controller information structure
  */
 struct mem_ctl_info {
@@ -541,13 +549,18 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned nr_csrows, num_cschannel;
+
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
+	bool mem_is_per_rank;
 
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -562,12 +575,15 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
+	u32 ue_count;           /* Total Uncorrectable Errors for this MC */
+	u32 ce_count;           /* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/* drivers shouldn't access this struct directly */
+	unsigned ce_noinfo_count, ue_noinfo_count;
+	unsigned ce_mc, ue_mc;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -580,7 +596,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-28  8:52                                       ` Borislav Petkov
@ 2012-04-29 14:25                                         ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 14:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Joe Perches, Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

Em 28-04-2012 05:52, Borislav Petkov escreveu:
> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
>> Yes. This is a common issue at the EDAC core: on several places, it calls the
>> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
>> the debug macros already handles that. I suspect that, in the past, the __func__
>> were not at the macros, but some patch added it there, and forgot to fix the
>> occurrences of its call.
> 
> The patch that added it is d357cbb445208 and you reviewed it.

And you wrote the patch that caused it.

> 
>> This is something that needs to be reviewed at the entire EDAC core (and likely
>> at the drivers).
> 
> Looks like a job for a newbie to get her/his feet wet with kernel work.

> 
>> I opted to not touch on this at the existing debug logic, as I think that the
>> better is to address all those issues on one separate patch, after fixing the
>> EDAC core bugs.
> 
> No,
> 
> you simply need to remove the __func__ argument in your newly added debug call:
> 
>                 debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
>                         i, (dimm - mci->dimms),
>                         pos[0], pos[1], pos[2], row, chn);
> 
> And while you're at it, remove the rest of the __func__ arguments from
> your newly added debugfX calls.

A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
an unrelated fix on this patch. This is already complex enough to add more unrelated
things there.

Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
on one shot.

Regards,
Mauro


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-29 14:25                                         ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 14:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Joe Perches, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Olof Johansson, Andrew Morton,
	linuxppc-dev

Em 28-04-2012 05:52, Borislav Petkov escreveu:
> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
>> Yes. This is a common issue at the EDAC core: on several places, it calls the
>> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
>> the debug macros already handles that. I suspect that, in the past, the __func__
>> were not at the macros, but some patch added it there, and forgot to fix the
>> occurrences of its call.
> 
> The patch that added it is d357cbb445208 and you reviewed it.

And you wrote the patch that caused it.

> 
>> This is something that needs to be reviewed at the entire EDAC core (and likely
>> at the drivers).
> 
> Looks like a job for a newbie to get her/his feet wet with kernel work.

> 
>> I opted to not touch on this at the existing debug logic, as I think that the
>> better is to address all those issues on one separate patch, after fixing the
>> EDAC core bugs.
> 
> No,
> 
> you simply need to remove the __func__ argument in your newly added debug call:
> 
>                 debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
>                         i, (dimm - mci->dimms),
>                         pos[0], pos[1], pos[2], row, chn);
> 
> And while you're at it, remove the rest of the __func__ arguments from
> your newly added debugfX calls.

A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
an unrelated fix on this patch. This is already complex enough to add more unrelated
things there.

Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
on one shot.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 14:25                                         ` Mauro Carvalho Chehab
@ 2012-04-29 15:11                                           ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 15:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Joe Perches, Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

Em 29-04-2012 11:25, Mauro Carvalho Chehab escreveu:
> Em 28-04-2012 05:52, Borislav Petkov escreveu:
>> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
>>> Yes. This is a common issue at the EDAC core: on several places, it calls the
>>> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
>>> the debug macros already handles that. I suspect that, in the past, the __func__
>>> were not at the macros, but some patch added it there, and forgot to fix the
>>> occurrences of its call.
>>
>> The patch that added it is d357cbb445208 and you reviewed it.
> 
> And you wrote the patch that caused it.
> 
>>
>>> This is something that needs to be reviewed at the entire EDAC core (and likely
>>> at the drivers).
>>
>> Looks like a job for a newbie to get her/his feet wet with kernel work.
> 
>>
>>> I opted to not touch on this at the existing debug logic, as I think that the
>>> better is to address all those issues on one separate patch, after fixing the
>>> EDAC core bugs.
>>
>> No,
>>
>> you simply need to remove the __func__ argument in your newly added debug call:
>>
>>                 debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
>>                         i, (dimm - mci->dimms),
>>                         pos[0], pos[1], pos[2], row, chn);
>>
>> And while you're at it, remove the rest of the __func__ arguments from
>> your newly added debugfX calls.
> 
> A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
> an unrelated fix on this patch. This is already complex enough to add more unrelated
> things there.
> 
> Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
> on one shot.

Most of the issues can be solved with the above script-based patch. 

There are still 171 places (12 places at the core, the rest are on the drivers)
that will require a more sophisticated patch or that requires a manual fix.

-

From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Sun, 29 Apr 2012 11:59:14 -0300
Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

The debug macro already adds that. Made by this small script:

$f .=$_ while (<>);

$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*(\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*(\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*(\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*(\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

print $f;

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index be6c225..25198c8 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -180,7 +180,7 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 static void amd76x_check(struct mem_ctl_info *mci)
 {
 	struct amd76x_error_info info;
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	amd76x_get_error_info(mci, &info);
 	amd76x_process_error_info(mci, &info, 1);
 }
@@ -241,7 +241,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 ems_mode;
 	struct amd76x_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_dword(pdev, AMD76X_ECC_MODE_STATUS, &ems);
 	ems_mode = (ems >> 10) & 0x3;
 
@@ -256,7 +256,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -292,7 +292,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -304,7 +304,7 @@ fail:
 static int __devinit amd76x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return amd76x_probe1(pdev, ent->driver_data);
@@ -322,7 +322,7 @@ static void __devexit amd76x_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (amd76x_pci)
 		edac_pci_release_generic_ctl(amd76x_pci);
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 31b3c91..9ee1194 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -316,13 +316,12 @@ static void get_total_mem(struct cpc925_mc_pdata *pdata)
 		reg += aw;
 		size = of_read_number(reg, sw);
 		reg += sw;
-		debugf1("%s: start 0x%lx, size 0x%lx\n", __func__,
-			start, size);
+		debugf1("start 0x%lx, size 0x%lx\n", start, size);
 		pdata->total_mem += size;
 	} while (reg < reg_end);
 
 	of_node_put(np);
-	debugf0("%s: total_mem 0x%lx\n", __func__, pdata->total_mem);
+	debugf0("total_mem 0x%lx\n", pdata->total_mem);
 }
 
 static void cpc925_init_csrows(struct mem_ctl_info *mci)
@@ -512,7 +511,7 @@ static void cpc925_mc_get_pfn(struct mem_ctl_info *mci, u32 mear,
 	*offset = pa & (PAGE_SIZE - 1);
 	*pfn = pa >> PAGE_SHIFT;
 
-	debugf0("%s: ECC physical address 0x%lx\n", __func__, pa);
+	debugf0("ECC physical address 0x%lx\n", pa);
 }
 
 static int cpc925_mc_find_channel(struct mem_ctl_info *mci, u16 syndrome)
@@ -852,8 +851,8 @@ static void cpc925_add_edac_devices(void __iomem *vbase)
 			goto err2;
 		}
 
-		debugf0("%s: Successfully added edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully added edac device for %s\n",
+			dev_info->ctl_name);
 
 		continue;
 
@@ -884,8 +883,8 @@ static void cpc925_del_edac_devices(void)
 		if (dev_info->exit)
 			dev_info->exit(dev_info);
 
-		debugf0("%s: Successfully deleted edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully deleted edac device for %s\n",
+			dev_info->ctl_name);
 	}
 }
 
@@ -900,7 +899,7 @@ static int cpc925_get_sdram_scrub_rate(struct mem_ctl_info *mci)
 	mscr = __raw_readl(pdata->vbase + REG_MSCR_OFFSET);
 	si = (mscr & MSCR_SI_MASK) >> MSCR_SI_SHIFT;
 
-	debugf0("%s, Mem Scrub Ctrl Register 0x%x\n", __func__, mscr);
+	debugf0("Mem Scrub Ctrl Register 0x%x\n", mscr);
 
 	if (((mscr & MSCR_SCRUB_MOD_MASK) != MSCR_BACKGR_SCRUB) ||
 	    (si == 0)) {
@@ -928,8 +927,7 @@ static int cpc925_mc_get_channels(void __iomem *vbase)
 	    ((mbcr & MBCR_64BITBUS_MASK) == 0))
 		dual = 1;
 
-	debugf0("%s: %s channel\n", __func__,
-		(dual > 0) ? "Dual" : "Single");
+	debugf0("%s channel\n", (dual > 0) ? "Dual" : "Single");
 
 	return dual;
 }
@@ -944,7 +942,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	struct resource *r;
 	int res = 0, nr_channels;
 
-	debugf0("%s: %s platform device found!\n", __func__, pdev->name);
+	debugf0("%s platform device found!\n", pdev->name);
 
 	if (!devres_open_group(&pdev->dev, cpc925_probe, GFP_KERNEL)) {
 		res = -ENOMEM;
@@ -1026,7 +1024,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	cpc925_add_edac_devices(vbase);
 
 	/* get this far and it's successful */
-	debugf0("%s: success\n", __func__);
+	debugf0("success\n");
 
 	res = 0;
 	goto out;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 7e601c1..edb6ff3 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -309,7 +309,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (page < pvt->tolm)
 		return page;
@@ -335,7 +335,7 @@ static void do_process_ce(struct mem_ctl_info *mci, u16 error_one,
 	int i;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* convert the addr to 4k page */
 	page = sec1_add >> (PAGE_SHIFT - 4);
@@ -394,7 +394,7 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 	int row;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (error_one & 0x0202) {
 		error_2b = ded_add;
@@ -453,7 +453,7 @@ static inline void process_ue_no_info_wr(struct mem_ctl_info *mci,
 	if (!handle_error)
 		return;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
 			     -1, -1, -1,
 			     "e752x UE log memory write", "", NULL);
@@ -982,7 +982,7 @@ static void e752x_check(struct mem_ctl_info *mci)
 {
 	struct e752x_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e752x_get_error_info(mci, &info);
 	e752x_process_error_info(mci, &info, 1);
 }
@@ -1270,7 +1270,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;		/* Number of channels 0=1chan,1=2chan */
 	struct e752x_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 	debugf0("Starting Probe1\n");
 
 	/* check to see if device 0 function 1 is enabled; if it isn't, we
@@ -1302,7 +1302,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	/* 3100 IMCH supports SECDEC only */
 	mci->edac_ctl_cap = (dev_idx == I3100) ? EDAC_FLAG_SECDED :
@@ -1312,7 +1312,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_ver = E752X_REVISION;
 	mci->pdev = &pdev->dev;
 
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e752x_pvt *)mci->pvt_info;
 	pvt->dev_info = &e752x_devs[dev_idx];
 	pvt->mc_symmetric = ((ddrcsr & 0x10) != 0);
@@ -1322,7 +1322,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENODEV;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e752x_check;
@@ -1344,7 +1344,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		mci->edac_cap = EDAC_FLAG_SECDED; /* the only mode supported */
 	else
 		mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E752X_TOLM, &pci_data);
@@ -1379,7 +1379,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -1395,7 +1395,7 @@ fail:
 static int __devinit e752x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	if (pci_enable_device(pdev) < 0)
@@ -1409,7 +1409,7 @@ static void __devexit e752x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e752x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e752x_pci)
 		edac_pci_release_generic_ctl(e752x_pci);
@@ -1455,7 +1455,7 @@ static int __init e752x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -1466,7 +1466,7 @@ static int __init e752x_init(void)
 
 static void __exit e752x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	pci_unregister_driver(&e752x_driver);
 }
 
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 2defa96..253e878 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -166,7 +166,7 @@ static const struct e7xxx_dev_info e7xxx_devs[] = {
 /* FIXME - is this valid for both SECDED and S4ECD4ED? */
 static inline int e7xxx_find_channel(u16 syndrome)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((syndrome & 0xff00) == 0)
 		return 0;
@@ -186,7 +186,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e7xxx_pvt *pvt = (struct e7xxx_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((page < pvt->tolm) ||
 		((page >= 0x100000) && (page < pvt->remapbase)))
@@ -208,7 +208,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	int row;
 	int channel;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_1b = info->dram_celog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -225,7 +225,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ce_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx CE log register overflow", "", NULL);
 }
@@ -235,7 +235,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	u32 error_2b, block_page;
 	int row;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_2b = info->dram_uelog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -248,7 +248,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ue_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx UE log register overflow", "", NULL);
@@ -334,7 +334,7 @@ static void e7xxx_check(struct mem_ctl_info *mci)
 {
 	struct e7xxx_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e7xxx_get_error_info(mci, &info);
 	e7xxx_process_error_info(mci, &info, 1);
 }
@@ -430,7 +430,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;
 	struct e7xxx_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 
 	pci_read_config_dword(pdev, E7XXX_DRC, &drc);
 
@@ -453,7 +453,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED |
 		EDAC_FLAG_S4ECD4ED;
@@ -461,7 +461,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_name = EDAC_MOD_STR;
 	mci->mod_ver = E7XXX_REVISION;
 	mci->pdev = &pdev->dev;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e7xxx_pvt *)mci->pvt_info;
 	pvt->dev_info = &e7xxx_devs[dev_idx];
 	pvt->bridge_ck = pci_get_device(PCI_VENDOR_ID_INTEL,
@@ -474,14 +474,14 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e7xxx_check;
 	mci->ctl_page_to_phys = ctl_page_to_phys;
 	e7xxx_init_csrows(mci, pdev, dev_idx, drc);
 	mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E7XXX_TOLM, &pci_data);
 	pvt->tolm = ((u32) pci_data) << 4;
@@ -516,7 +516,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -532,7 +532,7 @@ fail0:
 static int __devinit e7xxx_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	return pci_enable_device(pdev) ?
@@ -544,7 +544,7 @@ static void __devexit e7xxx_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e7xxx_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e7xxx_pci)
 		edac_pci_release_generic_ctl(e7xxx_pci);
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index cb397d9..ed46949 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -82,8 +82,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	void *pvt, *p;
 	int err;
 
-	debugf4("%s() instances=%d blocks=%d\n",
-		__func__, nr_instances, nr_blocks);
+	debugf4("instances=%d blocks=%d\n",
+		nr_instances, nr_blocks);
 
 	/* Calculate the size of memory we need to allocate AND
 	 * determine the offsets of the various item arrays
@@ -156,8 +156,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	/* Name of this edac device */
 	snprintf(dev_ctl->name,sizeof(dev_ctl->name),"%s",edac_device_name);
 
-	debugf4("%s() edac_dev=%p next after end=%p\n",
-		__func__, dev_ctl, pvt + sz_private );
+	debugf4("edac_dev=%p next after end=%p\n",
+		dev_ctl, pvt + sz_private );
 
 	/* Initialize every Instance */
 	for (instance = 0; instance < nr_instances; instance++) {
@@ -178,9 +178,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			snprintf(blk->name, sizeof(blk->name),
 				 "%s%d", edac_block_name, block+offset_value);
 
-			debugf4("%s() instance=%d inst_p=%p block=#%d "
+			debugf4("instance=%d inst_p=%p block=#%d "
 				"block_p=%p name='%s'\n",
-				__func__, instance, inst, block,
+				instance, inst, block,
 				blk, blk->name);
 
 			/* if there are NO attributes OR no attribute pointer
@@ -194,8 +194,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			attrib_p = &dev_attrib[block*nr_instances*nr_attrib];
 			blk->block_attributes = attrib_p;
 
-			debugf4("%s() THIS BLOCK_ATTRIB=%p\n",
-				__func__, blk->block_attributes);
+			debugf4("THIS BLOCK_ATTRIB=%p\n",
+				blk->block_attributes);
 
 			/* Initialize every user specified attribute in this
 			 * block with the data the caller passed in
@@ -214,9 +214,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 
 				attrib->block = blk;	/* up link */
 
-				debugf4("%s() alloc-attrib=%p attrib_name='%s' "
+				debugf4("alloc-attrib=%p attrib_name='%s' "
 					"attrib-spec=%p spec-name=%s\n",
-					__func__, attrib, attrib->attr.name,
+					attrib, attrib->attr.name,
 					&attrib_spec[attr],
 					attrib_spec[attr].attr.name
 					);
@@ -273,7 +273,7 @@ static struct edac_device_ctl_info *find_edac_device_by_dev(struct device *dev)
 	struct edac_device_ctl_info *edac_dev;
 	struct list_head *item;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	list_for_each(item, &edac_device_list) {
 		edac_dev = list_entry(item, struct edac_device_ctl_info, link);
@@ -408,7 +408,7 @@ static void edac_device_workq_function(struct work_struct *work_req)
 void edac_device_workq_setup(struct edac_device_ctl_info *edac_dev,
 				unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* take the arg 'msec' and set it into the control structure
 	 * to used in the time period calculation
@@ -496,7 +496,7 @@ EXPORT_SYMBOL_GPL(edac_device_alloc_index);
  */
 int edac_device_add_device(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -570,7 +570,7 @@ struct edac_device_ctl_info *edac_device_del_device(struct device *dev)
 {
 	struct edac_device_ctl_info *edac_dev;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&device_ctls_mutex);
 
diff --git a/drivers/edac/edac_device_sysfs.c b/drivers/edac/edac_device_sysfs.c
index b4ea185..3c589c2 100644
--- a/drivers/edac/edac_device_sysfs.c
+++ b/drivers/edac/edac_device_sysfs.c
@@ -202,7 +202,7 @@ static void edac_device_ctrl_master_release(struct kobject *kobj)
 {
 	struct edac_device_ctl_info *edac_dev = to_edacdev(kobj);
 
-	debugf4("%s() control index=%d\n", __func__, edac_dev->dev_idx);
+	debugf4("control index=%d\n", edac_dev->dev_idx);
 
 	/* decrement the EDAC CORE module ref count */
 	module_put(edac_dev->owner);
@@ -233,12 +233,12 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	struct bus_type *edac_subsys;
 	int err;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the /sys/devices/system/edac reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys error\n", __func__);
+		debugf1("no edac_subsys error\n");
 		err = -ENODEV;
 		goto err_out;
 	}
@@ -264,8 +264,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 				   &edac_subsys->dev_root->kobj,
 				   "%s", edac_dev->name);
 	if (err) {
-		debugf1("%s()Failed to register '.../edac/%s'\n",
-			__func__, edac_dev->name);
+		debugf1("Failed to register '.../edac/%s'\n",
+			edac_dev->name);
 		goto err_kobj_reg;
 	}
 	kobject_uevent(&edac_dev->kobj, KOBJ_ADD);
@@ -274,8 +274,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	 * edac_device_unregister_sysfs_main_kobj() must be used
 	 */
 
-	debugf4("%s() Registered '.../edac/%s' kobject\n",
-		__func__, edac_dev->name);
+	debugf4("Registered '.../edac/%s' kobject\n",
+		edac_dev->name);
 
 	return 0;
 
@@ -296,9 +296,9 @@ err_out:
  */
 void edac_device_unregister_sysfs_main_kobj(struct edac_device_ctl_info *dev)
 {
-	debugf0("%s()\n", __func__);
-	debugf4("%s() name of kobject is: %s\n",
-		__func__, kobject_name(&dev->kobj));
+	debugf0("\n");
+	debugf4("name of kobject is: %s\n",
+		kobject_name(&dev->kobj));
 
 	/*
 	 * Unregister the edac device's kobject and
@@ -336,7 +336,7 @@ static void edac_device_ctrl_instance_release(struct kobject *kobj)
 {
 	struct edac_device_instance *instance;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* map from this kobj to the main control struct
 	 * and then dec the main kobj count
@@ -442,7 +442,7 @@ static void edac_device_ctrl_block_release(struct kobject *kobj)
 {
 	struct edac_device_block *block;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the container of the kobj */
 	block = to_block(kobj);
@@ -524,10 +524,10 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	struct edac_dev_sysfs_block_attribute *sysfs_attrib;
 	struct kobject *main_kobj;
 
-	debugf4("%s() Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
-		__func__, instance->name, instance, block->name, block);
-	debugf4("%s() block kobj=%p  block kobj->parent=%p\n",
-		__func__, &block->kobj, &block->kobj.parent);
+	debugf4("Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
+		instance->name, instance, block->name, block);
+	debugf4("block kobj=%p  block kobj->parent=%p\n",
+		&block->kobj, &block->kobj.parent);
 
 	/* init this block's kobject */
 	memset(&block->kobj, 0, sizeof(struct kobject));
@@ -546,8 +546,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 				   &instance->kobj,
 				   "%s", block->name);
 	if (err) {
-		debugf1("%s() Failed to register instance '%s'\n",
-			__func__, block->name);
+		debugf1("Failed to register instance '%s'\n",
+			block->name);
 		kobject_put(main_kobj);
 		err = -ENODEV;
 		goto err_out;
@@ -560,9 +560,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	if (sysfs_attrib && block->nr_attribs) {
 		for (i = 0; i < block->nr_attribs; i++, sysfs_attrib++) {
 
-			debugf4("%s() creating block attrib='%s' "
+			debugf4("creating block attrib='%s' "
 				"attrib->%p to kobj=%p\n",
-				__func__,
 				sysfs_attrib->attr.name,
 				sysfs_attrib, &block->kobj);
 
@@ -647,14 +646,14 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	err = kobject_init_and_add(&instance->kobj, &ktype_instance_ctrl,
 				   &edac_dev->kobj, "%s", instance->name);
 	if (err != 0) {
-		debugf2("%s() Failed to register instance '%s'\n",
-			__func__, instance->name);
+		debugf2("Failed to register instance '%s'\n",
+			instance->name);
 		kobject_put(main_kobj);
 		goto err_out;
 	}
 
-	debugf4("%s() now register '%d' blocks for instance %d\n",
-		__func__, instance->nr_blocks, idx);
+	debugf4("now register '%d' blocks for instance %d\n",
+		instance->nr_blocks, idx);
 
 	/* register all blocks of this instance */
 	for (i = 0; i < instance->nr_blocks; i++) {
@@ -670,8 +669,8 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	}
 	kobject_uevent(&instance->kobj, KOBJ_ADD);
 
-	debugf4("%s() Registered instance %d '%s' kobject\n",
-		__func__, idx, instance->name);
+	debugf4("Registered instance %d '%s' kobject\n",
+		idx, instance->name);
 
 	return 0;
 
@@ -715,7 +714,7 @@ static int edac_device_create_instances(struct edac_device_ctl_info *edac_dev)
 	int i, j;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* iterate over creation of the instances */
 	for (i = 0; i < edac_dev->nr_instances; i++) {
@@ -817,12 +816,12 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	int err;
 	struct kobject *edac_kobj = &edac_dev->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, edac_dev->dev_idx);
+	debugf0("idx=%d\n", edac_dev->dev_idx);
 
 	/*  go create any main attributes callers wants */
 	err = edac_device_add_main_sysfs_attributes(edac_dev);
 	if (err) {
-		debugf0("%s() failed to add sysfs attribs\n", __func__);
+		debugf0("failed to add sysfs attribs\n");
 		goto err_out;
 	}
 
@@ -849,8 +848,8 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	}
 
 
-	debugf4("%s() create-instances done, idx=%d\n",
-		__func__, edac_dev->dev_idx);
+	debugf4("create-instances done, idx=%d\n",
+		edac_dev->dev_idx);
 
 	return 0;
 
@@ -873,7 +872,7 @@ err_out:
  */
 void edac_device_remove_sysfs(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* remove any main attributes for this device */
 	edac_device_remove_main_sysfs_attributes(edac_dev);
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 65568e6..32ed17b 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -259,13 +259,13 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	count = 1;
 	for (i = 0; i < n_layers; i++) {
 		count *= layers[i].size;
-		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		debugf4("errcount layer %d size %d\n", i, count);
 		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		tot_errcount += 2 * count;
 	}
 
-	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
+	debugf4("allocating %d error counters\n", tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
@@ -337,7 +337,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+	debugf4("initializing %d %s\n", tot_dimms,
 		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
@@ -451,7 +451,7 @@ EXPORT_SYMBOL_GPL(edac_mc_alloc);
  */
 void edac_mc_free(struct mem_ctl_info *mci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* the mci instance is freed here, when the sysfs object is dropped */
 	edac_unregister_sysfs(mci);
@@ -471,7 +471,7 @@ struct mem_ctl_info *find_mci_by_dev(struct device *dev)
 	struct mem_ctl_info *mci;
 	struct list_head *item;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	list_for_each(item, &mc_devices) {
 		mci = list_entry(item, struct mem_ctl_info, link);
@@ -539,7 +539,7 @@ static void edac_mc_workq_function(struct work_struct *work_req)
  */
 static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* if this instance is not in the POLL state, then simply return */
 	if (mci->op_state != OP_RUNNING_POLL)
@@ -566,8 +566,7 @@ static void edac_mc_workq_teardown(struct mem_ctl_info *mci)
 
 	status = cancel_delayed_work(&mci->work);
 	if (status == 0) {
-		debugf0("%s() not canceled, flush the queue\n",
-			__func__);
+		debugf0("not canceled, flush the queue\n");
 
 		/* workq instance might be running, wait for it */
 		flush_workqueue(edac_workqueue);
@@ -714,7 +713,7 @@ EXPORT_SYMBOL(edac_mc_find);
 /* FIXME - should a warning be printed if no error detection? correction? */
 int edac_mc_add_mc(struct mem_ctl_info *mci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -785,7 +784,7 @@ struct mem_ctl_info *edac_mc_del_mc(struct device *dev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&mem_ctls_mutex);
 
@@ -823,7 +822,7 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 	void *virt_addr;
 	unsigned long flags = 0;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* ECC error page was not in our memory. Ignore it. */
 	if (!pfn_valid(page))
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 81ca073..c4ce26e 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -623,8 +623,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 
 	err =  device_add(&dimm->dev);
 
-	debugf0("%s(): creating rank/dimm device %s\n", __func__,
-		dev_name(&dimm->dev));
+	debugf0("creating rank/dimm device %s\n", dev_name(&dimm->dev));
 
 	return err;
 }
@@ -981,8 +980,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	dev_set_drvdata(&mci->dev, mci);
 	pm_runtime_forbid(&mci->dev);
 
-	debugf0("%s(): creating device %s\n", __func__,
-		dev_name(&mci->dev));
+	debugf0("creating device %s\n", dev_name(&mci->dev));
 	err = device_add(&mci->dev);
 	if (err < 0) {
 		bus_unregister(&mci->bus);
@@ -999,8 +997,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 		if (dimm->nr_pages == 0)
 			continue;
 #ifdef CONFIG_EDAC_DEBUG
-		debugf1("%s creating dimm%d, located at ",
-			__func__, i);
+		debugf1("creating dimm%d, located at ",
+			i);
 		if (edac_debug_level >= 1) {
 			int lay;
 			for (lay = 0; lay < mci->n_layers; lay++)
@@ -1012,8 +1010,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 #endif
 		err = edac_create_dimm_object(mci, dimm, i);
 		if (err) {
-			debugf1("%s() failure: create dimm %d obj\n",
-				__func__, i);
+			debugf1("failure: create dimm %d obj\n",
+				i);
 			goto fail;
 		}
 	}
@@ -1051,7 +1049,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
 	int i;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	debugfs_remove(mci->debugfs);
@@ -1064,8 +1062,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 		struct dimm_info *dimm = mci->dimms[i];
 		if (dimm->nr_pages == 0)
 			continue;
-		debugf0("%s(): removing device %s\n", __func__,
-			dev_name(&dimm->dev));
+		debugf0("removing device %s\n", dev_name(&dimm->dev));
 		put_device(&dimm->dev);
 		device_del(&dimm->dev);
 	}
@@ -1105,7 +1102,7 @@ int __init edac_mc_sysfs_init(void)
 	/* get the /sys/devices/system/edac subsys reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		return -EINVAL;
 	}
 
diff --git a/drivers/edac/edac_module.c b/drivers/edac/edac_module.c
index 8735a0d..9de2484 100644
--- a/drivers/edac/edac_module.c
+++ b/drivers/edac/edac_module.c
@@ -113,7 +113,7 @@ error:
  */
 static void __exit edac_exit(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* tear down the various subsystems */
 	edac_workqueue_teardown();
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index f1ac866..953959e 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -45,7 +45,7 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 	void *p = NULL, *pvt;
 	unsigned int size;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	pci = edac_align_ptr(&p, sizeof(*pci), 1);
 	pvt = edac_align_ptr(&p, 1, sz_pvt);
@@ -80,7 +80,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_ctl_info);
  */
 void edac_pci_free_ctl_info(struct edac_pci_ctl_info *pci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	edac_pci_remove_sysfs(pci);
 }
@@ -97,7 +97,7 @@ static struct edac_pci_ctl_info *find_edac_pci_by_dev(struct device *dev)
 	struct edac_pci_ctl_info *pci;
 	struct list_head *item;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	list_for_each(item, &edac_pci_list) {
 		pci = list_entry(item, struct edac_pci_ctl_info, link);
@@ -122,7 +122,7 @@ static int add_edac_pci_to_global_list(struct edac_pci_ctl_info *pci)
 	struct list_head *item, *insert_before;
 	struct edac_pci_ctl_info *rover;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	insert_before = &edac_pci_list;
 
@@ -226,7 +226,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 	int msec;
 	unsigned long delay;
 
-	debugf3("%s() checking\n", __func__);
+	debugf3("checking\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -261,7 +261,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 static void edac_pci_workq_setup(struct edac_pci_ctl_info *pci,
 				 unsigned int msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	INIT_DELAYED_WORK(&pci->work, edac_pci_workq_function);
 	queue_delayed_work(edac_workqueue, &pci->work,
@@ -276,7 +276,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 {
 	int status;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	status = cancel_delayed_work(&pci->work);
 	if (status == 0)
@@ -293,7 +293,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 void edac_pci_reset_delay_period(struct edac_pci_ctl_info *pci,
 				 unsigned long value)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_workq_teardown(pci);
 
@@ -333,7 +333,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_index);
  */
 int edac_pci_add_device(struct edac_pci_ctl_info *pci, int edac_idx)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci->pci_idx = edac_idx;
 	pci->start_time = jiffies;
@@ -393,7 +393,7 @@ struct edac_pci_ctl_info *edac_pci_del_device(struct device *dev)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -430,7 +430,7 @@ EXPORT_SYMBOL_GPL(edac_pci_del_device);
  */
 static void edac_pci_generic_check(struct edac_pci_ctl_info *pci)
 {
-	debugf4("%s()\n", __func__);
+	debugf4("\n");
 	edac_pci_do_parity_check();
 }
 
@@ -491,7 +491,7 @@ EXPORT_SYMBOL_GPL(edac_pci_create_generic_ctl);
  */
 void edac_pci_release_generic_ctl(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() pci mod=%s\n", __func__, pci->mod_name);
+	debugf0("pci mod=%s\n", pci->mod_name);
 
 	edac_pci_del_device(pci->dev);
 	edac_pci_free_ctl_info(pci);
diff --git a/drivers/edac/edac_pci_sysfs.c b/drivers/edac/edac_pci_sysfs.c
index 97f5064..330e820 100644
--- a/drivers/edac/edac_pci_sysfs.c
+++ b/drivers/edac/edac_pci_sysfs.c
@@ -78,7 +78,7 @@ static void edac_pci_instance_release(struct kobject *kobj)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Form pointer to containing struct, the pci control struct */
 	pci = to_instance(kobj);
@@ -161,7 +161,7 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	struct kobject *main_kobj;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* First bump the ref count on the top main kobj, which will
 	 * track the number of PCI instances we have, and thus nest
@@ -177,14 +177,14 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	err = kobject_init_and_add(&pci->kobj, &ktype_pci_instance,
 				   edac_pci_top_main_kobj, "pci%d", idx);
 	if (err != 0) {
-		debugf2("%s() failed to register instance pci%d\n",
-			__func__, idx);
+		debugf2("failed to register instance pci%d\n",
+			idx);
 		kobject_put(edac_pci_top_main_kobj);
 		goto error_out;
 	}
 
 	kobject_uevent(&pci->kobj, KOBJ_ADD);
-	debugf1("%s() Register instance 'pci%d' kobject\n", __func__, idx);
+	debugf1("Register instance 'pci%d' kobject\n", idx);
 
 	return 0;
 
@@ -201,7 +201,7 @@ error_out:
 static void edac_pci_unregister_sysfs_instance_kobj(
 			struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Unregister the instance kobject and allow its release
 	 * function release the main reference count and then
@@ -345,7 +345,7 @@ static int edac_pci_main_kobj_setup(void)
 	int err;
 	struct bus_type *edac_subsys;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* check and count if we have already created the main kobject */
 	if (atomic_inc_return(&edac_pci_sysfs_refcount) != 1)
@@ -356,7 +356,7 @@ static int edac_pci_main_kobj_setup(void)
 	 */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		err = -ENODEV;
 		goto decrement_count_fail;
 	}
@@ -421,15 +421,14 @@ decrement_count_fail:
  */
 static void edac_pci_main_kobj_teardown(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Decrement the count and only if no more controller instances
 	 * are connected perform the unregisteration of the top level
 	 * main kobj
 	 */
 	if (atomic_dec_return(&edac_pci_sysfs_refcount) == 0) {
-		debugf0("%s() called kobject_put on main kobj\n",
-			__func__);
+		debugf0("called kobject_put on main kobj\n");
 		kobject_put(edac_pci_top_main_kobj);
 	}
 	edac_put_sysfs_subsys();
@@ -446,7 +445,7 @@ int edac_pci_create_sysfs(struct edac_pci_ctl_info *pci)
 	int err;
 	struct kobject *edac_kobj = &pci->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, pci->pci_idx);
+	debugf0("idx=%d\n", pci->pci_idx);
 
 	/* create the top main EDAC PCI kobject, IF needed */
 	err = edac_pci_main_kobj_setup();
@@ -484,7 +483,7 @@ unregister_cleanup:
  */
 void edac_pci_remove_sysfs(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() index=%d\n", __func__, pci->pci_idx);
+	debugf0("index=%d\n", pci->pci_idx);
 
 	/* Remove the symlink */
 	sysfs_remove_link(&pci->kobj, EDAC_PCI_SYMLINK);
@@ -671,7 +670,7 @@ void edac_pci_do_parity_check(void)
 {
 	int before_count;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* if policy has PCI check off, leave now */
 	if (!check_pci_errors)
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 55eff02..3609742 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -322,7 +322,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	unsigned long mchbar;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	pci_read_config_dword(pdev, I3000_MCHBAR, (u32 *) & mchbar);
 	mchbar &= I3000_MCHBAR_MASK;
@@ -366,7 +366,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -445,7 +445,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -461,7 +461,7 @@ static int __devinit i3000_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -477,7 +477,7 @@ static void __devexit i3000_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i3000_pci)
 		edac_pci_release_generic_ctl(i3000_pci);
@@ -511,7 +511,7 @@ static int __init i3000_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -552,7 +552,7 @@ fail0:
 
 static void __exit i3000_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	pci_unregister_driver(&i3000_driver);
 	if (!i3000_registered) {
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 818ee6f..c5fea07 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -332,7 +332,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	void __iomem *window;
 	struct i3200_priv *priv;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	window = i3200_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -408,7 +408,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -424,7 +424,7 @@ static int __devinit i3200_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -441,7 +441,7 @@ static void __devexit i3200_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i3200_priv *priv;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -475,7 +475,7 @@ static int __init i3200_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -516,7 +516,7 @@ fail0:
 
 static void __exit i3200_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	pci_unregister_driver(&i3200_driver);
 	if (!i3200_registered) {
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 2a9f1dc..251544a 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1388,8 +1388,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	i5000_get_dimm_and_channel_counts(pdev, &num_dimms_per_channel,
 					&num_channels);
 
-	debugf0("MC: %s(): Number of Branches=2 Channels= %d  DIMMS= %d\n",
-		__func__, num_channels, num_dimms_per_channel);
+	debugf0("MC: Number of Branches=2 Channels= %d  DIMMS= %d\n",
+		num_channels, num_dimms_per_channel);
 
 	/* allocate a new MC control structure */
 
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 7425f17..2fbe4c5 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -1143,7 +1143,7 @@ static void __devexit i7300_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	char *tmp;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	if (i7300_pci)
 		edac_pci_release_generic_ctl(i7300_pci);
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index ef237f4..2f7cc2a 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -824,7 +824,7 @@ static ssize_t i7core_inject_store_##param(			\
 	long value;						\
 	int rc;							\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	pvt = mci->pvt_info;					\
 								\
 	if (pvt->inject.enable)					\
@@ -852,7 +852,7 @@ static ssize_t i7core_inject_show_##param(			\
 	struct i7core_pvt *pvt;					\
 								\
 	pvt = mci->pvt_info;					\
-	debugf1("%s() pvt=%p\n", __func__, pvt);		\
+	debugf1("pvt=%p\n", pvt);		\
 	if (pvt->inject.param < 0)				\
 		return sprintf(data, "any\n");			\
 	else							\
@@ -1059,7 +1059,7 @@ static ssize_t i7core_show_counter_##param(			\
 	struct mem_ctl_info *mci = to_mci(dev);			\
 	struct i7core_pvt *pvt = mci->pvt_info;			\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	if (!pvt->ce_count_available || (pvt->is_registered))	\
 		return sprintf(data, "data unavailable\n");	\
 	return sprintf(data, "%lu\n",				\
@@ -1190,8 +1190,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 	dev_set_name(pvt->addrmatch_dev, "inject_addrmatch");
 	dev_set_drvdata(pvt->addrmatch_dev, mci);
 
-	debugf1("%s(): creating %s\n", __func__,
-		dev_name(pvt->addrmatch_dev));
+	debugf1("creating %s\n", dev_name(pvt->addrmatch_dev));
 
 	rc = device_add(pvt->addrmatch_dev);
 	if (rc < 0)
@@ -1213,8 +1212,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 		dev_set_name(pvt->chancounts_dev, "all_channel_counts");
 		dev_set_drvdata(pvt->chancounts_dev, mci);
 
-		debugf1("%s(): creating %s\n", __func__,
-			dev_name(pvt->chancounts_dev));
+		debugf1("creating %s\n", dev_name(pvt->chancounts_dev));
 
 		rc = device_add(pvt->chancounts_dev);
 		if (rc < 0)
@@ -1254,7 +1252,7 @@ static void i7core_put_devices(struct i7core_dev *i7core_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < i7core_dev->n_devs; i++) {
 		struct pci_dev *pdev = i7core_dev->pdev[i];
 		if (!pdev)
@@ -1652,7 +1650,7 @@ static void i7core_udimm_check_mc_ecc_err(struct mem_ctl_info *mci)
 	int new0, new1, new2;
 
 	if (!pvt->pci_mcr[4]) {
-		debugf0("%s MCR registers not found\n", __func__);
+		debugf0("MCR registers not found\n");
 		return;
 	}
 
@@ -2402,7 +2400,7 @@ static void __devexit i7core_remove(struct pci_dev *pdev)
 {
 	struct i7core_dev *i7core_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index c0249f3..5361f9a 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -305,8 +305,8 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 		edac_mode = EDAC_SECDED;
 		break;
 	default:
-		debugf0("%s(): Unknown/reserved ECC state "
-			"in NBXCFG register!\n", __func__);
+		debugf0("Unknown/reserved ECC state "
+			"in NBXCFG register!\n");
 		edac_mode = EDAC_UNKNOWN;
 		break;
 	}
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 6ff59b0..c097d7a 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -210,7 +210,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -245,7 +245,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -260,7 +260,7 @@ static int __devinit i82860_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82860_printk(KERN_INFO, "i82860 init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -278,7 +278,7 @@ static void __devexit i82860_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82860_pci)
 		edac_pci_release_generic_ctl(i82860_pci);
@@ -311,7 +311,7 @@ static int __init i82860_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -352,7 +352,7 @@ fail0:
 
 static void __exit i82860_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82860_driver);
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index c943904..66e5e98 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -405,7 +405,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 nr_chans;
 	struct i82875p_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	ovrfl_pdev = pci_get_device(PCI_VEND_DEV(INTEL, 82875_6), NULL);
 
@@ -426,7 +426,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -437,7 +437,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82875p_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82875p_pvt *)mci->pvt_info;
 	pvt->ovrfl_pdev = ovrfl_pdev;
 	pvt->ovrfl_window = ovrfl_window;
@@ -464,7 +464,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -485,7 +485,7 @@ static int __devinit i82875p_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82875p_printk(KERN_INFO, "i82875p init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -504,7 +504,7 @@ static void __devexit i82875p_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82875p_pvt *pvt = NULL;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82875p_pci)
 		edac_pci_release_generic_ctl(i82875p_pci);
@@ -550,7 +550,7 @@ static int __init i82875p_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -593,7 +593,7 @@ fail0:
 
 static void __exit i82875p_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	i82875p_remove_one(mci_pdev);
 	pci_dev_put(mci_pdev);
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index a4a6768..dcb2182 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -489,11 +489,11 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	u8 c1drb[4];
 #endif
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci_read_config_dword(pdev, I82975X_MCHBAR, &mchbar);
 	if (!(mchbar & 1)) {
-		debugf3("%s(): failed, MCHBAR disabled!\n", __func__);
+		debugf3("failed, MCHBAR disabled!\n");
 		goto fail0;
 	}
 	mchbar &= 0xffffc000;	/* bits 31:14 used for 16K window */
@@ -558,7 +558,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail1;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -569,7 +569,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82975x_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82975x_pvt *) mci->pvt_info;
 	pvt->mch_window = mch_window;
 	i82975x_init_csrows(mci, pdev, mch_window);
@@ -583,7 +583,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail2:
@@ -601,7 +601,7 @@ static int __devinit i82975x_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -619,7 +619,7 @@ static void __devexit i82975x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82975x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (mci  == NULL)
@@ -655,7 +655,7 @@ static int __init i82975x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -697,7 +697,7 @@ fail0:
 
 static void __exit i82975x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82975x_driver);
 
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 1640d54..ecb59b4 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -303,7 +303,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_pci_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " PCI err registered\n");
 
 	return 0;
@@ -321,7 +321,7 @@ static int mpc85xx_pci_err_remove(struct platform_device *op)
 	struct edac_pci_ctl_info *pci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_CAP_DR,
 		 orig_pci_err_cap_dr);
@@ -610,7 +610,7 @@ static int __devinit mpc85xx_l2_err_probe(struct platform_device *op)
 
 	devres_remove_group(&op->dev, mpc85xx_l2_err_probe);
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " L2 err registered\n");
 
 	return 0;
@@ -628,7 +628,7 @@ static int mpc85xx_l2_err_remove(struct platform_device *op)
 	struct edac_device_ctl_info *edac_dev = dev_get_drvdata(&op->dev);
 	struct mpc85xx_l2_pdata *pdata = edac_dev->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->l2_vbase + MPC85XX_L2_ERRINTEN, 0);
@@ -1038,7 +1038,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 		goto err;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_RDDR2 |
 	    MEM_FLAG_DDR | MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -1104,7 +1104,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_mc_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " MC err registered\n");
 
 	return 0;
@@ -1122,7 +1122,7 @@ static int mpc85xx_mc_err_remove(struct platform_device *op)
 	struct mem_ctl_info *mci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_INT_EN, 0);
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 59c399a..50bab21 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -194,7 +194,7 @@ static int __devinit mv64x60_pci_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_pci_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -210,7 +210,7 @@ static int mv64x60_pci_err_remove(struct platform_device *pdev)
 {
 	struct edac_pci_ctl_info *pci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_del_device(&pdev->dev);
 
@@ -363,7 +363,7 @@ static int __devinit mv64x60_sram_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_sram_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -379,7 +379,7 @@ static int mv64x60_sram_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -558,7 +558,7 @@ static int __devinit mv64x60_cpu_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_cpu_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -574,7 +574,7 @@ static int mv64x60_cpu_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -766,7 +766,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 		goto err2;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
 	mci->edac_cap = EDAC_FLAG_SECDED;
@@ -815,7 +815,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -831,7 +831,7 @@ static int mv64x60_mc_err_remove(struct platform_device *pdev)
 {
 	struct mem_ctl_info *mci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_mc_del_mc(&pdev->dev);
 	edac_mc_free(mci);
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 7b7eaf2..cd3ab28 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -236,13 +236,13 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_byte(pdev, R82600_DRBA + index, &drbar);
 
-		debugf1("%s() Row=%d DRBA = %#0x\n", __func__, index, drbar);
+		debugf1("Row=%d DRBA = %#0x\n", index, drbar);
 
 		row_high_limit = ((u32) drbar << 24);
 /*		row_high_limit = ((u32)drbar << 24) | 0xffffffUL; */
 
-		debugf1("%s() Row=%d, Boundary Address=%#0x, Last = %#0x\n",
-			__func__, index, row_high_limit, row_high_limit_last);
+		debugf1("Row=%d, Boundary Address=%#0x, Last = %#0x\n",
+			index, row_high_limit, row_high_limit_last);
 
 		/* Empty row [p.57] */
 		if (row_high_limit == row_high_limit_last)
@@ -277,14 +277,13 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 sdram_refresh_rate;
 	struct r82600_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_byte(pdev, R82600_DRAMC, &dramcr);
 	pci_read_config_dword(pdev, R82600_EAP, &eapr);
 	scrub_disabled = eapr & BIT(31);
 	sdram_refresh_rate = dramcr & (BIT(0) | BIT(1));
-	debugf2("%s(): sdram refresh rate = %#0x\n", __func__,
-		sdram_refresh_rate);
-	debugf2("%s(): DRAMC register = %#0x\n", __func__, dramcr);
+	debugf2("sdram refresh rate = %#0x\n", sdram_refresh_rate);
+	debugf2("DRAMC register = %#0x\n", dramcr);
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = R82600_NR_CSROWS;
 	layers[0].is_virt_csrow = true;
@@ -295,7 +294,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -311,8 +310,8 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 
 	if (ecc_enabled(dramcr)) {
 		if (scrub_disabled)
-			debugf3("%s(): mci = %p - Scrubbing disabled! EAP: "
-				"%#0x\n", __func__, mci, eapr);
+			debugf3("mci = %p - Scrubbing disabled! EAP: "
+				"%#0x\n", mci, eapr);
 	} else
 		mci->edac_cap = EDAC_FLAG_NONE;
 
@@ -352,7 +351,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 			__func__);
 	}
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -364,7 +363,7 @@ fail:
 static int __devinit r82600_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return r82600_probe1(pdev, ent->driver_data);
@@ -374,7 +373,7 @@ static void __devexit r82600_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (r82600_pci)
 		edac_pci_release_generic_ctl(r82600_pci);
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index bb7e95f..1dd6a98 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1064,7 +1064,7 @@ static void sbridge_put_devices(struct sbridge_dev *sbridge_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < sbridge_dev->n_devs; i++) {
 		struct pci_dev *pdev = sbridge_dev->pdev[i];
 		if (!pdev)
@@ -1760,7 +1760,7 @@ static void __devexit sbridge_remove(struct pci_dev *pdev)
 {
 	struct sbridge_dev *sbridge_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 219530b..771f78f 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -331,7 +331,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	bool stacked;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	window = x38_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -407,7 +407,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -423,7 +423,7 @@ static int __devinit x38_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -439,7 +439,7 @@ static void __devexit x38_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -472,7 +472,7 @@ static int __init x38_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -513,7 +513,7 @@ fail0:
 
 static void __exit x38_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	pci_unregister_driver(&x38_driver);
 	if (!x38_registered) {

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-29 15:11                                           ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 15:11 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Joe Perches, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Olof Johansson, Andrew Morton,
	linuxppc-dev

Em 29-04-2012 11:25, Mauro Carvalho Chehab escreveu:
> Em 28-04-2012 05:52, Borislav Petkov escreveu:
>> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
>>> Yes. This is a common issue at the EDAC core: on several places, it calls the
>>> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
>>> the debug macros already handles that. I suspect that, in the past, the __func__
>>> were not at the macros, but some patch added it there, and forgot to fix the
>>> occurrences of its call.
>>
>> The patch that added it is d357cbb445208 and you reviewed it.
> 
> And you wrote the patch that caused it.
> 
>>
>>> This is something that needs to be reviewed at the entire EDAC core (and likely
>>> at the drivers).
>>
>> Looks like a job for a newbie to get her/his feet wet with kernel work.
> 
>>
>>> I opted to not touch on this at the existing debug logic, as I think that the
>>> better is to address all those issues on one separate patch, after fixing the
>>> EDAC core bugs.
>>
>> No,
>>
>> you simply need to remove the __func__ argument in your newly added debug call:
>>
>>                 debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
>>                         i, (dimm - mci->dimms),
>>                         pos[0], pos[1], pos[2], row, chn);
>>
>> And while you're at it, remove the rest of the __func__ arguments from
>> your newly added debugfX calls.
> 
> A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
> an unrelated fix on this patch. This is already complex enough to add more unrelated
> things there.
> 
> Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
> on one shot.

Most of the issues can be solved with the above script-based patch. 

There are still 171 places (12 places at the core, the rest are on the drivers)
that will require a more sophisticated patch or that requires a manual fix.

-

From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Sun, 29 Apr 2012 11:59:14 -0300
Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

The debug macro already adds that. Made by this small script:

$f .=$_ while (<>);

$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*(\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*(\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*(\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*(\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

print $f;

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index be6c225..25198c8 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -180,7 +180,7 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 static void amd76x_check(struct mem_ctl_info *mci)
 {
 	struct amd76x_error_info info;
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	amd76x_get_error_info(mci, &info);
 	amd76x_process_error_info(mci, &info, 1);
 }
@@ -241,7 +241,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 ems_mode;
 	struct amd76x_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_dword(pdev, AMD76X_ECC_MODE_STATUS, &ems);
 	ems_mode = (ems >> 10) & 0x3;
 
@@ -256,7 +256,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -292,7 +292,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -304,7 +304,7 @@ fail:
 static int __devinit amd76x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return amd76x_probe1(pdev, ent->driver_data);
@@ -322,7 +322,7 @@ static void __devexit amd76x_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (amd76x_pci)
 		edac_pci_release_generic_ctl(amd76x_pci);
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 31b3c91..9ee1194 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -316,13 +316,12 @@ static void get_total_mem(struct cpc925_mc_pdata *pdata)
 		reg += aw;
 		size = of_read_number(reg, sw);
 		reg += sw;
-		debugf1("%s: start 0x%lx, size 0x%lx\n", __func__,
-			start, size);
+		debugf1("start 0x%lx, size 0x%lx\n", start, size);
 		pdata->total_mem += size;
 	} while (reg < reg_end);
 
 	of_node_put(np);
-	debugf0("%s: total_mem 0x%lx\n", __func__, pdata->total_mem);
+	debugf0("total_mem 0x%lx\n", pdata->total_mem);
 }
 
 static void cpc925_init_csrows(struct mem_ctl_info *mci)
@@ -512,7 +511,7 @@ static void cpc925_mc_get_pfn(struct mem_ctl_info *mci, u32 mear,
 	*offset = pa & (PAGE_SIZE - 1);
 	*pfn = pa >> PAGE_SHIFT;
 
-	debugf0("%s: ECC physical address 0x%lx\n", __func__, pa);
+	debugf0("ECC physical address 0x%lx\n", pa);
 }
 
 static int cpc925_mc_find_channel(struct mem_ctl_info *mci, u16 syndrome)
@@ -852,8 +851,8 @@ static void cpc925_add_edac_devices(void __iomem *vbase)
 			goto err2;
 		}
 
-		debugf0("%s: Successfully added edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully added edac device for %s\n",
+			dev_info->ctl_name);
 
 		continue;
 
@@ -884,8 +883,8 @@ static void cpc925_del_edac_devices(void)
 		if (dev_info->exit)
 			dev_info->exit(dev_info);
 
-		debugf0("%s: Successfully deleted edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully deleted edac device for %s\n",
+			dev_info->ctl_name);
 	}
 }
 
@@ -900,7 +899,7 @@ static int cpc925_get_sdram_scrub_rate(struct mem_ctl_info *mci)
 	mscr = __raw_readl(pdata->vbase + REG_MSCR_OFFSET);
 	si = (mscr & MSCR_SI_MASK) >> MSCR_SI_SHIFT;
 
-	debugf0("%s, Mem Scrub Ctrl Register 0x%x\n", __func__, mscr);
+	debugf0("Mem Scrub Ctrl Register 0x%x\n", mscr);
 
 	if (((mscr & MSCR_SCRUB_MOD_MASK) != MSCR_BACKGR_SCRUB) ||
 	    (si == 0)) {
@@ -928,8 +927,7 @@ static int cpc925_mc_get_channels(void __iomem *vbase)
 	    ((mbcr & MBCR_64BITBUS_MASK) == 0))
 		dual = 1;
 
-	debugf0("%s: %s channel\n", __func__,
-		(dual > 0) ? "Dual" : "Single");
+	debugf0("%s channel\n", (dual > 0) ? "Dual" : "Single");
 
 	return dual;
 }
@@ -944,7 +942,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	struct resource *r;
 	int res = 0, nr_channels;
 
-	debugf0("%s: %s platform device found!\n", __func__, pdev->name);
+	debugf0("%s platform device found!\n", pdev->name);
 
 	if (!devres_open_group(&pdev->dev, cpc925_probe, GFP_KERNEL)) {
 		res = -ENOMEM;
@@ -1026,7 +1024,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	cpc925_add_edac_devices(vbase);
 
 	/* get this far and it's successful */
-	debugf0("%s: success\n", __func__);
+	debugf0("success\n");
 
 	res = 0;
 	goto out;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 7e601c1..edb6ff3 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -309,7 +309,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (page < pvt->tolm)
 		return page;
@@ -335,7 +335,7 @@ static void do_process_ce(struct mem_ctl_info *mci, u16 error_one,
 	int i;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* convert the addr to 4k page */
 	page = sec1_add >> (PAGE_SHIFT - 4);
@@ -394,7 +394,7 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 	int row;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (error_one & 0x0202) {
 		error_2b = ded_add;
@@ -453,7 +453,7 @@ static inline void process_ue_no_info_wr(struct mem_ctl_info *mci,
 	if (!handle_error)
 		return;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
 			     -1, -1, -1,
 			     "e752x UE log memory write", "", NULL);
@@ -982,7 +982,7 @@ static void e752x_check(struct mem_ctl_info *mci)
 {
 	struct e752x_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e752x_get_error_info(mci, &info);
 	e752x_process_error_info(mci, &info, 1);
 }
@@ -1270,7 +1270,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;		/* Number of channels 0=1chan,1=2chan */
 	struct e752x_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 	debugf0("Starting Probe1\n");
 
 	/* check to see if device 0 function 1 is enabled; if it isn't, we
@@ -1302,7 +1302,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	/* 3100 IMCH supports SECDEC only */
 	mci->edac_ctl_cap = (dev_idx == I3100) ? EDAC_FLAG_SECDED :
@@ -1312,7 +1312,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_ver = E752X_REVISION;
 	mci->pdev = &pdev->dev;
 
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e752x_pvt *)mci->pvt_info;
 	pvt->dev_info = &e752x_devs[dev_idx];
 	pvt->mc_symmetric = ((ddrcsr & 0x10) != 0);
@@ -1322,7 +1322,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENODEV;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e752x_check;
@@ -1344,7 +1344,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		mci->edac_cap = EDAC_FLAG_SECDED; /* the only mode supported */
 	else
 		mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E752X_TOLM, &pci_data);
@@ -1379,7 +1379,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -1395,7 +1395,7 @@ fail:
 static int __devinit e752x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	if (pci_enable_device(pdev) < 0)
@@ -1409,7 +1409,7 @@ static void __devexit e752x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e752x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e752x_pci)
 		edac_pci_release_generic_ctl(e752x_pci);
@@ -1455,7 +1455,7 @@ static int __init e752x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -1466,7 +1466,7 @@ static int __init e752x_init(void)
 
 static void __exit e752x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	pci_unregister_driver(&e752x_driver);
 }
 
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 2defa96..253e878 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -166,7 +166,7 @@ static const struct e7xxx_dev_info e7xxx_devs[] = {
 /* FIXME - is this valid for both SECDED and S4ECD4ED? */
 static inline int e7xxx_find_channel(u16 syndrome)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((syndrome & 0xff00) == 0)
 		return 0;
@@ -186,7 +186,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e7xxx_pvt *pvt = (struct e7xxx_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((page < pvt->tolm) ||
 		((page >= 0x100000) && (page < pvt->remapbase)))
@@ -208,7 +208,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	int row;
 	int channel;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_1b = info->dram_celog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -225,7 +225,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ce_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx CE log register overflow", "", NULL);
 }
@@ -235,7 +235,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	u32 error_2b, block_page;
 	int row;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_2b = info->dram_uelog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -248,7 +248,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ue_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx UE log register overflow", "", NULL);
@@ -334,7 +334,7 @@ static void e7xxx_check(struct mem_ctl_info *mci)
 {
 	struct e7xxx_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e7xxx_get_error_info(mci, &info);
 	e7xxx_process_error_info(mci, &info, 1);
 }
@@ -430,7 +430,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;
 	struct e7xxx_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 
 	pci_read_config_dword(pdev, E7XXX_DRC, &drc);
 
@@ -453,7 +453,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED |
 		EDAC_FLAG_S4ECD4ED;
@@ -461,7 +461,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_name = EDAC_MOD_STR;
 	mci->mod_ver = E7XXX_REVISION;
 	mci->pdev = &pdev->dev;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e7xxx_pvt *)mci->pvt_info;
 	pvt->dev_info = &e7xxx_devs[dev_idx];
 	pvt->bridge_ck = pci_get_device(PCI_VENDOR_ID_INTEL,
@@ -474,14 +474,14 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e7xxx_check;
 	mci->ctl_page_to_phys = ctl_page_to_phys;
 	e7xxx_init_csrows(mci, pdev, dev_idx, drc);
 	mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E7XXX_TOLM, &pci_data);
 	pvt->tolm = ((u32) pci_data) << 4;
@@ -516,7 +516,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -532,7 +532,7 @@ fail0:
 static int __devinit e7xxx_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	return pci_enable_device(pdev) ?
@@ -544,7 +544,7 @@ static void __devexit e7xxx_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e7xxx_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e7xxx_pci)
 		edac_pci_release_generic_ctl(e7xxx_pci);
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index cb397d9..ed46949 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -82,8 +82,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	void *pvt, *p;
 	int err;
 
-	debugf4("%s() instances=%d blocks=%d\n",
-		__func__, nr_instances, nr_blocks);
+	debugf4("instances=%d blocks=%d\n",
+		nr_instances, nr_blocks);
 
 	/* Calculate the size of memory we need to allocate AND
 	 * determine the offsets of the various item arrays
@@ -156,8 +156,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	/* Name of this edac device */
 	snprintf(dev_ctl->name,sizeof(dev_ctl->name),"%s",edac_device_name);
 
-	debugf4("%s() edac_dev=%p next after end=%p\n",
-		__func__, dev_ctl, pvt + sz_private );
+	debugf4("edac_dev=%p next after end=%p\n",
+		dev_ctl, pvt + sz_private );
 
 	/* Initialize every Instance */
 	for (instance = 0; instance < nr_instances; instance++) {
@@ -178,9 +178,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			snprintf(blk->name, sizeof(blk->name),
 				 "%s%d", edac_block_name, block+offset_value);
 
-			debugf4("%s() instance=%d inst_p=%p block=#%d "
+			debugf4("instance=%d inst_p=%p block=#%d "
 				"block_p=%p name='%s'\n",
-				__func__, instance, inst, block,
+				instance, inst, block,
 				blk, blk->name);
 
 			/* if there are NO attributes OR no attribute pointer
@@ -194,8 +194,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			attrib_p = &dev_attrib[block*nr_instances*nr_attrib];
 			blk->block_attributes = attrib_p;
 
-			debugf4("%s() THIS BLOCK_ATTRIB=%p\n",
-				__func__, blk->block_attributes);
+			debugf4("THIS BLOCK_ATTRIB=%p\n",
+				blk->block_attributes);
 
 			/* Initialize every user specified attribute in this
 			 * block with the data the caller passed in
@@ -214,9 +214,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 
 				attrib->block = blk;	/* up link */
 
-				debugf4("%s() alloc-attrib=%p attrib_name='%s' "
+				debugf4("alloc-attrib=%p attrib_name='%s' "
 					"attrib-spec=%p spec-name=%s\n",
-					__func__, attrib, attrib->attr.name,
+					attrib, attrib->attr.name,
 					&attrib_spec[attr],
 					attrib_spec[attr].attr.name
 					);
@@ -273,7 +273,7 @@ static struct edac_device_ctl_info *find_edac_device_by_dev(struct device *dev)
 	struct edac_device_ctl_info *edac_dev;
 	struct list_head *item;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	list_for_each(item, &edac_device_list) {
 		edac_dev = list_entry(item, struct edac_device_ctl_info, link);
@@ -408,7 +408,7 @@ static void edac_device_workq_function(struct work_struct *work_req)
 void edac_device_workq_setup(struct edac_device_ctl_info *edac_dev,
 				unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* take the arg 'msec' and set it into the control structure
 	 * to used in the time period calculation
@@ -496,7 +496,7 @@ EXPORT_SYMBOL_GPL(edac_device_alloc_index);
  */
 int edac_device_add_device(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -570,7 +570,7 @@ struct edac_device_ctl_info *edac_device_del_device(struct device *dev)
 {
 	struct edac_device_ctl_info *edac_dev;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&device_ctls_mutex);
 
diff --git a/drivers/edac/edac_device_sysfs.c b/drivers/edac/edac_device_sysfs.c
index b4ea185..3c589c2 100644
--- a/drivers/edac/edac_device_sysfs.c
+++ b/drivers/edac/edac_device_sysfs.c
@@ -202,7 +202,7 @@ static void edac_device_ctrl_master_release(struct kobject *kobj)
 {
 	struct edac_device_ctl_info *edac_dev = to_edacdev(kobj);
 
-	debugf4("%s() control index=%d\n", __func__, edac_dev->dev_idx);
+	debugf4("control index=%d\n", edac_dev->dev_idx);
 
 	/* decrement the EDAC CORE module ref count */
 	module_put(edac_dev->owner);
@@ -233,12 +233,12 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	struct bus_type *edac_subsys;
 	int err;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the /sys/devices/system/edac reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys error\n", __func__);
+		debugf1("no edac_subsys error\n");
 		err = -ENODEV;
 		goto err_out;
 	}
@@ -264,8 +264,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 				   &edac_subsys->dev_root->kobj,
 				   "%s", edac_dev->name);
 	if (err) {
-		debugf1("%s()Failed to register '.../edac/%s'\n",
-			__func__, edac_dev->name);
+		debugf1("Failed to register '.../edac/%s'\n",
+			edac_dev->name);
 		goto err_kobj_reg;
 	}
 	kobject_uevent(&edac_dev->kobj, KOBJ_ADD);
@@ -274,8 +274,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	 * edac_device_unregister_sysfs_main_kobj() must be used
 	 */
 
-	debugf4("%s() Registered '.../edac/%s' kobject\n",
-		__func__, edac_dev->name);
+	debugf4("Registered '.../edac/%s' kobject\n",
+		edac_dev->name);
 
 	return 0;
 
@@ -296,9 +296,9 @@ err_out:
  */
 void edac_device_unregister_sysfs_main_kobj(struct edac_device_ctl_info *dev)
 {
-	debugf0("%s()\n", __func__);
-	debugf4("%s() name of kobject is: %s\n",
-		__func__, kobject_name(&dev->kobj));
+	debugf0("\n");
+	debugf4("name of kobject is: %s\n",
+		kobject_name(&dev->kobj));
 
 	/*
 	 * Unregister the edac device's kobject and
@@ -336,7 +336,7 @@ static void edac_device_ctrl_instance_release(struct kobject *kobj)
 {
 	struct edac_device_instance *instance;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* map from this kobj to the main control struct
 	 * and then dec the main kobj count
@@ -442,7 +442,7 @@ static void edac_device_ctrl_block_release(struct kobject *kobj)
 {
 	struct edac_device_block *block;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the container of the kobj */
 	block = to_block(kobj);
@@ -524,10 +524,10 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	struct edac_dev_sysfs_block_attribute *sysfs_attrib;
 	struct kobject *main_kobj;
 
-	debugf4("%s() Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
-		__func__, instance->name, instance, block->name, block);
-	debugf4("%s() block kobj=%p  block kobj->parent=%p\n",
-		__func__, &block->kobj, &block->kobj.parent);
+	debugf4("Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
+		instance->name, instance, block->name, block);
+	debugf4("block kobj=%p  block kobj->parent=%p\n",
+		&block->kobj, &block->kobj.parent);
 
 	/* init this block's kobject */
 	memset(&block->kobj, 0, sizeof(struct kobject));
@@ -546,8 +546,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 				   &instance->kobj,
 				   "%s", block->name);
 	if (err) {
-		debugf1("%s() Failed to register instance '%s'\n",
-			__func__, block->name);
+		debugf1("Failed to register instance '%s'\n",
+			block->name);
 		kobject_put(main_kobj);
 		err = -ENODEV;
 		goto err_out;
@@ -560,9 +560,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	if (sysfs_attrib && block->nr_attribs) {
 		for (i = 0; i < block->nr_attribs; i++, sysfs_attrib++) {
 
-			debugf4("%s() creating block attrib='%s' "
+			debugf4("creating block attrib='%s' "
 				"attrib->%p to kobj=%p\n",
-				__func__,
 				sysfs_attrib->attr.name,
 				sysfs_attrib, &block->kobj);
 
@@ -647,14 +646,14 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	err = kobject_init_and_add(&instance->kobj, &ktype_instance_ctrl,
 				   &edac_dev->kobj, "%s", instance->name);
 	if (err != 0) {
-		debugf2("%s() Failed to register instance '%s'\n",
-			__func__, instance->name);
+		debugf2("Failed to register instance '%s'\n",
+			instance->name);
 		kobject_put(main_kobj);
 		goto err_out;
 	}
 
-	debugf4("%s() now register '%d' blocks for instance %d\n",
-		__func__, instance->nr_blocks, idx);
+	debugf4("now register '%d' blocks for instance %d\n",
+		instance->nr_blocks, idx);
 
 	/* register all blocks of this instance */
 	for (i = 0; i < instance->nr_blocks; i++) {
@@ -670,8 +669,8 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	}
 	kobject_uevent(&instance->kobj, KOBJ_ADD);
 
-	debugf4("%s() Registered instance %d '%s' kobject\n",
-		__func__, idx, instance->name);
+	debugf4("Registered instance %d '%s' kobject\n",
+		idx, instance->name);
 
 	return 0;
 
@@ -715,7 +714,7 @@ static int edac_device_create_instances(struct edac_device_ctl_info *edac_dev)
 	int i, j;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* iterate over creation of the instances */
 	for (i = 0; i < edac_dev->nr_instances; i++) {
@@ -817,12 +816,12 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	int err;
 	struct kobject *edac_kobj = &edac_dev->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, edac_dev->dev_idx);
+	debugf0("idx=%d\n", edac_dev->dev_idx);
 
 	/*  go create any main attributes callers wants */
 	err = edac_device_add_main_sysfs_attributes(edac_dev);
 	if (err) {
-		debugf0("%s() failed to add sysfs attribs\n", __func__);
+		debugf0("failed to add sysfs attribs\n");
 		goto err_out;
 	}
 
@@ -849,8 +848,8 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	}
 
 
-	debugf4("%s() create-instances done, idx=%d\n",
-		__func__, edac_dev->dev_idx);
+	debugf4("create-instances done, idx=%d\n",
+		edac_dev->dev_idx);
 
 	return 0;
 
@@ -873,7 +872,7 @@ err_out:
  */
 void edac_device_remove_sysfs(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* remove any main attributes for this device */
 	edac_device_remove_main_sysfs_attributes(edac_dev);
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 65568e6..32ed17b 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -259,13 +259,13 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	count = 1;
 	for (i = 0; i < n_layers; i++) {
 		count *= layers[i].size;
-		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		debugf4("errcount layer %d size %d\n", i, count);
 		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		tot_errcount += 2 * count;
 	}
 
-	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
+	debugf4("allocating %d error counters\n", tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
@@ -337,7 +337,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+	debugf4("initializing %d %s\n", tot_dimms,
 		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
@@ -451,7 +451,7 @@ EXPORT_SYMBOL_GPL(edac_mc_alloc);
  */
 void edac_mc_free(struct mem_ctl_info *mci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* the mci instance is freed here, when the sysfs object is dropped */
 	edac_unregister_sysfs(mci);
@@ -471,7 +471,7 @@ struct mem_ctl_info *find_mci_by_dev(struct device *dev)
 	struct mem_ctl_info *mci;
 	struct list_head *item;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	list_for_each(item, &mc_devices) {
 		mci = list_entry(item, struct mem_ctl_info, link);
@@ -539,7 +539,7 @@ static void edac_mc_workq_function(struct work_struct *work_req)
  */
 static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* if this instance is not in the POLL state, then simply return */
 	if (mci->op_state != OP_RUNNING_POLL)
@@ -566,8 +566,7 @@ static void edac_mc_workq_teardown(struct mem_ctl_info *mci)
 
 	status = cancel_delayed_work(&mci->work);
 	if (status == 0) {
-		debugf0("%s() not canceled, flush the queue\n",
-			__func__);
+		debugf0("not canceled, flush the queue\n");
 
 		/* workq instance might be running, wait for it */
 		flush_workqueue(edac_workqueue);
@@ -714,7 +713,7 @@ EXPORT_SYMBOL(edac_mc_find);
 /* FIXME - should a warning be printed if no error detection? correction? */
 int edac_mc_add_mc(struct mem_ctl_info *mci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -785,7 +784,7 @@ struct mem_ctl_info *edac_mc_del_mc(struct device *dev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&mem_ctls_mutex);
 
@@ -823,7 +822,7 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 	void *virt_addr;
 	unsigned long flags = 0;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* ECC error page was not in our memory. Ignore it. */
 	if (!pfn_valid(page))
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 81ca073..c4ce26e 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -623,8 +623,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 
 	err =  device_add(&dimm->dev);
 
-	debugf0("%s(): creating rank/dimm device %s\n", __func__,
-		dev_name(&dimm->dev));
+	debugf0("creating rank/dimm device %s\n", dev_name(&dimm->dev));
 
 	return err;
 }
@@ -981,8 +980,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	dev_set_drvdata(&mci->dev, mci);
 	pm_runtime_forbid(&mci->dev);
 
-	debugf0("%s(): creating device %s\n", __func__,
-		dev_name(&mci->dev));
+	debugf0("creating device %s\n", dev_name(&mci->dev));
 	err = device_add(&mci->dev);
 	if (err < 0) {
 		bus_unregister(&mci->bus);
@@ -999,8 +997,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 		if (dimm->nr_pages == 0)
 			continue;
 #ifdef CONFIG_EDAC_DEBUG
-		debugf1("%s creating dimm%d, located at ",
-			__func__, i);
+		debugf1("creating dimm%d, located at ",
+			i);
 		if (edac_debug_level >= 1) {
 			int lay;
 			for (lay = 0; lay < mci->n_layers; lay++)
@@ -1012,8 +1010,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 #endif
 		err = edac_create_dimm_object(mci, dimm, i);
 		if (err) {
-			debugf1("%s() failure: create dimm %d obj\n",
-				__func__, i);
+			debugf1("failure: create dimm %d obj\n",
+				i);
 			goto fail;
 		}
 	}
@@ -1051,7 +1049,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
 	int i;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	debugfs_remove(mci->debugfs);
@@ -1064,8 +1062,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 		struct dimm_info *dimm = mci->dimms[i];
 		if (dimm->nr_pages == 0)
 			continue;
-		debugf0("%s(): removing device %s\n", __func__,
-			dev_name(&dimm->dev));
+		debugf0("removing device %s\n", dev_name(&dimm->dev));
 		put_device(&dimm->dev);
 		device_del(&dimm->dev);
 	}
@@ -1105,7 +1102,7 @@ int __init edac_mc_sysfs_init(void)
 	/* get the /sys/devices/system/edac subsys reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		return -EINVAL;
 	}
 
diff --git a/drivers/edac/edac_module.c b/drivers/edac/edac_module.c
index 8735a0d..9de2484 100644
--- a/drivers/edac/edac_module.c
+++ b/drivers/edac/edac_module.c
@@ -113,7 +113,7 @@ error:
  */
 static void __exit edac_exit(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* tear down the various subsystems */
 	edac_workqueue_teardown();
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index f1ac866..953959e 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -45,7 +45,7 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 	void *p = NULL, *pvt;
 	unsigned int size;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	pci = edac_align_ptr(&p, sizeof(*pci), 1);
 	pvt = edac_align_ptr(&p, 1, sz_pvt);
@@ -80,7 +80,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_ctl_info);
  */
 void edac_pci_free_ctl_info(struct edac_pci_ctl_info *pci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	edac_pci_remove_sysfs(pci);
 }
@@ -97,7 +97,7 @@ static struct edac_pci_ctl_info *find_edac_pci_by_dev(struct device *dev)
 	struct edac_pci_ctl_info *pci;
 	struct list_head *item;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	list_for_each(item, &edac_pci_list) {
 		pci = list_entry(item, struct edac_pci_ctl_info, link);
@@ -122,7 +122,7 @@ static int add_edac_pci_to_global_list(struct edac_pci_ctl_info *pci)
 	struct list_head *item, *insert_before;
 	struct edac_pci_ctl_info *rover;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	insert_before = &edac_pci_list;
 
@@ -226,7 +226,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 	int msec;
 	unsigned long delay;
 
-	debugf3("%s() checking\n", __func__);
+	debugf3("checking\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -261,7 +261,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 static void edac_pci_workq_setup(struct edac_pci_ctl_info *pci,
 				 unsigned int msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	INIT_DELAYED_WORK(&pci->work, edac_pci_workq_function);
 	queue_delayed_work(edac_workqueue, &pci->work,
@@ -276,7 +276,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 {
 	int status;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	status = cancel_delayed_work(&pci->work);
 	if (status == 0)
@@ -293,7 +293,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 void edac_pci_reset_delay_period(struct edac_pci_ctl_info *pci,
 				 unsigned long value)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_workq_teardown(pci);
 
@@ -333,7 +333,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_index);
  */
 int edac_pci_add_device(struct edac_pci_ctl_info *pci, int edac_idx)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci->pci_idx = edac_idx;
 	pci->start_time = jiffies;
@@ -393,7 +393,7 @@ struct edac_pci_ctl_info *edac_pci_del_device(struct device *dev)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -430,7 +430,7 @@ EXPORT_SYMBOL_GPL(edac_pci_del_device);
  */
 static void edac_pci_generic_check(struct edac_pci_ctl_info *pci)
 {
-	debugf4("%s()\n", __func__);
+	debugf4("\n");
 	edac_pci_do_parity_check();
 }
 
@@ -491,7 +491,7 @@ EXPORT_SYMBOL_GPL(edac_pci_create_generic_ctl);
  */
 void edac_pci_release_generic_ctl(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() pci mod=%s\n", __func__, pci->mod_name);
+	debugf0("pci mod=%s\n", pci->mod_name);
 
 	edac_pci_del_device(pci->dev);
 	edac_pci_free_ctl_info(pci);
diff --git a/drivers/edac/edac_pci_sysfs.c b/drivers/edac/edac_pci_sysfs.c
index 97f5064..330e820 100644
--- a/drivers/edac/edac_pci_sysfs.c
+++ b/drivers/edac/edac_pci_sysfs.c
@@ -78,7 +78,7 @@ static void edac_pci_instance_release(struct kobject *kobj)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Form pointer to containing struct, the pci control struct */
 	pci = to_instance(kobj);
@@ -161,7 +161,7 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	struct kobject *main_kobj;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* First bump the ref count on the top main kobj, which will
 	 * track the number of PCI instances we have, and thus nest
@@ -177,14 +177,14 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	err = kobject_init_and_add(&pci->kobj, &ktype_pci_instance,
 				   edac_pci_top_main_kobj, "pci%d", idx);
 	if (err != 0) {
-		debugf2("%s() failed to register instance pci%d\n",
-			__func__, idx);
+		debugf2("failed to register instance pci%d\n",
+			idx);
 		kobject_put(edac_pci_top_main_kobj);
 		goto error_out;
 	}
 
 	kobject_uevent(&pci->kobj, KOBJ_ADD);
-	debugf1("%s() Register instance 'pci%d' kobject\n", __func__, idx);
+	debugf1("Register instance 'pci%d' kobject\n", idx);
 
 	return 0;
 
@@ -201,7 +201,7 @@ error_out:
 static void edac_pci_unregister_sysfs_instance_kobj(
 			struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Unregister the instance kobject and allow its release
 	 * function release the main reference count and then
@@ -345,7 +345,7 @@ static int edac_pci_main_kobj_setup(void)
 	int err;
 	struct bus_type *edac_subsys;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* check and count if we have already created the main kobject */
 	if (atomic_inc_return(&edac_pci_sysfs_refcount) != 1)
@@ -356,7 +356,7 @@ static int edac_pci_main_kobj_setup(void)
 	 */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		err = -ENODEV;
 		goto decrement_count_fail;
 	}
@@ -421,15 +421,14 @@ decrement_count_fail:
  */
 static void edac_pci_main_kobj_teardown(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Decrement the count and only if no more controller instances
 	 * are connected perform the unregisteration of the top level
 	 * main kobj
 	 */
 	if (atomic_dec_return(&edac_pci_sysfs_refcount) == 0) {
-		debugf0("%s() called kobject_put on main kobj\n",
-			__func__);
+		debugf0("called kobject_put on main kobj\n");
 		kobject_put(edac_pci_top_main_kobj);
 	}
 	edac_put_sysfs_subsys();
@@ -446,7 +445,7 @@ int edac_pci_create_sysfs(struct edac_pci_ctl_info *pci)
 	int err;
 	struct kobject *edac_kobj = &pci->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, pci->pci_idx);
+	debugf0("idx=%d\n", pci->pci_idx);
 
 	/* create the top main EDAC PCI kobject, IF needed */
 	err = edac_pci_main_kobj_setup();
@@ -484,7 +483,7 @@ unregister_cleanup:
  */
 void edac_pci_remove_sysfs(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() index=%d\n", __func__, pci->pci_idx);
+	debugf0("index=%d\n", pci->pci_idx);
 
 	/* Remove the symlink */
 	sysfs_remove_link(&pci->kobj, EDAC_PCI_SYMLINK);
@@ -671,7 +670,7 @@ void edac_pci_do_parity_check(void)
 {
 	int before_count;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* if policy has PCI check off, leave now */
 	if (!check_pci_errors)
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 55eff02..3609742 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -322,7 +322,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	unsigned long mchbar;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	pci_read_config_dword(pdev, I3000_MCHBAR, (u32 *) & mchbar);
 	mchbar &= I3000_MCHBAR_MASK;
@@ -366,7 +366,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -445,7 +445,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -461,7 +461,7 @@ static int __devinit i3000_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -477,7 +477,7 @@ static void __devexit i3000_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i3000_pci)
 		edac_pci_release_generic_ctl(i3000_pci);
@@ -511,7 +511,7 @@ static int __init i3000_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -552,7 +552,7 @@ fail0:
 
 static void __exit i3000_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	pci_unregister_driver(&i3000_driver);
 	if (!i3000_registered) {
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 818ee6f..c5fea07 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -332,7 +332,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	void __iomem *window;
 	struct i3200_priv *priv;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	window = i3200_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -408,7 +408,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -424,7 +424,7 @@ static int __devinit i3200_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -441,7 +441,7 @@ static void __devexit i3200_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i3200_priv *priv;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -475,7 +475,7 @@ static int __init i3200_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -516,7 +516,7 @@ fail0:
 
 static void __exit i3200_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	pci_unregister_driver(&i3200_driver);
 	if (!i3200_registered) {
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 2a9f1dc..251544a 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1388,8 +1388,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	i5000_get_dimm_and_channel_counts(pdev, &num_dimms_per_channel,
 					&num_channels);
 
-	debugf0("MC: %s(): Number of Branches=2 Channels= %d  DIMMS= %d\n",
-		__func__, num_channels, num_dimms_per_channel);
+	debugf0("MC: Number of Branches=2 Channels= %d  DIMMS= %d\n",
+		num_channels, num_dimms_per_channel);
 
 	/* allocate a new MC control structure */
 
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 7425f17..2fbe4c5 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -1143,7 +1143,7 @@ static void __devexit i7300_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	char *tmp;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	if (i7300_pci)
 		edac_pci_release_generic_ctl(i7300_pci);
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index ef237f4..2f7cc2a 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -824,7 +824,7 @@ static ssize_t i7core_inject_store_##param(			\
 	long value;						\
 	int rc;							\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	pvt = mci->pvt_info;					\
 								\
 	if (pvt->inject.enable)					\
@@ -852,7 +852,7 @@ static ssize_t i7core_inject_show_##param(			\
 	struct i7core_pvt *pvt;					\
 								\
 	pvt = mci->pvt_info;					\
-	debugf1("%s() pvt=%p\n", __func__, pvt);		\
+	debugf1("pvt=%p\n", pvt);		\
 	if (pvt->inject.param < 0)				\
 		return sprintf(data, "any\n");			\
 	else							\
@@ -1059,7 +1059,7 @@ static ssize_t i7core_show_counter_##param(			\
 	struct mem_ctl_info *mci = to_mci(dev);			\
 	struct i7core_pvt *pvt = mci->pvt_info;			\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	if (!pvt->ce_count_available || (pvt->is_registered))	\
 		return sprintf(data, "data unavailable\n");	\
 	return sprintf(data, "%lu\n",				\
@@ -1190,8 +1190,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 	dev_set_name(pvt->addrmatch_dev, "inject_addrmatch");
 	dev_set_drvdata(pvt->addrmatch_dev, mci);
 
-	debugf1("%s(): creating %s\n", __func__,
-		dev_name(pvt->addrmatch_dev));
+	debugf1("creating %s\n", dev_name(pvt->addrmatch_dev));
 
 	rc = device_add(pvt->addrmatch_dev);
 	if (rc < 0)
@@ -1213,8 +1212,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 		dev_set_name(pvt->chancounts_dev, "all_channel_counts");
 		dev_set_drvdata(pvt->chancounts_dev, mci);
 
-		debugf1("%s(): creating %s\n", __func__,
-			dev_name(pvt->chancounts_dev));
+		debugf1("creating %s\n", dev_name(pvt->chancounts_dev));
 
 		rc = device_add(pvt->chancounts_dev);
 		if (rc < 0)
@@ -1254,7 +1252,7 @@ static void i7core_put_devices(struct i7core_dev *i7core_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < i7core_dev->n_devs; i++) {
 		struct pci_dev *pdev = i7core_dev->pdev[i];
 		if (!pdev)
@@ -1652,7 +1650,7 @@ static void i7core_udimm_check_mc_ecc_err(struct mem_ctl_info *mci)
 	int new0, new1, new2;
 
 	if (!pvt->pci_mcr[4]) {
-		debugf0("%s MCR registers not found\n", __func__);
+		debugf0("MCR registers not found\n");
 		return;
 	}
 
@@ -2402,7 +2400,7 @@ static void __devexit i7core_remove(struct pci_dev *pdev)
 {
 	struct i7core_dev *i7core_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index c0249f3..5361f9a 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -305,8 +305,8 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 		edac_mode = EDAC_SECDED;
 		break;
 	default:
-		debugf0("%s(): Unknown/reserved ECC state "
-			"in NBXCFG register!\n", __func__);
+		debugf0("Unknown/reserved ECC state "
+			"in NBXCFG register!\n");
 		edac_mode = EDAC_UNKNOWN;
 		break;
 	}
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 6ff59b0..c097d7a 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -210,7 +210,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -245,7 +245,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -260,7 +260,7 @@ static int __devinit i82860_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82860_printk(KERN_INFO, "i82860 init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -278,7 +278,7 @@ static void __devexit i82860_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82860_pci)
 		edac_pci_release_generic_ctl(i82860_pci);
@@ -311,7 +311,7 @@ static int __init i82860_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -352,7 +352,7 @@ fail0:
 
 static void __exit i82860_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82860_driver);
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index c943904..66e5e98 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -405,7 +405,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 nr_chans;
 	struct i82875p_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	ovrfl_pdev = pci_get_device(PCI_VEND_DEV(INTEL, 82875_6), NULL);
 
@@ -426,7 +426,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -437,7 +437,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82875p_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82875p_pvt *)mci->pvt_info;
 	pvt->ovrfl_pdev = ovrfl_pdev;
 	pvt->ovrfl_window = ovrfl_window;
@@ -464,7 +464,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -485,7 +485,7 @@ static int __devinit i82875p_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82875p_printk(KERN_INFO, "i82875p init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -504,7 +504,7 @@ static void __devexit i82875p_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82875p_pvt *pvt = NULL;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82875p_pci)
 		edac_pci_release_generic_ctl(i82875p_pci);
@@ -550,7 +550,7 @@ static int __init i82875p_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -593,7 +593,7 @@ fail0:
 
 static void __exit i82875p_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	i82875p_remove_one(mci_pdev);
 	pci_dev_put(mci_pdev);
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index a4a6768..dcb2182 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -489,11 +489,11 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	u8 c1drb[4];
 #endif
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci_read_config_dword(pdev, I82975X_MCHBAR, &mchbar);
 	if (!(mchbar & 1)) {
-		debugf3("%s(): failed, MCHBAR disabled!\n", __func__);
+		debugf3("failed, MCHBAR disabled!\n");
 		goto fail0;
 	}
 	mchbar &= 0xffffc000;	/* bits 31:14 used for 16K window */
@@ -558,7 +558,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail1;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -569,7 +569,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82975x_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82975x_pvt *) mci->pvt_info;
 	pvt->mch_window = mch_window;
 	i82975x_init_csrows(mci, pdev, mch_window);
@@ -583,7 +583,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail2:
@@ -601,7 +601,7 @@ static int __devinit i82975x_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -619,7 +619,7 @@ static void __devexit i82975x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82975x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (mci  == NULL)
@@ -655,7 +655,7 @@ static int __init i82975x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -697,7 +697,7 @@ fail0:
 
 static void __exit i82975x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82975x_driver);
 
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 1640d54..ecb59b4 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -303,7 +303,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_pci_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " PCI err registered\n");
 
 	return 0;
@@ -321,7 +321,7 @@ static int mpc85xx_pci_err_remove(struct platform_device *op)
 	struct edac_pci_ctl_info *pci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_CAP_DR,
 		 orig_pci_err_cap_dr);
@@ -610,7 +610,7 @@ static int __devinit mpc85xx_l2_err_probe(struct platform_device *op)
 
 	devres_remove_group(&op->dev, mpc85xx_l2_err_probe);
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " L2 err registered\n");
 
 	return 0;
@@ -628,7 +628,7 @@ static int mpc85xx_l2_err_remove(struct platform_device *op)
 	struct edac_device_ctl_info *edac_dev = dev_get_drvdata(&op->dev);
 	struct mpc85xx_l2_pdata *pdata = edac_dev->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->l2_vbase + MPC85XX_L2_ERRINTEN, 0);
@@ -1038,7 +1038,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 		goto err;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_RDDR2 |
 	    MEM_FLAG_DDR | MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -1104,7 +1104,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_mc_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " MC err registered\n");
 
 	return 0;
@@ -1122,7 +1122,7 @@ static int mpc85xx_mc_err_remove(struct platform_device *op)
 	struct mem_ctl_info *mci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_INT_EN, 0);
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 59c399a..50bab21 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -194,7 +194,7 @@ static int __devinit mv64x60_pci_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_pci_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -210,7 +210,7 @@ static int mv64x60_pci_err_remove(struct platform_device *pdev)
 {
 	struct edac_pci_ctl_info *pci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_del_device(&pdev->dev);
 
@@ -363,7 +363,7 @@ static int __devinit mv64x60_sram_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_sram_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -379,7 +379,7 @@ static int mv64x60_sram_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -558,7 +558,7 @@ static int __devinit mv64x60_cpu_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_cpu_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -574,7 +574,7 @@ static int mv64x60_cpu_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -766,7 +766,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 		goto err2;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
 	mci->edac_cap = EDAC_FLAG_SECDED;
@@ -815,7 +815,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -831,7 +831,7 @@ static int mv64x60_mc_err_remove(struct platform_device *pdev)
 {
 	struct mem_ctl_info *mci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_mc_del_mc(&pdev->dev);
 	edac_mc_free(mci);
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 7b7eaf2..cd3ab28 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -236,13 +236,13 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_byte(pdev, R82600_DRBA + index, &drbar);
 
-		debugf1("%s() Row=%d DRBA = %#0x\n", __func__, index, drbar);
+		debugf1("Row=%d DRBA = %#0x\n", index, drbar);
 
 		row_high_limit = ((u32) drbar << 24);
 /*		row_high_limit = ((u32)drbar << 24) | 0xffffffUL; */
 
-		debugf1("%s() Row=%d, Boundary Address=%#0x, Last = %#0x\n",
-			__func__, index, row_high_limit, row_high_limit_last);
+		debugf1("Row=%d, Boundary Address=%#0x, Last = %#0x\n",
+			index, row_high_limit, row_high_limit_last);
 
 		/* Empty row [p.57] */
 		if (row_high_limit == row_high_limit_last)
@@ -277,14 +277,13 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 sdram_refresh_rate;
 	struct r82600_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_byte(pdev, R82600_DRAMC, &dramcr);
 	pci_read_config_dword(pdev, R82600_EAP, &eapr);
 	scrub_disabled = eapr & BIT(31);
 	sdram_refresh_rate = dramcr & (BIT(0) | BIT(1));
-	debugf2("%s(): sdram refresh rate = %#0x\n", __func__,
-		sdram_refresh_rate);
-	debugf2("%s(): DRAMC register = %#0x\n", __func__, dramcr);
+	debugf2("sdram refresh rate = %#0x\n", sdram_refresh_rate);
+	debugf2("DRAMC register = %#0x\n", dramcr);
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = R82600_NR_CSROWS;
 	layers[0].is_virt_csrow = true;
@@ -295,7 +294,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -311,8 +310,8 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 
 	if (ecc_enabled(dramcr)) {
 		if (scrub_disabled)
-			debugf3("%s(): mci = %p - Scrubbing disabled! EAP: "
-				"%#0x\n", __func__, mci, eapr);
+			debugf3("mci = %p - Scrubbing disabled! EAP: "
+				"%#0x\n", mci, eapr);
 	} else
 		mci->edac_cap = EDAC_FLAG_NONE;
 
@@ -352,7 +351,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 			__func__);
 	}
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -364,7 +363,7 @@ fail:
 static int __devinit r82600_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return r82600_probe1(pdev, ent->driver_data);
@@ -374,7 +373,7 @@ static void __devexit r82600_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (r82600_pci)
 		edac_pci_release_generic_ctl(r82600_pci);
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index bb7e95f..1dd6a98 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1064,7 +1064,7 @@ static void sbridge_put_devices(struct sbridge_dev *sbridge_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < sbridge_dev->n_devs; i++) {
 		struct pci_dev *pdev = sbridge_dev->pdev[i];
 		if (!pdev)
@@ -1760,7 +1760,7 @@ static void __devexit sbridge_remove(struct pci_dev *pdev)
 {
 	struct sbridge_dev *sbridge_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 219530b..771f78f 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -331,7 +331,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	bool stacked;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	window = x38_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -407,7 +407,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -423,7 +423,7 @@ static int __devinit x38_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC: \n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -439,7 +439,7 @@ static void __devexit x38_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -472,7 +472,7 @@ static int __init x38_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -513,7 +513,7 @@ fail0:
 
 static void __exit x38_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC: \n");
 
 	pci_unregister_driver(&x38_driver);
 	if (!x38_registered) {

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 15:11                                           ` Mauro Carvalho Chehab
@ 2012-04-29 16:03                                             ` Joe Perches
  -1 siblings, 0 replies; 206+ messages in thread
From: Joe Perches @ 2012-04-29 16:03 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson,
	Mark Gross, Jason Uhlenkott, Tim Small, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

On Sun, 2012-04-29 at 12:11 -0300, Mauro Carvalho Chehab wrote:
> Em 29-04-2012 11:25, Mauro Carvalho Chehab escreveu:
> > Em 28-04-2012 05:52, Borislav Petkov escreveu:
> >> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
> >>> Yes. This is a common issue at the EDAC core: on several places, it calls the
> >>> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
> >>> the debug macros already handles that. I suspect that, in the past, the __func__
> >>> were not at the macros, but some patch added it there, and forgot to fix the
> >>> occurrences of its call.
> >> The patch that added it is d357cbb445208 and you reviewed it.
> > And you wrote the patch that caused it.

And Boris should have also written the follow-on patches that
removed most/all of the debugfX and __func__ uses.

> > A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
> > an unrelated fix on this patch. This is already complex enough to add more unrelated
> > things there.
> > 
> > Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
> > on one shot.

You make it sound simple, but it'd be a pretty complicated
cocci script.  Some of the changes would have to be inspected
or changed by hand in any case.

[]

> Most of the issues can be solved with the above script-based patch. 
> 
> There are still 171 places (12 places at the core, the rest are on the drivers)
> that will require a more sophisticated patch or that requires a manual fix.
[]
> From: Mauro Carvalho Chehab <mchehab@redhat.com>
> Date: Sun, 29 Apr 2012 11:59:14 -0300
> Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

Thanks Mauro, you shouldn't have had to do this.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-29 16:03                                             ` Joe Perches
  0 siblings, 0 replies; 206+ messages in thread
From: Joe Perches @ 2012-04-29 16:03 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Borislav Petkov,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Doug Thompson, Linux Edac Mailing List,
	Michal Marek, Jiri Kosina, Linux Kernel Mailing List,
	Olof Johansson, Andrew Morton, linuxppc-dev

On Sun, 2012-04-29 at 12:11 -0300, Mauro Carvalho Chehab wrote:
> Em 29-04-2012 11:25, Mauro Carvalho Chehab escreveu:
> > Em 28-04-2012 05:52, Borislav Petkov escreveu:
> >> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
> >>> Yes. This is a common issue at the EDAC core: on several places, it calls the
> >>> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
> >>> the debug macros already handles that. I suspect that, in the past, the __func__
> >>> were not at the macros, but some patch added it there, and forgot to fix the
> >>> occurrences of its call.
> >> The patch that added it is d357cbb445208 and you reviewed it.
> > And you wrote the patch that caused it.

And Boris should have also written the follow-on patches that
removed most/all of the debugfX and __func__ uses.

> > A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
> > an unrelated fix on this patch. This is already complex enough to add more unrelated
> > things there.
> > 
> > Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
> > on one shot.

You make it sound simple, but it'd be a pretty complicated
cocci script.  Some of the changes would have to be inspected
or changed by hand in any case.

[]

> Most of the issues can be solved with the above script-based patch. 
> 
> There are still 171 places (12 places at the core, the rest are on the drivers)
> that will require a more sophisticated patch or that requires a manual fix.
[]
> From: Mauro Carvalho Chehab <mchehab@redhat.com>
> Date: Sun, 29 Apr 2012 11:59:14 -0300
> Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

Thanks Mauro, you shouldn't have had to do this.

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 15:11                                           ` Mauro Carvalho Chehab
  (?)
  (?)
@ 2012-04-29 16:20                                           ` Mauro Carvalho Chehab
  2012-04-29 16:43                                             ` Joe Perches
  -1 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 16:20 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Joe Perches, Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson

Em 29-04-2012 12:11, Mauro Carvalho Chehab escreveu:
> Em 29-04-2012 11:25, Mauro Carvalho Chehab escreveu:
>> Em 28-04-2012 05:52, Borislav Petkov escreveu:
>>> you simply need to remove the __func__ argument in your newly added debug call:
>>>
>>>                 debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
>>>                         i, (dimm - mci->dimms),
>>>                         pos[0], pos[1], pos[2], row, chn);
>>>
>>> And while you're at it, remove the rest of the __func__ arguments from
>>> your newly added debugfX calls.
>>
>> A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
>> an unrelated fix on this patch. This is already complex enough to add more unrelated
>> things there.
>>
>> Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
>> on one shot.
> 
> Most of the issues can be solved with the above script-based patch. 
> 
> There are still 171 places (12 places at the core, the rest are on the drivers)
> that will require a more sophisticated patch or that requires a manual fix.
> 
> -
> 
> From: Mauro Carvalho Chehab <mchehab@redhat.com>
> Date: Sun, 29 Apr 2012 11:59:14 -0300
> Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs
> 
> The debug macro already adds that. Made by this small script:
> 
> $f .=$_ while (<>);
> 
> $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
> $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
> $f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;
> $f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*(\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
> $f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*(\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
> 
> $f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*(\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
> $f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*(\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
> 
> print $f;

The script below is even better. After that, only 113 occurrences of __func__
is now found at drivers/edac, and some of them are not related to debugf[1-9],
so they shouldn't be cover on a patch like that.

I'll do some manual cleanup on it.

>From 7f175162e84eb2d2d4c401527d32fd1337826f7e Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Sun, 29 Apr 2012 11:59:14 -0300
Subject: [PATCH RFC] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

The debug macro already adds that. Made by this small script:

$f .=$_ while (<>);

$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;

$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

$f =~ s/\"MC\: \\n\"/"MC:\\n"/g;

print $f;

PS.: No manual fix were applied after the script.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index be6c225..4ed97bf 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -180,7 +180,7 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 static void amd76x_check(struct mem_ctl_info *mci)
 {
 	struct amd76x_error_info info;
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	amd76x_get_error_info(mci, &info);
 	amd76x_process_error_info(mci, &info, 1);
 }
@@ -241,7 +241,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 ems_mode;
 	struct amd76x_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_dword(pdev, AMD76X_ECC_MODE_STATUS, &ems);
 	ems_mode = (ems >> 10) & 0x3;
 
@@ -256,7 +256,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -276,7 +276,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -292,7 +292,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -304,7 +304,7 @@ fail:
 static int __devinit amd76x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return amd76x_probe1(pdev, ent->driver_data);
@@ -322,7 +322,7 @@ static void __devexit amd76x_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (amd76x_pci)
 		edac_pci_release_generic_ctl(amd76x_pci);
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 31b3c91..9ee1194 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -316,13 +316,12 @@ static void get_total_mem(struct cpc925_mc_pdata *pdata)
 		reg += aw;
 		size = of_read_number(reg, sw);
 		reg += sw;
-		debugf1("%s: start 0x%lx, size 0x%lx\n", __func__,
-			start, size);
+		debugf1("start 0x%lx, size 0x%lx\n", start, size);
 		pdata->total_mem += size;
 	} while (reg < reg_end);
 
 	of_node_put(np);
-	debugf0("%s: total_mem 0x%lx\n", __func__, pdata->total_mem);
+	debugf0("total_mem 0x%lx\n", pdata->total_mem);
 }
 
 static void cpc925_init_csrows(struct mem_ctl_info *mci)
@@ -512,7 +511,7 @@ static void cpc925_mc_get_pfn(struct mem_ctl_info *mci, u32 mear,
 	*offset = pa & (PAGE_SIZE - 1);
 	*pfn = pa >> PAGE_SHIFT;
 
-	debugf0("%s: ECC physical address 0x%lx\n", __func__, pa);
+	debugf0("ECC physical address 0x%lx\n", pa);
 }
 
 static int cpc925_mc_find_channel(struct mem_ctl_info *mci, u16 syndrome)
@@ -852,8 +851,8 @@ static void cpc925_add_edac_devices(void __iomem *vbase)
 			goto err2;
 		}
 
-		debugf0("%s: Successfully added edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully added edac device for %s\n",
+			dev_info->ctl_name);
 
 		continue;
 
@@ -884,8 +883,8 @@ static void cpc925_del_edac_devices(void)
 		if (dev_info->exit)
 			dev_info->exit(dev_info);
 
-		debugf0("%s: Successfully deleted edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully deleted edac device for %s\n",
+			dev_info->ctl_name);
 	}
 }
 
@@ -900,7 +899,7 @@ static int cpc925_get_sdram_scrub_rate(struct mem_ctl_info *mci)
 	mscr = __raw_readl(pdata->vbase + REG_MSCR_OFFSET);
 	si = (mscr & MSCR_SI_MASK) >> MSCR_SI_SHIFT;
 
-	debugf0("%s, Mem Scrub Ctrl Register 0x%x\n", __func__, mscr);
+	debugf0("Mem Scrub Ctrl Register 0x%x\n", mscr);
 
 	if (((mscr & MSCR_SCRUB_MOD_MASK) != MSCR_BACKGR_SCRUB) ||
 	    (si == 0)) {
@@ -928,8 +927,7 @@ static int cpc925_mc_get_channels(void __iomem *vbase)
 	    ((mbcr & MBCR_64BITBUS_MASK) == 0))
 		dual = 1;
 
-	debugf0("%s: %s channel\n", __func__,
-		(dual > 0) ? "Dual" : "Single");
+	debugf0("%s channel\n", (dual > 0) ? "Dual" : "Single");
 
 	return dual;
 }
@@ -944,7 +942,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	struct resource *r;
 	int res = 0, nr_channels;
 
-	debugf0("%s: %s platform device found!\n", __func__, pdev->name);
+	debugf0("%s platform device found!\n", pdev->name);
 
 	if (!devres_open_group(&pdev->dev, cpc925_probe, GFP_KERNEL)) {
 		res = -ENOMEM;
@@ -1026,7 +1024,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	cpc925_add_edac_devices(vbase);
 
 	/* get this far and it's successful */
-	debugf0("%s: success\n", __func__);
+	debugf0("success\n");
 
 	res = 0;
 	goto out;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 7e601c1..5a599a3 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -309,7 +309,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (page < pvt->tolm)
 		return page;
@@ -335,7 +335,7 @@ static void do_process_ce(struct mem_ctl_info *mci, u16 error_one,
 	int i;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* convert the addr to 4k page */
 	page = sec1_add >> (PAGE_SHIFT - 4);
@@ -394,7 +394,7 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 	int row;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (error_one & 0x0202) {
 		error_2b = ded_add;
@@ -453,7 +453,7 @@ static inline void process_ue_no_info_wr(struct mem_ctl_info *mci,
 	if (!handle_error)
 		return;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
 			     -1, -1, -1,
 			     "e752x UE log memory write", "", NULL);
@@ -982,7 +982,7 @@ static void e752x_check(struct mem_ctl_info *mci)
 {
 	struct e752x_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e752x_get_error_info(mci, &info);
 	e752x_process_error_info(mci, &info, 1);
 }
@@ -1102,7 +1102,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		pci_read_config_byte(pdev, E752X_DRB + index, &value);
 		/* convert a 128 or 64 MiB DRB to a page size. */
 		cumul_size = value << (25 + drc_drbg - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -1270,7 +1270,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;		/* Number of channels 0=1chan,1=2chan */
 	struct e752x_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 	debugf0("Starting Probe1\n");
 
 	/* check to see if device 0 function 1 is enabled; if it isn't, we
@@ -1302,7 +1302,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	/* 3100 IMCH supports SECDEC only */
 	mci->edac_ctl_cap = (dev_idx == I3100) ? EDAC_FLAG_SECDED :
@@ -1312,7 +1312,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_ver = E752X_REVISION;
 	mci->pdev = &pdev->dev;
 
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e752x_pvt *)mci->pvt_info;
 	pvt->dev_info = &e752x_devs[dev_idx];
 	pvt->mc_symmetric = ((ddrcsr & 0x10) != 0);
@@ -1322,7 +1322,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENODEV;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e752x_check;
@@ -1344,7 +1344,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		mci->edac_cap = EDAC_FLAG_SECDED; /* the only mode supported */
 	else
 		mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E752X_TOLM, &pci_data);
@@ -1361,7 +1361,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -1379,7 +1379,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -1395,7 +1395,7 @@ fail:
 static int __devinit e752x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	if (pci_enable_device(pdev) < 0)
@@ -1409,7 +1409,7 @@ static void __devexit e752x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e752x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e752x_pci)
 		edac_pci_release_generic_ctl(e752x_pci);
@@ -1455,7 +1455,7 @@ static int __init e752x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -1466,7 +1466,7 @@ static int __init e752x_init(void)
 
 static void __exit e752x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	pci_unregister_driver(&e752x_driver);
 }
 
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 2defa96..2850d00 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -166,7 +166,7 @@ static const struct e7xxx_dev_info e7xxx_devs[] = {
 /* FIXME - is this valid for both SECDED and S4ECD4ED? */
 static inline int e7xxx_find_channel(u16 syndrome)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((syndrome & 0xff00) == 0)
 		return 0;
@@ -186,7 +186,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e7xxx_pvt *pvt = (struct e7xxx_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((page < pvt->tolm) ||
 		((page >= 0x100000) && (page < pvt->remapbase)))
@@ -208,7 +208,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	int row;
 	int channel;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_1b = info->dram_celog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -225,7 +225,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ce_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx CE log register overflow", "", NULL);
 }
@@ -235,7 +235,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	u32 error_2b, block_page;
 	int row;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_2b = info->dram_uelog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -248,7 +248,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ue_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx UE log register overflow", "", NULL);
@@ -334,7 +334,7 @@ static void e7xxx_check(struct mem_ctl_info *mci)
 {
 	struct e7xxx_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e7xxx_get_error_info(mci, &info);
 	e7xxx_process_error_info(mci, &info, 1);
 }
@@ -383,7 +383,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		pci_read_config_byte(pdev, E7XXX_DRB + index, &value);
 		/* convert a 64 or 32 MiB DRB to a page size. */
 		cumul_size = value << (25 + drc_drbg - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -430,7 +430,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;
 	struct e7xxx_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 
 	pci_read_config_dword(pdev, E7XXX_DRC, &drc);
 
@@ -453,7 +453,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED |
 		EDAC_FLAG_S4ECD4ED;
@@ -461,7 +461,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_name = EDAC_MOD_STR;
 	mci->mod_ver = E7XXX_REVISION;
 	mci->pdev = &pdev->dev;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e7xxx_pvt *)mci->pvt_info;
 	pvt->dev_info = &e7xxx_devs[dev_idx];
 	pvt->bridge_ck = pci_get_device(PCI_VENDOR_ID_INTEL,
@@ -474,14 +474,14 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e7xxx_check;
 	mci->ctl_page_to_phys = ctl_page_to_phys;
 	e7xxx_init_csrows(mci, pdev, dev_idx, drc);
 	mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E7XXX_TOLM, &pci_data);
 	pvt->tolm = ((u32) pci_data) << 4;
@@ -500,7 +500,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail1;
 	}
 
@@ -516,7 +516,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -532,7 +532,7 @@ fail0:
 static int __devinit e7xxx_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	return pci_enable_device(pdev) ?
@@ -544,7 +544,7 @@ static void __devexit e7xxx_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e7xxx_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e7xxx_pci)
 		edac_pci_release_generic_ctl(e7xxx_pci);
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index cb397d9..ed46949 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -82,8 +82,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	void *pvt, *p;
 	int err;
 
-	debugf4("%s() instances=%d blocks=%d\n",
-		__func__, nr_instances, nr_blocks);
+	debugf4("instances=%d blocks=%d\n",
+		nr_instances, nr_blocks);
 
 	/* Calculate the size of memory we need to allocate AND
 	 * determine the offsets of the various item arrays
@@ -156,8 +156,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	/* Name of this edac device */
 	snprintf(dev_ctl->name,sizeof(dev_ctl->name),"%s",edac_device_name);
 
-	debugf4("%s() edac_dev=%p next after end=%p\n",
-		__func__, dev_ctl, pvt + sz_private );
+	debugf4("edac_dev=%p next after end=%p\n",
+		dev_ctl, pvt + sz_private );
 
 	/* Initialize every Instance */
 	for (instance = 0; instance < nr_instances; instance++) {
@@ -178,9 +178,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			snprintf(blk->name, sizeof(blk->name),
 				 "%s%d", edac_block_name, block+offset_value);
 
-			debugf4("%s() instance=%d inst_p=%p block=#%d "
+			debugf4("instance=%d inst_p=%p block=#%d "
 				"block_p=%p name='%s'\n",
-				__func__, instance, inst, block,
+				instance, inst, block,
 				blk, blk->name);
 
 			/* if there are NO attributes OR no attribute pointer
@@ -194,8 +194,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			attrib_p = &dev_attrib[block*nr_instances*nr_attrib];
 			blk->block_attributes = attrib_p;
 
-			debugf4("%s() THIS BLOCK_ATTRIB=%p\n",
-				__func__, blk->block_attributes);
+			debugf4("THIS BLOCK_ATTRIB=%p\n",
+				blk->block_attributes);
 
 			/* Initialize every user specified attribute in this
 			 * block with the data the caller passed in
@@ -214,9 +214,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 
 				attrib->block = blk;	/* up link */
 
-				debugf4("%s() alloc-attrib=%p attrib_name='%s' "
+				debugf4("alloc-attrib=%p attrib_name='%s' "
 					"attrib-spec=%p spec-name=%s\n",
-					__func__, attrib, attrib->attr.name,
+					attrib, attrib->attr.name,
 					&attrib_spec[attr],
 					attrib_spec[attr].attr.name
 					);
@@ -273,7 +273,7 @@ static struct edac_device_ctl_info *find_edac_device_by_dev(struct device *dev)
 	struct edac_device_ctl_info *edac_dev;
 	struct list_head *item;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	list_for_each(item, &edac_device_list) {
 		edac_dev = list_entry(item, struct edac_device_ctl_info, link);
@@ -408,7 +408,7 @@ static void edac_device_workq_function(struct work_struct *work_req)
 void edac_device_workq_setup(struct edac_device_ctl_info *edac_dev,
 				unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* take the arg 'msec' and set it into the control structure
 	 * to used in the time period calculation
@@ -496,7 +496,7 @@ EXPORT_SYMBOL_GPL(edac_device_alloc_index);
  */
 int edac_device_add_device(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -570,7 +570,7 @@ struct edac_device_ctl_info *edac_device_del_device(struct device *dev)
 {
 	struct edac_device_ctl_info *edac_dev;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&device_ctls_mutex);
 
diff --git a/drivers/edac/edac_device_sysfs.c b/drivers/edac/edac_device_sysfs.c
index b4ea185..1cee83e 100644
--- a/drivers/edac/edac_device_sysfs.c
+++ b/drivers/edac/edac_device_sysfs.c
@@ -202,7 +202,7 @@ static void edac_device_ctrl_master_release(struct kobject *kobj)
 {
 	struct edac_device_ctl_info *edac_dev = to_edacdev(kobj);
 
-	debugf4("%s() control index=%d\n", __func__, edac_dev->dev_idx);
+	debugf4("control index=%d\n", edac_dev->dev_idx);
 
 	/* decrement the EDAC CORE module ref count */
 	module_put(edac_dev->owner);
@@ -233,12 +233,12 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	struct bus_type *edac_subsys;
 	int err;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the /sys/devices/system/edac reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys error\n", __func__);
+		debugf1("no edac_subsys error\n");
 		err = -ENODEV;
 		goto err_out;
 	}
@@ -264,8 +264,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 				   &edac_subsys->dev_root->kobj,
 				   "%s", edac_dev->name);
 	if (err) {
-		debugf1("%s()Failed to register '.../edac/%s'\n",
-			__func__, edac_dev->name);
+		debugf1("Failed to register '.../edac/%s'\n",
+			edac_dev->name);
 		goto err_kobj_reg;
 	}
 	kobject_uevent(&edac_dev->kobj, KOBJ_ADD);
@@ -274,8 +274,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	 * edac_device_unregister_sysfs_main_kobj() must be used
 	 */
 
-	debugf4("%s() Registered '.../edac/%s' kobject\n",
-		__func__, edac_dev->name);
+	debugf4("Registered '.../edac/%s' kobject\n",
+		edac_dev->name);
 
 	return 0;
 
@@ -296,9 +296,9 @@ err_out:
  */
 void edac_device_unregister_sysfs_main_kobj(struct edac_device_ctl_info *dev)
 {
-	debugf0("%s()\n", __func__);
-	debugf4("%s() name of kobject is: %s\n",
-		__func__, kobject_name(&dev->kobj));
+	debugf0("\n");
+	debugf4("name of kobject is: %s\n",
+		kobject_name(&dev->kobj));
 
 	/*
 	 * Unregister the edac device's kobject and
@@ -336,7 +336,7 @@ static void edac_device_ctrl_instance_release(struct kobject *kobj)
 {
 	struct edac_device_instance *instance;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* map from this kobj to the main control struct
 	 * and then dec the main kobj count
@@ -442,7 +442,7 @@ static void edac_device_ctrl_block_release(struct kobject *kobj)
 {
 	struct edac_device_block *block;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the container of the kobj */
 	block = to_block(kobj);
@@ -524,10 +524,10 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	struct edac_dev_sysfs_block_attribute *sysfs_attrib;
 	struct kobject *main_kobj;
 
-	debugf4("%s() Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
-		__func__, instance->name, instance, block->name, block);
-	debugf4("%s() block kobj=%p  block kobj->parent=%p\n",
-		__func__, &block->kobj, &block->kobj.parent);
+	debugf4("Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
+		instance->name, instance, block->name, block);
+	debugf4("block kobj=%p  block kobj->parent=%p\n",
+		&block->kobj, &block->kobj.parent);
 
 	/* init this block's kobject */
 	memset(&block->kobj, 0, sizeof(struct kobject));
@@ -546,8 +546,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 				   &instance->kobj,
 				   "%s", block->name);
 	if (err) {
-		debugf1("%s() Failed to register instance '%s'\n",
-			__func__, block->name);
+		debugf1("Failed to register instance '%s'\n",
+			block->name);
 		kobject_put(main_kobj);
 		err = -ENODEV;
 		goto err_out;
@@ -560,9 +560,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	if (sysfs_attrib && block->nr_attribs) {
 		for (i = 0; i < block->nr_attribs; i++, sysfs_attrib++) {
 
-			debugf4("%s() creating block attrib='%s' "
+			debugf4("creating block attrib='%s' "
 				"attrib->%p to kobj=%p\n",
-				__func__,
 				sysfs_attrib->attr.name,
 				sysfs_attrib, &block->kobj);
 
@@ -647,14 +646,14 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	err = kobject_init_and_add(&instance->kobj, &ktype_instance_ctrl,
 				   &edac_dev->kobj, "%s", instance->name);
 	if (err != 0) {
-		debugf2("%s() Failed to register instance '%s'\n",
-			__func__, instance->name);
+		debugf2("Failed to register instance '%s'\n",
+			instance->name);
 		kobject_put(main_kobj);
 		goto err_out;
 	}
 
-	debugf4("%s() now register '%d' blocks for instance %d\n",
-		__func__, instance->nr_blocks, idx);
+	debugf4("now register '%d' blocks for instance %d\n",
+		instance->nr_blocks, idx);
 
 	/* register all blocks of this instance */
 	for (i = 0; i < instance->nr_blocks; i++) {
@@ -670,8 +669,8 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	}
 	kobject_uevent(&instance->kobj, KOBJ_ADD);
 
-	debugf4("%s() Registered instance %d '%s' kobject\n",
-		__func__, idx, instance->name);
+	debugf4("Registered instance %d '%s' kobject\n",
+		idx, instance->name);
 
 	return 0;
 
@@ -715,7 +714,7 @@ static int edac_device_create_instances(struct edac_device_ctl_info *edac_dev)
 	int i, j;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* iterate over creation of the instances */
 	for (i = 0; i < edac_dev->nr_instances; i++) {
@@ -817,12 +816,12 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	int err;
 	struct kobject *edac_kobj = &edac_dev->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, edac_dev->dev_idx);
+	debugf0("idx=%d\n", edac_dev->dev_idx);
 
 	/*  go create any main attributes callers wants */
 	err = edac_device_add_main_sysfs_attributes(edac_dev);
 	if (err) {
-		debugf0("%s() failed to add sysfs attribs\n", __func__);
+		debugf0("failed to add sysfs attribs\n");
 		goto err_out;
 	}
 
@@ -832,8 +831,8 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	err = sysfs_create_link(edac_kobj,
 				&edac_dev->dev->kobj, EDAC_DEVICE_SYMLINK);
 	if (err) {
-		debugf0("%s() sysfs_create_link() returned err= %d\n",
-			__func__, err);
+		debugf0("sysfs_create_link() returned err= %d\n",
+			err);
 		goto err_remove_main_attribs;
 	}
 
@@ -843,14 +842,14 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	 */
 	err = edac_device_create_instances(edac_dev);
 	if (err) {
-		debugf0("%s() edac_device_create_instances() "
-			"returned err= %d\n", __func__, err);
+		debugf0("edac_device_create_instances() "
+			"returned err= %d\n", err);
 		goto err_remove_link;
 	}
 
 
-	debugf4("%s() create-instances done, idx=%d\n",
-		__func__, edac_dev->dev_idx);
+	debugf4("create-instances done, idx=%d\n",
+		edac_dev->dev_idx);
 
 	return 0;
 
@@ -873,7 +872,7 @@ err_out:
  */
 void edac_device_remove_sysfs(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* remove any main attributes for this device */
 	edac_device_remove_main_sysfs_attributes(edac_dev);
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 65568e6..5c3d5eb 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -259,18 +259,18 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	count = 1;
 	for (i = 0; i < n_layers; i++) {
 		count *= layers[i].size;
-		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		debugf4("errcount layer %d size %d\n", i, count);
 		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		tot_errcount += 2 * count;
 	}
 
-	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
+	debugf4("allocating %d error counters\n", tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
-	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
-		__func__, size,
+	debugf1("allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		size,
 		tot_dimms,
 		per_rank ? "ranks" : "dimms",
 		tot_csrows * tot_channels);
@@ -337,7 +337,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+	debugf4("initializing %d %s\n", tot_dimms,
 		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
@@ -351,8 +351,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 		mci->dimms[off] = dimm;
 		dimm->mci = mci;
 
-		debugf2("%s: %d: %s%i (%d:%d:%d): row %d, chan %d\n", __func__,
-			i, per_rank ? "rank" : "dimm", off,
+		debugf2("%d: %s%i (%d:%d:%d): row %d, chan %d\n", i, per_rank ? "rank" : "dimm", off,
 			pos[0], pos[1], pos[2], row, chn);
 
 		/*
@@ -451,7 +450,7 @@ EXPORT_SYMBOL_GPL(edac_mc_alloc);
  */
 void edac_mc_free(struct mem_ctl_info *mci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* the mci instance is freed here, when the sysfs object is dropped */
 	edac_unregister_sysfs(mci);
@@ -471,7 +470,7 @@ struct mem_ctl_info *find_mci_by_dev(struct device *dev)
 	struct mem_ctl_info *mci;
 	struct list_head *item;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	list_for_each(item, &mc_devices) {
 		mci = list_entry(item, struct mem_ctl_info, link);
@@ -539,7 +538,7 @@ static void edac_mc_workq_function(struct work_struct *work_req)
  */
 static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* if this instance is not in the POLL state, then simply return */
 	if (mci->op_state != OP_RUNNING_POLL)
@@ -566,8 +565,7 @@ static void edac_mc_workq_teardown(struct mem_ctl_info *mci)
 
 	status = cancel_delayed_work(&mci->work);
 	if (status == 0) {
-		debugf0("%s() not canceled, flush the queue\n",
-			__func__);
+		debugf0("not canceled, flush the queue\n");
 
 		/* workq instance might be running, wait for it */
 		flush_workqueue(edac_workqueue);
@@ -714,7 +712,7 @@ EXPORT_SYMBOL(edac_mc_find);
 /* FIXME - should a warning be printed if no error detection? correction? */
 int edac_mc_add_mc(struct mem_ctl_info *mci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -785,7 +783,7 @@ struct mem_ctl_info *edac_mc_del_mc(struct device *dev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&mem_ctls_mutex);
 
@@ -823,7 +821,7 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 	void *virt_addr;
 	unsigned long flags = 0;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* ECC error page was not in our memory. Ignore it. */
 	if (!pfn_valid(page))
@@ -1043,8 +1041,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			 * get csrow/channel of the dimm, in order to allow
 			 * incrementing the compat API counters
 			 */
-			debugf4("%s: %s csrows map: (%d,%d)\n",
-				__func__,
+			debugf4("%s csrows map: (%d,%d)\n",
 				mci->mem_is_per_rank ? "rank" : "dimm",
 				dimm->csrow, dimm->cschannel);
 			if (row == -1)
@@ -1060,8 +1057,8 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 	if (!enable_filter) {
 		strcpy(label, "any memory");
 	} else {
-		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
-			__func__, row, chan);
+		debugf4("csrow/channel to increment: (%d,%d)\n",
+			row, chan);
 		if (p == label)
 			strcpy(label, "unknown memory");
 		if (type == HW_EVENT_ERR_CORRECTED) {
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 81ca073..8f96c49 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -376,8 +376,7 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 	dev_set_name(&csrow->dev, "csrow%d", index);
 	dev_set_drvdata(&csrow->dev, csrow);
 
-	debugf0("%s(): creating (virtual) csrow node %s\n", __func__,
-		dev_name(&csrow->dev));
+	debugf0("creating (virtual) csrow node %s\n", dev_name(&csrow->dev));
 
 	err = device_add(&csrow->dev);
 	if (err < 0)
@@ -623,8 +622,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 
 	err =  device_add(&dimm->dev);
 
-	debugf0("%s(): creating rank/dimm device %s\n", __func__,
-		dev_name(&dimm->dev));
+	debugf0("creating rank/dimm device %s\n", dev_name(&dimm->dev));
 
 	return err;
 }
@@ -981,8 +979,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	dev_set_drvdata(&mci->dev, mci);
 	pm_runtime_forbid(&mci->dev);
 
-	debugf0("%s(): creating device %s\n", __func__,
-		dev_name(&mci->dev));
+	debugf0("creating device %s\n", dev_name(&mci->dev));
 	err = device_add(&mci->dev);
 	if (err < 0) {
 		bus_unregister(&mci->bus);
@@ -999,8 +996,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 		if (dimm->nr_pages == 0)
 			continue;
 #ifdef CONFIG_EDAC_DEBUG
-		debugf1("%s creating dimm%d, located at ",
-			__func__, i);
+		debugf1("creating dimm%d, located at ",
+			i);
 		if (edac_debug_level >= 1) {
 			int lay;
 			for (lay = 0; lay < mci->n_layers; lay++)
@@ -1012,8 +1009,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 #endif
 		err = edac_create_dimm_object(mci, dimm, i);
 		if (err) {
-			debugf1("%s() failure: create dimm %d obj\n",
-				__func__, i);
+			debugf1("failure: create dimm %d obj\n",
+				i);
 			goto fail;
 		}
 	}
@@ -1051,7 +1048,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
 	int i;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	debugfs_remove(mci->debugfs);
@@ -1064,8 +1061,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 		struct dimm_info *dimm = mci->dimms[i];
 		if (dimm->nr_pages == 0)
 			continue;
-		debugf0("%s(): removing device %s\n", __func__,
-			dev_name(&dimm->dev));
+		debugf0("removing device %s\n", dev_name(&dimm->dev));
 		put_device(&dimm->dev);
 		device_del(&dimm->dev);
 	}
@@ -1105,7 +1101,7 @@ int __init edac_mc_sysfs_init(void)
 	/* get the /sys/devices/system/edac subsys reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		return -EINVAL;
 	}
 
diff --git a/drivers/edac/edac_module.c b/drivers/edac/edac_module.c
index 8735a0d..9de2484 100644
--- a/drivers/edac/edac_module.c
+++ b/drivers/edac/edac_module.c
@@ -113,7 +113,7 @@ error:
  */
 static void __exit edac_exit(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* tear down the various subsystems */
 	edac_workqueue_teardown();
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index f1ac866..51dd4e0 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -45,7 +45,7 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 	void *p = NULL, *pvt;
 	unsigned int size;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	pci = edac_align_ptr(&p, sizeof(*pci), 1);
 	pvt = edac_align_ptr(&p, 1, sz_pvt);
@@ -80,7 +80,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_ctl_info);
  */
 void edac_pci_free_ctl_info(struct edac_pci_ctl_info *pci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	edac_pci_remove_sysfs(pci);
 }
@@ -97,7 +97,7 @@ static struct edac_pci_ctl_info *find_edac_pci_by_dev(struct device *dev)
 	struct edac_pci_ctl_info *pci;
 	struct list_head *item;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	list_for_each(item, &edac_pci_list) {
 		pci = list_entry(item, struct edac_pci_ctl_info, link);
@@ -122,7 +122,7 @@ static int add_edac_pci_to_global_list(struct edac_pci_ctl_info *pci)
 	struct list_head *item, *insert_before;
 	struct edac_pci_ctl_info *rover;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	insert_before = &edac_pci_list;
 
@@ -226,7 +226,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 	int msec;
 	unsigned long delay;
 
-	debugf3("%s() checking\n", __func__);
+	debugf3("checking\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -261,7 +261,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 static void edac_pci_workq_setup(struct edac_pci_ctl_info *pci,
 				 unsigned int msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	INIT_DELAYED_WORK(&pci->work, edac_pci_workq_function);
 	queue_delayed_work(edac_workqueue, &pci->work,
@@ -276,7 +276,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 {
 	int status;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	status = cancel_delayed_work(&pci->work);
 	if (status == 0)
@@ -293,7 +293,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 void edac_pci_reset_delay_period(struct edac_pci_ctl_info *pci,
 				 unsigned long value)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_workq_teardown(pci);
 
@@ -333,7 +333,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_index);
  */
 int edac_pci_add_device(struct edac_pci_ctl_info *pci, int edac_idx)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci->pci_idx = edac_idx;
 	pci->start_time = jiffies;
@@ -393,7 +393,7 @@ struct edac_pci_ctl_info *edac_pci_del_device(struct device *dev)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -430,7 +430,7 @@ EXPORT_SYMBOL_GPL(edac_pci_del_device);
  */
 static void edac_pci_generic_check(struct edac_pci_ctl_info *pci)
 {
-	debugf4("%s()\n", __func__);
+	debugf4("\n");
 	edac_pci_do_parity_check();
 }
 
@@ -475,7 +475,7 @@ struct edac_pci_ctl_info *edac_pci_create_generic_ctl(struct device *dev,
 	pdata->edac_idx = edac_pci_idx++;
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		edac_pci_free_ctl_info(pci);
 		return NULL;
 	}
@@ -491,7 +491,7 @@ EXPORT_SYMBOL_GPL(edac_pci_create_generic_ctl);
  */
 void edac_pci_release_generic_ctl(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() pci mod=%s\n", __func__, pci->mod_name);
+	debugf0("pci mod=%s\n", pci->mod_name);
 
 	edac_pci_del_device(pci->dev);
 	edac_pci_free_ctl_info(pci);
diff --git a/drivers/edac/edac_pci_sysfs.c b/drivers/edac/edac_pci_sysfs.c
index 97f5064..6678216 100644
--- a/drivers/edac/edac_pci_sysfs.c
+++ b/drivers/edac/edac_pci_sysfs.c
@@ -78,7 +78,7 @@ static void edac_pci_instance_release(struct kobject *kobj)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Form pointer to containing struct, the pci control struct */
 	pci = to_instance(kobj);
@@ -161,7 +161,7 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	struct kobject *main_kobj;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* First bump the ref count on the top main kobj, which will
 	 * track the number of PCI instances we have, and thus nest
@@ -177,14 +177,14 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	err = kobject_init_and_add(&pci->kobj, &ktype_pci_instance,
 				   edac_pci_top_main_kobj, "pci%d", idx);
 	if (err != 0) {
-		debugf2("%s() failed to register instance pci%d\n",
-			__func__, idx);
+		debugf2("failed to register instance pci%d\n",
+			idx);
 		kobject_put(edac_pci_top_main_kobj);
 		goto error_out;
 	}
 
 	kobject_uevent(&pci->kobj, KOBJ_ADD);
-	debugf1("%s() Register instance 'pci%d' kobject\n", __func__, idx);
+	debugf1("Register instance 'pci%d' kobject\n", idx);
 
 	return 0;
 
@@ -201,7 +201,7 @@ error_out:
 static void edac_pci_unregister_sysfs_instance_kobj(
 			struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Unregister the instance kobject and allow its release
 	 * function release the main reference count and then
@@ -317,7 +317,7 @@ static struct edac_pci_dev_attribute *edac_pci_attr[] = {
  */
 static void edac_pci_release_main_kobj(struct kobject *kobj)
 {
-	debugf0("%s() here to module_put(THIS_MODULE)\n", __func__);
+	debugf0("here to module_put(THIS_MODULE)\n");
 
 	kfree(kobj);
 
@@ -345,7 +345,7 @@ static int edac_pci_main_kobj_setup(void)
 	int err;
 	struct bus_type *edac_subsys;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* check and count if we have already created the main kobject */
 	if (atomic_inc_return(&edac_pci_sysfs_refcount) != 1)
@@ -356,7 +356,7 @@ static int edac_pci_main_kobj_setup(void)
 	 */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		err = -ENODEV;
 		goto decrement_count_fail;
 	}
@@ -366,7 +366,7 @@ static int edac_pci_main_kobj_setup(void)
 	 * level main kobj for EDAC PCI
 	 */
 	if (!try_module_get(THIS_MODULE)) {
-		debugf1("%s() try_module_get() failed\n", __func__);
+		debugf1("try_module_get() failed\n");
 		err = -ENODEV;
 		goto mod_get_fail;
 	}
@@ -421,15 +421,14 @@ decrement_count_fail:
  */
 static void edac_pci_main_kobj_teardown(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Decrement the count and only if no more controller instances
 	 * are connected perform the unregisteration of the top level
 	 * main kobj
 	 */
 	if (atomic_dec_return(&edac_pci_sysfs_refcount) == 0) {
-		debugf0("%s() called kobject_put on main kobj\n",
-			__func__);
+		debugf0("called kobject_put on main kobj\n");
 		kobject_put(edac_pci_top_main_kobj);
 	}
 	edac_put_sysfs_subsys();
@@ -446,7 +445,7 @@ int edac_pci_create_sysfs(struct edac_pci_ctl_info *pci)
 	int err;
 	struct kobject *edac_kobj = &pci->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, pci->pci_idx);
+	debugf0("idx=%d\n", pci->pci_idx);
 
 	/* create the top main EDAC PCI kobject, IF needed */
 	err = edac_pci_main_kobj_setup();
@@ -460,8 +459,8 @@ int edac_pci_create_sysfs(struct edac_pci_ctl_info *pci)
 
 	err = sysfs_create_link(edac_kobj, &pci->dev->kobj, EDAC_PCI_SYMLINK);
 	if (err) {
-		debugf0("%s() sysfs_create_link() returned err= %d\n",
-			__func__, err);
+		debugf0("sysfs_create_link() returned err= %d\n",
+			err);
 		goto symlink_fail;
 	}
 
@@ -484,7 +483,7 @@ unregister_cleanup:
  */
 void edac_pci_remove_sysfs(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() index=%d\n", __func__, pci->pci_idx);
+	debugf0("index=%d\n", pci->pci_idx);
 
 	/* Remove the symlink */
 	sysfs_remove_link(&pci->kobj, EDAC_PCI_SYMLINK);
@@ -496,7 +495,7 @@ void edac_pci_remove_sysfs(struct edac_pci_ctl_info *pci)
 	 * if this 'pci' is the last instance.
 	 * If it is, the main kobject will be unregistered as a result
 	 */
-	debugf0("%s() calling edac_pci_main_kobj_teardown()\n", __func__);
+	debugf0("calling edac_pci_main_kobj_teardown()\n");
 	edac_pci_main_kobj_teardown();
 }
 
@@ -671,7 +670,7 @@ void edac_pci_do_parity_check(void)
 {
 	int before_count;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* if policy has PCI check off, leave now */
 	if (!check_pci_errors)
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 55eff02..7cd4339 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -322,7 +322,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	unsigned long mchbar;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	pci_read_config_dword(pdev, I3000_MCHBAR, (u32 *) & mchbar);
 	mchbar &= I3000_MCHBAR_MASK;
@@ -366,7 +366,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -399,8 +399,8 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 		cumul_size = value << (I3000_DRB_SHIFT - PAGE_SHIFT);
 		if (interleaved)
 			cumul_size <<= 1;
-		debugf3("MC: %s(): (%d) cumul_size 0x%x\n",
-			__func__, i, cumul_size);
+		debugf3("MC: (%d) cumul_size 0x%x\n",
+			i, cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;
 
@@ -429,7 +429,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -445,7 +445,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -461,7 +461,7 @@ static int __devinit i3000_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -477,7 +477,7 @@ static void __devexit i3000_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i3000_pci)
 		edac_pci_release_generic_ctl(i3000_pci);
@@ -511,7 +511,7 @@ static int __init i3000_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -552,7 +552,7 @@ fail0:
 
 static void __exit i3000_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&i3000_driver);
 	if (!i3000_registered) {
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 818ee6f..1c5a9fc 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -332,7 +332,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	void __iomem *window;
 	struct i3200_priv *priv;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	window = i3200_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -403,12 +403,12 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -424,7 +424,7 @@ static int __devinit i3200_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -441,7 +441,7 @@ static void __devexit i3200_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i3200_priv *priv;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -475,7 +475,7 @@ static int __init i3200_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -516,7 +516,7 @@ fail0:
 
 static void __exit i3200_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&i3200_driver);
 	if (!i3200_registered) {
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 2a9f1dc..918a960 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -1363,9 +1363,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	int num_channels;
 	int num_dimms_per_channel;
 
-	debugf0("MC: %s: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
-		__FILE__, __func__,
-		pdev->bus->number,
+	debugf0("MC: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
+		__FILE__, pdev->bus->number,
 		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 
 	/* We only are looking for func 0 of the set */
@@ -1388,8 +1387,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	i5000_get_dimm_and_channel_counts(pdev, &num_dimms_per_channel,
 					&num_channels);
 
-	debugf0("MC: %s(): Number of Branches=2 Channels= %d  DIMMS= %d\n",
-		__func__, num_channels, num_dimms_per_channel);
+	debugf0("MC: Number of Branches=2 Channels= %d  DIMMS= %d\n",
+		num_channels, num_dimms_per_channel);
 
 	/* allocate a new MC control structure */
 
@@ -1407,7 +1406,7 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 
 	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
@@ -1450,8 +1449,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (edac_mc_add_mc(mci)) {
-		debugf0("MC: %s: %s(): failed edac_mc_add_mc()\n",
-			__FILE__, __func__);
+		debugf0("MC: %s(): failed edac_mc_add_mc()\n",
+			__FILE__);
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -1495,7 +1494,7 @@ static int __devinit i5000_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* wake up device */
 	rc = pci_enable_device(pdev);
@@ -1514,7 +1513,7 @@ static void __devexit i5000_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i5000_pci)
 		edac_pci_release_generic_ctl(i5000_pci);
@@ -1560,7 +1559,7 @@ static int __init i5000_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -1576,7 +1575,7 @@ static int __init i5000_init(void)
  */
 static void __exit i5000_exit(void)
 {
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 	pci_unregister_driver(&i5000_driver);
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 676591e..8aec1b9 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -1203,8 +1203,7 @@ static int i5400_init_dimms(struct mem_ctl_info *mci)
 
 			size_mb =  pvt->dimm_info[slot][channel].megabytes;
 
-			debugf2("%s: dimm (branch %d channel %d slot %d): %d.%03d GB\n",
-				__func__,
+			debugf2("dimm (branch %d channel %d slot %d): %d.%03d GB\n",
 				channel / 2, channel % 2, slot,
 				size_mb / 1000, size_mb % 1000);
 
@@ -1270,9 +1269,8 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	if (dev_idx >= ARRAY_SIZE(i5400_devs))
 		return -EINVAL;
 
-	debugf0("MC: %s: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
-		__FILE__, __func__,
-		pdev->bus->number,
+	debugf0("MC: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
+		__FILE__, pdev->bus->number,
 		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 
 	/* We only are looking for func 0 of the set */
@@ -1298,7 +1296,7 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 
 	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
@@ -1341,8 +1339,8 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (edac_mc_add_mc(mci)) {
-		debugf0("MC: %s: %s(): failed edac_mc_add_mc()\n",
-			__FILE__, __func__);
+		debugf0("MC: %s(): failed edac_mc_add_mc()\n",
+			__FILE__);
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -1386,7 +1384,7 @@ static int __devinit i5400_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* wake up device */
 	rc = pci_enable_device(pdev);
@@ -1405,7 +1403,7 @@ static void __devexit i5400_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i5400_pci)
 		edac_pci_release_generic_ctl(i5400_pci);
@@ -1451,7 +1449,7 @@ static int __init i5400_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -1467,7 +1465,7 @@ static int __init i5400_init(void)
  */
 static void __exit i5400_exit(void)
 {
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 	pci_unregister_driver(&i5400_driver);
 }
 
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 7425f17..2fbe4c5 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -1143,7 +1143,7 @@ static void __devexit i7300_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	char *tmp;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	if (i7300_pci)
 		edac_pci_release_generic_ctl(i7300_pci);
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index ef237f4..2f7cc2a 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -824,7 +824,7 @@ static ssize_t i7core_inject_store_##param(			\
 	long value;						\
 	int rc;							\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	pvt = mci->pvt_info;					\
 								\
 	if (pvt->inject.enable)					\
@@ -852,7 +852,7 @@ static ssize_t i7core_inject_show_##param(			\
 	struct i7core_pvt *pvt;					\
 								\
 	pvt = mci->pvt_info;					\
-	debugf1("%s() pvt=%p\n", __func__, pvt);		\
+	debugf1("pvt=%p\n", pvt);		\
 	if (pvt->inject.param < 0)				\
 		return sprintf(data, "any\n");			\
 	else							\
@@ -1059,7 +1059,7 @@ static ssize_t i7core_show_counter_##param(			\
 	struct mem_ctl_info *mci = to_mci(dev);			\
 	struct i7core_pvt *pvt = mci->pvt_info;			\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	if (!pvt->ce_count_available || (pvt->is_registered))	\
 		return sprintf(data, "data unavailable\n");	\
 	return sprintf(data, "%lu\n",				\
@@ -1190,8 +1190,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 	dev_set_name(pvt->addrmatch_dev, "inject_addrmatch");
 	dev_set_drvdata(pvt->addrmatch_dev, mci);
 
-	debugf1("%s(): creating %s\n", __func__,
-		dev_name(pvt->addrmatch_dev));
+	debugf1("creating %s\n", dev_name(pvt->addrmatch_dev));
 
 	rc = device_add(pvt->addrmatch_dev);
 	if (rc < 0)
@@ -1213,8 +1212,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 		dev_set_name(pvt->chancounts_dev, "all_channel_counts");
 		dev_set_drvdata(pvt->chancounts_dev, mci);
 
-		debugf1("%s(): creating %s\n", __func__,
-			dev_name(pvt->chancounts_dev));
+		debugf1("creating %s\n", dev_name(pvt->chancounts_dev));
 
 		rc = device_add(pvt->chancounts_dev);
 		if (rc < 0)
@@ -1254,7 +1252,7 @@ static void i7core_put_devices(struct i7core_dev *i7core_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < i7core_dev->n_devs; i++) {
 		struct pci_dev *pdev = i7core_dev->pdev[i];
 		if (!pdev)
@@ -1652,7 +1650,7 @@ static void i7core_udimm_check_mc_ecc_err(struct mem_ctl_info *mci)
 	int new0, new1, new2;
 
 	if (!pvt->pci_mcr[4]) {
-		debugf0("%s MCR registers not found\n", __func__);
+		debugf0("MCR registers not found\n");
 		return;
 	}
 
@@ -2402,7 +2400,7 @@ static void __devexit i7core_remove(struct pci_dev *pdev)
 {
 	struct i7core_dev *i7core_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index c0249f3..9f70b31 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -241,7 +241,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	enum mem_type mtype;
 	enum edac_type edac_mode;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* Something is really hosed if PCI config space reads from
 	 * the MC aren't working.
@@ -259,7 +259,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_EDO | MEM_FLAG_SDR | MEM_FLAG_RDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -305,8 +305,8 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 		edac_mode = EDAC_SECDED;
 		break;
 	default:
-		debugf0("%s(): Unknown/reserved ECC state "
-			"in NBXCFG register!\n", __func__);
+		debugf0("Unknown/reserved ECC state "
+			"in NBXCFG register!\n");
 		edac_mode = EDAC_UNKNOWN;
 		break;
 	}
@@ -330,7 +330,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->ctl_page_to_phys = NULL;
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -345,7 +345,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 			__func__);
 	}
 
-	debugf3("MC: %s: %s(): success\n", __FILE__, __func__);
+	debugf3("MC: %s(): success\n", __FILE__);
 	return 0;
 
 fail:
@@ -361,7 +361,7 @@ static int __devinit i82443bxgx_edacmc_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* don't need to call pci_enable_device() */
 	rc = i82443bxgx_edacmc_probe1(pdev, ent->driver_data);
@@ -376,7 +376,7 @@ static void __devexit i82443bxgx_edacmc_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i82443bxgx_pci)
 		edac_pci_release_generic_ctl(i82443bxgx_pci);
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 6ff59b0..b0c97f0 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -167,7 +167,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 		pci_read_config_word(pdev, I82860_GBA + index * 2, &value);
 		cumul_size = (value & I82860_GBA_MASK) <<
 			(I82860_GBA_SHIFT - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 
 		if (cumul_size == last_cumul_size)
@@ -210,7 +210,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -229,7 +229,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -245,7 +245,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -260,7 +260,7 @@ static int __devinit i82860_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82860_printk(KERN_INFO, "i82860 init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -278,7 +278,7 @@ static void __devexit i82860_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82860_pci)
 		edac_pci_release_generic_ctl(i82860_pci);
@@ -311,7 +311,7 @@ static int __init i82860_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -352,7 +352,7 @@ fail0:
 
 static void __exit i82860_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82860_driver);
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index c943904..e8b1b4b 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -371,7 +371,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 
 		value = readb(ovrfl_window + I82875P_DRB + index);
 		cumul_size = value << (I82875P_DRB_SHIFT - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -405,7 +405,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 nr_chans;
 	struct i82875p_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	ovrfl_pdev = pci_get_device(PCI_VEND_DEV(INTEL, 82875_6), NULL);
 
@@ -426,7 +426,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -437,7 +437,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82875p_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82875p_pvt *)mci->pvt_info;
 	pvt->ovrfl_pdev = ovrfl_pdev;
 	pvt->ovrfl_window = ovrfl_window;
@@ -448,7 +448,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail1;
 	}
 
@@ -464,7 +464,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -485,7 +485,7 @@ static int __devinit i82875p_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82875p_printk(KERN_INFO, "i82875p init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -504,7 +504,7 @@ static void __devexit i82875p_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82875p_pvt *pvt = NULL;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82875p_pci)
 		edac_pci_release_generic_ctl(i82875p_pci);
@@ -550,7 +550,7 @@ static int __init i82875p_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -593,7 +593,7 @@ fail0:
 
 static void __exit i82875p_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	i82875p_remove_one(mci_pdev);
 	pci_dev_put(mci_pdev);
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index a4a6768..26c7b73 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -406,7 +406,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		 */
 		if (csrow->nr_channels > 1)
 			cumul_size <<= 1;
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 
 		nr_pages = cumul_size - last_cumul_size;
@@ -489,11 +489,11 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	u8 c1drb[4];
 #endif
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci_read_config_dword(pdev, I82975X_MCHBAR, &mchbar);
 	if (!(mchbar & 1)) {
-		debugf3("%s(): failed, MCHBAR disabled!\n", __func__);
+		debugf3("failed, MCHBAR disabled!\n");
 		goto fail0;
 	}
 	mchbar &= 0xffffc000;	/* bits 31:14 used for 16K window */
@@ -558,7 +558,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail1;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -569,7 +569,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82975x_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82975x_pvt *) mci->pvt_info;
 	pvt->mch_window = mch_window;
 	i82975x_init_csrows(mci, pdev, mch_window);
@@ -578,12 +578,12 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* finalize this instance of memory controller with edac core */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail2;
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail2:
@@ -601,7 +601,7 @@ static int __devinit i82975x_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -619,7 +619,7 @@ static void __devexit i82975x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82975x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (mci  == NULL)
@@ -655,7 +655,7 @@ static int __init i82975x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -697,7 +697,7 @@ fail0:
 
 static void __exit i82975x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82975x_driver);
 
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 1640d54..17d000b 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -280,7 +280,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, ~0);
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		goto err;
 	}
 
@@ -303,7 +303,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_pci_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " PCI err registered\n");
 
 	return 0;
@@ -321,7 +321,7 @@ static int mpc85xx_pci_err_remove(struct platform_device *op)
 	struct edac_pci_ctl_info *pci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_CAP_DR,
 		 orig_pci_err_cap_dr);
@@ -582,7 +582,7 @@ static int __devinit mpc85xx_l2_err_probe(struct platform_device *op)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -610,7 +610,7 @@ static int __devinit mpc85xx_l2_err_probe(struct platform_device *op)
 
 	devres_remove_group(&op->dev, mpc85xx_l2_err_probe);
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " L2 err registered\n");
 
 	return 0;
@@ -628,7 +628,7 @@ static int mpc85xx_l2_err_remove(struct platform_device *op)
 	struct edac_device_ctl_info *edac_dev = dev_get_drvdata(&op->dev);
 	struct mpc85xx_l2_pdata *pdata = edac_dev->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->l2_vbase + MPC85XX_L2_ERRINTEN, 0);
@@ -1038,7 +1038,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 		goto err;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_RDDR2 |
 	    MEM_FLAG_DDR | MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -1064,13 +1064,13 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT, ~0);
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
 	if (mpc85xx_create_sysfs_attributes(mci)) {
 		edac_mc_del_mc(mci->pdev);
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
@@ -1104,7 +1104,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_mc_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " MC err registered\n");
 
 	return 0;
@@ -1122,7 +1122,7 @@ static int mpc85xx_mc_err_remove(struct platform_device *op)
 	struct mem_ctl_info *mci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_INT_EN, 0);
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 59c399a..35db597 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -169,7 +169,7 @@ static int __devinit mv64x60_pci_err_probe(struct platform_device *pdev)
 		 MV64X60_PCIx_ERR_MASK_VAL);
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		goto err;
 	}
 
@@ -194,7 +194,7 @@ static int __devinit mv64x60_pci_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_pci_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -210,7 +210,7 @@ static int mv64x60_pci_err_remove(struct platform_device *pdev)
 {
 	struct edac_pci_ctl_info *pci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_del_device(&pdev->dev);
 
@@ -336,7 +336,7 @@ static int __devinit mv64x60_sram_err_probe(struct platform_device *pdev)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -363,7 +363,7 @@ static int __devinit mv64x60_sram_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_sram_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -379,7 +379,7 @@ static int mv64x60_sram_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -531,7 +531,7 @@ static int __devinit mv64x60_cpu_err_probe(struct platform_device *pdev)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -558,7 +558,7 @@ static int __devinit mv64x60_cpu_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_cpu_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -574,7 +574,7 @@ static int mv64x60_cpu_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -766,7 +766,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 		goto err2;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
 	mci->edac_cap = EDAC_FLAG_SECDED;
@@ -790,7 +790,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	out_le32(pdata->mc_vbase + MV64X60_SDRAM_ERR_ECC_CNTL, ctl);
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
@@ -815,7 +815,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -831,7 +831,7 @@ static int mv64x60_mc_err_remove(struct platform_device *pdev)
 {
 	struct mem_ctl_info *mci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_mc_del_mc(&pdev->dev);
 	edac_mc_free(mci);
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 7b7eaf2..2001f9a 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -236,13 +236,13 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_byte(pdev, R82600_DRBA + index, &drbar);
 
-		debugf1("%s() Row=%d DRBA = %#0x\n", __func__, index, drbar);
+		debugf1("Row=%d DRBA = %#0x\n", index, drbar);
 
 		row_high_limit = ((u32) drbar << 24);
 /*		row_high_limit = ((u32)drbar << 24) | 0xffffffUL; */
 
-		debugf1("%s() Row=%d, Boundary Address=%#0x, Last = %#0x\n",
-			__func__, index, row_high_limit, row_high_limit_last);
+		debugf1("Row=%d, Boundary Address=%#0x, Last = %#0x\n",
+			index, row_high_limit, row_high_limit_last);
 
 		/* Empty row [p.57] */
 		if (row_high_limit == row_high_limit_last)
@@ -277,14 +277,13 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 sdram_refresh_rate;
 	struct r82600_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_byte(pdev, R82600_DRAMC, &dramcr);
 	pci_read_config_dword(pdev, R82600_EAP, &eapr);
 	scrub_disabled = eapr & BIT(31);
 	sdram_refresh_rate = dramcr & (BIT(0) | BIT(1));
-	debugf2("%s(): sdram refresh rate = %#0x\n", __func__,
-		sdram_refresh_rate);
-	debugf2("%s(): DRAMC register = %#0x\n", __func__, dramcr);
+	debugf2("sdram refresh rate = %#0x\n", sdram_refresh_rate);
+	debugf2("DRAMC register = %#0x\n", dramcr);
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = R82600_NR_CSROWS;
 	layers[0].is_virt_csrow = true;
@@ -295,7 +294,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -311,8 +310,8 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 
 	if (ecc_enabled(dramcr)) {
 		if (scrub_disabled)
-			debugf3("%s(): mci = %p - Scrubbing disabled! EAP: "
-				"%#0x\n", __func__, mci, eapr);
+			debugf3("mci = %p - Scrubbing disabled! EAP: "
+				"%#0x\n", mci, eapr);
 	} else
 		mci->edac_cap = EDAC_FLAG_NONE;
 
@@ -329,15 +328,14 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
 
 	if (disable_hardware_scrub) {
-		debugf3("%s(): Disabling Hardware Scrub (scrub on error)\n",
-			__func__);
+		debugf3("Disabling Hardware Scrub (scrub on error)\n");
 		pci_write_bits32(pdev, R82600_EAP, BIT(31), BIT(31));
 	}
 
@@ -352,7 +350,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 			__func__);
 	}
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -364,7 +362,7 @@ fail:
 static int __devinit r82600_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return r82600_probe1(pdev, ent->driver_data);
@@ -374,7 +372,7 @@ static void __devexit r82600_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (r82600_pci)
 		edac_pci_release_generic_ctl(r82600_pci);
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index bb7e95f..1dd6a98 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1064,7 +1064,7 @@ static void sbridge_put_devices(struct sbridge_dev *sbridge_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < sbridge_dev->n_devs; i++) {
 		struct pci_dev *pdev = sbridge_dev->pdev[i];
 		if (!pdev)
@@ -1760,7 +1760,7 @@ static void __devexit sbridge_remove(struct pci_dev *pdev)
 {
 	struct sbridge_dev *sbridge_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 219530b..03d2326 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -331,7 +331,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	bool stacked;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	window = x38_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -402,12 +402,12 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -423,7 +423,7 @@ static int __devinit x38_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -439,7 +439,7 @@ static void __devexit x38_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -472,7 +472,7 @@ static int __init x38_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -513,7 +513,7 @@ fail0:
 
 static void __exit x38_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&x38_driver);
 	if (!x38_registered) {

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 16:20                                           ` Mauro Carvalho Chehab
@ 2012-04-29 16:43                                             ` Joe Perches
  2012-04-29 17:39                                               ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Joe Perches @ 2012-04-29 16:43 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson

On Sun, 2012-04-29 at 13:20 -0300, Mauro Carvalho Chehab wrote:
> The script below is even better. After that, only 113 occurrences of __func__
> is now found at drivers/edac, and some of them are not related to debugf[1-9],
> so they shouldn't be cover on a patch like that.
> I'll do some manual cleanup on it.

Hi Mauro.

Another thing you could do would be to
separate the level from the multiple macros,
use a single macro, and convert the uses.

#define debugf(level, fmt, ...)
and change the uses to
debugf([0-n], "some format", args...)

I believe that's the more predominate
kernel style for debugging macros with
a tested level or mask.

Perhaps also add !CONFIG_EDAC_DEBUG
format/args checking to the debug statements.

Lastly, indenting the messages 2 tabs isn't
really useful, one or two spaces is probably
enough.

I did this a bit ago so it may not apply
after your changes:

commit 42f8f2d6fad6b62ab1d122c68984f4afd8a243f7
Author: Joe Perches <joe@perches.com>
Date:   Sat Apr 28 12:41:46 2012 -0700

edac: Use more normal debugging macro style

Convert macros to a simpler style and enforce appropriate
format checking when not CONFIG_EDAC_DEBUG.

Use fmt and __VA_ARGS__, neaten macros.

Move some string arrays to the debugfx uses and remove the
now unnecessary CONFIG_EDAC_DEBUG variable block definitions.

Signed-off-by: Joe Perches <joe@perches.com>

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..6198181 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -71,29 +71,30 @@ extern const char *edac_mem_types[];
 #ifdef CONFIG_EDAC_DEBUG
 extern int edac_debug_level;
 
-#define edac_debug_printk(level, fmt, arg...)                           \
-	do {                                                            \
-		if (level <= edac_debug_level)                          \
-			edac_printk(KERN_DEBUG, EDAC_DEBUG,		\
-				    "%s: " fmt, __func__, ##arg);	\
-	} while (0)
-
-#define debugf0( ... ) edac_debug_printk(0, __VA_ARGS__ )
-#define debugf1( ... ) edac_debug_printk(1, __VA_ARGS__ )
-#define debugf2( ... ) edac_debug_printk(2, __VA_ARGS__ )
-#define debugf3( ... ) edac_debug_printk(3, __VA_ARGS__ )
-#define debugf4( ... ) edac_debug_printk(4, __VA_ARGS__ )
+#define edac_debug_printk(level, fmt, ...)				\
+do {									\
+	if (level <= edac_debug_level)					\
+		edac_printk(KERN_DEBUG, EDAC_DEBUG,			\
+			    "%s: " fmt, __func__, ##__VA_ARGS__);	\
+} while (0)
 
 #else				/* !CONFIG_EDAC_DEBUG */
 
-#define debugf0( ... )
-#define debugf1( ... )
-#define debugf2( ... )
-#define debugf3( ... )
-#define debugf4( ... )
+#define edac_debug_printk(level, fmt, ...)				\
+do {									\
+	if (0)								\
+		edac_printk(KERN_DEBUG, EDAC_DEBUG,			\
+			    "%s: " fmt, __func__, ##__VA_ARGS__);	\
+} while (0)
 
 #endif				/* !CONFIG_EDAC_DEBUG */
 
+#define debugf0(fmt, ...) edac_debug_printk(0, fmt, ##__VA_ARGS__)
+#define debugf1(fmt, ...) edac_debug_printk(1, fmt, ##__VA_ARGS__)
+#define debugf2(fmt, ...) edac_debug_printk(2, fmt, ##__VA_ARGS__)
+#define debugf3(fmt, ...) edac_debug_printk(3, fmt, ##__VA_ARGS__)
+#define debugf4(fmt, ...) edac_debug_printk(4, fmt, ##__VA_ARGS__)
+
 #define PCI_VEND_DEV(vend, dev) PCI_VENDOR_ID_ ## vend, \
 	PCI_DEVICE_ID_ ## vend ## _ ## dev
 
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index a2680d8..1745a38 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -272,7 +272,7 @@
 #define NUM_MTRS		4
 #define CHANNELS_PER_BRANCH	(2)
 
-/* Defines to extract the vaious fields from the
+/* Defines to extract the various fields from the
  *	MTRx - Memory Technology Registers
  */
 #define MTR_DIMMS_PRESENT(mtr)		((mtr) & (0x1 << 8))
@@ -286,22 +286,6 @@
 #define MTR_DIMM_COLS(mtr)		((mtr) & 0x3)
 #define MTR_DIMM_COLS_ADDR_BITS(mtr)	(MTR_DIMM_COLS(mtr) + 10)
 
-#ifdef CONFIG_EDAC_DEBUG
-static char *numrow_toString[] = {
-	"8,192 - 13 rows",
-	"16,384 - 14 rows",
-	"32,768 - 15 rows",
-	"reserved"
-};
-
-static char *numcol_toString[] = {
-	"1,024 - 10 columns",
-	"2,048 - 11 columns",
-	"4,096 - 12 columns",
-	"reserved"
-};
-#endif
-
 /* enables the report of miscellaneous messages as CE errors - default off */
 static int misc_messages;
 
@@ -984,8 +968,16 @@ static void decode_mtr(int slot_row, u16 mtr)
 	debugf2("\t\tWIDTH: x%d\n", MTR_DRAM_WIDTH(mtr));
 	debugf2("\t\tNUMBANK: %d bank(s)\n", MTR_DRAM_BANKS(mtr));
 	debugf2("\t\tNUMRANK: %s\n", MTR_DIMM_RANK(mtr) ? "double" : "single");
-	debugf2("\t\tNUMROW: %s\n", numrow_toString[MTR_DIMM_ROWS(mtr)]);
-	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
+	debugf2("\t\tNUMROW: %s\n",
+		MTR_DIMM_ROWS(mtr) == 0 ? "8,192 - 13 rows" :
+		MTR_DIMM_ROWS(mtr) == 1 ? "16,384 - 14 rows" :
+		MTR_DIMM_ROWS(mtr) == 2 ? "32,768 - 15 rows" :
+		"reserved");
+	debugf2("\t\tNUMCOL: %s\n",
+		MTR_DIMM_COLS(mtr) == 0 ? "1,024 - 10 columns" :
+		MTR_DIMM_COLS(mtr) == 1 ? "2,048 - 11 columns" :
+		MTR_DIMM_COLS(mtr) == 2 ? "4,096 - 12 columns" :
+		"reserved");
 }
 
 static void handle_channel(struct i5000_pvt *pvt, int csrow, int channel,
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 1869a10..4d17641 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -298,24 +298,6 @@ static inline int extract_fbdchan_indx(u32 x)
 	return (x>>28) & 0x3;
 }
 
-#ifdef CONFIG_EDAC_DEBUG
-/* MTR NUMROW */
-static const char *numrow_toString[] = {
-	"8,192 - 13 rows",
-	"16,384 - 14 rows",
-	"32,768 - 15 rows",
-	"65,536 - 16 rows"
-};
-
-/* MTR NUMCOL */
-static const char *numcol_toString[] = {
-	"1,024 - 10 columns",
-	"2,048 - 11 columns",
-	"4,096 - 12 columns",
-	"reserved"
-};
-#endif
-
 /* Device name and register DID (Device ID) */
 struct i5400_dev_info {
 	const char *ctl_name;	/* name for this device */
@@ -909,8 +891,16 @@ static void decode_mtr(int slot_row, u16 mtr)
 
 	debugf2("\t\tNUMBANK: %d bank(s)\n", MTR_DRAM_BANKS(mtr));
 	debugf2("\t\tNUMRANK: %s\n", MTR_DIMM_RANK(mtr) ? "double" : "single");
-	debugf2("\t\tNUMROW: %s\n", numrow_toString[MTR_DIMM_ROWS(mtr)]);
-	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
+	debugf2("\t\tNUMROW: %s\n",
+		MTR_DIMM_ROWS(mtr) == 0 ? "8,192 - 13 rows" :
+		MTR_DIMM_ROWS(mtr) == 1 ? "16,384 - 14 rows" :
+		MTR_DIMM_ROWS(mtr) == 2 ? "32,768 - 15 rows" :
+		"65,536 - 16 rows");
+	debugf2("\t\tNUMCOL: %s\n",
+		MTR_DIMM_COLS(mtr) == 0 ? "1,024 - 10 columns" :
+		MTR_DIMM_COLS(mtr) == 1 ? "2,048 - 11 columns" :
+		MTR_DIMM_COLS(mtr) == 2 ? "4,096 - 12 columns" :
+		"reserved");
 }
 
 static void handle_channel(struct i5400_pvt *pvt, int csrow, int channel,
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 3bafa3b..9271da3 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -182,24 +182,6 @@ static const u16 mtr_regs[MAX_SLOTS] = {
 #define MTR_DIMM_COLS(mtr)		((mtr) & 0x3)
 #define MTR_DIMM_COLS_ADDR_BITS(mtr)	(MTR_DIMM_COLS(mtr) + 10)
 
-#ifdef CONFIG_EDAC_DEBUG
-/* MTR NUMROW */
-static const char *numrow_toString[] = {
-	"8,192 - 13 rows",
-	"16,384 - 14 rows",
-	"32,768 - 15 rows",
-	"65,536 - 16 rows"
-};
-
-/* MTR NUMCOL */
-static const char *numcol_toString[] = {
-	"1,024 - 10 columns",
-	"2,048 - 11 columns",
-	"4,096 - 12 columns",
-	"reserved"
-};
-#endif
-
 /************************************************
  * i7300 Register definitions for error detection
  ************************************************/
@@ -659,8 +641,16 @@ static int decode_mtr(struct i7300_pvt *pvt,
 
 	debugf2("\t\tNUMBANK: %d bank(s)\n", MTR_DRAM_BANKS(mtr));
 	debugf2("\t\tNUMRANK: %s\n", MTR_DIMM_RANKS(mtr) ? "double" : "single");
-	debugf2("\t\tNUMROW: %s\n", numrow_toString[MTR_DIMM_ROWS(mtr)]);
-	debugf2("\t\tNUMCOL: %s\n", numcol_toString[MTR_DIMM_COLS(mtr)]);
+	debugf2("\t\tNUMROW: %s\n",
+		MTR_DIMM_ROWS(mtr) == 0 ? "8,192 - 13 rows" :
+		MTR_DIMM_ROWS(mtr) == 1 ? "16,384 - 14 rows" :
+		MTR_DIMM_ROWS(mtr) == 2 ? "32,768 - 15 rows" :
+		"65,536 - 16 rows");
+	debugf2("\t\tNUMCOL: %s\n",
+		MTR_DIMM_COLS(mtr) == 0 ? "1,024 - 10 columns" :
+		MTR_DIMM_COLS(mtr) == 1 ? "2,048 - 11 columns" :
+		MTR_DIMM_COLS(mtr) == 2 ? "4,096 - 12 columns" :
+		"reserved");
 	debugf2("\t\tSIZE: %d MB\n", dinfo->megabytes);
 
 	p_csrow->grain = 8;




^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 16:03                                             ` Joe Perches
@ 2012-04-29 17:18                                               ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 17:18 UTC (permalink / raw)
  To: Joe Perches
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson,
	Mark Gross, Jason Uhlenkott, Tim Small, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Dmitry Eremin-Solenikov, Benjamin Herrenschmidt,
	Hitoshi Mitake, Andrew Morton, Niklas Söderlund,
	Shaohui Xie, Josh Boyer, linuxppc-dev

Em 29-04-2012 13:03, Joe Perches escreveu:
> On Sun, 2012-04-29 at 12:11 -0300, Mauro Carvalho Chehab wrote:
>> Em 29-04-2012 11:25, Mauro Carvalho Chehab escreveu:
>>> Em 28-04-2012 05:52, Borislav Petkov escreveu:
>>>> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
>>>>> Yes. This is a common issue at the EDAC core: on several places, it calls the
>>>>> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
>>>>> the debug macros already handles that. I suspect that, in the past, the __func__
>>>>> were not at the macros, but some patch added it there, and forgot to fix the
>>>>> occurrences of its call.
>>>> The patch that added it is d357cbb445208 and you reviewed it.
>>> And you wrote the patch that caused it.
> 
> And Boris should have also written the follow-on patches that
> removed most/all of the debugfX and __func__ uses.

Yes.

>>> A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
>>> an unrelated fix on this patch. This is already complex enough to add more unrelated
>>> things there.
>>>
>>> Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
>>> on one shot.
> 
> You make it sound simple, but it'd be a pretty complicated
> cocci script.  Some of the changes would have to be inspected
> or changed by hand in any case.

Yes manual changes are needed, to get rid of it some less likely patterns,
but using a script helps to do most of the changes automatically.

> []
> 
>> Most of the issues can be solved with the above script-based patch. 
>>
>> There are still 171 places (12 places at the core, the rest are on the drivers)
>> that will require a more sophisticated patch or that requires a manual fix.
> []
>> From: Mauro Carvalho Chehab <mchehab@redhat.com>
>> Date: Sun, 29 Apr 2012 11:59:14 -0300
>> Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs
> 
> Thanks Mauro, you shouldn't have had to do this.

I know, but the double __func__ were bothering me. Anyway, this change was kick ;)

Btw, new (final) version attached. This replaces all debugf[1-4] occurences.

>From 476ed993148a6b9f0215051c98db1cb094bca8a9 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Sun, 29 Apr 2012 11:59:14 -0300
Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

The debug macro already adds that. Most of the work here was
made by this small script:

$f .=$_ while (<>);

$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;

$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

$f =~ s/\"MC\: \\n\"/"MC:\\n"/g;

print $f;

After running the script, manual cleanups were done to fix it the remaining
places.

While here, removed the __LINE__ on most places, as it doesn't actually give
useful info on most places.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

---

PS.: Patch should be applied at the end of my EDAC experimental tree:
http://git.infradead.org/users/mchehab/edac.git/commit/476ed993148a6b9f0215051c98db1cb094bca8a9


diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index be6c225..4ed97bf 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -180,7 +180,7 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 static void amd76x_check(struct mem_ctl_info *mci)
 {
 	struct amd76x_error_info info;
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	amd76x_get_error_info(mci, &info);
 	amd76x_process_error_info(mci, &info, 1);
 }
@@ -241,7 +241,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 ems_mode;
 	struct amd76x_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_dword(pdev, AMD76X_ECC_MODE_STATUS, &ems);
 	ems_mode = (ems >> 10) & 0x3;
 
@@ -256,7 +256,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -276,7 +276,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -292,7 +292,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -304,7 +304,7 @@ fail:
 static int __devinit amd76x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return amd76x_probe1(pdev, ent->driver_data);
@@ -322,7 +322,7 @@ static void __devexit amd76x_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (amd76x_pci)
 		edac_pci_release_generic_ctl(amd76x_pci);
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 31b3c91..9ee1194 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -316,13 +316,12 @@ static void get_total_mem(struct cpc925_mc_pdata *pdata)
 		reg += aw;
 		size = of_read_number(reg, sw);
 		reg += sw;
-		debugf1("%s: start 0x%lx, size 0x%lx\n", __func__,
-			start, size);
+		debugf1("start 0x%lx, size 0x%lx\n", start, size);
 		pdata->total_mem += size;
 	} while (reg < reg_end);
 
 	of_node_put(np);
-	debugf0("%s: total_mem 0x%lx\n", __func__, pdata->total_mem);
+	debugf0("total_mem 0x%lx\n", pdata->total_mem);
 }
 
 static void cpc925_init_csrows(struct mem_ctl_info *mci)
@@ -512,7 +511,7 @@ static void cpc925_mc_get_pfn(struct mem_ctl_info *mci, u32 mear,
 	*offset = pa & (PAGE_SIZE - 1);
 	*pfn = pa >> PAGE_SHIFT;
 
-	debugf0("%s: ECC physical address 0x%lx\n", __func__, pa);
+	debugf0("ECC physical address 0x%lx\n", pa);
 }
 
 static int cpc925_mc_find_channel(struct mem_ctl_info *mci, u16 syndrome)
@@ -852,8 +851,8 @@ static void cpc925_add_edac_devices(void __iomem *vbase)
 			goto err2;
 		}
 
-		debugf0("%s: Successfully added edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully added edac device for %s\n",
+			dev_info->ctl_name);
 
 		continue;
 
@@ -884,8 +883,8 @@ static void cpc925_del_edac_devices(void)
 		if (dev_info->exit)
 			dev_info->exit(dev_info);
 
-		debugf0("%s: Successfully deleted edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully deleted edac device for %s\n",
+			dev_info->ctl_name);
 	}
 }
 
@@ -900,7 +899,7 @@ static int cpc925_get_sdram_scrub_rate(struct mem_ctl_info *mci)
 	mscr = __raw_readl(pdata->vbase + REG_MSCR_OFFSET);
 	si = (mscr & MSCR_SI_MASK) >> MSCR_SI_SHIFT;
 
-	debugf0("%s, Mem Scrub Ctrl Register 0x%x\n", __func__, mscr);
+	debugf0("Mem Scrub Ctrl Register 0x%x\n", mscr);
 
 	if (((mscr & MSCR_SCRUB_MOD_MASK) != MSCR_BACKGR_SCRUB) ||
 	    (si == 0)) {
@@ -928,8 +927,7 @@ static int cpc925_mc_get_channels(void __iomem *vbase)
 	    ((mbcr & MBCR_64BITBUS_MASK) == 0))
 		dual = 1;
 
-	debugf0("%s: %s channel\n", __func__,
-		(dual > 0) ? "Dual" : "Single");
+	debugf0("%s channel\n", (dual > 0) ? "Dual" : "Single");
 
 	return dual;
 }
@@ -944,7 +942,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	struct resource *r;
 	int res = 0, nr_channels;
 
-	debugf0("%s: %s platform device found!\n", __func__, pdev->name);
+	debugf0("%s platform device found!\n", pdev->name);
 
 	if (!devres_open_group(&pdev->dev, cpc925_probe, GFP_KERNEL)) {
 		res = -ENOMEM;
@@ -1026,7 +1024,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	cpc925_add_edac_devices(vbase);
 
 	/* get this far and it's successful */
-	debugf0("%s: success\n", __func__);
+	debugf0("success\n");
 
 	res = 0;
 	goto out;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 7e601c1..5a599a3 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -309,7 +309,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (page < pvt->tolm)
 		return page;
@@ -335,7 +335,7 @@ static void do_process_ce(struct mem_ctl_info *mci, u16 error_one,
 	int i;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* convert the addr to 4k page */
 	page = sec1_add >> (PAGE_SHIFT - 4);
@@ -394,7 +394,7 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 	int row;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (error_one & 0x0202) {
 		error_2b = ded_add;
@@ -453,7 +453,7 @@ static inline void process_ue_no_info_wr(struct mem_ctl_info *mci,
 	if (!handle_error)
 		return;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
 			     -1, -1, -1,
 			     "e752x UE log memory write", "", NULL);
@@ -982,7 +982,7 @@ static void e752x_check(struct mem_ctl_info *mci)
 {
 	struct e752x_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e752x_get_error_info(mci, &info);
 	e752x_process_error_info(mci, &info, 1);
 }
@@ -1102,7 +1102,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		pci_read_config_byte(pdev, E752X_DRB + index, &value);
 		/* convert a 128 or 64 MiB DRB to a page size. */
 		cumul_size = value << (25 + drc_drbg - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -1270,7 +1270,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;		/* Number of channels 0=1chan,1=2chan */
 	struct e752x_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 	debugf0("Starting Probe1\n");
 
 	/* check to see if device 0 function 1 is enabled; if it isn't, we
@@ -1302,7 +1302,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	/* 3100 IMCH supports SECDEC only */
 	mci->edac_ctl_cap = (dev_idx == I3100) ? EDAC_FLAG_SECDED :
@@ -1312,7 +1312,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_ver = E752X_REVISION;
 	mci->pdev = &pdev->dev;
 
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e752x_pvt *)mci->pvt_info;
 	pvt->dev_info = &e752x_devs[dev_idx];
 	pvt->mc_symmetric = ((ddrcsr & 0x10) != 0);
@@ -1322,7 +1322,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENODEV;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e752x_check;
@@ -1344,7 +1344,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		mci->edac_cap = EDAC_FLAG_SECDED; /* the only mode supported */
 	else
 		mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E752X_TOLM, &pci_data);
@@ -1361,7 +1361,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -1379,7 +1379,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -1395,7 +1395,7 @@ fail:
 static int __devinit e752x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	if (pci_enable_device(pdev) < 0)
@@ -1409,7 +1409,7 @@ static void __devexit e752x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e752x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e752x_pci)
 		edac_pci_release_generic_ctl(e752x_pci);
@@ -1455,7 +1455,7 @@ static int __init e752x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -1466,7 +1466,7 @@ static int __init e752x_init(void)
 
 static void __exit e752x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	pci_unregister_driver(&e752x_driver);
 }
 
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 2defa96..2850d00 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -166,7 +166,7 @@ static const struct e7xxx_dev_info e7xxx_devs[] = {
 /* FIXME - is this valid for both SECDED and S4ECD4ED? */
 static inline int e7xxx_find_channel(u16 syndrome)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((syndrome & 0xff00) == 0)
 		return 0;
@@ -186,7 +186,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e7xxx_pvt *pvt = (struct e7xxx_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((page < pvt->tolm) ||
 		((page >= 0x100000) && (page < pvt->remapbase)))
@@ -208,7 +208,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	int row;
 	int channel;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_1b = info->dram_celog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -225,7 +225,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ce_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx CE log register overflow", "", NULL);
 }
@@ -235,7 +235,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	u32 error_2b, block_page;
 	int row;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_2b = info->dram_uelog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -248,7 +248,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ue_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx UE log register overflow", "", NULL);
@@ -334,7 +334,7 @@ static void e7xxx_check(struct mem_ctl_info *mci)
 {
 	struct e7xxx_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e7xxx_get_error_info(mci, &info);
 	e7xxx_process_error_info(mci, &info, 1);
 }
@@ -383,7 +383,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		pci_read_config_byte(pdev, E7XXX_DRB + index, &value);
 		/* convert a 64 or 32 MiB DRB to a page size. */
 		cumul_size = value << (25 + drc_drbg - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -430,7 +430,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;
 	struct e7xxx_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 
 	pci_read_config_dword(pdev, E7XXX_DRC, &drc);
 
@@ -453,7 +453,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED |
 		EDAC_FLAG_S4ECD4ED;
@@ -461,7 +461,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_name = EDAC_MOD_STR;
 	mci->mod_ver = E7XXX_REVISION;
 	mci->pdev = &pdev->dev;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e7xxx_pvt *)mci->pvt_info;
 	pvt->dev_info = &e7xxx_devs[dev_idx];
 	pvt->bridge_ck = pci_get_device(PCI_VENDOR_ID_INTEL,
@@ -474,14 +474,14 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e7xxx_check;
 	mci->ctl_page_to_phys = ctl_page_to_phys;
 	e7xxx_init_csrows(mci, pdev, dev_idx, drc);
 	mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E7XXX_TOLM, &pci_data);
 	pvt->tolm = ((u32) pci_data) << 4;
@@ -500,7 +500,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail1;
 	}
 
@@ -516,7 +516,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -532,7 +532,7 @@ fail0:
 static int __devinit e7xxx_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	return pci_enable_device(pdev) ?
@@ -544,7 +544,7 @@ static void __devexit e7xxx_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e7xxx_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e7xxx_pci)
 		edac_pci_release_generic_ctl(e7xxx_pci);
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index cb397d9..ed46949 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -82,8 +82,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	void *pvt, *p;
 	int err;
 
-	debugf4("%s() instances=%d blocks=%d\n",
-		__func__, nr_instances, nr_blocks);
+	debugf4("instances=%d blocks=%d\n",
+		nr_instances, nr_blocks);
 
 	/* Calculate the size of memory we need to allocate AND
 	 * determine the offsets of the various item arrays
@@ -156,8 +156,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	/* Name of this edac device */
 	snprintf(dev_ctl->name,sizeof(dev_ctl->name),"%s",edac_device_name);
 
-	debugf4("%s() edac_dev=%p next after end=%p\n",
-		__func__, dev_ctl, pvt + sz_private );
+	debugf4("edac_dev=%p next after end=%p\n",
+		dev_ctl, pvt + sz_private );
 
 	/* Initialize every Instance */
 	for (instance = 0; instance < nr_instances; instance++) {
@@ -178,9 +178,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			snprintf(blk->name, sizeof(blk->name),
 				 "%s%d", edac_block_name, block+offset_value);
 
-			debugf4("%s() instance=%d inst_p=%p block=#%d "
+			debugf4("instance=%d inst_p=%p block=#%d "
 				"block_p=%p name='%s'\n",
-				__func__, instance, inst, block,
+				instance, inst, block,
 				blk, blk->name);
 
 			/* if there are NO attributes OR no attribute pointer
@@ -194,8 +194,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			attrib_p = &dev_attrib[block*nr_instances*nr_attrib];
 			blk->block_attributes = attrib_p;
 
-			debugf4("%s() THIS BLOCK_ATTRIB=%p\n",
-				__func__, blk->block_attributes);
+			debugf4("THIS BLOCK_ATTRIB=%p\n",
+				blk->block_attributes);
 
 			/* Initialize every user specified attribute in this
 			 * block with the data the caller passed in
@@ -214,9 +214,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 
 				attrib->block = blk;	/* up link */
 
-				debugf4("%s() alloc-attrib=%p attrib_name='%s' "
+				debugf4("alloc-attrib=%p attrib_name='%s' "
 					"attrib-spec=%p spec-name=%s\n",
-					__func__, attrib, attrib->attr.name,
+					attrib, attrib->attr.name,
 					&attrib_spec[attr],
 					attrib_spec[attr].attr.name
 					);
@@ -273,7 +273,7 @@ static struct edac_device_ctl_info *find_edac_device_by_dev(struct device *dev)
 	struct edac_device_ctl_info *edac_dev;
 	struct list_head *item;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	list_for_each(item, &edac_device_list) {
 		edac_dev = list_entry(item, struct edac_device_ctl_info, link);
@@ -408,7 +408,7 @@ static void edac_device_workq_function(struct work_struct *work_req)
 void edac_device_workq_setup(struct edac_device_ctl_info *edac_dev,
 				unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* take the arg 'msec' and set it into the control structure
 	 * to used in the time period calculation
@@ -496,7 +496,7 @@ EXPORT_SYMBOL_GPL(edac_device_alloc_index);
  */
 int edac_device_add_device(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -570,7 +570,7 @@ struct edac_device_ctl_info *edac_device_del_device(struct device *dev)
 {
 	struct edac_device_ctl_info *edac_dev;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&device_ctls_mutex);
 
diff --git a/drivers/edac/edac_device_sysfs.c b/drivers/edac/edac_device_sysfs.c
index b4ea185..1cee83e 100644
--- a/drivers/edac/edac_device_sysfs.c
+++ b/drivers/edac/edac_device_sysfs.c
@@ -202,7 +202,7 @@ static void edac_device_ctrl_master_release(struct kobject *kobj)
 {
 	struct edac_device_ctl_info *edac_dev = to_edacdev(kobj);
 
-	debugf4("%s() control index=%d\n", __func__, edac_dev->dev_idx);
+	debugf4("control index=%d\n", edac_dev->dev_idx);
 
 	/* decrement the EDAC CORE module ref count */
 	module_put(edac_dev->owner);
@@ -233,12 +233,12 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	struct bus_type *edac_subsys;
 	int err;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the /sys/devices/system/edac reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys error\n", __func__);
+		debugf1("no edac_subsys error\n");
 		err = -ENODEV;
 		goto err_out;
 	}
@@ -264,8 +264,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 				   &edac_subsys->dev_root->kobj,
 				   "%s", edac_dev->name);
 	if (err) {
-		debugf1("%s()Failed to register '.../edac/%s'\n",
-			__func__, edac_dev->name);
+		debugf1("Failed to register '.../edac/%s'\n",
+			edac_dev->name);
 		goto err_kobj_reg;
 	}
 	kobject_uevent(&edac_dev->kobj, KOBJ_ADD);
@@ -274,8 +274,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	 * edac_device_unregister_sysfs_main_kobj() must be used
 	 */
 
-	debugf4("%s() Registered '.../edac/%s' kobject\n",
-		__func__, edac_dev->name);
+	debugf4("Registered '.../edac/%s' kobject\n",
+		edac_dev->name);
 
 	return 0;
 
@@ -296,9 +296,9 @@ err_out:
  */
 void edac_device_unregister_sysfs_main_kobj(struct edac_device_ctl_info *dev)
 {
-	debugf0("%s()\n", __func__);
-	debugf4("%s() name of kobject is: %s\n",
-		__func__, kobject_name(&dev->kobj));
+	debugf0("\n");
+	debugf4("name of kobject is: %s\n",
+		kobject_name(&dev->kobj));
 
 	/*
 	 * Unregister the edac device's kobject and
@@ -336,7 +336,7 @@ static void edac_device_ctrl_instance_release(struct kobject *kobj)
 {
 	struct edac_device_instance *instance;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* map from this kobj to the main control struct
 	 * and then dec the main kobj count
@@ -442,7 +442,7 @@ static void edac_device_ctrl_block_release(struct kobject *kobj)
 {
 	struct edac_device_block *block;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the container of the kobj */
 	block = to_block(kobj);
@@ -524,10 +524,10 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	struct edac_dev_sysfs_block_attribute *sysfs_attrib;
 	struct kobject *main_kobj;
 
-	debugf4("%s() Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
-		__func__, instance->name, instance, block->name, block);
-	debugf4("%s() block kobj=%p  block kobj->parent=%p\n",
-		__func__, &block->kobj, &block->kobj.parent);
+	debugf4("Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
+		instance->name, instance, block->name, block);
+	debugf4("block kobj=%p  block kobj->parent=%p\n",
+		&block->kobj, &block->kobj.parent);
 
 	/* init this block's kobject */
 	memset(&block->kobj, 0, sizeof(struct kobject));
@@ -546,8 +546,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 				   &instance->kobj,
 				   "%s", block->name);
 	if (err) {
-		debugf1("%s() Failed to register instance '%s'\n",
-			__func__, block->name);
+		debugf1("Failed to register instance '%s'\n",
+			block->name);
 		kobject_put(main_kobj);
 		err = -ENODEV;
 		goto err_out;
@@ -560,9 +560,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	if (sysfs_attrib && block->nr_attribs) {
 		for (i = 0; i < block->nr_attribs; i++, sysfs_attrib++) {
 
-			debugf4("%s() creating block attrib='%s' "
+			debugf4("creating block attrib='%s' "
 				"attrib->%p to kobj=%p\n",
-				__func__,
 				sysfs_attrib->attr.name,
 				sysfs_attrib, &block->kobj);
 
@@ -647,14 +646,14 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	err = kobject_init_and_add(&instance->kobj, &ktype_instance_ctrl,
 				   &edac_dev->kobj, "%s", instance->name);
 	if (err != 0) {
-		debugf2("%s() Failed to register instance '%s'\n",
-			__func__, instance->name);
+		debugf2("Failed to register instance '%s'\n",
+			instance->name);
 		kobject_put(main_kobj);
 		goto err_out;
 	}
 
-	debugf4("%s() now register '%d' blocks for instance %d\n",
-		__func__, instance->nr_blocks, idx);
+	debugf4("now register '%d' blocks for instance %d\n",
+		instance->nr_blocks, idx);
 
 	/* register all blocks of this instance */
 	for (i = 0; i < instance->nr_blocks; i++) {
@@ -670,8 +669,8 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	}
 	kobject_uevent(&instance->kobj, KOBJ_ADD);
 
-	debugf4("%s() Registered instance %d '%s' kobject\n",
-		__func__, idx, instance->name);
+	debugf4("Registered instance %d '%s' kobject\n",
+		idx, instance->name);
 
 	return 0;
 
@@ -715,7 +714,7 @@ static int edac_device_create_instances(struct edac_device_ctl_info *edac_dev)
 	int i, j;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* iterate over creation of the instances */
 	for (i = 0; i < edac_dev->nr_instances; i++) {
@@ -817,12 +816,12 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	int err;
 	struct kobject *edac_kobj = &edac_dev->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, edac_dev->dev_idx);
+	debugf0("idx=%d\n", edac_dev->dev_idx);
 
 	/*  go create any main attributes callers wants */
 	err = edac_device_add_main_sysfs_attributes(edac_dev);
 	if (err) {
-		debugf0("%s() failed to add sysfs attribs\n", __func__);
+		debugf0("failed to add sysfs attribs\n");
 		goto err_out;
 	}
 
@@ -832,8 +831,8 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	err = sysfs_create_link(edac_kobj,
 				&edac_dev->dev->kobj, EDAC_DEVICE_SYMLINK);
 	if (err) {
-		debugf0("%s() sysfs_create_link() returned err= %d\n",
-			__func__, err);
+		debugf0("sysfs_create_link() returned err= %d\n",
+			err);
 		goto err_remove_main_attribs;
 	}
 
@@ -843,14 +842,14 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	 */
 	err = edac_device_create_instances(edac_dev);
 	if (err) {
-		debugf0("%s() edac_device_create_instances() "
-			"returned err= %d\n", __func__, err);
+		debugf0("edac_device_create_instances() "
+			"returned err= %d\n", err);
 		goto err_remove_link;
 	}
 
 
-	debugf4("%s() create-instances done, idx=%d\n",
-		__func__, edac_dev->dev_idx);
+	debugf4("create-instances done, idx=%d\n",
+		edac_dev->dev_idx);
 
 	return 0;
 
@@ -873,7 +872,7 @@ err_out:
  */
 void edac_device_remove_sysfs(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* remove any main attributes for this device */
 	edac_device_remove_main_sysfs_attributes(edac_dev);
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 65568e6..d8278b3 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -259,18 +259,18 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	count = 1;
 	for (i = 0; i < n_layers; i++) {
 		count *= layers[i].size;
-		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		debugf4("errcount layer %d size %d\n", i, count);
 		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		tot_errcount += 2 * count;
 	}
 
-	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
+	debugf4("allocating %d error counters\n", tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
-	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
-		__func__, size,
+	debugf1("allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		size,
 		tot_dimms,
 		per_rank ? "ranks" : "dimms",
 		tot_csrows * tot_channels);
@@ -337,7 +337,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+	debugf4("initializing %d %s\n", tot_dimms,
 		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
@@ -351,8 +351,8 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 		mci->dimms[off] = dimm;
 		dimm->mci = mci;
 
-		debugf2("%s: %d: %s%i (%d:%d:%d): row %d, chan %d\n", __func__,
-			i, per_rank ? "rank" : "dimm", off,
+		debugf2("%d: %s%i (%d:%d:%d): row %d, chan %d\n", i,
+			per_rank ? "rank" : "dimm", off,
 			pos[0], pos[1], pos[2], row, chn);
 
 		/*
@@ -451,7 +451,7 @@ EXPORT_SYMBOL_GPL(edac_mc_alloc);
  */
 void edac_mc_free(struct mem_ctl_info *mci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* the mci instance is freed here, when the sysfs object is dropped */
 	edac_unregister_sysfs(mci);
@@ -471,7 +471,7 @@ struct mem_ctl_info *find_mci_by_dev(struct device *dev)
 	struct mem_ctl_info *mci;
 	struct list_head *item;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	list_for_each(item, &mc_devices) {
 		mci = list_entry(item, struct mem_ctl_info, link);
@@ -539,7 +539,7 @@ static void edac_mc_workq_function(struct work_struct *work_req)
  */
 static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* if this instance is not in the POLL state, then simply return */
 	if (mci->op_state != OP_RUNNING_POLL)
@@ -566,8 +566,7 @@ static void edac_mc_workq_teardown(struct mem_ctl_info *mci)
 
 	status = cancel_delayed_work(&mci->work);
 	if (status == 0) {
-		debugf0("%s() not canceled, flush the queue\n",
-			__func__);
+		debugf0("not canceled, flush the queue\n");
 
 		/* workq instance might be running, wait for it */
 		flush_workqueue(edac_workqueue);
@@ -714,7 +713,7 @@ EXPORT_SYMBOL(edac_mc_find);
 /* FIXME - should a warning be printed if no error detection? correction? */
 int edac_mc_add_mc(struct mem_ctl_info *mci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -785,7 +784,7 @@ struct mem_ctl_info *edac_mc_del_mc(struct device *dev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&mem_ctls_mutex);
 
@@ -823,7 +822,7 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 	void *virt_addr;
 	unsigned long flags = 0;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* ECC error page was not in our memory. Ignore it. */
 	if (!pfn_valid(page))
@@ -853,7 +852,7 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 	struct csrow_info **csrows = mci->csrows;
 	int row, i, j, n;
 
-	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
+	debugf1("MC%d: 0x%lx\n", mci->mc_idx, page);
 	row = -1;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
@@ -866,8 +865,8 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 		if (n == 0)
 			continue;
 
-		debugf3("MC%d: %s(): first(0x%lx) page(0x%lx) last(0x%lx) "
-			"mask(0x%lx)\n", mci->mc_idx, __func__,
+		debugf3("MC%d: first(0x%lx) page(0x%lx) last(0x%lx) "
+			"mask(0x%lx)\n", mci->mc_idx,
 			csrow->first_page, page, csrow->last_page,
 			csrow->page_mask);
 
@@ -969,7 +968,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 	u32 grain;
 	bool enable_filter = false;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf3("MC%d\n", mci->mc_idx);
 
 	/* Check if the event report is consistent */
 	for (i = 0; i < mci->n_layers; i++) {
@@ -1043,8 +1042,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			 * get csrow/channel of the dimm, in order to allow
 			 * incrementing the compat API counters
 			 */
-			debugf4("%s: %s csrows map: (%d,%d)\n",
-				__func__,
+			debugf4("%s csrows map: (%d,%d)\n",
 				mci->mem_is_per_rank ? "rank" : "dimm",
 				dimm->csrow, dimm->cschannel);
 			if (row == -1)
@@ -1060,8 +1058,8 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 	if (!enable_filter) {
 		strcpy(label, "any memory");
 	} else {
-		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
-			__func__, row, chan);
+		debugf4("csrow/channel to increment: (%d,%d)\n",
+			row, chan);
 		if (p == label)
 			strcpy(label, "unknown memory");
 		if (type == HW_EVENT_ERR_CORRECTED) {
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 81ca073..8f96c49 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -376,8 +376,7 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 	dev_set_name(&csrow->dev, "csrow%d", index);
 	dev_set_drvdata(&csrow->dev, csrow);
 
-	debugf0("%s(): creating (virtual) csrow node %s\n", __func__,
-		dev_name(&csrow->dev));
+	debugf0("creating (virtual) csrow node %s\n", dev_name(&csrow->dev));
 
 	err = device_add(&csrow->dev);
 	if (err < 0)
@@ -623,8 +622,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 
 	err =  device_add(&dimm->dev);
 
-	debugf0("%s(): creating rank/dimm device %s\n", __func__,
-		dev_name(&dimm->dev));
+	debugf0("creating rank/dimm device %s\n", dev_name(&dimm->dev));
 
 	return err;
 }
@@ -981,8 +979,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	dev_set_drvdata(&mci->dev, mci);
 	pm_runtime_forbid(&mci->dev);
 
-	debugf0("%s(): creating device %s\n", __func__,
-		dev_name(&mci->dev));
+	debugf0("creating device %s\n", dev_name(&mci->dev));
 	err = device_add(&mci->dev);
 	if (err < 0) {
 		bus_unregister(&mci->bus);
@@ -999,8 +996,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 		if (dimm->nr_pages == 0)
 			continue;
 #ifdef CONFIG_EDAC_DEBUG
-		debugf1("%s creating dimm%d, located at ",
-			__func__, i);
+		debugf1("creating dimm%d, located at ",
+			i);
 		if (edac_debug_level >= 1) {
 			int lay;
 			for (lay = 0; lay < mci->n_layers; lay++)
@@ -1012,8 +1009,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 #endif
 		err = edac_create_dimm_object(mci, dimm, i);
 		if (err) {
-			debugf1("%s() failure: create dimm %d obj\n",
-				__func__, i);
+			debugf1("failure: create dimm %d obj\n",
+				i);
 			goto fail;
 		}
 	}
@@ -1051,7 +1048,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
 	int i;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	debugfs_remove(mci->debugfs);
@@ -1064,8 +1061,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 		struct dimm_info *dimm = mci->dimms[i];
 		if (dimm->nr_pages == 0)
 			continue;
-		debugf0("%s(): removing device %s\n", __func__,
-			dev_name(&dimm->dev));
+		debugf0("removing device %s\n", dev_name(&dimm->dev));
 		put_device(&dimm->dev);
 		device_del(&dimm->dev);
 	}
@@ -1105,7 +1101,7 @@ int __init edac_mc_sysfs_init(void)
 	/* get the /sys/devices/system/edac subsys reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		return -EINVAL;
 	}
 
diff --git a/drivers/edac/edac_module.c b/drivers/edac/edac_module.c
index 8735a0d..9de2484 100644
--- a/drivers/edac/edac_module.c
+++ b/drivers/edac/edac_module.c
@@ -113,7 +113,7 @@ error:
  */
 static void __exit edac_exit(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* tear down the various subsystems */
 	edac_workqueue_teardown();
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index f1ac866..51dd4e0 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -45,7 +45,7 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 	void *p = NULL, *pvt;
 	unsigned int size;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	pci = edac_align_ptr(&p, sizeof(*pci), 1);
 	pvt = edac_align_ptr(&p, 1, sz_pvt);
@@ -80,7 +80,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_ctl_info);
  */
 void edac_pci_free_ctl_info(struct edac_pci_ctl_info *pci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	edac_pci_remove_sysfs(pci);
 }
@@ -97,7 +97,7 @@ static struct edac_pci_ctl_info *find_edac_pci_by_dev(struct device *dev)
 	struct edac_pci_ctl_info *pci;
 	struct list_head *item;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	list_for_each(item, &edac_pci_list) {
 		pci = list_entry(item, struct edac_pci_ctl_info, link);
@@ -122,7 +122,7 @@ static int add_edac_pci_to_global_list(struct edac_pci_ctl_info *pci)
 	struct list_head *item, *insert_before;
 	struct edac_pci_ctl_info *rover;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	insert_before = &edac_pci_list;
 
@@ -226,7 +226,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 	int msec;
 	unsigned long delay;
 
-	debugf3("%s() checking\n", __func__);
+	debugf3("checking\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -261,7 +261,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 static void edac_pci_workq_setup(struct edac_pci_ctl_info *pci,
 				 unsigned int msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	INIT_DELAYED_WORK(&pci->work, edac_pci_workq_function);
 	queue_delayed_work(edac_workqueue, &pci->work,
@@ -276,7 +276,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 {
 	int status;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	status = cancel_delayed_work(&pci->work);
 	if (status == 0)
@@ -293,7 +293,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 void edac_pci_reset_delay_period(struct edac_pci_ctl_info *pci,
 				 unsigned long value)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_workq_teardown(pci);
 
@@ -333,7 +333,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_index);
  */
 int edac_pci_add_device(struct edac_pci_ctl_info *pci, int edac_idx)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci->pci_idx = edac_idx;
 	pci->start_time = jiffies;
@@ -393,7 +393,7 @@ struct edac_pci_ctl_info *edac_pci_del_device(struct device *dev)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -430,7 +430,7 @@ EXPORT_SYMBOL_GPL(edac_pci_del_device);
  */
 static void edac_pci_generic_check(struct edac_pci_ctl_info *pci)
 {
-	debugf4("%s()\n", __func__);
+	debugf4("\n");
 	edac_pci_do_parity_check();
 }
 
@@ -475,7 +475,7 @@ struct edac_pci_ctl_info *edac_pci_create_generic_ctl(struct device *dev,
 	pdata->edac_idx = edac_pci_idx++;
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		edac_pci_free_ctl_info(pci);
 		return NULL;
 	}
@@ -491,7 +491,7 @@ EXPORT_SYMBOL_GPL(edac_pci_create_generic_ctl);
  */
 void edac_pci_release_generic_ctl(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() pci mod=%s\n", __func__, pci->mod_name);
+	debugf0("pci mod=%s\n", pci->mod_name);
 
 	edac_pci_del_device(pci->dev);
 	edac_pci_free_ctl_info(pci);
diff --git a/drivers/edac/edac_pci_sysfs.c b/drivers/edac/edac_pci_sysfs.c
index 97f5064..6678216 100644
--- a/drivers/edac/edac_pci_sysfs.c
+++ b/drivers/edac/edac_pci_sysfs.c
@@ -78,7 +78,7 @@ static void edac_pci_instance_release(struct kobject *kobj)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Form pointer to containing struct, the pci control struct */
 	pci = to_instance(kobj);
@@ -161,7 +161,7 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	struct kobject *main_kobj;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* First bump the ref count on the top main kobj, which will
 	 * track the number of PCI instances we have, and thus nest
@@ -177,14 +177,14 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	err = kobject_init_and_add(&pci->kobj, &ktype_pci_instance,
 				   edac_pci_top_main_kobj, "pci%d", idx);
 	if (err != 0) {
-		debugf2("%s() failed to register instance pci%d\n",
-			__func__, idx);
+		debugf2("failed to register instance pci%d\n",
+			idx);
 		kobject_put(edac_pci_top_main_kobj);
 		goto error_out;
 	}
 
 	kobject_uevent(&pci->kobj, KOBJ_ADD);
-	debugf1("%s() Register instance 'pci%d' kobject\n", __func__, idx);
+	debugf1("Register instance 'pci%d' kobject\n", idx);
 
 	return 0;
 
@@ -201,7 +201,7 @@ error_out:
 static void edac_pci_unregister_sysfs_instance_kobj(
 			struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Unregister the instance kobject and allow its release
 	 * function release the main reference count and then
@@ -317,7 +317,7 @@ static struct edac_pci_dev_attribute *edac_pci_attr[] = {
  */
 static void edac_pci_release_main_kobj(struct kobject *kobj)
 {
-	debugf0("%s() here to module_put(THIS_MODULE)\n", __func__);
+	debugf0("here to module_put(THIS_MODULE)\n");
 
 	kfree(kobj);
 
@@ -345,7 +345,7 @@ static int edac_pci_main_kobj_setup(void)
 	int err;
 	struct bus_type *edac_subsys;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* check and count if we have already created the main kobject */
 	if (atomic_inc_return(&edac_pci_sysfs_refcount) != 1)
@@ -356,7 +356,7 @@ static int edac_pci_main_kobj_setup(void)
 	 */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		err = -ENODEV;
 		goto decrement_count_fail;
 	}
@@ -366,7 +366,7 @@ static int edac_pci_main_kobj_setup(void)
 	 * level main kobj for EDAC PCI
 	 */
 	if (!try_module_get(THIS_MODULE)) {
-		debugf1("%s() try_module_get() failed\n", __func__);
+		debugf1("try_module_get() failed\n");
 		err = -ENODEV;
 		goto mod_get_fail;
 	}
@@ -421,15 +421,14 @@ decrement_count_fail:
  */
 static void edac_pci_main_kobj_teardown(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Decrement the count and only if no more controller instances
 	 * are connected perform the unregisteration of the top level
 	 * main kobj
 	 */
 	if (atomic_dec_return(&edac_pci_sysfs_refcount) == 0) {
-		debugf0("%s() called kobject_put on main kobj\n",
-			__func__);
+		debugf0("called kobject_put on main kobj\n");
 		kobject_put(edac_pci_top_main_kobj);
 	}
 	edac_put_sysfs_subsys();
@@ -446,7 +445,7 @@ int edac_pci_create_sysfs(struct edac_pci_ctl_info *pci)
 	int err;
 	struct kobject *edac_kobj = &pci->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, pci->pci_idx);
+	debugf0("idx=%d\n", pci->pci_idx);
 
 	/* create the top main EDAC PCI kobject, IF needed */
 	err = edac_pci_main_kobj_setup();
@@ -460,8 +459,8 @@ int edac_pci_create_sysfs(struct edac_pci_ctl_info *pci)
 
 	err = sysfs_create_link(edac_kobj, &pci->dev->kobj, EDAC_PCI_SYMLINK);
 	if (err) {
-		debugf0("%s() sysfs_create_link() returned err= %d\n",
-			__func__, err);
+		debugf0("sysfs_create_link() returned err= %d\n",
+			err);
 		goto symlink_fail;
 	}
 
@@ -484,7 +483,7 @@ unregister_cleanup:
  */
 void edac_pci_remove_sysfs(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() index=%d\n", __func__, pci->pci_idx);
+	debugf0("index=%d\n", pci->pci_idx);
 
 	/* Remove the symlink */
 	sysfs_remove_link(&pci->kobj, EDAC_PCI_SYMLINK);
@@ -496,7 +495,7 @@ void edac_pci_remove_sysfs(struct edac_pci_ctl_info *pci)
 	 * if this 'pci' is the last instance.
 	 * If it is, the main kobject will be unregistered as a result
 	 */
-	debugf0("%s() calling edac_pci_main_kobj_teardown()\n", __func__);
+	debugf0("calling edac_pci_main_kobj_teardown()\n");
 	edac_pci_main_kobj_teardown();
 }
 
@@ -671,7 +670,7 @@ void edac_pci_do_parity_check(void)
 {
 	int before_count;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* if policy has PCI check off, leave now */
 	if (!check_pci_errors)
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 55eff02..1f05480 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -275,7 +275,7 @@ static void i3000_check(struct mem_ctl_info *mci)
 {
 	struct i3000_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i3000_get_error_info(mci, &info);
 	i3000_process_error_info(mci, &info, 1);
 }
@@ -322,7 +322,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	unsigned long mchbar;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	pci_read_config_dword(pdev, I3000_MCHBAR, (u32 *) & mchbar);
 	mchbar &= I3000_MCHBAR_MASK;
@@ -366,7 +366,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -399,8 +399,8 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 		cumul_size = value << (I3000_DRB_SHIFT - PAGE_SHIFT);
 		if (interleaved)
 			cumul_size <<= 1;
-		debugf3("MC: %s(): (%d) cumul_size 0x%x\n",
-			__func__, i, cumul_size);
+		debugf3("MC: (%d) cumul_size 0x%x\n",
+			i, cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;
 
@@ -429,7 +429,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -445,7 +445,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -461,7 +461,7 @@ static int __devinit i3000_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -477,7 +477,7 @@ static void __devexit i3000_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i3000_pci)
 		edac_pci_release_generic_ctl(i3000_pci);
@@ -511,7 +511,7 @@ static int __init i3000_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -552,7 +552,7 @@ fail0:
 
 static void __exit i3000_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&i3000_driver);
 	if (!i3000_registered) {
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 818ee6f..ce2d60c 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -245,7 +245,7 @@ static void i3200_check(struct mem_ctl_info *mci)
 {
 	struct i3200_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i3200_get_and_clear_error_info(mci, &info);
 	i3200_process_error_info(mci, &info);
 }
@@ -332,7 +332,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	void __iomem *window;
 	struct i3200_priv *priv;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	window = i3200_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -403,12 +403,12 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -424,7 +424,7 @@ static int __devinit i3200_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -441,7 +441,7 @@ static void __devexit i3200_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i3200_priv *priv;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -475,7 +475,7 @@ static int __init i3200_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -516,7 +516,7 @@ fail0:
 
 static void __exit i3200_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&i3200_driver);
 	if (!i3200_registered) {
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 2a9f1dc..0292a06 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -779,7 +779,7 @@ static void i5000_clear_error(struct mem_ctl_info *mci)
 static void i5000_check_error(struct mem_ctl_info *mci)
 {
 	struct i5000_error_info info;
-	debugf4("MC%d: %s: %s()\n", mci->mc_idx, __FILE__, __func__);
+	debugf4("MC%d\n", mci->mc_idx);
 	i5000_get_error_info(mci, &info);
 	i5000_process_error_info(mci, &info, 1);
 }
@@ -1363,9 +1363,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	int num_channels;
 	int num_dimms_per_channel;
 
-	debugf0("MC: %s: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
-		__FILE__, __func__,
-		pdev->bus->number,
+	debugf0("MC: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
+		__FILE__, pdev->bus->number,
 		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 
 	/* We only are looking for func 0 of the set */
@@ -1388,8 +1387,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	i5000_get_dimm_and_channel_counts(pdev, &num_dimms_per_channel,
 					&num_channels);
 
-	debugf0("MC: %s(): Number of Branches=2 Channels= %d  DIMMS= %d\n",
-		__func__, num_channels, num_dimms_per_channel);
+	debugf0("MC: Number of Branches=2 Channels= %d  DIMMS= %d\n",
+		num_channels, num_dimms_per_channel);
 
 	/* allocate a new MC control structure */
 
@@ -1407,7 +1406,7 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 
 	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
@@ -1450,8 +1449,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (edac_mc_add_mc(mci)) {
-		debugf0("MC: %s: %s(): failed edac_mc_add_mc()\n",
-			__FILE__, __func__);
+		debugf0("MC: %s(): failed edac_mc_add_mc()\n",
+			__FILE__);
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -1495,7 +1494,7 @@ static int __devinit i5000_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* wake up device */
 	rc = pci_enable_device(pdev);
@@ -1514,7 +1513,7 @@ static void __devexit i5000_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i5000_pci)
 		edac_pci_release_generic_ctl(i5000_pci);
@@ -1560,7 +1559,7 @@ static int __init i5000_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -1576,7 +1575,7 @@ static int __init i5000_init(void)
  */
 static void __exit i5000_exit(void)
 {
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 	pci_unregister_driver(&i5000_driver);
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 676591e..a736b98 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -700,7 +700,7 @@ static void i5400_clear_error(struct mem_ctl_info *mci)
 static void i5400_check_error(struct mem_ctl_info *mci)
 {
 	struct i5400_error_info info;
-	debugf4("MC%d: %s: %s()\n", mci->mc_idx, __FILE__, __func__);
+	debugf4("MC%d\n", mci->mc_idx);
 	i5400_get_error_info(mci, &info);
 	i5400_process_error_info(mci, &info);
 }
@@ -1203,8 +1203,7 @@ static int i5400_init_dimms(struct mem_ctl_info *mci)
 
 			size_mb =  pvt->dimm_info[slot][channel].megabytes;
 
-			debugf2("%s: dimm (branch %d channel %d slot %d): %d.%03d GB\n",
-				__func__,
+			debugf2("dimm (branch %d channel %d slot %d): %d.%03d GB\n",
 				channel / 2, channel % 2, slot,
 				size_mb / 1000, size_mb % 1000);
 
@@ -1270,9 +1269,8 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	if (dev_idx >= ARRAY_SIZE(i5400_devs))
 		return -EINVAL;
 
-	debugf0("MC: %s: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
-		__FILE__, __func__,
-		pdev->bus->number,
+	debugf0("MC: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
+		__FILE__, pdev->bus->number,
 		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 
 	/* We only are looking for func 0 of the set */
@@ -1298,7 +1296,7 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 
 	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
@@ -1341,8 +1339,8 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (edac_mc_add_mc(mci)) {
-		debugf0("MC: %s: %s(): failed edac_mc_add_mc()\n",
-			__FILE__, __func__);
+		debugf0("MC: %s(): failed edac_mc_add_mc()\n",
+			__FILE__);
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -1386,7 +1384,7 @@ static int __devinit i5400_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* wake up device */
 	rc = pci_enable_device(pdev);
@@ -1405,7 +1403,7 @@ static void __devexit i5400_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i5400_pci)
 		edac_pci_release_generic_ctl(i5400_pci);
@@ -1451,7 +1449,7 @@ static int __init i5400_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -1467,7 +1465,7 @@ static int __init i5400_init(void)
  */
 static void __exit i5400_exit(void)
 {
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 	pci_unregister_driver(&i5400_driver);
 }
 
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 7425f17..aa3eb98 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -1032,8 +1032,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 	if (rc == -EIO)
 		return rc;
 
-	debugf0("MC: " __FILE__ ": %s(), pdev bus %u dev=0x%x fn=0x%x\n",
-		__func__,
+	debugf0("MC: pdev bus %u dev=0x%x fn=0x%x\n",
 		pdev->bus->number,
 		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 
@@ -1056,7 +1055,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p\n", __func__, mci);
+	debugf0("MC: mci = %p\n", mci);
 
 	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
@@ -1100,8 +1099,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (edac_mc_add_mc(mci)) {
-		debugf0("MC: " __FILE__
-			": %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf0("MC: failed edac_mc_add_mc()\n");
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -1143,7 +1141,7 @@ static void __devexit i7300_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	char *tmp;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	if (i7300_pci)
 		edac_pci_release_generic_ctl(i7300_pci);
@@ -1190,7 +1188,7 @@ static int __init i7300_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -1205,7 +1203,7 @@ static int __init i7300_init(void)
  */
 static void __exit i7300_exit(void)
 {
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 	pci_unregister_driver(&i7300_driver);
 }
 
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index ef237f4..fcf9cfc 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -824,7 +824,7 @@ static ssize_t i7core_inject_store_##param(			\
 	long value;						\
 	int rc;							\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	pvt = mci->pvt_info;					\
 								\
 	if (pvt->inject.enable)					\
@@ -852,7 +852,7 @@ static ssize_t i7core_inject_show_##param(			\
 	struct i7core_pvt *pvt;					\
 								\
 	pvt = mci->pvt_info;					\
-	debugf1("%s() pvt=%p\n", __func__, pvt);		\
+	debugf1("pvt=%p\n", pvt);		\
 	if (pvt->inject.param < 0)				\
 		return sprintf(data, "any\n");			\
 	else							\
@@ -1059,7 +1059,7 @@ static ssize_t i7core_show_counter_##param(			\
 	struct mem_ctl_info *mci = to_mci(dev);			\
 	struct i7core_pvt *pvt = mci->pvt_info;			\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	if (!pvt->ce_count_available || (pvt->is_registered))	\
 		return sprintf(data, "data unavailable\n");	\
 	return sprintf(data, "%lu\n",				\
@@ -1190,8 +1190,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 	dev_set_name(pvt->addrmatch_dev, "inject_addrmatch");
 	dev_set_drvdata(pvt->addrmatch_dev, mci);
 
-	debugf1("%s(): creating %s\n", __func__,
-		dev_name(pvt->addrmatch_dev));
+	debugf1("creating %s\n", dev_name(pvt->addrmatch_dev));
 
 	rc = device_add(pvt->addrmatch_dev);
 	if (rc < 0)
@@ -1213,8 +1212,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 		dev_set_name(pvt->chancounts_dev, "all_channel_counts");
 		dev_set_drvdata(pvt->chancounts_dev, mci);
 
-		debugf1("%s(): creating %s\n", __func__,
-			dev_name(pvt->chancounts_dev));
+		debugf1("creating %s\n", dev_name(pvt->chancounts_dev));
 
 		rc = device_add(pvt->chancounts_dev);
 		if (rc < 0)
@@ -1254,7 +1252,7 @@ static void i7core_put_devices(struct i7core_dev *i7core_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < i7core_dev->n_devs; i++) {
 		struct pci_dev *pdev = i7core_dev->pdev[i];
 		if (!pdev)
@@ -1652,7 +1650,7 @@ static void i7core_udimm_check_mc_ecc_err(struct mem_ctl_info *mci)
 	int new0, new1, new2;
 
 	if (!pvt->pci_mcr[4]) {
-		debugf0("%s MCR registers not found\n", __func__);
+		debugf0("MCR registers not found\n");
 		return;
 	}
 
@@ -2190,8 +2188,7 @@ static void i7core_unregister_mci(struct i7core_dev *i7core_dev)
 	struct i7core_pvt *pvt;
 
 	if (unlikely(!mci || !mci->pvt_info)) {
-		debugf0("MC: " __FILE__ ": %s(): dev = %p\n",
-			__func__, &i7core_dev->pdev[0]->dev);
+		debugf0("MC: dev = %p\n", &i7core_dev->pdev[0]->dev);
 
 		i7core_printk(KERN_ERR, "Couldn't find mci handler\n");
 		return;
@@ -2199,8 +2196,7 @@ static void i7core_unregister_mci(struct i7core_dev *i7core_dev)
 
 	pvt = mci->pvt_info;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
-		__func__, mci, &i7core_dev->pdev[0]->dev);
+	debugf0("MC: mci = %p, dev = %p\n", mci, &i7core_dev->pdev[0]->dev);
 
 	/* Disable scrubrate setting */
 	if (pvt->enable_scrub)
@@ -2241,8 +2237,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 	if (unlikely(!mci))
 		return -ENOMEM;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
-		__func__, mci, &i7core_dev->pdev[0]->dev);
+	debugf0("MC: mci = %p, dev = %p\n", mci, &i7core_dev->pdev[0]->dev);
 
 	pvt = mci->pvt_info;
 	memset(pvt, 0, sizeof(*pvt));
@@ -2285,8 +2280,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (unlikely(edac_mc_add_mc(mci))) {
-		debugf0("MC: " __FILE__
-			": %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf0("MC: failed edac_mc_add_mc()\n");
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -2295,8 +2289,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 		goto fail0;
 	}
 	if (i7core_create_sysfs_devices(mci)) {
-		debugf0("MC: " __FILE__
-			": %s(): failed to create sysfs nodes\n", __func__);
+		debugf0("MC: failed to create sysfs nodes\n");
 		edac_mc_del_mc(mci->pdev);
 		rc = -EINVAL;
 		goto fail0;
@@ -2402,7 +2395,7 @@ static void __devexit i7core_remove(struct pci_dev *pdev)
 {
 	struct i7core_dev *i7core_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
@@ -2451,7 +2444,7 @@ static int __init i7core_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -2476,7 +2469,7 @@ static int __init i7core_init(void)
  */
 static void __exit i7core_exit(void)
 {
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 	pci_unregister_driver(&i7core_driver);
 }
 
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index c0249f3..b9215e8 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -178,7 +178,7 @@ static void i82443bxgx_edacmc_check(struct mem_ctl_info *mci)
 {
 	struct i82443bxgx_edacmc_error_info info;
 
-	debugf1("MC%d: %s: %s()\n", mci->mc_idx, __FILE__, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i82443bxgx_edacmc_get_error_info(mci, &info);
 	i82443bxgx_edacmc_process_error_info(mci, &info, 1);
 }
@@ -201,13 +201,13 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		dimm = csrow->channels[0]->dimm;
 
 		pci_read_config_byte(pdev, I82443BXGX_DRB + index, &drbar);
-		debugf1("MC%d: %s: %s() Row=%d DRB = %#0x\n",
-			mci->mc_idx, __FILE__, __func__, index, drbar);
+		debugf1("MC%d: Row=%d DRB = %#0x\n",
+			mci->mc_idx,index, drbar);
 		row_high_limit = ((u32) drbar << 23);
 		/* find the DRAM Chip Select Base address and mask */
-		debugf1("MC%d: %s: %s() Row=%d, "
+		debugf1("MC%d: Row=%d, "
 			"Boundary Address=%#0x, Last = %#0x\n",
-			mci->mc_idx, __FILE__, __func__, index, row_high_limit,
+			mci->mc_idx, index, row_high_limit,
 			row_high_limit_last);
 
 		/* 440GX goes to 2GB, represented with a DRB of 0. */
@@ -241,7 +241,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	enum mem_type mtype;
 	enum edac_type edac_mode;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* Something is really hosed if PCI config space reads from
 	 * the MC aren't working.
@@ -259,7 +259,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_EDO | MEM_FLAG_SDR | MEM_FLAG_RDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -305,8 +305,8 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 		edac_mode = EDAC_SECDED;
 		break;
 	default:
-		debugf0("%s(): Unknown/reserved ECC state "
-			"in NBXCFG register!\n", __func__);
+		debugf0("Unknown/reserved ECC state "
+			"in NBXCFG register!\n");
 		edac_mode = EDAC_UNKNOWN;
 		break;
 	}
@@ -330,7 +330,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->ctl_page_to_phys = NULL;
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -345,7 +345,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 			__func__);
 	}
 
-	debugf3("MC: %s: %s(): success\n", __FILE__, __func__);
+	debugf3("MC: %s(): success\n", __FILE__);
 	return 0;
 
 fail:
@@ -361,7 +361,7 @@ static int __devinit i82443bxgx_edacmc_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* don't need to call pci_enable_device() */
 	rc = i82443bxgx_edacmc_probe1(pdev, ent->driver_data);
@@ -376,7 +376,7 @@ static void __devexit i82443bxgx_edacmc_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i82443bxgx_pci)
 		edac_pci_release_generic_ctl(i82443bxgx_pci);
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 6ff59b0..ae5b2e1 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -136,7 +136,7 @@ static void i82860_check(struct mem_ctl_info *mci)
 {
 	struct i82860_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i82860_get_error_info(mci, &info);
 	i82860_process_error_info(mci, &info, 1);
 }
@@ -167,7 +167,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 		pci_read_config_word(pdev, I82860_GBA + index * 2, &value);
 		cumul_size = (value & I82860_GBA_MASK) <<
 			(I82860_GBA_SHIFT - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 
 		if (cumul_size == last_cumul_size)
@@ -210,7 +210,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -229,7 +229,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -245,7 +245,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -260,7 +260,7 @@ static int __devinit i82860_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82860_printk(KERN_INFO, "i82860 init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -278,7 +278,7 @@ static void __devexit i82860_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82860_pci)
 		edac_pci_release_generic_ctl(i82860_pci);
@@ -311,7 +311,7 @@ static int __init i82860_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -352,7 +352,7 @@ fail0:
 
 static void __exit i82860_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82860_driver);
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index c943904..e24e703 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -263,7 +263,7 @@ static void i82875p_check(struct mem_ctl_info *mci)
 {
 	struct i82875p_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i82875p_get_error_info(mci, &info);
 	i82875p_process_error_info(mci, &info, 1);
 }
@@ -371,7 +371,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 
 		value = readb(ovrfl_window + I82875P_DRB + index);
 		cumul_size = value << (I82875P_DRB_SHIFT - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -405,7 +405,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 nr_chans;
 	struct i82875p_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	ovrfl_pdev = pci_get_device(PCI_VEND_DEV(INTEL, 82875_6), NULL);
 
@@ -426,7 +426,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -437,7 +437,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82875p_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82875p_pvt *)mci->pvt_info;
 	pvt->ovrfl_pdev = ovrfl_pdev;
 	pvt->ovrfl_window = ovrfl_window;
@@ -448,7 +448,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail1;
 	}
 
@@ -464,7 +464,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -485,7 +485,7 @@ static int __devinit i82875p_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82875p_printk(KERN_INFO, "i82875p init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -504,7 +504,7 @@ static void __devexit i82875p_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82875p_pvt *pvt = NULL;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82875p_pci)
 		edac_pci_release_generic_ctl(i82875p_pci);
@@ -550,7 +550,7 @@ static int __init i82875p_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -593,7 +593,7 @@ fail0:
 
 static void __exit i82875p_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	i82875p_remove_one(mci_pdev);
 	pci_dev_put(mci_pdev);
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index a4a6768..6a367ba 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -331,7 +331,7 @@ static void i82975x_check(struct mem_ctl_info *mci)
 {
 	struct i82975x_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i82975x_get_error_info(mci, &info);
 	i82975x_process_error_info(mci, &info, 1);
 }
@@ -406,7 +406,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		 */
 		if (csrow->nr_channels > 1)
 			cumul_size <<= 1;
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 
 		nr_pages = cumul_size - last_cumul_size;
@@ -489,11 +489,11 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	u8 c1drb[4];
 #endif
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci_read_config_dword(pdev, I82975X_MCHBAR, &mchbar);
 	if (!(mchbar & 1)) {
-		debugf3("%s(): failed, MCHBAR disabled!\n", __func__);
+		debugf3("failed, MCHBAR disabled!\n");
 		goto fail0;
 	}
 	mchbar &= 0xffffc000;	/* bits 31:14 used for 16K window */
@@ -558,7 +558,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail1;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -569,7 +569,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82975x_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82975x_pvt *) mci->pvt_info;
 	pvt->mch_window = mch_window;
 	i82975x_init_csrows(mci, pdev, mch_window);
@@ -578,12 +578,12 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* finalize this instance of memory controller with edac core */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail2;
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail2:
@@ -601,7 +601,7 @@ static int __devinit i82975x_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -619,7 +619,7 @@ static void __devexit i82975x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82975x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (mci  == NULL)
@@ -655,7 +655,7 @@ static int __init i82975x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -697,7 +697,7 @@ fail0:
 
 static void __exit i82975x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82975x_driver);
 
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 1640d54..17d000b 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -280,7 +280,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, ~0);
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		goto err;
 	}
 
@@ -303,7 +303,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_pci_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " PCI err registered\n");
 
 	return 0;
@@ -321,7 +321,7 @@ static int mpc85xx_pci_err_remove(struct platform_device *op)
 	struct edac_pci_ctl_info *pci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_CAP_DR,
 		 orig_pci_err_cap_dr);
@@ -582,7 +582,7 @@ static int __devinit mpc85xx_l2_err_probe(struct platform_device *op)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -610,7 +610,7 @@ static int __devinit mpc85xx_l2_err_probe(struct platform_device *op)
 
 	devres_remove_group(&op->dev, mpc85xx_l2_err_probe);
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " L2 err registered\n");
 
 	return 0;
@@ -628,7 +628,7 @@ static int mpc85xx_l2_err_remove(struct platform_device *op)
 	struct edac_device_ctl_info *edac_dev = dev_get_drvdata(&op->dev);
 	struct mpc85xx_l2_pdata *pdata = edac_dev->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->l2_vbase + MPC85XX_L2_ERRINTEN, 0);
@@ -1038,7 +1038,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 		goto err;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_RDDR2 |
 	    MEM_FLAG_DDR | MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -1064,13 +1064,13 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT, ~0);
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
 	if (mpc85xx_create_sysfs_attributes(mci)) {
 		edac_mc_del_mc(mci->pdev);
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
@@ -1104,7 +1104,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_mc_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " MC err registered\n");
 
 	return 0;
@@ -1122,7 +1122,7 @@ static int mpc85xx_mc_err_remove(struct platform_device *op)
 	struct mem_ctl_info *mci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_INT_EN, 0);
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 59c399a..35db597 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -169,7 +169,7 @@ static int __devinit mv64x60_pci_err_probe(struct platform_device *pdev)
 		 MV64X60_PCIx_ERR_MASK_VAL);
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		goto err;
 	}
 
@@ -194,7 +194,7 @@ static int __devinit mv64x60_pci_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_pci_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -210,7 +210,7 @@ static int mv64x60_pci_err_remove(struct platform_device *pdev)
 {
 	struct edac_pci_ctl_info *pci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_del_device(&pdev->dev);
 
@@ -336,7 +336,7 @@ static int __devinit mv64x60_sram_err_probe(struct platform_device *pdev)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -363,7 +363,7 @@ static int __devinit mv64x60_sram_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_sram_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -379,7 +379,7 @@ static int mv64x60_sram_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -531,7 +531,7 @@ static int __devinit mv64x60_cpu_err_probe(struct platform_device *pdev)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -558,7 +558,7 @@ static int __devinit mv64x60_cpu_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_cpu_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -574,7 +574,7 @@ static int mv64x60_cpu_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -766,7 +766,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 		goto err2;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
 	mci->edac_cap = EDAC_FLAG_SECDED;
@@ -790,7 +790,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	out_le32(pdata->mc_vbase + MV64X60_SDRAM_ERR_ECC_CNTL, ctl);
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
@@ -815,7 +815,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -831,7 +831,7 @@ static int mv64x60_mc_err_remove(struct platform_device *pdev)
 {
 	struct mem_ctl_info *mci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_mc_del_mc(&pdev->dev);
 	edac_mc_free(mci);
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 7b7eaf2..36b870e 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -205,7 +205,7 @@ static void r82600_check(struct mem_ctl_info *mci)
 {
 	struct r82600_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	r82600_get_error_info(mci, &info);
 	r82600_process_error_info(mci, &info, 1);
 }
@@ -236,13 +236,13 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_byte(pdev, R82600_DRBA + index, &drbar);
 
-		debugf1("%s() Row=%d DRBA = %#0x\n", __func__, index, drbar);
+		debugf1("Row=%d DRBA = %#0x\n", index, drbar);
 
 		row_high_limit = ((u32) drbar << 24);
 /*		row_high_limit = ((u32)drbar << 24) | 0xffffffUL; */
 
-		debugf1("%s() Row=%d, Boundary Address=%#0x, Last = %#0x\n",
-			__func__, index, row_high_limit, row_high_limit_last);
+		debugf1("Row=%d, Boundary Address=%#0x, Last = %#0x\n",
+			index, row_high_limit, row_high_limit_last);
 
 		/* Empty row [p.57] */
 		if (row_high_limit == row_high_limit_last)
@@ -277,14 +277,13 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 sdram_refresh_rate;
 	struct r82600_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_byte(pdev, R82600_DRAMC, &dramcr);
 	pci_read_config_dword(pdev, R82600_EAP, &eapr);
 	scrub_disabled = eapr & BIT(31);
 	sdram_refresh_rate = dramcr & (BIT(0) | BIT(1));
-	debugf2("%s(): sdram refresh rate = %#0x\n", __func__,
-		sdram_refresh_rate);
-	debugf2("%s(): DRAMC register = %#0x\n", __func__, dramcr);
+	debugf2("sdram refresh rate = %#0x\n", sdram_refresh_rate);
+	debugf2("DRAMC register = %#0x\n", dramcr);
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = R82600_NR_CSROWS;
 	layers[0].is_virt_csrow = true;
@@ -295,7 +294,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -311,8 +310,8 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 
 	if (ecc_enabled(dramcr)) {
 		if (scrub_disabled)
-			debugf3("%s(): mci = %p - Scrubbing disabled! EAP: "
-				"%#0x\n", __func__, mci, eapr);
+			debugf3("mci = %p - Scrubbing disabled! EAP: "
+				"%#0x\n", mci, eapr);
 	} else
 		mci->edac_cap = EDAC_FLAG_NONE;
 
@@ -329,15 +328,14 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
 
 	if (disable_hardware_scrub) {
-		debugf3("%s(): Disabling Hardware Scrub (scrub on error)\n",
-			__func__);
+		debugf3("Disabling Hardware Scrub (scrub on error)\n");
 		pci_write_bits32(pdev, R82600_EAP, BIT(31), BIT(31));
 	}
 
@@ -352,7 +350,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 			__func__);
 	}
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -364,7 +362,7 @@ fail:
 static int __devinit r82600_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return r82600_probe1(pdev, ent->driver_data);
@@ -374,7 +372,7 @@ static void __devexit r82600_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (r82600_pci)
 		edac_pci_release_generic_ctl(r82600_pci);
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index bb7e95f..d1afa1a 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1064,7 +1064,7 @@ static void sbridge_put_devices(struct sbridge_dev *sbridge_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < sbridge_dev->n_devs; i++) {
 		struct pci_dev *pdev = sbridge_dev->pdev[i];
 		if (!pdev)
@@ -1597,8 +1597,7 @@ static void sbridge_unregister_mci(struct sbridge_dev *sbridge_dev)
 	struct sbridge_pvt *pvt;
 
 	if (unlikely(!mci || !mci->pvt_info)) {
-		debugf0("MC: " __FILE__ ": %s(): dev = %p\n",
-			__func__, &sbridge_dev->pdev[0]->dev);
+		debugf0("MC: dev = %p\n", &sbridge_dev->pdev[0]->dev);
 
 		sbridge_printk(KERN_ERR, "Couldn't find mci handler\n");
 		return;
@@ -1606,8 +1605,8 @@ static void sbridge_unregister_mci(struct sbridge_dev *sbridge_dev)
 
 	pvt = mci->pvt_info;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
-		__func__, mci, &sbridge_dev->pdev[0]->dev);
+	debugf0("MC: mci = %p, dev = %p\n",
+		mci, &sbridge_dev->pdev[0]->dev);
 
 	mce_unregister_decode_chain(&sbridge_mce_dec);
 
@@ -1645,8 +1644,8 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 	if (unlikely(!mci))
 		return -ENOMEM;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
-		__func__, mci, &sbridge_dev->pdev[0]->dev);
+	debugf0("MC: mci = %p, dev = %p\n",
+		mci, &sbridge_dev->pdev[0]->dev);
 
 	pvt = mci->pvt_info;
 	memset(pvt, 0, sizeof(*pvt));
@@ -1681,8 +1680,7 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (unlikely(edac_mc_add_mc(mci))) {
-		debugf0("MC: " __FILE__
-			": %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf0("MC: failed edac_mc_add_mc()\n");
 		rc = -EINVAL;
 		goto fail0;
 	}
@@ -1760,7 +1758,7 @@ static void __devexit sbridge_remove(struct pci_dev *pdev)
 {
 	struct sbridge_dev *sbridge_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
@@ -1809,7 +1807,7 @@ static int __init sbridge_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -1831,7 +1829,7 @@ static int __init sbridge_init(void)
  */
 static void __exit sbridge_exit(void)
 {
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 	pci_unregister_driver(&sbridge_driver);
 }
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 219530b..a3d8a40 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -243,7 +243,7 @@ static void x38_check(struct mem_ctl_info *mci)
 {
 	struct x38_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	x38_get_and_clear_error_info(mci, &info);
 	x38_process_error_info(mci, &info);
 }
@@ -331,7 +331,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	bool stacked;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	window = x38_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -402,12 +402,12 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -423,7 +423,7 @@ static int __devinit x38_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -439,7 +439,7 @@ static void __devexit x38_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -472,7 +472,7 @@ static int __init x38_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -513,7 +513,7 @@ fail0:
 
 static void __exit x38_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&x38_driver);
 	if (!x38_registered) {

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-29 17:18                                               ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 17:18 UTC (permalink / raw)
  To: Joe Perches
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Borislav Petkov,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Doug Thompson, Linux Edac Mailing List,
	Michal Marek, Jiri Kosina, Linux Kernel Mailing List,
	Olof Johansson, Andrew Morton, linuxppc-dev

Em 29-04-2012 13:03, Joe Perches escreveu:
> On Sun, 2012-04-29 at 12:11 -0300, Mauro Carvalho Chehab wrote:
>> Em 29-04-2012 11:25, Mauro Carvalho Chehab escreveu:
>>> Em 28-04-2012 05:52, Borislav Petkov escreveu:
>>>> On Fri, Apr 27, 2012 at 01:07:38PM -0300, Mauro Carvalho Chehab wrote:
>>>>> Yes. This is a common issue at the EDAC core: on several places, it calls the
>>>>> edac debug macros (DEBUGF0...DEBUGF4) passing a __func__ as an argument, while
>>>>> the debug macros already handles that. I suspect that, in the past, the __func__
>>>>> were not at the macros, but some patch added it there, and forgot to fix the
>>>>> occurrences of its call.
>>>> The patch that added it is d357cbb445208 and you reviewed it.
>>> And you wrote the patch that caused it.
> 
> And Boris should have also written the follow-on patches that
> removed most/all of the debugfX and __func__ uses.

Yes.

>>> A single patch fixing this everywhere at drivers/edac is better and clearer than adding 
>>> an unrelated fix on this patch. This is already complex enough to add more unrelated
>>> things there.
>>>
>>> Also, a simple perl/coccinelle script can replace all such __func__ occurrences 
>>> on one shot.
> 
> You make it sound simple, but it'd be a pretty complicated
> cocci script.  Some of the changes would have to be inspected
> or changed by hand in any case.

Yes manual changes are needed, to get rid of it some less likely patterns,
but using a script helps to do most of the changes automatically.

> []
> 
>> Most of the issues can be solved with the above script-based patch. 
>>
>> There are still 171 places (12 places at the core, the rest are on the drivers)
>> that will require a more sophisticated patch or that requires a manual fix.
> []
>> From: Mauro Carvalho Chehab <mchehab@redhat.com>
>> Date: Sun, 29 Apr 2012 11:59:14 -0300
>> Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs
> 
> Thanks Mauro, you shouldn't have had to do this.

I know, but the double __func__ were bothering me. Anyway, this change was kick ;)

Btw, new (final) version attached. This replaces all debugf[1-4] occurences.

>From 476ed993148a6b9f0215051c98db1cb094bca8a9 Mon Sep 17 00:00:00 2001
From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Sun, 29 Apr 2012 11:59:14 -0300
Subject: [PATCH] edac: Don't add __func__ or __FILE__ for debugf[0-9] msgs

The debug macro already adds that. Most of the work here was
made by this small script:

$f .=$_ while (<>);

$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*": /\1"/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*/\1/g;
$f =~ s/(debugf[0-9]\s*\(\s*)__FILE__\s*"MC: /\1"/g;

$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\")\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+)__func__\s*\,\s*/\1\2/g;
$f =~ s/(debugf[0-9]\s*\(\"MC\:\s*)\%s[\:\,\(\)]*\s*([^\"]*\s*[^\)]+),\s*__func__\s*\)/\1\2)/g;

$f =~ s/\"MC\: \\n\"/"MC:\\n"/g;

print $f;

After running the script, manual cleanups were done to fix it the remaining
places.

While here, removed the __LINE__ on most places, as it doesn't actually give
useful info on most places.

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

---

PS.: Patch should be applied at the end of my EDAC experimental tree:
http://git.infradead.org/users/mchehab/edac.git/commit/476ed993148a6b9f0215051c98db1cb094bca8a9


diff --git a/drivers/edac/amd76x_edac.c b/drivers/edac/amd76x_edac.c
index be6c225..4ed97bf 100644
--- a/drivers/edac/amd76x_edac.c
+++ b/drivers/edac/amd76x_edac.c
@@ -180,7 +180,7 @@ static int amd76x_process_error_info(struct mem_ctl_info *mci,
 static void amd76x_check(struct mem_ctl_info *mci)
 {
 	struct amd76x_error_info info;
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	amd76x_get_error_info(mci, &info);
 	amd76x_process_error_info(mci, &info, 1);
 }
@@ -241,7 +241,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 ems_mode;
 	struct amd76x_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_dword(pdev, AMD76X_ECC_MODE_STATUS, &ems);
 	ems_mode = (ems >> 10) & 0x3;
 
@@ -256,7 +256,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -276,7 +276,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -292,7 +292,7 @@ static int amd76x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -304,7 +304,7 @@ fail:
 static int __devinit amd76x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return amd76x_probe1(pdev, ent->driver_data);
@@ -322,7 +322,7 @@ static void __devexit amd76x_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (amd76x_pci)
 		edac_pci_release_generic_ctl(amd76x_pci);
diff --git a/drivers/edac/cpc925_edac.c b/drivers/edac/cpc925_edac.c
index 31b3c91..9ee1194 100644
--- a/drivers/edac/cpc925_edac.c
+++ b/drivers/edac/cpc925_edac.c
@@ -316,13 +316,12 @@ static void get_total_mem(struct cpc925_mc_pdata *pdata)
 		reg += aw;
 		size = of_read_number(reg, sw);
 		reg += sw;
-		debugf1("%s: start 0x%lx, size 0x%lx\n", __func__,
-			start, size);
+		debugf1("start 0x%lx, size 0x%lx\n", start, size);
 		pdata->total_mem += size;
 	} while (reg < reg_end);
 
 	of_node_put(np);
-	debugf0("%s: total_mem 0x%lx\n", __func__, pdata->total_mem);
+	debugf0("total_mem 0x%lx\n", pdata->total_mem);
 }
 
 static void cpc925_init_csrows(struct mem_ctl_info *mci)
@@ -512,7 +511,7 @@ static void cpc925_mc_get_pfn(struct mem_ctl_info *mci, u32 mear,
 	*offset = pa & (PAGE_SIZE - 1);
 	*pfn = pa >> PAGE_SHIFT;
 
-	debugf0("%s: ECC physical address 0x%lx\n", __func__, pa);
+	debugf0("ECC physical address 0x%lx\n", pa);
 }
 
 static int cpc925_mc_find_channel(struct mem_ctl_info *mci, u16 syndrome)
@@ -852,8 +851,8 @@ static void cpc925_add_edac_devices(void __iomem *vbase)
 			goto err2;
 		}
 
-		debugf0("%s: Successfully added edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully added edac device for %s\n",
+			dev_info->ctl_name);
 
 		continue;
 
@@ -884,8 +883,8 @@ static void cpc925_del_edac_devices(void)
 		if (dev_info->exit)
 			dev_info->exit(dev_info);
 
-		debugf0("%s: Successfully deleted edac device for %s\n",
-			__func__, dev_info->ctl_name);
+		debugf0("Successfully deleted edac device for %s\n",
+			dev_info->ctl_name);
 	}
 }
 
@@ -900,7 +899,7 @@ static int cpc925_get_sdram_scrub_rate(struct mem_ctl_info *mci)
 	mscr = __raw_readl(pdata->vbase + REG_MSCR_OFFSET);
 	si = (mscr & MSCR_SI_MASK) >> MSCR_SI_SHIFT;
 
-	debugf0("%s, Mem Scrub Ctrl Register 0x%x\n", __func__, mscr);
+	debugf0("Mem Scrub Ctrl Register 0x%x\n", mscr);
 
 	if (((mscr & MSCR_SCRUB_MOD_MASK) != MSCR_BACKGR_SCRUB) ||
 	    (si == 0)) {
@@ -928,8 +927,7 @@ static int cpc925_mc_get_channels(void __iomem *vbase)
 	    ((mbcr & MBCR_64BITBUS_MASK) == 0))
 		dual = 1;
 
-	debugf0("%s: %s channel\n", __func__,
-		(dual > 0) ? "Dual" : "Single");
+	debugf0("%s channel\n", (dual > 0) ? "Dual" : "Single");
 
 	return dual;
 }
@@ -944,7 +942,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	struct resource *r;
 	int res = 0, nr_channels;
 
-	debugf0("%s: %s platform device found!\n", __func__, pdev->name);
+	debugf0("%s platform device found!\n", pdev->name);
 
 	if (!devres_open_group(&pdev->dev, cpc925_probe, GFP_KERNEL)) {
 		res = -ENOMEM;
@@ -1026,7 +1024,7 @@ static int __devinit cpc925_probe(struct platform_device *pdev)
 	cpc925_add_edac_devices(vbase);
 
 	/* get this far and it's successful */
-	debugf0("%s: success\n", __func__);
+	debugf0("success\n");
 
 	res = 0;
 	goto out;
diff --git a/drivers/edac/e752x_edac.c b/drivers/edac/e752x_edac.c
index 7e601c1..5a599a3 100644
--- a/drivers/edac/e752x_edac.c
+++ b/drivers/edac/e752x_edac.c
@@ -309,7 +309,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (page < pvt->tolm)
 		return page;
@@ -335,7 +335,7 @@ static void do_process_ce(struct mem_ctl_info *mci, u16 error_one,
 	int i;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* convert the addr to 4k page */
 	page = sec1_add >> (PAGE_SHIFT - 4);
@@ -394,7 +394,7 @@ static void do_process_ue(struct mem_ctl_info *mci, u16 error_one,
 	int row;
 	struct e752x_pvt *pvt = (struct e752x_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if (error_one & 0x0202) {
 		error_2b = ded_add;
@@ -453,7 +453,7 @@ static inline void process_ue_no_info_wr(struct mem_ctl_info *mci,
 	if (!handle_error)
 		return;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0,
 			     -1, -1, -1,
 			     "e752x UE log memory write", "", NULL);
@@ -982,7 +982,7 @@ static void e752x_check(struct mem_ctl_info *mci)
 {
 	struct e752x_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e752x_get_error_info(mci, &info);
 	e752x_process_error_info(mci, &info, 1);
 }
@@ -1102,7 +1102,7 @@ static void e752x_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		pci_read_config_byte(pdev, E752X_DRB + index, &value);
 		/* convert a 128 or 64 MiB DRB to a page size. */
 		cumul_size = value << (25 + drc_drbg - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -1270,7 +1270,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;		/* Number of channels 0=1chan,1=2chan */
 	struct e752x_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 	debugf0("Starting Probe1\n");
 
 	/* check to see if device 0 function 1 is enabled; if it isn't, we
@@ -1302,7 +1302,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	/* 3100 IMCH supports SECDEC only */
 	mci->edac_ctl_cap = (dev_idx == I3100) ? EDAC_FLAG_SECDED :
@@ -1312,7 +1312,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_ver = E752X_REVISION;
 	mci->pdev = &pdev->dev;
 
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e752x_pvt *)mci->pvt_info;
 	pvt->dev_info = &e752x_devs[dev_idx];
 	pvt->mc_symmetric = ((ddrcsr & 0x10) != 0);
@@ -1322,7 +1322,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		return -ENODEV;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e752x_check;
@@ -1344,7 +1344,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 		mci->edac_cap = EDAC_FLAG_SECDED; /* the only mode supported */
 	else
 		mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E752X_TOLM, &pci_data);
@@ -1361,7 +1361,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -1379,7 +1379,7 @@ static int e752x_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -1395,7 +1395,7 @@ fail:
 static int __devinit e752x_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	if (pci_enable_device(pdev) < 0)
@@ -1409,7 +1409,7 @@ static void __devexit e752x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e752x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e752x_pci)
 		edac_pci_release_generic_ctl(e752x_pci);
@@ -1455,7 +1455,7 @@ static int __init e752x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -1466,7 +1466,7 @@ static int __init e752x_init(void)
 
 static void __exit e752x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	pci_unregister_driver(&e752x_driver);
 }
 
diff --git a/drivers/edac/e7xxx_edac.c b/drivers/edac/e7xxx_edac.c
index 2defa96..2850d00 100644
--- a/drivers/edac/e7xxx_edac.c
+++ b/drivers/edac/e7xxx_edac.c
@@ -166,7 +166,7 @@ static const struct e7xxx_dev_info e7xxx_devs[] = {
 /* FIXME - is this valid for both SECDED and S4ECD4ED? */
 static inline int e7xxx_find_channel(u16 syndrome)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((syndrome & 0xff00) == 0)
 		return 0;
@@ -186,7 +186,7 @@ static unsigned long ctl_page_to_phys(struct mem_ctl_info *mci,
 	u32 remap;
 	struct e7xxx_pvt *pvt = (struct e7xxx_pvt *)mci->pvt_info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	if ((page < pvt->tolm) ||
 		((page >= 0x100000) && (page < pvt->remapbase)))
@@ -208,7 +208,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	int row;
 	int channel;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_1b = info->dram_celog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -225,7 +225,7 @@ static void process_ce(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ce_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx CE log register overflow", "", NULL);
 }
@@ -235,7 +235,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 	u32 error_2b, block_page;
 	int row;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	/* read the error address */
 	error_2b = info->dram_uelog_add;
 	/* FIXME - should use PAGE_SHIFT */
@@ -248,7 +248,7 @@ static void process_ue(struct mem_ctl_info *mci, struct e7xxx_error_info *info)
 
 static void process_ue_no_info(struct mem_ctl_info *mci)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci, 0, 0, 0, -1, -1, -1,
 			     "e7xxx UE log register overflow", "", NULL);
@@ -334,7 +334,7 @@ static void e7xxx_check(struct mem_ctl_info *mci)
 {
 	struct e7xxx_error_info info;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 	e7xxx_get_error_info(mci, &info);
 	e7xxx_process_error_info(mci, &info, 1);
 }
@@ -383,7 +383,7 @@ static void e7xxx_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		pci_read_config_byte(pdev, E7XXX_DRB + index, &value);
 		/* convert a 64 or 32 MiB DRB to a page size. */
 		cumul_size = value << (25 + drc_drbg - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -430,7 +430,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	int drc_chan;
 	struct e7xxx_error_info discard;
 
-	debugf0("%s(): mci\n", __func__);
+	debugf0("mci\n");
 
 	pci_read_config_dword(pdev, E7XXX_DRC, &drc);
 
@@ -453,7 +453,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED |
 		EDAC_FLAG_S4ECD4ED;
@@ -461,7 +461,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->mod_name = EDAC_MOD_STR;
 	mci->mod_ver = E7XXX_REVISION;
 	mci->pdev = &pdev->dev;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct e7xxx_pvt *)mci->pvt_info;
 	pvt->dev_info = &e7xxx_devs[dev_idx];
 	pvt->bridge_ck = pci_get_device(PCI_VENDOR_ID_INTEL,
@@ -474,14 +474,14 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): more mci init\n", __func__);
+	debugf3("more mci init\n");
 	mci->ctl_name = pvt->dev_info->ctl_name;
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = e7xxx_check;
 	mci->ctl_page_to_phys = ctl_page_to_phys;
 	e7xxx_init_csrows(mci, pdev, dev_idx, drc);
 	mci->edac_cap |= EDAC_FLAG_NONE;
-	debugf3("%s(): tolm, remapbase, remaplimit\n", __func__);
+	debugf3("tolm, remapbase, remaplimit\n");
 	/* load the top of low memory, remap base, and remap limit vars */
 	pci_read_config_word(pdev, E7XXX_TOLM, &pci_data);
 	pvt->tolm = ((u32) pci_data) << 4;
@@ -500,7 +500,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail1;
 	}
 
@@ -516,7 +516,7 @@ static int e7xxx_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -532,7 +532,7 @@ fail0:
 static int __devinit e7xxx_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* wake up and enable device */
 	return pci_enable_device(pdev) ?
@@ -544,7 +544,7 @@ static void __devexit e7xxx_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct e7xxx_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (e7xxx_pci)
 		edac_pci_release_generic_ctl(e7xxx_pci);
diff --git a/drivers/edac/edac_device.c b/drivers/edac/edac_device.c
index cb397d9..ed46949 100644
--- a/drivers/edac/edac_device.c
+++ b/drivers/edac/edac_device.c
@@ -82,8 +82,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	void *pvt, *p;
 	int err;
 
-	debugf4("%s() instances=%d blocks=%d\n",
-		__func__, nr_instances, nr_blocks);
+	debugf4("instances=%d blocks=%d\n",
+		nr_instances, nr_blocks);
 
 	/* Calculate the size of memory we need to allocate AND
 	 * determine the offsets of the various item arrays
@@ -156,8 +156,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 	/* Name of this edac device */
 	snprintf(dev_ctl->name,sizeof(dev_ctl->name),"%s",edac_device_name);
 
-	debugf4("%s() edac_dev=%p next after end=%p\n",
-		__func__, dev_ctl, pvt + sz_private );
+	debugf4("edac_dev=%p next after end=%p\n",
+		dev_ctl, pvt + sz_private );
 
 	/* Initialize every Instance */
 	for (instance = 0; instance < nr_instances; instance++) {
@@ -178,9 +178,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			snprintf(blk->name, sizeof(blk->name),
 				 "%s%d", edac_block_name, block+offset_value);
 
-			debugf4("%s() instance=%d inst_p=%p block=#%d "
+			debugf4("instance=%d inst_p=%p block=#%d "
 				"block_p=%p name='%s'\n",
-				__func__, instance, inst, block,
+				instance, inst, block,
 				blk, blk->name);
 
 			/* if there are NO attributes OR no attribute pointer
@@ -194,8 +194,8 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 			attrib_p = &dev_attrib[block*nr_instances*nr_attrib];
 			blk->block_attributes = attrib_p;
 
-			debugf4("%s() THIS BLOCK_ATTRIB=%p\n",
-				__func__, blk->block_attributes);
+			debugf4("THIS BLOCK_ATTRIB=%p\n",
+				blk->block_attributes);
 
 			/* Initialize every user specified attribute in this
 			 * block with the data the caller passed in
@@ -214,9 +214,9 @@ struct edac_device_ctl_info *edac_device_alloc_ctl_info(
 
 				attrib->block = blk;	/* up link */
 
-				debugf4("%s() alloc-attrib=%p attrib_name='%s' "
+				debugf4("alloc-attrib=%p attrib_name='%s' "
 					"attrib-spec=%p spec-name=%s\n",
-					__func__, attrib, attrib->attr.name,
+					attrib, attrib->attr.name,
 					&attrib_spec[attr],
 					attrib_spec[attr].attr.name
 					);
@@ -273,7 +273,7 @@ static struct edac_device_ctl_info *find_edac_device_by_dev(struct device *dev)
 	struct edac_device_ctl_info *edac_dev;
 	struct list_head *item;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	list_for_each(item, &edac_device_list) {
 		edac_dev = list_entry(item, struct edac_device_ctl_info, link);
@@ -408,7 +408,7 @@ static void edac_device_workq_function(struct work_struct *work_req)
 void edac_device_workq_setup(struct edac_device_ctl_info *edac_dev,
 				unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* take the arg 'msec' and set it into the control structure
 	 * to used in the time period calculation
@@ -496,7 +496,7 @@ EXPORT_SYMBOL_GPL(edac_device_alloc_index);
  */
 int edac_device_add_device(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -570,7 +570,7 @@ struct edac_device_ctl_info *edac_device_del_device(struct device *dev)
 {
 	struct edac_device_ctl_info *edac_dev;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&device_ctls_mutex);
 
diff --git a/drivers/edac/edac_device_sysfs.c b/drivers/edac/edac_device_sysfs.c
index b4ea185..1cee83e 100644
--- a/drivers/edac/edac_device_sysfs.c
+++ b/drivers/edac/edac_device_sysfs.c
@@ -202,7 +202,7 @@ static void edac_device_ctrl_master_release(struct kobject *kobj)
 {
 	struct edac_device_ctl_info *edac_dev = to_edacdev(kobj);
 
-	debugf4("%s() control index=%d\n", __func__, edac_dev->dev_idx);
+	debugf4("control index=%d\n", edac_dev->dev_idx);
 
 	/* decrement the EDAC CORE module ref count */
 	module_put(edac_dev->owner);
@@ -233,12 +233,12 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	struct bus_type *edac_subsys;
 	int err;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the /sys/devices/system/edac reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys error\n", __func__);
+		debugf1("no edac_subsys error\n");
 		err = -ENODEV;
 		goto err_out;
 	}
@@ -264,8 +264,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 				   &edac_subsys->dev_root->kobj,
 				   "%s", edac_dev->name);
 	if (err) {
-		debugf1("%s()Failed to register '.../edac/%s'\n",
-			__func__, edac_dev->name);
+		debugf1("Failed to register '.../edac/%s'\n",
+			edac_dev->name);
 		goto err_kobj_reg;
 	}
 	kobject_uevent(&edac_dev->kobj, KOBJ_ADD);
@@ -274,8 +274,8 @@ int edac_device_register_sysfs_main_kobj(struct edac_device_ctl_info *edac_dev)
 	 * edac_device_unregister_sysfs_main_kobj() must be used
 	 */
 
-	debugf4("%s() Registered '.../edac/%s' kobject\n",
-		__func__, edac_dev->name);
+	debugf4("Registered '.../edac/%s' kobject\n",
+		edac_dev->name);
 
 	return 0;
 
@@ -296,9 +296,9 @@ err_out:
  */
 void edac_device_unregister_sysfs_main_kobj(struct edac_device_ctl_info *dev)
 {
-	debugf0("%s()\n", __func__);
-	debugf4("%s() name of kobject is: %s\n",
-		__func__, kobject_name(&dev->kobj));
+	debugf0("\n");
+	debugf4("name of kobject is: %s\n",
+		kobject_name(&dev->kobj));
 
 	/*
 	 * Unregister the edac device's kobject and
@@ -336,7 +336,7 @@ static void edac_device_ctrl_instance_release(struct kobject *kobj)
 {
 	struct edac_device_instance *instance;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* map from this kobj to the main control struct
 	 * and then dec the main kobj count
@@ -442,7 +442,7 @@ static void edac_device_ctrl_block_release(struct kobject *kobj)
 {
 	struct edac_device_block *block;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* get the container of the kobj */
 	block = to_block(kobj);
@@ -524,10 +524,10 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	struct edac_dev_sysfs_block_attribute *sysfs_attrib;
 	struct kobject *main_kobj;
 
-	debugf4("%s() Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
-		__func__, instance->name, instance, block->name, block);
-	debugf4("%s() block kobj=%p  block kobj->parent=%p\n",
-		__func__, &block->kobj, &block->kobj.parent);
+	debugf4("Instance '%s' inst_p=%p  block '%s'  block_p=%p\n",
+		instance->name, instance, block->name, block);
+	debugf4("block kobj=%p  block kobj->parent=%p\n",
+		&block->kobj, &block->kobj.parent);
 
 	/* init this block's kobject */
 	memset(&block->kobj, 0, sizeof(struct kobject));
@@ -546,8 +546,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 				   &instance->kobj,
 				   "%s", block->name);
 	if (err) {
-		debugf1("%s() Failed to register instance '%s'\n",
-			__func__, block->name);
+		debugf1("Failed to register instance '%s'\n",
+			block->name);
 		kobject_put(main_kobj);
 		err = -ENODEV;
 		goto err_out;
@@ -560,9 +560,8 @@ static int edac_device_create_block(struct edac_device_ctl_info *edac_dev,
 	if (sysfs_attrib && block->nr_attribs) {
 		for (i = 0; i < block->nr_attribs; i++, sysfs_attrib++) {
 
-			debugf4("%s() creating block attrib='%s' "
+			debugf4("creating block attrib='%s' "
 				"attrib->%p to kobj=%p\n",
-				__func__,
 				sysfs_attrib->attr.name,
 				sysfs_attrib, &block->kobj);
 
@@ -647,14 +646,14 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	err = kobject_init_and_add(&instance->kobj, &ktype_instance_ctrl,
 				   &edac_dev->kobj, "%s", instance->name);
 	if (err != 0) {
-		debugf2("%s() Failed to register instance '%s'\n",
-			__func__, instance->name);
+		debugf2("Failed to register instance '%s'\n",
+			instance->name);
 		kobject_put(main_kobj);
 		goto err_out;
 	}
 
-	debugf4("%s() now register '%d' blocks for instance %d\n",
-		__func__, instance->nr_blocks, idx);
+	debugf4("now register '%d' blocks for instance %d\n",
+		instance->nr_blocks, idx);
 
 	/* register all blocks of this instance */
 	for (i = 0; i < instance->nr_blocks; i++) {
@@ -670,8 +669,8 @@ static int edac_device_create_instance(struct edac_device_ctl_info *edac_dev,
 	}
 	kobject_uevent(&instance->kobj, KOBJ_ADD);
 
-	debugf4("%s() Registered instance %d '%s' kobject\n",
-		__func__, idx, instance->name);
+	debugf4("Registered instance %d '%s' kobject\n",
+		idx, instance->name);
 
 	return 0;
 
@@ -715,7 +714,7 @@ static int edac_device_create_instances(struct edac_device_ctl_info *edac_dev)
 	int i, j;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* iterate over creation of the instances */
 	for (i = 0; i < edac_dev->nr_instances; i++) {
@@ -817,12 +816,12 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	int err;
 	struct kobject *edac_kobj = &edac_dev->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, edac_dev->dev_idx);
+	debugf0("idx=%d\n", edac_dev->dev_idx);
 
 	/*  go create any main attributes callers wants */
 	err = edac_device_add_main_sysfs_attributes(edac_dev);
 	if (err) {
-		debugf0("%s() failed to add sysfs attribs\n", __func__);
+		debugf0("failed to add sysfs attribs\n");
 		goto err_out;
 	}
 
@@ -832,8 +831,8 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	err = sysfs_create_link(edac_kobj,
 				&edac_dev->dev->kobj, EDAC_DEVICE_SYMLINK);
 	if (err) {
-		debugf0("%s() sysfs_create_link() returned err= %d\n",
-			__func__, err);
+		debugf0("sysfs_create_link() returned err= %d\n",
+			err);
 		goto err_remove_main_attribs;
 	}
 
@@ -843,14 +842,14 @@ int edac_device_create_sysfs(struct edac_device_ctl_info *edac_dev)
 	 */
 	err = edac_device_create_instances(edac_dev);
 	if (err) {
-		debugf0("%s() edac_device_create_instances() "
-			"returned err= %d\n", __func__, err);
+		debugf0("edac_device_create_instances() "
+			"returned err= %d\n", err);
 		goto err_remove_link;
 	}
 
 
-	debugf4("%s() create-instances done, idx=%d\n",
-		__func__, edac_dev->dev_idx);
+	debugf4("create-instances done, idx=%d\n",
+		edac_dev->dev_idx);
 
 	return 0;
 
@@ -873,7 +872,7 @@ err_out:
  */
 void edac_device_remove_sysfs(struct edac_device_ctl_info *edac_dev)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* remove any main attributes for this device */
 	edac_device_remove_main_sysfs_attributes(edac_dev);
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 65568e6..d8278b3 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -259,18 +259,18 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	count = 1;
 	for (i = 0; i < n_layers; i++) {
 		count *= layers[i].size;
-		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		debugf4("errcount layer %d size %d\n", i, count);
 		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
 		tot_errcount += 2 * count;
 	}
 
-	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
+	debugf4("allocating %d error counters\n", tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
-	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
-		__func__, size,
+	debugf1("allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		size,
 		tot_dimms,
 		per_rank ? "ranks" : "dimms",
 		tot_csrows * tot_channels);
@@ -337,7 +337,7 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+	debugf4("initializing %d %s\n", tot_dimms,
 		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
@@ -351,8 +351,8 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 		mci->dimms[off] = dimm;
 		dimm->mci = mci;
 
-		debugf2("%s: %d: %s%i (%d:%d:%d): row %d, chan %d\n", __func__,
-			i, per_rank ? "rank" : "dimm", off,
+		debugf2("%d: %s%i (%d:%d:%d): row %d, chan %d\n", i,
+			per_rank ? "rank" : "dimm", off,
 			pos[0], pos[1], pos[2], row, chn);
 
 		/*
@@ -451,7 +451,7 @@ EXPORT_SYMBOL_GPL(edac_mc_alloc);
  */
 void edac_mc_free(struct mem_ctl_info *mci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	/* the mci instance is freed here, when the sysfs object is dropped */
 	edac_unregister_sysfs(mci);
@@ -471,7 +471,7 @@ struct mem_ctl_info *find_mci_by_dev(struct device *dev)
 	struct mem_ctl_info *mci;
 	struct list_head *item;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	list_for_each(item, &mc_devices) {
 		mci = list_entry(item, struct mem_ctl_info, link);
@@ -539,7 +539,7 @@ static void edac_mc_workq_function(struct work_struct *work_req)
  */
 static void edac_mc_workq_setup(struct mem_ctl_info *mci, unsigned msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* if this instance is not in the POLL state, then simply return */
 	if (mci->op_state != OP_RUNNING_POLL)
@@ -566,8 +566,7 @@ static void edac_mc_workq_teardown(struct mem_ctl_info *mci)
 
 	status = cancel_delayed_work(&mci->work);
 	if (status == 0) {
-		debugf0("%s() not canceled, flush the queue\n",
-			__func__);
+		debugf0("not canceled, flush the queue\n");
 
 		/* workq instance might be running, wait for it */
 		flush_workqueue(edac_workqueue);
@@ -714,7 +713,7 @@ EXPORT_SYMBOL(edac_mc_find);
 /* FIXME - should a warning be printed if no error detection? correction? */
 int edac_mc_add_mc(struct mem_ctl_info *mci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	if (edac_debug_level >= 3)
@@ -785,7 +784,7 @@ struct mem_ctl_info *edac_mc_del_mc(struct device *dev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&mem_ctls_mutex);
 
@@ -823,7 +822,7 @@ static void edac_mc_scrub_block(unsigned long page, unsigned long offset,
 	void *virt_addr;
 	unsigned long flags = 0;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* ECC error page was not in our memory. Ignore it. */
 	if (!pfn_valid(page))
@@ -853,7 +852,7 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 	struct csrow_info **csrows = mci->csrows;
 	int row, i, j, n;
 
-	debugf1("MC%d: %s(): 0x%lx\n", mci->mc_idx, __func__, page);
+	debugf1("MC%d: 0x%lx\n", mci->mc_idx, page);
 	row = -1;
 
 	for (i = 0; i < mci->nr_csrows; i++) {
@@ -866,8 +865,8 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 		if (n == 0)
 			continue;
 
-		debugf3("MC%d: %s(): first(0x%lx) page(0x%lx) last(0x%lx) "
-			"mask(0x%lx)\n", mci->mc_idx, __func__,
+		debugf3("MC%d: first(0x%lx) page(0x%lx) last(0x%lx) "
+			"mask(0x%lx)\n", mci->mc_idx,
 			csrow->first_page, page, csrow->last_page,
 			csrow->page_mask);
 
@@ -969,7 +968,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 	u32 grain;
 	bool enable_filter = false;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf3("MC%d\n", mci->mc_idx);
 
 	/* Check if the event report is consistent */
 	for (i = 0; i < mci->n_layers; i++) {
@@ -1043,8 +1042,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 			 * get csrow/channel of the dimm, in order to allow
 			 * incrementing the compat API counters
 			 */
-			debugf4("%s: %s csrows map: (%d,%d)\n",
-				__func__,
+			debugf4("%s csrows map: (%d,%d)\n",
 				mci->mem_is_per_rank ? "rank" : "dimm",
 				dimm->csrow, dimm->cschannel);
 			if (row == -1)
@@ -1060,8 +1058,8 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 	if (!enable_filter) {
 		strcpy(label, "any memory");
 	} else {
-		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
-			__func__, row, chan);
+		debugf4("csrow/channel to increment: (%d,%d)\n",
+			row, chan);
 		if (p == label)
 			strcpy(label, "unknown memory");
 		if (type == HW_EVENT_ERR_CORRECTED) {
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 81ca073..8f96c49 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -376,8 +376,7 @@ static int edac_create_csrow_object(struct mem_ctl_info *mci,
 	dev_set_name(&csrow->dev, "csrow%d", index);
 	dev_set_drvdata(&csrow->dev, csrow);
 
-	debugf0("%s(): creating (virtual) csrow node %s\n", __func__,
-		dev_name(&csrow->dev));
+	debugf0("creating (virtual) csrow node %s\n", dev_name(&csrow->dev));
 
 	err = device_add(&csrow->dev);
 	if (err < 0)
@@ -623,8 +622,7 @@ static int edac_create_dimm_object(struct mem_ctl_info *mci,
 
 	err =  device_add(&dimm->dev);
 
-	debugf0("%s(): creating rank/dimm device %s\n", __func__,
-		dev_name(&dimm->dev));
+	debugf0("creating rank/dimm device %s\n", dev_name(&dimm->dev));
 
 	return err;
 }
@@ -981,8 +979,7 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 	dev_set_drvdata(&mci->dev, mci);
 	pm_runtime_forbid(&mci->dev);
 
-	debugf0("%s(): creating device %s\n", __func__,
-		dev_name(&mci->dev));
+	debugf0("creating device %s\n", dev_name(&mci->dev));
 	err = device_add(&mci->dev);
 	if (err < 0) {
 		bus_unregister(&mci->bus);
@@ -999,8 +996,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 		if (dimm->nr_pages == 0)
 			continue;
 #ifdef CONFIG_EDAC_DEBUG
-		debugf1("%s creating dimm%d, located at ",
-			__func__, i);
+		debugf1("creating dimm%d, located at ",
+			i);
 		if (edac_debug_level >= 1) {
 			int lay;
 			for (lay = 0; lay < mci->n_layers; lay++)
@@ -1012,8 +1009,8 @@ int edac_create_sysfs_mci_device(struct mem_ctl_info *mci)
 #endif
 		err = edac_create_dimm_object(mci, dimm, i);
 		if (err) {
-			debugf1("%s() failure: create dimm %d obj\n",
-				__func__, i);
+			debugf1("failure: create dimm %d obj\n",
+				i);
 			goto fail;
 		}
 	}
@@ -1051,7 +1048,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 {
 	int i;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 #ifdef CONFIG_EDAC_DEBUG
 	debugfs_remove(mci->debugfs);
@@ -1064,8 +1061,7 @@ void edac_remove_sysfs_mci_device(struct mem_ctl_info *mci)
 		struct dimm_info *dimm = mci->dimms[i];
 		if (dimm->nr_pages == 0)
 			continue;
-		debugf0("%s(): removing device %s\n", __func__,
-			dev_name(&dimm->dev));
+		debugf0("removing device %s\n", dev_name(&dimm->dev));
 		put_device(&dimm->dev);
 		device_del(&dimm->dev);
 	}
@@ -1105,7 +1101,7 @@ int __init edac_mc_sysfs_init(void)
 	/* get the /sys/devices/system/edac subsys reference */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		return -EINVAL;
 	}
 
diff --git a/drivers/edac/edac_module.c b/drivers/edac/edac_module.c
index 8735a0d..9de2484 100644
--- a/drivers/edac/edac_module.c
+++ b/drivers/edac/edac_module.c
@@ -113,7 +113,7 @@ error:
  */
 static void __exit edac_exit(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* tear down the various subsystems */
 	edac_workqueue_teardown();
diff --git a/drivers/edac/edac_pci.c b/drivers/edac/edac_pci.c
index f1ac866..51dd4e0 100644
--- a/drivers/edac/edac_pci.c
+++ b/drivers/edac/edac_pci.c
@@ -45,7 +45,7 @@ struct edac_pci_ctl_info *edac_pci_alloc_ctl_info(unsigned int sz_pvt,
 	void *p = NULL, *pvt;
 	unsigned int size;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	pci = edac_align_ptr(&p, sizeof(*pci), 1);
 	pvt = edac_align_ptr(&p, 1, sz_pvt);
@@ -80,7 +80,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_ctl_info);
  */
 void edac_pci_free_ctl_info(struct edac_pci_ctl_info *pci)
 {
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	edac_pci_remove_sysfs(pci);
 }
@@ -97,7 +97,7 @@ static struct edac_pci_ctl_info *find_edac_pci_by_dev(struct device *dev)
 	struct edac_pci_ctl_info *pci;
 	struct list_head *item;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	list_for_each(item, &edac_pci_list) {
 		pci = list_entry(item, struct edac_pci_ctl_info, link);
@@ -122,7 +122,7 @@ static int add_edac_pci_to_global_list(struct edac_pci_ctl_info *pci)
 	struct list_head *item, *insert_before;
 	struct edac_pci_ctl_info *rover;
 
-	debugf1("%s()\n", __func__);
+	debugf1("\n");
 
 	insert_before = &edac_pci_list;
 
@@ -226,7 +226,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 	int msec;
 	unsigned long delay;
 
-	debugf3("%s() checking\n", __func__);
+	debugf3("checking\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -261,7 +261,7 @@ static void edac_pci_workq_function(struct work_struct *work_req)
 static void edac_pci_workq_setup(struct edac_pci_ctl_info *pci,
 				 unsigned int msec)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	INIT_DELAYED_WORK(&pci->work, edac_pci_workq_function);
 	queue_delayed_work(edac_workqueue, &pci->work,
@@ -276,7 +276,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 {
 	int status;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	status = cancel_delayed_work(&pci->work);
 	if (status == 0)
@@ -293,7 +293,7 @@ static void edac_pci_workq_teardown(struct edac_pci_ctl_info *pci)
 void edac_pci_reset_delay_period(struct edac_pci_ctl_info *pci,
 				 unsigned long value)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_workq_teardown(pci);
 
@@ -333,7 +333,7 @@ EXPORT_SYMBOL_GPL(edac_pci_alloc_index);
  */
 int edac_pci_add_device(struct edac_pci_ctl_info *pci, int edac_idx)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci->pci_idx = edac_idx;
 	pci->start_time = jiffies;
@@ -393,7 +393,7 @@ struct edac_pci_ctl_info *edac_pci_del_device(struct device *dev)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mutex_lock(&edac_pci_ctls_mutex);
 
@@ -430,7 +430,7 @@ EXPORT_SYMBOL_GPL(edac_pci_del_device);
  */
 static void edac_pci_generic_check(struct edac_pci_ctl_info *pci)
 {
-	debugf4("%s()\n", __func__);
+	debugf4("\n");
 	edac_pci_do_parity_check();
 }
 
@@ -475,7 +475,7 @@ struct edac_pci_ctl_info *edac_pci_create_generic_ctl(struct device *dev,
 	pdata->edac_idx = edac_pci_idx++;
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		edac_pci_free_ctl_info(pci);
 		return NULL;
 	}
@@ -491,7 +491,7 @@ EXPORT_SYMBOL_GPL(edac_pci_create_generic_ctl);
  */
 void edac_pci_release_generic_ctl(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() pci mod=%s\n", __func__, pci->mod_name);
+	debugf0("pci mod=%s\n", pci->mod_name);
 
 	edac_pci_del_device(pci->dev);
 	edac_pci_free_ctl_info(pci);
diff --git a/drivers/edac/edac_pci_sysfs.c b/drivers/edac/edac_pci_sysfs.c
index 97f5064..6678216 100644
--- a/drivers/edac/edac_pci_sysfs.c
+++ b/drivers/edac/edac_pci_sysfs.c
@@ -78,7 +78,7 @@ static void edac_pci_instance_release(struct kobject *kobj)
 {
 	struct edac_pci_ctl_info *pci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Form pointer to containing struct, the pci control struct */
 	pci = to_instance(kobj);
@@ -161,7 +161,7 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	struct kobject *main_kobj;
 	int err;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* First bump the ref count on the top main kobj, which will
 	 * track the number of PCI instances we have, and thus nest
@@ -177,14 +177,14 @@ static int edac_pci_create_instance_kobj(struct edac_pci_ctl_info *pci, int idx)
 	err = kobject_init_and_add(&pci->kobj, &ktype_pci_instance,
 				   edac_pci_top_main_kobj, "pci%d", idx);
 	if (err != 0) {
-		debugf2("%s() failed to register instance pci%d\n",
-			__func__, idx);
+		debugf2("failed to register instance pci%d\n",
+			idx);
 		kobject_put(edac_pci_top_main_kobj);
 		goto error_out;
 	}
 
 	kobject_uevent(&pci->kobj, KOBJ_ADD);
-	debugf1("%s() Register instance 'pci%d' kobject\n", __func__, idx);
+	debugf1("Register instance 'pci%d' kobject\n", idx);
 
 	return 0;
 
@@ -201,7 +201,7 @@ error_out:
 static void edac_pci_unregister_sysfs_instance_kobj(
 			struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Unregister the instance kobject and allow its release
 	 * function release the main reference count and then
@@ -317,7 +317,7 @@ static struct edac_pci_dev_attribute *edac_pci_attr[] = {
  */
 static void edac_pci_release_main_kobj(struct kobject *kobj)
 {
-	debugf0("%s() here to module_put(THIS_MODULE)\n", __func__);
+	debugf0("here to module_put(THIS_MODULE)\n");
 
 	kfree(kobj);
 
@@ -345,7 +345,7 @@ static int edac_pci_main_kobj_setup(void)
 	int err;
 	struct bus_type *edac_subsys;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* check and count if we have already created the main kobject */
 	if (atomic_inc_return(&edac_pci_sysfs_refcount) != 1)
@@ -356,7 +356,7 @@ static int edac_pci_main_kobj_setup(void)
 	 */
 	edac_subsys = edac_get_sysfs_subsys();
 	if (edac_subsys == NULL) {
-		debugf1("%s() no edac_subsys\n", __func__);
+		debugf1("no edac_subsys\n");
 		err = -ENODEV;
 		goto decrement_count_fail;
 	}
@@ -366,7 +366,7 @@ static int edac_pci_main_kobj_setup(void)
 	 * level main kobj for EDAC PCI
 	 */
 	if (!try_module_get(THIS_MODULE)) {
-		debugf1("%s() try_module_get() failed\n", __func__);
+		debugf1("try_module_get() failed\n");
 		err = -ENODEV;
 		goto mod_get_fail;
 	}
@@ -421,15 +421,14 @@ decrement_count_fail:
  */
 static void edac_pci_main_kobj_teardown(void)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* Decrement the count and only if no more controller instances
 	 * are connected perform the unregisteration of the top level
 	 * main kobj
 	 */
 	if (atomic_dec_return(&edac_pci_sysfs_refcount) == 0) {
-		debugf0("%s() called kobject_put on main kobj\n",
-			__func__);
+		debugf0("called kobject_put on main kobj\n");
 		kobject_put(edac_pci_top_main_kobj);
 	}
 	edac_put_sysfs_subsys();
@@ -446,7 +445,7 @@ int edac_pci_create_sysfs(struct edac_pci_ctl_info *pci)
 	int err;
 	struct kobject *edac_kobj = &pci->kobj;
 
-	debugf0("%s() idx=%d\n", __func__, pci->pci_idx);
+	debugf0("idx=%d\n", pci->pci_idx);
 
 	/* create the top main EDAC PCI kobject, IF needed */
 	err = edac_pci_main_kobj_setup();
@@ -460,8 +459,8 @@ int edac_pci_create_sysfs(struct edac_pci_ctl_info *pci)
 
 	err = sysfs_create_link(edac_kobj, &pci->dev->kobj, EDAC_PCI_SYMLINK);
 	if (err) {
-		debugf0("%s() sysfs_create_link() returned err= %d\n",
-			__func__, err);
+		debugf0("sysfs_create_link() returned err= %d\n",
+			err);
 		goto symlink_fail;
 	}
 
@@ -484,7 +483,7 @@ unregister_cleanup:
  */
 void edac_pci_remove_sysfs(struct edac_pci_ctl_info *pci)
 {
-	debugf0("%s() index=%d\n", __func__, pci->pci_idx);
+	debugf0("index=%d\n", pci->pci_idx);
 
 	/* Remove the symlink */
 	sysfs_remove_link(&pci->kobj, EDAC_PCI_SYMLINK);
@@ -496,7 +495,7 @@ void edac_pci_remove_sysfs(struct edac_pci_ctl_info *pci)
 	 * if this 'pci' is the last instance.
 	 * If it is, the main kobject will be unregistered as a result
 	 */
-	debugf0("%s() calling edac_pci_main_kobj_teardown()\n", __func__);
+	debugf0("calling edac_pci_main_kobj_teardown()\n");
 	edac_pci_main_kobj_teardown();
 }
 
@@ -671,7 +670,7 @@ void edac_pci_do_parity_check(void)
 {
 	int before_count;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	/* if policy has PCI check off, leave now */
 	if (!check_pci_errors)
diff --git a/drivers/edac/i3000_edac.c b/drivers/edac/i3000_edac.c
index 55eff02..1f05480 100644
--- a/drivers/edac/i3000_edac.c
+++ b/drivers/edac/i3000_edac.c
@@ -275,7 +275,7 @@ static void i3000_check(struct mem_ctl_info *mci)
 {
 	struct i3000_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i3000_get_error_info(mci, &info);
 	i3000_process_error_info(mci, &info, 1);
 }
@@ -322,7 +322,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	unsigned long mchbar;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	pci_read_config_dword(pdev, I3000_MCHBAR, (u32 *) & mchbar);
 	mchbar &= I3000_MCHBAR_MASK;
@@ -366,7 +366,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -399,8 +399,8 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 		cumul_size = value << (I3000_DRB_SHIFT - PAGE_SHIFT);
 		if (interleaved)
 			cumul_size <<= 1;
-		debugf3("MC: %s(): (%d) cumul_size 0x%x\n",
-			__func__, i, cumul_size);
+		debugf3("MC: (%d) cumul_size 0x%x\n",
+			i, cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;
 
@@ -429,7 +429,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -445,7 +445,7 @@ static int i3000_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -461,7 +461,7 @@ static int __devinit i3000_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -477,7 +477,7 @@ static void __devexit i3000_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i3000_pci)
 		edac_pci_release_generic_ctl(i3000_pci);
@@ -511,7 +511,7 @@ static int __init i3000_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -552,7 +552,7 @@ fail0:
 
 static void __exit i3000_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&i3000_driver);
 	if (!i3000_registered) {
diff --git a/drivers/edac/i3200_edac.c b/drivers/edac/i3200_edac.c
index 818ee6f..ce2d60c 100644
--- a/drivers/edac/i3200_edac.c
+++ b/drivers/edac/i3200_edac.c
@@ -245,7 +245,7 @@ static void i3200_check(struct mem_ctl_info *mci)
 {
 	struct i3200_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i3200_get_and_clear_error_info(mci, &info);
 	i3200_process_error_info(mci, &info);
 }
@@ -332,7 +332,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	void __iomem *window;
 	struct i3200_priv *priv;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	window = i3200_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -403,12 +403,12 @@ static int i3200_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -424,7 +424,7 @@ static int __devinit i3200_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -441,7 +441,7 @@ static void __devexit i3200_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i3200_priv *priv;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -475,7 +475,7 @@ static int __init i3200_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -516,7 +516,7 @@ fail0:
 
 static void __exit i3200_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&i3200_driver);
 	if (!i3200_registered) {
diff --git a/drivers/edac/i5000_edac.c b/drivers/edac/i5000_edac.c
index 2a9f1dc..0292a06 100644
--- a/drivers/edac/i5000_edac.c
+++ b/drivers/edac/i5000_edac.c
@@ -779,7 +779,7 @@ static void i5000_clear_error(struct mem_ctl_info *mci)
 static void i5000_check_error(struct mem_ctl_info *mci)
 {
 	struct i5000_error_info info;
-	debugf4("MC%d: %s: %s()\n", mci->mc_idx, __FILE__, __func__);
+	debugf4("MC%d\n", mci->mc_idx);
 	i5000_get_error_info(mci, &info);
 	i5000_process_error_info(mci, &info, 1);
 }
@@ -1363,9 +1363,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	int num_channels;
 	int num_dimms_per_channel;
 
-	debugf0("MC: %s: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
-		__FILE__, __func__,
-		pdev->bus->number,
+	debugf0("MC: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
+		__FILE__, pdev->bus->number,
 		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 
 	/* We only are looking for func 0 of the set */
@@ -1388,8 +1387,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	i5000_get_dimm_and_channel_counts(pdev, &num_dimms_per_channel,
 					&num_channels);
 
-	debugf0("MC: %s(): Number of Branches=2 Channels= %d  DIMMS= %d\n",
-		__func__, num_channels, num_dimms_per_channel);
+	debugf0("MC: Number of Branches=2 Channels= %d  DIMMS= %d\n",
+		num_channels, num_dimms_per_channel);
 
 	/* allocate a new MC control structure */
 
@@ -1407,7 +1406,7 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 
 	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
@@ -1450,8 +1449,8 @@ static int i5000_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (edac_mc_add_mc(mci)) {
-		debugf0("MC: %s: %s(): failed edac_mc_add_mc()\n",
-			__FILE__, __func__);
+		debugf0("MC: %s(): failed edac_mc_add_mc()\n",
+			__FILE__);
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -1495,7 +1494,7 @@ static int __devinit i5000_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* wake up device */
 	rc = pci_enable_device(pdev);
@@ -1514,7 +1513,7 @@ static void __devexit i5000_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i5000_pci)
 		edac_pci_release_generic_ctl(i5000_pci);
@@ -1560,7 +1559,7 @@ static int __init i5000_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -1576,7 +1575,7 @@ static int __init i5000_init(void)
  */
 static void __exit i5000_exit(void)
 {
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 	pci_unregister_driver(&i5000_driver);
 }
 
diff --git a/drivers/edac/i5400_edac.c b/drivers/edac/i5400_edac.c
index 676591e..a736b98 100644
--- a/drivers/edac/i5400_edac.c
+++ b/drivers/edac/i5400_edac.c
@@ -700,7 +700,7 @@ static void i5400_clear_error(struct mem_ctl_info *mci)
 static void i5400_check_error(struct mem_ctl_info *mci)
 {
 	struct i5400_error_info info;
-	debugf4("MC%d: %s: %s()\n", mci->mc_idx, __FILE__, __func__);
+	debugf4("MC%d\n", mci->mc_idx);
 	i5400_get_error_info(mci, &info);
 	i5400_process_error_info(mci, &info);
 }
@@ -1203,8 +1203,7 @@ static int i5400_init_dimms(struct mem_ctl_info *mci)
 
 			size_mb =  pvt->dimm_info[slot][channel].megabytes;
 
-			debugf2("%s: dimm (branch %d channel %d slot %d): %d.%03d GB\n",
-				__func__,
+			debugf2("dimm (branch %d channel %d slot %d): %d.%03d GB\n",
 				channel / 2, channel % 2, slot,
 				size_mb / 1000, size_mb % 1000);
 
@@ -1270,9 +1269,8 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	if (dev_idx >= ARRAY_SIZE(i5400_devs))
 		return -EINVAL;
 
-	debugf0("MC: %s: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
-		__FILE__, __func__,
-		pdev->bus->number,
+	debugf0("MC: %s(), pdev bus %u dev=0x%x fn=0x%x\n",
+		__FILE__, pdev->bus->number,
 		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 
 	/* We only are looking for func 0 of the set */
@@ -1298,7 +1296,7 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 
 	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
@@ -1341,8 +1339,8 @@ static int i5400_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (edac_mc_add_mc(mci)) {
-		debugf0("MC: %s: %s(): failed edac_mc_add_mc()\n",
-			__FILE__, __func__);
+		debugf0("MC: %s(): failed edac_mc_add_mc()\n",
+			__FILE__);
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -1386,7 +1384,7 @@ static int __devinit i5400_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* wake up device */
 	rc = pci_enable_device(pdev);
@@ -1405,7 +1403,7 @@ static void __devexit i5400_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i5400_pci)
 		edac_pci_release_generic_ctl(i5400_pci);
@@ -1451,7 +1449,7 @@ static int __init i5400_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -1467,7 +1465,7 @@ static int __init i5400_init(void)
  */
 static void __exit i5400_exit(void)
 {
-	debugf2("MC: %s: %s()\n", __FILE__, __func__);
+	debugf2("MC: %s()\n", __FILE__);
 	pci_unregister_driver(&i5400_driver);
 }
 
diff --git a/drivers/edac/i7300_edac.c b/drivers/edac/i7300_edac.c
index 7425f17..aa3eb98 100644
--- a/drivers/edac/i7300_edac.c
+++ b/drivers/edac/i7300_edac.c
@@ -1032,8 +1032,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 	if (rc == -EIO)
 		return rc;
 
-	debugf0("MC: " __FILE__ ": %s(), pdev bus %u dev=0x%x fn=0x%x\n",
-		__func__,
+	debugf0("MC: pdev bus %u dev=0x%x fn=0x%x\n",
 		pdev->bus->number,
 		PCI_SLOT(pdev->devfn), PCI_FUNC(pdev->devfn));
 
@@ -1056,7 +1055,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p\n", __func__, mci);
+	debugf0("MC: mci = %p\n", mci);
 
 	mci->pdev = &pdev->dev;	/* record ptr  to the generic device */
 
@@ -1100,8 +1099,7 @@ static int __devinit i7300_init_one(struct pci_dev *pdev,
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (edac_mc_add_mc(mci)) {
-		debugf0("MC: " __FILE__
-			": %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf0("MC: failed edac_mc_add_mc()\n");
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -1143,7 +1141,7 @@ static void __devexit i7300_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	char *tmp;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	if (i7300_pci)
 		edac_pci_release_generic_ctl(i7300_pci);
@@ -1190,7 +1188,7 @@ static int __init i7300_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -1205,7 +1203,7 @@ static int __init i7300_init(void)
  */
 static void __exit i7300_exit(void)
 {
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 	pci_unregister_driver(&i7300_driver);
 }
 
diff --git a/drivers/edac/i7core_edac.c b/drivers/edac/i7core_edac.c
index ef237f4..fcf9cfc 100644
--- a/drivers/edac/i7core_edac.c
+++ b/drivers/edac/i7core_edac.c
@@ -824,7 +824,7 @@ static ssize_t i7core_inject_store_##param(			\
 	long value;						\
 	int rc;							\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	pvt = mci->pvt_info;					\
 								\
 	if (pvt->inject.enable)					\
@@ -852,7 +852,7 @@ static ssize_t i7core_inject_show_##param(			\
 	struct i7core_pvt *pvt;					\
 								\
 	pvt = mci->pvt_info;					\
-	debugf1("%s() pvt=%p\n", __func__, pvt);		\
+	debugf1("pvt=%p\n", pvt);		\
 	if (pvt->inject.param < 0)				\
 		return sprintf(data, "any\n");			\
 	else							\
@@ -1059,7 +1059,7 @@ static ssize_t i7core_show_counter_##param(			\
 	struct mem_ctl_info *mci = to_mci(dev);			\
 	struct i7core_pvt *pvt = mci->pvt_info;			\
 								\
-	debugf1("%s()\n", __func__);				\
+	debugf1("\n");				\
 	if (!pvt->ce_count_available || (pvt->is_registered))	\
 		return sprintf(data, "data unavailable\n");	\
 	return sprintf(data, "%lu\n",				\
@@ -1190,8 +1190,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 	dev_set_name(pvt->addrmatch_dev, "inject_addrmatch");
 	dev_set_drvdata(pvt->addrmatch_dev, mci);
 
-	debugf1("%s(): creating %s\n", __func__,
-		dev_name(pvt->addrmatch_dev));
+	debugf1("creating %s\n", dev_name(pvt->addrmatch_dev));
 
 	rc = device_add(pvt->addrmatch_dev);
 	if (rc < 0)
@@ -1213,8 +1212,7 @@ static int i7core_create_sysfs_devices(struct mem_ctl_info *mci)
 		dev_set_name(pvt->chancounts_dev, "all_channel_counts");
 		dev_set_drvdata(pvt->chancounts_dev, mci);
 
-		debugf1("%s(): creating %s\n", __func__,
-			dev_name(pvt->chancounts_dev));
+		debugf1("creating %s\n", dev_name(pvt->chancounts_dev));
 
 		rc = device_add(pvt->chancounts_dev);
 		if (rc < 0)
@@ -1254,7 +1252,7 @@ static void i7core_put_devices(struct i7core_dev *i7core_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < i7core_dev->n_devs; i++) {
 		struct pci_dev *pdev = i7core_dev->pdev[i];
 		if (!pdev)
@@ -1652,7 +1650,7 @@ static void i7core_udimm_check_mc_ecc_err(struct mem_ctl_info *mci)
 	int new0, new1, new2;
 
 	if (!pvt->pci_mcr[4]) {
-		debugf0("%s MCR registers not found\n", __func__);
+		debugf0("MCR registers not found\n");
 		return;
 	}
 
@@ -2190,8 +2188,7 @@ static void i7core_unregister_mci(struct i7core_dev *i7core_dev)
 	struct i7core_pvt *pvt;
 
 	if (unlikely(!mci || !mci->pvt_info)) {
-		debugf0("MC: " __FILE__ ": %s(): dev = %p\n",
-			__func__, &i7core_dev->pdev[0]->dev);
+		debugf0("MC: dev = %p\n", &i7core_dev->pdev[0]->dev);
 
 		i7core_printk(KERN_ERR, "Couldn't find mci handler\n");
 		return;
@@ -2199,8 +2196,7 @@ static void i7core_unregister_mci(struct i7core_dev *i7core_dev)
 
 	pvt = mci->pvt_info;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
-		__func__, mci, &i7core_dev->pdev[0]->dev);
+	debugf0("MC: mci = %p, dev = %p\n", mci, &i7core_dev->pdev[0]->dev);
 
 	/* Disable scrubrate setting */
 	if (pvt->enable_scrub)
@@ -2241,8 +2237,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 	if (unlikely(!mci))
 		return -ENOMEM;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
-		__func__, mci, &i7core_dev->pdev[0]->dev);
+	debugf0("MC: mci = %p, dev = %p\n", mci, &i7core_dev->pdev[0]->dev);
 
 	pvt = mci->pvt_info;
 	memset(pvt, 0, sizeof(*pvt));
@@ -2285,8 +2280,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (unlikely(edac_mc_add_mc(mci))) {
-		debugf0("MC: " __FILE__
-			": %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf0("MC: failed edac_mc_add_mc()\n");
 		/* FIXME: perhaps some code should go here that disables error
 		 * reporting if we just enabled it
 		 */
@@ -2295,8 +2289,7 @@ static int i7core_register_mci(struct i7core_dev *i7core_dev)
 		goto fail0;
 	}
 	if (i7core_create_sysfs_devices(mci)) {
-		debugf0("MC: " __FILE__
-			": %s(): failed to create sysfs nodes\n", __func__);
+		debugf0("MC: failed to create sysfs nodes\n");
 		edac_mc_del_mc(mci->pdev);
 		rc = -EINVAL;
 		goto fail0;
@@ -2402,7 +2395,7 @@ static void __devexit i7core_remove(struct pci_dev *pdev)
 {
 	struct i7core_dev *i7core_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
@@ -2451,7 +2444,7 @@ static int __init i7core_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -2476,7 +2469,7 @@ static int __init i7core_init(void)
  */
 static void __exit i7core_exit(void)
 {
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 	pci_unregister_driver(&i7core_driver);
 }
 
diff --git a/drivers/edac/i82443bxgx_edac.c b/drivers/edac/i82443bxgx_edac.c
index c0249f3..b9215e8 100644
--- a/drivers/edac/i82443bxgx_edac.c
+++ b/drivers/edac/i82443bxgx_edac.c
@@ -178,7 +178,7 @@ static void i82443bxgx_edacmc_check(struct mem_ctl_info *mci)
 {
 	struct i82443bxgx_edacmc_error_info info;
 
-	debugf1("MC%d: %s: %s()\n", mci->mc_idx, __FILE__, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i82443bxgx_edacmc_get_error_info(mci, &info);
 	i82443bxgx_edacmc_process_error_info(mci, &info, 1);
 }
@@ -201,13 +201,13 @@ static void i82443bxgx_init_csrows(struct mem_ctl_info *mci,
 		dimm = csrow->channels[0]->dimm;
 
 		pci_read_config_byte(pdev, I82443BXGX_DRB + index, &drbar);
-		debugf1("MC%d: %s: %s() Row=%d DRB = %#0x\n",
-			mci->mc_idx, __FILE__, __func__, index, drbar);
+		debugf1("MC%d: Row=%d DRB = %#0x\n",
+			mci->mc_idx,index, drbar);
 		row_high_limit = ((u32) drbar << 23);
 		/* find the DRAM Chip Select Base address and mask */
-		debugf1("MC%d: %s: %s() Row=%d, "
+		debugf1("MC%d: Row=%d, "
 			"Boundary Address=%#0x, Last = %#0x\n",
-			mci->mc_idx, __FILE__, __func__, index, row_high_limit,
+			mci->mc_idx, index, row_high_limit,
 			row_high_limit_last);
 
 		/* 440GX goes to 2GB, represented with a DRB of 0. */
@@ -241,7 +241,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	enum mem_type mtype;
 	enum edac_type edac_mode;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* Something is really hosed if PCI config space reads from
 	 * the MC aren't working.
@@ -259,7 +259,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("MC: %s: %s(): mci = %p\n", __FILE__, __func__, mci);
+	debugf0("MC: %s(): mci = %p\n", __FILE__, mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_EDO | MEM_FLAG_SDR | MEM_FLAG_RDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -305,8 +305,8 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 		edac_mode = EDAC_SECDED;
 		break;
 	default:
-		debugf0("%s(): Unknown/reserved ECC state "
-			"in NBXCFG register!\n", __func__);
+		debugf0("Unknown/reserved ECC state "
+			"in NBXCFG register!\n");
 		edac_mode = EDAC_UNKNOWN;
 		break;
 	}
@@ -330,7 +330,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->ctl_page_to_phys = NULL;
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -345,7 +345,7 @@ static int i82443bxgx_edacmc_probe1(struct pci_dev *pdev, int dev_idx)
 			__func__);
 	}
 
-	debugf3("MC: %s: %s(): success\n", __FILE__, __func__);
+	debugf3("MC: %s(): success\n", __FILE__);
 	return 0;
 
 fail:
@@ -361,7 +361,7 @@ static int __devinit i82443bxgx_edacmc_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s: %s()\n", __FILE__, __func__);
+	debugf0("MC: %s()\n", __FILE__);
 
 	/* don't need to call pci_enable_device() */
 	rc = i82443bxgx_edacmc_probe1(pdev, ent->driver_data);
@@ -376,7 +376,7 @@ static void __devexit i82443bxgx_edacmc_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s: %s()\n", __FILE__, __func__);
+	debugf0("%s()\n", __FILE__);
 
 	if (i82443bxgx_pci)
 		edac_pci_release_generic_ctl(i82443bxgx_pci);
diff --git a/drivers/edac/i82860_edac.c b/drivers/edac/i82860_edac.c
index 6ff59b0..ae5b2e1 100644
--- a/drivers/edac/i82860_edac.c
+++ b/drivers/edac/i82860_edac.c
@@ -136,7 +136,7 @@ static void i82860_check(struct mem_ctl_info *mci)
 {
 	struct i82860_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i82860_get_error_info(mci, &info);
 	i82860_process_error_info(mci, &info, 1);
 }
@@ -167,7 +167,7 @@ static void i82860_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev)
 		pci_read_config_word(pdev, I82860_GBA + index * 2, &value);
 		cumul_size = (value & I82860_GBA_MASK) <<
 			(I82860_GBA_SHIFT - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 
 		if (cumul_size == last_cumul_size)
@@ -210,7 +210,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -229,7 +229,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
@@ -245,7 +245,7 @@ static int i82860_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -260,7 +260,7 @@ static int __devinit i82860_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82860_printk(KERN_INFO, "i82860 init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -278,7 +278,7 @@ static void __devexit i82860_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82860_pci)
 		edac_pci_release_generic_ctl(i82860_pci);
@@ -311,7 +311,7 @@ static int __init i82860_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -352,7 +352,7 @@ fail0:
 
 static void __exit i82860_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82860_driver);
 
diff --git a/drivers/edac/i82875p_edac.c b/drivers/edac/i82875p_edac.c
index c943904..e24e703 100644
--- a/drivers/edac/i82875p_edac.c
+++ b/drivers/edac/i82875p_edac.c
@@ -263,7 +263,7 @@ static void i82875p_check(struct mem_ctl_info *mci)
 {
 	struct i82875p_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i82875p_get_error_info(mci, &info);
 	i82875p_process_error_info(mci, &info, 1);
 }
@@ -371,7 +371,7 @@ static void i82875p_init_csrows(struct mem_ctl_info *mci,
 
 		value = readb(ovrfl_window + I82875P_DRB + index);
 		cumul_size = value << (I82875P_DRB_SHIFT - PAGE_SHIFT);
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 		if (cumul_size == last_cumul_size)
 			continue;	/* not populated */
@@ -405,7 +405,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 nr_chans;
 	struct i82875p_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	ovrfl_pdev = pci_get_device(PCI_VEND_DEV(INTEL, 82875_6), NULL);
 
@@ -426,7 +426,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail0;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -437,7 +437,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82875p_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82875p_pvt *)mci->pvt_info;
 	pvt->ovrfl_pdev = ovrfl_pdev;
 	pvt->ovrfl_window = ovrfl_window;
@@ -448,7 +448,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail1;
 	}
 
@@ -464,7 +464,7 @@ static int i82875p_probe1(struct pci_dev *pdev, int dev_idx)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail1:
@@ -485,7 +485,7 @@ static int __devinit i82875p_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	i82875p_printk(KERN_INFO, "i82875p init one\n");
 
 	if (pci_enable_device(pdev) < 0)
@@ -504,7 +504,7 @@ static void __devexit i82875p_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82875p_pvt *pvt = NULL;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (i82875p_pci)
 		edac_pci_release_generic_ctl(i82875p_pci);
@@ -550,7 +550,7 @@ static int __init i82875p_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -593,7 +593,7 @@ fail0:
 
 static void __exit i82875p_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	i82875p_remove_one(mci_pdev);
 	pci_dev_put(mci_pdev);
diff --git a/drivers/edac/i82975x_edac.c b/drivers/edac/i82975x_edac.c
index a4a6768..6a367ba 100644
--- a/drivers/edac/i82975x_edac.c
+++ b/drivers/edac/i82975x_edac.c
@@ -331,7 +331,7 @@ static void i82975x_check(struct mem_ctl_info *mci)
 {
 	struct i82975x_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	i82975x_get_error_info(mci, &info);
 	i82975x_process_error_info(mci, &info, 1);
 }
@@ -406,7 +406,7 @@ static void i82975x_init_csrows(struct mem_ctl_info *mci,
 		 */
 		if (csrow->nr_channels > 1)
 			cumul_size <<= 1;
-		debugf3("%s(): (%d) cumul_size 0x%x\n", __func__, index,
+		debugf3("(%d) cumul_size 0x%x\n", index,
 			cumul_size);
 
 		nr_pages = cumul_size - last_cumul_size;
@@ -489,11 +489,11 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	u8 c1drb[4];
 #endif
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	pci_read_config_dword(pdev, I82975X_MCHBAR, &mchbar);
 	if (!(mchbar & 1)) {
-		debugf3("%s(): failed, MCHBAR disabled!\n", __func__);
+		debugf3("failed, MCHBAR disabled!\n");
 		goto fail0;
 	}
 	mchbar &= 0xffffc000;	/* bits 31:14 used for 16K window */
@@ -558,7 +558,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 		goto fail1;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -569,7 +569,7 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 	mci->dev_name = pci_name(pdev);
 	mci->edac_check = i82975x_check;
 	mci->ctl_page_to_phys = NULL;
-	debugf3("%s(): init pvt\n", __func__);
+	debugf3("init pvt\n");
 	pvt = (struct i82975x_pvt *) mci->pvt_info;
 	pvt->mch_window = mch_window;
 	i82975x_init_csrows(mci, pdev, mch_window);
@@ -578,12 +578,12 @@ static int i82975x_probe1(struct pci_dev *pdev, int dev_idx)
 
 	/* finalize this instance of memory controller with edac core */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail2;
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail2:
@@ -601,7 +601,7 @@ static int __devinit i82975x_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -619,7 +619,7 @@ static void __devexit i82975x_remove_one(struct pci_dev *pdev)
 	struct mem_ctl_info *mci;
 	struct i82975x_pvt *pvt;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (mci  == NULL)
@@ -655,7 +655,7 @@ static int __init i82975x_init(void)
 {
 	int pci_rc;
 
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
        /* Ensure that the OPSTATE is set correctly for POLL or NMI */
        opstate_init();
@@ -697,7 +697,7 @@ fail0:
 
 static void __exit i82975x_exit(void)
 {
-	debugf3("%s()\n", __func__);
+	debugf3("\n");
 
 	pci_unregister_driver(&i82975x_driver);
 
diff --git a/drivers/edac/mpc85xx_edac.c b/drivers/edac/mpc85xx_edac.c
index 1640d54..17d000b 100644
--- a/drivers/edac/mpc85xx_edac.c
+++ b/drivers/edac/mpc85xx_edac.c
@@ -280,7 +280,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_DR, ~0);
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		goto err;
 	}
 
@@ -303,7 +303,7 @@ static int __devinit mpc85xx_pci_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_pci_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " PCI err registered\n");
 
 	return 0;
@@ -321,7 +321,7 @@ static int mpc85xx_pci_err_remove(struct platform_device *op)
 	struct edac_pci_ctl_info *pci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_pci_pdata *pdata = pci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	out_be32(pdata->pci_vbase + MPC85XX_PCI_ERR_CAP_DR,
 		 orig_pci_err_cap_dr);
@@ -582,7 +582,7 @@ static int __devinit mpc85xx_l2_err_probe(struct platform_device *op)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -610,7 +610,7 @@ static int __devinit mpc85xx_l2_err_probe(struct platform_device *op)
 
 	devres_remove_group(&op->dev, mpc85xx_l2_err_probe);
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " L2 err registered\n");
 
 	return 0;
@@ -628,7 +628,7 @@ static int mpc85xx_l2_err_remove(struct platform_device *op)
 	struct edac_device_ctl_info *edac_dev = dev_get_drvdata(&op->dev);
 	struct mpc85xx_l2_pdata *pdata = edac_dev->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->l2_vbase + MPC85XX_L2_ERRINTEN, 0);
@@ -1038,7 +1038,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 		goto err;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_RDDR2 |
 	    MEM_FLAG_DDR | MEM_FLAG_DDR2;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
@@ -1064,13 +1064,13 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_DETECT, ~0);
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
 	if (mpc85xx_create_sysfs_attributes(mci)) {
 		edac_mc_del_mc(mci->pdev);
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
@@ -1104,7 +1104,7 @@ static int __devinit mpc85xx_mc_err_probe(struct platform_device *op)
 	}
 
 	devres_remove_group(&op->dev, mpc85xx_mc_err_probe);
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	printk(KERN_INFO EDAC_MOD_STR " MC err registered\n");
 
 	return 0;
@@ -1122,7 +1122,7 @@ static int mpc85xx_mc_err_remove(struct platform_device *op)
 	struct mem_ctl_info *mci = dev_get_drvdata(&op->dev);
 	struct mpc85xx_mc_pdata *pdata = mci->pvt_info;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (edac_op_state == EDAC_OPSTATE_INT) {
 		out_be32(pdata->mc_vbase + MPC85XX_MC_ERR_INT_EN, 0);
diff --git a/drivers/edac/mv64x60_edac.c b/drivers/edac/mv64x60_edac.c
index 59c399a..35db597 100644
--- a/drivers/edac/mv64x60_edac.c
+++ b/drivers/edac/mv64x60_edac.c
@@ -169,7 +169,7 @@ static int __devinit mv64x60_pci_err_probe(struct platform_device *pdev)
 		 MV64X60_PCIx_ERR_MASK_VAL);
 
 	if (edac_pci_add_device(pci, pdata->edac_idx) > 0) {
-		debugf3("%s(): failed edac_pci_add_device()\n", __func__);
+		debugf3("failed edac_pci_add_device()\n");
 		goto err;
 	}
 
@@ -194,7 +194,7 @@ static int __devinit mv64x60_pci_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_pci_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -210,7 +210,7 @@ static int mv64x60_pci_err_remove(struct platform_device *pdev)
 {
 	struct edac_pci_ctl_info *pci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_pci_del_device(&pdev->dev);
 
@@ -336,7 +336,7 @@ static int __devinit mv64x60_sram_err_probe(struct platform_device *pdev)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -363,7 +363,7 @@ static int __devinit mv64x60_sram_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_sram_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -379,7 +379,7 @@ static int mv64x60_sram_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -531,7 +531,7 @@ static int __devinit mv64x60_cpu_err_probe(struct platform_device *pdev)
 	pdata->edac_idx = edac_dev_idx++;
 
 	if (edac_device_add_device(edac_dev) > 0) {
-		debugf3("%s(): failed edac_device_add_device()\n", __func__);
+		debugf3("failed edac_device_add_device()\n");
 		goto err;
 	}
 
@@ -558,7 +558,7 @@ static int __devinit mv64x60_cpu_err_probe(struct platform_device *pdev)
 	devres_remove_group(&pdev->dev, mv64x60_cpu_err_probe);
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -574,7 +574,7 @@ static int mv64x60_cpu_err_remove(struct platform_device *pdev)
 {
 	struct edac_device_ctl_info *edac_dev = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_device_del_device(&pdev->dev);
 	edac_device_free_ctl_info(edac_dev);
@@ -766,7 +766,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 		goto err2;
 	}
 
-	debugf3("%s(): init mci\n", __func__);
+	debugf3("init mci\n");
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_SECDED;
 	mci->edac_cap = EDAC_FLAG_SECDED;
@@ -790,7 +790,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	out_le32(pdata->mc_vbase + MV64X60_SDRAM_ERR_ECC_CNTL, ctl);
 
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto err;
 	}
 
@@ -815,7 +815,7 @@ static int __devinit mv64x60_mc_err_probe(struct platform_device *pdev)
 	}
 
 	/* get this far and it's successful */
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 
 	return 0;
 
@@ -831,7 +831,7 @@ static int mv64x60_mc_err_remove(struct platform_device *pdev)
 {
 	struct mem_ctl_info *mci = platform_get_drvdata(pdev);
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	edac_mc_del_mc(&pdev->dev);
 	edac_mc_free(mci);
diff --git a/drivers/edac/r82600_edac.c b/drivers/edac/r82600_edac.c
index 7b7eaf2..36b870e 100644
--- a/drivers/edac/r82600_edac.c
+++ b/drivers/edac/r82600_edac.c
@@ -205,7 +205,7 @@ static void r82600_check(struct mem_ctl_info *mci)
 {
 	struct r82600_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	r82600_get_error_info(mci, &info);
 	r82600_process_error_info(mci, &info, 1);
 }
@@ -236,13 +236,13 @@ static void r82600_init_csrows(struct mem_ctl_info *mci, struct pci_dev *pdev,
 		/* find the DRAM Chip Select Base address and mask */
 		pci_read_config_byte(pdev, R82600_DRBA + index, &drbar);
 
-		debugf1("%s() Row=%d DRBA = %#0x\n", __func__, index, drbar);
+		debugf1("Row=%d DRBA = %#0x\n", index, drbar);
 
 		row_high_limit = ((u32) drbar << 24);
 /*		row_high_limit = ((u32)drbar << 24) | 0xffffffUL; */
 
-		debugf1("%s() Row=%d, Boundary Address=%#0x, Last = %#0x\n",
-			__func__, index, row_high_limit, row_high_limit_last);
+		debugf1("Row=%d, Boundary Address=%#0x, Last = %#0x\n",
+			index, row_high_limit, row_high_limit_last);
 
 		/* Empty row [p.57] */
 		if (row_high_limit == row_high_limit_last)
@@ -277,14 +277,13 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	u32 sdram_refresh_rate;
 	struct r82600_error_info discard;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 	pci_read_config_byte(pdev, R82600_DRAMC, &dramcr);
 	pci_read_config_dword(pdev, R82600_EAP, &eapr);
 	scrub_disabled = eapr & BIT(31);
 	sdram_refresh_rate = dramcr & (BIT(0) | BIT(1));
-	debugf2("%s(): sdram refresh rate = %#0x\n", __func__,
-		sdram_refresh_rate);
-	debugf2("%s(): DRAMC register = %#0x\n", __func__, dramcr);
+	debugf2("sdram refresh rate = %#0x\n", sdram_refresh_rate);
+	debugf2("DRAMC register = %#0x\n", dramcr);
 	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
 	layers[0].size = R82600_NR_CSROWS;
 	layers[0].is_virt_csrow = true;
@@ -295,7 +294,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	if (mci == NULL)
 		return -ENOMEM;
 
-	debugf0("%s(): mci = %p\n", __func__, mci);
+	debugf0("mci = %p\n", mci);
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_RDDR | MEM_FLAG_DDR;
 	mci->edac_ctl_cap = EDAC_FLAG_NONE | EDAC_FLAG_EC | EDAC_FLAG_SECDED;
@@ -311,8 +310,8 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 
 	if (ecc_enabled(dramcr)) {
 		if (scrub_disabled)
-			debugf3("%s(): mci = %p - Scrubbing disabled! EAP: "
-				"%#0x\n", __func__, mci, eapr);
+			debugf3("mci = %p - Scrubbing disabled! EAP: "
+				"%#0x\n", mci, eapr);
 	} else
 		mci->edac_cap = EDAC_FLAG_NONE;
 
@@ -329,15 +328,14 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 	 * type of memory controller.  The ID is therefore hardcoded to 0.
 	 */
 	if (edac_mc_add_mc(mci)) {
-		debugf3("%s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
 
 	if (disable_hardware_scrub) {
-		debugf3("%s(): Disabling Hardware Scrub (scrub on error)\n",
-			__func__);
+		debugf3("Disabling Hardware Scrub (scrub on error)\n");
 		pci_write_bits32(pdev, R82600_EAP, BIT(31), BIT(31));
 	}
 
@@ -352,7 +350,7 @@ static int r82600_probe1(struct pci_dev *pdev, int dev_idx)
 			__func__);
 	}
 
-	debugf3("%s(): success\n", __func__);
+	debugf3("success\n");
 	return 0;
 
 fail:
@@ -364,7 +362,7 @@ fail:
 static int __devinit r82600_init_one(struct pci_dev *pdev,
 				const struct pci_device_id *ent)
 {
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	/* don't need to call pci_enable_device() */
 	return r82600_probe1(pdev, ent->driver_data);
@@ -374,7 +372,7 @@ static void __devexit r82600_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	if (r82600_pci)
 		edac_pci_release_generic_ctl(r82600_pci);
diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index bb7e95f..d1afa1a 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -1064,7 +1064,7 @@ static void sbridge_put_devices(struct sbridge_dev *sbridge_dev)
 {
 	int i;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 	for (i = 0; i < sbridge_dev->n_devs; i++) {
 		struct pci_dev *pdev = sbridge_dev->pdev[i];
 		if (!pdev)
@@ -1597,8 +1597,7 @@ static void sbridge_unregister_mci(struct sbridge_dev *sbridge_dev)
 	struct sbridge_pvt *pvt;
 
 	if (unlikely(!mci || !mci->pvt_info)) {
-		debugf0("MC: " __FILE__ ": %s(): dev = %p\n",
-			__func__, &sbridge_dev->pdev[0]->dev);
+		debugf0("MC: dev = %p\n", &sbridge_dev->pdev[0]->dev);
 
 		sbridge_printk(KERN_ERR, "Couldn't find mci handler\n");
 		return;
@@ -1606,8 +1605,8 @@ static void sbridge_unregister_mci(struct sbridge_dev *sbridge_dev)
 
 	pvt = mci->pvt_info;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
-		__func__, mci, &sbridge_dev->pdev[0]->dev);
+	debugf0("MC: mci = %p, dev = %p\n",
+		mci, &sbridge_dev->pdev[0]->dev);
 
 	mce_unregister_decode_chain(&sbridge_mce_dec);
 
@@ -1645,8 +1644,8 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 	if (unlikely(!mci))
 		return -ENOMEM;
 
-	debugf0("MC: " __FILE__ ": %s(): mci = %p, dev = %p\n",
-		__func__, mci, &sbridge_dev->pdev[0]->dev);
+	debugf0("MC: mci = %p, dev = %p\n",
+		mci, &sbridge_dev->pdev[0]->dev);
 
 	pvt = mci->pvt_info;
 	memset(pvt, 0, sizeof(*pvt));
@@ -1681,8 +1680,7 @@ static int sbridge_register_mci(struct sbridge_dev *sbridge_dev)
 
 	/* add this new MC control structure to EDAC's list of MCs */
 	if (unlikely(edac_mc_add_mc(mci))) {
-		debugf0("MC: " __FILE__
-			": %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf0("MC: failed edac_mc_add_mc()\n");
 		rc = -EINVAL;
 		goto fail0;
 	}
@@ -1760,7 +1758,7 @@ static void __devexit sbridge_remove(struct pci_dev *pdev)
 {
 	struct sbridge_dev *sbridge_dev;
 
-	debugf0(__FILE__ ": %s()\n", __func__);
+	debugf0("\n");
 
 	/*
 	 * we have a trouble here: pdev value for removal will be wrong, since
@@ -1809,7 +1807,7 @@ static int __init sbridge_init(void)
 {
 	int pci_rc;
 
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -1831,7 +1829,7 @@ static int __init sbridge_init(void)
  */
 static void __exit sbridge_exit(void)
 {
-	debugf2("MC: " __FILE__ ": %s()\n", __func__);
+	debugf2("\n");
 	pci_unregister_driver(&sbridge_driver);
 }
 
diff --git a/drivers/edac/x38_edac.c b/drivers/edac/x38_edac.c
index 219530b..a3d8a40 100644
--- a/drivers/edac/x38_edac.c
+++ b/drivers/edac/x38_edac.c
@@ -243,7 +243,7 @@ static void x38_check(struct mem_ctl_info *mci)
 {
 	struct x38_error_info info;
 
-	debugf1("MC%d: %s()\n", mci->mc_idx, __func__);
+	debugf1("MC%d\n", mci->mc_idx);
 	x38_get_and_clear_error_info(mci, &info);
 	x38_process_error_info(mci, &info);
 }
@@ -331,7 +331,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	bool stacked;
 	void __iomem *window;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	window = x38_map_mchbar(pdev);
 	if (!window)
@@ -352,7 +352,7 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 	if (!mci)
 		return -ENOMEM;
 
-	debugf3("MC: %s(): init mci\n", __func__);
+	debugf3("MC: init mci\n");
 
 	mci->pdev = &pdev->dev;
 	mci->mtype_cap = MEM_FLAG_DDR2;
@@ -402,12 +402,12 @@ static int x38_probe1(struct pci_dev *pdev, int dev_idx)
 
 	rc = -ENODEV;
 	if (edac_mc_add_mc(mci)) {
-		debugf3("MC: %s(): failed edac_mc_add_mc()\n", __func__);
+		debugf3("MC: failed edac_mc_add_mc()\n");
 		goto fail;
 	}
 
 	/* get this far and it's successful */
-	debugf3("MC: %s(): success\n", __func__);
+	debugf3("MC: success\n");
 	return 0;
 
 fail:
@@ -423,7 +423,7 @@ static int __devinit x38_init_one(struct pci_dev *pdev,
 {
 	int rc;
 
-	debugf0("MC: %s()\n", __func__);
+	debugf0("MC:\n");
 
 	if (pci_enable_device(pdev) < 0)
 		return -EIO;
@@ -439,7 +439,7 @@ static void __devexit x38_remove_one(struct pci_dev *pdev)
 {
 	struct mem_ctl_info *mci;
 
-	debugf0("%s()\n", __func__);
+	debugf0("\n");
 
 	mci = edac_mc_del_mc(&pdev->dev);
 	if (!mci)
@@ -472,7 +472,7 @@ static int __init x38_init(void)
 {
 	int pci_rc;
 
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	/* Ensure that the OPSTATE is set correctly for POLL or NMI */
 	opstate_init();
@@ -513,7 +513,7 @@ fail0:
 
 static void __exit x38_exit(void)
 {
-	debugf3("MC: %s()\n", __func__);
+	debugf3("MC:\n");
 
 	pci_unregister_driver(&x38_driver);
 	if (!x38_registered) {

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 16:43                                             ` Joe Perches
@ 2012-04-29 17:39                                               ` Mauro Carvalho Chehab
  2012-04-30  7:47                                                 ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-29 17:39 UTC (permalink / raw)
  To: Joe Perches
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson

Em 29-04-2012 13:43, Joe Perches escreveu:
> On Sun, 2012-04-29 at 13:20 -0300, Mauro Carvalho Chehab wrote:
>> The script below is even better. After that, only 113 occurrences of __func__
>> is now found at drivers/edac, and some of them are not related to debugf[1-9],
>> so they shouldn't be cover on a patch like that.
>> I'll do some manual cleanup on it.
> 
> Hi Mauro.
> 
> Another thing you could do would be to
> separate the level from the multiple macros,
> use a single macro, and convert the uses.
> 
> #define debugf(level, fmt, ...)
> and change the uses to
> debugf([0-n], "some format", args...)
> 
> I believe that's the more predominate
> kernel style for debugging macros with
> a tested level or mask.

Agreed.

> Perhaps also add !CONFIG_EDAC_DEBUG
> format/args checking to the debug statements.

Most/all debug-only stuff are already checking for CONFIG_EDAC_DEBUG.
There are a few static debug-only data/functions that aren't testing for
it, but the compiler should remove the dead code anyway, so this shouldn't
cause any harm.

> Lastly, indenting the messages 2 tabs isn't
> really useful, one or two spaces is probably
> enough.

agreed.

> 
> I did this a bit ago so it may not apply
> after your changes:

Believe or not, it applied without troubles ;)

I've added at the end of my experimental series, at:


git://git.infradead.org/users/mchehab/edac.git experimental

be careful if you use this branch, as I'm rebasing it every time I need
to change something on this series.

I'm keeping a non-rebased version, with one branch per review, at:

git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac.git

The current review is at hw_events_v17. Patches were already pushed there.
they should be there after the usual kernel.org master/mirror replication
delay.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 17:39                                               ` Mauro Carvalho Chehab
@ 2012-04-30  7:47                                                 ` Borislav Petkov
  2012-04-30 11:09                                                   ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30  7:47 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Joe Perches, Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Aristeu Rozanski, Doug Thompson

On Sun, Apr 29, 2012 at 02:39:04PM -0300, Mauro Carvalho Chehab wrote:
> Em 29-04-2012 13:43, Joe Perches escreveu:
> > On Sun, 2012-04-29 at 13:20 -0300, Mauro Carvalho Chehab wrote:
> >> The script below is even better. After that, only 113 occurrences of __func__
> >> is now found at drivers/edac, and some of them are not related to debugf[1-9],
> >> so they shouldn't be cover on a patch like that.
> >> I'll do some manual cleanup on it.
> > 
> > Hi Mauro.
> > 
> > Another thing you could do would be to
> > separate the level from the multiple macros,
> > use a single macro, and convert the uses.
> > 
> > #define debugf(level, fmt, ...)
> > and change the uses to
> > debugf([0-n], "some format", args...)
> > 
> > I believe that's the more predominate
> > kernel style for debugging macros with
> > a tested level or mask.
> 
> Agreed.
> 
> > Perhaps also add !CONFIG_EDAC_DEBUG
> > format/args checking to the debug statements.
> 
> Most/all debug-only stuff are already checking for CONFIG_EDAC_DEBUG.
> There are a few static debug-only data/functions that aren't testing for
> it, but the compiler should remove the dead code anyway, so this shouldn't
> cause any harm.
> 
> > Lastly, indenting the messages 2 tabs isn't
> > really useful, one or two spaces is probably
> > enough.
> 
> agreed.
> 
> > 
> > I did this a bit ago so it may not apply
> > after your changes:
> 
> Believe or not, it applied without troubles ;)
> 
> I've added at the end of my experimental series, at:
> 
> 
> git://git.infradead.org/users/mchehab/edac.git experimental
> 
> be careful if you use this branch, as I'm rebasing it every time I need
> to change something on this series.
> 
> I'm keeping a non-rebased version, with one branch per review, at:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac.git
> 
> The current review is at hw_events_v17. Patches were already pushed there.
> they should be there after the usual kernel.org master/mirror replication
> delay.

Now wait a minute,

you guys are so trigger-happy to apply humongous, cleanup patches but
let me ask this: can anyone of you really test those changes with each
driver? Do you have all the hardware that those patches touch?

I know, I know, it builds fine and it looks correct but subtle bugs tend
to sneak in in exactly such situations.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 14:16                                       ` Mauro Carvalho Chehab
@ 2012-04-30  7:59                                         ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30  7:59 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Sun, Apr 29, 2012 at 11:16:53AM -0300, Mauro Carvalho Chehab wrote:
> > Hey, are you looking at compiled code or at source code? Because I'm
> > looking at source code, and it is a pretty safe bet the majority of the
> > people here do that too.
> 
> What I said is that, from source code POV, a code where the loop variables are
> initialized just before the loop is easier to read it when the initialization
> of those vars are on another part of the code.
> 
> That's basically why the "for" syntax starts with a var initialization clause.
> 
> The tot_dimms & friends are loop vars: their value is calculated within the loop.
> 
> At the object code, this won't bring any difference.
> 
> > 
> >> it, either by using registers for those vars or by moving the initialization
> >> to the top of the function.
> >>
> >> This function is too complex, so it is better to initialize those vars
> >> just before the loops that are calculating those totals.
> > 
> > Simply initialize those variables at declaration time and that's it.
> > Initializing them before the loop doesn't make the function less complex
> > - splitting it and sanitizing it does.
> 
> Initializing loop-calculated vars just before the loop makes the code easier
> to read, and may avoid issues that might happen during code lifecycle.

This is getting ridiculous: the variable declaration and initialization
are on the same screen as the loop (unless one uses a screen which can
only show less than 40ish lines).

So the argument about making the code easier to read is bogus.

This function is already cluttered with a lot of crap, and is very large
so adding more lines which can simply be stashed away at declaration
time is better readability.

Besides, every modern editor can jump to the declaration of a local
variable so that the user can see to what it is initialized to.

> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> +                                    unsigned n_layers,
> +                                    struct edac_mc_layer *layers,
> +                                    bool rev_order,
> +                                    unsigned sz_pvt)
>  {
>       void *ptr = NULL;
>       struct mem_ctl_info *mci;
> -     struct csrow_info *csi, *csrow;
> +     struct edac_mc_layer *layer;
> +     struct csrow_info *csi, *csr;
>       struct rank_info *chi, *chp, *chan;
>       struct dimm_info *dimm;
> +     u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>       void *pvt;
> -     unsigned size;
> -     int row, chn;
> +     unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> +     unsigned tot_csrows, tot_channels, tot_errcount = 0;
> +     int i, j;
>       int err;
> +     int row, chn;
> +     bool per_rank = false;
> +
> +     BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
> +     /*
> +      * Calculate the total amount of dimms and csrows/cschannels while
> +      * in the old API emulation mode
> +      */
> +     tot_dimms = 1;
> +     tot_channels = 1;
> +     tot_csrows = 1;
> +     for (i = 0; i < n_layers; i++) {
> +             tot_dimms *= layers[i].size;
> +             if (layers[i].is_virt_csrow)
> +                     tot_csrows *= layers[i].size;
> +             else
> +                     tot_channels *= layers[i].size;
> +
> +             if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
> +                     per_rank = true;


-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30  7:59                                         ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30  7:59 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Sun, Apr 29, 2012 at 11:16:53AM -0300, Mauro Carvalho Chehab wrote:
> > Hey, are you looking at compiled code or at source code? Because I'm
> > looking at source code, and it is a pretty safe bet the majority of the
> > people here do that too.
> 
> What I said is that, from source code POV, a code where the loop variables are
> initialized just before the loop is easier to read it when the initialization
> of those vars are on another part of the code.
> 
> That's basically why the "for" syntax starts with a var initialization clause.
> 
> The tot_dimms & friends are loop vars: their value is calculated within the loop.
> 
> At the object code, this won't bring any difference.
> 
> > 
> >> it, either by using registers for those vars or by moving the initialization
> >> to the top of the function.
> >>
> >> This function is too complex, so it is better to initialize those vars
> >> just before the loops that are calculating those totals.
> > 
> > Simply initialize those variables at declaration time and that's it.
> > Initializing them before the loop doesn't make the function less complex
> > - splitting it and sanitizing it does.
> 
> Initializing loop-calculated vars just before the loop makes the code easier
> to read, and may avoid issues that might happen during code lifecycle.

This is getting ridiculous: the variable declaration and initialization
are on the same screen as the loop (unless one uses a screen which can
only show less than 40ish lines).

So the argument about making the code easier to read is bogus.

This function is already cluttered with a lot of crap, and is very large
so adding more lines which can simply be stashed away at declaration
time is better readability.

Besides, every modern editor can jump to the declaration of a local
variable so that the user can see to what it is initialized to.

> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> +                                    unsigned n_layers,
> +                                    struct edac_mc_layer *layers,
> +                                    bool rev_order,
> +                                    unsigned sz_pvt)
>  {
>       void *ptr = NULL;
>       struct mem_ctl_info *mci;
> -     struct csrow_info *csi, *csrow;
> +     struct edac_mc_layer *layer;
> +     struct csrow_info *csi, *csr;
>       struct rank_info *chi, *chp, *chan;
>       struct dimm_info *dimm;
> +     u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>       void *pvt;
> -     unsigned size;
> -     int row, chn;
> +     unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> +     unsigned tot_csrows, tot_channels, tot_errcount = 0;
> +     int i, j;
>       int err;
> +     int row, chn;
> +     bool per_rank = false;
> +
> +     BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
> +     /*
> +      * Calculate the total amount of dimms and csrows/cschannels while
> +      * in the old API emulation mode
> +      */
> +     tot_dimms = 1;
> +     tot_channels = 1;
> +     tot_csrows = 1;
> +     for (i = 0; i < n_layers; i++) {
> +             tot_dimms *= layers[i].size;
> +             if (layers[i].is_virt_csrow)
> +                     tot_csrows *= layers[i].size;
> +             else
> +                     tot_channels *= layers[i].size;
> +
> +             if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
> +                     per_rank = true;


-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-29 13:49                                       ` Mauro Carvalho Chehab
@ 2012-04-30  8:15                                         ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30  8:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Sun, Apr 29, 2012 at 10:49:44AM -0300, Mauro Carvalho Chehab wrote:
> > [   10.486440] EDAC MC: DCT0 chip selects:
> > [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> > [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> > [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
> > [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
> > [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
> > [   10.486455] EDAC MC: DCT1 chip selects:
> > [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> > [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> > [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
> > [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
> > [   10.486467] EDAC amd64: using x8 syndromes.
> > [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
> > [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
> > [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
> > [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
> > [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
> > [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
> > [   10.486488] EDAC amd64: MCT channel count: 2
> > [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
> > [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
> > [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
> > [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
> > [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
> > [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
> > [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
> > [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
> > [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
> > [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
> > [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
> > [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
> > [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
> > [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
> > [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
> > [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
> > [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
> > 
> > DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
> > 
> > Now your change is showing 16 ranks. Still b0rked.
> > 
> No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.
> 
> As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
> doesn't know how many ranks are filled, as the driver logic first calls it to 
> allocate for the max amount of ranks, and then fills the rank with their info 
> (or let them untouched with 0 pages, if they're empty).

Basically you're saying you're generating dimm_info structs for all
_possible_ dimms and the loop where this debug message comes from goes
and marrily initializes them all although some of them are empty:

+       for (i = 0; i < tot_dimms; i++) {
+               chan = &csi[row].channels[chn];
+               dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
+                              pos[0], pos[1], pos[2]);
+               dimm->mci = mci;
+
+               debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+                       i, (dimm - mci->dimms),
+                       pos[0], pos[1], pos[2], row, chn);
+
+               /* Copy DIMM location */
+               for (j = 0; j < n_layers; j++)
+                       dimm->location[j] = pos[j];
...

definitely superfluous.

Oh well, looking at edac_mc_alloc, it used to allocate structs for all
csrows on the controller even though some of them were empty...

Ok, then please remove this debug call because it is misleading. Having

[   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)

is enough.

You probably want to say how many channels/csrows there are, though:

[   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 8 csrows, 2 channels)

or something similar. Simply dump tot_dimms, tot_channels and tot_csrows
and that's it.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30  8:15                                         ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30  8:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Sun, Apr 29, 2012 at 10:49:44AM -0300, Mauro Carvalho Chehab wrote:
> > [   10.486440] EDAC MC: DCT0 chip selects:
> > [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> > [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> > [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
> > [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
> > [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
> > [   10.486455] EDAC MC: DCT1 chip selects:
> > [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
> > [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
> > [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
> > [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
> > [   10.486467] EDAC amd64: using x8 syndromes.
> > [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
> > [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
> > [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
> > [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
> > [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
> > [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
> > [   10.486488] EDAC amd64: MCT channel count: 2
> > [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
> > [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
> > [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
> > [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
> > [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
> > [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
> > [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
> > [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
> > [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
> > [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
> > [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
> > [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
> > [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
> > [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
> > [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
> > [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
> > [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
> > 
> > DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
> > 
> > Now your change is showing 16 ranks. Still b0rked.
> > 
> No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.
> 
> As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
> doesn't know how many ranks are filled, as the driver logic first calls it to 
> allocate for the max amount of ranks, and then fills the rank with their info 
> (or let them untouched with 0 pages, if they're empty).

Basically you're saying you're generating dimm_info structs for all
_possible_ dimms and the loop where this debug message comes from goes
and marrily initializes them all although some of them are empty:

+       for (i = 0; i < tot_dimms; i++) {
+               chan = &csi[row].channels[chn];
+               dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
+                              pos[0], pos[1], pos[2]);
+               dimm->mci = mci;
+
+               debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+                       i, (dimm - mci->dimms),
+                       pos[0], pos[1], pos[2], row, chn);
+
+               /* Copy DIMM location */
+               for (j = 0; j < n_layers; j++)
+                       dimm->location[j] = pos[j];
...

definitely superfluous.

Oh well, looking at edac_mc_alloc, it used to allocate structs for all
csrows on the controller even though some of them were empty...

Ok, then please remove this debug call because it is misleading. Having

[   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)

is enough.

You probably want to say how many channels/csrows there are, though:

[   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 8 csrows, 2 channels)

or something similar. Simply dump tot_dimms, tot_channels and tot_csrows
and that's it.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30  8:15                                         ` Borislav Petkov
@ 2012-04-30 10:58                                           ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 10:58 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 30-04-2012 05:15, Borislav Petkov escreveu:
> On Sun, Apr 29, 2012 at 10:49:44AM -0300, Mauro Carvalho Chehab wrote:
>>> [   10.486440] EDAC MC: DCT0 chip selects:
>>> [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
>>> [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
>>> [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
>>> [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
>>> [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
>>> [   10.486455] EDAC MC: DCT1 chip selects:
>>> [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
>>> [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
>>> [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
>>> [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
>>> [   10.486467] EDAC amd64: using x8 syndromes.
>>> [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
>>> [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
>>> [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
>>> [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
>>> [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
>>> [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
>>> [   10.486488] EDAC amd64: MCT channel count: 2
>>> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
>>> [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
>>> [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
>>> [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
>>> [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
>>> [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
>>> [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
>>> [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
>>> [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
>>> [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
>>> [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
>>> [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
>>> [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
>>> [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
>>> [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
>>> [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
>>> [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
>>>
>>> DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
>>>
>>> Now your change is showing 16 ranks. Still b0rked.
>>>
>> No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.
>>
>> As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
>> doesn't know how many ranks are filled, as the driver logic first calls it to 
>> allocate for the max amount of ranks, and then fills the rank with their info 
>> (or let them untouched with 0 pages, if they're empty).
> 
> Basically you're saying you're generating dimm_info structs for all
> _possible_ dimms and the loop where this debug message comes from goes
> and marrily initializes them all although some of them are empty:
> 
> +       for (i = 0; i < tot_dimms; i++) {
> +               chan = &csi[row].channels[chn];
> +               dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
> +                              pos[0], pos[1], pos[2]);
> +               dimm->mci = mci;
> +
> +               debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
> +                       i, (dimm - mci->dimms),
> +                       pos[0], pos[1], pos[2], row, chn);
> +
> +               /* Copy DIMM location */
> +               for (j = 0; j < n_layers; j++)
> +                       dimm->location[j] = pos[j];
> ...
> 
> definitely superfluous.
> 
> Oh well, looking at edac_mc_alloc, it used to allocate structs for all
> csrows on the controller even though some of them were empty...
> 
> Ok, then please remove this debug call because it is misleading. Having
> 
> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
> 
> is enough.
> 
> You probably want to say how many channels/csrows there are, though:
> 
> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 8 csrows, 2 channels)
> 
> or something similar. Simply dump tot_dimms, tot_channels and tot_csrows
> and that's it.
> 

It seems you have a very short memory. We had a similar discussion about that a while ago:
	https://lkml.org/lkml/2012/3/8/440

See my comments at:
	https://lkml.org/lkml/2012/3/9/101
	https://lkml.org/lkml/2012/3/9/267

As it was explained there, those debug messages provide a map between the legacy per-csrow
data, used by the old API and the dimm_info representation. For a per-csrow memory controller,
the map is trivial, as the memory location will match the csrow/channel information, but
for modern memory controllers, the map info is not trivial and it helps to check what it is
expected to be found when retrieving information via the legacy EDAC API.

For example, this is the mapping used by the second memory controller of the SB machine
I'm using on my tests:

[52803.640043] EDAC DEBUG: sbridge_probe: Registering MC#1 (2 of 2)
...
[52803.640062] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc(): allocating 7196 bytes for mci data (12 dimms, 12 csrows/channels)
[52803.640070] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: initializing 12 dimms
[52803.640072] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
[52803.640074] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
[52803.640077] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 2: dimm2 (0:2:0): row 0, chan 2
[52803.640080] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 3: dimm3 (1:0:0): row 0, chan 3
[52803.640083] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 4: dimm4 (1:1:0): row 1, chan 0
[52803.640086] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 5: dimm5 (1:2:0): row 1, chan 1
[52803.640089] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 6: dimm6 (2:0:0): row 1, chan 2
[52803.640092] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 7: dimm7 (2:1:0): row 1, chan 3
[52803.640095] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 8: dimm8 (2:2:0): row 2, chan 0
[52803.640098] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 9: dimm9 (3:0:0): row 2, chan 1
[52803.640101] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 10: dimm10 (3:1:0): row 2, chan 2
[52803.640104] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 11: dimm11 (3:2:0): row 2, chan 3

With the above info, it is clear that the DIMM located at mc#1, channel#3 slot#2 is
called "dimm11" at the new API, and corresponds to "csrow 2, channel 3" for a legacy
EDAC API call.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 10:58                                           ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 10:58 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 30-04-2012 05:15, Borislav Petkov escreveu:
> On Sun, Apr 29, 2012 at 10:49:44AM -0300, Mauro Carvalho Chehab wrote:
>>> [   10.486440] EDAC MC: DCT0 chip selects:
>>> [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
>>> [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
>>> [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
>>> [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
>>> [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
>>> [   10.486455] EDAC MC: DCT1 chip selects:
>>> [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
>>> [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
>>> [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
>>> [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
>>> [   10.486467] EDAC amd64: using x8 syndromes.
>>> [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
>>> [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
>>> [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
>>> [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
>>> [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
>>> [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
>>> [   10.486488] EDAC amd64: MCT channel count: 2
>>> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
>>> [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
>>> [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
>>> [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
>>> [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
>>> [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
>>> [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
>>> [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
>>> [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
>>> [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
>>> [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
>>> [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
>>> [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
>>> [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
>>> [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
>>> [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
>>> [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
>>>
>>> DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
>>>
>>> Now your change is showing 16 ranks. Still b0rked.
>>>
>> No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.
>>
>> As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
>> doesn't know how many ranks are filled, as the driver logic first calls it to 
>> allocate for the max amount of ranks, and then fills the rank with their info 
>> (or let them untouched with 0 pages, if they're empty).
> 
> Basically you're saying you're generating dimm_info structs for all
> _possible_ dimms and the loop where this debug message comes from goes
> and marrily initializes them all although some of them are empty:
> 
> +       for (i = 0; i < tot_dimms; i++) {
> +               chan = &csi[row].channels[chn];
> +               dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
> +                              pos[0], pos[1], pos[2]);
> +               dimm->mci = mci;
> +
> +               debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
> +                       i, (dimm - mci->dimms),
> +                       pos[0], pos[1], pos[2], row, chn);
> +
> +               /* Copy DIMM location */
> +               for (j = 0; j < n_layers; j++)
> +                       dimm->location[j] = pos[j];
> ...
> 
> definitely superfluous.
> 
> Oh well, looking at edac_mc_alloc, it used to allocate structs for all
> csrows on the controller even though some of them were empty...
> 
> Ok, then please remove this debug call because it is misleading. Having
> 
> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
> 
> is enough.
> 
> You probably want to say how many channels/csrows there are, though:
> 
> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: allocating 3692 bytes for mci data (16 ranks, 8 csrows, 2 channels)
> 
> or something similar. Simply dump tot_dimms, tot_channels and tot_csrows
> and that's it.
> 

It seems you have a very short memory. We had a similar discussion about that a while ago:
	https://lkml.org/lkml/2012/3/8/440

See my comments at:
	https://lkml.org/lkml/2012/3/9/101
	https://lkml.org/lkml/2012/3/9/267

As it was explained there, those debug messages provide a map between the legacy per-csrow
data, used by the old API and the dimm_info representation. For a per-csrow memory controller,
the map is trivial, as the memory location will match the csrow/channel information, but
for modern memory controllers, the map info is not trivial and it helps to check what it is
expected to be found when retrieving information via the legacy EDAC API.

For example, this is the mapping used by the second memory controller of the SB machine
I'm using on my tests:

[52803.640043] EDAC DEBUG: sbridge_probe: Registering MC#1 (2 of 2)
...
[52803.640062] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc(): allocating 7196 bytes for mci data (12 dimms, 12 csrows/channels)
[52803.640070] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: initializing 12 dimms
[52803.640072] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
[52803.640074] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
[52803.640077] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 2: dimm2 (0:2:0): row 0, chan 2
[52803.640080] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 3: dimm3 (1:0:0): row 0, chan 3
[52803.640083] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 4: dimm4 (1:1:0): row 1, chan 0
[52803.640086] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 5: dimm5 (1:2:0): row 1, chan 1
[52803.640089] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 6: dimm6 (2:0:0): row 1, chan 2
[52803.640092] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 7: dimm7 (2:1:0): row 1, chan 3
[52803.640095] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 8: dimm8 (2:2:0): row 2, chan 0
[52803.640098] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 9: dimm9 (3:0:0): row 2, chan 1
[52803.640101] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 10: dimm10 (3:1:0): row 2, chan 2
[52803.640104] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 11: dimm11 (3:2:0): row 2, chan 3

With the above info, it is clear that the DIMM located at mc#1, channel#3 slot#2 is
called "dimm11" at the new API, and corresponds to "csrow 2, channel 3" for a legacy
EDAC API call.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30  7:47                                                 ` Borislav Petkov
@ 2012-04-30 11:09                                                   ` Mauro Carvalho Chehab
  2012-04-30 11:15                                                     ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 11:09 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Joe Perches, Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson

Em 30-04-2012 04:47, Borislav Petkov escreveu:
> On Sun, Apr 29, 2012 at 02:39:04PM -0300, Mauro Carvalho Chehab wrote:
>> Em 29-04-2012 13:43, Joe Perches escreveu:
>>> On Sun, 2012-04-29 at 13:20 -0300, Mauro Carvalho Chehab wrote:
>>>> The script below is even better. After that, only 113 occurrences of __func__
>>>> is now found at drivers/edac, and some of them are not related to debugf[1-9],
>>>> so they shouldn't be cover on a patch like that.
>>>> I'll do some manual cleanup on it.
>>>
>>> Hi Mauro.
>>>
>>> Another thing you could do would be to
>>> separate the level from the multiple macros,
>>> use a single macro, and convert the uses.
>>>
>>> #define debugf(level, fmt, ...)
>>> and change the uses to
>>> debugf([0-n], "some format", args...)
>>>
>>> I believe that's the more predominate
>>> kernel style for debugging macros with
>>> a tested level or mask.
>>
>> Agreed.
>>
>>> Perhaps also add !CONFIG_EDAC_DEBUG
>>> format/args checking to the debug statements.
>>
>> Most/all debug-only stuff are already checking for CONFIG_EDAC_DEBUG.
>> There are a few static debug-only data/functions that aren't testing for
>> it, but the compiler should remove the dead code anyway, so this shouldn't
>> cause any harm.
>>
>>> Lastly, indenting the messages 2 tabs isn't
>>> really useful, one or two spaces is probably
>>> enough.
>>
>> agreed.
>>
>>>
>>> I did this a bit ago so it may not apply
>>> after your changes:
>>
>> Believe or not, it applied without troubles ;)
>>
>> I've added at the end of my experimental series, at:
>>
>>
>> git://git.infradead.org/users/mchehab/edac.git experimental
>>
>> be careful if you use this branch, as I'm rebasing it every time I need
>> to change something on this series.
>>
>> I'm keeping a non-rebased version, with one branch per review, at:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-edac.git
>>
>> The current review is at hw_events_v17. Patches were already pushed there.
>> they should be there after the usual kernel.org master/mirror replication
>> delay.
> 
> Now wait a minute,
> 
> you guys are so trigger-happy to apply humongous, cleanup patches but
> let me ask this: can anyone of you really test those changes with each
> driver? Do you have all the hardware that those patches touch?

Well, then why you've touched that in the first place, without even looking
what would be affected at the EDAC core and at the drivers? Didn't you tested
it on all hardware that your patch affected?

Now that your patch got applied, reverting it is not a solution, as newer
stuff now assumes that __func__ will be at the debug messages. So, reverting
it will cause regressions.

> I know, I know, it builds fine and it looks correct but subtle bugs tend
> to sneak in in exactly such situations.

This patch touches only on debug code that aren't even enabled on production
kernels. Assuming that a sneaky bug were introduced, this won't cause much
hurt, and any developer inspecting those debug messages will be able to discover
and fix what happened there.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30 10:58                                           ` Mauro Carvalho Chehab
@ 2012-04-30 11:11                                             ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30 11:11 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:
> It seems you have a very short memory.

Oh, puh-lease, let's don't start with the insults now. You're not a
saint yourself. And maybe the fact that I'm having hard time grasping
your code is maybe because it is a load of crap and you seem to generate
a lot of senseless drivel when explaining what it does. And don't get me
started on the patch bombs.

So, let's stay constructive here before I, as the last and only one
person reviewing this stinking pile stops messing with it (I got other
stuff to do, you know) and NACK it completely.

> For example, this is the mapping used by the second memory controller of the SB machine
> I'm using on my tests:
> 
> [52803.640043] EDAC DEBUG: sbridge_probe: Registering MC#1 (2 of 2)
> ...
> [52803.640062] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc(): allocating 7196 bytes for mci data (12 dimms, 12 csrows/channels)
> [52803.640070] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: initializing 12 dimms
> [52803.640072] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> [52803.640074] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> [52803.640077] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 2: dimm2 (0:2:0): row 0, chan 2
> [52803.640080] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 3: dimm3 (1:0:0): row 0, chan 3
> [52803.640083] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 4: dimm4 (1:1:0): row 1, chan 0
> [52803.640086] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 5: dimm5 (1:2:0): row 1, chan 1
> [52803.640089] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 6: dimm6 (2:0:0): row 1, chan 2
> [52803.640092] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 7: dimm7 (2:1:0): row 1, chan 3
> [52803.640095] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 8: dimm8 (2:2:0): row 2, chan 0
> [52803.640098] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 9: dimm9 (3:0:0): row 2, chan 1
> [52803.640101] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 10: dimm10 (3:1:0): row 2, chan 2
> [52803.640104] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 11: dimm11 (3:2:0): row 2, chan 3
> 
> With the above info, it is clear that the DIMM located at mc#1, channel#3 slot#2 is
> called "dimm11" at the new API, and corresponds to "csrow 2, channel 3" for a legacy
> EDAC API call.

Are all those DIMM slots above populated? What happens if they're not,
are you issuing the same dimm0-dimm11 lines for slots which aren't even
populated?

I have a much better idea: Generally, this debug info should come from
the specific driver that allocates the dimm descriptors, not from the
EDAC core. This way, you know in the driver which slots are populated
and those which are not should be omitted.

This way it says "initializing 12 dimms" and the user thinks there are
12 DIMMs on his system where this might not be true.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 11:11                                             ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30 11:11 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:
> It seems you have a very short memory.

Oh, puh-lease, let's don't start with the insults now. You're not a
saint yourself. And maybe the fact that I'm having hard time grasping
your code is maybe because it is a load of crap and you seem to generate
a lot of senseless drivel when explaining what it does. And don't get me
started on the patch bombs.

So, let's stay constructive here before I, as the last and only one
person reviewing this stinking pile stops messing with it (I got other
stuff to do, you know) and NACK it completely.

> For example, this is the mapping used by the second memory controller of the SB machine
> I'm using on my tests:
> 
> [52803.640043] EDAC DEBUG: sbridge_probe: Registering MC#1 (2 of 2)
> ...
> [52803.640062] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc(): allocating 7196 bytes for mci data (12 dimms, 12 csrows/channels)
> [52803.640070] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: initializing 12 dimms
> [52803.640072] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> [52803.640074] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> [52803.640077] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 2: dimm2 (0:2:0): row 0, chan 2
> [52803.640080] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 3: dimm3 (1:0:0): row 0, chan 3
> [52803.640083] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 4: dimm4 (1:1:0): row 1, chan 0
> [52803.640086] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 5: dimm5 (1:2:0): row 1, chan 1
> [52803.640089] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 6: dimm6 (2:0:0): row 1, chan 2
> [52803.640092] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 7: dimm7 (2:1:0): row 1, chan 3
> [52803.640095] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 8: dimm8 (2:2:0): row 2, chan 0
> [52803.640098] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 9: dimm9 (3:0:0): row 2, chan 1
> [52803.640101] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 10: dimm10 (3:1:0): row 2, chan 2
> [52803.640104] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 11: dimm11 (3:2:0): row 2, chan 3
> 
> With the above info, it is clear that the DIMM located at mc#1, channel#3 slot#2 is
> called "dimm11" at the new API, and corresponds to "csrow 2, channel 3" for a legacy
> EDAC API call.

Are all those DIMM slots above populated? What happens if they're not,
are you issuing the same dimm0-dimm11 lines for slots which aren't even
populated?

I have a much better idea: Generally, this debug info should come from
the specific driver that allocates the dimm descriptors, not from the
EDAC core. This way, you know in the driver which slots are populated
and those which are not should be omitted.

This way it says "initializing 12 dimms" and the user thinks there are
12 DIMMs on his system where this might not be true.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30 11:09                                                   ` Mauro Carvalho Chehab
@ 2012-04-30 11:15                                                     ` Borislav Petkov
  2012-04-30 11:46                                                       ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30 11:15 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Joe Perches, Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson

On Mon, Apr 30, 2012 at 08:09:20AM -0300, Mauro Carvalho Chehab wrote:
> > you guys are so trigger-happy to apply humongous, cleanup patches but
> > let me ask this: can anyone of you really test those changes with each
> > driver? Do you have all the hardware that those patches touch?
> 
> Well, then why you've touched that in the first place, without even looking
> what would be affected at the EDAC core and at the drivers? Didn't you tested
> it on all hardware that your patch affected?

Because I simply missed that fact, which is my bad, sorry.

> This patch touches only on debug code that aren't even enabled on
> production kernels. Assuming that a sneaky bug were introduced, this
> won't cause much hurt, and any developer inspecting those debug
> messages will be able to discover and fix what happened there.

Ok, fine, I still will review the amd64_edac side of the changes.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30  7:59                                         ` Borislav Petkov
@ 2012-04-30 11:23                                           ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 11:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 30-04-2012 04:59, Borislav Petkov escreveu:
> On Sun, Apr 29, 2012 at 11:16:53AM -0300, Mauro Carvalho Chehab wrote:
>>> Hey, are you looking at compiled code or at source code? Because I'm
>>> looking at source code, and it is a pretty safe bet the majority of the
>>> people here do that too.
>>
>> What I said is that, from source code POV, a code where the loop variables are
>> initialized just before the loop is easier to read it when the initialization
>> of those vars are on another part of the code.
>>
>> That's basically why the "for" syntax starts with a var initialization clause.
>>
>> The tot_dimms & friends are loop vars: their value is calculated within the loop.
>>
>> At the object code, this won't bring any difference.
>>
>>>
>>>> it, either by using registers for those vars or by moving the initialization
>>>> to the top of the function.
>>>>
>>>> This function is too complex, so it is better to initialize those vars
>>>> just before the loops that are calculating those totals.
>>>
>>> Simply initialize those variables at declaration time and that's it.
>>> Initializing them before the loop doesn't make the function less complex
>>> - splitting it and sanitizing it does.
>>
>> Initializing loop-calculated vars just before the loop makes the code easier
>> to read, and may avoid issues that might happen during code lifecycle.
> 
> This is getting ridiculous:

With this I fully agree: you're nacking patches because it is not the way you
write your code, not because the code there is doing anything wrong.

If you point anything wrong on the way I wrote, then I'll fix. Otherwise, why
should I do a change that will obfuscate the code?

> the variable declaration and initialization
> are on the same screen as the loop (unless one uses a screen which can
> only show less than 40ish lines).
> 
> So the argument about making the code easier to read is bogus.
> 
> This function is already cluttered with a lot of crap, and is very large
> so adding more lines which can simply be stashed away at declaration
> time is better readability.
> 
> Besides, every modern editor can jump to the declaration of a local
> variable so that the user can see to what it is initialized to.

The editor used by te developer is not relevant. This is not a reason
to obfuscate the code.

>> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
>> +                                    unsigned n_layers,
>> +                                    struct edac_mc_layer *layers,
>> +                                    bool rev_order,
>> +                                    unsigned sz_pvt)
>>  {
>>       void *ptr = NULL;
>>       struct mem_ctl_info *mci;
>> -     struct csrow_info *csi, *csrow;
>> +     struct edac_mc_layer *layer;
>> +     struct csrow_info *csi, *csr;
>>       struct rank_info *chi, *chp, *chan;
>>       struct dimm_info *dimm;
>> +     u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>>       void *pvt;
>> -     unsigned size;
>> -     int row, chn;
>> +     unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
>> +     unsigned tot_csrows, tot_channels, tot_errcount = 0;
>> +     int i, j;
>>       int err;
>> +     int row, chn;
>> +     bool per_rank = false;
>> +
>> +     BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
>> +     /*
>> +      * Calculate the total amount of dimms and csrows/cschannels while
>> +      * in the old API emulation mode
>> +      */
>> +     tot_dimms = 1;
>> +     tot_channels = 1;
>> +     tot_csrows = 1;
>> +     for (i = 0; i < n_layers; i++) {
>> +             tot_dimms *= layers[i].size;
>> +             if (layers[i].is_virt_csrow)
>> +                     tot_csrows *= layers[i].size;
>> +             else
>> +                     tot_channels *= layers[i].size;
>> +
>> +             if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
>> +                     per_rank = true;

Regards,
Mauro


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 11:23                                           ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 11:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 30-04-2012 04:59, Borislav Petkov escreveu:
> On Sun, Apr 29, 2012 at 11:16:53AM -0300, Mauro Carvalho Chehab wrote:
>>> Hey, are you looking at compiled code or at source code? Because I'm
>>> looking at source code, and it is a pretty safe bet the majority of the
>>> people here do that too.
>>
>> What I said is that, from source code POV, a code where the loop variables are
>> initialized just before the loop is easier to read it when the initialization
>> of those vars are on another part of the code.
>>
>> That's basically why the "for" syntax starts with a var initialization clause.
>>
>> The tot_dimms & friends are loop vars: their value is calculated within the loop.
>>
>> At the object code, this won't bring any difference.
>>
>>>
>>>> it, either by using registers for those vars or by moving the initialization
>>>> to the top of the function.
>>>>
>>>> This function is too complex, so it is better to initialize those vars
>>>> just before the loops that are calculating those totals.
>>>
>>> Simply initialize those variables at declaration time and that's it.
>>> Initializing them before the loop doesn't make the function less complex
>>> - splitting it and sanitizing it does.
>>
>> Initializing loop-calculated vars just before the loop makes the code easier
>> to read, and may avoid issues that might happen during code lifecycle.
> 
> This is getting ridiculous:

With this I fully agree: you're nacking patches because it is not the way you
write your code, not because the code there is doing anything wrong.

If you point anything wrong on the way I wrote, then I'll fix. Otherwise, why
should I do a change that will obfuscate the code?

> the variable declaration and initialization
> are on the same screen as the loop (unless one uses a screen which can
> only show less than 40ish lines).
> 
> So the argument about making the code easier to read is bogus.
> 
> This function is already cluttered with a lot of crap, and is very large
> so adding more lines which can simply be stashed away at declaration
> time is better readability.
> 
> Besides, every modern editor can jump to the declaration of a local
> variable so that the user can see to what it is initialized to.

The editor used by te developer is not relevant. This is not a reason
to obfuscate the code.

>> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
>> +                                    unsigned n_layers,
>> +                                    struct edac_mc_layer *layers,
>> +                                    bool rev_order,
>> +                                    unsigned sz_pvt)
>>  {
>>       void *ptr = NULL;
>>       struct mem_ctl_info *mci;
>> -     struct csrow_info *csi, *csrow;
>> +     struct edac_mc_layer *layer;
>> +     struct csrow_info *csi, *csr;
>>       struct rank_info *chi, *chp, *chan;
>>       struct dimm_info *dimm;
>> +     u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
>>       void *pvt;
>> -     unsigned size;
>> -     int row, chn;
>> +     unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
>> +     unsigned tot_csrows, tot_channels, tot_errcount = 0;
>> +     int i, j;
>>       int err;
>> +     int row, chn;
>> +     bool per_rank = false;
>> +
>> +     BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
>> +     /*
>> +      * Calculate the total amount of dimms and csrows/cschannels while
>> +      * in the old API emulation mode
>> +      */
>> +     tot_dimms = 1;
>> +     tot_channels = 1;
>> +     tot_csrows = 1;
>> +     for (i = 0; i < n_layers; i++) {
>> +             tot_dimms *= layers[i].size;
>> +             if (layers[i].is_virt_csrow)
>> +                     tot_csrows *= layers[i].size;
>> +             else
>> +                     tot_channels *= layers[i].size;
>> +
>> +             if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
>> +                     per_rank = true;

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30  8:15                                         ` Borislav Petkov
@ 2012-04-30 11:37                                           ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 11:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 30-04-2012 05:15, Borislav Petkov escreveu:
> On Sun, Apr 29, 2012 at 10:49:44AM -0300, Mauro Carvalho Chehab wrote:
>>> [   10.486440] EDAC MC: DCT0 chip selects:
>>> [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
>>> [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
>>> [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
>>> [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
>>> [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
>>> [   10.486455] EDAC MC: DCT1 chip selects:
>>> [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
>>> [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
>>> [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
>>> [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
>>> [   10.486467] EDAC amd64: using x8 syndromes.
>>> [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
>>> [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
>>> [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
>>> [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
>>> [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
>>> [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
>>> [   10.486488] EDAC amd64: MCT channel count: 2
>>> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
>>> [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
>>> [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
>>> [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
>>> [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
>>> [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
>>> [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
>>> [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
>>> [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
>>> [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
>>> [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
>>> [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
>>> [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
>>> [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
>>> [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
>>> [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
>>> [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
>>>
>>> DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
>>>
>>> Now your change is showing 16 ranks. Still b0rked.
>>>
>> No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.
>>
>> As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
>> doesn't know how many ranks are filled, as the driver logic first calls it to 
>> allocate for the max amount of ranks, and then fills the rank with their info 
>> (or let them untouched with 0 pages, if they're empty).
> 
> Basically you're saying you're generating dimm_info structs for all
> _possible_ dimms and the loop where this debug message comes from goes
> and marrily initializes them all although some of them are empty:
> 
> +       for (i = 0; i < tot_dimms; i++) {
> +               chan = &csi[row].channels[chn];
> +               dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
> +                              pos[0], pos[1], pos[2]);
> +               dimm->mci = mci;
> +
> +               debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
> +                       i, (dimm - mci->dimms),
> +                       pos[0], pos[1], pos[2], row, chn);
> +
> +               /* Copy DIMM location */
> +               for (j = 0; j < n_layers; j++)
> +                       dimm->location[j] = pos[j];
> ...
> 
> definitely superfluous.

This is the way the EDAC core works: everything is allocated, on one shot, when this
function is called, and, on most drivers, before the code that probes how many DIMMS/ranks
got initialized. That happens because the edac_mc_alloc() arguments provide the total
amount of ranks/dimms, but doesn't say anything about what is used there.

Changing from this model to another model that would dynamically initialize the per-dimm/rank
data is possible, but that would require another set of patches that will touch on all
drivers, and to convert the edac_mc_alloc function into 3 or 4 function calls, with the
corresponding changes on all drivers. Also, the changes at the drivers won't likely be
trivial.

The patches that convert kobj into "struct device" does part of the job, as, after it,
each dimm/csrow will be allocated by a separate kmalloc.

After having this series fully applied, it would be possible to work on such solution.
I'll eventually do that, as this would simplify the code at i7core_edac and sb_edac.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 11:37                                           ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 11:37 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 30-04-2012 05:15, Borislav Petkov escreveu:
> On Sun, Apr 29, 2012 at 10:49:44AM -0300, Mauro Carvalho Chehab wrote:
>>> [   10.486440] EDAC MC: DCT0 chip selects:
>>> [   10.486443] EDAC amd64: MC: 0:  2048MB 1:  2048MB
>>> [   10.486445] EDAC amd64: MC: 2:  2048MB 3:  2048MB
>>> [   10.486448] EDAC amd64: MC: 4:     0MB 5:     0MB
>>> [   10.486450] EDAC amd64: MC: 6:     0MB 7:     0MB
>>> [   10.486453] EDAC DEBUG: amd64_debug_display_dimm_sizes: F2x180 (DRAM Bank Address Mapping): 0x00000088
>>> [   10.486455] EDAC MC: DCT1 chip selects:
>>> [   10.486458] EDAC amd64: MC: 0:  2048MB 1:  2048MB
>>> [   10.486460] EDAC amd64: MC: 2:  2048MB 3:  2048MB
>>> [   10.486463] EDAC amd64: MC: 4:     0MB 5:     0MB
>>> [   10.486465] EDAC amd64: MC: 6:     0MB 7:     0MB
>>> [   10.486467] EDAC amd64: using x8 syndromes.
>>> [   10.486469] EDAC DEBUG: amd64_dump_dramcfg_low: F2x190 (DRAM Cfg Low): 0x00083100
>>> [   10.486472] EDAC DEBUG: amd64_dump_dramcfg_low:   DIMM type: buffered; all DIMMs support ECC: yes
>>> [   10.486475] EDAC DEBUG: amd64_dump_dramcfg_low:   PAR/ERR parity: enabled
>>> [   10.486478] EDAC DEBUG: amd64_dump_dramcfg_low:   DCT 128bit mode width: 64b
>>> [   10.486481] EDAC DEBUG: amd64_dump_dramcfg_low:   x4 logical DIMMs present: L0: yes L1: yes L2: no L3: no
>>> [   10.486485] EDAC DEBUG: f1x_early_channel_count: Data width is not 128 bits - need more decoding
>>> [   10.486488] EDAC amd64: MCT channel count: 2
>>> [   10.486493] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc(): allocating 3692 bytes for mci data (16 ranks, 16 csrows/channels)
>>> [   10.486501] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 0: rank0 (0:0:0): row 0, chan 0
>>> [   10.486506] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 1: rank1 (0:1:0): row 0, chan 1
>>> [   10.486510] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 2: rank2 (1:0:0): row 1, chan 0
>>> [   10.486514] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 3: rank3 (1:1:0): row 1, chan 1
>>> [   10.486518] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 4: rank4 (2:0:0): row 2, chan 0
>>> [   10.486522] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 5: rank5 (2:1:0): row 2, chan 1
>>> [   10.486526] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 6: rank6 (3:0:0): row 3, chan 0
>>> [   10.486530] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 7: rank7 (3:1:0): row 3, chan 1
>>> [   10.486534] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 8: rank8 (4:0:0): row 4, chan 0
>>> [   10.486538] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 9: rank9 (4:1:0): row 4, chan 1
>>> [   10.486542] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 10: rank10 (5:0:0): row 5, chan 0
>>> [   10.486546] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 11: rank11 (5:1:0): row 5, chan 1
>>> [   10.486550] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 12: rank12 (6:0:0): row 6, chan 0
>>> [   10.486554] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 13: rank13 (6:1:0): row 6, chan 1
>>> [   10.486558] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 14: rank14 (7:0:0): row 7, chan 0
>>> [   10.486562] EDAC DEBUG: new_edac_mc_alloc: new_edac_mc_alloc: 15: rank15 (7:1:0): row 7, chan 1
>>>
>>> DCT0 has 4 ranks + DCT1 also 4 ranks = 8 ranks total.
>>>
>>> Now your change is showing 16 ranks. Still b0rked.
>>>
>> No, DCT0+DCT1 have 16 ranks, 8 filled and 8 empty. So, it is OK.
>>
>> As I said before when you've pointed this bug (likel at v3 review), edac_mc_alloc
>> doesn't know how many ranks are filled, as the driver logic first calls it to 
>> allocate for the max amount of ranks, and then fills the rank with their info 
>> (or let them untouched with 0 pages, if they're empty).
> 
> Basically you're saying you're generating dimm_info structs for all
> _possible_ dimms and the loop where this debug message comes from goes
> and marrily initializes them all although some of them are empty:
> 
> +       for (i = 0; i < tot_dimms; i++) {
> +               chan = &csi[row].channels[chn];
> +               dimm = EDAC_DIMM_PTR(lay, mci->dimms, n_layers,
> +                              pos[0], pos[1], pos[2]);
> +               dimm->mci = mci;
> +
> +               debugf2("%s: %d: dimm%zd (%d:%d:%d): row %d, chan %d\n", __func__,
> +                       i, (dimm - mci->dimms),
> +                       pos[0], pos[1], pos[2], row, chn);
> +
> +               /* Copy DIMM location */
> +               for (j = 0; j < n_layers; j++)
> +                       dimm->location[j] = pos[j];
> ...
> 
> definitely superfluous.

This is the way the EDAC core works: everything is allocated, on one shot, when this
function is called, and, on most drivers, before the code that probes how many DIMMS/ranks
got initialized. That happens because the edac_mc_alloc() arguments provide the total
amount of ranks/dimms, but doesn't say anything about what is used there.

Changing from this model to another model that would dynamically initialize the per-dimm/rank
data is possible, but that would require another set of patches that will touch on all
drivers, and to convert the edac_mc_alloc function into 3 or 4 function calls, with the
corresponding changes on all drivers. Also, the changes at the drivers won't likely be
trivial.

The patches that convert kobj into "struct device" does part of the job, as, after it,
each dimm/csrow will be allocated by a separate kmalloc.

After having this series fully applied, it would be possible to work on such solution.
I'll eventually do that, as this would simplify the code at i7core_edac and sb_edac.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30 11:11                                             ` Borislav Petkov
@ 2012-04-30 11:45                                               ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 11:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 30-04-2012 08:11, Borislav Petkov escreveu:
> On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:

>> For example, this is the mapping used by the second memory controller of the SB machine
>> I'm using on my tests:
>>
>> [52803.640043] EDAC DEBUG: sbridge_probe: Registering MC#1 (2 of 2)
>> ...
>> [52803.640062] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc(): allocating 7196 bytes for mci data (12 dimms, 12 csrows/channels)
>> [52803.640070] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: initializing 12 dimms
>> [52803.640072] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
>> [52803.640074] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
>> [52803.640077] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 2: dimm2 (0:2:0): row 0, chan 2
>> [52803.640080] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 3: dimm3 (1:0:0): row 0, chan 3
>> [52803.640083] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 4: dimm4 (1:1:0): row 1, chan 0
>> [52803.640086] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 5: dimm5 (1:2:0): row 1, chan 1
>> [52803.640089] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 6: dimm6 (2:0:0): row 1, chan 2
>> [52803.640092] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 7: dimm7 (2:1:0): row 1, chan 3
>> [52803.640095] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 8: dimm8 (2:2:0): row 2, chan 0
>> [52803.640098] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 9: dimm9 (3:0:0): row 2, chan 1
>> [52803.640101] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 10: dimm10 (3:1:0): row 2, chan 2
>> [52803.640104] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 11: dimm11 (3:2:0): row 2, chan 3
>>
>> With the above info, it is clear that the DIMM located at mc#1, channel#3 slot#2 is
>> called "dimm11" at the new API, and corresponds to "csrow 2, channel 3" for a legacy
>> EDAC API call.
> 
> Are all those DIMM slots above populated? What happens if they're not,
> are you issuing the same dimm0-dimm11 lines for slots which aren't even
> populated?
> 
> I have a much better idea: Generally, this debug info should come from
> the specific driver that allocates the dimm descriptors, not from the
> EDAC core. This way, you know in the driver which slots are populated
> and those which are not should be omitted.

The drivers don't allocate the dimm descriptors. They're allocated by the
core.

> This way it says "initializing 12 dimms" and the user thinks there are
> 12 DIMMs on his system where this might not be true.


I'm OK to remove the "initializing 12 dimms" message. It doesn't add anything
new.

With regards do the other messages, if the debug messages are not clear, 
then let's fix them, instead of removing. What if we print, instead,
on a message like:

	"row 1, chan 1 will represent dimm5 (1:2:0) if not empty"

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 11:45                                               ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 11:45 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 30-04-2012 08:11, Borislav Petkov escreveu:
> On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:

>> For example, this is the mapping used by the second memory controller of the SB machine
>> I'm using on my tests:
>>
>> [52803.640043] EDAC DEBUG: sbridge_probe: Registering MC#1 (2 of 2)
>> ...
>> [52803.640062] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc(): allocating 7196 bytes for mci data (12 dimms, 12 csrows/channels)
>> [52803.640070] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: initializing 12 dimms
>> [52803.640072] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
>> [52803.640074] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
>> [52803.640077] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 2: dimm2 (0:2:0): row 0, chan 2
>> [52803.640080] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 3: dimm3 (1:0:0): row 0, chan 3
>> [52803.640083] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 4: dimm4 (1:1:0): row 1, chan 0
>> [52803.640086] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 5: dimm5 (1:2:0): row 1, chan 1
>> [52803.640089] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 6: dimm6 (2:0:0): row 1, chan 2
>> [52803.640092] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 7: dimm7 (2:1:0): row 1, chan 3
>> [52803.640095] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 8: dimm8 (2:2:0): row 2, chan 0
>> [52803.640098] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 9: dimm9 (3:0:0): row 2, chan 1
>> [52803.640101] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 10: dimm10 (3:1:0): row 2, chan 2
>> [52803.640104] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 11: dimm11 (3:2:0): row 2, chan 3
>>
>> With the above info, it is clear that the DIMM located at mc#1, channel#3 slot#2 is
>> called "dimm11" at the new API, and corresponds to "csrow 2, channel 3" for a legacy
>> EDAC API call.
> 
> Are all those DIMM slots above populated? What happens if they're not,
> are you issuing the same dimm0-dimm11 lines for slots which aren't even
> populated?
> 
> I have a much better idea: Generally, this debug info should come from
> the specific driver that allocates the dimm descriptors, not from the
> EDAC core. This way, you know in the driver which slots are populated
> and those which are not should be omitted.

The drivers don't allocate the dimm descriptors. They're allocated by the
core.

> This way it says "initializing 12 dimms" and the user thinks there are
> 12 DIMMs on his system where this might not be true.


I'm OK to remove the "initializing 12 dimms" message. It doesn't add anything
new.

With regards do the other messages, if the debug messages are not clear, 
then let's fix them, instead of removing. What if we print, instead,
on a message like:

	"row 1, chan 1 will represent dimm5 (1:2:0) if not empty"

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30 11:15                                                     ` Borislav Petkov
@ 2012-04-30 11:46                                                       ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 11:46 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Joe Perches, Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson

Em 30-04-2012 08:15, Borislav Petkov escreveu:
> On Mon, Apr 30, 2012 at 08:09:20AM -0300, Mauro Carvalho Chehab wrote:
>>> you guys are so trigger-happy to apply humongous, cleanup patches but
>>> let me ask this: can anyone of you really test those changes with each
>>> driver? Do you have all the hardware that those patches touch?
>>
>> Well, then why you've touched that in the first place, without even looking
>> what would be affected at the EDAC core and at the drivers? Didn't you tested
>> it on all hardware that your patch affected?
> 
> Because I simply missed that fact, which is my bad, sorry.
> 
>> This patch touches only on debug code that aren't even enabled on
>> production kernels. Assuming that a sneaky bug were introduced, this
>> won't cause much hurt, and any developer inspecting those debug
>> messages will be able to discover and fix what happened there.
> 
> Ok, fine, I still will review the amd64_edac side of the changes.
> 

Ok, Thanks!
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30 11:45                                               ` Mauro Carvalho Chehab
@ 2012-04-30 12:38                                                 ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30 12:38 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Mon, Apr 30, 2012 at 08:45:09AM -0300, Mauro Carvalho Chehab wrote:
> Em 30-04-2012 08:11, Borislav Petkov escreveu:
> > On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:
> 
> >> For example, this is the mapping used by the second memory controller of the SB machine
> >> I'm using on my tests:
> >>
> >> [52803.640043] EDAC DEBUG: sbridge_probe: Registering MC#1 (2 of 2)
> >> ...
> >> [52803.640062] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc(): allocating 7196 bytes for mci data (12 dimms, 12 csrows/channels)
> >> [52803.640070] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: initializing 12 dimms
> >> [52803.640072] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> >> [52803.640074] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> >> [52803.640077] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 2: dimm2 (0:2:0): row 0, chan 2
> >> [52803.640080] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 3: dimm3 (1:0:0): row 0, chan 3
> >> [52803.640083] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 4: dimm4 (1:1:0): row 1, chan 0
> >> [52803.640086] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 5: dimm5 (1:2:0): row 1, chan 1
> >> [52803.640089] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 6: dimm6 (2:0:0): row 1, chan 2
> >> [52803.640092] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 7: dimm7 (2:1:0): row 1, chan 3
> >> [52803.640095] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 8: dimm8 (2:2:0): row 2, chan 0
> >> [52803.640098] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 9: dimm9 (3:0:0): row 2, chan 1
> >> [52803.640101] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 10: dimm10 (3:1:0): row 2, chan 2
> >> [52803.640104] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 11: dimm11 (3:2:0): row 2, chan 3
> >>
> >> With the above info, it is clear that the DIMM located at mc#1, channel#3 slot#2 is
> >> called "dimm11" at the new API, and corresponds to "csrow 2, channel 3" for a legacy
> >> EDAC API call.
> > 
> > Are all those DIMM slots above populated? What happens if they're not,
> > are you issuing the same dimm0-dimm11 lines for slots which aren't even
> > populated?
> > 
> > I have a much better idea: Generally, this debug info should come from
> > the specific driver that allocates the dimm descriptors, not from the
> > EDAC core. This way, you know in the driver which slots are populated
> > and those which are not should be omitted.
> 
> The drivers don't allocate the dimm descriptors. They're allocated by the
> core.

I know that. The drivers call into EDAC core using edac_mc_alloc, this
is what I meant above.

> > This way it says "initializing 12 dimms" and the user thinks there are
> > 12 DIMMs on his system where this might not be true.
> 
> 
> I'm OK to remove the "initializing 12 dimms" message. It doesn't add anything
> new.
> 
> With regards do the other messages, if the debug messages are not clear, 
> then let's fix them, instead of removing. What if we print, instead,
> on a message like:
> 
> 	"row 1, chan 1 will represent dimm5 (1:2:0) if not empty"

How about the following instead: the specific driver calls
edac_mc_alloc(), it gets the allocated dimm array in mci->dimms
_without_ dumping each dimm%d line. Then, each driver figures out which
subset of that dimms array actually has populated slots and prints only
the populated rank/slot/...

This information is much more valuable than saying how many _possible_
slots the edac core has allocated.

Then, each driver can decide whether it makes sense to dump that info or
not.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 12:38                                                 ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30 12:38 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Mon, Apr 30, 2012 at 08:45:09AM -0300, Mauro Carvalho Chehab wrote:
> Em 30-04-2012 08:11, Borislav Petkov escreveu:
> > On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:
> 
> >> For example, this is the mapping used by the second memory controller of the SB machine
> >> I'm using on my tests:
> >>
> >> [52803.640043] EDAC DEBUG: sbridge_probe: Registering MC#1 (2 of 2)
> >> ...
> >> [52803.640062] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc(): allocating 7196 bytes for mci data (12 dimms, 12 csrows/channels)
> >> [52803.640070] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: initializing 12 dimms
> >> [52803.640072] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 0: dimm0 (0:0:0): row 0, chan 0
> >> [52803.640074] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 1: dimm1 (0:1:0): row 0, chan 1
> >> [52803.640077] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 2: dimm2 (0:2:0): row 0, chan 2
> >> [52803.640080] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 3: dimm3 (1:0:0): row 0, chan 3
> >> [52803.640083] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 4: dimm4 (1:1:0): row 1, chan 0
> >> [52803.640086] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 5: dimm5 (1:2:0): row 1, chan 1
> >> [52803.640089] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 6: dimm6 (2:0:0): row 1, chan 2
> >> [52803.640092] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 7: dimm7 (2:1:0): row 1, chan 3
> >> [52803.640095] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 8: dimm8 (2:2:0): row 2, chan 0
> >> [52803.640098] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 9: dimm9 (3:0:0): row 2, chan 1
> >> [52803.640101] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 10: dimm10 (3:1:0): row 2, chan 2
> >> [52803.640104] EDAC DEBUG: edac_mc_alloc: edac_mc_alloc: 11: dimm11 (3:2:0): row 2, chan 3
> >>
> >> With the above info, it is clear that the DIMM located at mc#1, channel#3 slot#2 is
> >> called "dimm11" at the new API, and corresponds to "csrow 2, channel 3" for a legacy
> >> EDAC API call.
> > 
> > Are all those DIMM slots above populated? What happens if they're not,
> > are you issuing the same dimm0-dimm11 lines for slots which aren't even
> > populated?
> > 
> > I have a much better idea: Generally, this debug info should come from
> > the specific driver that allocates the dimm descriptors, not from the
> > EDAC core. This way, you know in the driver which slots are populated
> > and those which are not should be omitted.
> 
> The drivers don't allocate the dimm descriptors. They're allocated by the
> core.

I know that. The drivers call into EDAC core using edac_mc_alloc, this
is what I meant above.

> > This way it says "initializing 12 dimms" and the user thinks there are
> > 12 DIMMs on his system where this might not be true.
> 
> 
> I'm OK to remove the "initializing 12 dimms" message. It doesn't add anything
> new.
> 
> With regards do the other messages, if the debug messages are not clear, 
> then let's fix them, instead of removing. What if we print, instead,
> on a message like:
> 
> 	"row 1, chan 1 will represent dimm5 (1:2:0) if not empty"

How about the following instead: the specific driver calls
edac_mc_alloc(), it gets the allocated dimm array in mci->dimms
_without_ dumping each dimm%d line. Then, each driver figures out which
subset of that dimms array actually has populated slots and prints only
the populated rank/slot/...

This information is much more valuable than saying how many _possible_
slots the edac core has allocated.

Then, each driver can decide whether it makes sense to dump that info or
not.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30 11:23                                           ` Mauro Carvalho Chehab
@ 2012-04-30 12:51                                             ` Borislav Petkov
  -1 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30 12:51 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

On Mon, Apr 30, 2012 at 08:23:42AM -0300, Mauro Carvalho Chehab wrote:
> With this I fully agree: you're nacking patches because it is not the way you

Where? Have I written Nacked-by somewhere?

> write your code, not because the code there is doing anything wrong.
> 
> If you point anything wrong on the way I wrote, then I'll fix. Otherwise, why
> should I do a change that will obfuscate the code?

What obfuscation are you talking about? Having the initialization of
variables along with their declaration is not it.

Now let's look what you're doing:

> >> +     unsigned tot_csrows, tot_channels, tot_errcount = 0;
> >> +     unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];

just to reassign 1 to some of them

> >> +     tot_dimms = 1;
> >> +     tot_channels = 1;
> >> +     tot_csrows = 1;

a couple of lines below.

Now this is misleading.

Now let's look at what I'm proposing

	unsigned tot_dimms = 1;
	unsigned tot_csrows = 1;
	unsigned tot_channels = 1;

How is this an obfuscation? It is basic code layout practices.

> The editor used by te developer is not relevant. This is not a reason
> to obfuscate the code.
> 
> >> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> >> +                                    unsigned n_layers,
> >> +                                    struct edac_mc_layer *layers,
> >> +                                    bool rev_order,
> >> +                                    unsigned sz_pvt)
> >>  {
> >>       void *ptr = NULL;
> >>       struct mem_ctl_info *mci;
> >> -     struct csrow_info *csi, *csrow;
> >> +     struct edac_mc_layer *layer;
> >> +     struct csrow_info *csi, *csr;
> >>       struct rank_info *chi, *chp, *chan;
> >>       struct dimm_info *dimm;
> >> +     u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
> >>       void *pvt;
> >> -     unsigned size;
> >> -     int row, chn;
> >> +     unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> >> +     unsigned tot_csrows, tot_channels, tot_errcount = 0;
> >> +     int i, j;
> >>       int err;
> >> +     int row, chn;
> >> +     bool per_rank = false;
> >> +
> >> +     BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
> >> +     /*
> >> +      * Calculate the total amount of dimms and csrows/cschannels while
> >> +      * in the old API emulation mode
> >> +      */
> >> +     tot_dimms = 1;
> >> +     tot_channels = 1;
> >> +     tot_csrows = 1;
> >> +     for (i = 0; i < n_layers; i++) {
> >> +             tot_dimms *= layers[i].size;
> >> +             if (layers[i].is_virt_csrow)
> >> +                     tot_csrows *= layers[i].size;
> >> +             else
> >> +                     tot_channels *= layers[i].size;
> >> +
> >> +             if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
> >> +                     per_rank = true;

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 12:51                                             ` Borislav Petkov
  0 siblings, 0 replies; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30 12:51 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

On Mon, Apr 30, 2012 at 08:23:42AM -0300, Mauro Carvalho Chehab wrote:
> With this I fully agree: you're nacking patches because it is not the way you

Where? Have I written Nacked-by somewhere?

> write your code, not because the code there is doing anything wrong.
> 
> If you point anything wrong on the way I wrote, then I'll fix. Otherwise, why
> should I do a change that will obfuscate the code?

What obfuscation are you talking about? Having the initialization of
variables along with their declaration is not it.

Now let's look what you're doing:

> >> +     unsigned tot_csrows, tot_channels, tot_errcount = 0;
> >> +     unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];

just to reassign 1 to some of them

> >> +     tot_dimms = 1;
> >> +     tot_channels = 1;
> >> +     tot_csrows = 1;

a couple of lines below.

Now this is misleading.

Now let's look at what I'm proposing

	unsigned tot_dimms = 1;
	unsigned tot_csrows = 1;
	unsigned tot_channels = 1;

How is this an obfuscation? It is basic code layout practices.

> The editor used by te developer is not relevant. This is not a reason
> to obfuscate the code.
> 
> >> +struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
> >> +                                    unsigned n_layers,
> >> +                                    struct edac_mc_layer *layers,
> >> +                                    bool rev_order,
> >> +                                    unsigned sz_pvt)
> >>  {
> >>       void *ptr = NULL;
> >>       struct mem_ctl_info *mci;
> >> -     struct csrow_info *csi, *csrow;
> >> +     struct edac_mc_layer *layer;
> >> +     struct csrow_info *csi, *csr;
> >>       struct rank_info *chi, *chp, *chan;
> >>       struct dimm_info *dimm;
> >> +     u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
> >>       void *pvt;
> >> -     unsigned size;
> >> -     int row, chn;
> >> +     unsigned size, tot_dimms, count, pos[EDAC_MAX_LAYERS];
> >> +     unsigned tot_csrows, tot_channels, tot_errcount = 0;
> >> +     int i, j;
> >>       int err;
> >> +     int row, chn;
> >> +     bool per_rank = false;
> >> +
> >> +     BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
> >> +     /*
> >> +      * Calculate the total amount of dimms and csrows/cschannels while
> >> +      * in the old API emulation mode
> >> +      */
> >> +     tot_dimms = 1;
> >> +     tot_channels = 1;
> >> +     tot_csrows = 1;
> >> +     for (i = 0; i < n_layers; i++) {
> >> +             tot_dimms *= layers[i].size;
> >> +             if (layers[i].is_virt_csrow)
> >> +                     tot_csrows *= layers[i].size;
> >> +             else
> >> +                     tot_channels *= layers[i].size;
> >> +
> >> +             if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
> >> +                     per_rank = true;

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30 12:38                                                 ` Borislav Petkov
@ 2012-04-30 13:00                                                   ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 13:00 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 30-04-2012 09:38, Borislav Petkov escreveu:
> On Mon, Apr 30, 2012 at 08:45:09AM -0300, Mauro Carvalho Chehab wrote:
>> Em 30-04-2012 08:11, Borislav Petkov escreveu:
>>> On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:

>>> This way it says "initializing 12 dimms" and the user thinks there are
>>> 12 DIMMs on his system where this might not be true.
>>
>>
>> I'm OK to remove the "initializing 12 dimms" message. It doesn't add anything
>> new.
>>
>> With regards do the other messages, if the debug messages are not clear, 
>> then let's fix them, instead of removing. What if we print, instead,
>> on a message like:
>>
>> 	"row 1, chan 1 will represent dimm5 (1:2:0) if not empty"
> 
> How about the following instead: the specific driver calls
> edac_mc_alloc(), it gets the allocated dimm array in mci->dimms
> _without_ dumping each dimm%d line. Then, each driver figures out which
> subset of that dimms array actually has populated slots and prints only
> the populated rank/slot/...
> 
> This information is much more valuable than saying how many _possible_
> slots the edac core has allocated.
> 
> Then, each driver can decide whether it makes sense to dump that info or
> not.

No, that would add extra complexity at the drivers level just due to debug
messages. I think that the better is to move this printk to the debug-specific
routine that is called only when the dimm is filled (edac_mc_dump_dimm).

With this cange, the message will be printed only for the filled dimms.

This is a cleanup patch, so I'll write it, together with the change that
will get rid of the loop that uses KERN_CONT. It will use a function added
by a latter patch at edac_mc_sysfs so it can't be merged on this patch
anyway.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 13:00                                                   ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 13:00 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 30-04-2012 09:38, Borislav Petkov escreveu:
> On Mon, Apr 30, 2012 at 08:45:09AM -0300, Mauro Carvalho Chehab wrote:
>> Em 30-04-2012 08:11, Borislav Petkov escreveu:
>>> On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:

>>> This way it says "initializing 12 dimms" and the user thinks there are
>>> 12 DIMMs on his system where this might not be true.
>>
>>
>> I'm OK to remove the "initializing 12 dimms" message. It doesn't add anything
>> new.
>>
>> With regards do the other messages, if the debug messages are not clear, 
>> then let's fix them, instead of removing. What if we print, instead,
>> on a message like:
>>
>> 	"row 1, chan 1 will represent dimm5 (1:2:0) if not empty"
> 
> How about the following instead: the specific driver calls
> edac_mc_alloc(), it gets the allocated dimm array in mci->dimms
> _without_ dumping each dimm%d line. Then, each driver figures out which
> subset of that dimms array actually has populated slots and prints only
> the populated rank/slot/...
> 
> This information is much more valuable than saying how many _possible_
> slots the edac core has allocated.
> 
> Then, each driver can decide whether it makes sense to dump that info or
> not.

No, that would add extra complexity at the drivers level just due to debug
messages. I think that the better is to move this printk to the debug-specific
routine that is called only when the dimm is filled (edac_mc_dump_dimm).

With this cange, the message will be printed only for the filled dimms.

This is a cleanup patch, so I'll write it, together with the change that
will get rid of the loop that uses KERN_CONT. It will use a function added
by a latter patch at edac_mc_sysfs so it can't be merged on this patch
anyway.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
  2012-04-30 13:00                                                   ` Mauro Carvalho Chehab
@ 2012-04-30 13:53                                                     ` Mauro Carvalho Chehab
  -1 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 13:53 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Aristeu Rozanski, Doug Thompson, Mark Gross, Jason Uhlenkott,
	Tim Small, Ranganathan Desikan, Arvind R.,
	Olof Johansson, Egor Martovetsky, Chris Metcalf, Michal Marek,
	Jiri Kosina, Joe Perches, Dmitry Eremin-Solenikov,
	Benjamin Herrenschmidt, Hitoshi Mitake, Andrew Morton,
	Niklas Söderlund, Shaohui Xie, Josh Boyer, linuxppc-dev

Em 30-04-2012 10:00, Mauro Carvalho Chehab escreveu:
> Em 30-04-2012 09:38, Borislav Petkov escreveu:
>> On Mon, Apr 30, 2012 at 08:45:09AM -0300, Mauro Carvalho Chehab wrote:
>>> Em 30-04-2012 08:11, Borislav Petkov escreveu:
>>>> On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:
> 
>>>> This way it says "initializing 12 dimms" and the user thinks there are
>>>> 12 DIMMs on his system where this might not be true.
>>>
>>>
>>> I'm OK to remove the "initializing 12 dimms" message. It doesn't add anything
>>> new.
>>>
>>> With regards do the other messages, if the debug messages are not clear, 
>>> then let's fix them, instead of removing. What if we print, instead,
>>> on a message like:
>>>
>>> 	"row 1, chan 1 will represent dimm5 (1:2:0) if not empty"
>>
>> How about the following instead: the specific driver calls
>> edac_mc_alloc(), it gets the allocated dimm array in mci->dimms
>> _without_ dumping each dimm%d line. Then, each driver figures out which
>> subset of that dimms array actually has populated slots and prints only
>> the populated rank/slot/...
>>
>> This information is much more valuable than saying how many _possible_
>> slots the edac core has allocated.
>>
>> Then, each driver can decide whether it makes sense to dump that info or
>> not.
> 
> No, that would add extra complexity at the drivers level just due to debug
> messages. I think that the better is to move this printk to the debug-specific
> routine that is called only when the dimm is filled (edac_mc_dump_dimm).
> 
> With this cange, the message will be printed only for the filled dimms.
> 
> This is a cleanup patch, so I'll write it, together with the change that
> will get rid of the loop that uses KERN_CONT. It will use a function added
> by a latter patch at edac_mc_sysfs so it can't be merged on this patch
> anyway.

The following patch dos the debug cleanup. I'll add at the end of my tree.

Regards,
Mauro.


From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Mon, 30 Apr 2012 10:24:43 -0300
Subject: [PATCH] edac_mc: Cleanup per-dimm_info debug messages

The edac_mc_alloc() routine allocates one dimm_info device for all
possible memories, including the non-filled ones. The debug messages
there are somewhat confusing. So, cleans them, by moving the code
that prints the memory location to edac_mc, and using it on both
edac_mc_sysfs and edac_mc.

After this patch, a dimm-based memory controller will print the debug
info as:

[  728.430828] EDAC DEBUG: edac_mc_dump_dimm: 	dimm2: channel 0 slot 2 mapped as virtual row 0, chan 2
[  728.430834] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->label = 'mc#0channel#0slot#2'
[  728.430839] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0
[  728.430846] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->grain = 0
[  728.430850] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0

(a rank-based memory controller would print, instead, "rank2"
 on the above debug info)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index d8278b3..1bc2843 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -40,6 +40,25 @@
 static DEFINE_MUTEX(mem_ctls_mutex);
 static LIST_HEAD(mc_devices);
 
+unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
+			         int len)
+{
+	struct mem_ctl_info *mci = dimm->mci;
+	int i, n, count = 0;
+	char *p = buf;
+
+	for (i = 0; i < mci->n_layers; i++) {
+		n = snprintf(p, len, "%s %d ",
+			      edac_layer_name[mci->layers[i].type],
+			      dimm->location[i]);
+		p += n;
+		len -= n;
+		count += n;
+	}
+
+	return count;
+}
+
 #ifdef CONFIG_EDAC_DEBUG
 
 static void edac_mc_dump_channel(struct rank_info *chan)
@@ -50,20 +69,18 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel->dimm = %p\n", chan->dimm);
 }
 
-static void edac_mc_dump_dimm(struct dimm_info *dimm)
+static void edac_mc_dump_dimm(struct dimm_info *dimm, int number)
 {
-	int i;
+	char location[80];
+
+	edac_dimm_info_location(dimm, location, sizeof(location));
 
 	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\t%s%i: %smapped as virtual row %d, chan %d\n",
+		dimm->mci->mem_is_per_rank ? "rank" : "dimm",
+		number, location, dimm->csrow, dimm->cschannel);
 	debugf4("\tdimm->label = '%s'\n", dimm->label);
 	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
-	debugf4("\tdimm location ");
-	for (i = 0; i < dimm->mci->n_layers; i++) {
-		printk(KERN_CONT "%d", dimm->location[i]);
-		if (i < dimm->mci->n_layers - 1)
-			printk(KERN_CONT ".");
-	}
-	printk(KERN_CONT "\n");
 	debugf4("\tdimm->grain = %d\n", dimm->grain);
 	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
@@ -337,8 +354,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("initializing %d %s\n", tot_dimms,
-		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
 		off = EDAC_DIMM_OFF(layer, n_layers, pos[0], pos[1], pos[2]);
@@ -351,10 +366,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 		mci->dimms[off] = dimm;
 		dimm->mci = mci;
 
-		debugf2("%d: %s%i (%d:%d:%d): row %d, chan %d\n", i,
-			per_rank ? "rank" : "dimm", off,
-			pos[0], pos[1], pos[2], row, chn);
-
 		/*
 		 * Copy DIMM location and initialize the memory location
 		 */
@@ -730,7 +741,7 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(mci->csrows[i]->channels[j]);
 		}
 		for (i = 0; i < mci->tot_dimms; i++)
-			edac_mc_dump_dimm(mci->dimms[i]);
+			edac_mc_dump_dimm(mci->dimms[i], i);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 8f96c49..e3e9e75 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -488,13 +488,7 @@ static ssize_t dimmdev_location_show(struct device *dev,
 	int i;
 	char *p = data;
 
-	for (i = 0; i < mci->n_layers; i++) {
-		p += sprintf(p, "%s %d ",
-			     edac_layer_name[mci->layers[i].type],
-			     dimm->location[i]);
-	}
-
-	return p - data;
+	return edac_dimm_info_location(dimm, data, PAGE_SIZE);
 }
 
 static ssize_t dimmdev_label_show(struct device *dev,
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 1af1367..de92756 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -34,6 +34,9 @@ extern int edac_mc_get_panic_on_ue(void);
 extern int edac_get_poll_msec(void);
 extern int edac_mc_get_poll_msec(void);
 
+unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
+				 int len);
+
 	/* on edac_device.c */
 extern int edac_device_register_sysfs_main_kobj(
 				struct edac_device_ctl_info *edac_dev);

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH EDACv16 1/2] edac: Change internal representation to work with layers
@ 2012-04-30 13:53                                                     ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 13:53 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Shaohui Xie, Jason Uhlenkott, Aristeu Rozanski, Hitoshi Mitake,
	Mark Gross, Dmitry Eremin-Solenikov, Ranganathan Desikan,
	Egor Martovetsky, Niklas Söderlund, Tim Small, Arvind R.,
	Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Em 30-04-2012 10:00, Mauro Carvalho Chehab escreveu:
> Em 30-04-2012 09:38, Borislav Petkov escreveu:
>> On Mon, Apr 30, 2012 at 08:45:09AM -0300, Mauro Carvalho Chehab wrote:
>>> Em 30-04-2012 08:11, Borislav Petkov escreveu:
>>>> On Mon, Apr 30, 2012 at 07:58:33AM -0300, Mauro Carvalho Chehab wrote:
> 
>>>> This way it says "initializing 12 dimms" and the user thinks there are
>>>> 12 DIMMs on his system where this might not be true.
>>>
>>>
>>> I'm OK to remove the "initializing 12 dimms" message. It doesn't add anything
>>> new.
>>>
>>> With regards do the other messages, if the debug messages are not clear, 
>>> then let's fix them, instead of removing. What if we print, instead,
>>> on a message like:
>>>
>>> 	"row 1, chan 1 will represent dimm5 (1:2:0) if not empty"
>>
>> How about the following instead: the specific driver calls
>> edac_mc_alloc(), it gets the allocated dimm array in mci->dimms
>> _without_ dumping each dimm%d line. Then, each driver figures out which
>> subset of that dimms array actually has populated slots and prints only
>> the populated rank/slot/...
>>
>> This information is much more valuable than saying how many _possible_
>> slots the edac core has allocated.
>>
>> Then, each driver can decide whether it makes sense to dump that info or
>> not.
> 
> No, that would add extra complexity at the drivers level just due to debug
> messages. I think that the better is to move this printk to the debug-specific
> routine that is called only when the dimm is filled (edac_mc_dump_dimm).
> 
> With this cange, the message will be printed only for the filled dimms.
> 
> This is a cleanup patch, so I'll write it, together with the change that
> will get rid of the loop that uses KERN_CONT. It will use a function added
> by a latter patch at edac_mc_sysfs so it can't be merged on this patch
> anyway.

The following patch dos the debug cleanup. I'll add at the end of my tree.

Regards,
Mauro.


From: Mauro Carvalho Chehab <mchehab@redhat.com>
Date: Mon, 30 Apr 2012 10:24:43 -0300
Subject: [PATCH] edac_mc: Cleanup per-dimm_info debug messages

The edac_mc_alloc() routine allocates one dimm_info device for all
possible memories, including the non-filled ones. The debug messages
there are somewhat confusing. So, cleans them, by moving the code
that prints the memory location to edac_mc, and using it on both
edac_mc_sysfs and edac_mc.

After this patch, a dimm-based memory controller will print the debug
info as:

[  728.430828] EDAC DEBUG: edac_mc_dump_dimm: 	dimm2: channel 0 slot 2 mapped as virtual row 0, chan 2
[  728.430834] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->label = 'mc#0channel#0slot#2'
[  728.430839] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0
[  728.430846] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->grain = 0
[  728.430850] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0

(a rank-based memory controller would print, instead, "rank2"
 on the above debug info)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index d8278b3..1bc2843 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -40,6 +40,25 @@
 static DEFINE_MUTEX(mem_ctls_mutex);
 static LIST_HEAD(mc_devices);
 
+unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
+			         int len)
+{
+	struct mem_ctl_info *mci = dimm->mci;
+	int i, n, count = 0;
+	char *p = buf;
+
+	for (i = 0; i < mci->n_layers; i++) {
+		n = snprintf(p, len, "%s %d ",
+			      edac_layer_name[mci->layers[i].type],
+			      dimm->location[i]);
+		p += n;
+		len -= n;
+		count += n;
+	}
+
+	return count;
+}
+
 #ifdef CONFIG_EDAC_DEBUG
 
 static void edac_mc_dump_channel(struct rank_info *chan)
@@ -50,20 +69,18 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel->dimm = %p\n", chan->dimm);
 }
 
-static void edac_mc_dump_dimm(struct dimm_info *dimm)
+static void edac_mc_dump_dimm(struct dimm_info *dimm, int number)
 {
-	int i;
+	char location[80];
+
+	edac_dimm_info_location(dimm, location, sizeof(location));
 
 	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\t%s%i: %smapped as virtual row %d, chan %d\n",
+		dimm->mci->mem_is_per_rank ? "rank" : "dimm",
+		number, location, dimm->csrow, dimm->cschannel);
 	debugf4("\tdimm->label = '%s'\n", dimm->label);
 	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
-	debugf4("\tdimm location ");
-	for (i = 0; i < dimm->mci->n_layers; i++) {
-		printk(KERN_CONT "%d", dimm->location[i]);
-		if (i < dimm->mci->n_layers - 1)
-			printk(KERN_CONT ".");
-	}
-	printk(KERN_CONT "\n");
 	debugf4("\tdimm->grain = %d\n", dimm->grain);
 	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
@@ -337,8 +354,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	debugf4("initializing %d %s\n", tot_dimms,
-		per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
 		off = EDAC_DIMM_OFF(layer, n_layers, pos[0], pos[1], pos[2]);
@@ -351,10 +366,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 		mci->dimms[off] = dimm;
 		dimm->mci = mci;
 
-		debugf2("%d: %s%i (%d:%d:%d): row %d, chan %d\n", i,
-			per_rank ? "rank" : "dimm", off,
-			pos[0], pos[1], pos[2], row, chn);
-
 		/*
 		 * Copy DIMM location and initialize the memory location
 		 */
@@ -730,7 +741,7 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(mci->csrows[i]->channels[j]);
 		}
 		for (i = 0; i < mci->tot_dimms; i++)
-			edac_mc_dump_dimm(mci->dimms[i]);
+			edac_mc_dump_dimm(mci->dimms[i], i);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index 8f96c49..e3e9e75 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -488,13 +488,7 @@ static ssize_t dimmdev_location_show(struct device *dev,
 	int i;
 	char *p = data;
 
-	for (i = 0; i < mci->n_layers; i++) {
-		p += sprintf(p, "%s %d ",
-			     edac_layer_name[mci->layers[i].type],
-			     dimm->location[i]);
-	}
-
-	return p - data;
+	return edac_dimm_info_location(dimm, data, PAGE_SIZE);
 }
 
 static ssize_t dimmdev_label_show(struct device *dev,
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 1af1367..de92756 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -34,6 +34,9 @@ extern int edac_mc_get_panic_on_ue(void);
 extern int edac_get_poll_msec(void);
 extern int edac_mc_get_poll_msec(void);
 
+unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
+				 int len);
+
 	/* on edac_device.c */
 extern int edac_device_register_sysfs_main_kobj(
 				struct edac_device_ctl_info *edac_dev);

^ permalink raw reply related	[flat|nested] 206+ messages in thread

* [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages
  2012-04-30 13:53                                                     ` Mauro Carvalho Chehab
  (?)
@ 2012-04-30 15:02                                                     ` Mauro Carvalho Chehab
  2012-04-30 15:10                                                       ` Mauro Carvalho Chehab
                                                                         ` (2 more replies)
  -1 siblings, 3 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 15:02 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Joe Perches

The edac_mc_alloc() routine allocates one dimm_info device for all
possible memories, including the non-filled ones. The debug messages
there are somewhat confusing. So, cleans them, by moving the code
that prints the memory location to edac_mc, and using it on both
edac_mc_sysfs and edac_mc.

After this patch, a dimm-based memory controller will print the debug
info as:

[  728.430828] EDAC DEBUG: edac_mc_dump_dimm: 	dimm2: channel 0 slot 2 mapped as virtual row 0, chan 2
[  728.430834] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->label = 'mc#0channel#0slot#2'
[  728.430839] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0
[  728.430846] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->grain = 0
[  728.430850] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0

(a rank-based memory controller would print, instead, "rank2"
 on the above debug info)

Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Joe Perches <joe@perches.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v2: rebase this patch to apply after:
    http://git.infradead.org/users/mchehab/edac.git/commitdiff/a93f4f2a667afa1ff2f983b12d2062d7259c6c44
    As Joe's patches renamed debugf[0-4] to edac_dbg([0-4], and both patches
    would conflict otherwise.

 drivers/edac/edac_mc.c       |   44 ++++++++++++++++++++++++++---------------
 drivers/edac/edac_mc_sysfs.c |   11 +---------
 drivers/edac/edac_module.h   |    3 ++
 3 files changed, 32 insertions(+), 26 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index f48011f..ecb23e5 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -40,6 +40,25 @@
 static DEFINE_MUTEX(mem_ctls_mutex);
 static LIST_HEAD(mc_devices);
 
+unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
+			         int len)
+{
+	struct mem_ctl_info *mci = dimm->mci;
+	int i, n, count = 0;
+	char *p = buf;
+
+	for (i = 0; i < mci->n_layers; i++) {
+		n = snprintf(p, len, "%s %d ",
+			      edac_layer_name[mci->layers[i].type],
+			      dimm->location[i]);
+		p += n;
+		len -= n;
+		count += n;
+	}
+
+	return count;
+}
+
 #ifdef CONFIG_EDAC_DEBUG
 
 static void edac_mc_dump_channel(struct rank_info *chan)
@@ -50,20 +69,19 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	edac_dbg(4, "\tchannel->dimm = %p\n", chan->dimm);
 }
 
-static void edac_mc_dump_dimm(struct dimm_info *dimm)
+static void edac_mc_dump_dimm(struct dimm_info *dimm, int number)
 {
-	int i;
+	char location[80];
+
+	edac_dimm_info_location(dimm, location, sizeof(location));
 
 	edac_dbg(4, "\tdimm = %p\n", dimm);
+	edac_dbg(4, "\tdimm = %p\n", dimm);
+	edac_dbg(4, "\t%s%i: %smapped as virtual row %d, chan %d\n",
+		 dimm->mci->mem_is_per_rank ? "rank" : "dimm",
+		 number, location, dimm->csrow, dimm->cschannel);
 	edac_dbg(4, "\tdimm->label = '%s'\n", dimm->label);
 	edac_dbg(4, "\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
-	edac_dbg(4, "\tdimm location ");
-	for (i = 0; i < dimm->mci->n_layers; i++) {
-		printk(KERN_CONT "%d", dimm->location[i]);
-		if (i < dimm->mci->n_layers - 1)
-			printk(KERN_CONT ".");
-	}
-	printk(KERN_CONT "\n");
 	edac_dbg(4, "\tdimm->grain = %d\n", dimm->grain);
 	edac_dbg(4, "\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
@@ -338,8 +356,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	edac_dbg(4, "initializing %d %s\n",
-		 tot_dimms, per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
 		off = EDAC_DIMM_OFF(layer, n_layers, pos[0], pos[1], pos[2]);
@@ -352,10 +368,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 		mci->dimms[off] = dimm;
 		dimm->mci = mci;
 
-		edac_dbg(2, "%d: %s%i (%d:%d:%d): row %d, chan %d\n",
-			 i, per_rank ? "rank" : "dimm", off,
-			 pos[0], pos[1], pos[2], row, chn);
-
 		/*
 		 * Copy DIMM location and initialize the memory location
 		 */
@@ -731,7 +743,7 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(mci->csrows[i]->channels[j]);
 		}
 		for (i = 0; i < mci->tot_dimms; i++)
-			edac_mc_dump_dimm(mci->dimms[i]);
+			edac_mc_dump_dimm(mci->dimms[i], i);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index fe2d922..95865a0 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -485,17 +485,8 @@ static ssize_t dimmdev_location_show(struct device *dev,
 				     struct device_attribute *mattr, char *data)
 {
 	struct dimm_info *dimm = to_dimm(dev);
-	struct mem_ctl_info *mci = dimm->mci;
-	int i;
-	char *p = data;
-
-	for (i = 0; i < mci->n_layers; i++) {
-		p += sprintf(p, "%s %d ",
-			     edac_layer_name[mci->layers[i].type],
-			     dimm->location[i]);
-	}
 
-	return p - data;
+	return edac_dimm_info_location(dimm, data, PAGE_SIZE);
 }
 
 static ssize_t dimmdev_label_show(struct device *dev,
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 1af1367..de92756 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -34,6 +34,9 @@ extern int edac_mc_get_panic_on_ue(void);
 extern int edac_get_poll_msec(void);
 extern int edac_mc_get_poll_msec(void);
 
+unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
+				 int len);
+
 	/* on edac_device.c */
 extern int edac_device_register_sysfs_main_kobj(
 				struct edac_device_ctl_info *edac_dev);
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages
  2012-04-30 15:02                                                     ` [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages Mauro Carvalho Chehab
@ 2012-04-30 15:10                                                       ` Mauro Carvalho Chehab
  2012-04-30 15:20                                                         ` Borislav Petkov
  2012-04-30 16:16                                                       ` Joe Perches
  2012-04-30 16:44                                                       ` [PATCHv3] " Mauro Carvalho Chehab
  2 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 15:10 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Joe Perches

Em 30-04-2012 12:02, Mauro Carvalho Chehab escreveu:
> The edac_mc_alloc() routine allocates one dimm_info device for all
> possible memories, including the non-filled ones. The debug messages
> there are somewhat confusing. So, cleans them, by moving the code
> that prints the memory location to edac_mc, and using it on both
> edac_mc_sysfs and edac_mc.
> 
> After this patch, a dimm-based memory controller will print the debug
> info as:
> 
> [  728.430828] EDAC DEBUG: edac_mc_dump_dimm: 	dimm2: channel 0 slot 2 mapped as virtual row 0, chan 2
> [  728.430834] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->label = 'mc#0channel#0slot#2'
> [  728.430839] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0
> [  728.430846] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->grain = 0
> [  728.430850] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0

Hmm... just noticed that, just like the per-csrow register dumps, this routine
is called even when empty memories are used (in this case: nr_pages = 0).

IMHO, as this is a registers dump, the better is to keep it as-is, but it would
be simple to add a test there - and at edac_mc_dump_csrow() - to just dump it
when dimm->nr_pages is not 0.

What do you think?

Regards,
Mauro

> 
> (a rank-based memory controller would print, instead, "rank2"
>  on the above debug info)
> 
> Cc: Doug Thompson <norsk5@yahoo.com>
> Cc: Joe Perches <joe@perches.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> ---
> 
> v2: rebase this patch to apply after:
>     http://git.infradead.org/users/mchehab/edac.git/commitdiff/a93f4f2a667afa1ff2f983b12d2062d7259c6c44
>     As Joe's patches renamed debugf[0-4] to edac_dbg([0-4], and both patches
>     would conflict otherwise.
> 
>  drivers/edac/edac_mc.c       |   44 ++++++++++++++++++++++++++---------------
>  drivers/edac/edac_mc_sysfs.c |   11 +---------
>  drivers/edac/edac_module.h   |    3 ++
>  3 files changed, 32 insertions(+), 26 deletions(-)
> 
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> index f48011f..ecb23e5 100644
> --- a/drivers/edac/edac_mc.c
> +++ b/drivers/edac/edac_mc.c
> @@ -40,6 +40,25 @@
>  static DEFINE_MUTEX(mem_ctls_mutex);
>  static LIST_HEAD(mc_devices);
>  
> +unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
> +			         int len)
> +{
> +	struct mem_ctl_info *mci = dimm->mci;
> +	int i, n, count = 0;
> +	char *p = buf;
> +
> +	for (i = 0; i < mci->n_layers; i++) {
> +		n = snprintf(p, len, "%s %d ",
> +			      edac_layer_name[mci->layers[i].type],
> +			      dimm->location[i]);
> +		p += n;
> +		len -= n;
> +		count += n;
> +	}
> +
> +	return count;
> +}
> +
>  #ifdef CONFIG_EDAC_DEBUG
>  
>  static void edac_mc_dump_channel(struct rank_info *chan)
> @@ -50,20 +69,19 @@ static void edac_mc_dump_channel(struct rank_info *chan)
>  	edac_dbg(4, "\tchannel->dimm = %p\n", chan->dimm);
>  }
>  
> -static void edac_mc_dump_dimm(struct dimm_info *dimm)
> +static void edac_mc_dump_dimm(struct dimm_info *dimm, int number)
>  {
> -	int i;
> +	char location[80];
> +
> +	edac_dimm_info_location(dimm, location, sizeof(location));
>  
>  	edac_dbg(4, "\tdimm = %p\n", dimm);
> +	edac_dbg(4, "\tdimm = %p\n", dimm);
> +	edac_dbg(4, "\t%s%i: %smapped as virtual row %d, chan %d\n",
> +		 dimm->mci->mem_is_per_rank ? "rank" : "dimm",
> +		 number, location, dimm->csrow, dimm->cschannel);
>  	edac_dbg(4, "\tdimm->label = '%s'\n", dimm->label);
>  	edac_dbg(4, "\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
> -	edac_dbg(4, "\tdimm location ");
> -	for (i = 0; i < dimm->mci->n_layers; i++) {
> -		printk(KERN_CONT "%d", dimm->location[i]);
> -		if (i < dimm->mci->n_layers - 1)
> -			printk(KERN_CONT ".");
> -	}
> -	printk(KERN_CONT "\n");
>  	edac_dbg(4, "\tdimm->grain = %d\n", dimm->grain);
>  	edac_dbg(4, "\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
>  }
> @@ -338,8 +356,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
>  	memset(&pos, 0, sizeof(pos));
>  	row = 0;
>  	chn = 0;
> -	edac_dbg(4, "initializing %d %s\n",
> -		 tot_dimms, per_rank ? "ranks" : "dimms");
>  	for (i = 0; i < tot_dimms; i++) {
>  		chan = mci->csrows[row]->channels[chn];
>  		off = EDAC_DIMM_OFF(layer, n_layers, pos[0], pos[1], pos[2]);
> @@ -352,10 +368,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
>  		mci->dimms[off] = dimm;
>  		dimm->mci = mci;
>  
> -		edac_dbg(2, "%d: %s%i (%d:%d:%d): row %d, chan %d\n",
> -			 i, per_rank ? "rank" : "dimm", off,
> -			 pos[0], pos[1], pos[2], row, chn);
> -
>  		/*
>  		 * Copy DIMM location and initialize the memory location
>  		 */
> @@ -731,7 +743,7 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
>  				edac_mc_dump_channel(mci->csrows[i]->channels[j]);
>  		}
>  		for (i = 0; i < mci->tot_dimms; i++)
> -			edac_mc_dump_dimm(mci->dimms[i]);
> +			edac_mc_dump_dimm(mci->dimms[i], i);
>  	}
>  #endif
>  	mutex_lock(&mem_ctls_mutex);
> diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
> index fe2d922..95865a0 100644
> --- a/drivers/edac/edac_mc_sysfs.c
> +++ b/drivers/edac/edac_mc_sysfs.c
> @@ -485,17 +485,8 @@ static ssize_t dimmdev_location_show(struct device *dev,
>  				     struct device_attribute *mattr, char *data)
>  {
>  	struct dimm_info *dimm = to_dimm(dev);
> -	struct mem_ctl_info *mci = dimm->mci;
> -	int i;
> -	char *p = data;
> -
> -	for (i = 0; i < mci->n_layers; i++) {
> -		p += sprintf(p, "%s %d ",
> -			     edac_layer_name[mci->layers[i].type],
> -			     dimm->location[i]);
> -	}
>  
> -	return p - data;
> +	return edac_dimm_info_location(dimm, data, PAGE_SIZE);
>  }
>  
>  static ssize_t dimmdev_label_show(struct device *dev,
> diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
> index 1af1367..de92756 100644
> --- a/drivers/edac/edac_module.h
> +++ b/drivers/edac/edac_module.h
> @@ -34,6 +34,9 @@ extern int edac_mc_get_panic_on_ue(void);
>  extern int edac_get_poll_msec(void);
>  extern int edac_mc_get_poll_msec(void);
>  
> +unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
> +				 int len);
> +
>  	/* on edac_device.c */
>  extern int edac_device_register_sysfs_main_kobj(
>  				struct edac_device_ctl_info *edac_dev);


^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages
  2012-04-30 15:10                                                       ` Mauro Carvalho Chehab
@ 2012-04-30 15:20                                                         ` Borislav Petkov
  2012-04-30 15:33                                                           ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-04-30 15:20 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson, Joe Perches

On Mon, Apr 30, 2012 at 12:10:24PM -0300, Mauro Carvalho Chehab wrote:
> > [  728.430828] EDAC DEBUG: edac_mc_dump_dimm: 	dimm2: channel 0 slot 2 mapped as virtual row 0, chan 2
> > [  728.430834] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->label = 'mc#0channel#0slot#2'
> > [  728.430839] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0
> > [  728.430846] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->grain = 0
> > [  728.430850] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0
> 
> Hmm... just noticed that, just like the per-csrow register dumps, this routine
> is called even when empty memories are used (in this case: nr_pages = 0).
> 
> IMHO, as this is a registers dump, the better is to keep it as-is, but it would
> be simple to add a test there - and at edac_mc_dump_csrow() - to just dump it
> when dimm->nr_pages is not 0.
> 
> What do you think?

Or even better, test dimm->nr_pages != 0 before calling
edac_mc_dump_csrow() so that you can save yourself the function call.

Btw, where are the latest versions of your patches so that I can
continue reviewing them?

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages
  2012-04-30 15:20                                                         ` Borislav Petkov
@ 2012-04-30 15:33                                                           ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 15:33 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Joe Perches

Em 30-04-2012 12:20, Borislav Petkov escreveu:
> On Mon, Apr 30, 2012 at 12:10:24PM -0300, Mauro Carvalho Chehab wrote:
>>> [  728.430828] EDAC DEBUG: edac_mc_dump_dimm: 	dimm2: channel 0 slot 2 mapped as virtual row 0, chan 2
>>> [  728.430834] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->label = 'mc#0channel#0slot#2'
>>> [  728.430839] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0
>>> [  728.430846] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->grain = 0
>>> [  728.430850] EDAC DEBUG: edac_mc_dump_dimm: 	dimm->nr_pages = 0x0
>>
>> Hmm... just noticed that, just like the per-csrow register dumps, this routine
>> is called even when empty memories are used (in this case: nr_pages = 0).
>>
>> IMHO, as this is a registers dump, the better is to keep it as-is, but it would
>> be simple to add a test there - and at edac_mc_dump_csrow() - to just dump it
>> when dimm->nr_pages is not 0.
>>
>> What do you think?
> 
> Or even better, test dimm->nr_pages != 0 before calling
> edac_mc_dump_csrow() so that you can save yourself the function call.

Ok, I'll do that.

> Btw, where are the latest versions of your patches so that I can
> continue reviewing them?

The very latest series of patches are always at:
	git://git.infradead.org/users/mchehab/edac.git experimental

The latest version of "[PATCH EDACv16 1/2]edac: Change internal representation to work with layers"

is on one of the replies, and also at:

	http://git.infradead.org/users/mchehab/edac.git/commit/447b7929e633027ffe131f2f8f246bba5690cee7

I won't resend the 28 "foo: convert driver to use the new edac ABI" to the mailing list,
as there were no changes on them, except for a contextual change: 
	the .is_csrow field got renamed to .is_virt_csrow

So, there's no sense on flooding the mailing lists with them again.

The only one that you're likely interested is the amd64_edac one (the first one of the 28 patch
series). This patch were re-sent as PATCH EDACv16 2/2.

The next patch after them is "edac: Remove the legacy EDAC ABI"[1], with also didn't change
(maybe except for contextual changes).

After you review them, I'll re-send the other patches on my queue, probably breaking it into two
or three patch series.

[1] http://git.infradead.org/users/mchehab/edac.git/commit/d1390992c51323802e1ccddfa48b16f0be062621

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages
  2012-04-30 15:02                                                     ` [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages Mauro Carvalho Chehab
  2012-04-30 15:10                                                       ` Mauro Carvalho Chehab
@ 2012-04-30 16:16                                                       ` Joe Perches
  2012-04-30 16:47                                                         ` Mauro Carvalho Chehab
  2012-04-30 16:44                                                       ` [PATCHv3] " Mauro Carvalho Chehab
  2 siblings, 1 reply; 206+ messages in thread
From: Joe Perches @ 2012-04-30 16:16 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

On Mon, 2012-04-30 at 12:02 -0300, Mauro Carvalho Chehab wrote:
> The edac_mc_alloc() routine allocates one dimm_info device for all
> possible memories, including the non-filled ones. The debug messages
> there are somewhat confusing. So, cleans them, by moving the code
> that prints the memory location to edac_mc, and using it on both
> edac_mc_sysfs and edac_mc.
[]
> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
[]
> @@ -40,6 +40,25 @@
>  static DEFINE_MUTEX(mem_ctls_mutex);
>  static LIST_HEAD(mc_devices);
>  
> +unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
> +			         int len)
> +{
> +	struct mem_ctl_info *mci = dimm->mci;
> +	int i, n, count = 0;
> +	char *p = buf;
> +
> +	for (i = 0; i < mci->n_layers; i++) {
> +		n = snprintf(p, len, "%s %d ",
> +			      edac_layer_name[mci->layers[i].type],
> +			      dimm->location[i]);
> +		p += n;
> +		len -= n;
> +		count += n;
> +	}

I believe this snprintf can be unsafe
when the buffer length len is exceeded.

if len is negative, it's promoted to size_t
and continues to write into p.



^ permalink raw reply	[flat|nested] 206+ messages in thread

* [PATCHv3] edac_mc: Cleanup per-dimm_info debug messages
  2012-04-30 15:02                                                     ` [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages Mauro Carvalho Chehab
  2012-04-30 15:10                                                       ` Mauro Carvalho Chehab
  2012-04-30 16:16                                                       ` Joe Perches
@ 2012-04-30 16:44                                                       ` Mauro Carvalho Chehab
  2 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 16:44 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Linux Edac Mailing List,
	Linux Kernel Mailing List

The edac_mc_alloc() routine allocates one dimm_info device for all
possible memories, including the non-filled ones. The debug messages
there are somewhat confusing. So, cleans them, by moving the code
that prints the memory location to edac_mc, and using it on both
edac_mc_sysfs and edac_mc.

Also, only dumps information when DIMM/ranks are actually
filled.

After this patch, a dimm-based memory controller will print the debug
info as:

[ 1011.380027] EDAC DEBUG: edac_mc_dump_csrow: csrow->csrow_idx = 0
[ 1011.380029] EDAC DEBUG: edac_mc_dump_csrow:   csrow = ffff8801169be000
[ 1011.380031] EDAC DEBUG: edac_mc_dump_csrow:   csrow->first_page = 0x0
[ 1011.380032] EDAC DEBUG: edac_mc_dump_csrow:   csrow->last_page = 0x0
[ 1011.380034] EDAC DEBUG: edac_mc_dump_csrow:   csrow->page_mask = 0x0
[ 1011.380035] EDAC DEBUG: edac_mc_dump_csrow:   csrow->nr_channels = 3
[ 1011.380037] EDAC DEBUG: edac_mc_dump_csrow:   csrow->channels = ffff8801149c2840
[ 1011.380039] EDAC DEBUG: edac_mc_dump_csrow:   csrow->mci = ffff880117426000
[ 1011.380041] EDAC DEBUG: edac_mc_dump_channel:   channel->chan_idx = 0
[ 1011.380042] EDAC DEBUG: edac_mc_dump_channel:     channel = ffff8801149c2860
[ 1011.380044] EDAC DEBUG: edac_mc_dump_channel:     channel->csrow = ffff8801169be000
[ 1011.380046] EDAC DEBUG: edac_mc_dump_channel:     channel->dimm = ffff88010fe90400
...
[ 1011.380095] EDAC DEBUG: edac_mc_dump_dimm: dimm0: channel 0 slot 0 mapped as virtual row 0, chan 0
[ 1011.380097] EDAC DEBUG: edac_mc_dump_dimm:   dimm = ffff88010fe90400
[ 1011.380099] EDAC DEBUG: edac_mc_dump_dimm:   dimm->label = 'CPU#0Channel#0_DIMM#0'
[ 1011.380101] EDAC DEBUG: edac_mc_dump_dimm:   dimm->nr_pages = 0x40000
[ 1011.380103] EDAC DEBUG: edac_mc_dump_dimm:   dimm->grain = 8
[ 1011.380104] EDAC DEBUG: edac_mc_dump_dimm:   dimm->nr_pages = 0x40000
...

(a rank-based memory controller would print, instead of "dimm?", "rank?"
 on the above debug info)

Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---
v3: Only existing dimms/channels/csrows are printed;
    Use unsigned for 'len' at edac_dimm_info_location;
    If len is zero, breaks the loop at edac_dimm_info_location;
    Some cosmetic changes.

 drivers/edac/edac_mc.c       |   95 +++++++++++++++++++++++++----------------
 drivers/edac/edac_mc_sysfs.c |   11 +----
 drivers/edac/edac_module.h   |    3 +
 3 files changed, 62 insertions(+), 47 deletions(-)

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index f48011f..0e5c345 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -40,44 +40,63 @@
 static DEFINE_MUTEX(mem_ctls_mutex);
 static LIST_HEAD(mc_devices);
 
+unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
+			         unsigned len)
+{
+	struct mem_ctl_info *mci = dimm->mci;
+	int i, n, count = 0;
+	char *p = buf;
+
+	for (i = 0; i < mci->n_layers; i++) {
+		n = snprintf(p, len, "%s %d ",
+			      edac_layer_name[mci->layers[i].type],
+			      dimm->location[i]);
+		p += n;
+		len -= n;
+		count += n;
+		if (!len)
+			break;
+	}
+
+	return count;
+}
+
 #ifdef CONFIG_EDAC_DEBUG
 
 static void edac_mc_dump_channel(struct rank_info *chan)
 {
-	edac_dbg(4, "\tchannel = %p\n", chan);
-	edac_dbg(4, "\tchannel->chan_idx = %d\n", chan->chan_idx);
-	edac_dbg(4, "\tchannel->csrow = %p\n", chan->csrow);
-	edac_dbg(4, "\tchannel->dimm = %p\n", chan->dimm);
+	edac_dbg(4, "  channel->chan_idx = %d\n", chan->chan_idx);
+	edac_dbg(4, "    channel = %p\n", chan);
+	edac_dbg(4, "    channel->csrow = %p\n", chan->csrow);
+	edac_dbg(4, "    channel->dimm = %p\n", chan->dimm);
 }
 
-static void edac_mc_dump_dimm(struct dimm_info *dimm)
+static void edac_mc_dump_dimm(struct dimm_info *dimm, int number)
 {
-	int i;
-
-	edac_dbg(4, "\tdimm = %p\n", dimm);
-	edac_dbg(4, "\tdimm->label = '%s'\n", dimm->label);
-	edac_dbg(4, "\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
-	edac_dbg(4, "\tdimm location ");
-	for (i = 0; i < dimm->mci->n_layers; i++) {
-		printk(KERN_CONT "%d", dimm->location[i]);
-		if (i < dimm->mci->n_layers - 1)
-			printk(KERN_CONT ".");
-	}
-	printk(KERN_CONT "\n");
-	edac_dbg(4, "\tdimm->grain = %d\n", dimm->grain);
-	edac_dbg(4, "\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	char location[80];
+
+	edac_dimm_info_location(dimm, location, sizeof(location));
+
+	edac_dbg(4, "%s%i: %smapped as virtual row %d, chan %d\n",
+		 dimm->mci->mem_is_per_rank ? "rank" : "dimm",
+		 number, location, dimm->csrow, dimm->cschannel);
+	edac_dbg(4, "  dimm = %p\n", dimm);
+	edac_dbg(4, "  dimm->label = '%s'\n", dimm->label);
+	edac_dbg(4, "  dimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	edac_dbg(4, "  dimm->grain = %d\n", dimm->grain);
+	edac_dbg(4, "  dimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
 {
-	edac_dbg(4, "\tcsrow = %p\n", csrow);
-	edac_dbg(4, "\tcsrow->csrow_idx = %d\n", csrow->csrow_idx);
-	edac_dbg(4, "\tcsrow->first_page = 0x%lx\n", csrow->first_page);
-	edac_dbg(4, "\tcsrow->last_page = 0x%lx\n", csrow->last_page);
-	edac_dbg(4, "\tcsrow->page_mask = 0x%lx\n", csrow->page_mask);
-	edac_dbg(4, "\tcsrow->nr_channels = %d\n", csrow->nr_channels);
-	edac_dbg(4, "\tcsrow->channels = %p\n", csrow->channels);
-	edac_dbg(4, "\tcsrow->mci = %p\n", csrow->mci);
+	edac_dbg(4, "csrow->csrow_idx = %d\n", csrow->csrow_idx);
+	edac_dbg(4, "  csrow = %p\n", csrow);
+	edac_dbg(4, "  csrow->first_page = 0x%lx\n", csrow->first_page);
+	edac_dbg(4, "  csrow->last_page = 0x%lx\n", csrow->last_page);
+	edac_dbg(4, "  csrow->page_mask = 0x%lx\n", csrow->page_mask);
+	edac_dbg(4, "  csrow->nr_channels = %d\n", csrow->nr_channels);
+	edac_dbg(4, "  csrow->channels = %p\n", csrow->channels);
+	edac_dbg(4, "  csrow->mci = %p\n", csrow->mci);
 }
 
 static void edac_mc_dump_mci(struct mem_ctl_info *mci)
@@ -338,8 +357,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 	memset(&pos, 0, sizeof(pos));
 	row = 0;
 	chn = 0;
-	edac_dbg(4, "initializing %d %s\n",
-		 tot_dimms, per_rank ? "ranks" : "dimms");
 	for (i = 0; i < tot_dimms; i++) {
 		chan = mci->csrows[row]->channels[chn];
 		off = EDAC_DIMM_OFF(layer, n_layers, pos[0], pos[1], pos[2]);
@@ -352,10 +369,6 @@ struct mem_ctl_info *edac_mc_alloc(unsigned edac_index,
 		mci->dimms[off] = dimm;
 		dimm->mci = mci;
 
-		edac_dbg(2, "%d: %s%i (%d:%d:%d): row %d, chan %d\n",
-			 i, per_rank ? "rank" : "dimm", off,
-			 pos[0], pos[1], pos[2], row, chn);
-
 		/*
 		 * Copy DIMM location and initialize the memory location
 		 */
@@ -724,14 +737,22 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 		int i;
 
 		for (i = 0; i < mci->nr_csrows; i++) {
+			struct csrow_info *csrow = mci->csrows[i];
+			u32 nr_pages = 0;
 			int j;
 
-			edac_mc_dump_csrow(mci->csrows[i]);
-			for (j = 0; j < mci->csrows[i]->nr_channels; j++)
-				edac_mc_dump_channel(mci->csrows[i]->channels[j]);
+			for (j = 0; j < csrow->nr_channels; j++)
+				nr_pages += csrow->channels[j]->dimm->nr_pages;
+			if (!nr_pages)
+				continue;
+			edac_mc_dump_csrow(csrow);
+			for (j = 0; j < csrow->nr_channels; j++)
+				if (csrow->channels[j]->dimm->nr_pages)
+					edac_mc_dump_channel(csrow->channels[j]);
 		}
 		for (i = 0; i < mci->tot_dimms; i++)
-			edac_mc_dump_dimm(mci->dimms[i]);
+			if (mci->dimms[i]->nr_pages)
+				edac_mc_dump_dimm(mci->dimms[i], i);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
diff --git a/drivers/edac/edac_mc_sysfs.c b/drivers/edac/edac_mc_sysfs.c
index fe2d922..95865a0 100644
--- a/drivers/edac/edac_mc_sysfs.c
+++ b/drivers/edac/edac_mc_sysfs.c
@@ -485,17 +485,8 @@ static ssize_t dimmdev_location_show(struct device *dev,
 				     struct device_attribute *mattr, char *data)
 {
 	struct dimm_info *dimm = to_dimm(dev);
-	struct mem_ctl_info *mci = dimm->mci;
-	int i;
-	char *p = data;
-
-	for (i = 0; i < mci->n_layers; i++) {
-		p += sprintf(p, "%s %d ",
-			     edac_layer_name[mci->layers[i].type],
-			     dimm->location[i]);
-	}
 
-	return p - data;
+	return edac_dimm_info_location(dimm, data, PAGE_SIZE);
 }
 
 static ssize_t dimmdev_label_show(struct device *dev,
diff --git a/drivers/edac/edac_module.h b/drivers/edac/edac_module.h
index 1af1367..62de640 100644
--- a/drivers/edac/edac_module.h
+++ b/drivers/edac/edac_module.h
@@ -34,6 +34,9 @@ extern int edac_mc_get_panic_on_ue(void);
 extern int edac_get_poll_msec(void);
 extern int edac_mc_get_poll_msec(void);
 
+unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
+				 unsigned len);
+
 	/* on edac_device.c */
 extern int edac_device_register_sysfs_main_kobj(
 				struct edac_device_ctl_info *edac_dev);
-- 
1.7.8


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages
  2012-04-30 16:16                                                       ` Joe Perches
@ 2012-04-30 16:47                                                         ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-04-30 16:47 UTC (permalink / raw)
  To: Joe Perches
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List, Doug Thompson

Em 30-04-2012 13:16, Joe Perches escreveu:
> On Mon, 2012-04-30 at 12:02 -0300, Mauro Carvalho Chehab wrote:
>> The edac_mc_alloc() routine allocates one dimm_info device for all
>> possible memories, including the non-filled ones. The debug messages
>> there are somewhat confusing. So, cleans them, by moving the code
>> that prints the memory location to edac_mc, and using it on both
>> edac_mc_sysfs and edac_mc.
> []
>> diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
> []
>> @@ -40,6 +40,25 @@
>>  static DEFINE_MUTEX(mem_ctls_mutex);
>>  static LIST_HEAD(mc_devices);
>>  
>> +unsigned edac_dimm_info_location(struct dimm_info *dimm, char *buf,
>> +			         int len)
>> +{
>> +	struct mem_ctl_info *mci = dimm->mci;
>> +	int i, n, count = 0;
>> +	char *p = buf;
>> +
>> +	for (i = 0; i < mci->n_layers; i++) {
>> +		n = snprintf(p, len, "%s %d ",
>> +			      edac_layer_name[mci->layers[i].type],
>> +			      dimm->location[i]);
>> +		p += n;
>> +		len -= n;
>> +		count += n;
>> +	}
> 
> I believe this snprintf can be unsafe
> when the buffer length len is exceeded.
> 
> if len is negative, it's promoted to size_t
> and continues to write into p.

Ok, I've changed it to unsigned at the new version and added a break
to prevent this condition to happen. Version 3 with this fix and some
other improvements were sent.

Yet, this condition will never happen, in practice, as the edac core
currently supports only 3 layers.

Regards,
Mauro.

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI
  2012-04-16 20:21     ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
@ 2012-05-07 14:31       ` Borislav Petkov
  2012-05-07 16:12         ` Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-05-07 14:31 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Borislav Petkov

Pasting latest version here:

> From bdd1d4a73e48676e1aaab0dc41fca81e42d5e644 Mon Sep 17 00:00:00 2001
> From: Mauro Carvalho Chehab <mchehab@redhat.com>
> Date: Mon, 16 Apr 2012 15:03:50 -0300
> Subject: [PATCH] amd64_edac: convert driver to use the new edac ABI
> 
> The legacy edac ABI is going to be removed. Port the driver to use
> and benefit from the new API functionality.
> 
> Cc: Doug Thompson <norsk5@yahoo.com>
> Cc: Borislav Petkov <borislav.petkov@amd.com>
> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

Btw,

now when I inject a correctable ECC error, I get:

[ 2377.733708] [Hardware Error]: CPU:0     MC4_STATUS[-|CE|-|-|AddrV|CECC]: 0x945f4000b1080a13
[ 2377.742143] [Hardware Error]:        MC4_ADDR: 0x000000005bac7388
[ 2377.742151] [Hardware Error]: Northbridge Error (node 0): DRAM ECC error detected on the NB.
[ 2377.742164] EDAC amd64 MC0: CE ERROR_ADDRESS= 0x5bac7388
[ 2377.742172] EDAC DEBUG: f1x_match_to_this_node: (range 0) SystemAddr= 0x5bac7388 Limit=0x437ffffff
[ 2377.742183] EDAC DEBUG: f1x_match_to_this_node:    Normalized DCT addr: 0x2dd63980
[ 2377.742190] EDAC DEBUG: f1x_lookup_addr_in_dct: input addr: 0x2dd63980, DCT: 1
[ 2377.742199] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=0 CSBase=0x0 CSMask=0xffffffe1fff9ffff
[ 2377.742206] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x0
[ 2377.742215] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=1 CSBase=0x20000 CSMask=0xffffffe1fff9ffff
[ 2377.742223] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x20000
[ 2377.742231] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=2 CSBase=0x40000 CSMask=0xffffffe1fff9ffff
[ 2377.742239] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x40000
[ 2377.742247] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=3 CSBase=0x60000 CSMask=0xffffffe1fff9ffff
[ 2377.742255] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x60000
[ 2377.742262] EDAC DEBUG: f1x_lookup_addr_in_dct:  MATCH csrow=3
[ 2377.742279] EDAC MC0: CE amd64_edac on unknown memory (csrow 3 channel 1 page 0x5bac7 offset 0x388 grain 0 syndrome 0xb1be )
					  ^^^^^^^^^^^^^^

which is misleading. I think that removing the line from
edac_mc_handle_error() is better:

-		if (p == label)
-			strcpy(label, "unknown memory");

because this way we don't puzzle the user even more with EDAC-internal
details of how we represent DIMMs and ranks etc.

IOW, simply having:

[ 2377.742279] EDAC MC0: CE amd64_edac (csrow:3 channel:1 page:0x5bac7 offset:0x388 grain:0 syndrome:0xb1be)

is much better IMO.

Also, formatting that info with ":" makes the data even easily
understandable and parseable.

Also, you have a trailing space at the end: "... syndrome 0xb1be <---HERE)

which needs to be removed.

[ 2377.742288] [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: RES (no timeout)


Other than that, I have only a minor nitpick below.


> ---
>  drivers/edac/amd64_edac.c |  137 ++++++++++++++++++++++++++++++---------------
>  1 file changed, 92 insertions(+), 45 deletions(-)
> 
> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
> index 6d6ec6814a98..d00d75a602c1 100644
> --- a/drivers/edac/amd64_edac.c
> +++ b/drivers/edac/amd64_edac.c
> @@ -1039,6 +1039,37 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  	int channel, csrow;
>  	u32 page, offset;
>  
> +	error_address_to_page_and_offset(sys_addr, &page, &offset);
> +
> +	/*
> +	 * Find out which node the error address belongs to. This may be
> +	 * different from the node that detected the error.
> +	 */
> +	src_mci = find_mc_by_sys_addr(mci, sys_addr);
> +	if (!src_mci) {
> +		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
> +			     (unsigned long)sys_addr);
> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				     page, offset, syndrome,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "failed to map error addr to a node",
> +				     NULL);
> +		return;
> +	}
> +
> +	/* Now map the sys_addr to a CSROW */
> +	csrow = sys_addr_to_csrow(src_mci, sys_addr);
> +	if (csrow < 0) {
> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				     page, offset, syndrome,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "failed to map error addr to a csrow",
> +				     NULL);
> +		return;
> +	}
> +
>  	/* CHIPKILL enabled */
>  	if (pvt->nbcfg & NBCFG_CHIPKILL) {
>  		channel = get_channel_from_ecc_syndrome(mci, syndrome);
> @@ -1048,9 +1079,15 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  			 * 2 DIMMs is in error. So we need to ID 'both' of them
>  			 * as suspect.
>  			 */
> -			amd64_mc_warn(mci, "unknown syndrome 0x%04x - possible "
> -					   "error reporting race\n", syndrome);
> -			edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
> +			amd64_mc_warn(src_mci, "unknown syndrome 0x%04x - "
> +				      "possible error reporting race\n",
> +				      syndrome);
> +			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +					     page, offset, syndrome,
> +					     csrow, -1, -1,
> +					     EDAC_MOD_STR,
> +					     "unknown syndrome - possible error reporting race",
> +					     NULL);
>  			return;
>  		}
>  	} else {
> @@ -1065,28 +1102,10 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  		channel = ((sys_addr & BIT(3)) != 0);
>  	}
>  
> -	/*
> -	 * Find out which node the error address belongs to. This may be
> -	 * different from the node that detected the error.
> -	 */
> -	src_mci = find_mc_by_sys_addr(mci, sys_addr);
> -	if (!src_mci) {
> -		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
> -			     (unsigned long)sys_addr);
> -		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
> -		return;
> -	}
> -
> -	/* Now map the sys_addr to a CSROW */
> -	csrow = sys_addr_to_csrow(src_mci, sys_addr);
> -	if (csrow < 0) {
> -		edac_mc_handle_ce_no_info(src_mci, EDAC_MOD_STR);
> -	} else {
> -		error_address_to_page_and_offset(sys_addr, &page, &offset);
> -
> -		edac_mc_handle_ce(src_mci, page, offset, syndrome, csrow,
> -				  channel, EDAC_MOD_STR);
> -	}
> +	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, src_mci,
> +			     page, offset, syndrome,
> +			     csrow, channel, -1,
> +			     EDAC_MOD_STR, "", NULL);
>  }
>  
>  static int ddr2_cs_size(unsigned i, bool dct_width)
> @@ -1568,15 +1587,20 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  	u32 page, offset;
>  	int nid, csrow, chan = 0;
>  
> +	error_address_to_page_and_offset(sys_addr, &page, &offset);
> +
>  	csrow = f1x_translate_sysaddr_to_cs(pvt, sys_addr, &nid, &chan);
>  
>  	if (csrow < 0) {
> -		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				     page, offset, syndrome,
> +				     -1, -1, -1,
> +				     EDAC_MOD_STR,
> +				     "failed to map error addr to a csrow",
> +				     NULL);
>  		return;
>  	}
>  
> -	error_address_to_page_and_offset(sys_addr, &page, &offset);
> -
>  	/*
>  	 * We need the syndromes for channel detection only when we're
>  	 * ganged. Otherwise @chan should already contain the channel at
> @@ -1585,16 +1609,10 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>  	if (dct_ganging_enabled(pvt))
>  		chan = get_channel_from_ecc_syndrome(mci, syndrome);
>  
> -	if (chan >= 0)
> -		edac_mc_handle_ce(mci, page, offset, syndrome, csrow, chan,
> -				  EDAC_MOD_STR);
> -	else
> -		/*
> -		 * Channel unknown, report all channels on this CSROW as failed.
> -		 */
> -		for (chan = 0; chan < mci->csrows[csrow].nr_channels; chan++)
> -			edac_mc_handle_ce(mci, page, offset, syndrome,
> -					  csrow, chan, EDAC_MOD_STR);
> +	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
> +				page, offset, syndrome,
> +				csrow, chan, -1,
> +				EDAC_MOD_STR, "", NULL);

Strange alignement.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI
  2012-05-07 14:31       ` Borislav Petkov
@ 2012-05-07 16:12         ` Mauro Carvalho Chehab
  2012-05-07 16:17           ` Borislav Petkov
  2012-05-07 16:24           ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
  0 siblings, 2 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-05-07 16:12 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Borislav Petkov

Em 07-05-2012 11:31, Borislav Petkov escreveu:
> Pasting latest version here:
> 
>> From bdd1d4a73e48676e1aaab0dc41fca81e42d5e644 Mon Sep 17 00:00:00 2001
>> From: Mauro Carvalho Chehab <mchehab@redhat.com>
>> Date: Mon, 16 Apr 2012 15:03:50 -0300
>> Subject: [PATCH] amd64_edac: convert driver to use the new edac ABI
>>
>> The legacy edac ABI is going to be removed. Port the driver to use
>> and benefit from the new API functionality.
>>
>> Cc: Doug Thompson <norsk5@yahoo.com>
>> Cc: Borislav Petkov <borislav.petkov@amd.com>
>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
> 
> Btw,
> 
> now when I inject a correctable ECC error, I get:
> 
> [ 2377.733708] [Hardware Error]: CPU:0     MC4_STATUS[-|CE|-|-|AddrV|CECC]: 0x945f4000b1080a13
> [ 2377.742143] [Hardware Error]:        MC4_ADDR: 0x000000005bac7388
> [ 2377.742151] [Hardware Error]: Northbridge Error (node 0): DRAM ECC error detected on the NB.
> [ 2377.742164] EDAC amd64 MC0: CE ERROR_ADDRESS= 0x5bac7388
> [ 2377.742172] EDAC DEBUG: f1x_match_to_this_node: (range 0) SystemAddr= 0x5bac7388 Limit=0x437ffffff
> [ 2377.742183] EDAC DEBUG: f1x_match_to_this_node:    Normalized DCT addr: 0x2dd63980
> [ 2377.742190] EDAC DEBUG: f1x_lookup_addr_in_dct: input addr: 0x2dd63980, DCT: 1
> [ 2377.742199] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=0 CSBase=0x0 CSMask=0xffffffe1fff9ffff
> [ 2377.742206] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x0
> [ 2377.742215] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=1 CSBase=0x20000 CSMask=0xffffffe1fff9ffff
> [ 2377.742223] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x20000
> [ 2377.742231] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=2 CSBase=0x40000 CSMask=0xffffffe1fff9ffff
> [ 2377.742239] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x40000
> [ 2377.742247] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=3 CSBase=0x60000 CSMask=0xffffffe1fff9ffff
> [ 2377.742255] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x60000
> [ 2377.742262] EDAC DEBUG: f1x_lookup_addr_in_dct:  MATCH csrow=3
> [ 2377.742279] EDAC MC0: CE amd64_edac on unknown memory (csrow 3 channel 1 page 0x5bac7 offset 0x388 grain 0 syndrome 0xb1be )
> 					  ^^^^^^^^^^^^^^
> 
> which is misleading. I think that removing the line from
> edac_mc_handle_error() is better:
> 
> -		if (p == label)
> -			strcpy(label, "unknown memory");
> 
> because this way we don't puzzle the user even more with EDAC-internal
> details of how we represent DIMMs and ranks etc.

That happens because the EDAC core couldn't find any EDAC labels for the affected
memory.

There are two reasons for seeing a "unknown memory":
	1) there's no label information. This is fixed by:
	   http://git.kernel.org/?p=linux/kernel/git/mchehab/linux-edac.git;a=commit;h=b14dbb9b8220f8e07634125bf0e315f739cbf501
	2) there's no info about the label e. g. something like the old 
	   edac_mc_handle_ce_no_info().

As csrow and channel info is filled on your log, it is very likely to be
caused by (1) and that you didn't load the labels for this system with edac-ctl.

If you had a table with the labels for your motherboard at /etc/edac/labels.db
and run "edac-ctl --load" during your system init (that's the default on RHEL/Fedora,
not sure what other distros do), the message would be like:

EDAC MC0: CE amd64_edac on DIMM_4B (csrow 3 channel 1 page 0x5bac7 offset 0x388 grain 0 syndrome 0xb1be )

> IOW, simply having:
> 
> [ 2377.742279] EDAC MC0: CE amd64_edac (csrow:3 channel:1 page:0x5bac7 offset:0x388 grain:0 syndrome:0xb1be)
> 
> is much better IMO.
> 
> Also, formatting that info with ":" makes the data even easily
> understandable and parseable.

Ok. I'll write a patch to replace it at the end of the series.

> Also, you have a trailing space at the end: "... syndrome 0xb1be <---HERE)
> 
> which needs to be removed.

I'll do it also at the printk cleanup patch at the end of the series.

> 
> [ 2377.742288] [Hardware Error]: cache level: L3/GEN, mem/io: MEM, mem-tx: RD, part-proc: RES (no timeout)
> 
> 
> Other than that, I have only a minor nitpick below.
> 
> 
>> ---
>>  drivers/edac/amd64_edac.c |  137 ++++++++++++++++++++++++++++++---------------
>>  1 file changed, 92 insertions(+), 45 deletions(-)
>>
>> diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
>> index 6d6ec6814a98..d00d75a602c1 100644
>> --- a/drivers/edac/amd64_edac.c
>> +++ b/drivers/edac/amd64_edac.c
>> @@ -1039,6 +1039,37 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>>  	int channel, csrow;
>>  	u32 page, offset;
>>  
>> +	error_address_to_page_and_offset(sys_addr, &page, &offset);
>> +
>> +	/*
>> +	 * Find out which node the error address belongs to. This may be
>> +	 * different from the node that detected the error.
>> +	 */
>> +	src_mci = find_mc_by_sys_addr(mci, sys_addr);
>> +	if (!src_mci) {
>> +		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
>> +			     (unsigned long)sys_addr);
>> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +				     page, offset, syndrome,
>> +				     -1, -1, -1,
>> +				     EDAC_MOD_STR,
>> +				     "failed to map error addr to a node",
>> +				     NULL);
>> +		return;
>> +	}
>> +
>> +	/* Now map the sys_addr to a CSROW */
>> +	csrow = sys_addr_to_csrow(src_mci, sys_addr);
>> +	if (csrow < 0) {
>> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +				     page, offset, syndrome,
>> +				     -1, -1, -1,
>> +				     EDAC_MOD_STR,
>> +				     "failed to map error addr to a csrow",
>> +				     NULL);
>> +		return;
>> +	}
>> +
>>  	/* CHIPKILL enabled */
>>  	if (pvt->nbcfg & NBCFG_CHIPKILL) {
>>  		channel = get_channel_from_ecc_syndrome(mci, syndrome);
>> @@ -1048,9 +1079,15 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>>  			 * 2 DIMMs is in error. So we need to ID 'both' of them
>>  			 * as suspect.
>>  			 */
>> -			amd64_mc_warn(mci, "unknown syndrome 0x%04x - possible "
>> -					   "error reporting race\n", syndrome);
>> -			edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
>> +			amd64_mc_warn(src_mci, "unknown syndrome 0x%04x - "
>> +				      "possible error reporting race\n",
>> +				      syndrome);
>> +			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +					     page, offset, syndrome,
>> +					     csrow, -1, -1,
>> +					     EDAC_MOD_STR,
>> +					     "unknown syndrome - possible error reporting race",
>> +					     NULL);
>>  			return;
>>  		}
>>  	} else {
>> @@ -1065,28 +1102,10 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>>  		channel = ((sys_addr & BIT(3)) != 0);
>>  	}
>>  
>> -	/*
>> -	 * Find out which node the error address belongs to. This may be
>> -	 * different from the node that detected the error.
>> -	 */
>> -	src_mci = find_mc_by_sys_addr(mci, sys_addr);
>> -	if (!src_mci) {
>> -		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
>> -			     (unsigned long)sys_addr);
>> -		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
>> -		return;
>> -	}
>> -
>> -	/* Now map the sys_addr to a CSROW */
>> -	csrow = sys_addr_to_csrow(src_mci, sys_addr);
>> -	if (csrow < 0) {
>> -		edac_mc_handle_ce_no_info(src_mci, EDAC_MOD_STR);
>> -	} else {
>> -		error_address_to_page_and_offset(sys_addr, &page, &offset);
>> -
>> -		edac_mc_handle_ce(src_mci, page, offset, syndrome, csrow,
>> -				  channel, EDAC_MOD_STR);
>> -	}
>> +	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, src_mci,
>> +			     page, offset, syndrome,
>> +			     csrow, channel, -1,
>> +			     EDAC_MOD_STR, "", NULL);
>>  }
>>  
>>  static int ddr2_cs_size(unsigned i, bool dct_width)
>> @@ -1568,15 +1587,20 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>>  	u32 page, offset;
>>  	int nid, csrow, chan = 0;
>>  
>> +	error_address_to_page_and_offset(sys_addr, &page, &offset);
>> +
>>  	csrow = f1x_translate_sysaddr_to_cs(pvt, sys_addr, &nid, &chan);
>>  
>>  	if (csrow < 0) {
>> -		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
>> +		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +				     page, offset, syndrome,
>> +				     -1, -1, -1,
>> +				     EDAC_MOD_STR,
>> +				     "failed to map error addr to a csrow",
>> +				     NULL);
>>  		return;
>>  	}
>>  
>> -	error_address_to_page_and_offset(sys_addr, &page, &offset);
>> -
>>  	/*
>>  	 * We need the syndromes for channel detection only when we're
>>  	 * ganged. Otherwise @chan should already contain the channel at
>> @@ -1585,16 +1609,10 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
>>  	if (dct_ganging_enabled(pvt))
>>  		chan = get_channel_from_ecc_syndrome(mci, syndrome);
>>  
>> -	if (chan >= 0)
>> -		edac_mc_handle_ce(mci, page, offset, syndrome, csrow, chan,
>> -				  EDAC_MOD_STR);
>> -	else
>> -		/*
>> -		 * Channel unknown, report all channels on this CSROW as failed.
>> -		 */
>> -		for (chan = 0; chan < mci->csrows[csrow].nr_channels; chan++)
>> -			edac_mc_handle_ce(mci, page, offset, syndrome,
>> -					  csrow, chan, EDAC_MOD_STR);
>> +	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
>> +				page, offset, syndrome,
>> +				csrow, chan, -1,
>> +				EDAC_MOD_STR, "", NULL);
> 
> Strange alignement.

Alignment fixed below. Infradead tree updated.

Thanks,
Mauro

-

amd64_edac: convert driver to use the new edac ABI

From: Mauro Carvalho Chehab <mchehab@redhat.com>

The legacy edac ABI is going to be removed. Port the driver to use
and benefit from the new API functionality.

Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 6d6ec68..ede3895 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -1039,6 +1039,37 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	int channel, csrow;
 	u32 page, offset;
 
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
+
+	/*
+	 * Find out which node the error address belongs to. This may be
+	 * different from the node that detected the error.
+	 */
+	src_mci = find_mc_by_sys_addr(mci, sys_addr);
+	if (!src_mci) {
+		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
+			     (unsigned long)sys_addr);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a node",
+				     NULL);
+		return;
+	}
+
+	/* Now map the sys_addr to a CSROW */
+	csrow = sys_addr_to_csrow(src_mci, sys_addr);
+	if (csrow < 0) {
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a csrow",
+				     NULL);
+		return;
+	}
+
 	/* CHIPKILL enabled */
 	if (pvt->nbcfg & NBCFG_CHIPKILL) {
 		channel = get_channel_from_ecc_syndrome(mci, syndrome);
@@ -1048,9 +1079,15 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 			 * 2 DIMMs is in error. So we need to ID 'both' of them
 			 * as suspect.
 			 */
-			amd64_mc_warn(mci, "unknown syndrome 0x%04x - possible "
-					   "error reporting race\n", syndrome);
-			edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+			amd64_mc_warn(src_mci, "unknown syndrome 0x%04x - "
+				      "possible error reporting race\n",
+				      syndrome);
+			edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+					     page, offset, syndrome,
+					     csrow, -1, -1,
+					     EDAC_MOD_STR,
+					     "unknown syndrome - possible error reporting race",
+					     NULL);
 			return;
 		}
 	} else {
@@ -1065,28 +1102,10 @@ static void k8_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 		channel = ((sys_addr & BIT(3)) != 0);
 	}
 
-	/*
-	 * Find out which node the error address belongs to. This may be
-	 * different from the node that detected the error.
-	 */
-	src_mci = find_mc_by_sys_addr(mci, sys_addr);
-	if (!src_mci) {
-		amd64_mc_err(mci, "failed to map error addr 0x%lx to a node\n",
-			     (unsigned long)sys_addr);
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
-		return;
-	}
-
-	/* Now map the sys_addr to a CSROW */
-	csrow = sys_addr_to_csrow(src_mci, sys_addr);
-	if (csrow < 0) {
-		edac_mc_handle_ce_no_info(src_mci, EDAC_MOD_STR);
-	} else {
-		error_address_to_page_and_offset(sys_addr, &page, &offset);
-
-		edac_mc_handle_ce(src_mci, page, offset, syndrome, csrow,
-				  channel, EDAC_MOD_STR);
-	}
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, src_mci,
+			     page, offset, syndrome,
+			     csrow, channel, -1,
+			     EDAC_MOD_STR, "", NULL);
 }
 
 static int ddr2_cs_size(unsigned i, bool dct_width)
@@ -1568,15 +1587,20 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	u32 page, offset;
 	int nid, csrow, chan = 0;
 
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
+
 	csrow = f1x_translate_sysaddr_to_cs(pvt, sys_addr, &nid, &chan);
 
 	if (csrow < 0) {
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     page, offset, syndrome,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "failed to map error addr to a csrow",
+				     NULL);
 		return;
 	}
 
-	error_address_to_page_and_offset(sys_addr, &page, &offset);
-
 	/*
 	 * We need the syndromes for channel detection only when we're
 	 * ganged. Otherwise @chan should already contain the channel at
@@ -1585,16 +1609,10 @@ static void f1x_map_sysaddr_to_csrow(struct mem_ctl_info *mci, u64 sys_addr,
 	if (dct_ganging_enabled(pvt))
 		chan = get_channel_from_ecc_syndrome(mci, syndrome);
 
-	if (chan >= 0)
-		edac_mc_handle_ce(mci, page, offset, syndrome, csrow, chan,
-				  EDAC_MOD_STR);
-	else
-		/*
-		 * Channel unknown, report all channels on this CSROW as failed.
-		 */
-		for (chan = 0; chan < mci->csrows[csrow].nr_channels; chan++)
-			edac_mc_handle_ce(mci, page, offset, syndrome,
-					  csrow, chan, EDAC_MOD_STR);
+	edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			     page, offset, syndrome,
+			     csrow, chan, -1,
+			     EDAC_MOD_STR, "", NULL);
 }
 
 /*
@@ -1875,7 +1893,12 @@ static void amd64_handle_ce(struct mem_ctl_info *mci, struct mce *m)
 	/* Ensure that the Error Address is VALID */
 	if (!(m->status & MCI_STATUS_ADDRV)) {
 		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
-		edac_mc_handle_ce_no_info(mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+				     0, 0, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "HW has no ERROR_ADDRESS available",
+				     NULL);
 		return;
 	}
 
@@ -1899,11 +1922,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 
 	if (!(m->status & MCI_STATUS_ADDRV)) {
 		amd64_mc_err(mci, "HW has no ERROR_ADDRESS available\n");
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     0, 0, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "HW has no ERROR_ADDRESS available",
+				     NULL);
 		return;
 	}
 
 	sys_addr = get_error_address(m);
+	error_address_to_page_and_offset(sys_addr, &page, &offset);
 
 	/*
 	 * Find out which node the error address belongs to. This may be
@@ -1913,7 +1942,11 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 	if (!src_mci) {
 		amd64_mc_err(mci, "ERROR ADDRESS (0x%lx) NOT mapped to a MC\n",
 				  (unsigned long)sys_addr);
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "ERROR ADDRESS NOT mapped to a MC", NULL);
 		return;
 	}
 
@@ -1923,10 +1956,17 @@ static void amd64_handle_ue(struct mem_ctl_info *mci, struct mce *m)
 	if (csrow < 0) {
 		amd64_mc_err(mci, "ERROR_ADDRESS (0x%lx) NOT mapped to CS\n",
 				  (unsigned long)sys_addr);
-		edac_mc_handle_ue_no_info(log_mci, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     -1, -1, -1,
+				     EDAC_MOD_STR,
+				     "ERROR ADDRESS NOT mapped to CS",
+				     NULL);
 	} else {
-		error_address_to_page_and_offset(sys_addr, &page, &offset);
-		edac_mc_handle_ue(log_mci, page, offset, csrow, EDAC_MOD_STR);
+		edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+				     page, offset, 0,
+				     csrow, -1, -1,
+				     EDAC_MOD_STR, "", NULL);
 	}
 }
 
@@ -2486,6 +2526,7 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 	struct amd64_pvt *pvt = NULL;
 	struct amd64_family_type *fam_type = NULL;
 	struct mem_ctl_info *mci = NULL;
+	struct edac_mc_layer layers[2];
 	int err = 0, ret;
 	u8 nid = get_node_id(F2);
 
@@ -2520,7 +2561,13 @@ static int amd64_init_one_instance(struct pci_dev *F2)
 		goto err_siblings;
 
 	ret = -ENOMEM;
-	mci = edac_mc_alloc(0, pvt->csels[0].b_cnt, pvt->channel_count, nid);
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = pvt->csels[0].b_cnt;
+	layers[0].is_virt_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = pvt->channel_count;
+	layers[1].is_virt_csrow = false;
+	mci = new_edac_mc_alloc(nid, ARRAY_SIZE(layers), layers, 0);
 	if (!mci)
 		goto err_siblings;
 


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI
  2012-05-07 16:12         ` Mauro Carvalho Chehab
@ 2012-05-07 16:17           ` Borislav Petkov
  2012-05-07 16:59             ` Mauro Carvalho Chehab
  2012-05-07 16:24           ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
  1 sibling, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-05-07 16:17 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Borislav Petkov

On Mon, May 07, 2012 at 01:12:24PM -0300, Mauro Carvalho Chehab wrote:
> That happens because the EDAC core couldn't find any EDAC labels for the affected
> memory.
> 
> There are two reasons for seeing a "unknown memory":
> 	1) there's no label information. This is fixed by:
> 	   http://git.kernel.org/?p=linux/kernel/git/mchehab/linux-edac.git;a=commit;h=b14dbb9b8220f8e07634125bf0e315f739cbf501
> 	2) there's no info about the label e. g. something like the old 
> 	   edac_mc_handle_ce_no_info().
> 
> As csrow and channel info is filled on your log, it is very likely to be
> caused by (1) and that you didn't load the labels for this system with edac-ctl.
> 
> If you had a table with the labels for your motherboard at /etc/edac/labels.db
> and run "edac-ctl --load" during your system init (that's the default on RHEL/Fedora,
> not sure what other distros do), the message would be like:
> 
> EDAC MC0: CE amd64_edac on DIMM_4B (csrow 3 channel 1 page 0x5bac7 offset 0x388 grain 0 syndrome 0xb1be )
> 
> > IOW, simply having:
> > 
> > [ 2377.742279] EDAC MC0: CE amd64_edac (csrow:3 channel:1 page:0x5bac7 offset:0x388 grain:0 syndrome:0xb1be)
> > 
> > is much better IMO.
> > 
> > Also, formatting that info with ":" makes the data even easily
> > understandable and parseable.
> 
> Ok. I'll write a patch to replace it at the end of the series.

Why not rebase this patch? Your tree is not merged anywhere yet.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI
  2012-05-07 16:12         ` Mauro Carvalho Chehab
  2012-05-07 16:17           ` Borislav Petkov
@ 2012-05-07 16:24           ` Mauro Carvalho Chehab
  1 sibling, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-05-07 16:24 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Borislav Petkov

Em 07-05-2012 13:12, Mauro Carvalho Chehab escreveu:
> Em 07-05-2012 11:31, Borislav Petkov escreveu:
>> Pasting latest version here:
>>
>>> From bdd1d4a73e48676e1aaab0dc41fca81e42d5e644 Mon Sep 17 00:00:00 2001
>>> From: Mauro Carvalho Chehab <mchehab@redhat.com>
>>> Date: Mon, 16 Apr 2012 15:03:50 -0300
>>> Subject: [PATCH] amd64_edac: convert driver to use the new edac ABI
>>>
>>> The legacy edac ABI is going to be removed. Port the driver to use
>>> and benefit from the new API functionality.
>>>
>>> Cc: Doug Thompson <norsk5@yahoo.com>
>>> Cc: Borislav Petkov <borislav.petkov@amd.com>
>>> Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
>>
>> Btw,
>>
>> now when I inject a correctable ECC error, I get:
>>
>> [ 2377.733708] [Hardware Error]: CPU:0     MC4_STATUS[-|CE|-|-|AddrV|CECC]: 0x945f4000b1080a13
>> [ 2377.742143] [Hardware Error]:        MC4_ADDR: 0x000000005bac7388
>> [ 2377.742151] [Hardware Error]: Northbridge Error (node 0): DRAM ECC error detected on the NB.
>> [ 2377.742164] EDAC amd64 MC0: CE ERROR_ADDRESS= 0x5bac7388
>> [ 2377.742172] EDAC DEBUG: f1x_match_to_this_node: (range 0) SystemAddr= 0x5bac7388 Limit=0x437ffffff
>> [ 2377.742183] EDAC DEBUG: f1x_match_to_this_node:    Normalized DCT addr: 0x2dd63980
>> [ 2377.742190] EDAC DEBUG: f1x_lookup_addr_in_dct: input addr: 0x2dd63980, DCT: 1
>> [ 2377.742199] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=0 CSBase=0x0 CSMask=0xffffffe1fff9ffff
>> [ 2377.742206] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x0
>> [ 2377.742215] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=1 CSBase=0x20000 CSMask=0xffffffe1fff9ffff
>> [ 2377.742223] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x20000
>> [ 2377.742231] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=2 CSBase=0x40000 CSMask=0xffffffe1fff9ffff
>> [ 2377.742239] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x40000
>> [ 2377.742247] EDAC DEBUG: f1x_lookup_addr_in_dct:     CSROW=3 CSBase=0x60000 CSMask=0xffffffe1fff9ffff
>> [ 2377.742255] EDAC DEBUG: f1x_lookup_addr_in_dct:     (InputAddr & ~CSMask)=0x60000 (CSBase & ~CSMask)=0x60000
>> [ 2377.742262] EDAC DEBUG: f1x_lookup_addr_in_dct:  MATCH csrow=3
>> [ 2377.742279] EDAC MC0: CE amd64_edac on unknown memory (csrow 3 channel 1 page 0x5bac7 offset 0x388 grain 0 syndrome 0xb1be )
>> 					  ^^^^^^^^^^^^^^
>>
>> which is misleading. I think that removing the line from
>> edac_mc_handle_error() is better:
>>
>> -		if (p == label)
>> -			strcpy(label, "unknown memory");
>>
>> because this way we don't puzzle the user even more with EDAC-internal
>> details of how we represent DIMMs and ranks etc.
> 
> That happens because the EDAC core couldn't find any EDAC labels for the affected
> memory.
> 
> There are two reasons for seeing a "unknown memory":
> 	1) there's no label information. This is fixed by:
> 	   http://git.kernel.org/?p=linux/kernel/git/mchehab/linux-edac.git;a=commit;h=b14dbb9b8220f8e07634125bf0e315f739cbf501
> 	2) there's no info about the label e. g. something like the old 
> 	   edac_mc_handle_ce_no_info().
> 
> As csrow and channel info is filled on your log, it is very likely to be
> caused by (1) and that you didn't load the labels for this system with edac-ctl.
> 
> If you had a table with the labels for your motherboard at /etc/edac/labels.db
> and run "edac-ctl --load" during your system init (that's the default on RHEL/Fedora,
> not sure what other distros do), the message would be like:
> 
> EDAC MC0: CE amd64_edac on DIMM_4B (csrow 3 channel 1 page 0x5bac7 offset 0x388 grain 0 syndrome 0xb1be )
> 
>> IOW, simply having:
>>
>> [ 2377.742279] EDAC MC0: CE amd64_edac (csrow:3 channel:1 page:0x5bac7 offset:0x388 grain:0 syndrome:0xb1be)
>>
>> is much better IMO.
>>
>> Also, formatting that info with ":" makes the data even easily
>> understandable and parseable.
> 
> Ok. I'll write a patch to replace it at the end of the series.
> 
>> Also, you have a trailing space at the end: "... syndrome 0xb1be <---HERE)
>>
>> which needs to be removed.
> 
> I'll do it also at the printk cleanup patch at the end of the series.

edac: Improve EDAC handle error logs

From: Mauro Carvalho Chehab <mchehab@redhat.com>

As suggested by Borislav:
- Instead of using space to split between a field and its value,
  use ':';
- when no driver-specific error details are provided via
  "other_detail" field, don't add an extra space.

Reported-by: Borislav Petkov <bp@amd64.org>
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>

diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 4f4953c..0adbfa9 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -951,11 +951,18 @@ static void edac_ce_error(struct mem_ctl_info *mci,
 {
 	unsigned long remapped_page;
 
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-				"CE %s on %s (%s%s %s)\n",
-				msg, label, location,
-				detail, other_detail);
+	if (edac_mc_get_log_ce()) {
+		if (other_detail && *other_detail)
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s - %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		else
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s)\n",
+				       msg, label, location,
+				       detail);
+	}
 	edac_inc_ce_error(mci, enable_per_layer_report, pos);
 
 	if (mci->scrub_mode & SCRUB_SW_SRC) {
@@ -988,14 +995,26 @@ static void edac_ue_error(struct mem_ctl_info *mci,
 			  const char *other_detail,
 			  const bool enable_per_layer_report)
 {
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE %s on %s (%s%s %s)\n",
-			msg, label, location, detail, other_detail);
+	if (edac_mc_get_log_ue()) {
+		if (other_detail && *other_detail)
+			edac_mc_printk(mci, KERN_WARNING,
+				       "UE %s on %s (%s%s - %s)\n",
+			               msg, label, location, detail,
+				       other_detail);
+		else
+			edac_mc_printk(mci, KERN_WARNING,
+				       "UE %s on %s (%s%s)\n",
+			               msg, label, location, detail);
+	}
 
-	if (edac_mc_get_panic_on_ue())
-		panic("UE %s on %s (%s%s %s)\n",
-			msg, label, location, detail, other_detail);
+	if (edac_mc_get_panic_on_ue()) {
+		if (other_detail && *other_detail)
+			panic("UE %s on %s (%s%s - %s)\n",
+			      msg, label, location, detail, other_detail);
+		else
+			panic("UE %s on %s (%s%s)\n",
+			      msg, label, location, detail);
+	}
 
 	edac_inc_ue_error(mci, enable_per_layer_report, pos);
 }
@@ -1139,7 +1158,7 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 		if (pos[i] < 0)
 			continue;
 
-		p += sprintf(p, "%s %d ",
+		p += sprintf(p, "%s:%d ",
 			     edac_layer_name[mci->layers[i].type],
 			     pos[i]);
 	}
@@ -1147,12 +1166,12 @@ void edac_mc_handle_error(const enum hw_event_mc_err_type type,
 	/* Memory type dependent details about the error */
 	if (type == HW_EVENT_ERR_CORRECTED)
 		snprintf(detail, sizeof(detail),
-			"page 0x%lx offset 0x%lx grain %d syndrome 0x%lx",
+			"page:0x%lx offset:0x%lx grain:%d syndrome:0x%lx",
 			page_frame_number, offset_in_page,
 			grain, syndrome);
 	else
 		snprintf(detail, sizeof(detail),
-			"page 0x%lx offset 0x%lx grain %d",
+			"page:0x%lx offset:0x%lx grain:%d",
 			page_frame_number, offset_in_page, grain);
 
 	/* Report the error via the trace interface */


^ permalink raw reply related	[flat|nested] 206+ messages in thread

* Re: [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI
  2012-05-07 16:17           ` Borislav Petkov
@ 2012-05-07 16:59             ` Mauro Carvalho Chehab
  2012-05-07 19:49               ` Borislav Petkov
  0 siblings, 1 reply; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-05-07 16:59 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Linux Edac Mailing List, Linux Kernel Mailing List,
	Doug Thompson, Borislav Petkov

Em 07-05-2012 13:17, Borislav Petkov escreveu:
> On Mon, May 07, 2012 at 01:12:24PM -0300, Mauro Carvalho Chehab wrote:
>> That happens because the EDAC core couldn't find any EDAC labels for the affected
>> memory.
>>
>> There are two reasons for seeing a "unknown memory":
>> 	1) there's no label information. This is fixed by:
>> 	   http://git.kernel.org/?p=linux/kernel/git/mchehab/linux-edac.git;a=commit;h=b14dbb9b8220f8e07634125bf0e315f739cbf501
>> 	2) there's no info about the label e. g. something like the old 
>> 	   edac_mc_handle_ce_no_info().
>>
>> As csrow and channel info is filled on your log, it is very likely to be
>> caused by (1) and that you didn't load the labels for this system with edac-ctl.
>>
>> If you had a table with the labels for your motherboard at /etc/edac/labels.db
>> and run "edac-ctl --load" during your system init (that's the default on RHEL/Fedora,
>> not sure what other distros do), the message would be like:
>>
>> EDAC MC0: CE amd64_edac on DIMM_4B (csrow 3 channel 1 page 0x5bac7 offset 0x388 grain 0 syndrome 0xb1be )
>>
>>> IOW, simply having:
>>>
>>> [ 2377.742279] EDAC MC0: CE amd64_edac (csrow:3 channel:1 page:0x5bac7 offset:0x388 grain:0 syndrome:0xb1be)
>>>
>>> is much better IMO.
>>>
>>> Also, formatting that info with ":" makes the data even easily
>>> understandable and parseable.
>>
>> Ok. I'll write a patch to replace it at the end of the series.
> 
> Why not rebase this patch? Your tree is not merged anywhere yet.

This change has nothing to do with "amd64_edac: convert driver to use the new edac ABI",
but to the previous patch.

It is possible to merge it with the previous patch that made the changes
at the edac_mc core, but that would require to re-review that patch, and will
loose the history about this specific change.

If all you want is to have this change applied before the amd64 patch, for you
to better test it, I've re-based it on my experimental tree to be between
the two patches:
	http://git.infradead.org/users/mchehab/edac.git/commitdiff/3f4610491cc58dda8c737378212559dce77adde2

Anyway, if you still prefer that I merge it with "edac: Change internal representation to work with layers"
patch, I can do it.

Regards,
Mauro

^ permalink raw reply	[flat|nested] 206+ messages in thread

* Re: [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI
  2012-05-07 16:59             ` Mauro Carvalho Chehab
@ 2012-05-07 19:49               ` Borislav Petkov
  2012-05-08 13:51                 ` [PATCH] edac: Change internal representation to work with layers Mauro Carvalho Chehab
  0 siblings, 1 reply; 206+ messages in thread
From: Borislav Petkov @ 2012-05-07 19:49 UTC (permalink / raw)
  To: Mauro Carvalho Chehab
  Cc: Borislav Petkov, Linux Edac Mailing List,
	Linux Kernel Mailing List, Doug Thompson

On Mon, May 07, 2012 at 01:59:24PM -0300, Mauro Carvalho Chehab wrote:
> This change has nothing to do with "amd64_edac: convert driver to use the new edac ABI",
> but to the previous patch.
> 
> It is possible to merge it with the previous patch that made the changes
> at the edac_mc core, but that would require to re-review that patch, and will
> loose the history about this specific change.

That's fine - the change is simple enough to merge it in the respective patch.

> If all you want is to have this change applied before the amd64 patch, for you
> to better test it, I've re-based it on my experimental tree to be between
> the two patches:
> 	http://git.infradead.org/users/mchehab/edac.git/commitdiff/3f4610491cc58dda8c737378212559dce77adde2
> 
> Anyway, if you still prefer that I merge it with "edac: Change internal representation to work with layers"
> patch, I can do it.

Yes please, this way you have only the patchset and not the patchset + n
fixup patches at the end.

I'll continue reviewing/testing from the one before "amd64_edac: convert
driver to use the new edac ABI."

Thanks.

-- 
Regards/Gruss,
Boris.

Advanced Micro Devices GmbH
Einsteinring 24, 85609 Dornach
GM: Alberto Bozzo
Reg: Dornach, Landkreis Muenchen
HRB Nr. 43632 WEEE Registernr: 129 19551

^ permalink raw reply	[flat|nested] 206+ messages in thread

* [PATCH] edac: Change internal representation to work with layers
  2012-05-07 19:49               ` Borislav Petkov
@ 2012-05-08 13:51                 ` Mauro Carvalho Chehab
  0 siblings, 0 replies; 206+ messages in thread
From: Mauro Carvalho Chehab @ 2012-05-08 13:51 UTC (permalink / raw)
  Cc: Mauro Carvalho Chehab, Jason Uhlenkott, Aristeu Rozanski,
	Hitoshi Mitake, Shaohui Xie, Mark Gross, Dmitry Eremin-Solenikov,
	Ranganathan Desikan, Egor Martovetsky, Niklas Söderlund,
	Tim Small, Arvind R.,
	Borislav Petkov, Chris Metcalf, Olof Johansson, Doug Thompson,
	Linux Edac Mailing List, Michal Marek, Jiri Kosina,
	Linux Kernel Mailing List, Joe Perches, Andrew Morton,
	linuxppc-dev

Change the EDAC internal representation to work with non-csrow
based memory controllers.

There are lots of those memory controllers nowadays, and more
are coming. So, the EDAC internal representation needs to be
changed, in order to work with those memory controllers, while
preserving backward compatibility with the old ones.

The edac core was written with the idea that memory controllers
are able to directly access csrows.

This is not true for FB-DIMM and RAMBUS memory controllers.

Also, some recent advanced memory controllers don't present a per-csrows
view. Instead, they view memories as DIMMs, instead of ranks.

So, change the allocation and error report routines to allow
them to work with all types of architectures.

This will allow the removal of several hacks with FB-DIMM and RAMBUS
memory controllers.

Also, several tests were done on different platforms using different
x86 drivers.

TODO: a multi-rank DIMMs are currently represented by multiple DIMM
entries in struct dimm_info. That means that changing a label for one
rank won't change the same label for the other ranks at the same DIMM.
This bug is present since the beginning of the EDAC, so it is not a big
deal. However, on several drivers, it is possible to fix this issue, but
it should be a per-driver fix, as the csrow => DIMM arrangement may not
be equal for all. So, don't try to fix it here yet.

I tried to make this patch as short as possible, preceding it with
several other patches that simplified the logic here. Yet, as the
internal API changes, all drivers need changes. The changes are
generally bigger in the drivers for FB-DIMMs.

Cc: Aristeu Rozanski <arozansk@redhat.com>
Cc: Doug Thompson <norsk5@yahoo.com>
Cc: Borislav Petkov <borislav.petkov@amd.com>
Cc: Mark Gross <mark.gross@intel.com>
Cc: Jason Uhlenkott <juhlenko@akamai.com>
Cc: Tim Small <tim@buttersideup.com>
Cc: Ranganathan Desikan <ravi@jetztechnologies.com>
Cc: "Arvind R." <arvino55@gmail.com>
Cc: Olof Johansson <olof@lixom.net>
Cc: Egor Martovetsky <egor@pasemi.com>
Cc: Chris Metcalf <cmetcalf@tilera.com>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Jiri Kosina <jkosina@suse.cz>
Cc: Joe Perches <joe@perches.com>
Cc: Dmitry Eremin-Solenikov <dbaryshkov@gmail.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Hitoshi Mitake <h.mitake@gmail.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Niklas Söderlund" <niklas.soderlund@ericsson.com>
Cc: Shaohui Xie <Shaohui.Xie@freescale.com>
Cc: Josh Boyer <jwboyer@gmail.com>
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Mauro Carvalho Chehab <mchehab@redhat.com>
---

v.21: As requested by Borislav, fold two patches:
  http://git.infradead.org/users/mchehab/edac.git/commitdiff/3f4610491cc58dda8c737378212559dce77adde2
  http://git.infradead.org/users/mchehab/edac.git/commitdiff/97b826eaa8fe8ceb84220a2438551c4c73e55635

 drivers/edac/edac_core.h |   99 ++++++--
 drivers/edac/edac_mc.c   |  702 +++++++++++++++++++++++++++++-----------------
 include/linux/edac.h     |   38 ++-
 3 files changed, 552 insertions(+), 287 deletions(-)

diff --git a/drivers/edac/edac_core.h b/drivers/edac/edac_core.h
index e48ab31..1286c5e 100644
--- a/drivers/edac/edac_core.h
+++ b/drivers/edac/edac_core.h
@@ -447,8 +447,12 @@ static inline void pci_write_bits32(struct pci_dev *pdev, int offset,
 
 #endif				/* CONFIG_PCI */
 
-extern struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-					  unsigned nr_chans, int edac_index);
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int edac_index);
+struct mem_ctl_info *new_edac_mc_alloc(unsigned edac_index,
+				   unsigned n_layers,
+				   struct edac_mc_layer *layers,
+				   unsigned sz_pvt);
 extern int edac_mc_add_mc(struct mem_ctl_info *mci);
 extern void edac_mc_free(struct mem_ctl_info *mci);
 extern struct mem_ctl_info *edac_mc_find(int idx);
@@ -467,24 +471,78 @@ extern int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci,
  * reporting logic and function interface - reduces conditional
  * statement clutter and extra function arguments.
  */
-extern void edac_mc_handle_ce(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page,
-			      unsigned long syndrome, int row, int channel,
-			      const char *msg);
-extern void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_ue(struct mem_ctl_info *mci,
-			      unsigned long page_frame_number,
-			      unsigned long offset_in_page, int row,
-			      const char *msg);
-extern void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
-				      const char *msg);
-extern void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel0, unsigned int channel1,
-				  char *msg);
-extern void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci, unsigned int csrow,
-				  unsigned int channel, char *msg);
+
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog);
+
+static inline void edac_mc_handle_ce(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page,
+				     unsigned long syndrome, int row, int channel,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      page_frame_number, offset_in_page, syndrome,
+			      row, channel, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue(struct mem_ctl_info *mci,
+				     unsigned long page_frame_number,
+				     unsigned long offset_in_page, int row,
+				     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      page_frame_number, offset_in_page, 0,
+			      row, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci,
+					     const char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0, -1, -1, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel0,
+					 unsigned int channel1,
+					 char *msg)
+{
+	/*
+	 *FIXME: The error can also be at channel1 (e. g. at the second
+	 *	  channel of the same branch). The fix is to push
+	 *	  edac_mc_handle_error() call into each driver
+	 */
+	 edac_mc_handle_error(HW_EVENT_ERR_UNCORRECTED, mci,
+			      0, 0, 0,
+			      csrow, channel0, -1, msg, NULL, NULL);
+}
+
+static inline void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
+					 unsigned int csrow,
+					 unsigned int channel, char *msg)
+{
+	 edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci,
+			      0, 0, 0,
+			      csrow, channel, -1, msg, NULL, NULL);
+}
 
 /*
  * edac_device APIs
@@ -496,6 +554,7 @@ extern void edac_device_handle_ue(struct edac_device_ctl_info *edac_dev,
 extern void edac_device_handle_ce(struct edac_device_ctl_info *edac_dev,
 				int inst_nr, int block_nr, const char *msg);
 extern int edac_device_alloc_index(void);
+extern const char *edac_layer_name[];
 
 /*
  * edac_pci APIs
diff --git a/drivers/edac/edac_mc.c b/drivers/edac/edac_mc.c
index 6ec967a..2614658 100644
--- a/drivers/edac/edac_mc.c
+++ b/drivers/edac/edac_mc.c
@@ -44,9 +44,25 @@ static void edac_mc_dump_channel(struct rank_info *chan)
 	debugf4("\tchannel = %p\n", chan);
 	debugf4("\tchannel->chan_idx = %d\n", chan->chan_idx);
 	debugf4("\tchannel->csrow = %p\n\n", chan->csrow);
-	debugf4("\tdimm->ce_count = %d\n", chan->dimm->ce_count);
-	debugf4("\tdimm->label = '%s'\n", chan->dimm->label);
-	debugf4("\tdimm->nr_pages = 0x%x\n", chan->dimm->nr_pages);
+	debugf4("\tchannel->dimm = %p\n", chan->dimm);
+}
+
+static void edac_mc_dump_dimm(struct dimm_info *dimm)
+{
+	int i;
+
+	debugf4("\tdimm = %p\n", dimm);
+	debugf4("\tdimm->label = '%s'\n", dimm->label);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
+	debugf4("\tdimm location ");
+	for (i = 0; i < dimm->mci->n_layers; i++) {
+		printk(KERN_CONT "%d", dimm->location[i]);
+		if (i < dimm->mci->n_layers - 1)
+			printk(KERN_CONT ".");
+	}
+	printk(KERN_CONT "\n");
+	debugf4("\tdimm->grain = %d\n", dimm->grain);
+	debugf4("\tdimm->nr_pages = 0x%x\n", dimm->nr_pages);
 }
 
 static void edac_mc_dump_csrow(struct csrow_info *csrow)
@@ -70,6 +86,8 @@ static void edac_mc_dump_mci(struct mem_ctl_info *mci)
 	debugf4("\tmci->edac_check = %p\n", mci->edac_check);
 	debugf3("\tmci->nr_csrows = %d, csrows = %p\n",
 		mci->nr_csrows, mci->csrows);
+	debugf3("\tmci->nr_dimms = %d, dimms = %p\n",
+		mci->tot_dimms, mci->dimms);
 	debugf3("\tdev = %p\n", mci->dev);
 	debugf3("\tmod_name:ctl_name = %s:%s\n", mci->mod_name, mci->ctl_name);
 	debugf3("\tpvt_info = %p\n\n", mci->pvt_info);
@@ -157,10 +175,12 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
 }
 
 /**
- * edac_mc_alloc: Allocate a struct mem_ctl_info structure
- * @size_pvt:	size of private storage needed
- * @nr_csrows:	Number of CWROWS needed for this MC
- * @nr_chans:	Number of channels for the MC
+ * edac_mc_alloc: Allocate and partially fill a struct mem_ctl_info structure
+ * @mc_num:		Memory controller number
+ * @n_layers:		Number of MC hierarchy layers
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @size_pvt:		size of private storage needed
+ *
  *
  * Everything is kmalloc'ed as one big chunk - more efficient.
  * Only can be used if all structures have the same lifetime - otherwise
@@ -168,22 +188,49 @@ void *edac_align_ptr(void **p, unsigned size, int n_elems)
  *
  * Use edac_mc_free() to free mc structures allocated by this function.
  *
+ * NOTE: drivers handle multi-rank memories in different ways: in some
+ * drivers, one multi-rank memory stick is mapped as one entry, while, in
+ * others, a single multi-rank memory stick would be mapped into several
+ * entries. Currently, this function will allocate multiple struct dimm_info
+ * on such scenarios, as grouping the multiple ranks require drivers change.
+ *
  * Returns:
  *	NULL allocation failed
  *	struct mem_ctl_info pointer
  */
-struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
-				unsigned nr_chans, int edac_index)
+struct mem_ctl_info *new_edac_mc_alloc(unsigned mc_num,
+				       unsigned n_layers,
+				       struct edac_mc_layer *layers,
+				       unsigned sz_pvt)
 {
-	void *ptr = NULL;
 	struct mem_ctl_info *mci;
-	struct csrow_info *csi, *csrow;
+	struct edac_mc_layer *layer;
+	struct csrow_info *csi, *csr;
 	struct rank_info *chi, *chp, *chan;
 	struct dimm_info *dimm;
-	void *pvt;
-	unsigned size;
-	int row, chn;
-	int err;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+	unsigned pos[EDAC_MAX_LAYERS];
+	void *pvt, *ptr = NULL;
+	unsigned size, tot_dimms = 1, count = 1;
+	unsigned tot_csrows = 1, tot_channels = 1, tot_errcount = 0;
+	int i, j, err, row, chn;
+	bool per_rank = false;
+
+	BUG_ON(n_layers > EDAC_MAX_LAYERS || n_layers == 0);
+	/*
+	 * Calculate the total amount of dimms and csrows/cschannels while
+	 * in the old API emulation mode
+	 */
+	for (i = 0; i < n_layers; i++) {
+		tot_dimms *= layers[i].size;
+		if (layers[i].is_virt_csrow)
+			tot_csrows *= layers[i].size;
+		else
+			tot_channels *= layers[i].size;
+
+		if (layers[i].type == EDAC_MC_LAYER_CHIP_SELECT)
+			per_rank = true;
+	}
 
 	/* Figure out the offsets of the various items from the start of an mc
 	 * structure.  We want the alignment of each item to be at least as
@@ -191,12 +238,27 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 * hardcode everything into a single struct.
 	 */
 	mci = edac_align_ptr(&ptr, sizeof(*mci), 1);
-	csi = edac_align_ptr(&ptr, sizeof(*csi), nr_csrows);
-	chi = edac_align_ptr(&ptr, sizeof(*chi), nr_csrows * nr_chans);
-	dimm = edac_align_ptr(&ptr, sizeof(*dimm), nr_csrows * nr_chans);
+	layer = edac_align_ptr(&ptr, sizeof(*layer), n_layers);
+	csi = edac_align_ptr(&ptr, sizeof(*csi), tot_csrows);
+	chi = edac_align_ptr(&ptr, sizeof(*chi), tot_csrows * tot_channels);
+	dimm = edac_align_ptr(&ptr, sizeof(*dimm), tot_dimms);
+	for (i = 0; i < n_layers; i++) {
+		count *= layers[i].size;
+		debugf4("%s: errcount layer %d size %d\n", __func__, i, count);
+		ce_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		ue_per_layer[i] = edac_align_ptr(&ptr, sizeof(u32), count);
+		tot_errcount += 2 * count;
+	}
+
+	debugf4("%s: allocating %d error counters\n", __func__, tot_errcount);
 	pvt = edac_align_ptr(&ptr, sz_pvt, 1);
 	size = ((unsigned long)pvt) + sz_pvt;
 
+	debugf1("%s(): allocating %u bytes for mci data (%d %s, %d csrows/channels)\n",
+		__func__, size,
+		tot_dimms,
+		per_rank ? "ranks" : "dimms",
+		tot_csrows * tot_channels);
 	mci = kzalloc(size, GFP_KERNEL);
 	if (mci == NULL)
 		return NULL;
@@ -204,42 +266,87 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	/* Adjust pointers so they point within the memory we just allocated
 	 * rather than an imaginary chunk of memory located at address 0.
 	 */
+	layer = (struct edac_mc_layer *)(((char *)mci) + ((unsigned long)layer));
 	csi = (struct csrow_info *)(((char *)mci) + ((unsigned long)csi));
 	chi = (struct rank_info *)(((char *)mci) + ((unsigned long)chi));
 	dimm = (struct dimm_info *)(((char *)mci) + ((unsigned long)dimm));
+	for (i = 0; i < n_layers; i++) {
+		mci->ce_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ce_per_layer[i]));
+		mci->ue_per_layer[i] = (u32 *)((char *)mci + ((unsigned long)ue_per_layer[i]));
+	}
 	pvt = sz_pvt ? (((char *)mci) + ((unsigned long)pvt)) : NULL;
 
 	/* setup index and various internal pointers */
-	mci->mc_idx = edac_index;
+	mci->mc_idx = mc_num;
 	mci->csrows = csi;
 	mci->dimms  = dimm;
+	mci->tot_dimms = tot_dimms;
 	mci->pvt_info = pvt;
-	mci->nr_csrows = nr_csrows;
+	mci->n_layers = n_layers;
+	mci->layers = layer;
+	memcpy(mci->layers, layers, sizeof(*layer) * n_layers);
+	mci->nr_csrows = tot_csrows;
+	mci->num_cschannel = tot_channels;
+	mci->mem_is_per_rank = per_rank;
 
 	/*
-	 * For now, assumes that a per-csrow arrangement for dimms.
-	 * This will be latter changed.
+	 * Fill the csrow struct
 	 */
-	dimm = mci->dimms;
-
-	for (row = 0; row < nr_csrows; row++) {
-		csrow = &csi[row];
-		csrow->csrow_idx = row;
-		csrow->mci = mci;
-		csrow->nr_channels = nr_chans;
-		chp = &chi[row * nr_chans];
-		csrow->channels = chp;
-
-		for (chn = 0; chn < nr_chans; chn++) {
+	for (row = 0; row < tot_csrows; row++) {
+		csr = &csi[row];
+		csr->csrow_idx = row;
+		csr->mci = mci;
+		csr->nr_channels = tot_channels;
+		chp = &chi[row * tot_channels];
+		csr->channels = chp;
+
+		for (chn = 0; chn < tot_channels; chn++) {
 			chan = &chp[chn];
 			chan->chan_idx = chn;
-			chan->csrow = csrow;
+			chan->csrow = csr;
+		}
+	}
+
+	/*
+	 * Fill the dimm struct
+	 */
+	memset(&pos, 0, sizeof(pos));
+	row = 0;
+	chn = 0;
+	debugf4("%s: initializing %d %s\n", __func__, tot_dimms,
+		per_rank ? "ranks" : "dimms");
+	for (i = 0; i < tot_dimms; i++) {
+		chan = &csi[row].channels[chn];
+		dimm = EDAC_DIMM_PTR(layer, mci->dimms, n_layers,
+			       pos[0], pos[1], pos[2]);
+		dimm->mci = mci;
+
+		debugf2("%s: %d: %s%zd (%d:%d:%d): row %d, chan %d\n", __func__,
+			i, per_rank ? "rank" : "dimm", (dimm - mci->dimms),
+			pos[0], pos[1], pos[2], row, chn);
+
+		/* Copy DIMM location */
+		for (j = 0; j < n_layers; j++)
+			dimm->location[j] = pos[j];
+
+		/* Link it to the csrows old API data */
+		chan->dimm = dimm;
+		dimm->csrow = row;
+		dimm->cschannel = chn;
+
+		/* Increment csrow location */
+		row++;
+		if (row == tot_csrows) {
+			row = 0;
+			chn++;
+		}
 
-			mci->csrows[row].channels[chn].dimm = dimm;
-			dimm->csrow = row;
-			dimm->csrow_channel = chn;
-			dimm++;
-			mci->nr_dimms++;
+		/* Increment dimm location */
+		for (j = n_layers - 1; j >= 0; j--) {
+			pos[j]++;
+			if (pos[j] < layers[j].size)
+				break;
+			pos[j] = 0;
 		}
 	}
 
@@ -263,6 +370,46 @@ struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
 	 */
 	return mci;
 }
+EXPORT_SYMBOL_GPL(new_edac_mc_alloc);
+
+/**
+ * edac_mc_alloc: Allocate and partially fill a struct mem_ctl_info structure
+ * @mc_num:		Memory controller number
+ * @n_layers:		Number of layers at the MC hierarchy
+ * layers:		Describes each layer as seen by the Memory Controller
+ * @size_pvt:		Size of private storage needed
+ *
+ *
+ * FIXME: drivers handle multi-rank memories in different ways: some
+ * drivers map multi-ranked DIMMs as one DIMM while others
+ * as several DIMMs.
+ *
+ * Everything is kmalloc'ed as one big chunk - more efficient.
+ * It can only be used if all structures have the same lifetime - otherwise
+ * you have to allocate and initialize your own structures.
+ *
+ * Use edac_mc_free() to free mc structures allocated by this function.
+ *
+ * Returns:
+ *	On failure: NULL
+ *	On success: struct mem_ctl_info pointer
+ */
+
+struct mem_ctl_info *edac_mc_alloc(unsigned sz_pvt, unsigned nr_csrows,
+				   unsigned nr_chans, int mc_num)
+{
+	unsigned n_layers = 2;
+	struct edac_mc_layer layers[n_layers];
+
+	layers[0].type = EDAC_MC_LAYER_CHIP_SELECT;
+	layers[0].size = nr_csrows;
+	layers[0].is_virt_csrow = true;
+	layers[1].type = EDAC_MC_LAYER_CHANNEL;
+	layers[1].size = nr_chans;
+	layers[1].is_virt_csrow = false;
+
+	return new_edac_mc_alloc(mc_num, ARRAY_SIZE(layers), layers, sz_pvt);
+}
 EXPORT_SYMBOL_GPL(edac_mc_alloc);
 
 /**
@@ -528,7 +675,6 @@ EXPORT_SYMBOL(edac_mc_find);
  * edac_mc_add_mc: Insert the 'mci' structure into the mci global list and
  *                 create sysfs entries associated with mci structure
  * @mci: pointer to the mci structure to be added to the list
- * @mc_idx: A unique numeric identifier to be assigned to the 'mci' structure.
  *
  * Return:
  *	0	Success
@@ -555,6 +701,8 @@ int edac_mc_add_mc(struct mem_ctl_info *mci)
 				edac_mc_dump_channel(&mci->csrows[i].
 						channels[j]);
 		}
+		for (i = 0; i < mci->tot_dimms; i++)
+			edac_mc_dump_dimm(&mci->dimms[i]);
 	}
 #endif
 	mutex_lock(&mem_ctls_mutex);
@@ -712,261 +860,307 @@ int edac_mc_find_csrow_by_page(struct mem_ctl_info *mci, unsigned long page)
 }
 EXPORT_SYMBOL_GPL(edac_mc_find_csrow_by_page);
 
-/* FIXME - setable log (warning/emerg) levels */
-/* FIXME - integrate with evlog: http://evlog.sourceforge.net/ */
-void edac_mc_handle_ce(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, unsigned long syndrome,
-		int row, int channel, const char *msg)
+const char *edac_layer_name[] = {
+	[EDAC_MC_LAYER_BRANCH] = "branch",
+	[EDAC_MC_LAYER_CHANNEL] = "channel",
+	[EDAC_MC_LAYER_SLOT] = "slot",
+	[EDAC_MC_LAYER_CHIP_SELECT] = "csrow",
+};
+EXPORT_SYMBOL_GPL(edac_layer_name);
+
+static void edac_inc_ce_error(struct mem_ctl_info *mci,
+				    bool enable_per_layer_report,
+				    const int pos[EDAC_MAX_LAYERS])
 {
-	unsigned long remapped_page;
-	char *label = NULL;
-	u32 grain;
+	int i, index = 0;
 
-	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
+	mci->ce_count++;
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	if (!enable_per_layer_report) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	if (channel >= mci->csrows[row].nr_channels || channel < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range "
-			"(%d >= %d)\n", channel,
-			mci->csrows[row].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ce_per_layer[i][index]++;
+
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
+
+static void edac_inc_ue_error(struct mem_ctl_info *mci,
+				    bool enable_per_layer_report,
+				    const int pos[EDAC_MAX_LAYERS])
+{
+	int i, index = 0;
+
+	mci->ue_count++;
+
+	if (!enable_per_layer_report) {
+		mci->ce_noinfo_count++;
 		return;
 	}
 
-	label = mci->csrows[row].channels[channel].dimm->label;
-	grain = mci->csrows[row].channels[channel].dimm->grain;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			break;
+		index += pos[i];
+		mci->ue_per_layer[i][index]++;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE page 0x%lx, offset 0x%lx, grain %d, syndrome "
-			"0x%lx, row %d, channel %d, label \"%s\": %s\n",
-			page_frame_number, offset_in_page,
-			grain, syndrome, row, channel,
-			label, msg);
+		if (i < mci->n_layers - 1)
+			index *= mci->layers[i + 1].size;
+	}
+}
 
-	mci->ce_count++;
-	mci->csrows[row].ce_count++;
-	mci->csrows[row].channels[channel].dimm->ce_count++;
-	mci->csrows[row].channels[channel].ce_count++;
+static void edac_ce_error(struct mem_ctl_info *mci,
+			  const int pos[EDAC_MAX_LAYERS],
+			  const char *msg,
+			  const char *location,
+			  const char *label,
+			  const char *detail,
+			  const char *other_detail,
+			  const bool enable_per_layer_report,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  u32 grain)
+{
+	unsigned long remapped_page;
+
+	if (edac_mc_get_log_ce()) {
+		if (other_detail && *other_detail)
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s - %s)\n",
+				       msg, label, location,
+				       detail, other_detail);
+		else
+			edac_mc_printk(mci, KERN_WARNING,
+				       "CE %s on %s (%s%s)\n",
+				       msg, label, location,
+				       detail);
+	}
+	edac_inc_ce_error(mci, enable_per_layer_report, pos);
 
 	if (mci->scrub_mode & SCRUB_SW_SRC) {
 		/*
-		 * Some MC's can remap memory so that it is still available
-		 * at a different address when PCI devices map into memory.
-		 * MC's that can't do this lose the memory where PCI devices
-		 * are mapped.  This mapping is MC dependent and so we call
-		 * back into the MC driver for it to map the MC page to
-		 * a physical (CPU) page which can then be mapped to a virtual
-		 * page - which can then be scrubbed.
-		 */
+			* Some memory controllers (called MCs below) can remap
+			* memory so that it is still available at a different
+			* address when PCI devices map into memory.
+			* MC's that can't do this, lose the memory where PCI
+			* devices are mapped. This mapping is MC-dependent
+			* and so we call back into the MC driver for it to
+			* map the MC page to a physical (CPU) page which can
+			* then be mapped to a virtual page - which can then
+			* be scrubbed.
+			*/
 		remapped_page = mci->ctl_page_to_phys ?
 			mci->ctl_page_to_phys(mci, page_frame_number) :
 			page_frame_number;
 
-		edac_mc_scrub_block(remapped_page, offset_in_page, grain);
+		edac_mc_scrub_block(remapped_page,
+					offset_in_page, grain);
 	}
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce);
 
-void edac_mc_handle_ce_no_info(struct mem_ctl_info *mci, const char *msg)
+static void edac_ue_error(struct mem_ctl_info *mci,
+			  const int pos[EDAC_MAX_LAYERS],
+			  const char *msg,
+			  const char *location,
+			  const char *label,
+			  const char *detail,
+			  const char *other_detail,
+			  const bool enable_per_layer_report)
 {
-	if (edac_mc_get_log_ce())
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE - no information available: %s\n", msg);
+	if (edac_mc_get_log_ue()) {
+		if (other_detail && *other_detail)
+			edac_mc_printk(mci, KERN_WARNING,
+				       "UE %s on %s (%s%s - %s)\n",
+			               msg, label, location, detail,
+				       other_detail);
+		else
+			edac_mc_printk(mci, KERN_WARNING,
+				       "UE %s on %s (%s%s)\n",
+			               msg, label, location, detail);
+	}
 
-	mci->ce_noinfo_count++;
-	mci->ce_count++;
+	if (edac_mc_get_panic_on_ue()) {
+		if (other_detail && *other_detail)
+			panic("UE %s on %s (%s%s - %s)\n",
+			      msg, label, location, detail, other_detail);
+		else
+			panic("UE %s on %s (%s%s)\n",
+			      msg, label, location, detail);
+	}
+
+	edac_inc_ue_error(mci, enable_per_layer_report, pos);
 }
-EXPORT_SYMBOL_GPL(edac_mc_handle_ce_no_info);
 
-void edac_mc_handle_ue(struct mem_ctl_info *mci,
-		unsigned long page_frame_number,
-		unsigned long offset_in_page, int row, const char *msg)
+#define OTHER_LABEL " or "
+void edac_mc_handle_error(const enum hw_event_mc_err_type type,
+			  struct mem_ctl_info *mci,
+			  const unsigned long page_frame_number,
+			  const unsigned long offset_in_page,
+			  const unsigned long syndrome,
+			  const int layer0,
+			  const int layer1,
+			  const int layer2,
+			  const char *msg,
+			  const char *other_detail,
+			  const void *mcelog)
 {
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chan;
-	int chars;
-	char *label = NULL;
+	/* FIXME: too much for stack: move it to some pre-alocated area */
+	char detail[80], location[80];
+	char label[(EDAC_MC_LABEL_LEN + 1 + sizeof(OTHER_LABEL)) * mci->tot_dimms];
+	char *p;
+	int row = -1, chan = -1;
+	int pos[EDAC_MAX_LAYERS] = { layer0, layer1, layer2 };
+	int i;
 	u32 grain;
+	bool enable_per_layer_report = false;
 
 	debugf3("MC%d: %s()\n", mci->mc_idx, __func__);
 
-	/* FIXME - maybe make panic on INTERNAL ERROR an option */
-	if (row >= mci->nr_csrows || row < 0) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range "
-			"(%d >= %d)\n", row, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	grain = mci->csrows[row].channels[0].dimm->grain;
-	label = mci->csrows[row].channels[0].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	for (chan = 1; (chan < mci->csrows[row].nr_channels) && (len > 0);
-		chan++) {
-		label = mci->csrows[row].channels[chan].dimm->label;
-		chars = snprintf(pos, len + 1, ":%s", label);
-		len -= chars;
-		pos += chars;
+	/*
+	 * Check if the event report is consistent and if the memory
+	 * location is known. If it is known, enable_per_layer_report will be
+	 * true, the DIMM(s) label info will be filled and the per-layer
+	 * error counters will be incremented.
+	 */
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] >= (int)mci->layers[i].size) {
+			if (type == HW_EVENT_ERR_CORRECTED)
+				p = "CE";
+			else
+				p = "UE";
+
+			edac_mc_printk(mci, KERN_ERR,
+				       "INTERNAL ERROR: %s value is out of range (%d >= %d)\n",
+				       edac_layer_name[mci->layers[i].type],
+				       pos[i], mci->layers[i].size);
+			/*
+			 * Instead of just returning it, let's use what's
+			 * known about the error. The increment routines and
+			 * the DIMM filter logic will do the right thing by
+			 * pointing the likely damaged DIMMs.
+			 */
+			pos[i] = -1;
+		}
+		if (pos[i] >= 0)
+			enable_per_layer_report = true;
 	}
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE page 0x%lx, offset 0x%lx, grain %d, row %d, "
-			"labels \"%s\": %s\n", page_frame_number,
-			offset_in_page, grain, row, labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: UE page 0x%lx, offset 0x%lx, grain %d, "
-			"row %d, labels \"%s\": %s\n", mci->mc_idx,
-			page_frame_number, offset_in_page,
-			grain, row, labels, msg);
-
-	mci->ue_count++;
-	mci->csrows[row].ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue);
-
-void edac_mc_handle_ue_no_info(struct mem_ctl_info *mci, const char *msg)
-{
-	if (edac_mc_get_panic_on_ue())
-		panic("EDAC MC%d: Uncorrected Error", mci->mc_idx);
+	/*
+	 * Get the dimm label/grain that applies to the match criteria.
+	 * As the error algorithm may not be able to point to just one memory
+	 * stick, the logic here will get all possible labels that could
+	 * pottentially be affected by the error.
+	 * On FB-DIMM memory controllers, for uncorrected errors, it is common
+	 * to have only the MC channel and the MC dimm (also called "branch")
+	 * but the channel is not known, as the memory is arranged in pairs,
+	 * where each memory belongs to a separate channel within the same
+	 * branch.
+	 */
+	grain = 0;
+	p = label;
+	*p = '\0';
+	for (i = 0; i < mci->tot_dimms; i++) {
+		struct dimm_info *dimm = &mci->dimms[i];
 
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_WARNING,
-			"UE - no information available: %s\n", msg);
-	mci->ue_noinfo_count++;
-	mci->ue_count++;
-}
-EXPORT_SYMBOL_GPL(edac_mc_handle_ue_no_info);
+		if (layer0 >= 0 && layer0 != dimm->location[0])
+			continue;
+		if (layer1 >= 0 && layer1 != dimm->location[1])
+			continue;
+		if (layer2 >= 0 && layer2 != dimm->location[2])
+			continue;
 
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process UE events
- */
-void edac_mc_handle_fbd_ue(struct mem_ctl_info *mci,
-			unsigned int csrow,
-			unsigned int channela,
-			unsigned int channelb, char *msg)
-{
-	int len = EDAC_MC_LABEL_LEN * 4;
-	char labels[len + 1];
-	char *pos = labels;
-	int chars;
-	char *label;
-
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
+		/* get the max grain, over the error match range */
+		if (dimm->grain > grain)
+			grain = dimm->grain;
 
-	if (channela >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-a out of range "
-			"(%d >= %d)\n",
-			channela, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+		/*
+		 * If the error is memory-controller wide, there's no need to
+		 * seek for the affected DIMMs because the whole
+		 * channel/memory controller/...  may be affected.
+		 * Also, don't show errors for empty DIMM slots.
+		 */
+		if (enable_per_layer_report && dimm->nr_pages) {
+			if (p != label) {
+				strcpy(p, OTHER_LABEL);
+				p += strlen(OTHER_LABEL);
+			}
+			strcpy(p, dimm->label);
+			p += strlen(p);
+			*p = '\0';
+
+			/*
+			 * get csrow/channel of the DIMM, in order to allow
+			 * incrementing the compat API counters
+			 */
+			debugf4("%s: %s csrows map: (%d,%d)\n",
+				__func__,
+				mci->mem_is_per_rank ? "rank" : "dimm",
+				dimm->csrow, dimm->cschannel);
+
+			if (row == -1)
+				row = dimm->csrow;
+			else if (row >= 0 && row != dimm->csrow)
+				row = -2;
+
+			if (chan == -1)
+				chan = dimm->cschannel;
+			else if (chan >= 0 && chan != dimm->cschannel)
+				chan = -2;
+		}
 	}
 
-	if (channelb >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel-b out of range "
-			"(%d >= %d)\n",
-			channelb, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ue_no_info(mci, "INTERNAL ERROR");
-		return;
+	if (!enable_per_layer_report) {
+		strcpy(label, "any memory");
+	} else {
+		debugf4("%s: csrow/channel to increment: (%d,%d)\n",
+			__func__, row, chan);
+		if (p == label)
+			strcpy(label, "unknown memory");
+		if (type == HW_EVENT_ERR_CORRECTED) {
+			if (row >= 0) {
+				mci->csrows[row].ce_count++;
+				if (chan >= 0)
+					mci->csrows[row].channels[chan].ce_count++;
+			}
+		} else
+			if (row >= 0)
+				mci->csrows[row].ue_count++;
 	}
 
-	mci->ue_count++;
-	mci->csrows[csrow].ue_count++;
-
-	/* Generate the DIMM labels from the specified channels */
-	label = mci->csrows[csrow].channels[channela].dimm->label;
-	chars = snprintf(pos, len + 1, "%s", label);
-	len -= chars;
-	pos += chars;
-
-	chars = snprintf(pos, len + 1, "-%s",
-			mci->csrows[csrow].channels[channelb].dimm->label);
-
-	if (edac_mc_get_log_ue())
-		edac_mc_printk(mci, KERN_EMERG,
-			"UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela, channelb,
-			labels, msg);
-
-	if (edac_mc_get_panic_on_ue())
-		panic("UE row %d, channel-a= %d channel-b= %d "
-			"labels \"%s\": %s\n", csrow, channela,
-			channelb, labels, msg);
-}
-EXPORT_SYMBOL(edac_mc_handle_fbd_ue);
-
-/*************************************************************
- * On Fully Buffered DIMM modules, this help function is
- * called to process CE events
- */
-void edac_mc_handle_fbd_ce(struct mem_ctl_info *mci,
-			unsigned int csrow, unsigned int channel, char *msg)
-{
-	char *label = NULL;
+	/* Fill the RAM location data */
+	p = location;
+	for (i = 0; i < mci->n_layers; i++) {
+		if (pos[i] < 0)
+			continue;
 
-	/* Ensure boundary values */
-	if (csrow >= mci->nr_csrows) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: row out of range (%d >= %d)\n",
-			csrow, mci->nr_csrows);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
+		p += sprintf(p, "%s:%d ",
+			     edac_layer_name[mci->layers[i].type],
+			     pos[i]);
 	}
-	if (channel >= mci->csrows[csrow].nr_channels) {
-		/* something is wrong */
-		edac_mc_printk(mci, KERN_ERR,
-			"INTERNAL ERROR: channel out of range (%d >= %d)\n",
-			channel, mci->csrows[csrow].nr_channels);
-		edac_mc_handle_ce_no_info(mci, "INTERNAL ERROR");
-		return;
-	}
-
-	label = mci->csrows[csrow].channels[channel].dimm->label;
 
-	if (edac_mc_get_log_ce())
-		/* FIXME - put in DIMM location */
-		edac_mc_printk(mci, KERN_WARNING,
-			"CE row %d, channel %d, label \"%s\": %s\n",
-			csrow, channel, label, msg);
+	/* Memory type dependent details about the error */
+	if (type == HW_EVENT_ERR_CORRECTED) {
+		snprintf(detail, sizeof(detail),
+			"page:0x%lx offset:0x%lx grain:%d syndrome:0x%lx",
+			page_frame_number, offset_in_page,
+			grain, syndrome);
+		edac_ce_error(mci, pos, msg, location, label, detail,
+			      other_detail, enable_per_layer_report,
+			      page_frame_number, offset_in_page, grain);
+	} else {
+		snprintf(detail, sizeof(detail),
+			"page:0x%lx offset:0x%lx grain:%d",
+			page_frame_number, offset_in_page, grain);
 
-	mci->ce_count++;
-	mci->csrows[csrow].ce_count++;
-	mci->csrows[csrow].channels[channel].dimm->ce_count++;
-	mci->csrows[csrow].channels[channel].ce_count++;
+		edac_ue_error(mci, pos, msg, location, label, detail,
+			      other_detail, enable_per_layer_report);
+	}
 }
-EXPORT_SYMBOL(edac_mc_handle_fbd_ce);
+EXPORT_SYMBOL_GPL(edac_mc_handle_error);
diff --git a/include/linux/edac.h b/include/linux/edac.h
index 3b8798d..c8f507d 100644
--- a/include/linux/edac.h
+++ b/include/linux/edac.h
@@ -412,18 +412,20 @@ struct edac_mc_layer {
 /* FIXME: add the proper per-location error counts */
 struct dimm_info {
 	char label[EDAC_MC_LABEL_LEN + 1];	/* DIMM label on motherboard */
-	unsigned memory_controller;
-	unsigned csrow;
-	unsigned csrow_channel;
+
+	/* Memory location data */
+	unsigned location[EDAC_MAX_LAYERS];
+
+	struct mem_ctl_info *mci;	/* the parent */
 
 	u32 grain;		/* granularity of reported error in bytes */
 	enum dev_type dtype;	/* memory device type */
 	enum mem_type mtype;	/* memory dimm type */
 	enum edac_type edac_mode;	/* EDAC mode for this dimm */
 
-	u32 nr_pages;			/* number of pages in csrow */
+	u32 nr_pages;			/* number of pages on this dimm */
 
-	u32 ce_count;		/* Correctable Errors for this dimm */
+	unsigned csrow, cschannel;	/* Points to the old API data */
 };
 
 /**
@@ -443,9 +445,10 @@ struct dimm_info {
  */
 struct rank_info {
 	int chan_idx;
-	u32 ce_count;
 	struct csrow_info *csrow;
 	struct dimm_info *dimm;
+
+	u32 ce_count;		/* Correctable Errors for this csrow */
 };
 
 struct csrow_info {
@@ -541,13 +544,18 @@ struct mem_ctl_info {
 	unsigned long (*ctl_page_to_phys) (struct mem_ctl_info * mci,
 					   unsigned long page);
 	int mc_idx;
-	int nr_csrows;
 	struct csrow_info *csrows;
+	unsigned nr_csrows, num_cschannel;
+
+	/* Memory Controller hierarchy */
+	unsigned n_layers;
+	struct edac_mc_layer *layers;
+	bool mem_is_per_rank;
 
 	/*
 	 * DIMM info. Will eventually remove the entire csrows_info some day
 	 */
-	unsigned nr_dimms;
+	unsigned tot_dimms;
 	struct dimm_info *dimms;
 
 	/*
@@ -562,12 +570,16 @@ struct mem_ctl_info {
 	const char *dev_name;
 	char proc_name[MC_PROC_NAME_MAX_LEN + 1];
 	void *pvt_info;
-	u32 ue_noinfo_count;	/* Uncorrectable Errors w/o info */
-	u32 ce_noinfo_count;	/* Correctable Errors w/o info */
-	u32 ue_count;		/* Total Uncorrectable Errors for this MC */
-	u32 ce_count;		/* Total Correctable Errors for this MC */
 	unsigned long start_time;	/* mci load start time (in jiffies) */
 
+	/*
+	 * drivers shouldn't access those fields directly, as the core
+	 * already handles that.
+	 */
+	u32 ce_noinfo_count, ue_noinfo_count;
+	u32 ue_count, ce_count;
+	u32 *ce_per_layer[EDAC_MAX_LAYERS], *ue_per_layer[EDAC_MAX_LAYERS];
+
 	struct completion complete;
 
 	/* edac sysfs device control */
@@ -580,7 +592,7 @@ struct mem_ctl_info {
 	 * by the low level driver.
 	 *
 	 * Set by the low level driver to provide attributes at the
-	 * controller level, same level as 'ue_count' and 'ce_count' above.
+	 * controller level.
 	 * An array of structures, NULL terminated
 	 *
 	 * If attributes are desired, then set to array of attributes
-- 
1.7.8

^ permalink raw reply related	[flat|nested] 206+ messages in thread

end of thread, other threads:[~2012-05-08 13:52 UTC | newest]

Thread overview: 206+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-03-29 16:45 [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 01/13] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
2012-03-30 10:50   ` Borislav Petkov
2012-03-30 13:26     ` Mauro Carvalho Chehab
2012-03-30 15:38       ` Borislav Petkov
2012-04-16  8:41     ` Mauro Carvalho Chehab
2012-04-16 11:02       ` Borislav Petkov
2012-04-16 11:44         ` Mauro Carvalho Chehab
2012-04-16 13:21           ` Borislav Petkov
2012-03-29 16:45 ` [PATCH 02/13] edac: move dimm properties to struct memset_info Mauro Carvalho Chehab
2012-03-30 13:10   ` Borislav Petkov
2012-03-30 13:22     ` Mauro Carvalho Chehab
2012-03-30 17:03   ` Borislav Petkov
2012-04-16  8:56     ` Mauro Carvalho Chehab
2012-04-16 13:31       ` Borislav Petkov
2012-03-29 16:45 ` [PATCH 03/13] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
2012-04-02 12:33   ` Borislav Petkov
2012-03-29 16:45 ` [PATCH 04/13] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
2012-04-02 13:18   ` Borislav Petkov
2012-03-29 16:45 ` [PATCH 05/13] edac: Fix core support for MC's that see DIMMS instead of ranks Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 06/13] edac: Initialize the dimm label with the known information Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 07/13] edac: Cleanup the logs for i7core and sb edac drivers Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 08/13] i5400_edac: improve debug messages to better represent the filled memory Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 09/13] events/hw_event: Create a Hardware Events Report Mecanism (HERM) Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 10/13] i5000_edac: Fix the logic that retrieves memory information Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 11/13] e752x_edac: provide more info about how DIMMS/ranks are mapped Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 12/13] edac: Rename the parent dev to pdev Mauro Carvalho Chehab
2012-03-29 16:45 ` [PATCH 13/13] edac: use Documentation-nano format for some data structs Mauro Carvalho Chehab
2012-03-29 20:46 ` [PATCH 00/13] Convert EDAC internal strutures to support all types of Memory Controllers Aristeu Rozanski Filho
2012-04-02 13:59 ` Borislav Petkov
2012-04-16 12:58   ` Mauro Carvalho Chehab
2012-04-16 14:06     ` Borislav Petkov
2012-04-16 20:12 ` [EDAC PATCH v13 0/7] Convert EDAC core to work with non-csrow-based memory controllers Mauro Carvalho Chehab
2012-04-16 20:12   ` [EDAC PATCH v13 1/7] edac: Create a dimm struct and move the labels into it Mauro Carvalho Chehab
2012-04-26 14:26     ` Borislav Petkov
2012-04-16 20:12   ` [EDAC PATCH v13 2/7] edac: move dimm properties to struct dimm_info Mauro Carvalho Chehab
2012-04-26 14:26     ` Borislav Petkov
2012-04-16 20:12   ` [EDAC PATCH v13 3/7] edac: Don't initialize csrow's first_page & friends when not needed Mauro Carvalho Chehab
2012-04-16 20:12   ` [EDAC PATCH v13 4/7] edac: move nr_pages to dimm struct Mauro Carvalho Chehab
2012-04-16 20:12     ` Mauro Carvalho Chehab
2012-04-17 18:48     ` Borislav Petkov
2012-04-17 18:48       ` Borislav Petkov
2012-04-17 19:28       ` Mauro Carvalho Chehab
2012-04-17 19:28         ` Mauro Carvalho Chehab
2012-04-17 21:40         ` Borislav Petkov
2012-04-17 21:40           ` Borislav Petkov
2012-04-18 12:58           ` Mauro Carvalho Chehab
2012-04-18 12:58             ` Mauro Carvalho Chehab
2012-04-18 17:53           ` [PATCH] " Mauro Carvalho Chehab
2012-04-18 17:53             ` Mauro Carvalho Chehab
2012-04-16 20:12   ` [EDAC PATCH v13 5/7] edac: rewrite edac_align_ptr() Mauro Carvalho Chehab
2012-04-18 14:06     ` Borislav Petkov
2012-04-18 15:25       ` Borislav Petkov
2012-04-18 18:15       ` Mauro Carvalho Chehab
2012-04-18 18:19       ` [PATCH] " Mauro Carvalho Chehab
2012-04-23 14:05         ` Borislav Petkov
2012-04-23 15:19           ` Mauro Carvalho Chehab
2012-04-23 15:26             ` Mauro Carvalho Chehab
2012-04-16 20:12   ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Mauro Carvalho Chehab
2012-04-23 17:49     ` Borislav Petkov
2012-04-23 18:30       ` Mauro Carvalho Chehab
2012-04-23 18:56         ` Mauro Carvalho Chehab
2012-04-23 19:19           ` [PATCH] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
2012-04-23 20:07             ` Mauro Carvalho Chehab
2012-04-24 10:46               ` Borislav Petkov
2012-04-24 10:40         ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
2012-04-24 11:46           ` Mauro Carvalho Chehab
2012-04-24 12:42             ` Mauro Carvalho Chehab
2012-04-24 12:49               ` [PATCH] edac.h: Add generic layers for describing a memory location Mauro Carvalho Chehab
2012-04-24 13:09                 ` Borislav Petkov
2012-04-24 13:22                   ` Mauro Carvalho Chehab
2012-04-24 13:38                     ` Borislav Petkov
2012-04-24 16:39                       ` Mauro Carvalho Chehab
2012-04-24 16:49                         ` Borislav Petkov
2012-04-24 17:38                           ` Mauro Carvalho Chehab
2012-04-24 18:15                             ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Mauro Carvalho Chehab
2012-04-24 18:15                               ` [PATCH EDACv16 2/2] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
2012-04-27 10:42                                 ` Mauro Carvalho Chehab
2012-04-27 13:33                               ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Borislav Petkov
2012-04-27 13:33                                 ` Borislav Petkov
2012-04-27 14:11                                 ` Joe Perches
2012-04-27 14:11                                   ` Joe Perches
2012-04-27 15:12                                   ` Borislav Petkov
2012-04-27 15:12                                     ` Borislav Petkov
2012-04-27 16:07                                   ` Mauro Carvalho Chehab
2012-04-27 16:07                                     ` Mauro Carvalho Chehab
2012-04-28  8:52                                     ` Borislav Petkov
2012-04-28  8:52                                       ` Borislav Petkov
2012-04-28 20:38                                       ` Joe Perches
2012-04-28 20:38                                         ` Joe Perches
2012-04-29 14:25                                       ` Mauro Carvalho Chehab
2012-04-29 14:25                                         ` Mauro Carvalho Chehab
2012-04-29 15:11                                         ` Mauro Carvalho Chehab
2012-04-29 15:11                                           ` Mauro Carvalho Chehab
2012-04-29 16:03                                           ` Joe Perches
2012-04-29 16:03                                             ` Joe Perches
2012-04-29 17:18                                             ` Mauro Carvalho Chehab
2012-04-29 17:18                                               ` Mauro Carvalho Chehab
2012-04-29 16:20                                           ` Mauro Carvalho Chehab
2012-04-29 16:43                                             ` Joe Perches
2012-04-29 17:39                                               ` Mauro Carvalho Chehab
2012-04-30  7:47                                                 ` Borislav Petkov
2012-04-30 11:09                                                   ` Mauro Carvalho Chehab
2012-04-30 11:15                                                     ` Borislav Petkov
2012-04-30 11:46                                                       ` Mauro Carvalho Chehab
2012-04-27 15:36                                 ` Mauro Carvalho Chehab
2012-04-27 15:36                                   ` Mauro Carvalho Chehab
2012-04-28  9:05                                   ` Borislav Petkov
2012-04-28  9:05                                     ` Borislav Petkov
2012-04-29 13:49                                     ` Mauro Carvalho Chehab
2012-04-29 13:49                                       ` Mauro Carvalho Chehab
2012-04-30  8:15                                       ` Borislav Petkov
2012-04-30  8:15                                         ` Borislav Petkov
2012-04-30 10:58                                         ` Mauro Carvalho Chehab
2012-04-30 10:58                                           ` Mauro Carvalho Chehab
2012-04-30 11:11                                           ` Borislav Petkov
2012-04-30 11:11                                             ` Borislav Petkov
2012-04-30 11:45                                             ` Mauro Carvalho Chehab
2012-04-30 11:45                                               ` Mauro Carvalho Chehab
2012-04-30 12:38                                               ` Borislav Petkov
2012-04-30 12:38                                                 ` Borislav Petkov
2012-04-30 13:00                                                 ` Mauro Carvalho Chehab
2012-04-30 13:00                                                   ` Mauro Carvalho Chehab
2012-04-30 13:53                                                   ` Mauro Carvalho Chehab
2012-04-30 13:53                                                     ` Mauro Carvalho Chehab
2012-04-30 15:02                                                     ` [PATCH v2] edac_mc: Cleanup per-dimm_info debug messages Mauro Carvalho Chehab
2012-04-30 15:10                                                       ` Mauro Carvalho Chehab
2012-04-30 15:20                                                         ` Borislav Petkov
2012-04-30 15:33                                                           ` Mauro Carvalho Chehab
2012-04-30 16:16                                                       ` Joe Perches
2012-04-30 16:47                                                         ` Mauro Carvalho Chehab
2012-04-30 16:44                                                       ` [PATCHv3] " Mauro Carvalho Chehab
2012-04-30 11:37                                         ` [PATCH EDACv16 1/2] edac: Change internal representation to work with layers Mauro Carvalho Chehab
2012-04-30 11:37                                           ` Mauro Carvalho Chehab
2012-04-27 17:52                                 ` Mauro Carvalho Chehab
2012-04-27 17:52                                   ` Mauro Carvalho Chehab
2012-04-27 18:11                                   ` Luck, Tony
2012-04-27 19:24                                     ` Mauro Carvalho Chehab
2012-04-28  8:58                                       ` Borislav Petkov
2012-04-28  9:16                                   ` Borislav Petkov
2012-04-28  9:16                                     ` Borislav Petkov
2012-04-28 17:07                                     ` Joe Perches
2012-04-28 17:07                                       ` Joe Perches
2012-04-29 14:02                                       ` Mauro Carvalho Chehab
2012-04-29 14:02                                         ` Mauro Carvalho Chehab
2012-04-29 14:16                                     ` Mauro Carvalho Chehab
2012-04-29 14:16                                       ` Mauro Carvalho Chehab
2012-04-30  7:59                                       ` Borislav Petkov
2012-04-30  7:59                                         ` Borislav Petkov
2012-04-30 11:23                                         ` Mauro Carvalho Chehab
2012-04-30 11:23                                           ` Mauro Carvalho Chehab
2012-04-30 12:51                                           ` Borislav Petkov
2012-04-30 12:51                                             ` Borislav Petkov
2012-04-24 12:55             ` [EDAC PATCH v13 6/7] edac.h: Prepare to handle with generic layers Borislav Petkov
2012-04-24 13:11               ` Mauro Carvalho Chehab
2012-04-24 13:32                 ` Borislav Petkov
2012-04-24 14:24                   ` Mauro Carvalho Chehab
2012-04-24 16:27                     ` Borislav Petkov
2012-04-24 17:24                       ` Mauro Carvalho Chehab
2012-04-25 17:19                         ` Borislav Petkov
2012-04-25 17:47                           ` Mauro Carvalho Chehab
2012-04-25 18:32                             ` Luck, Tony
2012-04-25 18:44                               ` Mauro Carvalho Chehab
2012-04-26 14:11                             ` Borislav Petkov
2012-04-26 14:25                               ` Mauro Carvalho Chehab
2012-04-26 14:59                                 ` Mauro Carvalho Chehab
2012-04-25 17:55                           ` Luck, Tony
2012-04-24 17:31                       ` Luck, Tony
2012-04-16 20:12   ` [EDAC PATCH v13 7/7] edac: Change internal representation to work with layers Mauro Carvalho Chehab
2012-04-18 18:22     ` [PATCH] " Mauro Carvalho Chehab
2012-04-16 20:21   ` [EDAC_ABI PATCH v13 00/26] Use the new EDAC kernel ABI on drivers Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
2012-05-07 14:31       ` Borislav Petkov
2012-05-07 16:12         ` Mauro Carvalho Chehab
2012-05-07 16:17           ` Borislav Petkov
2012-05-07 16:59             ` Mauro Carvalho Chehab
2012-05-07 19:49               ` Borislav Petkov
2012-05-08 13:51                 ` [PATCH] edac: Change internal representation to work with layers Mauro Carvalho Chehab
2012-05-07 16:24           ` [EDAC_ABI PATCH v13 01/26] amd64_edac: convert driver to use the new edac ABI Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 02/26] amd76x_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 03/26] cell_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 04/26] cpc925_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 05/26] e752x_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 06/26] e7xxx_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 07/26] i3000_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 08/26] i3200_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 09/26] i5000_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 10/26] i5100_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 11/26] i5400_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 12/26] i7300_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 13/26] i7core_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 14/26] i82443bxgx_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 15/26] i82860_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 16/26] i82875p_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 17/26] i82975x_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 18/26] mpc85xx_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 19/26] mv64x60_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 20/26] pasemi_edac: " Mauro Carvalho Chehab
2012-04-16 20:21       ` Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 21/26] ppc4xx_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 22/26] r82600_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 23/26] sb_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 24/26] tile_edac: " Mauro Carvalho Chehab
2012-04-26 19:47       ` Chris Metcalf
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 25/26] x38_edac: " Mauro Carvalho Chehab
2012-04-16 20:21     ` [EDAC_ABI PATCH v13 26/26] edac: Remove the legacy EDAC ABI Mauro Carvalho Chehab

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.