linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
@ 2016-07-22 23:44 Lukasz Odzioba
  2016-07-29  4:15 ` Odzioba, Lukasz
  2016-08-03  6:44 ` Borislav Petkov
  0 siblings, 2 replies; 11+ messages in thread
From: Lukasz Odzioba @ 2016-07-22 23:44 UTC (permalink / raw)
  To: linux-kernel, linux-edac, bp, dougthompson, mchehab
  Cc: tony.luck, hubert.chrzaniuk, lukasz.anaczkowski, lukasz.odzioba

On Intel Xeon Phi Knights Landing processor family the channels
of memory controller have untypical arrangement - MC0 is mapped to
CH3,4,5 and MC1 is mapped to CH0,1,2. This causes EDAC driver to
report the channel name incorrectly.

We missed this change earlier, so the code already contains
similar comment, but the translation function is incorrect.

Without this patch:
  errors in DIMM_A and DIMM_D were reported in DIMM_D
  errors in DIMM_B and DIMM_E were reported in DIMM_E
  errors in DIMM_C and DIMM_F were reported in DIMM_F

Fixes: d0cdf9003140 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support")
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
---
 drivers/edac/sb_edac.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 6744d88..61e2c52 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -552,9 +552,9 @@ static const struct pci_id_table pci_dev_descr_haswell_table[] = {
 /* Knight's Landing Support */
 /*
  * KNL's memory channels are swizzled between memory controllers.
- * MC0 is mapped to CH3,5,6 and MC1 is mapped to CH0,1,2
+ * MC0 is mapped to CH3,4,5 and MC1 is mapped to CH0,1,2
  */
-#define knl_channel_remap(channel) ((channel + 3) % 6)
+#define knl_channel_remap(mc, chan) (mc ? chan : chan + 3)
 
 /* Memory controller, TAD tables, error injection - 2-8-0, 2-9-0 (2 of these) */
 #define PCI_DEVICE_ID_INTEL_KNL_IMC_MC       0x7840
@@ -1286,7 +1286,7 @@ static u32 knl_get_mc_route(int entry, u32 reg)
 	mc = GET_BITFIELD(reg, entry*3, (entry*3)+2);
 	chan = GET_BITFIELD(reg, (entry*2) + 18, (entry*2) + 18 + 1);
 
-	return knl_channel_remap(mc*3 + chan);
+	return knl_channel_remap(mc, chan);
 }
 
 /*
@@ -3003,9 +3003,16 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 				mscod, errcode,
 				m->bank);
 		} else {
-			char A = *("A");
+			char mc, A = *("A");
 
-			channel = knl_channel_remap(channel);
+			/*
+			 * Reported channel is in range 0-2, so we can't map it
+			 * back to mc. To figure out mc we check machine check
+			 * bank register that reported this error.
+			 * bank15 means mc0 and bank16 means mc1.
+			 */
+			mc = m->bank == 16;
+			channel = knl_channel_remap(mc, channel);
 			channel_mask = 1 << channel;
 			snprintf(msg, sizeof(msg),
 				"%s%s err_code:%04x:%04x channel:%d (DIMM_%c)",
-- 
1.8.3.1

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-07-22 23:44 [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing Lukasz Odzioba
@ 2016-07-29  4:15 ` Odzioba, Lukasz
  2016-08-03  6:44 ` Borislav Petkov
  1 sibling, 0 replies; 11+ messages in thread
From: Odzioba, Lukasz @ 2016-07-29  4:15 UTC (permalink / raw)
  To: linux-kernel, linux-edac, bp, dougthompson, mchehab
  Cc: Luck, Tony, Chrzaniuk, Hubert, Anaczkowski, Lukasz

On Saturday, July 23, 2016 1:45 AM, Lukasz Odzioba wrote:
> On Intel Xeon Phi Knights Landing processor family the channels
> of memory controller have untypical arrangement - MC0 is mapped to
> CH3,4,5 and MC1 is mapped to CH0,1,2. This causes EDAC driver to
> report the channel name incorrectly.
>
> We missed this change earlier, so the code already contains
> similar comment, but the translation function is incorrect.
>
> Without this patch:
>  errors in DIMM_A and DIMM_D were reported in DIMM_D
>  errors in DIMM_B and DIMM_E were reported in DIMM_E
>  errors in DIMM_C and DIMM_F were reported in DIMM_F
>
> Fixes: d0cdf9003140 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support")

Hi,
I would greatly appreciate it if you kindly give me some feedback on this patch.

Thanks,
Lukas

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-07-22 23:44 [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing Lukasz Odzioba
  2016-07-29  4:15 ` Odzioba, Lukasz
@ 2016-08-03  6:44 ` Borislav Petkov
  2016-08-03  7:06   ` Chrzaniuk, Hubert
  1 sibling, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-08-03  6:44 UTC (permalink / raw)
  To: Lukasz Odzioba
  Cc: linux-kernel, linux-edac, dougthompson, mchehab, tony.luck,
	hubert.chrzaniuk, lukasz.anaczkowski

On Sat, Jul 23, 2016 at 01:44:49AM +0200, Lukasz Odzioba wrote:
> On Intel Xeon Phi Knights Landing processor family the channels
> of memory controller have untypical arrangement - MC0 is mapped to
> CH3,4,5 and MC1 is mapped to CH0,1,2. This causes EDAC driver to
> report the channel name incorrectly.
> 
> We missed this change earlier, so the code already contains
> similar comment, but the translation function is incorrect.
> 
> Without this patch:
>   errors in DIMM_A and DIMM_D were reported in DIMM_D
>   errors in DIMM_B and DIMM_E were reported in DIMM_E
>   errors in DIMM_C and DIMM_F were reported in DIMM_F
> 
> Fixes: d0cdf9003140 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support")
> Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
> Signed-off-by: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>

What is that SOB supposed to mean?

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-08-03  6:44 ` Borislav Petkov
@ 2016-08-03  7:06   ` Chrzaniuk, Hubert
  2016-08-03  7:12     ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Chrzaniuk, Hubert @ 2016-08-03  7:06 UTC (permalink / raw)
  To: Borislav Petkov, Odzioba, Lukasz
  Cc: linux-kernel, linux-edac, dougthompson, mchehab, Luck, Tony,
	Anaczkowski, Lukasz

From: Borislav Petkov [mailto:bp@alien8.de] 
> What is that SOB supposed to mean?

The original patch has been prepared by Lukas, then I [Hubert] rebased it, added commentary and cleaned up the code.
In the end, Lukas modified it a bit once again according to suggestions from Tony. My SOB is unnecessary here, it only makes it more confusing.

Hubert

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-08-03  7:06   ` Chrzaniuk, Hubert
@ 2016-08-03  7:12     ` Borislav Petkov
  2016-08-03  7:44       ` Chrzaniuk, Hubert
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-08-03  7:12 UTC (permalink / raw)
  To: Chrzaniuk, Hubert
  Cc: Odzioba, Lukasz, linux-kernel, linux-edac, dougthompson, mchehab,
	Luck, Tony, Anaczkowski, Lukasz

On Wed, Aug 03, 2016 at 07:06:56AM +0000, Chrzaniuk, Hubert wrote:
> The original patch has been prepared by Lukas, then I [Hubert] rebased
> it, added commentary and cleaned up the code. In the end, Lukas
> modified it a bit once again according to suggestions from Tony. My
> SOB is unnecessary here, it only makes it more confusing.

So aspects like that can (and should!) be documented in the commit
message in free text, before the SOB chain. Something like this, for
example:

"...

Hubert:
 - rebased to 4.8
 - comments and code cleanup
"

You can send me just the exact free text you want in there as a reply to
this message and I'll integrate it.

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-08-03  7:12     ` Borislav Petkov
@ 2016-08-03  7:44       ` Chrzaniuk, Hubert
  2016-08-03  7:58         ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Chrzaniuk, Hubert @ 2016-08-03  7:44 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Odzioba, Lukasz, linux-kernel, linux-edac, dougthompson, mchehab,
	Luck, Tony, Anaczkowski, Lukasz

On Wed, Aug 03, 2016 at 07:06:56AM +0000, Chrzaniuk, Hubert wrote:
>> The original patch has been prepared by Lukas, then I [Hubert] rebased 
>> it, added commentary and cleaned up the code. In the end, Lukas 
>> modified it a bit once again according to suggestions from Tony. My 
>> SOB is unnecessary here, it only makes it more confusing.

> So aspects like that can (and should!) be documented in the commit message in free text, before the SOB chain. Something like this, for
> example:
> "...
> Hubert:
>  - rebased to 4.8
>  - comments and code cleanup
> "
> You can send me just the exact free text you want in there as a reply to this message and I'll integrate it.

What you have suggested is simple and perfect, let’s leave it this way :) 

Hubert:
 - rebased to 4.8
 - comments and code cleanup

Thanks a lot.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-08-03  7:44       ` Chrzaniuk, Hubert
@ 2016-08-03  7:58         ` Borislav Petkov
  2016-08-03  8:54           ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-08-03  7:58 UTC (permalink / raw)
  To: Chrzaniuk, Hubert
  Cc: Odzioba, Lukasz, linux-kernel, linux-edac, dougthompson, mchehab,
	Luck, Tony, Anaczkowski, Lukasz

On Wed, Aug 03, 2016 at 07:44:22AM +0000, Chrzaniuk, Hubert wrote:
> What you have suggested is simple and perfect, let’s leave it this way :) 

"Simple" is my speciality! :-)))

Thanks.

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-08-03  7:58         ` Borislav Petkov
@ 2016-08-03  8:54           ` Borislav Petkov
  2016-08-03  9:25             ` Chrzaniuk, Hubert
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-08-03  8:54 UTC (permalink / raw)
  To: Chrzaniuk, Hubert
  Cc: Odzioba, Lukasz, linux-kernel, linux-edac, dougthompson, mchehab,
	Luck, Tony, Anaczkowski, Lukasz

On Wed, Aug 03, 2016 at 09:58:38AM +0200, Borislav Petkov wrote:
> "Simple" is my speciality! :-)))

...and so I did simplify it a bit more. Here's the final result:

---
From: Lukasz Odzioba <lukasz.odzioba@intel.com>
Date: Sat, 23 Jul 2016 01:44:49 +0200
Subject: [PATCH] EDAC, sb_edac: Fix channel reporting on Knights Landing

On Intel Xeon Phi Knights Landing processor family the channels of the
memory controller have untypical arrangement - MC0 is mapped to CH3,4,5
and MC1 is mapped to CH0,1,2. This causes the EDAC driver to report the
channel name incorrectly.

We missed this change earlier, so the code already contains similar
comment, but the translation function is incorrect.

Without this patch:
  errors in DIMM_A and DIMM_D were reported in DIMM_D
  errors in DIMM_B and DIMM_E were reported in DIMM_E
  errors in DIMM_C and DIMM_F were reported in DIMM_F

Correct this.

Hubert Chrzaniuk:
 - rebased to 4.8
 - comments and code cleanup

Fixes: d0cdf9003140 ("sb_edac: Add Knights Landing (Xeon Phi gen 2) support")
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: Mauro Carvalho Chehab <mchehab@kernel.org>
Cc: Hubert Chrzaniuk <hubert.chrzaniuk@intel.com>
Cc: linux-edac <linux-edac@vger.kernel.org>
Cc: lukasz.anaczkowski@intel.com
Cc: lukasz.odzioba@intel.com
Cc: mchehab@kernel.org
Cc: <stable@vger.kernel.org> # v4.5..
Link: http://lkml.kernel.org/r/1469231089-22837-1-git-send-email-lukasz.odzioba@intel.com
Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
[ Boris: Simplify a bit by removing char mc. ]
Signed-off-by: Borislav Petkov <bp@suse.de>
---
 drivers/edac/sb_edac.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/drivers/edac/sb_edac.c b/drivers/edac/sb_edac.c
index 4fb2eb7c800d..ce0067b7a2f6 100644
--- a/drivers/edac/sb_edac.c
+++ b/drivers/edac/sb_edac.c
@@ -552,9 +552,9 @@ static const struct pci_id_table pci_dev_descr_haswell_table[] = {
 /* Knight's Landing Support */
 /*
  * KNL's memory channels are swizzled between memory controllers.
- * MC0 is mapped to CH3,5,6 and MC1 is mapped to CH0,1,2
+ * MC0 is mapped to CH3,4,5 and MC1 is mapped to CH0,1,2
  */
-#define knl_channel_remap(channel) ((channel + 3) % 6)
+#define knl_channel_remap(mc, chan) ((mc) ? (chan) : (chan) + 3)
 
 /* Memory controller, TAD tables, error injection - 2-8-0, 2-9-0 (2 of these) */
 #define PCI_DEVICE_ID_INTEL_KNL_IMC_MC       0x7840
@@ -1286,7 +1286,7 @@ static u32 knl_get_mc_route(int entry, u32 reg)
 	mc = GET_BITFIELD(reg, entry*3, (entry*3)+2);
 	chan = GET_BITFIELD(reg, (entry*2) + 18, (entry*2) + 18 + 1);
 
-	return knl_channel_remap(mc*3 + chan);
+	return knl_channel_remap(mc, chan);
 }
 
 /*
@@ -2997,8 +2997,15 @@ static void sbridge_mce_output_error(struct mem_ctl_info *mci,
 		} else {
 			char A = *("A");
 
-			channel = knl_channel_remap(channel);
+			/*
+			 * Reported channel is in range 0-2, so we can't map it
+			 * back to mc. To figure out mc we check machine check
+			 * bank register that reported this error.
+			 * bank15 means mc0 and bank16 means mc1.
+			 */
+			channel = knl_channel_remap(m->bank == 16, channel);
 			channel_mask = 1 << channel;
+
 			snprintf(msg, sizeof(msg),
 				"%s%s err_code:%04x:%04x channel:%d (DIMM_%c)",
 				overflow ? " OVERFLOW" : "",
-- 
2.8.4

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-08-03  8:54           ` Borislav Petkov
@ 2016-08-03  9:25             ` Chrzaniuk, Hubert
  2016-08-03  9:31               ` Borislav Petkov
  0 siblings, 1 reply; 11+ messages in thread
From: Chrzaniuk, Hubert @ 2016-08-03  9:25 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Odzioba, Lukasz, linux-kernel, linux-edac, dougthompson, mchehab,
	Luck, Tony, Anaczkowski, Lukasz

On Wednesday, August 3, 2016 10:55 AM, Borislav Petkov wrote: 
> ...and so I did simplify it a bit more. [..] 

Thanks a lot, next time I am gonna do it the right way from the begging :)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-08-03  9:25             ` Chrzaniuk, Hubert
@ 2016-08-03  9:31               ` Borislav Petkov
  2016-08-03 10:23                 ` Chrzaniuk, Hubert
  0 siblings, 1 reply; 11+ messages in thread
From: Borislav Petkov @ 2016-08-03  9:31 UTC (permalink / raw)
  To: Chrzaniuk, Hubert
  Cc: Odzioba, Lukasz, linux-kernel, linux-edac, dougthompson, mchehab,
	Luck, Tony, Anaczkowski, Lukasz

On Wed, Aug 03, 2016 at 09:25:13AM +0000, Chrzaniuk, Hubert wrote:
> Thanks a lot, next time I am gonna do it the right way from the begging :)

Does "next time" mean you guys will be introducing more bugs to sb_edac?

:-))))

-- 
Regards/Gruss,
    Boris.

ECO tip #101: Trim your mails when you reply.
--

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing
  2016-08-03  9:31               ` Borislav Petkov
@ 2016-08-03 10:23                 ` Chrzaniuk, Hubert
  0 siblings, 0 replies; 11+ messages in thread
From: Chrzaniuk, Hubert @ 2016-08-03 10:23 UTC (permalink / raw)
  To: Borislav Petkov
  Cc: Odzioba, Lukasz, linux-kernel, linux-edac, dougthompson, mchehab,
	Luck, Tony, Anaczkowski, Lukasz

On Wed, Aug 03, 2016 at 09:25:13AM +0000, Chrzaniuk, Hubert wrote:
>> Thanks a lot, next time I am gonna do it the right way from the begging :)
> Does "next time" mean you guys will be introducing more bugs to sb_edac?
> :-))))

I try to maintain my job security in other ways, introducing bugs on purpose would be short-sighted :D

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2016-08-03 13:51 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-07-22 23:44 [PATCH 1/1] EDAC, sb_edac: Fix channel reporting on Knights Landing Lukasz Odzioba
2016-07-29  4:15 ` Odzioba, Lukasz
2016-08-03  6:44 ` Borislav Petkov
2016-08-03  7:06   ` Chrzaniuk, Hubert
2016-08-03  7:12     ` Borislav Petkov
2016-08-03  7:44       ` Chrzaniuk, Hubert
2016-08-03  7:58         ` Borislav Petkov
2016-08-03  8:54           ` Borislav Petkov
2016-08-03  9:25             ` Chrzaniuk, Hubert
2016-08-03  9:31               ` Borislav Petkov
2016-08-03 10:23                 ` Chrzaniuk, Hubert

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).