Linux-EDAC Archive on lore.kernel.org
 help / color / Atom feed
From: Borislav Petkov <bp@alien8.de>
To: "Ghannam, Yazen" <Yazen.Ghannam@amd.com>
Cc: "linux-edac@vger.kernel.org" <linux-edac@vger.kernel.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH v3 0/5] AMD64 EDAC: Check for nodes without memory, etc.
Date: Thu, 7 Nov 2019 11:38:57 +0100
Message-ID: <20191107103857.GC19501@zn.tnic> (raw)
In-Reply-To: <20191106195417.GF28380@zn.tnic>

On Wed, Nov 06, 2019 at 08:54:17PM +0100, Borislav Petkov wrote:
> which are also two attempts.
> 
> Anyway, I'll queue your set and I'll try to debug that thing because it
> is getting on my nerves slowly...

Yah, the problem is that because we have:

MODULE_DEVICE_TABLE(x86cpu, amd64_cpuids);

it gets tried on each CPU because an uevent gets dispatched for each
device, and each CPU is a device.

That's why I see it twice on this box - it has two CPUs.

And Greg says making it attempt once per system can't be done. Unless we
start doing hacks with sending uevents per BSP only which is too much.
Or we can remember the previous return value of the module init function
into edac_core but that's nasty too.

I'm thinking we should simply kill this fat ecc_msg thing which is not
very useful and be done with it:

[    5.697275] EDAC MC: Ver: 3.0.0
[    5.909530] EDAC amd64: F10h detected (node 0).
[    6.345231] EDAC amd64: Node 0: DRAM ECC disabled.
[    6.370815] EDAC amd64: F10h detected (node 0).
[    6.370929] EDAC amd64: Node 0: DRAM ECC disabled.

That's probably still a bit annoying on a large machine but better than
nothing.

---
diff --git a/drivers/edac/amd64_edac.c b/drivers/edac/amd64_edac.c
index 3aeb5173e200..0738237e3f09 100644
--- a/drivers/edac/amd64_edac.c
+++ b/drivers/edac/amd64_edac.c
@@ -3188,18 +3188,6 @@ static void restore_ecc_error_reporting(struct ecc_settings *s, u16 nid,
 		amd64_warn("Error restoring NB MCGCTL settings!\n");
 }
 
-/*
- * EDAC requires that the BIOS have ECC enabled before
- * taking over the processing of ECC errors. A command line
- * option allows to force-enable hardware ECC later in
- * enable_ecc_error_reporting().
- */
-static const char *ecc_msg =
-	"ECC disabled in the BIOS or no ECC capability, module will not load.\n"
-	" Either enable ECC checking or force module loading by setting "
-	"'ecc_enable_override'.\n"
-	" (Note that use of the override may cause unknown side effects.)\n";
-
 static bool ecc_enabled(struct amd64_pvt *pvt)
 {
 	u16 nid = pvt->mc_node_id;
@@ -3246,11 +3234,10 @@ static bool ecc_enabled(struct amd64_pvt *pvt)
 	amd64_info("Node %d: DRAM ECC %s.\n",
 		   nid, (ecc_en ? "enabled" : "disabled"));
 
-	if (!ecc_en || !nb_mce_en) {
-		amd64_info("%s", ecc_msg);
+	if (!ecc_en || !nb_mce_en)
 		return false;
-	}
-	return true;
+	else
+		return true;
 }
 
 static inline void

-- 
Regards/Gruss,
    Boris.

https://people.kernel.org/tglx/notes-about-netiquette

  reply index

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-06  1:24 Ghannam, Yazen
2019-11-06  1:24 ` [PATCH v3 1/5] EDAC/amd64: Make struct amd64_family_type global Ghannam, Yazen
2019-11-06  1:25 ` [PATCH v3 2/5] EDAC/amd64: Gather hardware information early Ghannam, Yazen
2019-11-06  1:25 ` [PATCH v3 3/5] EDAC/amd64: Save max number of controllers to family type Ghannam, Yazen
2019-11-06  1:25 ` [PATCH v3 4/5] EDAC/amd64: Use cached data when checking for ECC Ghannam, Yazen
2019-11-06  1:25 ` [PATCH v3 5/5] EDAC/amd64: Check for memory before fully initializing an instance Ghannam, Yazen
2019-11-06 16:06 ` [PATCH v3 0/5] AMD64 EDAC: Check for nodes without memory, etc Borislav Petkov
2019-11-06 18:16   ` Ghannam, Yazen
2019-11-06 19:54     ` Borislav Petkov
2019-11-07 10:38       ` Borislav Petkov [this message]
2019-11-07 13:47         ` Ghannam, Yazen
2019-11-07 15:40           ` Borislav Petkov
2019-11-07 19:20             ` Ghannam, Yazen
2019-11-07 19:34               ` Borislav Petkov
2019-11-07 19:41                 ` Ghannam, Yazen
2019-11-09  9:08                   ` [PATCH] EDAC/amd64: Get rid of the ECC disabled long message Borislav Petkov

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191107103857.GC19501@zn.tnic \
    --to=bp@alien8.de \
    --cc=Yazen.Ghannam@amd.com \
    --cc=linux-edac@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-EDAC Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-edac/0 linux-edac/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-edac linux-edac/ https://lore.kernel.org/linux-edac \
		linux-edac@vger.kernel.org
	public-inbox-index linux-edac

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-edac


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git